[01:52:20] Analytics, Datasets-General-or-Unknown, Services: Many error 500 from pageviews API "Error in Cassandra table storage backend" - https://phabricator.wikimedia.org/T125345#1992693 (mobrovac) Open>Resolved a:mobrovac The situation seems to have calmed down in the last 12h, so resolving. [03:17:08] Analytics, Research-and-Data-Archive: Historical analysis of edit productivity for English Wikipedia - https://phabricator.wikimedia.org/T99172#1992722 (DarTar) [07:29:31] Analytics, Datasets-General-or-Unknown, Services: Many error 500 from pageviews API "Error in Cassandra table storage backend" - https://phabricator.wikimedia.org/T125345#1993037 (Nemo_bis) Hm, yes, the example URL produces no error 500 for me now (I don't know whether the treeviews code changed). Nic... [08:35:51] o/ [09:19:38] hi elukey [09:44:18] Analytics-Cluster, DBA, Collaboration-Team-Current: Replicate Echo tables to analytics-store - https://phabricator.wikimedia.org/T115275#1993284 (jcrespo) Open>stalled [10:03:47] Analytics-Kanban, hardware-requests, operations, Patch-For-Review: 8 x 3 SSDs for AQS nodes. - https://phabricator.wikimedia.org/T124947#1993367 (JAllemandou) I +1ed the CR for changing restbase read consistency to one (it'll be good even with SSDs). The change on code for replication factor set to... [10:14:31] joal: the hadoop cluster looks very fine today, very good job in the past days :) [10:14:57] elukey: thanks :) [10:15:30] elukey: hadoop usually works it's life without issues, but when it fails, it's a mess :D [10:19:50] joal: this is kafka with broken disks http://devopsreactions.tumblr.com/post/128174792963/when-your-deploy-breaks-other-stuff [10:25:55] :) [11:03:07] (PS19) Joal: Daily last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [11:26:32] (PS20) Joal: Daily last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [11:26:53] (PS1) Joal: Monthly last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/268087 (https://bugzilla.wikimedia.org/124678) [11:53:36] (PS4) Joal: Revert "Revert "Remove mobile webrequest_source merging it in text"" [analytics/refinery] - https://gerrit.wikimedia.org/r/267891 (https://phabricator.wikimedia.org/T122651) (owner: Ottomata) [11:54:08] (PS21) Joal: Daily last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [11:54:36] (PS2) Joal: Monthly last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/268087 (https://bugzilla.wikimedia.org/124678) [12:13:41] * joal is away for a moment :) [12:21:56] Analytics-Kanban: Dashiki visualization that shows a hierarchy [13 pts] {lama} - https://phabricator.wikimedia.org/T124296#1993626 (mforns) a:Milimetric>mforns [12:30:19] Analytics-Tech-community-metrics, DevRel-February-2016: What is contributors.html for, in contrast to who_contributes_code.html and sc[m,r]-contributors.html and top-contributors.html? - https://phabricator.wikimedia.org/T118522#1993630 (Lcanasdiaz) Thanks for your patch @aklapper! This panel is not upstr... [13:21:37] Analytics-Cluster, operations: Kafka Broker disk usage is imbalanced - https://phabricator.wikimedia.org/T99105#1993728 (elukey) Actual status: kafka1014.eqiad.wmnet: Filesystem Size Used Avail Use% Mounted on /dev/sda3 1.8T 618G 1.2T 35% /var/spool/kafka/a /dev/sdb3 1.8T... [13:43:16] Analytics-Cluster, operations: Kafka Broker disk usage is imbalanced - https://phabricator.wikimedia.org/T99105#1993811 (JAllemandou) @ottomata: It indeed seems that message keys are actually input to the partitioner :) [13:48:03] Analytics-Cluster, operations: Kafka Broker disk usage is imbalanced - https://phabricator.wikimedia.org/T99105#1993827 (JAllemandou) Having 'per-schema' topics might also impact. [13:59:11] (CR) Elukey: "LGTM, small comment that might not be relevant at all :)" (1 comment) [analytics/camus] (wmf) - https://gerrit.wikimedia.org/r/267167 (https://phabricator.wikimedia.org/T125144) (owner: Ottomata) [14:07:26] (CR) Joal: JsonStringMessageDecoder can now find timestamps using dotted notation, e.g. ("a.b.ts") (1 comment) [analytics/camus] (wmf) - https://gerrit.wikimedia.org/r/267167 (https://phabricator.wikimedia.org/T125144) (owner: Ottomata) [14:09:26] MOrrning! [14:09:32] Hullllo [14:20:35] joal: am checking some emails. how did the data move go? [14:20:55] you were right yesterday, needed to be parallelized [14:21:03] I did that this morning, still runningb [14:21:14] Some more (really cool) reading: http://data-artisans.com/extending-the-yahoo-streaming-benchmark/ [14:21:35] * joal wants to try flink for real [14:22:31] ja my twitter friend (who no longer works there) is elsewhere now and they are leaning towards flink too [14:24:38] reading article [14:25:04] joal: if you do the data move in recent to old order, maybe we can get started sooner? :D [14:25:36] ottomata: did it the opposite ... But, since I didn't start feb, we can still go I think [14:25:57] I'll start feb when we stop the load and refine jobs [14:26:05] oh yea ok [14:26:25] Take your email/article/coffee time, let's start after :) [14:26:27] wait we can go? [14:26:33] i'm reading your article now... :p [14:26:37] huhu [14:26:44] When you're done, let's go :) [14:28:12] whoa joal "Flink supports windows on event time" [14:28:12] ? [14:28:20] like a timestamp in the event? [14:28:28] yeah ottomata, currently reading more on that :D [14:30:35] "Flink started to saturate the network links to Kafka at around 3 million events/sec and thus the throughput was limited at that point." whoa [14:31:02] Yes, and you'll see the throughput when actually creating data inside flik itself ... [14:31:06] Impressive [14:31:28] But the funnier thing comes after --> getting rid of the DB ! [14:39:39] huh yeah i don't quite get that. [14:39:51] they added a listner that responded to queries at that stage? [14:40:01] heh, don't know much about akka/actors [14:40:34] every flink worker add a thread dedicated to providing the worker internal state to external stuff (queries) [14:41:09] does the worker have access to all the other worker's internal state? [14:41:34] haha [14:41:35] joal: http://data-artisans.com/how-we-selected-apache-flink-at-otto-group/ [14:42:17] yessir, otto-group will do some flink in the near future I guess ;) [14:42:19] heheh [14:42:26] joal: let's do this thang! (can we?) [14:42:29] i didn't quite follow [14:42:32] so, data move is done for jan? [14:42:45] in the move for Jan and Dec [14:42:58] When we stop load/refine, I start Feb [14:43:19] ok ? [14:43:30] Shall we review the plan ( I updated stuff this morning)? [14:43:44] and maybe sync in batcave, would be easier? [14:44:11] ah [14:44:16] yeah lets' sync [14:44:17] k [14:44:18] batcave [14:53:36] ottomata: hello! Whenever you have time, can you tell me more about "The main Kafka clusters use RAID for reliability reasons" [14:53:55] (CR) Ottomata: [C: 2 V: 2] Revert "Revert "Remove mobile webrequest_source merging it in text"" [analytics/refinery] - https://gerrit.wikimedia.org/r/267891 (https://phabricator.wikimedia.org/T122651) (owner: Ottomata) [14:54:21] (CR) Ottomata: [C: 2 V: 2] Daily last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [14:54:30] (CR) Ottomata: [C: 2 V: 2] Monthly last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/268087 (https://bugzilla.wikimedia.org/124678) (owner: Joal) [14:54:39] elukey: hi! [14:54:50] joal and I are busy restarting a bunch of oozie jobs and such [14:54:56] we are in batcave if you are interested [14:55:45] gimme 5 and I'll join :) [15:04:55] (PS1) Joal: Remove mobile dep from last_access_uniques jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/268120 [15:06:11] (CR) Ottomata: [C: 2 V: 2] Remove mobile dep from last_access_uniques jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/268120 (owner: Joal) [15:14:57] (PS1) Joal: Correct aqs_hourly job (typo in quey) [analytics/refinery] - https://gerrit.wikimedia.org/r/268123 [15:17:46] (CR) Ottomata: [C: 2 V: 2] Correct aqs_hourly job (typo in quey) [analytics/refinery] - https://gerrit.wikimedia.org/r/268123 (owner: Joal) [15:18:35] oozie jobs -jobtype coordinator -filter status=RUNNING [15:36:16] oozie job --info 0140731-150922143436497-oozie-oozi-C -offset 1500 [15:59:39] (PS1) Ottomata: Fixing symlinks for recent 0.0.26 refinery artifacts [analytics/refinery] - https://gerrit.wikimedia.org/r/268137 [16:00:06] (CR) Ottomata: [C: 2 V: 2] Fixing symlinks for recent 0.0.26 refinery artifacts [analytics/refinery] - https://gerrit.wikimedia.org/r/268137 (owner: Ottomata) [16:11:02] Analytics-Kanban, hardware-requests, operations, Patch-For-Review: 8 x 3 SSDs for AQS nodes. - https://phabricator.wikimedia.org/T124947#1994142 (Eevans) >>! In T124947#1991215, @Ottomata wrote: > Hold on this, it seems will be replacing the aqs1xxx nodes since they are out of warranty. Moar memory... [16:37:55] Analytics-Kanban, Patch-For-Review: Reorganize oozie jobs to not use mobile cache webrequest_source {hawk} [13 pts] - https://phabricator.wikimedia.org/T122651#1994263 (Ottomata) We are mostly done! Our process was here: https://etherpad.wikimedia.org/p/refinery_mobile2text We still need to wait for some... [16:44:07] Analytics-Kanban, Patch-For-Review: Reorganize oozie jobs to not use mobile cache webrequest_source {hawk} [13 pts] - https://phabricator.wikimedia.org/T122651#1994306 (BBlack) [16:58:01] Analytics-Cluster, Analytics-Kanban, Patch-For-Review, Performance: Implement Unique Devices report on cluster using x-analytics header & last access date {bear} [13 pts] - https://phabricator.wikimedia.org/T92977#1994351 (Nuria) [17:01:44] madhuvishy: standup ? [17:02:07] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Create and maintain an Analytics Cluster in Beta Cluster in labs. - https://phabricator.wikimedia.org/T109859#1994384 (Ottomata) a:Ottomata [17:08:53] Analytics-Kanban, Patch-For-Review: Provide weekly app session metrics separately for Android and iOS, and move to 7 day counts [13 pts] - https://phabricator.wikimedia.org/T117615#1994434 (Nuria) Open>Resolved [17:11:32] Analytics: Expose the results of the global metric at a public link, that's available immediately for the API {kudu} [8 pts] - https://phabricator.wikimedia.org/T118310#1994473 (Milimetric) [17:25:33] joal: https://gerrit.wikimedia.org/r/#/c/216341/21/oozie/last_access_uniques/daily/last_access_uniques_daily.hql [17:25:36] nuria_: so, more on uniques ? [17:25:51] has an item for "uniques_underestimate," [17:25:57] yes [17:26:23] a-team: sorry i couldn't make it to standup early - I was up really late last night talking wedding dates to parents! [17:26:24] joal: but we are not using that on final data right? (as table only has underestimate +offset) [17:26:37] madhuvishy: np, you can send an e-scrum [17:26:43] no nuria_ , we have the 3 columns in table [17:26:49] as we discussed the other day [17:26:56] I can rename though if you prefer [17:27:19] joal: ah SORRY [17:27:26] madhuvishy and I had these names before [17:27:35] and they mean something different [17:27:37] i see [17:27:40] Arrrrf [17:27:47] Rename? [17:27:49] joal: you have underestimate+offset=total [17:27:55] joal: no, it's fine [17:27:56] yes m'dame [17:27:59] ok [17:28:19] underestimate = without the fresh sessions [17:28:28] joal: it's that on our early calculations we had an overestimate and underestimate but those do not apply here [17:28:33] joal: right, i get it [17:28:34] nope [17:28:37] ok cool :) [17:28:46] joal: do we have a ticket for monthly jobs? [17:28:51] done [17:29:00] joal: ah , both are done? [17:29:16] (currently in review, cause I need to wait before the jobs to start, but done) [17:29:21] yes [17:30:03] nuria_: https://gerrit.wikimedia.org/r/#/c/268087/ [17:30:23] joal: excellent many thanks joal & madhuvishy [17:30:46] I had the easy part ;) Thanks madhu for all the tough research work ! [17:31:43] ottomata: nevermid the text load job --> Index reset (probable varnishkafka restart) [17:31:54] ottomata: from cp1008.wikimedia.org [17:37:01] nuria_: we decided with Madhu to run uniques monthly for December even if we'll miss 3 days [17:37:08] nuria_: Do you agree/confirm [17:37:42] joal: yes, I will send an e-mail announcing it [17:37:53] joal: and i will mention that caveat [17:38:05] ok, not today though, let's wait for december and january data to be there :) [17:38:18] joal:yes, agreed [17:38:27] joal: we will let it bake a bit [17:38:43] yup, looking at numbers as well :) [17:38:49] joal: yes [17:39:38] joal: november numbers are here: https://docs.google.com/spreadsheets/d/1ay0SQeYrwQtaxszUEzj0AuVa9olHsfkgUCTe6kVF3E0/edit#gid=0 [17:39:54] joal: so those we can use too to cross check everything [17:39:59] joal: great ok! [17:40:00] thanks [17:40:07] great nuria_ [17:48:23] Analytics, Analytics-Cluster, operations: Kafka Broker disk usage is imbalanced - https://phabricator.wikimedia.org/T99105#1994605 (Nuria) [17:59:37] ottomata: updated budget with box replacements [17:59:58] ah not sure i was finished.. :) [18:00:39] Analytics, Analytics-Cluster, operations: Kafka Broker disk usage is imbalanced - https://phabricator.wikimedia.org/T99105#1994684 (Ottomata) p:Normal>Lowest [18:27:58] ottomata: I can't get eventlogging-processor to run - in some dependency hell. it says pykafka.rdkafka not found or something [18:28:16] sudo apt-get install librdkafka1 [18:28:20] try taht madhuvishy [18:28:22] soudns weird though [18:28:25] ottomata: activating role::eventlogging fails too [18:28:38] a-team: going offline! talk with you tomorrow :) [18:28:39] https://www.irccloud.com/pastebin/aUkXcCTZ/ [18:32:24] ottomata: librdkafka1 is installed [18:32:28] https://www.irccloud.com/pastebin/kcTvUPz2/ [18:33:03] madhuvishy: ok, with you in a little while, i need to make some lunch, and i think Jeff Green's kafkatee is broken in production [18:33:12] ottomata: alright np [18:34:25] (PS5) Madhuvishy: [WIP] Setup fabric to deploy dashboards powered by dashiki [analytics/dashiki] - https://gerrit.wikimedia.org/r/259437 (https://phabricator.wikimedia.org/T110351) [18:34:46] bye elukey ! [18:35:21] milimetric: i updated the fabric deployer a bit - it's nicer I think - but we should chat for a few minutes so i know what are next steps. let me know if you have some time :) [18:47:00] hue is kaput right? [18:49:27] madhuvishy: is there any way to see jobs running other than hue or oozie jobs -len 10? [18:50:07] madhuvishy: ah, this works: oozie jobs -jobtype bundle -filter status=RUNNING [18:51:55] nuria_: yeah I usually use hue - somethimes the yarn scheduler, although it's more job level [18:52:00] works for me nuria_ [18:52:16] joal: hue.wikimedia.org works? [18:52:26] it does for me at least [18:53:06] Going for diner, back after for a check on jobs [18:53:09] later ! [18:53:11] joal : for me hue timeouts and https://yarn.wikimedia.org/ gives error, is there a new domain for yarn? [18:54:45] cc madhuvishy [18:56:31] nuria_: you have to proxy now [18:56:31] madhuvishy: ah i see [18:56:31] Ottamata disabled the public url [18:56:31] madhuvishy: k [18:56:31] Uhh [18:56:31] madhuvishy: batcave? [18:56:31] milimetric: sure. Joining in 2 minutes [19:03:04] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1994963 (Nuria) I dropped table and now it has 8 new events from today: MariaDB [log]> select count(*) from M... [19:32:39] madhuvishy: [19:32:43] pykafka 2.2.0? [19:32:47] we don't have that in apt [19:32:55] looks like you got leftover problems from your pip install [19:33:29] ottomata: aah [19:33:44] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1995137 (Nuria) Set up consumer, messages are already duplicated in the topic eventlogging-valid-mixed [19:33:47] ottomata: should i just get a new instance? [19:33:52] madhuvishy: yea might be easiest [19:33:59] ottomata: ok cool [19:38:34] milimetric: uwsgi looks alright now - dont know what was happening a few minutes ago but - [19:38:37] https://www.irccloud.com/pastebin/84XJjIW1/ [19:38:58] milimetric: shows up when i tried to log in with google [19:39:10] cool, that makes sense, it was just serving an old version [19:39:22] we'll have to see if that comes back, seems like some problem restarting the service maybe? [19:39:45] milimetric: yeah [19:39:47] maybe [19:39:56] but dont know what's with the google exception [19:43:48] (PS1) Madhuvishy: Make google javascript origin configurable [analytics/wikimetrics-deploy] - https://gerrit.wikimedia.org/r/268198 [19:46:49] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1995208 (Niedzielski) @Nuria, weird! I just saw the events in the table jump from 8 to 10 but now it's stuck t... [19:54:00] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1995243 (Nuria) Events are duplicated already on the client side stream, these are identical events with ident... [19:55:37] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1995250 (Nuria) @Niedzielski: events are inserted in batches so they will take a while to make it to db [19:57:24] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1995258 (Niedzielski) @Nuria, ah, I'm used to see them kind of go through in a few seconds. It's been about te... [20:03:39] (PS2) MarkTraceur: Add illustration queries for enwiki [analytics/limn-multimedia-data] - https://gerrit.wikimedia.org/r/267722 [20:03:55] (PS6) Madhuvishy: Setup fabric to deploy dashboards powered by dashiki [analytics/dashiki] - https://gerrit.wikimedia.org/r/259437 (https://phabricator.wikimedia.org/T110351) [20:07:17] (PS7) Ori.livneh: Set up fabric to deploy dashboards powered by dashiki [analytics/dashiki] - https://gerrit.wikimedia.org/r/259437 (https://phabricator.wikimedia.org/T110351) (owner: Madhuvishy) [20:13:17] (PS1) MarkTraceur: Add dewiki to illustrations query config [analytics/limn-multimedia-data] - https://gerrit.wikimedia.org/r/268206 [20:14:30] Analytics team: did we stop providing sample webrequest data? https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequests_sampled [20:14:44] on Hive, the last daily file i can see is from november 23 [20:14:51] tbayer@stat1002:~$ ls /a/squid/archive/sampled/ | tail [20:36:05] HaeB: we turned off udp2log see the readme in that dir [20:36:07] look in /a/log [20:42:17] (PS8) Madhuvishy: Set up fabric to deploy dashboards powered by dashiki [analytics/dashiki] - https://gerrit.wikimedia.org/r/259437 (https://phabricator.wikimedia.org/T110351) [20:43:42] ottomata: still there ? [20:45:46] (PS1) Joal: Correct typo in last_access_uniques monthly job [analytics/refinery] - https://gerrit.wikimedia.org/r/268213 [20:45:56] ottomata: --^ [20:46:09] I think that's the last one for today [20:46:42] ottomata: With your permission, we merge that patch, and I correct the code stright in production (one coma) [20:55:55] (CR) Joal: [C: 2 V: 2] "Self merging typo." [analytics/refinery] - https://gerrit.wikimedia.org/r/268213 (owner: Joal) [20:57:45] (PS1) Joal: Update oozie diagram to last status [analytics/refinery] - https://gerrit.wikimedia.org/r/268216 [20:58:12] ottomata: thanks for the explanation! i don't see a readme in /a/log though [20:58:24] ...and the two readmes in /a/squid/archive/sampled/ don't mention it [20:58:57] in any case, should the discontinuation be mentioned in the wikitech documentation in the first place? [20:59:03] https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequests_sampled [21:00:07] i can add the information there now myself if you like, but it saves time for everyone if it's updated as soon as the change is made [21:00:10] joal: go right ahead! [21:00:14] sorry was in 1:1 [21:00:19] np ottomata [21:00:19] oh they dot'n!? [21:00:39] HaeB: we sent an email out at least, thought for sure I put it in the READMEs [21:00:41] if not i'll update those [21:00:55] !log Manually update last_access_uniques_monthly code on HDFS (patch merged) [21:03:44] ok, but apparently not on any mailing lists i'm on (searched my inbox for " webrequests sampled " and "squid sampled") [21:07:46] HaeB: i can't either. bah, i am remembering wrong i guess... [21:07:49] apologies [21:07:59] ok, thanks for updating the wiki page [21:08:09] BTW, the reason i was asking is that this data formed the basis for the most recent longterm analysis of how our google referral traffic has been developing ( https://github.com/wikimedia-research/memo/ ) ... [21:08:47] ...it looks that now that this stream has stopped, we have no way of directly comparing this year-over-year [21:08:50] aye, you should probably use the /a/log stuff if you can, the formats should be the same, and it will be less lossy [21:09:10] discovery has a nice dashboard for this, but it only goes back to october 2015 [21:09:20] HaeB: the format is the same in /a/log/webrequest/archive/sampled [21:09:41] so you should be able to combine the datasets [21:09:48] beware though [21:09:54] folks want to delete the historical data [21:10:30] https://phabricator.wikimedia.org/T92342 [21:12:35] ottomata: aware of EL-kafka lag ? [21:13:23] no [21:13:37] looking [21:13:40] Can I leave it to you ? [21:14:03] Could be time to get to bed for me :) [21:14:44] yes [21:14:47] gooooodnight! [21:14:49] Thanks mate [21:14:57] Bye a-team ! tomoooooorrow :) [21:15:04] ottomata: i see, interesting [21:15:14] ottomata: i see, interesting [21:15:17] its ok now, btw [22:21:32] nuria_: hey is task https://phabricator.wikimedia.org/T124063 going to update https://vital-signs.wmflabs.org ? [22:22:01] kevinator: will change code but graphs should be identical, makes sense? [22:22:41] yes, makes sense. I wasn't sure from the task that it would affect vital signs. [22:23:05] community member just email me directly because there has not been any new data in vital signs since Jan 21 [22:23:14] I'll point his to the phab task. Any ETA? [22:23:21] kevinator: for edits? or pageviews? [22:23:42] pageviews [22:24:13] kevinator: I did not know that, it should not be the case [22:24:25] hmm [22:24:57] ottomata: you around? [22:25:01] ja [22:25:03] hiya [22:25:06] hey [22:25:27] madhuvishy: in the wikimetrics move, did you disable the pageview data that comes from agreggator depot? [22:25:32] https://gerrit.wikimedia.org/r/#/c/267924/ should reduce the pageview api read latency [22:25:54] if you have a moment to check & possibly deploy, that would be great [22:26:05] nuria_: no - it's still getting it from there - but it's just git cloned manually - the puppet code doesn't clone automatically [22:26:19] gwicke: i'd just be merge monkey, and can do, but i am about to head out for the day [22:26:22] after the puppet run, RB will need to be restarted to use the new config [22:26:40] nuria_: only vitalsigns depends on it - so i'm temporarily serving data from the endpoint it expects [22:26:41] you think its ok to merge this even if i'm heading out? [22:26:43] madhuvishy: right, there must have been a cron taht updated depot though [22:26:47] git pull that is [22:26:54] nuria_: yes - that's not happening now [22:26:57] it was in the old puppet [22:27:06] or should we wait til morning? [22:27:08] ottomata: if you don't have time to wait for puppet & restart rb, then it's perhaps better to wait [22:27:15] i can ask joal to ping me and watch [22:27:16] nuria_: right now it's not being updated unless you manually git pull [22:27:19] otherwise the system will be in a strange state [22:27:34] aye ok [22:27:36] madhuvishy: ok, I think we need to do that automatically [22:28:01] nuria_: well once your dashiki consumes from pageview api change is done - we wont need it [22:28:18] madhuvishy: as even when we have pageviews from api endpoint we might want to have both methods for a bit to see whether latencies from api are aceptable [22:28:48] nuria_: that's fine - we can just keep updating periodically manually until we figure it out [22:28:54] madhuvishy: the medians are way too high so i suspect until we get more hardware we will not switch completely [22:29:28] madhuvishy: I think we need a cron in puppet (we can remove it later) so otherwise we will forget [22:29:39] if we do want it - we should move the data to some other place like some datasets.wikimedia place that milimetric showed me [22:29:41] gwicke: will do it with joal in my morning [22:29:52] ottomata: kk, thx! [22:29:55] yup! [22:29:59] laters all! [22:30:15] nuria_: i'd rather not add things that are not relevant to wikimetrics there - lets move the data elsewhere if that's what we want [22:52:35] Analytics: Process new PV hourly dumps for Vital Signs {hawk} - https://phabricator.wikimedia.org/T94592#1995937 (kevinator) [22:53:44] Analytics: Process new PV hourly dumps for Vital Signs {hawk} - https://phabricator.wikimedia.org/T94592#1167628 (kevinator) I'm gonna mark this as a duplicate of {T124063} [22:54:51] Analytics: Process new PV hourly dumps for Vital Signs {hawk} - https://phabricator.wikimedia.org/T94592#1995955 (kevinator) [22:54:53] Analytics-Kanban: Make Dashiki get pageview data from pageview API {melc} - https://phabricator.wikimedia.org/T124063#1944457 (kevinator) [23:04:25] Analytics: Add regexps that match the bots that follow the User-Agent policy {hawk} - https://phabricator.wikimedia.org/T125731#1996003 (mforns) NEW [23:07:37] milimetric: I've fixed the fonts/ thing for now by also uploading fonts to the root - so edit-analysis-test.wmflabs is doing good now. Feel free to review the fabric patch [23:07:47] i moved it to in Code review [23:16:19] madhuvishy: There are two things 1) the short term fix so vital signs works and 2) work to move it elsewhere. 2) will be taken care of with pageview api move but until then vital signs should work like it did prior so i'd say we add teh cron for git pull short term and remove it once we no longer need it. [23:17:51] nuria_: we already talked about this - milimetric and I discussed it and the first plan was I immediately move the datasets to a public folder that currently has other datasets. then he said let's just manually update until we make the pageview api move. it's just git pulling once in a while - and vitalsigns is fine. [23:19:38] madhuvishy: the problem I see with that plan is that we forget to pull manually and then we have incomplete data, now we are missing two weeks [23:21:00] nuria_: well it is temporary. i just pulled it so it should be okay - we can talk about it at standup tomorrow and take a call again if you want - i really dont want to put the puppet code in the wikimetrics module [23:21:15] madhuvishy: but that's ok, i can make a regular cron in the box. it doesn't have to be in puppet [23:21:45] nuria_: hmmm that's also kinda bad then we have stray processes in the instance [23:21:56] and we'd just forget to clean it up [23:22:55] madhuvishy: sure, not the best but I think that is a better alternative to having to do it by hand and forgetting which is the current state of affairs [23:23:20] nuria_: will it take very long to move it to the pageview api setup? [23:24:02] madhuvishy: as i said before , even when the whole code is done if latencies are not good we will not move it [23:25:53] nuria_: which is fine - at which point lets move the data out of the wikimetrics server - serve it from elsewhere. datasets.wmflabs or pageviews.wmflabs - i'm fine making a puppet role to do that which will clone and update these repos [23:26:03] just not okay with adding it to this instance [23:26:36] I can pull it everyday until then [23:29:11] jaja, let's just do a cron on the instance, it can be removed after, so we do as little manual work as possible. i can do that [23:29:51] it's an ok tradeoff [23:51:29] Analytics-Kanban: Remove cron on wikimetrics instance that updates vital signs [1] - https://phabricator.wikimedia.org/T125751#1996333 (Nuria) NEW a:Nuria [23:54:25] madhuvishy: created task so we do not forget to remove cron (liken to pageview api task) [23:54:26] https://phabricator.wikimedia.org/T125751?workflow=124063 [23:54:34] ok [23:54:45] madhuvishy: do not worry, we will not forget