[00:37:54] Are you on your way already? [00:38:29] Must have missed you at the elevators downstairs [06:43:23] 10Analytics, 10Analytics-Kanban: Upgrade Analytics infrastructure to Debian Stretch - https://phabricator.wikimedia.org/T192642 (10elukey) [06:51:13] 10Analytics, 10decommission, 10User-Elukey: Decommission analytics100[1,2] - https://phabricator.wikimedia.org/T205507 (10elukey) p:05Triage>03Normal [06:51:49] 10Analytics, 10Operations, 10decommission, 10ops-eqiad, 10User-Elukey: Decommission analytics100[1,2] - https://phabricator.wikimedia.org/T205507 (10elukey) [06:58:15] 10Analytics, 10Operations, 10decommission, 10ops-eqiad, and 2 others: Decommission analytics100[1,2] - https://phabricator.wikimedia.org/T205507 (10elukey) [06:58:49] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Replace the Analytics HDFS/Yarn masters (hardware refresh) - https://phabricator.wikimedia.org/T203635 (10elukey) [07:04:12] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Q1 2018/19 Analytics procurement - https://phabricator.wikimedia.org/T198694 (10elukey) [07:15:12] 10Analytics: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh) - https://phabricator.wikimedia.org/T205509 (10elukey) p:05Triage>03Normal [07:15:46] 10Analytics: Replace the Analytics Hadoop coordinator - Hive/Oozie/etc... (hardware refresh) - https://phabricator.wikimedia.org/T205509 (10elukey) [07:37:54] 10Analytics, 10Operations, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843 (10elukey) Getting back to this task: the plan is now to "free" stat1005 and move all users to a new host, stat1007. This will allow us to reboot/compile/etc.. on stat1005 without impac... [07:38:07] 10Analytics, 10Operations, 10Research-management, 10User-Elukey: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843 (10elukey) [07:38:55] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review: setup/install an-coord1001/wmf7621 - https://phabricator.wikimedia.org/T204970 (10elukey) Any chance that this work can be done before the end of next week? If so I'll plan some maintenance time for Hadoop :) [07:51:14] !log stop mysql consumers on eventlog1002 as prep step for db maintenance [07:51:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:02:59] Morning elukey - 5 mins check before back to kids - everything seems ok this mroning, right? [08:03:13] joal: yep! [08:03:19] Great :) [08:03:21] Bonjour! [08:03:50] Bonjour à toi :) [08:22:19] !log start mysql consumers on eventlog1002 after maintenance [08:22:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:22:26] eventlogging master upgraded! [08:23:50] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Test role::analytics_cluster::coordinator on Debian Stretch - https://phabricator.wikimedia.org/T204060 (10elukey) Fixed during the offsite, everything works! [08:23:57] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Test role::analytics_cluster::coordinator on Debian Stretch - https://phabricator.wikimedia.org/T204060 (10elukey) [10:27:38] 10Analytics, 10Analytics-Kanban: Reboot Analytics hosts for kernel security upgrades - https://phabricator.wikimedia.org/T203165 (10elukey) [10:28:07] 10Analytics, 10Operations, 10ops-eqiad: analytics1068 doesn't boot - https://phabricator.wikimedia.org/T203244 (10elukey) Any news from Dell ? [11:11:23] * elukey lunch! [11:35:14] 10Analytics, 10Performance-Team (Radar): Rename column on old hive data for a few tables - https://phabricator.wikimedia.org/T204922 (10Gilles) 05Open>03declined I'll give views a shot, thanks. I wasn't aware those were available. [12:08:55] 10Analytics, 10Analytics-Kanban: Make hover info-box on bar charts consistent with line charts - https://phabricator.wikimedia.org/T205461 (10fdans) [12:38:08] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure, 10Operations, 10Patch-For-Review, and 2 others: Prometheus resources in deployment-prep to create grafana graphs of EventLogging - https://phabricator.wikimedia.org/T204088 (10fgiunchedi) >>! In T204088#4616379, @Ottomata wrote: > BTW, I updated https://wi... [13:24:49] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure, 10Operations, 10Patch-For-Review, and 2 others: Prometheus resources in deployment-prep to create grafana graphs of EventLogging - https://phabricator.wikimedia.org/T204088 (10Ottomata) I didn't realize that either! Documenting. [13:32:37] joal: you around? wanna bc on revision-score real quick? [13:33:19] ottomata: I think he is out until standup, kids day [13:33:21] morning :) [13:34:47] ohoo k [13:34:49] hiyaaa [13:36:56] I'd need to reboot an-master1001 to clear out a weird systemd state (already fixed in an-master1002 this morning) [13:37:09] so if nobody opposes I am going to failover to an-master1002 briefly [13:39:01] 10Analytics, 10Analytics-Kanban: Add index to mediawiki_page_create_3 - https://phabricator.wikimedia.org/T204572 (10Ottomata) On db1108 (eventlog log database replica) I ran: ``` alter table mediawiki_page_create_3 add index `ix_mediawiki_page_create_3_database_rev_timestamp_T204572` (`database`(64),`rev_tim... [13:40:23] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team Goals, and 2 others: Modern Event Platform: Stream Configuration Service - https://phabricator.wikimedia.org/T205319 (10Ottomata) [13:40:48] rebooting [13:42:19] 'Ȍ*.D��`�i��R��k�\\]�Än��K��.e�[�IS��i���٧ [13:42:21] ��r|�02�BY.�ܘ��k�ۊ���>����J [13:42:24] $�~�+�f����D���O��R�1�S�m��?�?̐,��dQmR�¯��^��k��0�A�����KO,�մ-\�X�X������jܮ�GR�\$_�o� hG؟�!�4���� [13:42:28] �0n&�b�+��*�b9e���\?1e�3�ar��0RsU���;���$�nv�K����� [13:42:31] ;W�=m�����0�2X��B'�(;��G�X �6+ [13:45:17] I tried to -r this chan but it doesn't work [13:45:18] sigh [13:47:27] reboot done! [13:53:00] (03PS3) 10Ottomata: Remove potentially dangerous Refine Config defaults [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/462820 (https://phabricator.wikimedia.org/T203804) [13:57:45] mforns: hiya! please let me know when you have a moment to chat about T205441 and https://gerrit.wikimedia.org/r/#/c/analytics/reportupdater/+/462732/ [13:57:46] T205441: 'group' parameter in Reportupdater for automatic chgrp of generated reports - https://phabricator.wikimedia.org/T205441 [14:00:23] 10Analytics: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10Marostegui) [14:00:36] 10Analytics: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10Marostegui) p:05Triage>03Normal [14:03:57] 10Analytics: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10elukey) I agree, let's pick a couple of big tables and convert them. [14:06:53] (03CR) 10Ottomata: [C: 032] Remove potentially dangerous Refine Config defaults [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/462820 (https://phabricator.wikimedia.org/T203804) (owner: 10Ottomata) [14:25:47] 10Analytics: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10Banyek) wouldn't be better if we skip toku, but compress the tables? I am not sure if it is not good to have the same kind of storage engines everywhere. Less hidden caveats ,etc. [14:26:29] 10Analytics: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10Marostegui) >>! In T205544#4619129, @Banyek wrote: > wouldn't be better if we skip toku, but compress the tables? > I am not sure if it is not good to have the same kind of storage engines everywhere. Less hidden caveats ,et... [14:27:48] 10Analytics: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10Banyek) ah, ok [14:33:32] a-team: I'm still busted from last night, I think I'm going to skip standup and take a nap [14:33:57] ok dan, rake rest [14:33:57] fdans / mforns: could one of you review https://gerrit.wikimedia.org/r/#/c/analytics/wikistats2/+/440971/, it'd be nice to deploy it soon and make the annotation pages on meta [14:34:04] ok [14:34:12] milimetric: you made it tho! have a nice nap [14:34:28] yeah :) I'm alive [14:34:44] what happened? Are you ok? [14:34:51] mforns: I’ll review annotations and you can take a look at pages to date? that’s ready too :) [14:35:09] ok fdans! [14:36:19] bearloga, sorry I forgot about that yesterday... I have a meeting now, but in 30 mins we can look at that if you can? [14:37:44] i also will miss standup, we got better use of data meeting today [14:37:48] elukey: no, totally fine, just baby woke up 8 times screaming for mama [14:37:58] ahhh right your night alone! [14:38:04] but you made it! [14:51:41] ottomata: so I just discovered that our journald logs are under /run, that is on tmpfs :D [14:51:52] I was wondering why they were wiped after a reboot [14:52:05] no bueno [14:52:25] I'll add the rsyslog rules to dump (when needed) on a file [14:53:34] 10Analytics: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10jcrespo) Some of them may be on innodb on purpuse because we hit a bug (T109069), some may not be on purpose because alter/import tables happened. This is the last tokudb server aside from analytics. TokuDB works mostly ok fo... [14:53:54] bearloga, I'm looking into your RU change [14:53:59] ooohhhh right ok great elukey [14:59:39] mforns: do you think https://gerrit.wikimedia.org/r/#/c/analytics/wikistats2/+/458784/ is ready to merge? [15:00:12] nuria, fdans asked me to review this, will do after I finish my current review for bearloga [15:00:21] mforns: sounds good [15:00:54] mforns: thanks! :) [15:01:47] * fdans joins standup one hour earlier as part of his Wednesday routine [15:01:48] mforns: I have no idea where the logic is buried for default parameters so I would like some help with that, please and thank you [15:02:39] (03PS2) 10Mforns: [WIP] Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) (owner: 10Bearloga) [15:03:07] bearloga, I just pushed a small patch that adds that logic, I haven't tested, but should be something like that [15:03:26] bearloga, please, can you check that it works? [15:03:31] 10Analytics: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10Banyek) a:03Banyek I can take care of this [15:03:53] also, I think flake8 is complaining about an except statement? [15:05:00] bearloga, it would be cool also to add 1 test that checks the group thing, if it is not super-time consuming.. [15:05:27] mforns: oh I see the defaults thing now. good job! and yeah, I'll try to add a test [15:05:37] thanks :] [15:05:47] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) (owner: 10Bearloga) [15:09:20] 10Analytics, 10User-Banyek: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10Banyek) [15:11:31] (03PS1) 10Bearloga: Blank patch to confirm CI [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/463091 [15:14:38] (03CR) 10jerkins-bot: [V: 04-1] Blank patch to confirm CI [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/463091 (owner: 10Bearloga) [15:15:31] mforns: ^ [15:16:18] bearloga, yes, I think the syntax checker does not allow for "naked" excepts no? [15:17:13] mforns: sure looks like it but I'm not the codebase maintainer ¯\_(ツ)_/¯ [15:18:21] bearloga, I see now what you mean! I thought it was the new code that triggered this erro... [15:18:22] mforns: can you please make the necessary changes so I can rebase on top of the correct syntax? [15:18:45] that's weird... flake8 might have been updated maybe? [15:19:16] bearloga, will do! but I need to go to Scrum of Scrums meeting in short, is it OK if I do later? [15:20:36] mforns: thanks! and no problem [15:26:16] (03CR) 10Mforns: "I think code looks great now!" (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/458784 (https://phabricator.wikimedia.org/T203180) (owner: 10Fdans) [15:28:06] mforns: whats do you suggest instead of repeating the config between new-pages and pages-to-date? [15:28:38] what do you mean? do we need new-pages config 2 times? [15:29:42] I might have misunderstood, but I think the config for new-pages exists 2 times, one before pages-to-date and one after it [15:36:34] mforns: ohhh damn I'm stupid [15:36:35] sorry [15:37:01] xD, no problem, it was just a copy paste thing [15:37:02] I thought you were suggesting that we avoid repeating code between the two metrics [15:52:12] a-team: my cat decided to become Indiana Jones and after some jumps ended up injuring its paw, so I'd need to take her to the vet and skip standup sorry :( [15:52:22] I'll send e-scrum and re-join later [15:52:31] oh... [15:53:21] oh elukey that sucks, give her all the a.team snuggles [15:54:13] (03PS1) 10Fdans: Make the line chart tooltip move with cursor [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463102 (https://phabricator.wikimedia.org/T205461) [15:56:14] (03PS9) 10Fdans: Add pages to date metric [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/458784 (https://phabricator.wikimedia.org/T203180) [15:58:55] fdans: ah, i missed that too on Cr [16:01:16] 10Analytics, 10Analytics-Kanban: Add index to mediawiki_page_create_3 - https://phabricator.wikimedia.org/T204572 (10JAllemandou) > @nettrom_WMF what can we do to get your queries moved to Hive instead of MySQL? You can use the hive table `event.mediawiki_page_create`. I +1 this idea, even with the possibil... [16:01:44] milimetric, ottomata , joal: standdduppp [16:02:02] ohh sorry will miss today! better use of data meeting is now [16:02:27] I’m still napping, dizzy from last night, will take half day [16:03:20] 10Analytics, 10Analytics-Kanban: Add index to mediawiki_page_create_3 - https://phabricator.wikimedia.org/T204572 (10Nuria) [16:16:54] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Wikistats: add functions you apply to dimensional data such as "accumulate" - https://phabricator.wikimedia.org/T203180 (10Nuria) We need to fix bug with '0' on dashboard {F26212555} [16:21:25] 10Analytics, 10Analytics-Kanban: Change dashboard order in wikistats - https://phabricator.wikimedia.org/T205555 (10Nuria) [16:40:57] (03PS4) 10Joal: Correct end-point documentation [analytics/aqs] - 10https://gerrit.wikimedia.org/r/456573 (https://phabricator.wikimedia.org/T203403) [16:41:05] elukey: merging that last one the ready [16:41:23] (03CR) 10Joal: [V: 032 C: 032] "Merging for deploy" [analytics/aqs] - 10https://gerrit.wikimedia.org/r/456573 (https://phabricator.wikimedia.org/T203403) (owner: 10Joal) [16:42:22] https://dist.apache.org/repos/dist/dev/bigtop/bigtop-1.3.0-RC1/ \o/ [16:42:28] joal: ack! [16:42:53] elukey: we were waiting for that one were we? [16:42:57] git up [16:42:59] oops [16:44:49] joal: yes basically, surely to test Hive 2.x [16:44:57] they are voting the release in these days [16:45:04] \o/ [16:45:47] ooo [16:46:24] do the RCs get .debs, do you know? [16:47:10] not sure, but there is a procedure to build all the deb packages via docker container IIUC [16:47:18] wow [16:47:35] in theory if the RC is accepted then the debs will be uploaded [16:47:41] so we can wait a couple of days [16:47:43] elukey: something we'll be willing to if we go for testing hive2 is to also bring Tez [16:47:46] I thinkg [17:00:33] joal: ready to deploy or after 1:1? [17:01:20] After 1-1, fighting with docker [17:01:23] sorry elukey :( [17:01:33] no issue! will re-check in a few then :) [17:18:23] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 2 others: RFC: Modern Event Platform: Scalable Event Intake Service - https://phabricator.wikimedia.org/T201963 (10Joe) I'll write down here some questions I'd like to discuss in this evening's meeting: - In general, rewriting from... [17:23:26] (03PS1) 10Joal: Update aqs to 91b287e [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/463116 [17:30:31] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Joe) >I want to produce events and get a synchronous response if my event is produced so I can build production backend... [17:34:52] nuria: T178587 [17:34:53] T178587: Update wikimedia-history revision data with deleted field (and find it a new name?) - https://phabricator.wikimedia.org/T178587 [17:35:20] nuria: T179692 [17:35:21] T179692: Enhance mediawiki-history page reconstruction with best historical information possible - https://phabricator.wikimedia.org/T179692 [17:35:33] 10Analytics: Update wikimedia-history revision data with deleted field (and find it a new name?) - https://phabricator.wikimedia.org/T178587 (10Nuria) [17:35:35] 10Analytics, 10Analytics-Kanban: Raise Edit Data Quality to the point where we can offer snapshots on Cloud (labs) environment - https://phabricator.wikimedia.org/T204953 (10Nuria) [17:36:57] 10Analytics, 10Analytics-Kanban: Raise Edit Data Quality to the point where we can offer snapshots on Cloud (labs) environment - https://phabricator.wikimedia.org/T204953 (10Nuria) [17:36:59] 10Analytics: Enhance mediawiki-history page reconstruction with best historical information possible - https://phabricator.wikimedia.org/T179692 (10Nuria) [17:38:43] elukey: ready for deploy? [17:38:53] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10daniel) > anything other than the "fire and forget" pattern is a potential cause of latencies and even of outages. Ins... [17:39:38] mforns: to ingest schema readingdepth we do not really need to do any changes to refine as this schema does not have arrays, will asign ticket to fdans [17:40:15] (03CR) 10Joal: [V: 032 C: 032] "LGTM! Merging for deploy" [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/463116 (owner: 10Joal) [17:40:34] joal: yep! [17:44:37] elukey: scaping in a minute :) [17:45:07] elukey: shall we let operatuin chan know? [17:45:15] 10Analytics: Ingest data into druid for readingDepth schema - https://phabricator.wikimedia.org/T205562 (10Nuria) [17:45:58] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Ottomata) > The only reason I can think of for this is to allow the intake to validate the event, and tell the sender w... [17:49:45] !log Deploy AQS from scap [17:49:46] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:50:28] joal: sorryyyy I was writing in a task [17:50:31] seen all the updates [17:50:46] elukey: currently checking aqs1004 after canary deploy [17:53:35] elukey: all good from aqs1004 - continuing rolling deploy ? [17:54:09] fdans: is this change on top of "pages to date" https://gerrit.wikimedia.org/r/#/c/analytics/wikistats2/+/463102/ [17:54:18] joal: +1 [17:54:32] fdans: does not lokk like it from git history but i see "pages to date " metric when i download the dashboard [17:55:08] nuria: that's weird... no, the base should be origin/master [17:55:20] fdans: do try it and let me know [17:55:58] (03PS10) 10Fdans: Add pages to date metric [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/458784 (https://phabricator.wikimedia.org/T203180) [17:56:17] nuria: looking [17:57:01] nuria: confirming that I don't see pages-to-date with that change [17:57:20] fdans: ah wait no, local cached copy [17:57:26] fdans: my mistake [17:57:59] nuria: I wonder if there is any magic we can use in web pack so that js is never cached when using dis-dev [17:58:17] goddammit safari autocorrect, who invited you [17:58:45] fdans: ya, totally doable [17:59:03] fdans: but hard to do that and incremental builds [17:59:23] fdans: i think python server should do that one sec [18:00:37] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 2 others: RFC: Modern Event Platform: Scalable Event Intake Service - https://phabricator.wikimedia.org/T201963 (10Ottomata) I'll try answer these in order of increasing difficulty: > Why would a re-implementation use node instead... [18:00:54] joal: all good? [18:01:29] elukey: yes sir ! just finished [18:01:51] elukey: I prefer to take my time, go gently and watch charts when deploying that beast :) [18:01:52] fdans: mm no, it does not by default [18:02:02] ah good good :) [18:02:08] the graphs looks good to me [18:02:12] same [18:02:52] elukey: I'm gonna write the patch for restbase, and maybe I'll mange to have it deployed tonight or tomorrow :) [18:02:57] Many thanks elukey ! [18:03:15] I release you from my evening-will-to-deploy :D [18:04:04] as always you did all the work :D [18:04:09] have a good evening! [18:04:11] * elukey off [18:04:19] nuria, re. readingdepth yes, but this schema is not the one that we were analyzing before no? [18:04:22] bye elukey! [18:04:28] bye elukey :) [18:04:59] mforns: there are two schemas currently being use in audiences : readingdepth and pageissues [18:05:07] ok ok [18:05:27] mforns: pageissues is the one with the array fields that cannot be imported [18:05:34] yea ok [18:08:21] joal: "top editors" is top by number of edits right? [18:08:29] correct nuria [18:08:36] joal: per our conversation before [18:09:14] since top-contributors could be thought of as contributors-having-added-most-bytes (as a meaning of having added value) [18:09:20] nuria: --^ [18:09:46] (03CR) 10Nuria: Add most top editors and top edited pages metrics (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) (owner: 10Fdans) [18:10:28] fdans: some comments to this patch, we can talk about them in irc if you are arround [18:11:49] nuria: that CR is still wip, all the aqs code will be removed once the endpoints are updated :) [18:12:45] fdans: right for intervals ,there are couple other things , aqs was just deployed by joal [18:13:06] nuria: I need a restbase deploy before we can test for real [18:13:25] joal: i see is peter doing that? [18:14:08] nuria: I do hope - preparing a patch now, and I have a pending deploy that have been waiting for 2 weeks, so maybe he'll accept to do it now :) [18:14:23] nuria: yeah the naming, will change that, thanks for the CR :) [18:16:46] (03Abandoned) 10Mforns: Blank patch to confirm CI [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/463091 (owner: 10Bearloga) [18:20:56] uyuyuy [18:21:28] * fdans realizes of a pretty embarrassing bug in the Wikistats dashboard [18:21:46] (03CR) 10Mforns: "Bearloga, thanks for pointing out the flake8 problem!" [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) (owner: 10Bearloga) [18:27:46] 10Analytics-Kanban, 10Analytics-Wikistats: Last 12 months aggregate is actually taking the first 12 months - https://phabricator.wikimedia.org/T205565 (10fdans) [18:28:43] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Nuria) >>I want to produce events and get a synchronous response if my event is produced so I can build production back... [18:29:53] I'm having trouble reaching bast4001 [18:32:20] groceryheist: I am not sure we can help you with that, maybewikimedia-dev? [18:32:49] hmm, I was able to ssh yesterday, but not today [18:32:56] (03PS1) 10Fdans: Fix last 12 months aggregate [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463126 (https://phabricator.wikimedia.org/T205565) [18:33:48] mobrovac: we are preparing a restbase patch (cc joal) do you think you would be able to deploy it today/tomorrow? [18:35:22] mobrovac, nuria: https://github.com/wikimedia/restbase/pull/1070 [18:36:41] (03CR) 10jerkins-bot: [V: 04-1] Fix last 12 months aggregate [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463126 (https://phabricator.wikimedia.org/T205565) (owner: 10Fdans) [18:38:10] Gone for diner, will be back after [18:38:25] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Ottomata) > Agreed. That is why results of having sent an event are to be received on a callback if at all. I think the... [18:38:48] nuria, joal: we are in the process of deploying some big changes to rb and parsoid, so it will have to wait at least a couple of days :/ [18:40:20] ottomata: yt? [18:41:01] mobrovac: ok, we need this deploy for our goals so if we could have it deployed by thursday EOD it will be great [18:41:08] bast4001 is having maitenance, switching to 2001 worked [18:41:42] (03PS2) 10Fdans: Fix last 12 months aggregate [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463126 (https://phabricator.wikimedia.org/T205565) [18:42:25] nuria: fyi this one is pretty critical ^ [18:43:18] 10Analytics, 10ORES, 10Scoring-platform-team, 10revscoring, 10artificial-intelligence: [Investigate] Use PMML for prediction model serialization - https://phabricator.wikimedia.org/T173244 (10awight) a:05awight>03None [18:44:55] nuria: yo am here [18:48:32] fdans: looking [18:49:26] ottomata: sooo, what i was saying on MEDP ticket is that I do not think we need the synchronous use case, the eventbus callback error now is async , right? [18:50:17] nuria: i think there is a confusion of blocking vs async [18:51:18] ottomata: the way i understand it synchronous is blocking [18:51:39] ottomata: were you thinking differently? [18:51:48] k maybe i need it rephrased then. i'm just calling out a difference between [18:51:49] fire and forget [18:51:53] and waiting for a response [18:51:57] (which might be async) [18:52:13] nuria: why do we have a deadline on Joal PR? [18:53:56] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Ottomata) Maybe I need to change the wording. These use cases are just calling out the difference between fire-and-forg... [18:54:19] ottomata: right [18:55:00] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Ottomata) [18:55:05] ottomata: fire and forget is actually neither but the response to eventbus when on error i think most people will describe it as asyn error callback [18:55:19] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Ottomata) Changed wording to: > - As an **engineer**, I want to produce events and get an appropriate response if my e... [18:55:49] (03PS3) 10Bearloga: Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) [18:56:51] Pchelolo: the dealine is not a hard one, ideally we want to deploy what we need to be able to finish our goals this week but if you have a major deployment today and it cannot be done we will need to wait a couple days i guess [18:58:57] (03CR) 10jerkins-bot: [V: 04-1] Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) (owner: 10Bearloga) [19:00:01] nuria: that patch doesn't affect anything we're doing, but in the perfect world I would want to separate deploys. We can just merge it and deploy together with tomorrow big one. But that implies a risk we will have to undeploy it back [19:00:17] mforns: alright, the tests pass. wanna double check that I've written the test correctly? [19:00:34] Pchelolo: really, it can wait . [19:01:04] ok. I will try to get it done tomorrow. But not today definitely, sorry about that [19:01:52] we need to think on T204981 because we can not continue doing these double changes and double deploys for things as easy as this one [19:01:53] T204981: Keeping Node services documentation in sync - https://phabricator.wikimedia.org/T204981 [19:02:01] bearloga, looking! [19:04:16] mforns: idk what groups are available so I just used the group to which the testing process belongs to, but idk if that's a thorough enough test of the group-changing feature [19:04:44] fdans: I see, that fixes one of the bugs but there is still a 0.0% yoy that looks wrong [19:06:06] bearloga, yea, that's a tricky test to do... [19:06:22] fdans: take a look and see if you get what i mean [19:07:39] mforns: maybe using grp.getgrall and using whatever the group is the first entry in that result would be better [19:07:47] bearloga, theoretically the file will have the testing process group anyway, no? [19:07:59] aha, makes sense [19:08:43] mforns: yeah, so at best the current test just tests that things don't go haywire (which isn't necessarily useless). gonna try that getgrall approach now! :) [19:09:21] bearloga, I wonder if os.chown will need some special permissions in some cases? [19:10:00] if you chown to a group where RU does not have reading or writing rights? [19:11:35] also, bearloga, if you want you can remove the asserts for output_lines, those are tested elsewhere already no? [19:11:38] mforns: like, if output directory is owned by group A but we specify the output file to be owned by group B? [19:12:22] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Remove webrequest misc analytics related jobs and code after cache misc -> text merge is complete - https://phabricator.wikimedia.org/T200822 (10Nuria) 05Open>03Resolved [19:12:32] 10Analytics-Kanban, 10Patch-For-Review: Refactor Refine job scalaopt to use property files and CLI overrides - https://phabricator.wikimedia.org/T203804 (10Nuria) 05Open>03Resolved [19:12:41] bearloga, yes, my concern was if we specify the output file to be owned by a group that user executing update_reports.py has no write permission [19:12:45] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Replace the Analytics HDFS/Yarn masters (hardware refresh) - https://phabricator.wikimedia.org/T203635 (10Nuria) 05Open>03Resolved [19:13:11] mforns: oh, that's what you meant [19:13:11] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Test role::analytics_cluster::coordinator on Debian Stretch - https://phabricator.wikimedia.org/T204060 (10Nuria) 05stalled>03Resolved [19:13:29] bearloga, but the individual user will continue to be there, so I guess no problem, right? [19:13:38] mforns: that's my thinking too [19:13:47] ok [19:13:49] makes sense [19:13:59] mforns: that as long as update_reports.py is at least executed by the same user even if not same group, that should be fine [19:14:12] 10Analytics, 10Analytics-Kanban: Add index to mediawiki_page_create_3 - https://phabricator.wikimedia.org/T204572 (10Nuria) 05Open>03Resolved [19:14:13] aha [19:16:10] 10Analytics, 10Analytics-Kanban: Ingest data into druid for readingDepth schema - https://phabricator.wikimedia.org/T205562 (10Nuria) [19:16:35] 10Analytics: Review parent task for any potential pageview definition improvements - https://phabricator.wikimedia.org/T156656 (10Nuria) p:05Low>03High [19:17:56] 10Analytics, 10Research: Refactor pagecounts-ez generation - https://phabricator.wikimedia.org/T192474 (10Nuria) p:05Low>03High [19:18:04] 10Analytics: Generate pagecounts-ez data back to 2008 - https://phabricator.wikimedia.org/T188041 (10Nuria) p:05Low>03High [19:18:31] 10Analytics: Whitelist analytics.wikimedia.org and stats.wikimedia.org in ad blockers - https://phabricator.wikimedia.org/T182816 (10Nuria) p:05Low>03Triage [19:18:58] 10Analytics: Remove request for font.googleapis.com from analytics.wikimedia.org - https://phabricator.wikimedia.org/T182804 (10Nuria) p:05Low>03Triage [19:19:58] (03CR) 10Nuria: "Anything preventing us from merging this changeset?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/454631 (https://phabricator.wikimedia.org/T199900) (owner: 10Mforns) [19:20:32] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: [EL sanitization] Store the old salt for 2 extra weeks - https://phabricator.wikimedia.org/T199900 (10Nuria) p:05Normal>03High [19:20:45] (03CR) 10Mforns: "No, it's fixed, tested and ready." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/454631 (https://phabricator.wikimedia.org/T199900) (owner: 10Mforns) [19:22:05] 10Analytics: Move the Analytics Refinery to Python 3 - https://phabricator.wikimedia.org/T204735 (10Nuria) a:05fdans>03None [19:22:55] 10Analytics, 10Analytics-Wikistats, 10Operations, 10Traffic, 10Regression: [Regression] stats.wikipedia.org redirect no longer works ("Domain not served here") - https://phabricator.wikimedia.org/T126281 (10Nuria) 05Open>03Resolved [19:23:19] 10Analytics, 10Analytics-Kanban: Upgrade Analytics infrastructure to Debian Stretch - https://phabricator.wikimedia.org/T192642 (10Nuria) p:05Normal>03Triage [19:23:23] 10Analytics, 10Analytics-Kanban: Upgrade Analytics infrastructure to Debian Stretch - https://phabricator.wikimedia.org/T192642 (10Nuria) p:05Triage>03Normal [19:23:38] 10Analytics-Kanban: Upgrade Analytics infrastructure to Debian Stretch - https://phabricator.wikimedia.org/T192642 (10Nuria) [19:24:19] (03PS4) 10Bearloga: Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) [19:24:51] 10Analytics, 10Performance-Team (Radar): Evaluate alternate means to send X-Analytics information from Varnish to Hadoop. - https://phabricator.wikimedia.org/T196558 (10Nuria) p:05Low>03Triage [19:25:17] 10Analytics, 10Product-Analytics, 10Reading List Service, 10Reading-Infrastructure-Team-Backlog, and 3 others: [EPIC] Reading List Sync service analytics - https://phabricator.wikimedia.org/T191859 (10Nuria) p:05Low>03Triage [19:25:47] (03PS5) 10Bearloga: Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) [19:28:44] (03CR) 10jerkins-bot: [V: 04-1] Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) (owner: 10Bearloga) [19:29:38] (03CR) 10Nuria: [V: 032 C: 032] Allow backup of last rotated salt for a given period in saltrotate [analytics/refinery] - 10https://gerrit.wikimedia.org/r/454631 (https://phabricator.wikimedia.org/T199900) (owner: 10Mforns) [19:30:13] 10Analytics, 10Analytics-Wikistats: Wikispecial wikis WikiStats Zeitgeist should include talk - https://phabricator.wikimedia.org/T37195 (10Nuria) 05Open>03declined [19:30:18] (03PS6) 10Bearloga: Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) [19:31:36] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: 'group' parameter in Reportupdater for automatic chgrp of generated reports - https://phabricator.wikimedia.org/T205441 (10Nuria) [19:37:44] (03CR) 10Nuria: "There is I think still a bug with 0.0% YoY" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463126 (https://phabricator.wikimedia.org/T205565) (owner: 10Fdans) [19:43:12] (03CR) 10jerkins-bot: [V: 04-1] Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) (owner: 10Bearloga) [19:44:17] ottomata: are you still here? [19:44:24] ya heya [19:44:28] cool :) [19:45:12] ottomata: would you mind having a look at the wikihadoop readmes? I'd like to move forward this end of week if possible [19:45:32] oh my god this mistake is so silly [19:45:50] (03PS7) 10Bearloga: Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) [19:45:58] sorry for clogging up the chat log, folks [19:46:23] joal looking! [19:47:48] ottomata: I also have the XML-dumpo importer pending - if you may before leaving :S https://gerrit.wikimedia.org/r/c/analytics/refinery/+/456654 [19:49:07] (03CR) 10jerkins-bot: [V: 04-1] Add support for group param for chgrp-ing generated reports [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/462732 (https://phabricator.wikimedia.org/T205441) (owner: 10Bearloga) [19:52:36] (03CR) 10Milimetric: Annotate wikistats (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/440971 (https://phabricator.wikimedia.org/T194705) (owner: 10Milimetric) [19:52:55] mforns: "Build timed out (after 3 minutes). Marking the build as aborted." I wonder if that's because the first group is root, so maybe there are issues giving ownership to the file to that group? I'm worried about getting gid & name of whatever the second group is in case there aren't 2 groups, although I suppose I could build a check for that? thoughts? [19:53:36] bearloga, hmmm [19:54:06] bearloga, one option is to mock os.chown, but not sure how much work that will be... [19:54:34] like, mock it and just assert that it receives the right param [19:54:54] this way we can pass any group [19:55:20] (03CR) 10Ottomata: "python function docs plz! Otherwise +1" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/456654 (https://phabricator.wikimedia.org/T202489) (owner: 10Joal) [19:55:21] another option is leaving that feature without unit test [19:55:28] joal: readmes seem good [19:55:36] i don't have time atm to totally get into the details :/ [19:55:44] gotta run to a different location before IRC RFC meeting in an hour [19:55:59] (03CR) 10Ottomata: [C: 031] "one nit for now" (031 comment) [analytics/wikihadoop] - 10https://gerrit.wikimedia.org/r/337004 (owner: 10Joal) [19:56:03] i don't want to block you though [19:56:24] so let's i'll do a more thorough post review ok? [19:56:27] ottomata: I hear that - If you don't have time and if ok, we can merge as-is and we'll review-modify next week? It can also wait for a round of comments later this evening (I'll work them tomorrow) [19:56:59] mforns: I'd be wary of mocking something like os.chown (+ it's beyond my capability) and given that we don't want to assume anything about the test env beyond "there is at least one group" then I guess a unit test-less option looks like the way to go :\ [19:57:09] yes [19:57:14] yikes joal also [19:57:17] https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&var-cluster=druid_analytics&var-druid_datasource=All&from=now-2d&to=now [19:57:35] wow [19:57:36] bearloga, I was looking at the code, and other os.* methods are mocked there already [19:57:45] I think it wouldn't be so complicated [19:57:45] this is unexpected :( [19:58:06] bearloga, let me see if you can setup expected params with MagicMock() [19:58:30] hm - This coincides with master-swaps [19:58:43] yeah [19:58:57] ottomata: Are the hadoop config hardcoded on druid? [19:59:22] no [19:59:27] but maybe druid reads them on startup and needs a restart? [19:59:44] im' going to bounce coordinators, ok? [19:59:59] ottomata: I'm confident in that we've not applied any druid patch yesterday - meaning no puppet specific run, nor restart [20:00:08] yes, coods been running for 3 weeks [20:00:38] !log rolling restart of druid coordinators to hopefully pick up hadoop master config change [20:00:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:00:52] bearloga, I think you can do: os.chown = MagicMock() and then: os.chown.assert_called_once_with(group) [20:01:33] you can use any group for that, because the call to os.chown() won't have any effect [20:01:36] ottomata: Thanks for the rolling - I confirm new config is in files [20:02:09] ottomata: no need to do it now, but we'll also need to restart druid public [20:02:31] ottomata: Was there an alarm about the missing segments? [20:02:37] yes [20:02:42] but druid public looks ok? [20:02:42] https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&edit&from=now-2d&to=now&var-datasource=eqiad%20prometheus%2Fanalytics&var-cluster=druid_public&var-druid_datasource=All [20:02:49] is it that stuff is just not loaded there frequently? [20:03:03] ottomata: nothing to reload (monthhly reloads only) [20:03:10] bearloga, also look at setUp and tearDown, there's some code that restores the mocked method to normal [20:03:13] k [20:03:29] bearloga, do you want to pair on that? [20:04:23] ok, gotta run, will check back on this soon [20:04:55] i wonder if other services need to be bounced too [20:04:58] wouldnt' be surporised [20:05:31] ottomata: hm - Do we have other services hdfs-dependent? [20:07:42] mforns: oh neat! and yeah, would love to pair. would tomorrow work for you? [20:07:50] sure [20:08:58] bearloga, I start working around 12 UTC [20:14:40] ok - I think Andrew's restart of coordinators was not what should have been done - historical nodes actually pull-up data from hdfs - Those should have been the ones restarted [20:14:44] mwarf [20:15:01] I'll wait for Andrew to be back online [20:15:20] mforns: cool! :D [20:17:34] (03PS6) 10Joal: Code reorg and scala refacto [analytics/wikihadoop] - 10https://gerrit.wikimedia.org/r/337004 [20:19:07] ok - no right to restart druid :S [20:33:49] joal: o/ [20:33:52] you still there? [20:34:03] Hi elukey - Was waiting for andrew :) [20:34:09] elukey: You;re my savior :) [20:34:15] as usual should I say [20:34:35] elukey: I think druid-historical nodes need a bump [20:34:43] please :) [20:34:45] I saw the email and realized that I haven't checked druid after the master swap [20:34:48] sigh [20:34:55] private and public clusters? [20:35:00] didn't do it either - We were concentrated on kafka ... [20:35:09] private a lot more urgent [20:35:19] public has 1 indexation per month, can wait for tomorow [20:35:28] all right doing private [20:35:51] elukey: also, didn't we have alarms on missing segments? [20:36:41] * joal looks avidly to under-replicated-segments charts and expects a down-trend [20:36:42] yes we do, but sadly icinga doesn't join this chan since it is +r [20:36:49] (only registered users allowed) [20:36:53] ops is working on it [20:36:56] * joal cries away [20:37:04] ok makes sense [20:37:16] basically when icinga will be a registered user we'll see in here again [20:37:27] today I tried to remove +r and we got tons of spam [20:37:51] I don't understand why it happened only now though' [20:37:58] elukey: druid-coordinator UI is funny - red everywhere :( [20:38:19] so just completed the historical restarts [20:38:36] right - this explains the all-red [20:39:08] historical need to read back everything from HDFS - There'll be some data-transfer [20:39:53] !log rolling restart of all the druid historicals on Druid private/analytics [20:39:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:41:46] Many thanks for having showed up elukey - This saves my beginning of night :) [20:42:15] I realized only now about the page (I got the icinga alarm) [20:42:17] :( [20:42:47] graphs are looking better [20:43:08] still ~240Gb to be copied but we'll get there [20:43:09] joal: i mean other druid services [20:43:17] ? [20:43:44] right ottomata - the one that needed to be bounced was historica [20:43:49] "ottomata: hm - Do we have other services hdfs-dependent?" [20:43:50] ah! [20:43:52] did you bounce it? [20:43:53] ottomata: took me a minute to realize [20:43:56] or someone? [20:44:01] ottomata: elukey did :) [20:44:05] oh elukey always saving the day! [20:44:11] indeed [20:44:21] * joal sends wikilove to elukey :) [20:44:22] so things ok now? it's catchin gup? [20:44:28] gently catching up [20:44:34] great, do we still need to do public? [20:44:35] ottomata: I saw the page only now :( [20:44:44] we can do it tomorrow (me and joseph) [20:44:54] s'ok elukey you should be having fun! i just had to run for a minute (and have RFC and then choir rehearsal!) [20:44:57] ok [20:44:59] great [20:45:24] elukey: let's do pubic tomorrow (we'll need to be gentle no to break AQS) [20:45:27] ah one thing.. since this chan is now +r (only registered) to fight the spammers, icinga doesn't log [20:45:41] Riccardo and Filippo are working on it [20:46:17] joal: sure, we can do one historical at the time, waiting for it to recover all segments [20:46:25] Sounds good elukey :) [20:47:20] ottomata: I'll merge the wikihadoop tomorrow and manually release an archiva version to move - I'll be very happy if we go for tougher post-review though :) [20:47:24] joal: the coordinator ui looks better now :) [20:47:29] Yessir [20:48:09] ottomata: completely forgot to check druid after the hadoop masters swap, and I knew it needed some care [20:48:12] :( [20:49:11] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Ottomata) > (also: I find it easier to translates feature requests to technical implementations, instead of SCRUM stori... [20:49:33] joal: +1 [20:49:44] elukey: no worries i would have done the same thing! [20:49:51] Good luck with RFC and enjoy singing ottomata ;) [20:50:44] ottomata: email sent to internal to keep archive happy [20:51:17] Maaaaan - Still another 35 emails in phab folder ... Is it that time of the year, or is it something elsE? [20:51:34] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 2 others: RFC: Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201963 (10Ottomata) [20:55:12] all righhhttt o/ [20:55:22] have a good time off ottomata!! [20:55:25] bye elukey :) [20:56:24] latrersrs [21:01:44] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Joe) >>! In T201068#4619940, @Ottomata wrote: >> For things that need acknowledgement, we should have a way to check a... [21:05:36] 10Analytics, 10Analytics-Kanban, 10Readers-Web-Backlog, 10Wikimedia-Site-requests, and 2 others: Turn on MinervaErrorLogSamplingRate (Schema:WebClientError) - https://phabricator.wikimedia.org/T203814 (10Jdlrobson) [21:06:08] 10Analytics, 10Analytics-Kanban, 10Readers-Web-Backlog, 10Wikimedia-Site-requests, and 2 others: Turn on MinervaErrorLogSamplingRate (Schema:WebClientError) - https://phabricator.wikimedia.org/T203814 (10Jdlrobson) I've revised this task to be about turning this on on the beta cluster. I've also created T2... [21:06:47] joal: can you run simple queries on hive? i am getting errors all arround [21:07:02] didn't try recently - spark works though [21:07:15] joal: can you try a simple one on 1005? [21:07:57] joal: but i was running these like hive -f blah.hql > out.txt as recent as sunday [21:08:14] trying one from stat1004 [21:40:59] joal: hive no work on 1005 [21:41:03] https://www.irccloud.com/pastebin/yQUFjCB5/ [21:41:17] worked for me nuris [21:41:21] on stat1005 [21:41:54] Like I ran a few queries without problem [21:42:07] seems related to tmp folder permission nuria [21:43:05] nuria: /tmp/nuria has no x permission - can't be accessed [21:43:05] joal: hive works on 1004 [21:43:21] joal: it is lying! [21:43:32] nuria: not related to hive I think, but rather to some config on stat1005 [21:43:42] joal: right, agreed [21:43:44] hive worked fine for me on both [21:43:53] ottomata: yt? [21:44:19] nuria: can you try on stat1005: chmod 755 /tmp/nuria please [21:44:26] and then hive [21:47:05] joal: you know me dumb , changed permisions on hadoop /tmp/ nor machine tmp [21:47:21] huhu - that happens too :) [21:55:09] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Nuria) >This is the only reliable way to get message acknolwedgement without creating a potential tight coupling betwee... [21:58:32] nuria: working better? [21:58:50] joal: yessisr thank you [21:58:54] np :) [21:59:16] \o/ ! sharing my job with you folks - I have an answer to https://phabricator.wikimedia.org/T205457!! [22:00:04] joal: nice, i want to know [22:01:04] nuria: dumb forgotten thing - pageview-dumps have been made as to mimic old webstatcollector ones (more or less) [22:01:12] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 2 others: RFC: Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201963 (10kchapman) TechCom hosted a IRC meeting regarding this RFC today: * Minutes: https://tools.wmflabs.org/meetbot/wikimedia-offi... [22:01:19] joal: yes [22:02:08] nuria: And in webstatcollector world, hours are shifted by 1 - For us, 2018-09-25T23:00 is hour starting at 23, while for webstatcollector it is hour ENDING at 23 [22:03:26] so in order to find matching results, you need to load 1 day of pageviews starting at hour 1 and ending the first hour of next day [22:03:37] nuria: do you mind if I reword this task https://phabricator.wikimedia.org/T197829 ? I have a similar concern but maybe more broad and I was hoping to coopt that to look after both [22:04:15] chasemp: sounds great [22:04:29] tx [22:04:48] joal: i tried shifted dates a little in turnillo but got nowhere [22:05:10] nuria: there also is minor glitches when aggregating page at title level (SUM(pageviews) != SUM(projectview)) because of some special chars in pagetitles, but this is really minor [22:05:17] i gotta go to rehearsal bye allll! [22:05:41] I'll sum my findings in the task tomorrow :) [22:07:50] Ok done for tonight :) See you tomorrow team :) [22:17:40] nuria: how do you reproduce the 0% YoY? [22:18:16] fdans: on bambaran wikipedia on dashboard see "edits" [22:18:38] nuria: but that's because there were exactly the same number of edits on the last two months [22:20:24] fdans: but that is YoY, right? "says 0.00% YoY" [22:21:26] ohhh sorry, I see what you mean [22:21:44] fdans: k [22:22:41] nuria: na, not a bug, same thing :) [22:23:06] 11 editors on Aug 18, 11 editors on Aug 17 :D [22:25:56] fdans: i see the 0.00 is the strange thing then cause it should say 0% [22:27:55] nuria: you mean the decimals? yeah I guess we could change the filters [22:28:26] but we should merge this patch asap because it corrects a pretty major thing [22:31:13] fdans: yes, agreed [22:32:06] (03CR) 10Nuria: "No bug on YoY just format is strange and we need to change 0.00% to be 0.0%" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463126 (https://phabricator.wikimedia.org/T205565) (owner: 10Fdans) [22:32:08] (03CR) 10Nuria: [C: 032] Fix last 12 months aggregate [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463126 (https://phabricator.wikimedia.org/T205565) (owner: 10Fdans) [22:32:33] fdans: did you build it on top of master ? [22:36:12] nuria: yup, although to test it I cherry-picked the change on top of pages to date [22:36:24] but it's ok to merge [22:45:40] (03PS3) 10Fdans: Fix last 12 months aggregate [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463126 (https://phabricator.wikimedia.org/T205565) [22:46:01] (03CR) 10Fdans: [V: 032] Fix last 12 months aggregate [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463126 (https://phabricator.wikimedia.org/T205565) (owner: 10Fdans) [22:46:39] nuria: ah, right, it had the pages to date as parent revision, I just rebased and merged, bed now! [23:36:49] 10Analytics, 10Analytics-Kanban, 10Readers-Web-Backlog, 10Wikimedia-Site-requests, and 2 others: Turn on MinervaErrorLogSamplingRate (Schema:WebClientError) - https://phabricator.wikimedia.org/T203814 (10Nuria) Sounds fine, note that beta cluster has no hadoop component so you will need to consume errors d...