[00:46:59] (03CR) 10Milimetric: Allow breakdown filtering in top metrics (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463964 (https://phabricator.wikimedia.org/T205725) (owner: 10Fdans) [02:37:10] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Deprecation Information for EventLogging ResourceLoader modules - https://phabricator.wikimedia.org/T205744 (10Krinkle) [03:51:40] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Deprecation Information for EventLogging ResourceLoader modules - https://phabricator.wikimedia.org/T205744 (10Krinkle) @Milimetric Very nice :) I made some minor edits to link to debug mode, added a mention of `mw.track()`, and phrased the recommen... [05:23:59] 10Analytics, 10Pageviews-API: No results for Special:BlankPage or Special:BlankPage/RTRC - https://phabricator.wikimedia.org/T151363 (10Krinkle) >>! In T151363#2827483, @Milimetric wrote: > Currently BlankPage is one of the special pages we don't count as pageviews explicitly: > > https://github.com/wikimedia... [07:06:58] Hi team [07:07:32] for when you're back mforns: You have been considering turnilo == druid - It is not [07:08:11] mforns: If you tunnel to druid-coordinator and check for datasources, event_ReadingDepth is gone (datasource has been deleted in druid) [07:08:38] mforns: However, it is indeed still present in turnilo, which keeps datasources even if not present anymore in druid [07:09:13] mforns: Solution is to restart turnilo to drop exisiting self-created datasource-configs [07:11:09] elukey: Could you please restart turnilo to confirm my understanding of the datasource issue? [07:11:15] elukey: o/ ! [07:12:43] done :) [07:13:10] confirmed: datasource gone :) Thanks elukey :) [07:13:52] :) [07:14:09] I am checking now if it is possible to add analytics-admins to the analytics-tool hosts [07:14:16] elukey: how is going with an-coord01? [07:14:32] I need to fix lvm partitions but now it works :) [07:14:38] \o/ ! [07:14:51] as FYI this is happening this evening https://phabricator.wikimedia.org/T201039#4638675 [07:15:11] there is a potential 30m downtime for multiple hadoop hosts, all in the same switch/row [07:15:47] but best case scenario is a couple of seconds network blip [07:15:47] k elukey - I assume we have cover for critical services (masters and journal)? [07:16:06] what do you mean? [07:16:07] I like the best-case, the worst less, but still very acceptable :) [07:17:12] I learned about maintenance yesterday while seeing Arzhel mentioning it on the chan :( [07:17:12] elukey: The hadoop-masters and journal-nodes have been spanned over multiple rows, making them resilient to this failure scenario right? [07:17:15] just confirming [07:17:40] on an-coord1002 will be impacted, and I think one journal node [07:17:50] yep yep they are spread among multiple rows [07:19:17] elukey: have you seen cloudera-hortonworks merge? [07:20:27] saw it now, I am now even more convinced that BigTop is the best way forward :D [07:20:35] that big monster scares me [07:21:11] indeed - Plus we never know which release they'll keep :) [07:21:37] * joal tries not to think of the big monsters in the closet [07:31:29] Moving piwik to matomo1001 in a bit [08:00:18] (03PS5) 10Joal: Add MediawikiXMLDumpsConverter spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/463370 (https://phabricator.wikimedia.org/T202490) [08:02:51] (03CR) 10Joal: "Another bunch of changes" (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/463370 (https://phabricator.wikimedia.org/T202490) (owner: 10Joal) [08:10:34] (03CR) 10Joal: [V: 032 C: 032] "Merging" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/459780 (https://phabricator.wikimedia.org/T202489) (owner: 10Joal) [08:11:01] aaaand piwik on matomo1001 \o/ [08:11:11] Yay ! [08:39:00] heya teaam :] [08:39:30] Hi mforns - turnilo dataset gone (see IRC backlog) [08:41:08] joal, reading [08:52:35] makes sense, thanks joal and elukey :] [09:11:04] (03PS2) 10Mforns: Correct and improve EventLoggingToDruid time measure bucketing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/464171 (https://phabricator.wikimedia.org/T205641) [10:58:00] * elukey lunch! [12:04:24] (03PS4) 10Fdans: Allow breakdown filtering in top metrics [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463964 (https://phabricator.wikimedia.org/T205725) [12:04:29] (03CR) 10Fdans: "Sorry for the classic Dan-Fran tug of war CR " (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463964 (https://phabricator.wikimedia.org/T205725) (owner: 10Fdans) [13:19:41] 10Analytics, 10Operations, 10hardware-requests: Allow Analytics team members to restart Turnilo and Superset - https://phabricator.wikimedia.org/T206217 (10elukey) p:05Triage>03Normal [13:20:12] 10Analytics, 10Operations, 10SRE-Access-Requests: Allow Analytics team members to restart Turnilo and Superset - https://phabricator.wikimedia.org/T206217 (10elukey) [13:20:29] ottomata: o/ ---^ [13:20:38] yoohoo [13:20:58] analytics-admins is fine no? [13:21:00] that's what this is for! [13:21:21] 10Analytics, 10Operations, 10SRE-Access-Requests: Allow Analytics team members to restart Turnilo and Superset - https://phabricator.wikimedia.org/T206217 (10Ottomata) I think `analytics-admins` is the right group let's keep using it! [13:21:49] ottomata: sure I am ok with it, but it kinda deploys a ton of sudoers rules that are not needed on analytics-tool* [13:22:01] it makes sense on hadoop nodes [13:22:07] but imho not elsewhere [13:22:24] (03CR) 10Ottomata: Add MediawikiXMLDumpsConverter spark job (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/463370 (https://phabricator.wikimedia.org/T202490) (owner: 10Joal) [13:22:49] elukey: sudoer rules? [13:22:51] oh [13:22:53] hm [13:23:08] that is a little funny. man we do users weird here [13:23:22] kinda weird that the sudoer rules go with the group rather than with the group+role/node [13:23:32] i dunno, i think its ok though [13:23:37] i think that's better than creating a new group [13:23:42] why? :D [13:23:58] because then that's more groups to deal with in admin.yaml, and we really want all the same people in that group [13:24:33] it really is analytics-admins, there's no reason to have a ui-admins other than to differentiate sudo rules [13:24:43] we want the analytics-admins to be able to restart services [13:26:29] sure but we don't have granularity [13:26:42] I mean, I am ok with whatever, but we always strive for clarity [13:26:46] and this seems the opposite [13:28:02] i think it only seems that way because of a puppet admin module limitation [13:28:12] there is no granularity here from the user group level [13:28:19] if it weren't for extra uneeded sudo rules [13:28:28] we'd allow analytics-admins to do this [13:28:40] we'd only create a ui-admins group if we had people that we wanted to be in ui-admins but not in admins [13:30:13] I still think that analytics-admins, in the context of the admin module, is overloaded. And if you see the members in it you can understand why [13:30:28] ? [13:30:50] we have one user (that is trusted of course) ended up in there because he needed special access, and turned out to be able to restart a ton of things [13:30:54] oh [13:30:56] rather than min privilege [13:30:58] i didn't know he was in there [13:31:03] that is weird i agree [13:31:05] what was that for? [13:31:33] There was a task to do some data analytics requiring more privileges [13:31:36] i added it! [13:31:40] looking [13:32:13] https://phabricator.wikimedia.org/T178802 [13:32:16] this is of course an example, we completely trust the "extra" user, I was just going to reason out loud about pros/cons [13:32:17] don't remember what this is about at all [13:32:20] let's ask fi we can remove him [13:32:39] 10Analytics-Kanban, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Add Tilman to analytics-admins - https://phabricator.wikimedia.org/T178802 (10Ottomata) @HaeB do you still need this? Can we roll this back? [13:32:39] I did it a couple of months ago and we decided to postpone, might be a good time now [13:32:41] no, in this case you are right about overloaded, i think its more than an example [13:32:56] but the ideal is that the analytics-admins have priviledges to administer analytics cluster things [13:32:59] including ui tools [13:46:06] ottomata: change for comment has been made, I think CR-history doesn't move forward with patches - You can have a look at patch 5: https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/463370/5/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/mediawikihistory/MediawikiXMLDumpsConverter.scala [13:58:30] (03PS8) 10Joal: Add python script importing xml dumps onto hdfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/456654 (https://phabricator.wikimedia.org/T202489) [13:59:01] (03CR) 10Joal: "Version not yet tested, but with more comments :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/456654 (https://phabricator.wikimedia.org/T202489) (owner: 10Joal) [14:07:10] oh weird joal, i thought i compared that [14:07:11] ok cool! [14:07:13] let's do it! [14:07:35] (03CR) 10Ottomata: [C: 031] Add MediawikiXMLDumpsConverter spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/463370 (https://phabricator.wikimedia.org/T202490) (owner: 10Joal) [14:07:45] elukey: +1 https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/464563/ ? [14:08:22] ottomata: yep seems good! [14:09:13] ottomata: on qs - is accept sent only once? [14:10:24] RFC seems to suggest so [14:10:49] (03PS1) 10Ottomata: Add accept header to webrequest raw and refined tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/464574 (https://phabricator.wikimedia.org/T170606) [14:11:07] elukey: like multiple settings? I don't think so, usually it is comma separated or something? [14:11:15] or semicolon? [14:11:39] yeah comma [14:11:52] yeah https://tools.ietf.org/html/rfc7231#page-38 [14:12:10] okok good [14:12:12] (03CR) 10Ottomata: "Needs https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/464563/ deployed first" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/464574 (https://phabricator.wikimedia.org/T170606) (owner: 10Ottomata) [14:14:25] (03CR) 10Milimetric: Allow breakdown filtering in top metrics (032 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463964 (https://phabricator.wikimedia.org/T205725) (owner: 10Fdans) [14:15:18] fdans: dude, I love our CRs. We're so different it's great [14:41:36] PROBLEM - Number of segments reported as unavailable by the Druid Coordinators of the Analytics cluster on einsteinium is CRITICAL: 1536 gt 200 https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&var-cluster=druid_analytics&var-druid_datasource=All [14:42:50] ReadingDepth segments going way up [14:43:11] mforns: ---^ [14:43:17] are you loading segments? [14:43:37] RECOVERY - Number of segments reported as unavailable by the Druid Coordinators of the Analytics cluster on einsteinium is OK: (C)200 gt (W)180 gt 0 https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=46&fullscreen&orgId=1&var-cluster=druid_analytics&var-druid_datasource=All [14:52:06] (03PS1) 10Fdans: Adds logic and configuration for project families [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T188550) [14:55:40] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Add Accept header to webrequest logs - https://phabricator.wikimedia.org/T170606 (10Nuria) Let's please update docs: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest [14:55:42] fdans / mforns: I made this hotfix https://gerrit.wikimedia.org/r/#/c/analytics/wikistats2/+/464428/ but the issue isn't too terrible, and I'm thinking of making a better change to incorporate adjustedGraphData back into graphModel [14:55:53] I started doing that and decided it was kind of tricky, so wanted to chat first [14:56:03] let me know if you're interested, if not I'll just go ahead and do it [14:56:13] (but objections are ok too) [15:01:58] 10Analytics, 10Operations, 10SRE-Access-Requests: Allow Analytics team members to restart Turnilo and Superset - https://phabricator.wikimedia.org/T206217 (10Nuria) +1 to analytics-admins [15:04:45] (03CR) 10Nuria: [C: 031] "Looks good, let's merge once we have tested and loaded it." (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/464171 (https://phabricator.wikimedia.org/T205641) (owner: 10Mforns) [15:05:18] elukey, yes, loading... [15:13:10] (03CR) 10Nuria: [C: 04-1] "We really need to look at what metrics do we have available for all families before merging these. The dashboard will look strange as ther" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T188550) (owner: 10Fdans) [15:18:34] (03CR) 10Nuria: [C: 04-1] "Current code is getting top editors for "July": https://wikimedia.org/api/rest_v1/metrics/editors/top-by-edits/ar.wikipedia.org/all-editor" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464375 (https://phabricator.wikimedia.org/T205915) (owner: 10Fdans) [15:19:23] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Top metrics. Implement failsafe mechanism for when current month computations are not available - https://phabricator.wikimedia.org/T205915 (10Nuria) See bug regarding top editors, data displays for July rather than August: {F26307440} [15:22:03] nuria: this is once again a timezone thing [15:22:16] fdans: right, it is. [15:22:23] changing to use the utc functions [15:26:24] (03CR) 10Mforns: "Hm, if we did the AQS endpoint with meta information about available intervals, we would solve those problems. We wouldn't need adjustedGr" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464428 (https://phabricator.wikimedia.org/T206171) (owner: 10Milimetric) [15:27:20] mforns: yeah, but we decided not to do that for a while, right? [15:27:28] milimetric, yea... [15:27:28] like, for now we're just going to do the guess and check thing for tops [15:27:37] (03CR) 10Nuria: [C: 04-1] "Some UX suggestions:" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T188550) (owner: 10Fdans) [15:28:48] milimetric: right, we are going to go with "convention rather than configuration" idea [15:29:34] nuria: just for now, but long term we're going to implement that second endpoint that tells you the time range, I made a task for it [15:29:37] ok, will +2 [15:29:41] (03CR) 10Mforns: [C: 032] Filter out annotations based on adjustedGraphData [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464428 (https://phabricator.wikimedia.org/T206171) (owner: 10Milimetric) [15:29:42] mforns: +2 what? [15:29:45] nono [15:29:50] I don't like it [15:30:01] I'm going to fix adjusted properly, this is too ugly and the bug isn't super bad [15:30:46] (03CR) 10Nuria: [C: 04-1] Filter out annotations based on adjustedGraphData (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464428 (https://phabricator.wikimedia.org/T206171) (owner: 10Milimetric) [15:31:18] (03CR) 10Nuria: "Sorry, wrong gerrit chageset for my last comment!" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464428 (https://phabricator.wikimedia.org/T206171) (owner: 10Milimetric) [15:31:42] :) we crossed the streams aaaaaah [15:32:17] (03CR) 10Nuria: [C: 04-1] Adds logic and configuration for project families (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T188550) (owner: 10Fdans) [15:42:12] (03Merged) 10jenkins-bot: Filter out annotations based on adjustedGraphData [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464428 (https://phabricator.wikimedia.org/T206171) (owner: 10Milimetric) [15:42:24] aaaaah, no! [15:42:29] everything is going wrong :) [15:42:45] (03PS1) 10Milimetric: Revert "Filter out annotations based on adjustedGraphData" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464593 [15:42:57] (03CR) 10Milimetric: [V: 032 C: 032] Revert "Filter out annotations based on adjustedGraphData" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464593 (owner: 10Milimetric) [15:45:47] milimetric, sorry for not catching problems in review... :/ [15:46:08] oh, no problem! [15:46:26] it's a tricky thing because it works fine but it sets up consumers of data downstream for failure [15:46:43] hm [15:46:49] it's forking data basically [15:47:09] fdans: please sign up for presenting here if you are interested: https://office.wikimedia.org/wiki/Technology/5_Minute_Demo [15:48:03] nuria: donew [15:48:11] fdans: k grasias [15:49:11] nuria: for the timezones stuff I'm avoiding the use of Date since it's a very simple subtraction [15:49:18] (03PS3) 10Mforns: Correct and improve EventLoggingToDruid time measure bucketing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/464171 (https://phabricator.wikimedia.org/T205641) [15:49:42] fdans: i haven't looked at code in detail but i will today [15:50:08] (03PS2) 10Fdans: Allow several attempts to get latest data in top metrics [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464375 (https://phabricator.wikimedia.org/T205915) [15:53:13] (03PS1) 10Elukey: Add an-coord1001 to the list of targets [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/464596 (https://phabricator.wikimedia.org/T205509) [15:53:33] (03CR) 10Elukey: [V: 032 C: 032] Add an-coord1001 to the list of targets [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/464596 (https://phabricator.wikimedia.org/T205509) (owner: 10Elukey) [15:58:44] fdans: but not using date makes harder to go back when you are at the border of the year 2018/01 [15:58:56] fdans: to 2017/12 [15:59:18] nuria: maaaaah not that harder :) there's actually less lines of code [16:00:20] not true, same number of lines [16:06:15] (03PS1) 10Milimetric: Remove adjustedGraphData and update children [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464599 (https://phabricator.wikimedia.org/T206171) [16:32:45] (03CR) 10Joal: [C: 04-1] "Comment inline ..." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/464574 (https://phabricator.wikimedia.org/T170606) (owner: 10Ottomata) [16:35:22] i'm trying to join the lab meeting mut google says someone needs to let me in [16:35:33] sub(mut,but) [16:40:14] (03PS2) 10Ottomata: Add accept header to webrequest raw and refined tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/464574 (https://phabricator.wikimedia.org/T170606) [16:50:35] (03CR) 10Ottomata: [C: 031] "One nit but +1!" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/456654 (https://phabricator.wikimedia.org/T202489) (owner: 10Joal) [16:50:41] ping fdans [16:53:00] 10Analytics: Research: add participant list to some of AQS edit api operations - https://phabricator.wikimedia.org/T206137 (10Ottomata) p:05Triage>03Normal [16:54:17] 10Analytics: Research: add participant list to some of AQS edit api operations - https://phabricator.wikimedia.org/T206137 (10Ottomata) We need cohort definition somehow? [16:54:32] 10Analytics: Research: add participant list to some of AQS edit api operations - https://phabricator.wikimedia.org/T206137 (10Ottomata) p:05Normal>03Low [16:56:37] 10Analytics, 10Analytics-Kanban: /var/log/refinery/sqoop-mediawiki-private.log does not rotate - https://phabricator.wikimedia.org/T206020 (10Ottomata) [16:56:56] 10Analytics, 10Analytics-Kanban: /var/log/refinery/sqoop-mediawiki-private.log does not rotate - https://phabricator.wikimedia.org/T206020 (10Ottomata) p:05Triage>03Normal [16:57:06] 10Analytics, 10Analytics-Kanban: /var/log/refinery/sqoop-mediawiki-private.log does not rotate - https://phabricator.wikimedia.org/T206020 (10Ottomata) p:05Normal>03High [16:58:33] 10Analytics: Return "available time range" custom header with AQS responses - https://phabricator.wikimedia.org/T205949 (10Ottomata) p:05Triage>03Low [16:59:05] 10Analytics: Return "available time range" custom header with AQS responses - https://phabricator.wikimedia.org/T205949 (10Milimetric) p:05Low>03Normal [17:00:21] 10Analytics, 10Analytics-EventLogging, 10Wikipedia-iOS-App-Backlog: MobileWikiAppiOSSearch validation errors adding noise to EventLogging error - https://phabricator.wikimedia.org/T205910 (10Ottomata) a:03JMinor Josh, assigning this to you to find the right person to work on it. Too many validation errors... [17:05:12] 10Analytics, 10Analytics-Kanban: /var/log/refinery/sqoop-mediawiki-private.log does not rotate - https://phabricator.wikimedia.org/T206020 (10elukey) We do have a logrotate config on an1003: ``` elukey@analytics1003:/etc/logrotate.d$ cat refinery # Note: This file is managed by Puppet. # /var/log/refinery/*.... [17:05:24] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Contributors-Analysis, 10Product-Analytics: Add change tag tables to monthly mediawiki_history sqoop - https://phabricator.wikimedia.org/T205940 (10Nuria) a:05Milimetric>03fdans [17:07:04] I'm looking to work with the serversideaccountcreation eventlogging data in the Data Lake, but it only goes back to late Nov 2017. Is it possible to import older data, or should I grab that from MySQL and import it into my own database? [17:09:36] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Contributors-Analysis, 10Product-Analytics: Attempting to select all columns of mediawiki_history sometimes fails with a cryptic error message - https://phabricator.wikimedia.org/T205367 (10Ottomata) p:05Normal>03High [17:09:38] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Contributors-Analysis, 10Product-Analytics: Attempting to select all columns of mediawiki_history sometimes fails with a cryptic error message - https://phabricator.wikimedia.org/T205367 (10Ottomata) [17:20:40] !log bounce druid-brokers on druid100[4-6] after network maintenance [17:20:41] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:41:22] nuria: https://phabricator.wikimedia.org/T206020#4642888 :) [17:42:17] elukey: nice thank you, onbiously me ninja when looking at puppet [17:42:32] I was puzzled as well, that log dir looked weird [17:42:43] not sure why we logrotate every 100MB [17:42:58] anyhow, if you want to take a stab in fixing/improving it send a code review :) [17:45:09] Nettrom: you "importing" mysql eventlogging data for serversideaccountcreation , right? Ya, for older data not in hadoop you can sqoop table from MySQL, we have done that for several other tables: https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging#Hadoop._Archived_Data [17:45:49] Nettrom: you can sqoop this one and it can be deleted from mysql when done, sqooping will give you atable in the format of MySQL column wise which is different than hadoop's format [17:46:10] Nettrom: chelsyx has dealt with sqoop before so she can also assist if this does not sound familiar [17:47:01] * elukey off! [17:47:55] nuria: thanks! after thinking about it, this goes into the "nice to have" bucket, so I'll use the data that's available for now and put "get more data" into the "possible future work" column [17:49:16] Nettrom: ok, so you know scooping is quite easy , just time consuming cause it might take hours to consume a large table [17:49:44] Nettrom: see https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging#Hadoop._Archived_Data [17:50:48] nuria: yes, I used sqoop for some MySQL <-> Data Lake transfers when I needed pageview data 18 months ago, it's a neat tool :) [17:51:10] will think about it and see what I do, tight schedule this time around, unfortunately [17:57:23] nuria, you think I can include this in refinery deploy now? https://gerrit.wikimedia.org/r/#/c/analytics/refinery/source/+/464171/ [17:57:55] I just added the fix for exit status (harmless) errors [17:58:09] mforns: yes, cause we just loaded all data with that right? [17:58:14] (03CR) 10Nuria: [C: 032] Correct and improve EventLoggingToDruid time measure bucketing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/464171 (https://phabricator.wikimedia.org/T205641) (owner: 10Mforns) [17:58:14] yes [17:58:20] thanks! [17:58:27] (03CR) 10Nuria: [V: 032 C: 032] Correct and improve EventLoggingToDruid time measure bucketing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/464171 (https://phabricator.wikimedia.org/T205641) (owner: 10Mforns) [17:58:51] mforns: ya, that way we can include cron for navigation timing [17:58:58] yea! [18:00:00] joal, do you want this merged before I deploy now? [18:00:01] https://gerrit.wikimedia.org/r/#/c/analytics/refinery/source/+/463370/ [18:03:29] mforns: https://phabricator.wikimedia.org/T166414 [18:03:51] cool [18:37:08] bearloga: do we have the sitemap report on commons now? [19:18:30] (03PS1) 10Mforns: Update changelog.md for v0.0.76 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/464630 [19:19:20] (03CR) 10Mforns: [V: 032 C: 032] Update changelog.md for v0.0.76 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/464630 (owner: 10Mforns) [19:22:26] !log Started deployment of refinery-source [19:22:28] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:24:42] (03CR) 10Ppchelko: Add accept header to webrequest raw and refined tables (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/464574 (https://phabricator.wikimedia.org/T170606) (owner: 10Ottomata) [19:34:08] (03CR) 10Ottomata: "Nono it doesn't. We just keep the schema up to date in these files for testing purposes. We will alter the table to add the field." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/464574 (https://phabricator.wikimedia.org/T170606) (owner: 10Ottomata) [19:50:11] !log Finished deployment of refinery-source [19:50:13] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:52:55] !log Started deployment of refinery [19:52:57] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:57:58] milimetric, I'm deploying refinery changes, should I restart geoeditors oozie job when I'm done? [20:12:25] ottomata, I'm deploying refinery using scap and one of the 5 targets failed [20:12:47] by the logs, I'd say it's an-coord1001 [20:18:19] hmm oh that's a new one [20:18:21] lemme check [20:18:41] i saw that luca added that today,... [20:19:39] hmm id unno why he'd merge that in scap but not have it on puppet on the ndoe [20:19:41] let's revert that [20:20:36] (03PS1) 10Ottomata: Revert "Add an-coord1001 to the list of targets" [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/464660 [20:20:51] (03CR) 10Ottomata: [V: 032 C: 032] Revert "Add an-coord1001 to the list of targets" [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/464660 (owner: 10Ottomata) [20:20:58] mforns: git pull in refinery/scap [20:21:02] or whevere that is [20:21:14] yeah [20:21:16] in there [20:21:17] then try again [20:21:28] ottomata, thanks! [20:21:58] ok, it says: 1 file changed, 1 deletion(-) [20:23:24] ottomata, OK it worked! [20:23:25] ok [20:23:27] cool [20:23:29] thannnnnks! [20:23:32] yupppers [20:33:51] !log Finished deployment of refinery [20:33:53] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:48:10] nice mforns thnaks for deploying [20:48:21] np! [20:49:10] for https://turnilo.wikimedia.org/#event_PageIssues let's also add ua fields to schema on crontab [20:49:25] does that sound ok mforns ? [20:49:57] yea. makes sense, will reload data, it's short (starts sept 19) [20:50:58] mforns: k thank you [20:51:57] joal: do you have a moment for a question about: https://gerrit.wikimedia.org/r/#/c/464574/1/hive/webrequest/create_webrequest_table.hql@44 [20:55:07] mforns: was there a change in the geoeditors? Sorry I forgot [20:55:40] ah, yes, the SLA one [20:55:55] yeah, you could restart the job... but... lemme make sure it's not running now [20:56:01] ok [20:56:33] mforns: yeah, you can restart it [20:56:46] ok, thx! [21:03:02] milimetric, there are 3 coordinators for geoeditors: druid, monthly and load [21:03:16] milimetric, which ones should I restart? [21:08:10] mforns: I think only load (my proof is just that that's the only one that changed: https://github.com/wikimedia/analytics-refinery/commit/b2d2b722d523a8a240ad751a9d5523b57a0072c4) [21:08:20] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10iOS-app-Bugs, 10Services (watching): iOS app is hitting rate limit on AQS endpoints - https://phabricator.wikimedia.org/T206263 (10Pchelolo) [21:08:31] ok, I was searching for that, thanks! [21:14:46] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Deprecation Information for EventLogging ResourceLoader modules - https://phabricator.wikimedia.org/T205744 (10Milimetric) right, makes sense, I think I understood all this but used all the wrong words. I'll be more careful. [21:20:59] ottomata: last time we met we moved the Stream Intake RFC back to "In Discussion" but you/I can move it back to Request IRC meeting anytime, just lemme know if you think enough time has passed since your last comment [21:21:27] After reading that and considering it more, I'm with you on the service-template-node option [21:21:51] and I think typescript would be awesome but that's just a bonus [21:21:55] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0 Remaining reports. - https://phabricator.wikimedia.org/T186121 (10Nuria) [21:21:58] 10Analytics-Kanban, 10Analytics-Wikistats: Per family metrics in wikistats UI (wikipedia, wikinews) - https://phabricator.wikimedia.org/T205663 (10Nuria) 05Invalid>03Open [21:22:09] 10Analytics-Kanban, 10Analytics-Wikistats: Per family metrics in wikistats UI (wikipedia, wikinews) - https://phabricator.wikimedia.org/T205663 (10Nuria) [21:22:16] 10Analytics, 10Analytics-Wikistats, 10Patch-For-Review: Wikistats 2.0: allow to view stats for all language versions (a.k.a. Project families) - https://phabricator.wikimedia.org/T188550 (10Nuria) [21:27:27] 10Analytics, 10Analytics-Wikimetrics, 10Patch-For-Review: Wikimetrics docker build/test environment is broken - https://phabricator.wikimedia.org/T193780 (10Milimetric) no worries about the flake stuff, @sbassett, I'll fix it [21:29:51] milimetric, restarted the job :] [21:33:24] thanks! [21:38:30] 10Analytics, 10Research: Create labeled dataset for bot identification - https://phabricator.wikimedia.org/T206267 (10Nuria) p:05Triage>03Normal [21:39:06] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10iOS-app-Bugs, 10Services (watching): iOS app is hitting rate limit on AQS endpoints - https://phabricator.wikimedia.org/T206263 (10Nuria) This is a reported ip for bots/spam: https://www.proxydocker.com/en/proxy/79.161.248.1 UA is probably just a fake one and it... [21:39:28] 10Analytics, 10Research: [Open question] Improve bot identification at scale - https://phabricator.wikimedia.org/T138207 (10Nuria) [21:39:31] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10iOS-app-Bugs, 10Services (watching): iOS app is hitting rate limit on AQS endpoints - https://phabricator.wikimedia.org/T206263 (10Nuria) [21:41:31] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10iOS-app-Bugs, 10Services (watching): iOS app is hitting rate limit on AQS endpoints - https://phabricator.wikimedia.org/T206263 (10Pchelolo) Thanks @Nuria! I think it's safe to say we can close this as "Invalid" [21:42:21] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10iOS-app-Bugs, 10Services (watching): iOS app is hitting rate limit on AQS endpoints - https://phabricator.wikimedia.org/T206263 (10Nuria) you can live it open, it is just a good data point for pateerns of bots so we can calculate features on those for classifier [21:46:49] (03PS1) 10Milimetric: Clean up flake8 style errors [analytics/wikimetrics] - 10https://gerrit.wikimedia.org/r/464724 [21:47:36] 10Analytics, 10Analytics-Wikimetrics, 10Patch-For-Review: Wikimetrics docker build/test environment is broken - https://phabricator.wikimedia.org/T193780 (10sbassett) Sounds good, thanks. [21:50:19] (03CR) 10Nuria: [V: 032 C: 032] Clean up flake8 style errors [analytics/wikimetrics] - 10https://gerrit.wikimedia.org/r/464724 (owner: 10Milimetric) [21:51:18] (03Merged) 10jenkins-bot: Clean up flake8 style errors [analytics/wikimetrics] - 10https://gerrit.wikimedia.org/r/464724 (owner: 10Milimetric) [21:53:01] wikimedia/analytics-wikimetrics#2 (master - 3fb8810 : Dan Andreescu): The build has errored. [21:53:02] Change view : https://github.com/wikimedia/analytics-wikimetrics/compare/ee6c5a08fab3...3fb8810a6a5a [21:53:02] Build details : https://travis-ci.com/wikimedia/analytics-wikimetrics/builds/86946909 [21:59:23] 10Analytics, 10Services: Evaluate using TypeScript on node projects - https://phabricator.wikimedia.org/T206268 (10Milimetric) [21:59:56] 10Analytics, 10Services (watching): Evaluate using TypeScript on node projects - https://phabricator.wikimedia.org/T206268 (10Pchelolo) [22:01:02] 10Analytics, 10Services (watching): Consider converting AQS to TypeScript - https://phabricator.wikimedia.org/T206269 (10Milimetric) p:05Triage>03Low [22:19:42] 10Analytics, 10Analytics-Kanban, 10Wikimedia-Site-requests, 10MobileFrontend (MobileFrontend.js), and 2 others: Turn on MinervaErrorLogSamplingRate (Schema:WebClientError) - https://phabricator.wikimedia.org/T203814 (10Jdlrobson) [22:50:54] 10Analytics, 10Analytics-Kanban: Logrotate of refinery rotating on size rather than time - https://phabricator.wikimedia.org/T206020 (10Nuria) [22:58:36] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Logrotate of refinery rotating on size rather than time - https://phabricator.wikimedia.org/T206020 (10Nuria) @elukey : I think (please let me know if you disagree) that a logrotate on time is more intuitive than size so we have a clear point in time when... [22:58:52] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Logrotate of refinery rotating on size rather than time - https://phabricator.wikimedia.org/T206020 (10Nuria) [22:59:48] 10Analytics-Kanban: Drop old mediawiki_history_reduced snapshots - https://phabricator.wikimedia.org/T197888 (10Nuria) a:03fdans [23:38:18] (03CR) 10Nuria: [C: 031] "Just one nit but otherwise working fine!" (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464375 (https://phabricator.wikimedia.org/T205915) (owner: 10Fdans) [23:40:54] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Contributors-Analysis, 10Product-Analytics: Attempting to select all columns of mediawiki_history sometimes fails with a cryptic error message - https://phabricator.wikimedia.org/T205367 (10Neil_P._Quinn_WMF) @Ottomata, from my perspective, we can c...