[00:30:17] (03PS1) 10Mforns: Add EventLoggingSanitizationMonitor.scala [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/478126 (https://phabricator.wikimedia.org/T202429) [04:57:41] 10Analytics, 10Readers-Web-Backlog: % of "none" referers seems too high - https://phabricator.wikimedia.org/T195880 (10Tbayer) [05:15:03] 10Analytics, 10Readers-Web-Backlog: % of "none" referers seems too high - https://phabricator.wikimedia.org/T195880 (10Tbayer) See also T211077 (TLDR: it looks like a lot of formerly "unknown" referrers on Chrome Mobile are now, since around September 13, classified as "external (search engine)") [05:15:53] 10Analytics, 10Product-Analytics: Investigate referrer class change on Chrome Mobile from September 13, 2018 - https://phabricator.wikimedia.org/T211077 (10Tbayer) See also the observations in T195880 (note: "none" != "unknown") [06:24:54] 10Analytics, 10Product-Analytics: Investigate referrer class change on Chrome Mobile from September 13, 2018 - https://phabricator.wikimedia.org/T211077 (10Nuria) It does coincide with an upgrade, right? All those chrome mobile views are coming from Android and Android 8 overtakes all other versions about when... [06:45:36] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Marostegui) >>! In T210693#4801008, @Milimetric wrote: > Thanks very much @Anomie, I understand my misunderstand... [06:59:49] 10Analytics, 10DBA, 10Data-Services, 10User-Banyek, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Marostegui) @nuria do you think we should close this as it is decided we'll go for a host with the same specs and config than the rest of cl... [07:02:30] 10Analytics, 10DBA, 10Data-Services, 10User-Banyek, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10elukey) >>! In T210749#4805215, @Marostegui wrote: > @nuria do you think we should close this as it is decided we'll go for a host with the... [07:03:01] joal: o/ [07:03:16] for some reason I forgot an-worker1084 yesterday, another node coming up :P [07:59:49] going afk for 10/20 mins, brb! [08:01:54] Hi elukey - MOAR NODZ ! [08:03:07] 4.99 TB RAM :) Hello, world :) [08:08:36] (03PS1) 10Joal: Comment change_tag table from mediawiki load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478146 [08:10:13] !log manually create /wmf/data/raw/mediawiki/tables/change_tag/snapshot=2018-11/_SUCCESS on hdfs to unlock mw-history-load and therefore mw-history-reduced [08:10:14] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:26:21] :D [08:29:25] Trying one last time to make the mediawiki-history-load job work - If it doens't let's wait monday to deploy the patch above --^ [08:30:57] failed - Let's wait for the patch to be merged and deploy, then restart the load job without the change_tag table [08:31:39] Note: The mediawiki-history-reduced job has been unlocked since it only depends on wmf_raw.mediawiki_project_namespace_map table, which has been repaired [09:25:38] * elukey is cleaning up all the old home dirs on stat/notebooks [09:25:41] booooring [09:30:56] * joal sends some friday-thoughts to elukey :) [09:31:54] I hope to reach a clean state, and from now on we have the SRE team pinging us to clean up HDFS and homes when a user is not in the WMF anymore (and had access to stat/notebooks/etc..) [09:32:27] but I can nuke only the obvious ones, for the rest I need Leila's approval :P [09:39:59] joal: just merged a change to add AAAA IPv6 records to an-master* and stat* [09:40:17] should be a noop but I am raising it as FYI [09:40:45] the SRE team is looking forward to enable CI checks for DNS changes and some of our records are causing the linter/checker to fail now [09:40:58] the other patch to merge is for all the an-workers etc.. [09:41:05] elukey: ack! [09:44:55] * joal tries to remember what DNS stand for ... Do Not Simplify ? Don't Nuke Systems ? Diverting Noob Sillies most probably [09:46:52] ahahha [09:58:48] * ema lols [10:46:15] (brb) [11:39:00] * elukey lunch + errand! [12:35:57] (03CR) 10Michael Große: "I looked through the results of the properties used for references more than 10000 times." [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/475807 (https://phabricator.wikimedia.org/T209399) (owner: 10Michael Große) [12:46:58] joal: amazing! [13:10:38] (03PS6) 10Michael Große: Update metric's items and properties automatically [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/475807 (https://phabricator.wikimedia.org/T209399) [13:10:56] (03CR) 10jerkins-bot: [V: 04-1] Update metric's items and properties automatically [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/475807 (https://phabricator.wikimedia.org/T209399) (owner: 10Michael Große) [13:32:28] (03CR) 10Milimetric: [C: 032] Comment change_tag table from mediawiki load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478146 (owner: 10Joal) [13:32:39] (03CR) 10Milimetric: [V: 032 C: 032] Comment change_tag table from mediawiki load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478146 (owner: 10Joal) [13:48:49] (03PS7) 10Michael Große: Update metric's items and properties automatically [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/475807 (https://phabricator.wikimedia.org/T209399) [13:54:58] heya teaam :] [14:31:55] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Milimetric) >>! In T210693#4805204, @Marostegui wrote: > Whilst we still discuss if this JOIN "feature" can do w... [14:34:29] I think this ^ is the most coherent thing I've ever written in my life (https://phabricator.wikimedia.org/T210693#4805955) [14:43:57] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Marostegui) Thanks for the detailed analysis. My proposal was more in the line to check that the JOIN is always... [15:12:04] 10Analytics-EventLogging, 10Analytics-Kanban: [EventLogging Sanitization] Enable older-than-90-day purging of unsanitized EL database (event) in Hive - https://phabricator.wikimedia.org/T209503 (10mforns) Hi @leila! Have you guys copied the data you need? Thanks [15:13:02] 10Analytics, 10User-Elukey: Varnishkafka error to investigate: Required feature not supported by broker - https://phabricator.wikimedia.org/T210939 (10elukey) It happened on the 30th on various cp hosts, and in most of the Jumbo brokers I can see something like the following (repeated multiple times and for d... [15:16:33] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Milimetric) To duplicate something like check_private_data in Hadoop, I'd guess a day to write it and a couple o... [15:22:31] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Marostegui) >>! In T210693#4806084, @Milimetric wrote: > To duplicate something like check_private_data in Hadoo... [15:23:56] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Phase out and replace analytics-store (multisource) - https://phabricator.wikimedia.org/T172410 (10Banyek) @elukey the ports are mapped from 3311 to 3318 along with the section names (eg. s3 will be on 3313, s5 on 3315... [15:24:24] we had a very brief outage on Nov 30th for Kafka Jumbo [15:24:28] 4/5 minutes [15:50:57] 10Analytics, 10DBA, 10Data-Services, 10User-Banyek, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Nuria) >@Nuria do you think we should close this as it is decided we'll go for a host with the same specs and config than the rest of clouds... [15:57:37] 10Analytics, 10Dumps-Generation, 10ORES, 10Scoring-platform-team, and 3 others: Decide whether we will include raw features - https://phabricator.wikimedia.org/T211069 (10awight) Offsite chat suggests that there's value in actually storing the raw "root" data sources that we build the feature tree from. L... [15:57:43] 10Analytics, 10Readers-Web-Backlog: % of "none" referers seems too high - https://phabricator.wikimedia.org/T195880 (10Nuria) FYI that comment above is tied to android 8 upgrade. I think this ticket can be closed, it should be linked on docs for future reference as it does not really have any actions. None ref... [16:27:58] 10Analytics-EventLogging, 10Analytics-Kanban: [EventLogging Sanitization] Enable older-than-90-day purging of unsanitized EL database (event) in Hive - https://phabricator.wikimedia.org/T209503 (10Miriam) Hey @mforns! Not yet, we will be done by next week. Thanks! [16:28:21] elukey: hiya! would you be able to give hip access to hue.wikimedia.org? his username is jdl [16:28:26] please and thank you :) [16:32:14] (03CR) 10Nuria: [C: 04-1] Add EventLoggingSanitizationMonitor.scala (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/478126 (https://phabricator.wikimedia.org/T202429) (owner: 10Mforns) [16:33:28] bearloga: hi! he is already in all the right groups afaics, it shouldn't be a problem [16:35:59] elukey: Hey Luca, what account should be used for that login, my wikitech one? [16:36:44] hip: o/ - checking, I think so since we sync from LDAP IIRC [16:37:41] (5 min and I'll check, in the middle of a dns change :) [16:47:54] hip: can you try to log in with 'jdl' ? [16:48:43] works now! [16:49:05] gooooood [16:50:30] yay! \o/ [17:00:14] 10Analytics, 10Readers-Web-Backlog: % of "none" referers seems too high - https://phabricator.wikimedia.org/T195880 (10JKatzWMF) @Nuria while bot detection certainly plays a role, I am nervous about classifying this as an issue that can be more or less fixed with better bot detection. Other sites have somethi... [17:12:00] (03PS2) 10Nettrom: Add EditAttemptStep schema to whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/471038 (https://phabricator.wikimedia.org/T208332) [17:13:07] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10Andrew) *bump* [17:14:30] (03CR) 10Nettrom: [C: 04-1] "> Uploaded patch set 2." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/471038 (https://phabricator.wikimedia.org/T208332) (owner: 10Nettrom) [17:20:40] (03PS3) 10Nettrom: Add EditAttemptStep schema to whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/471038 (https://phabricator.wikimedia.org/T208332) [17:26:45] 10Analytics, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Add EditAttemptStep properties to the schema whitelist - https://phabricator.wikimedia.org/T208332 (10nettrom_WMF) After discussing this with @Neil_P._Quinn_WMF, we propose to remove the information that identifies what page the user was... [17:33:09] 10Analytics, 10Analytics-Kanban: Link to User Contribution page in wikistats UI rather than user page - https://phabricator.wikimedia.org/T210422 (10Nuria) a:05Milimetric>03fdans [17:34:38] 10Analytics, 10Analytics-Kanban: Wikistats2 UX bug: table option should not be available in table graph selected - https://phabricator.wikimedia.org/T210424 (10Nuria) a:03fdans [17:35:24] * elukey off! [17:57:18] 10Analytics, 10Readers-Web-Backlog: % of "none" referers seems too high - https://phabricator.wikimedia.org/T195880 (10Nuria) Not disagreeing in that browsers play a major factor. Per analysis above referrer "none" is caused by 1) bots + 2) older browsers (or browsers that do not understand referrer policy). W... [17:59:13] (03CR) 10Nuria: [V: 032 C: 032] Add EditAttemptStep schema to whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/471038 (https://phabricator.wikimedia.org/T208332) (owner: 10Nettrom) [18:00:07] (03CR) 10Nuria: [V: 032 C: 032] Comment change_tag table from mediawiki load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478146 (owner: 10Joal) [18:30:47] 10Analytics, 10Readers-Web-Backlog: % of "none" referers seems too high - https://phabricator.wikimedia.org/T195880 (10JKatzWMF) Makes sense. Thanks for clarifying! [20:06:27] 10Analytics, 10Product-Analytics: Bug: can't make a YoY time series chart in Superset - https://phabricator.wikimedia.org/T210687 (10Tbayer) For some reason @JKatzWMF was more recently able to get Superset to make a YoY chart here: https://goo.gl/Vg88uv [20:18:14] 10Analytics, 10Product-Analytics: Bug: can't make a YoY time series chart in Superset - https://phabricator.wikimedia.org/T210687 (10JKatzWMF) Yeah, I've only gotten it to work well by looking at 1 year and the previous year overlaid. I tried but have not been able to create a "periodicity pivot" chart (or... [21:15:56] 10Analytics, 10Product-Analytics: Investigate referrer class change on Chrome Mobile from September 13, 2018 - https://phabricator.wikimedia.org/T211077 (10Nuria) While android 8 plays a factor you can also see below Android 8 referrers switching from unknown to "search engine" at about Sep 13th so there is m... [22:04:10] hey a-team! I'm digging into blocking information in the mediawiki history tables on the Data Lake. From what I can tell, blocks of IPs are not captured in that dataset, is that so? [22:05:28] Nettrom: we grab them out of the logging table, and include them with event_user_blocks and event_user_blocks_historical [22:06:07] but we haven't looked at those fields very closely, to see whether they capture most or a minority of what actually happens in the ipblocks table [22:06:21] (there are no metrics based on it, so we just haven't gotten around to looking) [22:43:47] 10Analytics, 10Product-Analytics: Bug: can't make a YoY time series chart in Superset - https://phabricator.wikimedia.org/T210687 (10Nuria) Assigning to fdans, please take a look at logs and file bug with upstream , tool is deployed to analytics-tool1003.eqiad.wmnet [22:44:09] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Bug: can't make a YoY time series chart in Superset - https://phabricator.wikimedia.org/T210687 (10Nuria) a:03fdans