[01:02:55] PROBLEM - Check if the Hadoop HDFS Fuse mountpoint is readable on notebook1003 is CRITICAL: connect to address 10.64.21.109 port 5666: Connection refused [01:33:48] 10Analytics, 10Operations, 10Research, 10Article-Recommendation, and 3 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Dzahn) >>! In T213566#4898998, @bmansurov wrote: > @Dzahn please feel free to invite a senior SRE to the discussion. Hi @... [07:35:57] RECOVERY - Check if the Hadoop HDFS Fuse mountpoint is readable on notebook1003 is OK: OK [07:47:50] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Degraded RAID on analytics1054 - https://phabricator.wikimedia.org/T213038 (10elukey) 05Open→03Resolved All good thanks a lot @Cmjohnson ! [07:54:03] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review: Broken disk on analytics1056 - https://phabricator.wikimedia.org/T214057 (10elukey) 05Open→03Resolved all good thanks @Cmjohnson ! [08:02:11] 10Analytics, 10Analytics-Kanban, 10DBA, 10User-Elukey: Review dbstore1002's non-wiki databases and decide which ones needs to be migrated to the new multi instance setup - https://phabricator.wikimedia.org/T212487 (10elukey) >>! In T212487#4906144, @jcrespo wrote: > @elukey Not sure if part of this task, o... [09:28:42] 10Analytics: Check home leftovers of user imarlier (Ian Marlier) - https://phabricator.wikimedia.org/T213702 (10Gilles) The data on Hive seems to be about sitemaps. This is a project Ian volunteered to work on, but that we (Performance Team) don't intend on pursuing, since it's unrelated to performance. The fol... [09:51:07] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [09:51:28] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [11:54:31] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10The-Wikipedia-Library, 10Patch-For-Review: ExternalLinksChange Logging instrumentation is completely broken - https://phabricator.wikimedia.org/T162365 (10Samwalton9) 05Open→03Invalid We're starting again on this work, so this task is no longer nee... [15:10:17] 10Analytics: Check home leftovers of user imarlier (Ian Marlier) - https://phabricator.wikimedia.org/T213702 (10mpopov) >>! In T213702#4908349, @Gilles wrote: > The folks who (may) still work on sitemaps might be interested in that data, though. @ovasileva @mpopov ? Looks like it was some preliminary stuff on I... [16:11:02] PROBLEM - Check if the Hadoop HDFS Fuse mountpoint is readable on notebook1003 is CRITICAL: connect to address 10.64.21.109 port 5666: Connection refused [16:43:33] milimetric: o/ how long is the recentchanges event stream data kept around? Is it stored in a database too or is it temporarily available in kafka only? [16:52:24] 10Analytics, 10EventBus, 10Parsoid, 10Reading-Infrastructure-Team-Backlog, and 2 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10bmansurov) p:05Triage→03Normal [16:57:15] 10Analytics, 10EventBus, 10Parsoid, 10Reading-Infrastructure-Team-Backlog, and 3 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10bmansurov) [17:14:17] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), and 3 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10AndyRussG) >>! In T187207#4902806, @Nuria wrote: > Ping here @AndyR... [17:25:51] 10Analytics, 10EventBus, 10Parsoid, 10Reading-Infrastructure-Team-Backlog, and 3 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10bmansurov) [17:41:42] RECOVERY - Check if the Hadoop HDFS Fuse mountpoint is readable on notebook1003 is OK: OK [17:45:39] 10Analytics, 10EventBus, 10Parsoid, 10Reading-Infrastructure-Team-Backlog, and 3 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10bmansurov) [18:07:17] 10Analytics, 10Analytics-Kanban, 10Page-Issue-Warnings, 10Patch-For-Review: event_pageissues Turnilo view contains no valid data from before January 5 - https://phabricator.wikimedia.org/T214136 (10mforns) @Tbayer This is finished! Please check that pageissues and readingdepth contain the data that you ex... [18:09:04] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Finalize eventlogging to druid ingestion - https://phabricator.wikimedia.org/T206342 (10mforns) [18:13:15] (03PS3) 10Mforns: Adapt EventLogging/WhiteListSanitization to new way of storing salts [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/484708 (https://phabricator.wikimedia.org/T212014) [18:13:58] (03PS4) 10Mforns: Make saltrotate store salts with timestamps as file name. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/484250 (https://phabricator.wikimedia.org/T212014) [18:20:46] 10Analytics, 10EventBus, 10Parsoid, 10Reading-Infrastructure-Team-Backlog, and 3 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Nuria) Clarifying: ChnageProp consumes EventBus data just like EventStreams consumes EventBus data. So you cannot "use" changepro... [18:25:58] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 4 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10bmansurov) Thanks, @Nuria. I'll wait and see what other teams have to say before drafting a solution. Since we're interested in identifying links,... [18:36:19] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 4 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Nuria) @bmansurov ah I think I understand what you meant, now sorry: if mediawiki cannot generate the diff you are interested on at the time the p... [18:58:09] 10Analytics, 10Research, 10Article-Recommendation: Generate article recommendations in Hadoop for use in production - https://phabricator.wikimedia.org/T210844 (10Nuria) I think this the one I can think of: https://github.com/wikimedia/wikimedia-discovery-analytics/blob/master/oozie/transfer_to_es/coordinato... [19:00:26] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 4 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Pchelolo) The task at hand is very easy. There's LinksUpdateComplete hook in MW core, it gets the LinksUpdate which contains the list of external l... [19:05:08] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 4 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Pchelolo) Moving to services-blocked until #research comes up with the schema for the event. [19:16:46] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 4 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10bmansurov) @Pchelolo good to hear. Besides the links themselves, will we be able to extract metadata associated with the change too? We'll need the... [19:26:07] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 4 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Pchelolo) @bmansurov look at the event schema for properties change I've linked, it contains all of the metadata you need and it's generated from t... [19:29:31] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 4 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10bmansurov) @Pchelolo yes, replacing "properties" with "links" should do it. I don't think we need the full list of links, just the changes. Also,... [19:32:32] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 4 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Pchelolo) > @Pchelolo yes, replacing "properties" with "links" should do it. I don't think we need the full list of links, just the changes. Perfe... [19:45:29] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 5 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Nuria) @bmansurov I think you need to consider also couple more things: a list of links can be very lengthy, do we have a limit for how much this f... [20:01:35] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 5 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Pchelolo) > Are links url encoded? (we probably want them to be so). I agree with @Nuria on this. The 'meta.uri' is encoded, we need to be consis... [20:11:45] 10Analytics, 10Dumps-Generation, 10ORES, 10Scoring-platform-team, and 3 others: Decide whether we will include raw features - https://phabricator.wikimedia.org/T211069 (10awight) Just for fun, I elaborated on the quick estimate based on existing `w_cache` files. Note that these are not the "root" data sou... [20:31:28] 10Analytics, 10Knowledge-Integrity, 10Research, 10Epic, 10Patch-For-Review: Citation Usage: run third round of data collection - https://phabricator.wikimedia.org/T213969 (10RyanSteinberg) Hi @bmansurov I interacted with [[ https://en.wikipedia.beta.wmflabs.org/wiki/Brown_bear | a beta cluster page ]] an... [20:39:33] 10Analytics, 10Knowledge-Integrity, 10Research, 10Epic, 10Patch-For-Review: Citation Usage: run third round of data collection - https://phabricator.wikimedia.org/T213969 (10bmansurov) Hi @RyanSteinberg, I forgot to mention that you need to follow [[ https://wikitech.wikimedia.org/wiki/Analytics/Systems/... [21:29:55] 10Analytics, 10Knowledge-Integrity, 10Research, 10Epic, 10Patch-For-Review: Citation Usage: run third round of data collection - https://phabricator.wikimedia.org/T213969 (10RyanSteinberg) @bmansurov I don't think I have access to `deployment-eventlog05.deployment-prep.eqiad.wmflabs` or any of the wmflab... [21:32:06] 10Analytics, 10Knowledge-Integrity, 10Research, 10Epic, 10Patch-For-Review: Citation Usage: run third round of data collection - https://phabricator.wikimedia.org/T213969 (10bmansurov) @RyanSteinberg I see. Let's see if @Miriam and @tizianopiccardi can verify the data. I think they should have access. [21:40:02] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10Nuria) @Neil_P._Quinn_WMF: You will use this tool something like: >python my... [21:48:53] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 5 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10bmansurov) @Nuria @Pchelolo I'm encoding the URL in the above patch. How should we handle the situation with long list of links? Is that really a p... [21:53:20] bmansurov: sorry was off today, I think only 90 days, same as the recentchanges table in prod. [21:53:43] milimetric: nw, thanks! [22:00:40] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 5 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Pchelolo) > How should we handle the situation with long list of links? Is that really a problem? Should we create multiple events? You should not... [22:52:40] 10Analytics, 10Analytics-Kanban, 10Page-Issue-Warnings, 10Patch-For-Review: event_pageissues Turnilo view contains no valid data from before January 5 - https://phabricator.wikimedia.org/T214136 (10Tbayer) >>! In T214136#4909559, @mforns wrote: > @Tbayer > This is finished! Please check that pageissues an... [23:29:17] 10Analytics, 10good first bug: Productionize and run 2018 job for Global Innovation Index from Hadoop Geowiki data - https://phabricator.wikimedia.org/T190535 (10Nuria) Moving this to Kanban and assigning this work to @fdans [23:29:28] 10Analytics, 10good first bug: Productionize and run 2018 job for Global Innovation Index from Hadoop Geowiki data - https://phabricator.wikimedia.org/T190535 (10Nuria) a:03fdans [23:35:40] (03CR) 10Nuria: [C: 04-1] "Just couple nits." (035 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/484657 (https://phabricator.wikimedia.org/T206894) (owner: 10Fdans) [23:36:57] 10Analytics, 10Analytics-Kanban, 10Datasets-General-or-Unknown, 10Patch-For-Review: cron job rsyncing dumps webserver logs to stat1005 is broken - https://phabricator.wikimedia.org/T211330 (10Nuria) 05Open→03Resolved [23:37:14] 10Analytics, 10Analytics-Kanban: Create Office Hours for Team Analytics - https://phabricator.wikimedia.org/T211609 (10Nuria) 05Open→03Resolved [23:38:06] 10Analytics, 10Analytics-Kanban: virtualpageview_hourly lacks data from December 17 on - https://phabricator.wikimedia.org/T213602 (10Nuria) 05Open→03Resolved [23:38:56] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review: Allow the deployment of users without SSH access - https://phabricator.wikimedia.org/T212949 (10Nuria) 05Open→03Resolved [23:39:24] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Nuria) [23:39:28] 10Analytics, 10Analytics-Kanban, 10DBA, 10User-Elukey: Review dbstore1002's non-wiki databases and decide which ones needs to be migrated to the new multi instance setup - https://phabricator.wikimedia.org/T212487 (10Nuria) 05Open→03Resolved [23:44:16] 10Analytics: Refine: Use Spark SQL instead of Hive JDBC - https://phabricator.wikimedia.org/T209453 (10Nuria) [23:44:51] 10Analytics, 10Core Platform Team, 10EventBus, 10Parsoid, and 5 others: How to surface link changes as a stream? - https://phabricator.wikimedia.org/T214706 (10Nuria) p:05Normal→03Triage [23:52:17] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Performance-Team (Radar), 10Readers-Web-Backlog (Tracking): Make it easier to enable EventLogging's debug mode - https://phabricator.wikimedia.org/T188640 (10Nuria) To be honest I do not see us working on this feature soon, the debug mode is in... [23:52:32] 10Analytics, 10Analytics-EventLogging, 10Performance-Team (Radar), 10Readers-Web-Backlog (Tracking): Make it easier to enable EventLogging's debug mode - https://phabricator.wikimedia.org/T188640 (10Nuria) p:05High→03Low a:05Milimetric→03None [23:53:04] 10Analytics: Generate pagecounts-ez data back to 2008 - https://phabricator.wikimedia.org/T188041 (10Nuria) [23:54:32] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Growth-Team, and 2 others: Add EditAttemptStep properties to the schema whitelist - https://phabricator.wikimedia.org/T208332 (10Nuria) My mistake here cause I did not realize since recent whitelist changes need to be deployed to be applied. [23:54:43] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Growth-Team, and 2 others: Add EditAttemptStep properties to the schema whitelist - https://phabricator.wikimedia.org/T208332 (10Nuria) 05Open→03Resolved