[02:48:55] 10Analytics: Comments missing in mediawiki_history table - https://phabricator.wikimedia.org/T211535 (10Tgr) [06:35:50] 10Analytics, 10DBA, 10Data-Services, 10User-Banyek, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Marostegui) 05Open>03Resolved a:03Marostegui Closing this as the hardware has been decided to be purchased and will be followed up at:... [07:28:31] goood morning [07:28:41] just added the AAAA dns records to analytics103* [07:28:52] will proceed with the rest during the day [08:15:42] Morning elukey [08:15:57] elukey: May I update the druid datasource for AQS? [08:19:53] joal: sure! [08:20:17] elukey: Thanks :) everything seems alright with it [08:21:02] super, going to merge the next dns change in a bit [08:21:07] k [08:24:07] merged the AAAA records for analytics104* [08:24:15] next batch will be the rest [08:32:08] joal: aqs1004 ready (and depooled) [08:32:19] 10Analytics: Comments missing in mediawiki_history table - https://phabricator.wikimedia.org/T211535 (10JAllemandou) Thanks for pointing this out @Tgr. This is a known issue due indeed to the comment-storage change. Comments are unavialable for snapshots 2018-10 and 2018-11 and we are currently devising a soluti... [08:33:54] eqi aqs1004 [08:33:58] oops [08:34:09] o/ [08:34:19] hi fdans [08:35:50] elukey: Good for me :) [08:36:15] joal: so good to apply to the rest? [08:36:21] yessir [08:40:23] joal: done! [08:44:44] joal: the rest of the AAAA records are pushed [08:45:23] now I need to do the an-worker ones, but the SRE's dns CI job should be unblocked (i.e. not doing any -1 due to our records) [09:22:44] joal: https://www.youtube.com/watch?v=pr_9jF-wL3M [09:22:54] Make Hadoop great again! [09:29:01] (and this was 2y ago) [09:29:45] it could be the real alternative to Cloudera/Hortonworks [09:33:08] it also seems that Hops is used by Spotify [09:33:16] with their huge cluster [09:33:21] that is not bad [09:35:25] elukey: only concern is if at some points hops decides to stop being opensource - But I guess we'll see at this point [09:38:41] yep this is a good concern indeed [09:38:54] but they'll probably shoot themselves on the feet [09:39:03] if they think about it [09:39:22] there's no way they could survive with all the other players doing open source [09:39:38] * joal tries not to forget the pivot mess [09:39:58] yeah that was a bit different though :) [09:40:09] anyway, it is a good point [09:41:16] but except from that, it sounds a good move :) [09:43:24] \o/ ! Think I finally managed to have a better error checking script for webrequest :) [09:44:06] nice! [09:44:19] Man it's been painful :( [09:48:54] But the solution is simpler [09:49:02] And accounts for everything [10:04:29] (03PS3) 10Fdans: Adds logic and configuration for project families [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) [10:07:34] (03CR) 10jerkins-bot: [V: 04-1] Adds logic and configuration for project families [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) (owner: 10Fdans) [10:10:06] (03PS1) 10Joal: Update webrequest data-loss false positive check [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478626 (https://phabricator.wikimedia.org/T211000) [10:10:16] elukey: sorry, very verbose commit message --^ [10:14:07] (03PS4) 10Fdans: Adds logic and configuration for project families [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) [10:16:44] (03CR) 10jerkins-bot: [V: 04-1] Adds logic and configuration for project families [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) (owner: 10Fdans) [10:17:23] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, 10Core Platform Team Backlog (Watching / External): Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10JAllemandou) a:05JAllemandou>03Milimetric [10:44:28] joal: I gave it a first pass and it looks very good :) [10:44:46] one thing that I'd like to know is if this is happening now [10:45:09] more specifically in the middle of the hour (so not around the edges) [10:45:32] or if it was a specific weirdness triggered by the Varnish change [10:45:45] (if we already discussed this I forgot sorry :) [10:46:37] elukey: I'm pretty sure it is happening now -- However at a smaller scale [10:47:16] elukey: this makes me think I should provide a different script and modify the one I sent to be more informative in that regard [10:48:25] (03PS5) 10Fdans: Adds logic and configuration for project families [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) [11:13:05] (03PS1) 10Fdans: Don't display change chart button if there's only one chart available [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/478638 (https://phabricator.wikimedia.org/T210424) [11:16:13] (03PS1) 10Fdans: Replaces userpage link with contributions list [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/478639 (https://phabricator.wikimedia.org/T210422) [11:16:57] (03PS2) 10Fdans: Replace userpage link with contributions list [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/478639 (https://phabricator.wikimedia.org/T210422) [11:21:43] joal: sorry just read, what changes do you have in mind? [11:40:16] * elukey lunch + errand! [13:07:25] (03PS2) 10Joal: Update webrequest data-loss false positive check [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478626 (https://phabricator.wikimedia.org/T211000) [14:34:23] joal: catching up, new snapshot looks good, shall I start the other sqoop now? [14:34:56] milimetric: indeed, it's been deployed to AQS and is live :) [14:35:09] yeah, I checked it in wikistats, looks good [14:35:10] milimetric: Please let's start the new prcess to test :) [14:35:14] ok, will do [14:35:15] (03CR) 10Lucas Werkmeister (WMDE): "What’s the status of this after your discussion with Lydia? Is there anything left to be done or can it be merged?" [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/475807 (https://phabricator.wikimedia.org/T209399) (owner: 10Michael Große) [14:35:28] Thanks milimetric :) [14:39:36] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10Ottomata) AH a task! I missed that. Making now. [14:40:08] elukey: o/ are you ok with the name ca-worker, etc. in cloud-analytics ? [14:40:12] for vm machine names? [14:40:28] ottomata: o/ sure! [14:41:26] also [14:41:27] numbering [14:41:31] should we do like we do in prod? [14:41:34] ca-worker1001 ? [14:41:34] or [14:41:44] should we try to match what the horizon interface automaticaly generates? [14:41:45] e.g. [14:41:51] ca-worker-1, ca-worker-2 [14:41:53] ? [14:42:43] or maybe the 1001 etc. will confuse people because it looks like prod eqiad network stuff [14:42:54] ca-worker01 [14:42:54] ? [14:42:55] elukey: ^ [14:42:58] any preference :) [14:42:59] ? [14:43:46] (in an interview will answer in 10 mins sorry :) [14:44:53] (03CR) 10Michael Große: "> Patch Set 7:" [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/475807 (https://phabricator.wikimedia.org/T209399) (owner: 10Michael Große) [14:49:48] joal: interesting thing I never thought of - can we sqoop from private and labsdb at the same time? We'll find out! [14:50:12] I guess we do with cu_changes already, and the yarn job name should be unique [14:50:46] anyway, let's log what's going on [14:51:52] !log trying the labsdb/analytics-store combination sqoop, live logs in /home/milimetric/sqoop-[private-]log.log on stat1004 [14:51:53] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:56:01] (03CR) 10Ottomata: "Cool! Wonder where this should live if it is no longer hive/ ?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478626 (https://phabricator.wikimedia.org/T211000) (owner: 10Joal) [15:03:37] ottomata: here I am sorry [15:03:38] soooo naming [15:04:02] we are discussing the names of the vms right? I like ca-worker-1/2/etc.. [15:04:09] seems straightforwardd [15:04:15] *straightforward [15:04:41] (brb) [15:05:46] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Bug: can't make a YoY time series chart in Superset - https://phabricator.wikimedia.org/T210687 (10fdans) After digging a lil bit, I'm pretty sure this issue was fixed with this change: https://github.com/apache/incubator-superset/pull/5931 Current vers... [15:08:09] (03CR) 10Ottomata: [C: 031] Add EventLoggingSanitizationMonitor.scala (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/478126 (https://phabricator.wikimedia.org/T202429) (owner: 10Mforns) [15:11:56] fdans: I was about to merge your wikistats stuff and saw the big linting change: 4269763665 [15:11:59] oops... [15:12:07] https://gerrit.wikimedia.org/r/#/c/analytics/wikistats2/+/471371/ [15:12:36] I'm going to try and merge that if you don't have any objections [15:13:03] milimetric: wat nooo then i’ll have like a thousand conflicts [15:13:18] ok, then we should've abandoned this patch a long time ago [15:13:48] hm... actually I don't love some of it, I want superfluous commas [15:14:04] ok, let's merge yours and then I'll rework this patch [15:14:15] I do LOVE that it found more double quotes, because I HATE double quotes [15:15:06] (03PS8) 10Michael Große: Update metric's items and properties automatically [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/475807 (https://phabricator.wikimedia.org/T209399) [15:16:59] "milimetric" [15:17:02] * elukey runs away [15:18:34] elukey: lol, no just in javascript. So I'm safe with you 'cause you wouldn't stoop to writing javascript just to upset me :) [15:18:49] elukey: yup vms [15:18:49] ok! [15:19:39] 10Analytics, 10Analytics-Kanban: Set up 5 decidated VMs on the cloudvirtan hardware in the cloud-analytics project - https://phabricator.wikimedia.org/T211599 (10Ottomata) p:05Triage>03High [15:22:46] omg the ocd squad [15:24:37] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.33-notes (1.33.0-wmf.6; 2018-11-27): Deprecation Information for EventLogging ResourceLoader modules - https://phabricator.wikimedia.org/T205744 (10Petar.petkovic) [15:41:55] 10Analytics, 10Analytics-Kanban: Upgrade Superset to 0.28.1 - https://phabricator.wikimedia.org/T211605 (10fdans) [15:44:36] 10Analytics, 10Product-Analytics: As a user of Superset I would like it to be up-to-date so I'm not blocked by bugs that have already been fixed - https://phabricator.wikimedia.org/T211606 (10mpopov) [15:49:49] (03CR) 10Milimetric: [C: 04-1] Adds logic and configuration for project families (0314 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) (owner: 10Fdans) [15:50:37] 10Analytics, 10Analytics-Kanban: Set up 5 decidated VMs on the cloudvirtan hardware in the cloud-analytics project - https://phabricator.wikimedia.org/T211599 (10Andrew) I created these 5 VMs and added profile::labs::lvm::srv so that most of the storage is available on /srv. I hope they do the trick! [15:50:38] hey team :] [15:52:28] (03CR) 10Milimetric: "I'm sorry I'm just now seeing this. Thanks for doing the work. We have a couple of bigger features in the mix right now, we'll merge tho" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/471371 (https://phabricator.wikimedia.org/T208697) (owner: 10John Erling Blad) [15:52:52] (03CR) 10Mforns: [C: 032] Replace userpage link with contributions list [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/478639 (https://phabricator.wikimedia.org/T210422) (owner: 10Fdans) [15:55:32] (03CR) 10Mforns: [C: 032] Don't display change chart button if there's only one chart available [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/478638 (https://phabricator.wikimedia.org/T210424) (owner: 10Fdans) [15:59:41] 10Analytics, 10Analytics-Kanban: Set up 5 decidated VMs on the cloudvirtan hardware in the cloud-analytics project - https://phabricator.wikimedia.org/T211599 (10Ottomata) AWESOOOME. WIll start to work with these today oh boy! [16:00:03] 10Analytics, 10Product-Analytics: Investigate referrer class change on Chrome Mobile from September 13, 2018 - https://phabricator.wikimedia.org/T211077 (10Nuria) While change seems sharp on the 12 months plot it actually happens in the course of couple weeks from September 9th to September 22nd. The fact that... [16:01:26] fdans: holaaaa.... standup! [16:39:50] 10Analytics: Create Office Hours for Team Analytics - https://phabricator.wikimedia.org/T211609 (10Milimetric) [16:41:32] (03PS6) 10Fdans: Adds logic and configuration for project families [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) [16:41:34] (03CR) 10Fdans: Adds logic and configuration for project families (0313 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) (owner: 10Fdans) [16:42:25] 10Analytics, 10Analytics-Kanban: Create Office Hours for Team Analytics - https://phabricator.wikimedia.org/T211609 (10fdans) [16:43:54] 10Analytics, 10Analytics-Kanban: Upgrade Superset to 0.28.1 - https://phabricator.wikimedia.org/T211605 (10fdans) p:05Triage>03High [16:47:00] 10Analytics: Comments missing in mediawiki_history table - https://phabricator.wikimedia.org/T211535 (10fdans) [16:47:08] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, 10Core Platform Team Backlog (Watching / External): Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10fdans) [16:52:05] 10Analytics: "Edit" equivalent of pageviews daily available to use in Turnilo and Superset - https://phabricator.wikimedia.org/T211173 (10fdans) We'll be working on this on Q3 2019. This is easier to achieve if you don't need article title. [16:52:48] 10Analytics: "Edit" equivalent of pageviews daily available to use in Turnilo and Superset - https://phabricator.wikimedia.org/T211173 (10fdans) p:05Triage>03High [16:52:56] 10Analytics, 10Product-Analytics: Investigate referrer class change on Chrome Mobile from September 13, 2018 - https://phabricator.wikimedia.org/T211077 (10fdans) a:03Nuria [16:53:38] 10Analytics, 10Product-Analytics: Investigate referrer class change on Chrome Mobile from September 13, 2018 - https://phabricator.wikimedia.org/T211077 (10fdans) p:05Triage>03Normal [16:55:25] 10Analytics, 10Dumps-Generation, 10ORES, 10Scoring-platform-team, and 3 others: Decide whether we will include raw features - https://phabricator.wikimedia.org/T211069 (10fdans) Sounds ok to us! [17:00:45] 10Analytics: Give access to Superset to Pau - https://phabricator.wikimedia.org/T211036 (10fdans) [17:01:29] 10Analytics: Give access to Superset to Pau - https://phabricator.wikimedia.org/T211036 (10fdans) @Pginer-WMF can you check whether you have access now? https://superset.wikimedia.org/ [17:02:33] 10Analytics: Give access to Superset to Pau - https://phabricator.wikimedia.org/T211036 (10fdans) p:05Triage>03Normal [17:05:57] 10Analytics, 10Analytics-Kanban, 10Research: POC More efficient Bot filtering on pageview data - https://phabricator.wikimedia.org/T211359 (10fdans) a:03Nuria [17:06:57] 10Analytics, 10Analytics-Kanban, 10Datasets-General-or-Unknown, 10Patch-For-Review: cron job rsyncing dumps webserver logs to stat1005 is broken - https://phabricator.wikimedia.org/T211330 (10fdans) p:05Normal>03High [17:08:43] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Services (done): Refinery Spark HiveExtensions schema merge should support merging of arrays with struct elements - https://phabricator.wikimedia.org/T210465 (10fdans) 05Open>03Resolved [17:09:42] 10Analytics, 10Analytics-Kanban: Create Office Hours for Team Analytics - https://phabricator.wikimedia.org/T211609 (10Milimetric) p:05Triage>03Normal [17:17:05] 10Analytics, 10Operations, 10Performance-Team, 10Traffic: Only serve debug HTTP headers when x-wikimedia-debug is present - https://phabricator.wikimedia.org/T210484 (10Anomie) > server I note that with X-Wikimedia-Debug it seems you have to specify a backend, so this wouldn't be terribly useful there eit... [17:38:56] 10Analytics: Give access to Superset to Pau - https://phabricator.wikimedia.org/T211036 (10Pginer-WMF) >>! In T211036#4811133, @fdans wrote: > @Pginer-WMF can you check whether you have access now? https://superset.wikimedia.org/ > > Also please confirm that your wikitech username is pginer. I still cannot acc... [17:40:12] (03CR) 10Nuria: [C: 04-1] "If I select "unique devices" and split by access site I get errors on console and a "something went wrong". I think we agreed the "split"" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) (owner: 10Fdans) [17:44:33] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Add new wikis to analytics - https://phabricator.wikimedia.org/T209822 (10Nuria) @fdans did we check these wikis are on the newly scooped snapshot? [17:45:05] 10Analytics, 10Analytics-Kanban, 10Pageviews-API, 10Patch-For-Review: Pageviews top endpoint in descending order as of 2018-11-20 - https://phabricator.wikimedia.org/T210091 (10Nuria) 05Open>03Resolved [17:45:54] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Phase out and replace analytics-store (multisource) - https://phabricator.wikimedia.org/T172410 (10bmansurov) @elukey where can I see the mappings of wikishared, log, and centralauth to sX? [17:48:13] 10Analytics, 10ORES, 10Scoring-platform-team: Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Nuria) p:05High>03Triage [17:49:42] nuria: o/ - when you have a min can you approve https://phabricator.wikimedia.org/T211095 ? [17:49:46] fdans: can you follow up on this task to make sure changes (sessionId, pageId) have been submitted and task can be closed? [17:50:07] elukey: approved, thanks [17:50:08] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Grant fdans permissions to deploy AQS in prod, and accessing the aqs hosts - https://phabricator.wikimedia.org/T211095 (10Nuria) Approved. [17:50:30] thanks :) [17:50:56] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update log_namespace, page_namespace from bigint to int - https://phabricator.wikimedia.org/T209179 (10Nuria) [17:51:01] 10Analytics, 10Analytics-Kanban: Refactor Mediawiki-Database ingestion - https://phabricator.wikimedia.org/T209178 (10Nuria) [17:51:03] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update log_namespace, page_namespace from bigint to int - https://phabricator.wikimedia.org/T209179 (10Nuria) 05Open>03Resolved [17:51:17] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Performance tweaks for state management in wikistats - https://phabricator.wikimedia.org/T207352 (10Nuria) 05Open>03Resolved [17:51:52] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Allow hadoop prod jobs to preempt resource over default queue - https://phabricator.wikimedia.org/T208208 (10Nuria) 05Open>03Resolved [17:52:05] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: is-yarn-app-running script should output the running application id - https://phabricator.wikimedia.org/T206555 (10Nuria) 05Open>03Resolved [17:52:54] (03CR) 10Nuria: "Nice, thanks for doing these changes." [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/476195 (https://phabricator.wikimedia.org/T210570) (owner: 10Milimetric) [17:53:07] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: Dashiki should filter out empty newlines - https://phabricator.wikimedia.org/T210570 (10Nuria) 05Open>03Resolved [17:53:54] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: [EventLogging Sanitization] Fix passing of input_path_regex params to Refine - https://phabricator.wikimedia.org/T210110 (10Nuria) 05Open>03Resolved [17:58:15] 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Automate XML-to-parquet transformation for XML dumps (oozie job) - https://phabricator.wikimedia.org/T202490 (10Nuria) @JAllemandou I think we still need to document the table right? https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Me... [17:58:27] 10Analytics, 10Analytics-Kanban: Upgrade Matomo to 3.6.1 or 3.7.0 - https://phabricator.wikimedia.org/T209808 (10Nuria) 05Open>03Resolved [17:59:45] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Readers-Web-Backlog (Tracking): Ingest data aggregate ReadingDepth data into Druid - https://phabricator.wikimedia.org/T205562 (10Nuria) 05Open>03Resolved [17:59:50] 10Analytics, 10Analytics-Kanban: Update EventLogging kafkacat examples to use jumbo - https://phabricator.wikimedia.org/T209635 (10Nuria) 05Open>03Resolved [18:01:22] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Move turnilo to nodejs 10 - https://phabricator.wikimedia.org/T210705 (10Nuria) 05Open>03Resolved [18:10:31] fdans: you can deploy aqs now :) [18:36:04] chelsyx: yt? I have a question [18:38:17] fdans: i pinged you on couple tasks that were still open , let's finish those before starting a new one if that is OK with you? [18:43:27] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Contributors-Analysis, 10Product-Analytics: Hive join fails when using a HiveServer2 client - https://phabricator.wikimedia.org/T206279 (10Nuria) 05Open>03Resolved [18:43:59] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Hive query fails with local join - https://phabricator.wikimedia.org/T209536 (10Nuria) 05Open>03Resolved [18:54:43] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Phase out and replace analytics-store (multisource) - https://phabricator.wikimedia.org/T172410 (10elukey) @bmansurov: * log is not anymore on dbstore1002/analytics-store, but you can find it in analytics-slave (db1108... [18:56:25] * elukey off! [19:20:45] 10Analytics, 10Operations, 10Performance-Team, 10Traffic: Only serve debug HTTP headers when x-wikimedia-debug is present - https://phabricator.wikimedia.org/T210484 (10Krinkle) >>! In T210484#4811185, @Anomie wrote: >> server > > I note that with X-Wikimedia-Debug it seems you have to specify a backend,... [19:23:16] 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Automate XML-to-parquet transformation for XML dumps (oozie job) - https://phabricator.wikimedia.org/T202490 (10JAllemandou) OMG ! Done ... Sorry for having skipped that one :S [19:25:04] (03CR) 10Joal: "> Patch Set 2:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478626 (https://phabricator.wikimedia.org/T211000) (owner: 10Joal) [19:25:47] (03CR) 10Ottomata: [C: 032] "OHHH right this is not the actual sequence calculation job. Very cool!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478626 (https://phabricator.wikimedia.org/T211000) (owner: 10Joal) [19:25:53] (03CR) 10Ottomata: [V: 032 C: 032] Update webrequest data-loss false positive check [analytics/refinery] - 10https://gerrit.wikimedia.org/r/478626 (https://phabricator.wikimedia.org/T211000) (owner: 10Joal) [19:28:42] 10Analytics-EventLogging, 10Analytics-Kanban: [EventLogging Sanitization] Enable older-than-90-day purging of unsanitized EL database (event) in Hive - https://phabricator.wikimedia.org/T209503 (10mforns) @leila and @Miriam, can you please leave me a couple days to execute the deletion script before the Christ... [19:29:59] 10Analytics-EventLogging, 10Analytics-Kanban: [EventLogging Sanitization] Enable older-than-90-day purging of unsanitized EL database (event) in Hive - https://phabricator.wikimedia.org/T209503 (10mforns) Oh, and if you guys need any help copying/formatting that data, you can ping me and I'll try to help. [19:37:40] !log Manually deleting old druid-public snapshots that were not following datasource naming convention (- instead of _) [19:37:42] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:25:12] 10Analytics, 10Anti-Harassment, 10Product-Analytics: Mediawiki history has no data on IP blocks - https://phabricator.wikimedia.org/T211627 (10nettrom_WMF) [20:46:14] 10Analytics, 10Anti-Harassment, 10Product-Analytics: Mediawiki history has no data on IP blocks - https://phabricator.wikimedia.org/T211627 (10JAllemandou) Hi @nettrom_WMF , Indeed the mediawiki_history table doesn't contain historical blocks (or actually group, but it's not relevant). The approach taken whe... [20:49:46] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Grant fdans permissions to deploy AQS in prod, and accessing the aqs hosts - https://phabricator.wikimedia.org/T211095 (10Dzahn) approved in SRE-2018-12-10#Access_Requests pending manager approval which is now done [21:07:02] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Grant fdans permissions to deploy AQS in prod, and accessing the aqs hosts - https://phabricator.wikimedia.org/T211095 (10Dzahn) @fdans said: > There are two things I'd like to do .. - "Deploy AQS with scap from deployment.eqiad.wmnet"... [21:07:27] 10Analytics, 10Operations, 10Performance-Team, 10Traffic: Only serve debug HTTP headers when x-wikimedia-debug is present - https://phabricator.wikimedia.org/T210484 (10Gilles) a:03Gilles [21:07:52] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Grant fdans permissions to deploy AQS in prod, and accessing the aqs hosts - https://phabricator.wikimedia.org/T211095 (10Dzahn) 05Open>03Resolved a:03Dzahn [21:12:46] 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Automate XML-to-parquet transformation for XML dumps (oozie job) - https://phabricator.wikimedia.org/T202490 (10Nuria) Ok, docs at : https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawki_wikitext_history [21:20:20] ottomata: (cc mforns ) want to talk about refinemonitor? [21:20:55] nuria: sure let's do it via IRC [21:21:17] ottomata: k , ahem ... let me read your comments [21:21:53] k [21:22:36] nuria: i think your objection has to do with the way ConfigHelper (which I wrote) worlks [21:22:41] we wanted the ability to load from either CLI or from Config files [21:22:45] so we couldn't use scopt anymore [21:22:50] we went with Profig [21:22:50] ottomata: aham [21:23:03] which does not have built in support for help messages [21:23:17] so, the help message stuff is done manually, like https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/refine/Refine.scala#L63 [21:25:25] ottomata: right, i saw that . i do not love it cause it makes code all mixed up with plain english text hard to read [21:25:43] ottomata: but i get that that is aminor point that mforns shoudl not have to change in his change [21:26:38] I believe the change is consistent with how we use ConfigHelper elsewhere [21:26:56] mforns: ya, it is [21:27:37] and I appreciate the way this structure prevents code repetition [21:28:18] because we can merge the param description from RefineMonitor with the one specific of EventLoggingSanitizationMonitor without repeating code [21:28:35] mforns: right yes [21:28:46] nuria: how do you feel about docopt say you love it because i love igt [21:28:47] :p [21:28:49] but yea, I also see the logic is a bit split [21:30:03] nuria: would it be that different with scopt? or python argparse? those work the same [21:30:29] .param('-f', '--force', help='Force the thing to do the thing', default=False) [21:30:32] or whatever [21:30:51] ottomata: ok, let me say we do not need to change anything now but i think these type of code is a mix of functional plus oo but it is neither here nor there [21:31:23] aye we agree mforns should proceed, but now you've started a discussion so oh boyyy! [21:31:38] ottomata: if class A uses class B i think the convention is that you have a constructor where class B is passed to class A like [21:31:41] heheh [21:31:41] how would you like to see an arg parser lib fix your complaint>? [21:33:09] ottomata: AobjectConstructor(B BObject) so B is used by A and if i have to write a unit tests I can mock B [21:33:35] in this case RefineMonitor us used by SanitizeMonitor? [21:33:50] so you want SanitizeMonitor to be passed RefineMonitor somehow? [21:34:35] ottomata: in thsi case eventloggingsanitize monitor uses refine (or wraps it, or decorates it) or it coudl even be i do not know a filter pattern cause it is one criteria plus another additional criteria applied [21:35:08] ottomata: but the structure of what happens it is not apparent , not that we have to make it so in this one change [21:36:37] nuria: am confused your comments were about usage docs? [21:37:26] ottomata: ya, i know, it is more a complain about overall structure but it is nothing that we will resolve in this change [21:38:44] ottomata: let me give you an example [21:39:34] ottomata: if we had a formal structure i would expect that if we add 1 more config mandatory parameter to refine the calss that uses refine will not compile [21:39:57] *the class that uses refine , in this case, EventLoggingSanitizationMonitor [21:40:23] ottomata: cause an non bcakwards compatible addition of a parameter should trigger a compile fail [21:41:22] ottomata: in this case i do not think that would happen because there is no formal structure as to what it is required (in terms of config or constructor) for Refine to work [21:41:42] nuria, wouldn't that force EventLoggingSanitizationMonitor to repeat all RefineMonitor params? [21:42:42] mforns: no, i do not think so. [21:43:27] 10Analytics, 10Analytics-Kanban: Increase quota in cloud-analytics project for zookeeper nodes - https://phabricator.wikimedia.org/T211634 (10Ottomata) p:05Triage>03High [21:43:45] we could do sth analogous to the scala case class copy function [21:44:05] like: RefineMontor.Config.copy(newParameter = newValue) [21:44:38] ottomata: is stat1004 ok? I'm kind of just timing out connecting to it [21:45:03] I had two terminals printing logs that I had to force-close, just trying to check on my screens there and can't get back in [21:45:12] milimetric: seems ok to me [21:45:24] 10Analytics, 10Analytics-Kanban: Increase quota in cloud-analytics project for zookeeper nodes - https://phabricator.wikimedia.org/T211634 (10Andrew) Looks like you have room for two more smalls with existing quotas -- is that space spoken for already? [21:46:32] ottomata, mforns: anyways, i do not think we need to change anything right now , I just feel that we are in this not so well structure in-between OO and functional and purely scripting code . purely oo woudl give us compile fails for issues that now we will find at runtime. Now, these classes are just wrappers so maybe they do not require this much thought [21:47:16] mforns: please merge if andrew feels it is ok to do so [21:47:26] ok, will do [21:47:56] 10Analytics, 10Analytics-Kanban: Increase quota in cloud-analytics project for zookeeper nodes - https://phabricator.wikimedia.org/T211634 (10Ottomata) OH! I think you are right, I was reading that wrong. Ok, I'll work with this for now. Thanks! [21:48:07] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Presto cluster online and usable with test data pushed from analytics prod infrastructure accessible by Cloud (labs) users - https://phabricator.wikimedia.org/T204951 (10Ottomata) [21:48:10] 10Analytics, 10Analytics-Kanban: Increase quota in cloud-analytics project for zookeeper nodes - https://phabricator.wikimedia.org/T211634 (10Ottomata) 05Open>03declined [21:48:32] (03CR) 10Mforns: [V: 032 C: 032] "Merging after discussion in IRC with people involved in this CR." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/478126 (https://phabricator.wikimedia.org/T202429) (owner: 10Mforns) [21:50:05] thanks ottomata and nuria for your thoughts on that! [21:54:33] (03CR) 10Milimetric: Adds logic and configuration for project families (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/464583 (https://phabricator.wikimedia.org/T205665) (owner: 10Fdans) [21:55:52] (fyi: my connection problems were bast4001 not working, using bast4002 for now) [21:56:05] cool [21:56:10] good to know [22:01:02] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update sqoop to work with the new schema - https://phabricator.wikimedia.org/T210541 (10Milimetric) Update: running a big sqoop against all wikis with the change referenced here. Actor and Comment tables finished for all wikis in 5 hours. The main sqoop... [22:02:48] joal: what's the difference between https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawki_wikitext_history and the existing page https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_history ? ;) [22:05:37] HaeB: they are different tables [22:05:43] the wikitext one contains page content [22:05:58] you can join them! [22:11:55] thanks ottomata ! sorry for being a bit opaque - i was getting at the point that it would be nice to add a brief explanation to each page that the other history table exists and may be preferable for use case X (e.g. i guess that if one doesn't need to access the wikitext content, one should use the mediawiki_history one, right?) [22:16:54] HaeB: I was trying to edit and add some clarity, but I think we don't know enough about the data yet to do that. Right now it's just a straight import from dumps with not much vetting of the two datasets to see how well they join and work together [22:18:00] mediawiki_history is the end of a long process where we parse broken records out of the database. The _wikitext_ version includes the content, but we just need to do more work to know exactly what we should say about the two datasets, and how they compare [22:26:31] 10Analytics, 10Developer-Advocacy, 10Product-Analytics, 10Documentation: Develop EventLogging schema for documentation feedback gadget - https://phabricator.wikimedia.org/T211638 (10srishakatux) p:05Triage>03Normal [22:32:22] 10Analytics, 10Developer-Advocacy, 10Product-Analytics, 10Documentation: Develop EventLogging schema for documentation feedback gadget - https://phabricator.wikimedia.org/T211638 (10srishakatux) Ideally, we want to collect users votes on a document and page ids associated with it. Based on this, I've creat... [23:07:01] 10Analytics, 10Analytics-Kanban: Grafana, icinga, prometheus in cloud-analytics project - https://phabricator.wikimedia.org/T211640 (10Ottomata) p:05Triage>03High [23:10:38] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Presto cluster online and usable with test data pushed from analytics prod infrastructure accessible by Cloud (labs) users - https://phabricator.wikimedia.org/T204951 (10Ottomata) Status update! cloud-analytics Hadoop cluster is up and running! Tomorrow... [23:26:17] 10Analytics, 10Analytics-Kanban: Grafana, icinga, prometheus in cloud-analytics project - https://phabricator.wikimedia.org/T211640 (10Andrew) There isn't a great monitoring solution available for VMs. I can add your project to shinken so that basic VM stats (up/down/puppet failures, etc.) have monitoring and...