[00:19:13] @milimetric @nuria If I understand it right, for deploying the dashboard to staging instance, I would need to add the hostname here https://wikitech.wikimedia.org/wiki/Hiera:Dashiki/host/dashiki-staging-01 and access to the lab project. Is that all I would need, also how can I get access to the labs project? [00:24:37] srish_aka_tux: for access to labs project your team are the ones in command [00:26:48] srish_aka_tux: also you need to create atop dns level to the dashboard like cloud_vps-edits.wmflabs.wikimedia.org , for this one also your team can help you best [00:28:31] @nuria okay thank you :) [00:29:08] srish_aka_tux: you need to install fabric to deploy.See DEPLOY section on readme: https://github.com/wikimedia/analytics-dashiki/blob/master/README.md [00:29:51] i don’t think anyone but us has rights to deploy to those instances on labs, nuria [00:30:26] oh you covered that, ok [00:30:59] usually we do the deploy, but I don’t mind sharing [00:31:21] milimetric: I think srish_aka_tux can be added to be a deployer by anyone that is an administrator to the project , us or her team i think [00:31:58] yep [09:25:49] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10elukey) >>! In T222253#5633384, @Ottomata wrote: > Plan here: https://etherpad.wikimedia.org/p/analytics-spark Left some nits but looks good! [10:27:16] 10Analytics, 10Analytics-Kanban: Prepare the Hadoop Analytics cluster for Kerberos - https://phabricator.wikimedia.org/T237269 (10elukey) [11:00:14] !log testing load of top metric from mediarequests with corrected quotemarks escaping [11:00:15] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:24:40] 10Analytics, 10Analytics-Kanban: Add Sakizaya Wikipedia to analytics setup - https://phabricator.wikimedia.org/T237378 (10jhsoby) [11:28:32] * elukey lunch! [11:33:35] (03PS4) 10Fdans: Add historical backfilling for mediarequest tops [analytics/refinery] - 10https://gerrit.wikimedia.org/r/545583 (https://phabricator.wikimedia.org/T228149) [11:34:39] (03CR) 10Fdans: [V: 03+2 C: 03+2] "Merging this and the patch on top of it with the double quotes since it's been tested ad nauseam and we have +1 on the cr on top" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/545583 (https://phabricator.wikimedia.org/T228149) (owner: 10Fdans) [11:35:11] (03PS3) 10Fdans: Escape double quotes in file urls [analytics/refinery] - 10https://gerrit.wikimedia.org/r/546882 [11:35:40] (03CR) 10Fdans: [V: 03+2 C: 03+2] "tested, merging" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/546882 (owner: 10Fdans) [11:47:51] 10Analytics-Kanban, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Team-Backlog, and 4 others: Create new eventgate-logging deployment in k8s with helmfile - https://phabricator.wikimedia.org/T236386 (10fgiunchedi) >>! In T236386#5633635, @Ottomata wrote: > Hm, another issue! > > For Mir... [12:02:32] 10Analytics, 10ArticlePlaceholder, 10Wikidata, 10Wikidata-Campsite, 10wikidata-tech-focus: ArticlePlaceholder dashboard stopped tracking page views - https://phabricator.wikimedia.org/T236895 (10Addshore) p:05Triage→03Low [12:25:02] 10Analytics, 10incubator.wikimedia.org: Create dashiki dashboard / small tool to track statistics about incubated wikis - https://phabricator.wikimedia.org/T237389 (10fdans) [12:39:48] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10BBlack) >>! In T233661#5634559, @elukey wrote: >>>! In T233661#5632172, @BBlack wrote: >> Agreed, let's not go down that road right here... [12:41:07] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10BBlack) >>! In T233661#5633768, @Nuria wrote: > @BBlack: once we deploy the VCL/varnish-kafka chnages we need to change our refine pipel... [13:01:43] (03PS1) 10Joal: Add TLS information to webrequest hive/druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548736 (https://phabricator.wikimedia.org/T233661) [13:01:56] Hi team - Whenever one of you is ready --^ [13:02:06] 10Analytics, 10incubator.wikimedia.org: Create dashiki dashboard / small tool to track statistics about incubated wikis - https://phabricator.wikimedia.org/T237389 (10jhsoby) Thanks for creating this, @fdans! The tool we currently use to monitor activity in the Incubator is "Catanalysis" by @Pathoschild. For... [13:02:29] (03CR) 10Joal: [V: 03+1] "Fully verified (hive+druid)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548736 (https://phabricator.wikimedia.org/T233661) (owner: 10Joal) [13:04:13] (03PS2) 10Joal: Add TLS information to webrequest hive/druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548736 (https://phabricator.wikimedia.org/T233661) [13:36:23] hello joal!!! [13:46:16] fdans, hello ! [13:46:19] fdans: how are you? [13:46:48] joal: pretty good, missed you yesterday in the meetings, all good? [13:47:50] yessir, busy period of the year for me (some teaching, plus Melissa also has more workload than sometimes ...) Thank you team for covering for me :) [13:49:18] fdans: How may I help you? [13:50:01] joal: I was just checking in but actuallyyyy, do you have a second in the batcave? [14:03:09] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Correct namespace zero editor counts on geoeditors_monthly table on hive and druid - https://phabricator.wikimedia.org/T237072 (10Milimetric) Status update: * nullified ns0 distinct editors in old data * re-ran monthly insert with... [14:04:40] 10Analytics, 10Analytics-Kanban: Prepare the Hadoop Analytics cluster for Kerberos - https://phabricator.wikimedia.org/T237269 (10elukey) [14:07:02] o/ [14:07:51] o/ [14:13:01] pfff - internet is really flaky today [14:13:05] Hi ottomata, hi elukey [14:14:53] bonsoir joal! [14:15:29] 10Analytics, 10Desktop Improvements, 10Event-Platform, 10Readers-Web-Backlog (Kanbanana-2019-20-Q2): [SPIKE 8hrs] How will the changes to eventlogging affect desktop improvements - https://phabricator.wikimedia.org/T233824 (10jlinehan) Just wanted to chime in here as I'm really glad to see these discussion... [14:15:45] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Correct namespace zero editor counts on geoeditors_monthly table on hive and druid - https://phabricator.wikimedia.org/T237072 (10Ijon) It's true that I never requested ns0 counts separately. Count of all namespaces is fine. [14:21:22] 10Analytics, 10Cloud-VPS, 10cloud-services-team (Kanban): Remove Analytics project hiera config from wikitech - https://phabricator.wikimedia.org/T237410 (10Andrew) [14:23:21] 10Analytics-Kanban, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Team-Backlog, and 4 others: Create new eventgate-logging deployment in k8s with helmfile - https://phabricator.wikimedia.org/T236386 (10Ottomata) > Perhaps mirrormaker could be warning us or similar if it detects there are... [14:23:34] 10Analytics, 10New-Readers: Add KaiOS to the list of OS query options for pageviews in Turnilo - https://phabricator.wikimedia.org/T231998 (10jlinehan) >>! In T231998#5632183, @SBisson wrote: > Anyone knows the maintainers of this project? I don't, but isn't their information in the README? It looks like the... [14:26:33] (03CR) 10Ottomata: [C: 03+1] Add TLS information to webrequest hive/druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548736 (https://phabricator.wikimedia.org/T233661) (owner: 10Joal) [14:28:56] 10Analytics, 10Cloud-VPS, 10cloud-services-team (Kanban): Remove Analytics project hiera config from wikitech - https://phabricator.wikimedia.org/T237410 (10Ottomata) 05Open→03Resolved Oh! We don't need that for sure. Deleted! [14:31:25] ottomata, elukey: I have experienced problems starting new labs VM, and cloud-team told me I was experiencing the same problem as https://phabricator.wikimedia.org/T234232 [14:31:52] any idea on what we need to change for labs to be back for us? [14:32:49] joal: what kind of problems? [14:33:11] Usually what I do is start the VM with no puppet role assigned, and wait for it to be available (ie ssh works) [14:33:15] elukey: puppet failed to run because of a hiera misconfiguration [14:33:20] ah there you go [14:33:29] So no ssh etc [14:33:57] ottomata: hi! are the new schema for sparql/query 1.0.0 deploy? [14:34:01] in theory if you remove the role/profile and then wait a bit, it should run puppet [14:34:01] ed [14:34:08] hm [14:34:22] elukey: I had no puppet role assigned (or didn't do it on purpose if one was) [14:34:40] joal: ah interesting! Can you tell me the name of the vm? [14:34:43] I'll take a look [14:35:18] 10Analytics, 10Cloud-VPS, 10cloud-services-team (Kanban): Remove Analytics project hiera config from wikitech - https://phabricator.wikimedia.org/T237410 (10Andrew) thanks :) [14:37:31] ah that might be the issue --^ [14:37:35] may work now [14:37:46] didn't check the task in depth :) [14:37:50] joal: let me know if it works now [14:37:55] otherwise I'll check [14:38:54] sure elukey, will let you know [14:39:01] when internet gets back in rack :( [14:41:10] 10Analytics, 10Analytics-Kanban: Prepare the Hadoop Analytics cluster for Kerberos - https://phabricator.wikimedia.org/T237269 (10elukey) [14:42:55] 10Analytics, 10Analytics-Kanban: Prepare the Hadoop Analytics cluster for Kerberos - https://phabricator.wikimedia.org/T237269 (10elukey) [14:43:31] elukey: If you have time after standup this evening, let's spend a minute discussing the specs of the script for oozie-beros [14:44:20] joal: sans aucun doute [14:47:06] hello hello [14:50:09] o/ [14:52:08] milimetric: when you are caffeinated and ready, I'd need a sanity check of https://phabricator.wikimedia.org/T233891 [14:52:16] (also when you have time, not urgent) [14:53:27] elukey so you’re basically planning to do https://phabricator.wikimedia.org/T233891#5554625 right? [14:53:50] Ok, I’ll review after I submit my second queue patch [14:58:22] yep correct! [14:58:36] only the drop table part though [14:58:42] the rest I think it is not needed [15:07:31] dcausse: o/ [15:07:40] nope! i haven't done a k8s releases, shall I? [15:07:45] do you want to test in beta? [15:07:49] it should be there [15:08:04] ottomata: sure what is the endpoint in beta? [15:10:02] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Prepare the Hadoop Analytics cluster for Kerberos - https://phabricator.wikimedia.org/T237269 (10elukey) [15:11:11] dcausse: deployment-eventgate-3.deployment-prep.eqiad.wmflabs:8192/v1/events [15:11:18] thanks [15:11:54] oh hm actually! this is view the analytics instance! it should work in prod too, since that is configured to look for schemas in the remote schema repo uri if not found locally [15:11:56] ottomata: in beta the events live in kafka and that's it? I mean nothing would be consuming my partical queue? [15:12:05] OH, but we need stream config... [15:12:08] that isn't dynamic yet [15:12:11] in prod [15:12:15] dcausse: yes [15:12:21] in the kafka-jumbo stuff [15:12:22] um [15:12:32] e.g. [15:12:37] deployment-kafka-jumbo-1 [15:12:39] it should go there [15:12:47] ottomata: shoult I file a task so that we can synchronize on what needs to be done on this part? [15:13:02] you have a task already somewhere? [15:13:09] lemme check [15:13:34] dcausse: what is the stream (topic) name going to be? [15:13:51] joal: I just realized that I can go one step further and kerberize hosts in Hadoop Analytics now, with keytabs etc.. It will allow us to test if everything is working or not [15:13:57] wdqs.sparql-query ? [15:14:07] ottomata: fine by me [15:14:11] k [15:14:45] ottomata: we could reuse this ticket: T101013 [15:14:55] T101013: Log Wikidata Query Service queries to the event gate infrastructure - https://phabricator.wikimedia.org/T101013 [15:15:02] perfect [15:15:58] ottomata: err, about the topic: queryservice.sparql-query would be a better fit, we will have structured data soon (commons) [15:16:37] ? [15:16:39] I know Guillaume and Mat are trying to get rid of the assumption that sparql == wikidata == wdqs [15:16:59] hm, ya that's why we made the schema wdqs agnostic [15:17:02] would that be a different producer? [15:17:06] if so, could be [15:17:17] another stream: commons.sparql-query [15:17:50] ottomata: I chose the stream in the event itself? [15:18:33] I probably hardcoded something then [15:18:56] hm? [15:19:00] oh dcausse yes [15:19:04] meta.stream must be set to the stream name [15:19:04] 10Analytics, 10ChangeProp, 10Event-Platform, 10MediaWiki-JobQueue, and 5 others: [EPIC] Develop a JobQueue backend based on EventBus - https://phabricator.wikimedia.org/T157088 (10eprodromou) [15:19:15] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/548764/1/helmfile.d/services/eqiad/eventgate-analytics/values.yaml [15:19:25] and must match an entry here [15:19:26] ottomata: ok that makes sense now, I'll make this configurable then [15:19:49] cool [15:19:52] and agreed on having wdqs.sparql-query and commons.sparql-query separated [15:20:01] so dcausse you plan to have multiple producers using this scheam with different streams in the (near) future? [15:20:16] yes [15:20:18] i can make this a regex stream name config here [15:20:24] so we don't have to update the thing later [15:20:24] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10elukey) >>! In T233661#5635537, @BBlack wrote: >>>! In T233661#5634559, @elukey wrote: >>>>! In T233661#5632172, @BBlack wrote: >>> Agree... [15:20:35] ottomata: that would be great yes [15:20:52] or i can add them each explicitly if we know what they are going to be [15:21:03] if the # is small and we know the streams ahead of time, i think i prefer that [15:21:26] ok lemme consult Guillaume and Mat about that [15:21:30] k [15:22:04] (03PS3) 10Fdans: Add python script to generate intervals for long backfilling [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) [15:22:34] (03CR) 10Fdans: Add python script to generate intervals for long backfilling (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [15:23:34] 10Analytics, 10incubator.wikimedia.org: Create dashiki dashboard / small tool to track statistics about incubated wikis - https://phabricator.wikimedia.org/T237389 (10StevenJ81) That's pretty much it, @jhsoby. I don't promise that when we see an alpha version of this, some further refinements won't come to us... [15:25:00] (03PS4) 10Fdans: Add python script to generate intervals for long backfilling [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) [15:26:12] fdans: excuse me, can you please stop spamming us? :D [15:26:24] * elukey runs away [15:27:04] * fdans pipes all notifications to elukey's PMs [15:27:40] joal: I'm happy with the script to schedule the backfillings if you are [15:27:49] we can set the crons [15:37:17] (03CR) 10Nuria: Add TLS information to webrequest hive/druid (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548736 (https://phabricator.wikimedia.org/T233661) (owner: 10Joal) [15:40:26] 10Analytics, 10New-Readers: Add KaiOS to the list of OS query options for pageviews in Turnilo - https://phabricator.wikimedia.org/T231998 (10SBisson) >>! In T231998#5636055, @jlinehan wrote: > [...] Maybe send one of them an e-mail, mention the PR is from the Wikimedia Foundation in service of Wikipedia, and... [15:45:36] nuria: https://www.youtube.com/watch?v=0B52aW1KFkk :O [15:46:08] elukey: nicee [15:50:08] 10Analytics, 10Analytics-Kanban, 10Multimedia, 10Tool-Pageviews: Create script that returns oozie time intervals every time a coordinator is started from a cron job - https://phabricator.wikimedia.org/T237119 (10Nuria) https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/547750/ [16:03:13] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM! Let's discuss the backfilling after I talk with the team." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/548306 (owner: 10Cicalese) [16:06:27] hey elukey, just read your message about mysql navigationtiming [16:06:55] (03CR) 10Nuria: [C: 04-1] Add python script to generate intervals for long backfilling (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [16:07:49] mforns: o/ - I pinged Dan since it is his ops week, but if you have time we can do it! [16:08:18] elukey, I'm reading the task's comments, sure we can do that whenever is good for you! [16:09:04] mforns: can you triple check that it is safe to drop those tables? Just a quick check to avoid "oh noes!" moments :D [16:09:17] then if you can come to the bc to watch me dropping stuff it would be great :D [16:09:58] ok elukey trying to understand [16:12:12] nuria: thanks for the cr. I think the function "get_oozie_formatted_interval" is the only thing we need to test though. The remaining lines you mention are not doing anything that needs testing, they're just getting arguments via docopt [16:13:04] fdans: right, but even then let's do so in a format that does not require an if/else for tests [16:13:19] fdans: on meeting can talk more in a bit [16:19:21] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Performance-Team (Radar): Drop Navigationtiming data entirely from mysql storage? - https://phabricator.wikimedia.org/T233891 (10mforns) @elukey @Nuria @Gilles EventLogging data was first enabled in Hive on 2017-11-20T19:00:00Z. I believe that i... [16:19:31] elukey, commented on the task [16:20:43] (03PS1) 10Fdans: Add date to daily cassandra loading job titles,reduce SLA hours [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548776 [16:24:01] elukey, you think it's worth shaving to 2017-11-20T19 or we just drop 2018% ? [16:25:49] mforns: ah snap sorry I didn't make myself clear, I am planning to drop only the tables listed, the rest I think can stay there.. what do you think? It will be way simpler [16:27:12] elukey, so those tables, IIUC, are the newest tables created, right? The ones that hold data for 2018-2019? [16:27:43] yes I believe so, but I wanted a sanity check :) [16:28:51] ok, will review each table there [16:29:02] thanks! [16:29:05] dcausse: if your stuff works in beta, i will deploy the change in prod [16:29:35] ottomata: sure, thank you, I'll ping you once I've done some testing! [16:32:52] (03CR) 10Mforns: "So simple of a change for a cool feature! Thanks" (031 comment) [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/547713 (owner: 10Awight) [16:36:26] (03PS1) 10Ottomata: HDFSCleaner - use Path toString in log messages instead of getName [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/548783 (https://phabricator.wikimedia.org/T235200) [16:37:28] elukey, all good making sense, wanna batcave? [16:38:56] mforns: sure! [16:39:04] I'm there [16:45:47] (03CR) 10Ottomata: [C: 03+2] HDFSCleaner - use Path toString in log messages instead of getName [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/548783 (https://phabricator.wikimedia.org/T235200) (owner: 10Ottomata) [16:55:50] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Performance-Team (Radar): Drop Navigationtiming data entirely from mysql storage? - https://phabricator.wikimedia.org/T233891 (10elukey) >>! In T233891#5554625, @elukey wrote: >>>! In T233891#5554355, @elukey wrote: >> To recap: >> >> * Drop tab... [16:55:59] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Performance-Team (Radar): Drop Navigationtiming data entirely from mysql storage? - https://phabricator.wikimedia.org/T233891 (10elukey) [16:56:50] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Rerun sanitization before archiving eventlogging mysql data - https://phabricator.wikimedia.org/T236818 (10elukey) a:03elukey [16:59:32] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Track WDQS updater UA in wikidata-special-entitydata grafana dashboard - https://phabricator.wikimedia.org/T218998 (10Rosalie_WMDE) a:05Rosalie_WMDE→03None [17:01:11] !log first run of HDFSCleaner on /tmp, should delete files older than 31 days [17:01:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:03:38] nuria: standuup [17:12:12] !log 2019-11-05T17:11:50.239 INFO HDFSCleaner Deleted 872360 files and directories in tmp [17:12:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:29:30] (03PS3) 10Joal: Add TLS information to webrequest hive/druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548736 (https://phabricator.wikimedia.org/T233661) [17:32:09] (03CR) 10Joal: "commit message update and etherpad updated :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548736 (https://phabricator.wikimedia.org/T233661) (owner: 10Joal) [17:45:27] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Create HDFS /tmp/ cleaner - https://phabricator.wikimedia.org/T235200 (10Ottomata) Moving back to in progress to use HDFS Trash [18:12:01] 10Analytics, 10ChangeProp, 10Event-Platform, 10MediaWiki-JobQueue, and 5 others: [EPIC] Develop a JobQueue backend based on EventBus - https://phabricator.wikimedia.org/T157088 (10eprodromou) [18:18:08] nuria: I added the problem to the section you created: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors#Changes_and_known_problems_since_2019-09 [18:18:25] mforns: do you want to pair/vet the data? [18:19:01] and nuria/mforns: more importantly, do we want to make a notebook for geoeditors itself since that was the problem, or are we just vetting the bucketed set? [18:19:13] * milimetric does not deal well with ambiguity [18:20:38] milimetric: I added a bunch of things to the notebook you had yesterday about geoeditors, since there is no easy way to share that one is now on myhomedir, we probably should put those on github or something. [18:21:01] yeah, we should come up with a standard and smooth way to do this [18:21:12] milimetric: it is called "Vet Geoeditors Bucketed" on /home/nuria on notebook3 [18:22:07] milimetric: let's see [18:24:47] milimetric: i think we can have some code on that notebook for both the dataset and the files, do copy my notebook and let me know what you think, i added a section for the data , and maybe once we call it done we push it somewhere , we need to be careful not to push not public data long with it to gitgub, inthsi case i do not think it is a problem [18:24:53] (03CR) 10Joal: [C: 03+2] "Merging for deploy and upgrade" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/542226 (https://phabricator.wikimedia.org/T222253) (owner: 10Ottomata) [18:36:02] (03CR) 10Nuria: [C: 03+2] Add TLS information to webrequest hive/druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548736 (https://phabricator.wikimedia.org/T233661) (owner: 10Joal) [18:36:08] (03PS1) 10Joal: Bump changelog.md to v0.0.105 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/548829 [18:36:47] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/548829 (owner: 10Joal) [18:49:32] !log Make Jenkins release refinery-source v0.0.105 to archiva [18:49:33] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:49:44] Gone for diner while jenkins is at work, will be back soon [19:09:35] jo ready when you are!@ [19:09:38] joal* [19:15:22] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Performance-Team (Radar): Drop Navigationtiming data entirely from mysql storage? - https://phabricator.wikimedia.org/T233891 (10Nuria) +1 to @elukey I think we are good [19:16:50] (03CR) 10Nuria: "Meant to be CR-ed? (asking since there are none)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548776 (owner: 10Fdans) [19:48:39] Heya ottomata - Excuse me my internet had gone berzing again [19:48:56] ottomata: source is in archiva- ok for me to bump the jars in refinery? [19:49:03] ah! [19:49:06] yes please let's go! [19:49:12] ok [19:49:42] git up [19:49:44] oops [19:50:56] Wow ottomata I just realized I had forgotten stuff in oozie patch - updating now [19:51:52] ok! [19:56:35] (03PS2) 10Joal: Update oozie jobs to use spark 2.4.4 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548494 (https://phabricator.wikimedia.org/T222253) [19:56:55] ottomata: --^ [19:57:19] ottomata: I had forgotten to bump the jar version ... [19:57:42] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Update oozie jobs to use spark 2.4.4 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548494 (https://phabricator.wikimedia.org/T222253) (owner: 10Joal) [19:57:48] merged joal ! :) [19:58:21] Actually, not merged :) [19:58:25] Just did it :) [19:58:48] oh oops :) [19:58:52] ok ready to scap - ok for you ottomata ? [19:58:57] yup proceed! [20:00:17] !log Deploying refinery using scap [20:00:19] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:00:59] all good joal? [20:02:59] So far yes ottomata - Except for a forgotten change in jars :( [20:03:25] aye, we ready to stop jobs then? [20:04:58] (03PS1) 10Joal: Bump clickstream jar version for spark 2.4.4 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548867 (https://phabricator.wikimedia.org/T222253) [20:04:58] joal: ^ ? [20:05:01] oh more sorry [20:05:02] ottomata: --^ [20:05:09] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Bump clickstream jar version for spark 2.4.4 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548867 (https://phabricator.wikimedia.org/T222253) (owner: 10Joal) [20:05:13] scap away! [20:05:16] I had forgotten one (glad I double checked [20:05:57] ottomata: I'm beggining to stop jobs while scap proceed (we'll need another one) [20:06:08] ok [20:06:16] i'll stop refine jobs now too [20:06:18] can't hurt! [20:09:42] !log Deploying refinery using scap with missing patch [20:09:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:10:48] 10Analytics, 10Analytics-EventLogging, 10QuickSurveys, 10MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), and 2 others: QuickSurveys EventLogging missing ~10% of interactions - https://phabricator.wikimedia.org/T220627 (10Isaac) @Jdlrobson I took a look at the survey responses we had and our ability to match them... [20:12:08] !log stopped refine jobs for Spark 2.4 upgrade - T222253 [20:12:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:12:11] T222253: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 [20:12:48] joal oozie jobs stopped then? [20:12:56] almost [20:13:00] k [20:13:54] yup we're good [20:14:04] deploy in progress with last missing bit [20:14:25] k [20:14:34] this job is running [20:14:37] could wait or kill [20:14:44] doesn't matter i guess until we start restaring nodemanagers [20:15:00] joal i'm going to install spark 2.4.4 ok? [20:15:15] yessir [20:15:22] tell me when ready for a test [20:18:14] haven't used debdeploy much...i hope its working... [20:18:21] !log Deploying refinery onto HDFS [20:18:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:18:30] ottomata: can't help really on that matter :( [20:18:44] ottomata: or, to say, I can be the yellow duck you talk to :) [20:18:57] i think it is! [20:19:04] debmonitor says it is! [20:19:05] :) [20:19:58] I'm debhappy as well then [20:21:23] !log install spark 2.4.4-bin-hadoop2.6-1 cluster wide using debdeploy - T222253 [20:21:27] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:21:27] T222253: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 [20:24:21] great! [20:25:23] joal just curios [20:25:24] https://yarn.wikimedia.org/cluster/app/application_1571142484661_87717 [20:25:25] ? [20:25:51] Oh ! I thought I had killed that :) [20:26:10] ottomata: Apache Ignite can be deployed in yarn naturally, so it was easy to do some tests [20:26:22] heh [20:26:30] somebody is using some memory! [20:26:30] https://yarn.wikimedia.org/cluster/app/application_1571142484661_85644 [20:26:50] groceryheist: ^ :) [20:27:03] just FYI in case you didn't know [20:27:20] i think its ok [20:28:11] ottomata: refinery has been deployed onto hdfs - We're ready for the sheebang [20:28:19] we should wait for this first [20:28:19] https://yarn.wikimedia.org/cluster/app/application_1571142484661_91881 [20:28:20] or kill it [20:28:32] but i'm going to merge the puppet change to bump refine jar version [20:29:30] ottomata: usdual runs of that task are very short - I think groceryheist job is slowing the whole cluster [20:29:34] yeah [20:29:41] it took a long time to get assigned [20:30:07] ottomata: let's for it to finish as it managed to get some memory [20:30:46] ottomata: I suggest killing groceryheist job - a huge step has barely started [20:30:55] ottomata: i think that's okay [20:31:18] ah its done [20:31:20] wait hang on [20:31:26] Hi groceryheist [20:31:34] i think it's mostly cached [20:31:35] it finished ya? [20:31:38] oh [20:31:40] our job finished [20:31:41] anyway [20:31:44] we can proceed joal [20:31:46] ah cool [20:31:54] thanks ya'll for upgrading spark ! [20:32:01] groceryheist: we're going to upgrade spark - there are chances your job will fail anyway [20:32:25] ah [20:32:30] looks like it failed anyway [20:32:34] ok ottomata - let me know when I can test the new spark :) [20:33:09] got a strange python utf-8 decoding error [20:34:02] ottomata: should we kill notebooks kernels? [20:36:54] hm dunno [20:37:12] ok joal roll restarting nodemanagers 2 at a time [20:37:28] k ottomata [20:37:51] And by the way ottomata - Will there be changes needed in paws? [20:37:53] !log roll restarting hadoop-yarn-nodemanagers to pick up spark 2.4.4 shuffle lib [20:37:55] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:38:03] joal: shouldn't be.... [20:38:09] k [20:38:15] they are configured to use the local spark2 bins [20:38:19] so it should just pick up the new stuff [20:38:58] great [20:49:15] ok done joal [20:49:17] test please! [20:51:38] (03PS3) 10Awight: Skip reports which start in the future [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/547713 [20:51:41] essir [20:52:10] (03CR) 10Awight: Skip reports which start in the future (031 comment) [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/547713 (owner: 10Awight) [20:54:36] ottomata: confirmed! [20:54:51] ottomata: shall I run an oozie job manully with previous dates to check? [20:55:25] ottomata: you guys got those very many oozie alarms right/ [20:55:27] ? [20:55:32] ottomata: I suggest rerunning denormalize-check, as it has no impact, just check [20:55:35] yes nuria :) [20:55:41] nuria: ya those are SLA alarms, jobs are paused [20:55:52] denormalize-check? [20:55:52] actually, jobs are killed :) [20:55:55] yessir [20:55:56] ottomata: yaya, triple Checking ! [20:56:26] joal: ok, you probably know better how to launch it than I, can you do? [20:56:27] ottomata: denoramalize-check is a complex job having no impact on data really [20:57:07] !log Starting denoramlize-check one month in advance to enforce a running job with new spark [20:57:08] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:58:03] ottomata: https://hue.wikimedia.org/oozie/list_oozie_workflow/0005824-191031155137252-oozie-oozi-W/?coordinator_job_id=0005823-191031155137252-oozie-oozi-C [20:58:38] ottomata: have you restarted oozie oozie to pick shared lib? [20:58:42] oh no [20:58:47] i haven't run puppet on an-coord [20:58:53] sorry so oozie sharelib hasn't been created... [20:58:59] probalby with puppet restarting refine jobs [20:59:02] lemme see if i can do manually [21:00:21] ok joal it exists now [21:00:23] try again [21:00:25] sure [21:01:54] Better ottomata ! [21:02:33] coooool [21:02:41] joal how long will that take to run? [21:03:33] ottomata: matters of minutes - But my chec is done - oozie runs, spark doesn't fail - We can proceed ! [21:04:14] ok! [21:04:44] !log re-enabling refine jobs after spark 2.4.4 upgrade [21:04:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:05:54] ottomata: Shall I restart oozie jobs [21:05:55] ? [21:06:02] joal: yes please ! [21:06:17] !log restarting oozie jobs after spark 2.4.4 upgrade [21:06:19] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:08:29] joal looking good to me so far! [21:08:32] let's keep watching stuff [21:10:28] Yup - Gently restarting, apis job runs hourly and has been sucessful [21:11:39] nice [21:12:12] (03PS1) 10Milimetric: Update hiera config directions [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/548887 [21:14:08] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Update hiera config directions [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/548887 (owner: 10Milimetric) [21:18:08] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10Ottomata) [21:18:32] ottomata: joal cool if I retry my jobs? [21:19:22] ok for me groceryheist [21:24:22] groceryheist: spark 2.4.4 should be good to use! [21:24:42] 6 months after you asked for it :p [21:35:28] 10Analytics, 10Analytics-EventLogging, 10QuickSurveys, 10MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), and 2 others: QuickSurveys EventLogging missing ~10% of interactions - https://phabricator.wikimedia.org/T220627 (10Nuria) @Isaac Something else to consider (once loading issues have been solved) is that for... [21:35:58] (03CR) 10Nuria: "Nice, thanks for doing this change." [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/548887 (owner: 10Milimetric) [21:38:19] 10Analytics, 10WMDE-Analytics-Engineering, 10WMDE-FUN-Funban-2019, 10WMDE-FUN-Sprint-2019-10-14, 10WMDE-New-Editors-Banner-Campaigns (Banner Campaign Autumn 2019): Implement banner design for WMDEs autum new editor recruitment campaign - https://phabricator.wikimedia.org/T235845 (10awight) @GoranSMilovan... [21:39:09] ottomata: everything seems fine (I assume you have checked refine?) [21:39:28] ottomata: If ok for you I'll take my leave now :0 [21:46:37] i'm watching logs and refine jobs look good! [21:46:39] i think we are good joal [21:46:45] thank you so much for your help! [21:46:45] we did it! [21:47:49] \o/ :) [21:50:04] * milimetric claps [22:01:35] (03PS1) 10Ottomata: HDFSCleaner improvements [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/548909 (https://phabricator.wikimedia.org/T235200) [22:03:49] (03PS2) 10Ottomata: HDFSCleaner improvements [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/548909 (https://phabricator.wikimedia.org/T235200) [22:04:57] (03PS3) 10Ottomata: HDFSCleaner improvements [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/548909 (https://phabricator.wikimedia.org/T235200) [22:58:27] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Setup a proxy "wmcs-edits" for a dashiki instance - https://phabricator.wikimedia.org/T237481 (10srishakatux) [22:58:47] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Setup a proxy "wmcs-edits" for a dashiki instance - https://phabricator.wikimedia.org/T237481 (10srishakatux) p:05Triage→03Normal [23:00:09] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Setup a proxy "wmcs-edits" for a dashiki instance - https://phabricator.wikimedia.org/T237481 (10srishakatux) Here it is https://wmcs-edits.wmflabs.org with help from @milimetric :) [23:07:30] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Develop a tool or integrate feature in existing one to visualize WMCS edits data - https://phabricator.wikimedia.org/T226663 (10srishakatux) #Developer-advocacy Hi team - here we have the first version of the dashboard: https://wmcs-edits.wm... [23:07:36] PROBLEM - Check the last execution of hdfs-cleaner on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit hdfs-cleaner https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [23:14:03] byeeee! [23:20:19] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Develop a tool or integrate feature in existing one to visualize WMCS edits data - https://phabricator.wikimedia.org/T226663 (10Nuria) NIce @srishakatux We can also compute this data daily so you have more fine grained data should that be...