[00:44:25] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Develop a tool or integrate feature in existing one to visualize WMCS edits data - https://phabricator.wikimedia.org/T226663 (10srishakatux) [05:46:02] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10elukey) >>! In T222253#5577341, @Ottomata wrote: > @elukey mind if we upgrade to Spark 2.4.4 in the analytics test cluster and do some tests there? Please go ahea... [06:03:29] 10Analytics, 10User-Elukey: CDH Jessie dependencies not available on Stretch - https://phabricator.wikimedia.org/T214364 (10elukey) I found https://github.com/cloudera/hue/issues/629 that contain the fix to upgrade the libssl deps. Given the fact that Hue for CDH 5.X will not ever support python 3+ (see T23307... [07:02:06] I think that we are ready for eventlogging on py3! [07:02:18] will wait for more reviews but hopefully I'll deploy tomorrow [07:02:34] the last man standing is Hue now, but it will be more painful (sigh) [07:26:46] 10Analytics, 10Analytics-EventLogging, 10Wikimedia-Logstash, 10observability: Validation error for invalid value type should include property name - https://phabricator.wikimedia.org/T116719 (10elukey) @Krinkle super old task with no updates, apologies :) Is it still valid in your opinion? [08:03:04] 10Analytics, 10Analytics-Kanban, 10Operations, 10Wikimedia-Logstash, and 6 others: Move AQS logging to new logging pipeline - https://phabricator.wikimedia.org/T219928 (10elukey) a:03elukey [08:08:53] 10Analytics: Update R on the statboxes - https://phabricator.wikimedia.org/T214598 (10elukey) The update of R follows the Debian OS upgrade process sadly, since it is a big burden for us to package a more up to date stack. Having said that, we are in the process of moving to Debian Buster :) stat1005 is the fir... [08:11:03] 10Analytics, 10Product-Analytics: Update R from 3.3.3 to 3.6.0 on stat and notebook machines - https://phabricator.wikimedia.org/T220542 (10elukey) As FYI on stat1005 (running Buster) we have: ` elukey@stat1005:~$ R R version 3.5.2 (2018-12-20) -- "Eggshell Igloo" ` [10:21:53] joal: thanks for your comments yesterday. is there a reason you prefer coalesce(expr, else) instead of when(expr).otherwise(else)? taste or efficiency? [10:27:23] 10Analytics: Move AQS to a more standard service configuration - https://phabricator.wikimedia.org/T235620 (10elukey) [10:48:24] 10Analytics: Update R on the statboxes - https://phabricator.wikimedia.org/T214598 (10GoranSMilovanovic) @elukey Great, thanks for the update! [10:55:36] * elukey lunch! [11:53:18] 10Analytics, 10MinervaNeue, 10Performance-Team (Radar), 10Readers-Web-Backlog (Kanbanana-2019-20-Q2): MinervaClientError sends malformed events - https://phabricator.wikimedia.org/T234344 (10Jdrewniak) 05Open→03Declined > If our statsv client is producing request urls with multiple query strings, that'... [12:11:16] Hi mgerlach - Pure taste, no efficiency difference AFAIK - I think I prefer null checks being coalesce, as it is the builtin syntax, while case or if are more general conditionals - But it's very personal :) [12:47:35] hello joal, just added a note to the etherpad to deploy aqs as part of today's train [12:47:44] so that we can start backfilling top files [12:47:57] ack fdans :) [12:48:02] thank you :) [12:59:11] interesting drop in audio and increase in video around early 2017 [12:59:18] https://usercontent.irccloud-cdn.com/file/AEYeeRtF/Screen%20Shot%202019-10-16%20at%202.58.37%20PM.png [13:00:06] (this metric is way more interesting when you exclude images) [13:20:28] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Move reportupdater reports that pull data from eventlogging mysql to pull data from hadoop - https://phabricator.wikimedia.org/T223414 (10mforns) @CCicalese_WMF If you'd be fine with the solution in my prior comment, I could a... [13:40:00] (03CR) 10Joal: Update mediawiki-history dumper (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) (owner: 10Joal) [13:43:27] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: MediaWiki history dumps have some events in 2025 - https://phabricator.wikimedia.org/T235269 (10JAllemandou) From raw data: ` // spark2-shell --master yarn --driver-memory 4G --executor-memory 16G --executor-cores 4 --conf spark.executor.memoryOverhead=409... [13:44:35] (03PS2) 10Joal: Update mediawiki-history dumper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) [13:45:02] (03CR) 10Joal: "Tested on cluster." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) (owner: 10Joal) [13:45:27] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10elukey) @Bstorm any remaining doubts that we can discuss? :) [13:55:02] (03PS2) 10Joal: Add partition pruning to hive queries [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543210 (https://phabricator.wikimedia.org/T235283) [14:33:16] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Archiva relies on a tmpfs directory that is wiped after each reboot - https://phabricator.wikimedia.org/T214366 (10elukey) [14:33:20] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Archiva relies on a tmpfs directory that is wiped after each reboot - https://phabricator.wikimedia.org/T214366 (10elukey) a:03elukey [14:52:55] (03CR) 10Nuria: [C: 03+2] "I think we can merge and keep an eye on reports." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543210 (https://phabricator.wikimedia.org/T235283) (owner: 10Joal) [14:54:01] fdans: NICEEE [14:55:36] nuria: this metric is making me realise that it's pretty pressing to add simultaneous splitting and filtering on wikistats [14:56:15] fdans: agreed, but i do not think that fits in this quarter [14:56:35] 10Analytics, 10Fundraising-Backlog, 10Operations, 10SRE-Access-Requests: Banner History and page view data access for fundraising analysts - Jerrie and Erin - https://phabricator.wikimedia.org/T233636 (10herron) [14:57:48] 10Analytics, 10Fundraising-Backlog, 10Operations, 10SRE-Access-Requests: Banner History and page view data access for fundraising analysts - Jerrie and Erin - https://phabricator.wikimedia.org/T233636 (10herron) 05Open→03Resolved Transitioning this resolved as all subtasks have now been resolved. If a... [14:58:26] 10Analytics, 10Research: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 (10MGerlach) == Change of formatting in webrequest logs? == When looking at the number of registration events over time, I find that there are between 100-200 events per hour. However, at some point thi... [15:03:39] 10Analytics, 10Research: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 (10Ottomata) //Unhelpful comment below -- (I don't know why things changed)// This is why using webrequest for this kind of thing is difficult! The HTTP request patterns you are querying for are not desi... [15:07:44] (03PS5) 10Fdans: Add backfill queries for per referer mediarequests [analytics/refinery] - 10https://gerrit.wikimedia.org/r/541817 (https://phabricator.wikimedia.org/T228149) [15:09:09] fdans: did teh restbase deployment of top endpoints already happened? [15:10:17] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Archiva relies on a tmpfs directory that is wiped after each reboot - https://phabricator.wikimedia.org/T214366 (10elukey) [15:10:18] nuria: nope, we're deploying aqs today so that the keyspace is created, then I'll start the backfilling and send the restbase PR [15:10:26] fdans: k [15:11:51] (03CR) 10Fdans: [V: 03+2 C: 03+2] "These queries are already running, merging in case we need them again in the future" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/541817 (https://phabricator.wikimedia.org/T228149) (owner: 10Fdans) [15:14:42] 10Analytics, 10Research: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 (10Nuria) +1 to @ottomatas's comment, webrequest data is very limited for this type of research. Mediawiki does not change dramatically frequently but it does change every time there is a release. Have... [15:19:40] (03CR) 10Mforns: "LGTM!!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543210 (https://phabricator.wikimedia.org/T235283) (owner: 10Joal) [15:38:55] 10Analytics, 10Better Use Of Data, 10Epic, 10Performance-Team (Radar), 10Product-Infrastructure-Team-Backlog (Kanban): Prototype client to log errors - https://phabricator.wikimedia.org/T235189 (10LGoto) [15:59:57] 10Analytics, 10Analytics-Cluster: Optimize archiva git-fat symlink script - https://phabricator.wikimedia.org/T235668 (10Ottomata) [16:01:52] 10Analytics, 10Research: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 (10JAllemandou) I have an explanation for you @MGerlach, and will try to show you how I have found it: - What happened on 2019-07-23 in the analytics operation world? https://tools.wmflabs.org/sal/analyt... [16:02:43] 10Analytics: Update R on the statboxes - https://phabricator.wikimedia.org/T214598 (10mpopov) [16:02:45] 10Analytics, 10Product-Analytics: Update R from 3.3.3 to 3.6.0 on stat and notebook machines - https://phabricator.wikimedia.org/T220542 (10mpopov) [16:03:26] 10Analytics, 10Product-Analytics: Update R from 3.3.3 to 3.6.0 on stat and notebook machines - https://phabricator.wikimedia.org/T220542 (10mpopov) Thanks for the update @elukey! [16:06:34] 10Analytics, 10Research: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 (10MGerlach) That is fantastic @JAllemandou I was suspecting something along these lines but it was not sure where/how to track those changes. Should be possible to fix now. Thanks a lot. [16:20:55] a-team: Hi everyone, my name is Lex Nasser, and I’m excited to start working with you all! I’m joining the Analytics team as a Software Engineering Intern for the next several months. [16:20:55] As a bit of background, I’m currently studying Computer Science, Data Science, and Economics at UC Berkeley in the SF Bay Area. This past summer, I worked as a Software Engineering Intern at Roku, where I worked with Hive, Hadoop, and several other technologies employed by the Analytics team. Now I am excited to take these skills to the WMF, where I am hoping to learn a bunch and help further its [16:20:55] Analytics work. [16:20:56] My first project will most likely be the improvement of caching for the public user-request dataset, which I hope will provide the necessary grounding for further projects on the Analytics team. [16:20:57] I’ve already met many of you, and hope to meet the rest of you soon — I’ll be present at the staff call in a few minutes. Looking forward to working together! [16:21:00] 10Analytics, 10Fundraising-Backlog, 10Operations, 10SRE-Access-Requests: Banner History and page view data access for fundraising analysts - Jerrie and Erin - https://phabricator.wikimedia.org/T233636 (10jrobell) Thank you @herron ! [16:21:59] lexnasser: o/ [16:22:06] Hi lex :) [16:25:16] hi lexnasser! [16:43:58] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-General, 10WMF-Legal, 10Privacy: Collect IPs for pingback - https://phabricator.wikimedia.org/T191691 (10sbassett) p:05Triage→03Normal [17:12:02] 10Analytics, 10Analytics-EventLogging, 10QuickSurveys, 10Readers-Web-Backlog (Kanbanana-2019-20-Q2): QuickSurveys EventLogging missing ~10% of interactions - https://phabricator.wikimedia.org/T220627 (10MBinder_WMF) Changed subtype based on Kickoff @phuedx [17:17:17] Going for diner team, back after for deploy [17:24:26] ottomata: ok if I upgrade archiva? [17:25:00] go for it! [17:33:59] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Move reportupdater reports that pull data from eventlogging mysql to pull data from hadoop - https://phabricator.wikimedia.org/T223414 (10CCicalese_WMF) >>! In T223414#5576959, @mforns wrote: > @CCicalese_WMF > The only thing I... [17:39:10] lexnasser: WELCOME [17:39:53] lexnasser: you see EU folks dropping as it is getting later over there, let me know if you have issues filing the request for access to data, will schedule couple meetings to go over things once you get your ssh keys set up [17:44:41] 10Analytics-General-or-Unknown, 10Analytics-Kanban, 10Privacy: analytics.wikimedia.org loads resources from third parties - https://phabricator.wikimedia.org/T156347 (10sbassett) [17:49:04] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics inter - https://phabricator.wikimedia.org/T235688 (10Nuria) [17:49:12] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Nuria) [17:49:14] hey mforns - What should I do for my reportupdater-queries to get deploy? I have no idea :( [17:49:47] joal: there was a brief outage for the singapore DC so I will do the archiva upgrade tomorrow [17:50:26] elukey: ok - Do you prefer me to wait for this upgrade before the deploy train? [17:51:22] nono please go ahead, we can test archiva tomorrow with some basic builds [17:51:29] 10Analytics: Request for a large request data set for caching research and tuning - https://phabricator.wikimedia.org/T225538 (10lexnasser) a:03lexnasser [17:51:47] ack elukey - We won't test releasing, but that's probably ok :) [17:54:04] nuria: if you have a minute can you double check https://gerrit.wikimedia.org/r/#/c/analytics/refinery/source/+/543216/ please? [17:54:07] there will be time :D [17:54:20] I'm sure there will elukey :) [17:55:43] elukey: is your code for AQS ready or not yet? [17:56:33] Same question for you mforns about the hive-partition having dots [17:57:40] joal: ah that is puppet code, I'll merge when Filippo will review, feel free to discard it [17:57:55] elukey: ack! Thanks :) [17:58:21] nuria: For the public keypair, is that both my ssh-rsa and ssh-ed25519 public keys? [17:58:23] elukey: Oh, and EL-py3? [18:00:38] will do it tomorrow :) [18:01:19] k elukey - Nothing for me then :) [18:02:03] PROBLEM - Check the last execution of reportupdater-pingback on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:03:10] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10lexnasser) Approving as the relevant Wikimedia Foundation employee. [18:03:39] PROBLEM - Check the last execution of refinery-import-page-history-dumps on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:04:05] PROBLEM - Check if the Hadoop HDFS Fuse mountpoint is readable on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration%23Fixing_HDFS_mount_at_/mnt/hdfs [18:04:33] PROBLEM - Check the last execution of reportupdater-interlanguage on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:04:44] joal, to deploy reportupdater queries just need to merge the change [18:05:05] mforns: I was thinking that, but was not sure [18:05:10] thanks mforns :) [18:05:17] PROBLEM - Check the last execution of reportupdater-published_cx2_translations on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:05:17] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Nuria) Approved on my end, i think @lexnasser needs to provide ssh keys and sign NDA per https://wikitech.wikimedia.org/wiki/Production_access [18:05:18] np! [18:05:28] Maybe not now though, as they're all alarming [18:05:29] do you guys need help with ru? [18:05:51] well elukey, I must say I have no clue of what to do when RU is unhappy like that [18:06:04] ah no wait stat1007 is borked [18:06:08] this is why [18:06:17] heh [18:06:21] PROBLEM - Check the last execution of archive-maxmind-geoip-database on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:07:02] indeed - meh [18:07:03] PROBLEM - Check the last execution of reportupdater-browser on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:07:34] should recover soon [18:07:46] the oom killer did its job, [18:07:56] I restarted the nagios daemon [18:08:10] all right all good then :) [18:08:17] elukey: do we know why? [18:09:58] joal: somebody probably started a greedy python process ending up saturating the host [18:10:16] sooner or later I'll have time to apply proper limits [18:10:24] need to figure out how, it is tricky [18:10:35] but it would remove a lot of noise/annoyances [18:10:55] ack elukey - thanks [18:11:07] thanks luca :] [18:12:18] all right going to dinner, if you need me please ping! [18:12:35] RECOVERY - Check the last execution of reportupdater-pingback on stat1007 is OK: OK: Status of the systemd unit reportupdater-pingback https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:12:49] Will I need ssh access to anything other than the analytics servers? [18:13:19] mforns: ping again about dots in hive-partitions - Shall this be deployed? [18:13:55] joal, for me it's good to go! I was waiting for a CR [18:14:01] Ah [18:14:06] joal, I mean [18:14:11] RECOVERY - Check the last execution of refinery-import-page-history-dumps on stat1007 is OK: OK: Status of the systemd unit refinery-import-page-history-dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:14:15] do you want to discuss this in tha cave now? [18:14:20] joal, ^ [18:14:52] no worries mforns - It's been +1ed by Andrew, it's a 1 char change and I couldn't find in my mind a reason for not doing it [18:14:56] Let's merge? [18:15:05] RECOVERY - Check the last execution of reportupdater-interlanguage on stat1007 is OK: OK: Status of the systemd unit reportupdater-interlanguage https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:15:29] ok joal let's [18:15:30] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10lexnasser) [18:15:51] RECOVERY - Check the last execution of reportupdater-published_cx2_translations on stat1007 is OK: OK: Status of the systemd unit reportupdater-published_cx2_translations https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:16:07] (03CR) 10Joal: [V: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/542441 (https://phabricator.wikimedia.org/T235268) (owner: 10Mforns) [18:16:17] (03CR) 10Joal: [V: 03+2 C: 03+2] Allow HivePartitions to have dots (.) in their values [analytics/refinery] - 10https://gerrit.wikimedia.org/r/542441 (https://phabricator.wikimedia.org/T235268) (owner: 10Mforns) [18:16:25] ok [18:16:53] RECOVERY - Check the last execution of archive-maxmind-geoip-database on stat1007 is OK: OK: Status of the systemd unit archive-maxmind-geoip-database https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:17:37] RECOVERY - Check the last execution of reportupdater-browser on stat1007 is OK: OK: Status of the systemd unit reportupdater-browser https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:18:51] Last thing I'm waiting for is mediawiki-history-dumper from nuria, when she has a minute [18:19:39] ok first thing, I'm gonna deploy AQS - ottomata are you nearby, in case? [18:20:39] Ah fdans - You have merged your patch for mediarequest-top without providing CQL test data I think [18:28:42] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Move reportupdater reports that pull data from eventlogging mysql to pull data from hadoop - https://phabricator.wikimedia.org/T223414 (10mforns) @CCicalese_WMF Totally understand your point. > Is there any way we could have... [18:29:03] mforns: ok for me merging my RU queries now that it has recovered? [18:29:11] yes! [18:29:15] joal, ^ [18:29:21] ack! Doing so [18:29:47] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543210 (https://phabricator.wikimedia.org/T235283) (owner: 10Joal) [18:31:16] joal, I've seen now that interlanguage queries treat $2 as inclusive. [18:31:37] joal, but that's not an issue of your patch, it was like that before... [18:31:46] Meh! Didn't notice mforns :( [18:31:49] so, does not make sense to change it in this patch [18:31:50] orry for that [18:31:53] no no [18:32:03] Let's add a task [18:32:08] this is something for the data owners to change [18:32:10] yes [18:32:33] joal: hello sorry am I still in time to send it? [18:32:46] otherwise they will see differences in their metrics [18:32:49] fdans: I wass doing it, and noticed by the way a bug [18:33:45] fdans: https://github.com/wikimedia/analytics-aqs/blob/master/v1/mediarequests.yaml#L266 [18:34:25] (╯°□°)╯︵ ┻━┻ [18:34:34] :) [18:34:37] joal: sending patch :( [18:34:39] RECOVERY - Check if the Hadoop HDFS Fuse mountpoint is readable on stat1007 is OK: OK https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration%23Fixing_HDFS_mount_at_/mnt/hdfs [18:34:42] I'm sorry [18:34:42] ffs [18:34:51] fdans: ok - Sending for aqs-deploy CQL [18:37:14] (03PS1) 10Fdans: Correct copypaste bug in top mediarequests [analytics/aqs] - 10https://gerrit.wikimedia.org/r/543506 [18:37:38] joal ^ [18:38:32] fdans: shouldn't it be media_types? [18:40:10] (03PS2) 10Fdans: Correct copypaste bug in top mediarequests [analytics/aqs] - 10https://gerrit.wikimedia.org/r/543506 [18:40:42] joal: fixed, forgive my stupid foggy brain [18:41:14] (03CR) 10Joal: [C: 03+2] "Merging for deploy - Thanks for the quick turnaround" [analytics/aqs] - 10https://gerrit.wikimedia.org/r/543506 (owner: 10Fdans) [18:41:26] thank you joal [18:41:53] (03Merged) 10jenkins-bot: Correct copypaste bug in top mediarequests [analytics/aqs] - 10https://gerrit.wikimedia.org/r/543506 (owner: 10Fdans) [18:42:59] (03PS1) 10Joal: Add mediarequest-top fake data in script [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/543633 (https://phabricator.wikimedia.org/T233716) [18:43:05] fdans: --^ please :) [18:43:43] (03CR) 10Fdans: [V: 03+2 C: 03+2] Add mediarequest-top fake data in script [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/543633 (https://phabricator.wikimedia.org/T233716) (owner: 10Joal) [18:45:42] !log Manually create mediarequest-top cassandra keyspace and tables, and add fake test data into it [18:45:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:47:24] (03PS1) 10Joal: Correct type in mediarequest-top fake data script [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/543656 (https://phabricator.wikimedia.org/T233716) [18:47:31] fdans: --^ sorry :/ [18:47:55] (03CR) 10Fdans: [V: 03+2 C: 03+2] Correct type in mediarequest-top fake data script [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/543656 (https://phabricator.wikimedia.org/T233716) (owner: 10Joal) [18:48:04] joal: classic [18:48:06] :) [18:48:11] Thanks mate :) [18:48:25] I don't know exactly why cassandra wants camelcase fields wrapped in quotes [18:48:33] I have no clue :( [18:49:18] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Move reportupdater reports that pull data from eventlogging mysql to pull data from hadoop - https://phabricator.wikimedia.org/T223414 (10CCicalese_WMF) @mforns Sounds great - thank you! [18:50:28] joal: I see the keyspace created from aqs1004 :) [18:51:04] (03CR) 10Mforns: [C: 03+1] "LGTM! Thanks for doing this :] I just left 2 nitpicky-annoying comments. But feel free to merge like this." (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) (owner: 10Joal) [18:52:46] (03PS1) 10Joal: Update aqs to e3e4dae [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/543660 [18:53:25] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Move reportupdater reports that pull data from eventlogging mysql to pull data from hadoop - https://phabricator.wikimedia.org/T223414 (10mforns) @CCicalese_WMF Cool! I will then copy the data over to Hive and rerun the querie... [18:55:10] (03CR) 10Joal: Update mediawiki-history dumper (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) (owner: 10Joal) [18:55:23] (03PS3) 10Joal: Update mediawiki-history dumper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) [18:55:31] Thanks a lot for the review mforns [18:56:27] fdans: do you mind triple checking this https://gerrit.wikimedia.org/r/543216? [18:56:49] joal: looking [18:58:15] (03CR) 10Fdans: [C: 03+1] "Don't have much context on this, but I see nothing looking fishy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) (owner: 10Joal) [18:58:31] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/543660 (owner: 10Joal) [18:58:37] Thanks fdans [18:58:50] ok - Let's wait for ottomata to be around before actually dpeloying :) [18:59:26] joal: sounds good, I'll be around if yall need me :) [18:59:34] Thanks fdans :) [18:59:48] by the way fdans - shall I start coords for loading top? [19:00:00] fdans: or do you prefer doing it tomorrow? [19:00:16] joal: I can do it tomorrow, don't worry about it [19:00:21] k [19:01:19] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 4 others: Modern Event Platform: Stream Configuration: Implementation - https://phabricator.wikimedia.org/T233634 (10Ottomata) Conclusions from today's meeting: - TODO: Make RL package file callbacks support args from e... [19:02:02] joal: hi do you need me? [19:02:16] ottomata: Just wanted to be sure you were nearby IF :) [19:02:23] ottomata: going to deploy AQS now [19:02:35] ok for you ottomata ? [19:03:07] go go go [19:04:53] ottomata: did 1007 went kaput [19:05:15] nuria: it did [19:05:35] nuria: elukey said the oom tool did its job - Not sure about what it is though [19:05:39] hmmm seems ok now? [19:06:01] (03CR) 10Nuria: [C: 03+2] Update mediawiki-history dumper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) (owner: 10Joal) [19:06:13] aouch - failed deploy because of test [19:06:40] Thanks nuria for the reivdew [19:09:44] Wow - I messed up badly my CQL :( [19:11:40] (03Merged) 10jenkins-bot: Update mediawiki-history dumper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543216 (https://phabricator.wikimedia.org/T235269) (owner: 10Joal) [19:15:22] (03PS1) 10Joal: Correct top-mediarequest fake data script [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/543669 (https://phabricator.wikimedia.org/T233716) [19:15:45] nuria: --^ if you have a minute - I messed it up 3 times :( I'm sorry for that [19:18:32] (03PS1) 10Joal: Add v0.0.103 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543670 [19:18:56] ottomata, mforns --^ if any of you have a minute - If no answer in 5 minutes, I'll do it myself :) [19:19:21] !log AQS deployed with mediarequest-top endpoint [19:19:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:19:38] fdans: --^ :) [19:19:40] (03CR) 10Ottomata: [C: 03+1] Add v0.0.103 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543670 (owner: 10Joal) [19:19:46] Thanks ottomata :) [19:20:08] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for release" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543670 (owner: 10Joal) [19:20:31] thank you for your work joal, can’t wait to have this metric live! [19:20:35] good night [19:20:42] See you fdans [19:29:43] !log Ask jenkins to release refinery-source v0.0.103 to archiva [19:29:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:30:51] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10Bstorm) Well, so far, we generally coordinate on maintenance, reboots and service downtime with @ArielGlenn... [19:32:30] (03PS1) 10Joal: Bump mediawiki_history_dumps job jar version [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543676 (https://phabricator.wikimedia.org/T235409) [19:32:49] mforns: --^ if you have a minute please [19:33:02] lokin [19:33:05] lookin [19:33:30] (03CR) 10Mforns: [V: 03+2 C: 03+2] Bump mediawiki_history_dumps job jar version [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543676 (https://phabricator.wikimedia.org/T235409) (owner: 10Joal) [19:33:36] mforns: Thanks :) [19:33:40] joal, can I merge? [19:33:55] mforns: let's wait for jar to be available since you have not done so :) [19:34:00] Thanks for asking :) [19:34:08] hehehe ok [19:45:45] !log Refinery-source v0.0.103 released to refinery [19:45:46] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:46:07] (03PS2) 10Joal: Bump mediawiki_history_dumps job jar version [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543676 (https://phabricator.wikimedia.org/T235409) [19:46:14] mforns: this is the one :) --^ [19:47:25] (03CR) 10Mforns: [V: 03+2 C: 03+2] Bump mediawiki_history_dumps job jar version [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543676 (https://phabricator.wikimedia.org/T235409) (owner: 10Joal) [19:47:45] joal, merged :] [19:47:49] Thanks mforns :) [19:48:41] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging (already fixed in prod)" [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/543669 (https://phabricator.wikimedia.org/T233716) (owner: 10Joal) [20:05:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Proof of concept: Entropy calculations can be used to alarm on anomalies for data quality metrics - https://phabricator.wikimedia.org/T215863 (10mforns) The docs are here: https://wikitech.wikimedia.org/wiki/Analytics/Data_quality/Entropy_alarms Moving tas... [20:08:00] !log Deployed refinery using scap [20:08:02] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:15:42] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM too!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543008 (https://phabricator.wikimedia.org/T232671) (owner: 10Srishakatux) [20:16:29] !log Deployed refinery onto HDFS [20:16:30] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:16:40] mforns: question for you [20:16:48] yep :] [20:17:02] mforns: shall I regenerate the files for mediawiki-history-dumps 2019-09? [20:17:47] joal, mmm dunno, theoretically whenever we improve sth in the format or contents of the dumps, we are not going to replace previous dumps, right? [20:17:54] true [20:18:10] ok I'll just restart the oozie job then :) [20:18:12] for this time, though, considering it's still not out officially, I'm not sure [20:19:34] Let's just restart it mforns - sounds good [20:19:37] joal, maybe when we officially launch it, we can delete all versions older than the official release, to not confuse [20:19:45] works for me [20:20:26] !log Kill-restart mediawiki-history-dumps-coord to pick up changes [20:20:28] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:20:33] joal, should we create a puppet patch to enable wmcs RU queries already? [20:20:44] for Srishti? [20:21:23] mforns: why not - query is planned to start 2019-11-01 (start date in october, monthly granularity) - So having it deloyed shoudln't hurt [20:21:38] joal, do you want me to do that? [20:22:21] mforns: I wonder if she'd prefer us to do it, or learn to do it... She told us the project was a learning experience - Let's ask her :) [20:22:26] srish_aka_tux: Hi! [20:22:30] ok [20:24:53] srish_aka_tux: mforns has merged the reportupdater-query patch, and now it needs a puppet patch to actually run (reportupdater needs a dedicated run for every folder) [20:25:36] srish_aka_tux: We were wondering with mforns if you'd prefer us to write that patch, or do it yourself as you told us the project was a learning experience [20:26:00] srish_aka_tux: Both are fine for us, let us know what you prefer :) [20:27:09] !log upgrading to spark 2.4.4 in analytics test cluster [20:27:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:28:14] @joal @mforns Okay, thanks for asking :) Do customers of your infra usually update the puppet patch on their own, or you prefer to do it for them? If the former case, then I can look into it :) [20:28:40] I'll let mforns answer that :) --^ [20:29:43] srish_aka_tux, I don't recall an occasion where someone did it for us :] [20:30:45] I can do that, and will add you as a reviewer, so that you can see what is it about. Its pretty simple, but puppet is always difficult to understand (at least for me) [20:32:59] @mforns Haha, okay, then go ahead and add the patch :) I think the goal for me is to work only on the things that you would prefer for folks outside your team to do for their project and be aware of the process :) [20:34:29] ok looks like decision has been made :) [20:34:33] srish_aka_tux, ok no problem [20:34:35] elukey: yt? [20:34:47] (not urgent!) [20:35:00] Ok team - I'm gonna get to bed :) Talk to you tomorrow [20:35:07] bye joseph1 [20:35:07] ! [20:35:14] byeee [20:37:02] srish_aka_tux, I just noticed that the name of the script in the reportupdater config does not match the name of the file... [20:37:12] srish_aka_tux, sorry for not noticing this in the code review [20:37:26] could you fix it in a new patch please? [20:37:59] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10Ottomata) @elukey, I think we need either an hdfs or a spark keytab. https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/hadoop/spark2.p... [20:50:16] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10Ottomata) Ah right, we have an hdfs keytab on hadoop masters and workers, just not coordinator. I ran ` sudo -u hdfs /usr/local/bin/kerberos-run-command hdfs ./sp... [20:59:20] @mforns Sure! On it.. [21:07:04] milimetric: let's talk tomorrow about my comments on https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/530878/ which i think is the only thing holding this patch from being merged [21:18:01] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10Ottomata) @joal FYI spark is upgraded to 2.4.4 in test cluster. shuffle service restart applied there too. [21:19:25] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Move reportupdater reports that pull data from eventlogging mysql to pull data from hadoop - https://phabricator.wikimedia.org/T223414 (10Nuria) @CCicalese_WMF Do file another task for the pingback work that should include the h... [21:33:22] (03PS1) 10Srishakatux: Fix report name in wmcs config [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543709 (https://phabricator.wikimedia.org/T232671) [21:46:35] (03CR) 10Nuria: "I think you want to send a patch with diff, not a whole new addition of files as these files already exist: https://github.com/wikimedia/" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543709 (https://phabricator.wikimedia.org/T232671) (owner: 10Srishakatux) [22:03:32] (03CR) 10Srishakatux: "> I think you want to send a patch with diff, not a whole new" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543709 (https://phabricator.wikimedia.org/T232671) (owner: 10Srishakatux) [22:08:13] (03Abandoned) 10Srishakatux: Fix report name in wmcs config [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543709 (https://phabricator.wikimedia.org/T232671) (owner: 10Srishakatux) [22:16:25] (03PS1) 10Srishakatux: Fix report name in wmcs config [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543719 (https://phabricator.wikimedia.org/T232671) [22:30:06] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Puppetize reportupdater to run wmcs reports - https://phabricator.wikimedia.org/T235718 (10srishakatux) [22:30:13] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Puppetize reportupdater to run wmcs reports - https://phabricator.wikimedia.org/T235718 (10srishakatux) p:05Triage→03Normal [22:31:29] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Puppetize reportupdater to run wmcs reports - https://phabricator.wikimedia.org/T235718 (10srishakatux) @mforns for the documentation sake, I've filed this task :) You can share related updates here. [22:44:09] (03CR) 10Nuria: [C: 03+2] Fix report name in wmcs config [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543719 (https://phabricator.wikimedia.org/T232671) (owner: 10Srishakatux) [23:06:43] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 4 others: Modern Event Platform: Stream Configuration: Implementation - https://phabricator.wikimedia.org/T233634 (10Nuria) >Default behavior: if no config for stream, just return empty config. EL will always send event (no... [23:13:40] PROBLEM - Check the last execution of archive-maxmind-geoip-database on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [23:14:20] PROBLEM - Check the last execution of reportupdater-browser on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [23:15:42] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Create test Kerberos identities/accounts for some selected users in hadoop test cluster - https://phabricator.wikimedia.org/T212258 (10Isaac) @elukey I'm having trouble ssh-ing into an-tool1006.eqiad.wmnet (`ssh isaacj@an-tool1006.eqiad.wmnet`) where it is not... [23:24:12] RECOVERY - Check the last execution of archive-maxmind-geoip-database on stat1007 is OK: OK: Status of the systemd unit archive-maxmind-geoip-database https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [23:24:54] RECOVERY - Check the last execution of reportupdater-browser on stat1007 is OK: OK: Status of the systemd unit reportupdater-browser https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers