[01:16:09] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Setup Config:Dashiki:WMCSEdits on meta wiki - https://phabricator.wikimedia.org/T236223 (10srishakatux) [01:16:30] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Setup Config:Dashiki:WMCSEdits on meta wiki - https://phabricator.wikimedia.org/T236223 (10srishakatux) p:05Triage→03High [01:42:54] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Setup Config:Dashiki:WMCSEdits on meta wiki - https://phabricator.wikimedia.org/T236223 (10srishakatux) @Milimetric I looked into the supported layouts and visualization a bit https://wikitech.wikimedia.org/wiki/Analytics/Tutorials/Dashboard... [01:47:28] nuria: sure! I just closed this ticket by the way https://phabricator.wikimedia.org/T232992 I'd be interested to know if the algorithm you're using is able to detect it. There is not a "spike" in this case, rather sustained false traffic like xhamster [01:48:15] 10Analytics, 10Pageviews-Anomaly: Manipulation of pageview statistics - https://phabricator.wikimedia.org/T232992 (10Nuria) [01:48:31] musikanimal: will look and setup meeting [01:48:37] and I think we've identified a motive... it is surfaced in the mobile apps as a "trending topic". Looking at the mobile app numbers, specifically, you can see that it works. Genuine traffic is going to these articles because they are wrongfully surfaced as "trending" [01:48:40] great! thanks [01:49:08] musikanimal: so that will not be detected, as it is organic traffic [01:49:13] musikanimal: right? [01:49:31] the mobile-app traffic is, the mobile-web and desktop are not [01:49:50] musikanimal: ah i see, will look in detail [01:49:54] musikanimal: this one: https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2019-06-01&end=2019-10-01&pages=Line_shaft [01:50:24] musikanimal: is more like the traffic pattern also would catch [01:50:35] *algo not also [01:50:45] yeah, that's what I figured :/ [04:00:54] 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Make aggregate data on editors per country per wiki publicly available - https://phabricator.wikimedia.org/T131280 (10Erik_Zachte) @milimetric thanks for bringing this to completion I see from https://wikitech.wikimedia.org/wiki/Analytics/Data_La... [06:39:49] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10elukey) Thanks a lot for all the feedback! As next step I'd propose to file a puppet patch to: 1) add the h... [07:39:39] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10MoritzMuehlenhoff) 05Resolved→03Open Reopening, currently the same key is used in Cloud VPS and production, which is a security risk. [07:57:50] (03PS1) 10WMDE-leszek: Modify access rules [analytics/wmde/WD/WD_languagesLandscape] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/545492 [07:58:03] (03Abandoned) 10WMDE-leszek: Modify access rules [analytics/wmde/WD/WD_languagesLandscape] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/545492 (owner: 10WMDE-leszek) [07:58:33] (03PS1) 10WMDE-leszek: Modify access rules [analytics/wmde/WD/WD_languagesLandscape] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/545493 [07:59:57] (03CR) 10Addshore: [V: 03+2 C: 03+2] Modify access rules [analytics/wmde/WD/WD_languagesLandscape] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/545493 (owner: 10WMDE-leszek) [08:57:19] (03Abandoned) 10WMDE-leszek: Modify access rules [analytics/wmde/WD/WD_languagesLandscape] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/545493 (owner: 10WMDE-leszek) [09:39:45] (03CR) 10Awight: New report for Reference Previews (031 comment) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/542419 (https://phabricator.wikimedia.org/T231529) (owner: 10Awight) [09:44:26] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Gilles) @Fsalutari has put together a sanitised version of the dataset(s) according to what we agreed on. I... [09:55:13] 10Analytics, 10Android-app-Bugs, 10Wikipedia-Android-App-Backlog (Android-app-release-v2.7.29x-N-Nanaimo-Bar): App requests classified as pageviews that probably should not be so - https://phabricator.wikimedia.org/T229068 (10Charlotte) Thanks @Nuria - the issues will be resolved once those on the older vers... [09:55:25] 10Analytics, 10Android-app-Bugs, 10Wikipedia-Android-App-Backlog (Android-app-release-v2.7.29x-N-Nanaimo-Bar): App requests classified as pageviews that probably should not be so - https://phabricator.wikimedia.org/T229068 (10Charlotte) 05Open→03Resolved [10:19:50] (03PS1) 10MarcoAurelio: Add .gitreview [analytics/wmde/WD/WD_languagesLandscape] - 10https://gerrit.wikimedia.org/r/545510 [10:20:17] (03CR) 10MarcoAurelio: [V: 03+2 C: 03+2] Add .gitreview [analytics/wmde/WD/WD_languagesLandscape] - 10https://gerrit.wikimedia.org/r/545510 (owner: 10MarcoAurelio) [11:59:27] 10Analytics, 10Analytics-EventLogging, 10QuickSurveys, 10MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), and 2 others: QuickSurveys EventLogging missing ~10% of interactions - https://phabricator.wikimedia.org/T220627 (10phuedx) Additionally, the Performance team is running their Perceived Performance survey rig... [12:50:11] 10Analytics, 10Discovery, 10Event-Platform, 10Wikidata, and 3 others: Log Wikidata Query Service queries to the event gate infrastructure - https://phabricator.wikimedia.org/T101013 (10Igorkim78) Added link to the task T236251: Add header returning time millis to first solution similar to TTFB measured in... [12:51:01] neilpquinn: i am not good at messaging on wiki! [12:51:10] thanks for docs, i edited a bit to fix a couple of things [12:51:22] one of the reasons I don't want to maintain this stuff on wiki is that it just gets stale too fast [12:51:39] puppet has this info, and for us SRE it is queryable via cumin [12:55:14] 10Analytics, 10Discovery, 10Event-Platform, 10Wikidata, and 3 others: Log Wikidata Query Service queries to the event gate infrastructure - https://phabricator.wikimedia.org/T101013 (10Ottomata) sparql/query schema is merged. We'll need to do an eventgate-analytics k8s deploy before it can be used. Let m... [13:09:55] https://github.com/internetarchive/snakebite-py3 [13:09:57] \o/ [13:11:37] so the internet archive people apparently have a cdh5 cluster as we do [13:20:55] also [13:20:57] https://community.cloudera.com/t5/Support-Questions/Upgrade-unmanaged-CDH-Cluster-to-6-1-from-5-16/m-p/87771/highlight/true#M28599 [13:21:10] "Starting with Cloudera Manager 6 and CDH6, any new clusters (either CDH5 or CDH6) installed without Cloudera Manager and not managed by Cloudera Manager are no longer be supported." [13:21:30] ! [13:21:37] well bye bye CDH [13:22:18] well, the current packages were also not really supported unless you had a subscription, what does that change? [13:22:25] hm [13:22:37] ya i guess what does 'supported' mean? [13:22:49] they still had lots of docs about how to install CDH without cloudera manager [13:22:59] I read that as being able to buy a support contract at Cloudera [13:22:59] if they stop making that possible or helping with that, things will be hard! [13:23:03] aye [13:24:23] which probably even makes sense, they can hardly provide sensible support for arbitrary setups [13:25:06] the start of the forum post was somebody asking for a migration doc from cdh 5.16 to 6.x, and there is none.. [13:25:40] and we'd need cloudera manager to do anything probably :( [13:25:42] 10Analytics, 10Discovery, 10Event-Platform, 10Wikidata, and 3 others: Log Wikidata Query Service queries to the event gate infrastructure - https://phabricator.wikimedia.org/T101013 (10dcausse) @Igorkim78 thanks for the suggestion, we could perhaps log all response HTTP headers (we already log all request... [13:26:29] too bad that in here we didn't find anybody from bigtop [14:07:48] ottomata: yeah, I totally understand that—I'm all for just putting pointers to information elsewhere, which won't need to be updated. For example, maybe instead of listing all the Analytics Puppet roles _and_ which servers are in each role, we could list just the roles and say "search the puppet codebase for the role name to learn more" or something like that :) [14:08:36] that's why I didn't bring back the hardware specs or the ports—they definitely didn't seem worth the effort of maintaining on-wiki [14:38:35] (03CR) 10Fdans: [V: 03+2 C: 03+2] "This job is already running without issues, merging. No deploy required." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543837 (https://phabricator.wikimedia.org/T228149) (owner: 10Fdans) [14:42:41] elukey: thanks for fixing up my request ticket! I wasn't sure what it should look like and copied tickets from when we setup logstash on ganetti [14:44:21] ebernhardson: np! I'll work on the task on Monday if nobody touches it, I am not working this week (at a conf) otherwise I'd have already done it :( [14:45:19] ebernhardson: qq - today me and Joseph had a chat with somebody from the Airflow PMC, and they told us that hadoop operations are not really supported now (some dev time would be needed, since the actual code doesn't work with py3 etc..) [14:45:35] what operators are you using? (out of curiosity( [14:45:37] elukey: right, they only directly support python 2.7 with hdfs integraiton [14:45:46] elukey: so i had to write a custom hdfs plugin that shells out to the hdfs cli [14:46:04] it's *crazy* slow, an exists check takes like 5 sec to spin up a jvm, but whatever [14:46:13] ? oof [14:46:14] ouch :( [14:46:31] in the scheme of things, 5 sec for a directory exists check isn't the end of the world at least :) [14:46:45] ebernhardson: did you see https://github.com/internetarchive/snakebite-py3 ? that might unblock things [14:47:01] elukey: for operators, basically only my own custom operators that tie into SparkSubmitHook, and custom SkeinHook that i wrote [14:47:09] nice [14:48:13] elukey: ahh, a 3.x python hdfs client would work if i can get it working. My hdfs integration class is probably good enough for the moment but if we actually want to watch hdfs and such a plugin for snakebite would make sense [14:48:13] ottomata: we might want to package --^ and use it in refinery or other places [14:48:29] ya [14:49:00] (skein == python app to deploy and run arbitrary code on yarn) [14:49:10] basically i use skein to call swift_upload.py [14:51:38] me and joseph had a chat with an airflow dev, he said that they are moving to py3 and would require some dev help/time to have hadoop-related operators working [14:51:45] could be interesting to contribute for us [14:53:16] yea that might make sense [14:53:38] i have to find time around everything else though of course :) I know yo uall are equally swamped with things [14:53:54] I'll also have to figure out what kerberos means for airflow ... it has docs about it but i didn't really dig into it yet [14:54:16] (03PS1) 10Fdans: Add historical backfilling for mediarequest tops [analytics/refinery] - 10https://gerrit.wikimedia.org/r/545583 (https://phabricator.wikimedia.org/T228149) [14:55:50] ebernhardson: yes please, let's work on it together if you want. We could probably use the hadoop test cluster to see what needs to be set to make it work [15:04:21] (afk) [15:05:30] hi James_F! is it okay if I add a couple of i8n translations to your footer patch? [15:13:39] (03PS2) 10Fdans: Add historical backfilling for mediarequest tops [analytics/refinery] - 10https://gerrit.wikimedia.org/r/545583 (https://phabricator.wikimedia.org/T228149) [15:14:30] (03CR) 10Milimetric: [V: 03+2] Publish monthly geoeditor numbers [analytics/refinery] - 10https://gerrit.wikimedia.org/r/530878 (https://phabricator.wikimedia.org/T131280) (owner: 10Milimetric) [15:30:46] fdans: They should go through TranslateWiki in the normal way otherwise they'll break the i18n update workflow, but you can do it locally. [15:31:22] James_F: oohhh sure, of course :) [15:34:47] (03PS3) 10Fdans: Add historical backfilling for mediarequest tops [analytics/refinery] - 10https://gerrit.wikimedia.org/r/545583 (https://phabricator.wikimedia.org/T228149) [16:00:16] (03PS1) 10Mforns: Correct and optimize pingback reports [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/545607 (https://phabricator.wikimedia.org/T223414) [16:15:40] (03CR) 10Milimetric: [V: 03+2 C: 03+2] "I declare this to be the "judo change" because it flips all the queries upside down" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/545607 (https://phabricator.wikimedia.org/T223414) (owner: 10Mforns) [16:23:29] 10Analytics, 10Analytics-EventLogging, 10QuickSurveys, 10MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), and 2 others: QuickSurveys EventLogging missing ~10% of interactions - https://phabricator.wikimedia.org/T220627 (10Isaac) > Additionally, the Performance team is running their Perceived Performance survey ri... [16:49:39] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Dzahn) @lexnasser Please create a new SSH key that is not used in cloud and let us know the public part so we can update the production access. [16:49:54] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10Dzahn) a:05Dzahn→03lexnasser [16:56:18] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10lexnasser) Here's another public ED25519 key: AAAAC3NzaC1lZDI1NTE5AAAAIOBTDDmL8isvso6xqOJB5qkk3n8xuM0XxFc1Q34ZnZRj Let me know which service is associated with which k... [17:04:20] nuria: Confirming that I now have access to Turnilo. Will check out and play around with the data in it later today. [17:04:46] lexnasser: sounds good, let's touch base later via irc or tomorrow, whenever [17:05:18] nuria: 👍 [18:21:51] !log deploying refinery with scap up to 1110d59c3983bcff4986bce1baf885f05ee06ba5 [18:21:53] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:31:47] !log deploying refinery with refinery-deploy-to-hdfs up to 1110d59c3983bcff4986bce1baf885f05ee06ba5 [18:31:48] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:37:09] !log refinery deployment done! [18:37:10] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:41:10] milimetric, when I try to create the blacklist table, it complains that the path passed is not a directory [18:41:26] should I remove the last filename? [18:43:20] ah shoot, yes [18:43:28] my script is wrong, sending patch [18:44:30] mforns: arg! I forgot to update my change before merging because of the gerrit hiccups yesterday [18:44:44] mforns: do you have time for me to deploy refinery again before starting the job? [18:44:56] I had corrected this but hadn't sent the patch [18:45:28] milimetric, sure! but if it's just that, no need to redeploy that now, no? [18:45:49] I already created the table with non-typo path [18:46:09] is there more stuff? [18:46:12] mforns: no there's another fix too, to make the ranges 1-10 instead of 0-9 [18:46:17] ok ok [18:46:23] no problem, let's redeploy [18:47:00] (03PS1) 10Milimetric: Add lost geoeditors patch set, gerrit hiccup [analytics/refinery] - 10https://gerrit.wikimedia.org/r/545633 [18:47:34] I'll deploy, my fault [18:47:46] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add lost geoeditors patch set, gerrit hiccup [analytics/refinery] - 10https://gerrit.wikimedia.org/r/545633 (owner: 10Milimetric) [18:50:00] milimetric, isn't it possible to have distinct_editors be 0? [18:50:39] milimetric, if so, cast(floor((distinct_editors - 1) / 10) * 10 + 1 as string) could be "-9" no? [18:50:54] no, the rows in geoeditors_monthly are generated with a sum over actual rows, so it would never be zero [18:51:07] oh I see [18:51:23] I checked and it only reports countries that actually have data [18:51:28] cool, redeploying! [18:53:15] oh milimetric are you already deploying? [18:53:33] yes, sorry, I said above ^ [18:53:47] ok! sorry [18:54:06] let me know and I'll start the oozie job :] [18:59:01] !log refinery deployment re-done to fix my mistake [18:59:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:59:05] ok, mforns [19:00:16] milimetric, this is the command I'll use: https://pastebin.com/8rwPC1ST [19:00:19] is that OK? [19:01:13] looks great mforns [19:01:42] k [19:03:15] milimetric, https://hue.wikimedia.org/oozie/list_oozie_coordinator/0042999-190918123808661-oozie-oozi-C/ [20:30:45] PROBLEM - Check the last execution of reportupdater-wmcs on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:32:01] PROBLEM - Check the last execution of reportupdater-interlanguage on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:33:57] PROBLEM - Check the last execution of refinery-import-page-history-dumps on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:35:41] PROBLEM - Check the last execution of reportupdater-browser on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:36:05] PROBLEM - Check the last execution of archive-maxmind-geoip-database on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:36:07] PROBLEM - Check the last execution of reportupdater-pingback on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:41:21] RECOVERY - Check the last execution of reportupdater-wmcs on stat1007 is OK: OK: Status of the systemd unit reportupdater-wmcs https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:42:37] RECOVERY - Check the last execution of reportupdater-interlanguage on stat1007 is OK: OK: Status of the systemd unit reportupdater-interlanguage https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:44:31] RECOVERY - Check the last execution of refinery-import-page-history-dumps on stat1007 is OK: OK: Status of the systemd unit refinery-import-page-history-dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:46:17] RECOVERY - Check the last execution of reportupdater-browser on stat1007 is OK: OK: Status of the systemd unit reportupdater-browser https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:46:39] RECOVERY - Check the last execution of archive-maxmind-geoip-database on stat1007 is OK: OK: Status of the systemd unit archive-maxmind-geoip-database https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:46:43] RECOVERY - Check the last execution of reportupdater-pingback on stat1007 is OK: OK: Status of the systemd unit reportupdater-pingback https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [20:48:26] ottomata: I think stat1007 is kaput [20:48:45] ottomata: we are really going to have to look into restricting user limits [20:50:06] GoranSM: Are you executing your scripts with nice? [20:50:23] GoranSM: they are using teh whole capacity of the machine [20:51:14] ya GoranSM I think your R stuff is doing some crazy things. [20:51:22] could you possibly use Spark or SparkR on the cluster? [20:51:38] ottomata: can we kill that process? [20:51:50] i think OOM killed somethign [20:51:57] there's lots of free mem now [20:52:37] hmm, but ya it is using lots of CPU [20:53:38] ottomata: ya, like all cores [20:54:13] its been running for 11 hours already [20:54:23] machine seems functional atm [21:16:06] hm, now I see the alarms, thanks team for taking care [21:27:01] musikanimal: if you are there i can show you how to the pageview research [22:13:05] are you getting the help you needed on CoC with new members and all? [22:13:33] puf wrong chat