[00:54:04] 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: implement wider data purge window - https://phabricator.wikimedia.org/T237124 (10MMiller_WMF) @mforns -- I think that @nettrom_WMF can answer you on this. @nettrom_WMF and I decided today that we want to have a strategy for this by mid-December, and w... [01:34:39] 10Analytics, 10Editing-team, 10Performance-Team: VE edit data stopped at ~2019-11-24Z01:25 - https://phabricator.wikimedia.org/T239121 (10Nuria) Super nice explanation @Krinkle, shinny start of the week, really. And note to self I need to look at how these things are defined in graphite ("movingAverage(ve.$T... [02:21:47] (03PS1) 10Nuria: Create table to hold calculations of session features [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360) [06:51:27] ebernhardson: really weird, it should have been executed.. [06:51:39] anyway, the last patch seems to be https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/544990/ at this point [06:51:49] I left a couple of comments, then I think will be done! [06:54:12] 10Analytics, 10Operations, 10ops-eqiad, 10User-Elukey: Check if a GPU fits in any of the remaining stat or notebook hosts - https://phabricator.wikimedia.org/T220698 (10elukey) @RobH my bad! Thanks a lot for the patience @Cmjohnson, I'll add more pictures to the blog post when it will be allowed to be publ... [07:21:24] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) >>! In T237605#5691488, @Anomie wrote: >>>! In T237605#5690652, @elukey wrote: >> I will not comment the friendly introduction to ask for the credentials. > > If that was meant to criti... [07:25:00] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) >>! In T237605#5691161, @EBernhardson wrote: > I'd like to request two credentials for hadoop access: > > ebernhardson: typical personal usage, ebernhardson@wikimedia.org ` elukey@krb... [07:26:52] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) >>! In T237605#5691149, @nettrom_WMF wrote: > Like so many others, I'd like to request my credentials for access on stat100x and notebook100x. My username is `nettrom`. I'll keep an eye... [07:29:26] 10Analytics, 10Operations, 10ops-eqiad: analytics1057's BBU is faulty - https://phabricator.wikimedia.org/T239045 (10elukey) >>! In T239045#5691484, @Jclark-ctr wrote: > @elukey No spare bbu around @Jclark-ctr hi! In https://phabricator.wikimedia.org/T233080 analytics1032 needs to be decommed, maybe we can... [08:46:13] 10Analytics, 10Discovery-Search (Current work): "Wikidata Query Service Updater" should have 'bot' in the user agent to indicate is a tool - https://phabricator.wikimedia.org/T238106 (10Gehel) User agent is set in [[ https://github.com/wikimedia/wikidata-query-rdf/blob/master/tools/src/main/java/org/wikidata/q... [08:50:55] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10elukey) >>! In T234234#5690523, @Ottomata wrote: > Qs: > > - Do we need to support full IRC spec? I see in [[ https://gist.github.com/paravoid/3419e0b5ae1f24b6ea21906a142f2f47... [08:53:31] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10elukey) >>! In T234234#5690124, @Ottomata wrote: > > If I were to do this, I would try to do it in NodeJS with service-runner before Python, but I don't think that is a require... [09:12:31] * elukey goes to the doctor, be back later (available on the phone if needed) [09:52:31] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10dcausse) I'd need access as well, username: `dcausse` Thanks! [09:54:34] 10Analytics, 10Product-Analytics, 10SDC General, 10Wikidata: Data about how many file pages on Commons contain at least one structured data element - https://phabricator.wikimedia.org/T238878 (10Addshore) >>! In T238878#5684895, @Nuria wrote: > @Addshore : disclaimer: I know next to nothing about this but... [10:39:05] back! [10:40:28] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) >>! In T237605#5692512, @dcausse wrote: > I'd need access as well, username: `dcausse` > Thanks! ` elukey@krb1001:~$ sudo manage_principals.py create dcausse --email_address=dcausse@wik... [10:41:48] elukey: thanks! ^ [10:48:42] dcausse: np! Let me know if there are any doubts etc.. [10:51:49] 10Analytics, 10Operations, 10ops-eqiad: Degraded RAID on dbstore1003 - https://phabricator.wikimedia.org/T239217 (10Marostegui) [10:53:51] lovely [11:29:23] PROBLEM - Check the last execution of reportupdater-reference-previews on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:29:52] this is due to an issue with mediawiki etc.. [11:30:22] ah no wait stat1007 kaput [11:30:23] sigh [11:31:15] isaacj: o/ [11:31:22] are you there by any chance? [11:31:38] your script on stat1007 seems consuming a ton of resources [11:31:51] PROBLEM - Check the last execution of reportupdater-wmcs on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:32:47] PROBLEM - Check the last execution of archive-maxmind-geoip-database on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:34:21] PROBLEM - Check the last execution of refinery-import-siteinfo-dumps on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:34:27] PROBLEM - Check the last execution of reportupdater-pingback on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:35:05] PROBLEM - Check the last execution of reportupdater-browser on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:35:37] PROBLEM - Check the last execution of reportupdater-published_cx2_translations on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:36:28] rebooting stat1007.. [11:36:37] ok we definitely need user limit [11:36:45] I'll start working on it today [11:40:05] RECOVERY - Check the last execution of reportupdater-reference-previews on stat1007 is OK: OK: Status of the systemd unit reportupdater-reference-previews https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:40:29] lunch! [11:42:31] RECOVERY - Check the last execution of reportupdater-wmcs on stat1007 is OK: OK: Status of the systemd unit reportupdater-wmcs https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:43:29] RECOVERY - Check the last execution of archive-maxmind-geoip-database on stat1007 is OK: OK: Status of the systemd unit archive-maxmind-geoip-database https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:45:03] RECOVERY - Check the last execution of refinery-import-siteinfo-dumps on stat1007 is OK: OK: Status of the systemd unit refinery-import-siteinfo-dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:45:09] RECOVERY - Check the last execution of reportupdater-pingback on stat1007 is OK: OK: Status of the systemd unit reportupdater-pingback https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:45:47] RECOVERY - Check the last execution of reportupdater-browser on stat1007 is OK: OK: Status of the systemd unit reportupdater-browser https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:46:17] RECOVERY - Check the last execution of reportupdater-published_cx2_translations on stat1007 is OK: OK: Status of the systemd unit reportupdater-published_cx2_translations https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:46:46] 10Analytics, 10Operations, 10ops-eqiad: Degraded RAID on dbstore1003 - https://phabricator.wikimedia.org/T239217 (10jbond) p:05Triage→03Normal [12:06:12] :( [12:06:17] Hi elukey [12:06:44] joal: I have a question regarding using graphframes in a pyspark notebook on swap? [12:07:12] Hi mgerlach - I guess you have issues getting the package onto the workers? [12:07:17] essentially how to do it ; ) [12:08:08] I s saw this to have a shell (but no notebook) https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Spark#pyspark_and_external_packages [12:08:38] and another piece to set up my own pyspark session https://wikitech.wikimedia.org/wiki/SWAP#Launching_as_SparkSession_in_a_Python_Notebook [12:09:03] I couldnt install findspark though [12:10:37] mgerlach: I'd go for this one: https://wikitech.wikimedia.org/wiki/SWAP#Permanent_custom_Spark_Notebook_kernels [12:11:15] mgerlach: Defining a new kernel on notebooks, based on an existing one (copy/paste), adding the package download extra command [12:11:41] joal: thanks; will try that one [12:11:58] mgerlach: pleae let me know if this does the trick, I have not done it myself [12:28:44] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Add bot and change_type dimensions to geoeditors-daily - https://phabricator.wikimedia.org/T238855 (10JAllemandou) [12:41:31] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10dr0ptp4kt) Access requested for me - `dr0ptp4kt` [12:44:38] joal: I edited the kernel.json; I can use the new kernel but cannot import graphframes [12:45:05] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) >>! In T237605#5693131, @dr0ptp4kt wrote: > Access requested for me - `dr0ptp4kt` ` elukey@krb1001:~$ sudo manage_principals.py create dr0ptp4kt --email_address=abaso@wikimedia.org Prin... [12:45:23] "PYSPARK_SUBMIT_ARGS": "--conf spark.dynamicAllocation.maxExecutors=128 --master yarn pyspark-shell --packages graphframes:graphframes:0.6.0-spark\ [12:45:23] 2.3-s_2.11 --conf 'spark.driver.extraJavaOptions=-Dhttp.proxyHost=webproxy.eqiad.wmnet -Dhttp.proxyPort=8080 -Dhttps.proxyHost=webproxy.eqiad.wmnet -D\ [12:45:23] https.proxyPort=8080'" [12:45:55] I thought this would do the trick since it works with the shell https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Spark#pyspark_and_external_packages [12:46:11] I thought so as well mgerlach :( [12:46:37] what are external packages? [12:46:48] elukey: packages you download on the flight [12:46:59] from where? [12:47:17] joal: --^ [12:47:29] https://spark-packages.org/ [12:47:37] elukey: --^ [12:48:08] so the spark job should download stuff from the internet to hadoop before starting? [12:48:22] correct [12:48:59] ok so it cannot work if my understanding is correct, spark doesn't use any proxy to reach the internet [12:49:49] so I'd expect it to be blocked by the great analytics firewall :) [12:50:12] elukey: that's why we provide proxying options :) [12:50:46] ah ok just seen it, sorry [12:50:54] I don't particularly love it though [12:51:12] elukey: neither do I, but it's complicated to have better solutions :( [12:51:13] we are downloading stuff from the internet and make it run on our cluster [12:51:37] elukey: this is a long-time discussion with Andrew [12:52:18] it is kinda like what we have with pypi, but in the whole cluster (potentially) [12:53:02] mgerlach: what error do you get? [12:53:23] elukey: no error [12:53:36] elukey: indeed - now the question is how we can provide usability without putting too much risk - This is an open question [12:53:41] just the package is not there [12:56:31] not sure I am doing the right thing though [12:57:54] When you say 'not there' - you get an import error, right? [12:58:09] mgerlach: --^ [12:58:12] yes. when using the notebook [12:58:49] more specifically, 'from graphframes import *' yields 'ImportError ...' [13:00:19] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Repurpose db1108 as generic Analytics db replica - https://phabricator.wikimedia.org/T234826 (10elukey) I came up quickly with a new profile for multiinstance, but I have a doubt about the following: ` $basedir = '/opt/wmf-mariadb101' class { 'mariadb:... [13:03:38] mgerlach: I got it working with the example you pasted (not the extra kernel one) [13:04:07] mgerlach: you said you don't have findspark is that it? [13:04:26] needs to be installed? [13:04:34] correct :) using proxy [13:05:17] mgerlach: then minimal changes for the code made it work for me [13:05:19] mgerlach: https://pypi.org/project/findspark/ [13:05:51] how does your kernel.json look like? [13:06:01] mgerlach: I used a regular python one [13:06:32] mgerlach: https://gist.github.com/jobar/2c23eb96ed44e5950204597c02963179 [13:09:57] joal: indeed that works : ) thank you [13:10:45] np mgerlach :) [13:26:31] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10awight) Requesting credentials for myself, my user info is: ` sudo manage_principals.py create awight --email_address=adam.wight@wikimedia.de ` [13:28:12] awight: asking credentials like a boss --^ [13:28:19] command is perfect [13:28:57] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) >>! In T237605#5693281, @awight wrote: > Requesting credentials for myself, my user info is: > ` > sudo manage_principals.py create awight --email_address=adam.wight@wikimedia.de > ` `... [13:28:57] hehe, I'm like a human template engine ;-) [13:38:26] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Repurpose db1108 as generic Analytics db replica - https://phabricator.wikimedia.org/T234826 (10Marostegui) >>! In T234826#5693200, @elukey wrote: > What could be the best approach to make the current mariadb config be part of an instance? Keep `misc_multiinsta... [13:44:42] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10faidon) >>! In T234234#5690523, @Ottomata wrote: > - Do we need to support full IRC spec? I see in [[ https://gist.github.com/paravoid/3419e0b5ae1f24b6ea21906a142f2f47 | Faidon... [13:52:57] 10Analytics, 10Analytics-Kanban, 10Operations, 10SRE-Access-Requests: Add system user analytics-privatedata to the anaytics-privatedata-users group - https://phabricator.wikimedia.org/T238306 (10elukey) [13:53:26] 10Analytics: Doubts and questions about Kerberos and Hadoop - https://phabricator.wikimedia.org/T238560 (10elukey) [13:57:53] (03PS2) 10Joal: Add bot and non-edit actions to geoeditors-daily [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552510 (https://phabricator.wikimedia.org/T238855) [14:40:10] heya teamm [14:40:30] joal, you recently upgraded the guaba lib in refinery no? [14:40:33] guava [14:40:46] mforns: in refinery-cassandra only [14:40:59] oh ok, you think we can downgrade it for core? [14:41:06] to v15 [14:41:33] or 16 [14:41:42] I'm having this exact problem: https://stackoverflow.com/questions/36427291/illegalaccesserror-to-guavas-stopwatch-from-org-apache-hadoop-mapreduce-lib-inp [14:42:12] mforns: I don't think it;s a good idea, our hadoop relies on 16 [14:42:12] why do you need 15? [14:42:36] well, in the link that I pasted, it says 15 works [14:42:48] hm - can we try 16? [14:42:49] 16 might work as well, though [14:42:52] yes! [14:43:06] 17 fails, they say [14:43:41] joal, ok, will try 16 then and let you know if it avoids the error! [14:44:10] k mforns - I wonder why we have not seen this error before - Are you using specific spark features? [14:44:27] joal, I think it might be because of the very small size of written files [14:44:53] bizarre mforns [14:44:53] it looks like the failing call (StopWatch) might be related to it [14:45:50] joal: o/ [14:45:54] do you have a min? [14:46:02] ah no sorry you guys are already discussing [14:46:03] later :) [14:46:09] elukey, no no go on! [14:46:17] we're kinda done [14:46:18] elukey: I can do 2 conversations :) batcave ? [14:46:56] joal: did you just fork one joseph for me? [14:47:04] :D [14:47:10] batcave yes! [14:47:28] :) [14:48:04] mforns: first time I see this one :( [14:48:43] aha [15:02:02] Gone for kifs [15:02:13] s/f/d/g [15:14:19] 10Analytics, 10Analytics-Kanban, 10Operations, 10Product-Analytics, and 2 others: notebook/stat server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10elukey) [15:35:06] joal: I raised limit to 60/90% of memory on the host [15:35:20] we can always refine and lower down [15:35:32] but I hope should be enough to stop a lot of failures [15:41:50] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10TJones) I'd also like to request access (tjones@wikimedia.org). Thanks! [15:58:56] 10Analytics: Doubts and questions about Kerberos and Hadoop - https://phabricator.wikimedia.org/T238560 (10mepps) @Nuria @elukey Thank you so much for moving this back! I appreciate all your questions and helpful explanations about the implications of this change. [16:00:46] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) >>! In T237605#5693788, @TJones wrote: > I'd also like to request access (tjones@wikimedia.org). Thanks! ` elukey@krb1001:~$ sudo manage_principals.py create tjones --email_address=tjon... [16:05:30] 10Analytics, 10Analytics-Kanban, 10Operations, 10Product-Analytics, and 2 others: notebook/stat server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10elukey) Applied the change to stat1004 and executed a little test: ` a = [] while True: print(len(a)) a.append(' ' * 10**6)... [16:05:47] joal: --^ [16:05:51] works perfectlu [16:05:57] *perfectly [16:07:49] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10EBernhardson) >> discovery-analytics: We run analytics jobs (submit to oozie, etc) from this user. The upcoming airflow installation will also submit our jobs as this user. I suppose send this t... [16:15:47] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) >>! In T237605#5693906, @EBernhardson wrote: >>> discovery-analytics: We run analytics jobs (submit to oozie, etc) from this user. The upcoming airflow installation will also submit our... [16:35:30] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Team-Backlog, 10Epic: Event Platform Client Libraries - https://phabricator.wikimedia.org/T228175 (10jlinehan) [16:41:35] 10Analytics, 10Analytics-Kanban, 10Operations, 10Product-Analytics, and 2 others: notebook/stat server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10elukey) I think we are on a good track, next steps: 1) add support for Buster (for stat1005) - needs Cloud team's review/approval:... [17:01:50] nuria: standuuppp [17:19:19] !log add systemd user limits to stat1004 [17:19:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:30:33] 10Analytics, 10Editing-team, 10Performance-Team: VE edit data stopped at ~2019-11-24Z01:25 - https://phabricator.wikimedia.org/T239121 (10Jdforrester-WMF) The data started back again at 2019-11-26Z14:43… [17:31:14] 10Analytics, 10Editing-team, 10Performance-Team: VE edit data stopped at ~2019-11-24Z01:25 - https://phabricator.wikimedia.org/T239121 (10Jdforrester-WMF) Potentially kicked into life by https://tools.wmflabs.org/sal/log/AW6oKay3vrJzePItdZSN ? [17:38:16] nuria: the page-links-change EventStream is broken (https://phabricator.wikimedia.org/T239220) and unfortunately Andrew Otto is on vacation. Is there anyone else who could look at this? It's pretty important to the Internet Archive as they use that stream to archive all our reference links. [17:41:19] kaldari: I think that we and CPT can take a look at it :) [17:41:31] awesome! [17:43:37] kaldari: it seems working for me though [17:43:42] oh really? [17:44:29] so it is [17:44:31] ah ok I think I know what's happening [17:44:37] https://grafana.wikimedia.org/d/000000336/eventstreams?orgId=1&refresh=1m&from=now-30d&to=now [17:44:50] there might be some overloading [17:45:18] the last jump seems around the 18th/19th [17:45:40] will try to follow up! [17:53:39] 10Analytics, 10Product-Analytics, 10Growth-Team (Current Sprint): Growth: implement wider data purge window - https://phabricator.wikimedia.org/T237124 (10MMiller_WMF) [17:54:04] ebernhardson: o/ [17:54:12] I left some comments to the last airflow patches [17:54:26] but those are minor ones, we should be close to have it working [18:05:05] 10Analytics: Import slots table through scoop - https://phabricator.wikimedia.org/T239127 (10Nuria) a:03JAllemandou [18:08:38] 10Analytics, 10Operations, 10ops-eqiad: Degraded RAID on dbstore1003 - https://phabricator.wikimedia.org/T239217 (10Cmjohnson) a:03Jclark-ctr I created a self-dispatch ticket. You have successfully submitted request SR1004377941. Assigning to @Jclark-ctr since I will be out of the area. [18:09:50] 10Analytics, 10Analytics-Kanban: Import slots/slots_roles and wikibase.wbc_entity_usage through scoop - https://phabricator.wikimedia.org/T239127 (10Nuria) [18:15:29] elukey: hey, i'll check it out. Also having an odd issue where sudo doesn't want to let me run the command as airflow user, not clear why yet. Probably some order of the sudoers import or something [18:16:26] also realized i probably have to change it, sudo will strip the AIRFLOW_HOME env variable, meaning it wont read /etc/airflow. Probably have to change the sudo command to use /usr/bin/env, or write a simple wrapper script and allow sudo to call wrapper [18:17:39] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10Nuria) I think we will have availability to work on porting this next quarter. Regarding ownership Analytics is not staffed to own a critical feed like this one that could be co... [18:30:54] ebernhardson: the current sudoers rules allow you to sudo (as root) and systemctl restart or service restart [18:31:07] 10Analytics, 10Discovery-Search (Current work): "Wikidata Query Service Updater" should have 'bot' in the user agent to indicate is a tool - https://phabricator.wikimedia.org/T238106 (10Mstyles) a:03Mstyles Yep, I'll take a look [18:31:09] and you can only impersonate airflow with one of the rules [18:31:47] where is AIRFLOW_HOME required/defined? [18:31:56] we can add it to the systemd units for example [18:34:31] 10Analytics, 10DBA: add wbc_entity_usage to all labs projects? - https://phabricator.wikimedia.org/T239261 (10Nuria) [18:35:38] (going afk but will read later :) [18:35:40] * elukey off! [18:39:51] elukey: it's in the systemd units, the place that doesn't get it is this sudo rule: (airflow) NOPASSWD: /srv/deployment/search/airflow/venv/bin/airflow * [18:40:36] relatedly, this should work but asks for a password: sudo -u airflow /srv/deployment/search/airflow/venv/b [18:40:39] in/airflow -h [18:42:01] 10Analytics, 10Data-Services, 10cloud-services-team: add wbc_entity_usage to all labs projects? - https://phabricator.wikimedia.org/T239261 (10jcrespo) [18:44:51] (03PS3) 10Joal: Add bot and non-edit actions to geoeditors-daily [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552510 (https://phabricator.wikimedia.org/T238855) [18:45:40] 10Analytics, 10Data-Services, 10cloud-services-team: add wbc_entity_usage to all labs projects? - https://phabricator.wikimedia.org/T239261 (10jcrespo) Although checking more closely, this should be closed as invalid- those wikis doesn't have wikidata enabled, so there is no such tables. Not all wikis have... [18:57:59] (03PS1) 10Joal: [WIP] Add slot, slot_roles and wbc_entity_usage to sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/553160 (https://phabricator.wikimedia.org/T239127) [18:59:04] ebernhardson: mmm /srv/deployment/search/airflow/venv/bin/airflow seems not there no? [19:00:15] now in theory we should have systemd units that executes as the airflow user, not us directly calling /srv/deployment/search/airflow/venv/bin/airflow as airflow no? [19:00:18] what is the use case? [19:01:00] elukey: running that is for doing this like issuing backfills [19:02:08] elukey: also, i hadn't noticed the airflow file was missing. Indeed this venv doesn't seem to have installed the dependencies, i probably have some deployment problems to fix [19:02:38] ack then [19:09:55] nuria: sqoop launched from a screen on 1004 (my user0 [19:26:26] joal: k [19:32:55] 10Analytics, 10Editing-team, 10Performance-Team, 10observability: VE edit data stopped at ~2019-11-24Z01:25 - https://phabricator.wikimedia.org/T239121 (10CDanis) [19:46:10] joal, nuria, the spark problem with guava is not depending on worker node... It affects all kind of nodes, plus I saw an example of a node that fails once and succeeds another time... [19:48:09] another curious fact, yesterday every third job would generate some retry, but today none of the ~40 jobs I ran has generated any retry, regardless of lib version [19:55:28] 10Analytics, 10Editing-team, 10Performance-Team, 10observability: VE edit data stopped at ~2019-11-24Z01:25 - https://phabricator.wikimedia.org/T239121 (10Krinkle) >>! In T239121#5694251, @Jdforrester-WMF wrote: > Potentially kicked into life by https://tools.wmflabs.org/sal/log/AW6oKay3vrJzePItdZSN ? Wel... [19:56:23] 10Analytics, 10Editing-team, 10Performance-Team, 10observability: VE edit data stopped at ~2019-11-24Z01:25 - https://phabricator.wikimedia.org/T239121 (10Jdforrester-WMF) Yeah, I was thinking about maybe a key server getting stuck somehow, and getting unlocked by the next deploy, but it seems very unlikely. [19:59:09] 10Analytics, 10Editing-team, 10Performance-Team, 10observability: VE edit data stopped at ~2019-11-24Z01:25 - https://phabricator.wikimedia.org/T239121 (10Krinkle) Yeah, bandwidth congestion of some kind somehow selectively causing these requests to get lost on the edge, or maybe later on in the `webperf10... [20:02:20] 10Analytics, 10Cloud-Services, 10Discovery, 10Discovery-Search, 10Elasticsearch: Create partitioned CirrusSearchElasticaWrite topic - https://phabricator.wikimedia.org/T239135 (10Pchelolo) 05Open→03Resolved Done. [20:09:25] nuria, also, what do you think about having a country whitelist (or blacklist) for the traffic_by_country, because there are many countries that are too small (in terms of population) [20:57:53] 10Analytics, 10Fundraising-Backlog, 10Fundraising Sprint A Wrinkle in Timezones, 10Fundraising Sprint X 2019: Identify source of discrepancy between HUE query in Count of event.impression and druid queries via turnilo/superset - https://phabricator.wikimedia.org/T204396 (10DStrine) [21:12:08] mforns: man that problem with guava is java hell at its best [21:12:17] mforns: whitelist +1 [21:12:33] mforns: i would actually make that whitelist quite small at first [21:12:40] yea.. [21:16:18] 10Analytics, 10Editing-team, 10Performance-Team, 10observability: VE edit data stopped at ~2019-11-24Z01:25 - https://phabricator.wikimedia.org/T239121 (10Nuria) @Jdforrester-WMF Is this data published from parsoid servers or browser clients? [21:23:17] (03CR) 10Nuria: [C: 03+1] "Virtual +2 if we have tested code works as written. I think is confusing that the monthly data would not have any bots or just 1 type of e" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552510 (https://phabricator.wikimedia.org/T238855) (owner: 10Joal) [21:41:50] (03CR) 10Joal: "I think we should even go for removing the 'geoeditors' bit of it if we rename: editors_details_daily ?" (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552510 (https://phabricator.wikimedia.org/T238855) (owner: 10Joal) [21:44:01] (03CR) 10Nuria: [WIP] Add slot, slot_roles and wbc_entity_usage to sqoop (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/553160 (https://phabricator.wikimedia.org/T239127) (owner: 10Joal) [21:45:58] (03CR) 10Nuria: [C: 03+1] "How about renaming table to editors_daily?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552510 (https://phabricator.wikimedia.org/T238855) (owner: 10Joal) [21:46:25] joal: scoops are executed in an-coord1003? [21:46:35] nope nuria, stat1004 [21:48:28] joal: i dreamed that then [21:48:57] Ah nuria! production ones you mean? an-coord1001 - The one I run (manual one), from stat1004 [21:49:02] joal: once scoop is done what i am going to do is getting a jupyter notebook together and calculate the distict page_id for just the top 20 wikis [21:49:03] sorry I didn't understand the question [21:49:13] joal: ahahahah [21:50:17] 10Analytics, 10Data-Services, 10cloud-services-team: add wbc_entity_usage to all labs projects? - https://phabricator.wikimedia.org/T239261 (10Nuria) Thanks for the fast response, resolving [21:50:20] 10Analytics, 10Data-Services, 10cloud-services-team: add wbc_entity_usage to all labs projects? - https://phabricator.wikimedia.org/T239261 (10Nuria) 05Open→03Declined [21:51:55] 10Analytics: Label high volume bot spikes in pageview data as automated traffic - https://phabricator.wikimedia.org/T238357 (10Isaac) Hey @Nuria -- I had been doing some of my own research on this as part of some background work around re-use of Wikimedia content. I wanted to throw in a few thoughts in case they... [21:57:20] mforns: I might have an explanation about the spark failures: you use oozie_spark_lib=spark2.3.1 - We have upgraded to 2.4.4: oozie_spark_lib=spark-2.4.4 [21:57:34] O.o! [21:57:43] mforns: Notice the '-' between spark and the version number [21:57:51] yea [21:58:33] joal, OK will restart the job with that [21:59:12] if this is it, I owe you a good beer [21:59:15] :D [21:59:49] (03CR) 10Joal: "Works for me!: /wmf/data/wmf/mediawiki_private/editors_daily - Let' wait for milimetric to confirm :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552510 (https://phabricator.wikimedia.org/T238855) (owner: 10Joal) [22:01:46] (03CR) 10Joal: "Ok for on the fly filtering - Should we parse "https://noc.wikimedia.org/conf/highlight.php?file=dblists/wikidataclient.dblist" of somethi" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/553160 (https://phabricator.wikimedia.org/T239127) (owner: 10Joal) [22:04:30] musikanimal: ping me if you are there [22:04:47] mforns: oooohhhhhhhhh [22:04:52] nuria: hola! [22:05:07] musikanimal: do you have a user for notebooks? [22:05:21] musikanimal: as in , can you log in into notebook1003? [22:05:23] I don't know what notebooks is, so I guess not, hehe [22:05:42] 10Analytics, 10Analytics-Kanban, 10Operations, 10Product-Analytics, and 2 others: notebook/stat server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10JAllemandou) @elukey: We should apply the same treatment for stat1007 :) [22:05:52] musikanimal: let's try this, can you ssh to notebook1003.eqiad.wmnet? [22:06:18] yes! I'm in :) [22:06:22] musikanimal: do try and let me know cause i made a tool there for you to troubleshoot pageview spikes and oi can show you how to use it [22:06:31] musikanimal: ok what is your home dir? [22:06:44] /home/musikanimal [22:07:17] musikanimal: ok, if you have 10 minutes, exit and make a tunnel like: [22:07:26] musikanimal: ssh -N notebook1003.eqiad.wmnet -L 8000:127.0.0.1:8000 [22:07:35] musikanimal: and go to http://localhost:8000 [22:08:16] works! [22:08:32] what are my credentials? LDAP? [22:08:37] ya [22:09:13] musikanimal: this is all documented here BTW: https://wikitech.wikimedia.org/wiki/PAWS [22:09:43] oh PAWS, I'm a tiny bit familiar [22:10:09] musikanimal: once you can see your directory do ssh into the machine again (in another terminal) and copy from my home dir: /home/nuria/Detailed_Pageview_Report.ipynb into your /home/musikanimal dir [22:10:33] musikanimal: let's jump in a hangout and i saw you the rest cause it is easier sharing screen [22:30:01] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10faidon) I think conceptually this belongs together with EventStreams, as a product offering and, by extension, to the same owners and maintainers. This is just another (non-HTTP... [22:41:19] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10Nuria) I agree it is a similar product but its usage seems quite different, it seems to support quite a few critical bots. EvenStreams availability and support are those of a t... [22:42:38] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10Nuria) We can talk more offline as needed be. [23:05:58] (03CR) 10Nuria: "I think we should have a blacklist (i.e, not implemented in wikimaina2017) and we can have that blacklist for now in the file itself, if t" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/553160 (https://phabricator.wikimedia.org/T239127) (owner: 10Joal) [23:19:32] 10Analytics, 10CPT Initiatives (MCR), 10Multi-Content-Revisions (Tech Debt), 10Patch-For-Review, 10Schema-change: Once MCR is deployed, drop the rev_text_id, rev_content_model, and rev_content_format fields from the revision table - https://phabricator.wikimedia.org/T184615 (10CCicalese_WMF) [23:29:32] 10Analytics: Label high volume bot spikes in pageview data as automated traffic - https://phabricator.wikimedia.org/T238357 (10Nuria) @Isaac weblight data will be excluded from the classification entirely, the way it gets to us it does not have any client IP that we can use. This is true for any other proxy as... [23:39:32] (03PS2) 10Nuria: Create table to hold calculations of session features [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360) [23:44:09] (03PS3) 10Nuria: Create table to hold calculations of session features [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360)