[00:01:03] 10Analytics, 10Analytics-Kanban, 10ArticlePlaceholder, 10Wikidata, and 4 others: ArticlePlaceholder dashboard stopped tracking page views - https://phabricator.wikimedia.org/T236895 (10Nuria) 05Open→03Resolved [02:21:35] (03PS1) 10Milimetric: [WIP] Detect pageviews as requested by KaiOS [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/580161 (https://phabricator.wikimedia.org/T244547) [03:17:53] (03CR) 10Nuria: "Thanks for cleaning up, please add unit tests for the functions you refactored." (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/580161 (https://phabricator.wikimedia.org/T244547) (owner: 10Milimetric) [07:13:03] (03CR) 10Elukey: [C: 03+1] "LGTM, let me know when you want to deploy this.." [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/580083 (owner: 10Joal) [07:14:29] bonjour [07:23:50] so all the kernels are deployed under /usr/local/share/jupyter/kernels. [07:54:58] 10Analytics, 10Analytics-Kanban, 10ArticlePlaceholder, 10Wikidata, and 4 others: ArticlePlaceholder dashboard stopped tracking page views - https://phabricator.wikimedia.org/T236895 (10Lydia_Pintscher) Thank you :) [09:05:52] (03PS2) 10Fdans: Remove graph trend from Wikistats detail page [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577531 (https://phabricator.wikimedia.org/T212032) [09:05:54] (03PS3) 10Fdans: Remove graph trend from Wikistats detail page [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577531 (https://phabricator.wikimedia.org/T212032) [09:10:20] 10Analytics, 10Operations, 10User-Elukey: Refactor Analytics POSIX groups in puppet to improve maintainability - https://phabricator.wikimedia.org/T246578 (10elukey) [09:10:39] 10Analytics, 10Analytics-Kanban, 10Operations, 10User-Elukey: Refactor Analytics POSIX groups in puppet to improve maintainability - https://phabricator.wikimedia.org/T246578 (10elukey) [09:53:29] need to go out to buy some stuff, it shouldn't take much but there could be a big line [10:18:59] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10akosiaris) >>! In T238658#5973616, @colewhite wrote: >>>! In T238658#5972237, @akosiaris wrote: >> Sure it might very well be... [10:52:39] (03PS2) 10Fdans: Show correct 12 month period in the dashboard [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577542 (https://phabricator.wikimedia.org/T238894) [10:53:07] (03CR) 10Fdans: Show correct 12 month period in the dashboard (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577542 (https://phabricator.wikimedia.org/T238894) (owner: 10Fdans) [11:31:12] https://pypi.org/project/PyHive/0.6.2/ [11:31:15] \o/ [12:06:34] ok so I'll rebuild Superset later on and deploy again to test [12:06:40] if all is good, I'll deploy to prod [12:06:47] and configure superset to use presto [12:15:55] * elukey lunch! [13:03:09] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: pip not accessible in new SWAP virtual environments - https://phabricator.wikimedia.org/T247752 (10nshahquinn-wmf) >>! In T247752#5971999, @nshahquinn-wmf wrote: > I remember @SNowick_WMF also experienced this when she first started using SWAP, and she may... [13:43:14] elukey: G1 helps but is not enough :) [13:52:21] elukey: I have managed to make the job suceeded by reducing parallelism [13:52:40] hi yall [13:53:52] Hi mforns [13:54:48] hi :] [14:05:18] joal: interesting! When did you run the job? [14:05:21] curious about the metrics [14:05:30] mforns: surprise for you [14:05:38] elukey: in the past few hours [14:05:48] current succesful one is finishing now [14:05:58] Like, as we type :) [14:06:00] mforns: check https://pypi.org/project/PyHive/ [14:06:15] I have seen that elukey - This is super great :) [14:08:16] joal: do you want to deploy the kernels? [14:09:46] sure elukey - let's do that! - from nuria's comment on https://gerrit.wikimedia.org/r/#/c/analytics/refinery/source/+/580077/, I think we're gonna need a new project or module for non-hadoop-necessary libs :) [14:10:41] joal: why do you want vegas in refinery-job? [14:11:25] ottomata: I'd like vegas in a project imported in notebooks - Since we import refinery-job and hive there for other reasons, why not there [14:11:37] But I get the point - Let's put vegas in a different place I don't mind [14:11:38] why not just put it in swap repo like we did with brunel? [14:11:42] the jars themselves? [14:11:54] feasible [14:12:57] Also, unfortunately Vegas last release was in 2017 - works well, but still, seems unmaintained [14:17:44] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: pip not accessible in new SWAP virtual environments - https://phabricator.wikimedia.org/T247752 (10mpopov) I just have `export PATH=/home/bearloga/venv/bin:${PATH}` in my ~.bashrc file on all the machines. @elukey @Ottomata: perhaps that sort of thing can... [14:20:45] elukey, \o/ Commits on Feb 21, 2020 [14:20:45] @elukey [14:20:45] elukey [14:20:45] Fix CI errors reported (#312) … [14:20:55] coooooll [14:21:31] ottomata: morningggg [14:21:40] hellOoOO [14:21:42] I have a question for you when you are caffeinated [14:25:26] ah no I might have the answer [14:25:29] or a lead to it [14:25:33] self resolved [14:27:00] no I still need a brainbounce [14:27:59] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: pip not accessible in new SWAP virtual environments - https://phabricator.wikimedia.org/T247752 (10SNowick_WMF) I use notebook1004, to fix this I added aliases in my .bash_profile and haven't had any issues since. alias py='/home/snowick/venv/bin/python3'... [14:29:58] ottomata: vegas jar is ~50M - is that ok for the git repo? [14:31:37] and actually there is no directly available java package, meaning we should create one [14:31:43] let's discuss that at standup [14:32:31] joal: ya it'll be ok [14:32:37] or we could use git fat :p [14:32:41] but naw [14:32:59] hm joal we could just check it in as an artifact in refinery [14:33:02] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: pip not accessible in new SWAP virtual environments - https://phabricator.wikimedia.org/T247752 (10elukey) I think that Neil is right, it is probably something that I missed when upgrading SystemdSpawner on all the notebooks/stats. For example, on notebook1... [14:33:04] that would use git fat [14:33:12] as long as that is deployed to swap hosts [14:33:30] oh elukey hi [14:33:32] soorry yes [14:33:49] ( somotimes i see first pings but forget to check back for follow upafter I answer sorry) [14:33:51] now is good! [14:34:02] ottomata: we'd need a project to gather dependencies - there is no such "shaded" vegas version [14:34:03] ottomata: np, I am trying to find what I missed when deploying jupyterhub :) [14:34:10] joal: oh i see [14:34:32] ottomata: We can create a new refinery module (refinery-libs ?) [14:34:41] joal: i kinda doubt anyone will use it except for you and maybe me but not reallyl [14:34:50] agreed ottomata [14:34:54] maybe you can just make a custom kernel and do it for yourself? [14:35:01] that's what I did :) [14:35:04] oh [14:35:05] heheh [14:35:36] elukey: whatt's up? [14:36:00] ottomata: it is about https://phabricator.wikimedia.org/T247752 - only notebook1003 works [14:36:06] ah [14:36:08] and 1004? [14:36:11] nope [14:36:12] or not 1004? [14:36:13] oh [14:36:14] hm [14:36:27] but I recall one thing, that might be related, even if I don't find how [14:37:15] under /srv/jupyterhub on notebook1003 I see jupyter-venv and venv [14:37:21] meanwhile on 1003 only venv [14:37:37] this is due to me, since by mistake I removed jupyter-venv from there [14:37:41] because I thought it was not needed [14:37:52] (judging from the create_environment.sh etc..) [14:38:18] I always wanted to ask why it was there, but then I saw the default on create_environment.sh and thought it was a leftover [14:38:26] could it be related? [14:38:46] just want to triple check that there is no hidden thing that I don't see [14:39:06] I see c.SystemdSpawner.extra_paths = ['/home/{USERNAME}/venv/bin'] in jupyterhub_config.oy [14:39:09] that should work on all [14:39:31] seems unrelated buut ya I think you want to leave jupyter-venv (just from reading create_virtualenv.sh) [14:39:37] hm [14:39:52] hmm [14:40:08] but the systemd unit uses /srv/jupyterhub/venv [14:40:09] ? [14:40:24] ah but my readme says to create that [14:40:25] ok. [14:40:40] i think that discrepency is my fault, probably a leftover from something [14:40:58] we should probaby change the default to /srv/jupyterhub/venv [14:41:02] since that is what everything else refers to [14:41:06] but ya, seems unrelated [14:41:11] ok perfect [14:44:30] elukey: this seems to be an issue with the terminal kernel somehow [14:44:48] the venv is sourced correctly for the notebook server [14:44:56] and python kernels have the correct env [14:45:00] but the terminal one doesn't [14:45:23] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Definition of not rich tech content metrics for tunning session - https://phabricator.wikimedia.org/T247417 (10mpopov) Hi @Nuria, please add a description explaining what is being asked for here. Also, can you please specify if this is a request for the P... [14:45:46] but it is indeed a problem only on 1004 [14:46:30] and also on the stats I think [14:46:38] oh ya? [14:46:46] elukey: check a python notebook [14:46:49] i imagine it is fine [14:46:51] you can do [14:46:58] !echo $PATH [14:47:02] in a pythhon nb shell [14:47:18] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Technical contributors metrics definition - https://phabricator.wikimedia.org/T247419 (10mpopov) Hi @Nuria, can you please specify if this is a request for the Product Analytics team or if the team is tagged just so this task is on our radar? Also, any li... [14:50:08] ottomata: yep on a py3 kernel it is fine [14:50:27] i don't know where this terminal comes from :p [14:50:35] it isn't a kernelspec [14:52:45] it must be something that systemd spawner does [14:54:34] weird [14:54:35] elukey@stat1006:~$ echo $PATH [14:54:35] /usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games [14:54:42] this is from inside a terminal [14:55:48] elukey: yeah i dunno... very strange [14:55:53] i don't know what's up [14:55:57] https://github.com/jupyter/notebook/issues/2339 maybe related? [14:56:06] terminado is the notebook app that does this [14:56:14] i don't know why it doesn't get the venv sources [14:56:16] sourced [14:56:39] i gotta keep working on this memory leak though, sorry i couuldn't be more helpful :/ [14:56:56] oh no thanks a lot for the help, clarified a lot of doubts [14:56:58] :) [14:58:36] (03CR) 10Nuria: [C: 04-1] "I think we still need to remove the method from the graphmodel, correct? See my prior comment. It is not used on the dashboard." [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577531 (https://phabricator.wikimedia.org/T212032) (owner: 10Fdans) [14:59:22] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Definition of not rich tech content metrics for tunning session - https://phabricator.wikimedia.org/T247417 (10Nuria) This is part of the tree of tasks we are working together with @jwang [14:59:50] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: pip not accessible in new SWAP virtual environments - https://phabricator.wikimedia.org/T247752 (10elukey) Ok so the $PATH is correctly picked up in kernels (I tested a py3 one) but not from the Terminal. Not sure what different on notebook1003, it has been... [15:02:31] sorry nuria, I'm not sure what I did wrong twice to not include the graphmodel diff, will make sure to include it now [15:02:41] fdans: no worries [15:16:51] when I tested my Oozie job, it errored on the action that called on the mark_directory_done workflow in the process of creating the done_file in hdfs with the error `FS014: Permission denied: user=lexnasser, access=WRITE, inode="/wmf/data/wmf/mediawiki_private":analytics:analytics-privatedata-users:drwxr-x`. [15:16:57] is this something do with kerberos or my user permissions (as opposed to using the root user) or the config files? [15:18:07] hm lexnasser is sounds like you ran the job as yourself and it was trying to write in the /wmf/data/wmf/mediawiki_private directory? [15:18:11] not sure if it should do that or not [15:18:14] sounds like not? [15:24:50] lexnasser: ottomata is right, the testing job should write to your dir under /home/lexnasser when you run it you need to override the paths such it uses that dir, let me give you an example [15:25:10] ottomata: yeah, don't intend for it to write into there. I don't reference that location anywhere in the mark_directory_done workflow, though. Is there another place in the config I should look at? [15:27:10] lexnasser: are you testing these code https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/576618/ [15:27:29] nuria: yes, a slight modification, but functionally the same [15:28:16] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: pip not accessible in new SWAP virtual environments - https://phabricator.wikimedia.org/T247752 (10elukey) For some mysterious reasons, I can now see PATH properly populated in notebook1004: ` elukey@notebook1004:~$ echo $PATH /home/elukey/venv/bin:/usr/lo... [15:28:38] lexnasser: i see, your job should override paths, for example: [15:28:54] https://www.irccloud.com/pastebin/uWfWW3P6/ [15:29:34] lexnasser: this is an example of how to override a job to read/write from local dirs [15:29:39] nuria: Ok, I'll retry with overriding it. Thanks! [15:29:42] lexnasser: local to your homedir [15:56:10] (03PS1) 10Elukey: Update dependencies with PyHive 0.6.2 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/580363 (https://phabricator.wikimedia.org/T239903) [15:56:19] ta daaaan [15:57:13] going to cherry pick it on deploy1001 and test it in staging [15:58:48] 10Analytics, 10Core Platform Team Workboards (Initiatives): Should reportupdater Pingback reports be refactored? - https://phabricator.wikimedia.org/T246154 (10eprodromou) [16:02:09] ping ottomata STANDUP [16:02:27] AHHH [16:21:19] sorry team - internet too busy at home again [16:42:37] (03CR) 10Mforns: [C: 03+1] "LGTM!" [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/580363 (https://phabricator.wikimedia.org/T239903) (owner: 10Elukey) [16:43:39] ottomata, wanna chat about the EventStreamConfig extension dependency? [16:44:44] sure [16:44:45] mforns, mind if I listen in? [16:44:51] not at all :) [16:45:00] of course [16:45:17] ottomata, hip, https://meet.google.com/kti-iybt-ekv [16:45:26] oh wait [16:45:30] there's people there too! [16:45:35] haha [16:45:39] break out rooms! [16:46:27] ottomata, hip: https://meet.google.com/ttr-mcwc-kxf [17:08:22] 10Analytics, 10Pageviews-Anomaly: Topviews Analysis of the Hungarian Wikipedia is flooded with spam - https://phabricator.wikimedia.org/T237282 (10Nuria) Update on this, we have deployed our identifying code to the pageview pipeline and it is being run on shadow mode (meaning end users do not yet see the resul... [17:09:57] 10Analytics, 10Research-Backlog: [Open question] Improve bot identification at scale - https://phabricator.wikimedia.org/T138207 (10Nuria) Update on this, code is been deployed and running on shadow mode ( meaning that end users do not see yet the results of the bot/no bot classification) [17:10:50] 10Analytics, 10Research-Backlog: [Open question] Improve bot identification at scale - https://phabricator.wikimedia.org/T138207 (10Nuria) [17:10:52] 10Analytics, 10Pageviews-Anomaly: Manipulation of pageview statistics - https://phabricator.wikimedia.org/T232992 (10Nuria) [17:11:08] 10Analytics, 10Pageviews-Anomaly: Manipulation of pageview statistics German Wikipedia - https://phabricator.wikimedia.org/T232992 (10Nuria) [17:12:27] stat1005 is back! [17:22:42] (03CR) 10Elukey: [V: 03+2 C: 03+2] Update dependencies with PyHive 0.6.2 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/580363 (https://phabricator.wikimedia.org/T239903) (owner: 10Elukey) [17:24:20] wha hey how elukey ? [17:24:42] ottomata: I asked dcops to try another switch port, apparently multiple ones were fried [17:24:54] huh [17:24:54] cool [17:25:01] !log deploy superset to enable Presto and Kerberos (Pyhive 0.6.2.) [17:25:02] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:27:24] mforns: https://superset.wikimedia.org/superset/dashboard/73/ enjoy :) [17:28:09] nuria: --^ [17:29:06] elukey, \\\\\\\o//////// [17:30:36] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Epic: Spark sessions can provision kerberos tickets in a more predictable manner - https://phabricator.wikimedia.org/T246132 (10nshahquinn-wmf) [17:33:00] 10Analytics, 10DC-Ops, 10Operations, 10netops: kafka-jumbo1006 and stat1005 network issues - https://phabricator.wikimedia.org/T247561 (10elukey) 05Open→03Resolved a:03elukey stat1005 is back, John and Papaul switched it to port /43. [17:36:35] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10User-Elukey: Learn how to make dashboard on top of data on hadoop/hive - https://phabricator.wikimedia.org/T247329 (10kzimmerman) [17:38:07] ugi=elukey (auth:PROXY) via presto/an-presto1002.eqiad.wmnet@WIKIMEDIA (auth:KERBEROS) [17:38:10] niceee [17:38:38] the sql lab also works [17:38:50] this is massive, it could be a replacement of Hue [17:40:53] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review, 10User-Elukey: Kerberize Superset to allow Presto queries - https://phabricator.wikimedia.org/T239903 (10elukey) Everything deployed on Superset, the dashboards that were broken are now working fine! Also tried to use the SQL Lab... [17:41:04] 10Analytics, 10Core Platform Team Workboards (Initiatives): Design Document that proposes an alternative architecture for historic data endpoints - https://phabricator.wikimedia.org/T241184 (10WDoranWMF) [17:52:19] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review, 10User-Elukey: Kerberize Superset to allow Presto queries - https://phabricator.wikimedia.org/T239903 (10elukey) Authentication flow: - User authenticates via LDAP to superset.wikimedia.org - User is proxied by Superset to Presto... [18:06:04] really nice, the jmx_exporter seems to work ok for presto https://github.com/prometheus/jmx_exporter/pull/264 [18:06:36] if it works it would mean not packaging another exporter, that's great [18:07:46] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Add Prometheus Presto metrics and dashboards - https://phabricator.wikimedia.org/T247884 (10elukey) [18:19:44] elukey: thanks for bringing back stat1005 [18:27:32] elukey: WOWWWWWW [18:27:58] elukey: this is going to be a game changer, let's let it bake for a bit and connie can start using it after a few days [18:28:19] yep! will try to add presto metrics tomorrow, should be doable without too much pain [18:28:26] so we'll know how presto behaves [18:28:50] mgerlach: np! Really sorry for the lag, networking issue :( [18:29:15] anyway, logging off :) o/ [18:32:24] byeeee [18:53:07] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10Ottomata) @akosiaris when you find have a moment, I'm trying to set `debug.enabled=true` on the eventstreams canary pod. Som... [19:24:25] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10Mike_Peel) Looking at CPU usage at https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&from=now-30d&to=now I can't see anything obvious that would explain thing... [20:37:57] 10Analytics, 10Event-Platform, 10Product-Analytics, 10CPT Initiatives (Modern Event Platform (TEC2)), 10Core Platform Team Workboards (Clinic Duty Team): Eventbus revisions are duplicated in event.mediawiki_revision_tags_change - https://phabricator.wikimedia.org/T218246 (10AMooney) [20:41:39] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10zhuyifei1999) > @zhuyifei1999 Perhaps there could be some sort of a trusted user set on quarry that can run things for longer? How do you want such a list to be made? I obvio... [20:59:19] mforns: i think this will fix the jenkins problem https://gerrit.wikimedia.org/r/c/integration/config/+/580466 [20:59:35] lookin [21:00:30] oh, ottomata thanks for that! [21:03:02] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10Mike_Peel) >>! In T246970#5977568, @zhuyifei1999 wrote: >> @zhuyifei1999 Perhaps there could be some sort of a trusted user set on quarry that can run things for longer? > >... [21:53:49] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10zhuyifei1999) > a simple request system at https://www.mediawiki.org/wiki/Talk:Quarry I don't like the idea of flooding a help page with access requests (or perhaps there wil... [21:55:42] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10Mike_Peel) I expect that there would be few requests. Phab would also work. [21:58:09] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10bd808) >>! In T246970#5977262, @Mike_Peel wrote: > 3. Is there a way to request more direct access to the replicas, ideally with an example of how to run a MySQL query and out... [22:41:56] 10Analytics, 10Analytics-Kanban, 10Multimedia, 10Tool-Pageviews: Fix double encoding of urls on mediarequests api - https://phabricator.wikimedia.org/T244373 (10MusikAnimal) Thanks for the fix (and apologies for my delayed response)! Should this be closed? It does indeed work in Mediaviews now https://tool...