[02:27:49] (03CR) 10Nuria: "Looks good, did we tested that piwik is processing these events, should be possible on desktop." [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/583521 (https://phabricator.wikimedia.org/T247106) (owner: 10Fdans) [02:27:54] (03CR) 10Nuria: [C: 03+1] Instrument event tracking in TopicExplorer [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/583521 (https://phabricator.wikimedia.org/T247106) (owner: 10Fdans) [07:23:41] 10Analytics, 10Analytics-Wikistats, 10Product-Analytics: Contribution inequality graphs for Wikistats - https://phabricator.wikimedia.org/T195033 (10Quasipodo) Hello there! I'm thinking the possibility to the GSoC with WMF this year, would you accept implementing this (and maybe add some other metrics) as a... [08:07:08] (03CR) 10Elukey: [V: 03+2 C: 03+2] Add refinery jars tp spark scala and sql kernels [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/580083 (owner: 10Joal) [08:09:47] !log deployed new kernerls for https://gerrit.wikimedia.org/r/580083 on stat1004 [08:09:49] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:09:59] joal: --^ ready for a test when you have time (not urgent [08:10:11] if positive I'll deploy it also in other places [08:10:19] and re-create the same kernels for buster [08:38:29] since it seems very quiet I'll go out to buy some groceries/etc.., it might take a bit but I have my phone with me in case you need :) [08:38:46] * elukey afk! Out getting groceries, might take a while, but call me if urgent :) [08:40:37] Hello there! [08:42:06] would be anyone from the Analytics team interested in mentoring me for a GSoC project? I could propose to implement this one: https://phabricator.wikimedia.org/T195033 as I'm already very familiar with the topic, but I am open to listen any other ideas / suggestions. [08:42:14] My background: [08:44:34] I'm finishing my master's on data science and I should be done by middle of June. I have previously attended to some wikimedia hackathons and I'm familiar with mediawiki analysis and data visualization (I'm the developer of WikiChron: http://wikichron.science/ ). I'm quite fluent in Python. You can have a look also to my github profile: [08:44:34] https://github.com/Akronix/ [08:46:53] I have been working for two years in the academia doing research on wikis (smaller than wikipedia ones) and published some articles on the matter: https://www.akronix.es/publications.html [08:58:44] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Patch-For-Review, 10Services (watching): Switch all eventgate clients to use new TLS port - https://phabricator.wikimedia.org/T242224 (10akosiaris) p:05Triage→03Medium [08:59:20] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Set up TLS for eventgate-main and eventgate-analytics - https://phabricator.wikimedia.org/T241073 (10akosiaris) [09:52:12] Here are some information that GSoC mentors need to know, in case you are not familiar with it: [09:52:14] GSoC mentors are typically required to commit 4-5 hours per week, per student GSoC mentor responsibilities (also includes 'Selection process tips' which might help answer your previous question about likelihood of selection) [09:54:04] 10Analytics, 10Operations, 10Product-Analytics, 10SRE-Access-Requests: Hive access for Sam Patton - https://phabricator.wikimedia.org/T248097 (10Dzahn) [10:03:38] Quasipodo: hi! I would send an email to analytics@ to have a broader audience, on IRC things might be missed (especially with different timezones etc..). Half of our team is currently at reduced work capacity due to constraints in various countries, so we might not be able to support/mentor somebody during the next months [10:18:54] I see. Thank you for your response @elukey! [10:19:24] 10Analytics, 10Technical blog: Techblog. Change url shape to better measure pageviews - https://phabricator.wikimedia.org/T248614 (10Peachey88) [10:19:34] Quasipodo: to be clear it is not a "no", please try to follow up, I was just warning you :) [10:20:02] I'll do, thanks :) [10:24:11] time_firstbyte .... is that in seconds? [10:24:13] :D [10:25:40] addshore: I think ms but lemme triple check [10:26:10] and what about "-" http code ? :P what does that mean? :D [10:28:40] addshore: usually if there are '-' it means that the request was somehow not ending up in a response, or something collecting data went wrong [10:28:52] but if you have a specific use case it might be different [10:28:55] and any idea about "int-front" varnish code? or is that a question better suited to varnished people? [10:31:34] so it should be that the cache was hit in a varnish frontend, but lemme pull the docs [10:33:43] addshore: https://wikitech.wikimedia.org/wiki/Caching_overview#Headers [10:37:31] thanks, i tried searching for the codes but never found that page! [10:44:48] 10Analytics, 10Operations, 10Performance-Team, 10Traffic, 10Patch-For-Review: Only serve debug HTTP headers when x-wikimedia-debug is present - https://phabricator.wikimedia.org/T210484 (10ema) @Gilles: I see that `X-Varnish` is used by [[https://gerrit.wikimedia.org/g/mediawiki/extensions/MultimediaView... [11:44:51] 10Analytics, 10Operations, 10Performance-Team, 10Traffic, 10Patch-For-Review: Only serve debug HTTP headers when x-wikimedia-debug is present - https://phabricator.wikimedia.org/T210484 (10Gilles) No, that's some ancient performance logging that dates back to when Media Viewer was launched and we needed... [12:13:08] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10Core Platform Team Workboards (Clinic Duty Team), 10Performance-Team (Radar): Remove usage of MEDIAWIKI_JOB_RUNNER from WikimediaEvents extension - https://phabricator.wikimedia.org/T247130 (10AMooney) @Clarakosi can you update the priority for this task? [12:34:10] 10Analytics, 10Analytics-Kanban, 10Research, 10User-Elukey: Add SWAP profile to stat1005 - https://phabricator.wikimedia.org/T245179 (10elukey) Interesting: https://github.com/apache/incubator-toree/blob/master/RELEASE_NOTES.md 0.3.0 > Removed support for PySpark and Spark R in Toree (use specific kernels) [12:50:42] (03PS1) 10Elukey: Use toree 0.3.0 on buster [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583935 (https://phabricator.wikimedia.org/T245179) [12:54:14] all right this should hopefully split kernels between stretch and buster --^ [13:00:52] (03CR) 10Elukey: [V: 03+2 C: 03+2] "Trying to see if it works :)" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583935 (https://phabricator.wikimedia.org/T245179) (owner: 10Elukey) [13:06:53] mforns: holaaa [13:07:23] mforns: if you have sometime, can you please check if the spark yarn notebook works on stat1005? [13:07:27] just re-installed the kernels [13:18:10] * elukey lunch! [14:58:07] (03PS1) 10Elukey: Add two kernels to Buster [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583971 (https://phabricator.wikimedia.org/T245179) [14:59:54] (03Abandoned) 10Elukey: Add two kernels to Buster [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583971 (https://phabricator.wikimedia.org/T245179) (owner: 10Elukey) [15:01:12] (03PS1) 10Elukey: Add spark_yarn_pyspark_large to Buster's kernels [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583972 (https://phabricator.wikimedia.org/T245179) [15:01:44] (03CR) 10Elukey: [V: 03+2 C: 03+2] Add spark_yarn_pyspark_large to Buster's kernels [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583972 (https://phabricator.wikimedia.org/T245179) (owner: 10Elukey) [15:06:46] 10Analytics, 10Technical blog: Techblog. Change url shape to better measure pageviews - https://phabricator.wikimedia.org/T248614 (10bd808) The /%YEAR/%MONTH/%DAY/%SLUG permalink structure matches the historical techblog & blog layout. Right now we do not have legacy articles loaded into the blog, but {T243407... [15:17:33] (03PS1) 10Elukey: Rename Spark Buster kernels [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583981 (https://phabricator.wikimedia.org/T245179) [15:18:27] (03CR) 10Elukey: [V: 03+2 C: 03+2] Rename Spark Buster kernels [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583981 (https://phabricator.wikimedia.org/T245179) (owner: 10Elukey) [15:22:22] (03PS1) 10Elukey: Fix display name of the spark_yarn_scala_large kernel [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583983 (https://phabricator.wikimedia.org/T245179) [15:23:04] (03CR) 10Elukey: [V: 03+2 C: 03+2] Fix display name of the spark_yarn_scala_large kernel [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583983 (https://phabricator.wikimedia.org/T245179) (owner: 10Elukey) [15:29:49] all right stat1005 looks better now [15:30:55] still missing the sparkr kernels [15:31:09] but those seems not handled by toree anymore [15:31:57] so it should be a matter of replicating what we do for pyspark [15:32:38] just fyi I'm trying to track down what's going on with the kaios requests (currently trying to remember/find the topic where raw eventlogging data streams into the processors) [15:33:00] also I noticed all the refine fails, going to look at that in a bit [15:33:14] I'll do more ops week next week since I was out basically most of my turn [15:38:47] milimetric: hey! So most of the refine failures are related to something already WIP, there is an email in alerts@ about it [15:39:16] found it! eventlogging-client-side [15:39:40] yeah, I saw that, so I was going to double check that all of them get resolved by the WIP [15:39:40] (the ones about TwoColConflictExit) [15:43:18] 10Analytics, 10Analytics-Wikistats, 10Product-Analytics: Contribution inequality graphs for Wikistats - https://phabricator.wikimedia.org/T195033 (10Milimetric) +1, I can mentor you unless someone else was already planning on doing it. [15:50:31] (03PS1) 10Elukey: Fold the kernel's README into the main one and add documentation [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583991 (https://phabricator.wikimedia.org/T245179) [15:51:25] (03PS2) 10Elukey: Fold the kernel's README into the main one and add documentation [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583991 (https://phabricator.wikimedia.org/T245179) [15:51:54] (03CR) 10Elukey: [V: 03+2 C: 03+2] Fold the kernel's README into the main one and add documentation [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/583991 (https://phabricator.wikimedia.org/T245179) (owner: 10Elukey) [16:05:03] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Product-Analytics: EventLogging does not properly classify KaiOS user agents - https://phabricator.wikimedia.org/T248560 (10Nuria) a:03Milimetric [16:12:36] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Product-Analytics: EventLogging does not properly classify KaiOS user agents - https://phabricator.wikimedia.org/T248560 (10Milimetric) Ok, I'm looking into this. Will document my debug steps. 1. Looked at raw data: `kafkacat -C -b kafka-jumbo1... [16:17:41] 10Analytics, 10Analytics-Kanban, 10Research: covid19 data preservation - https://phabricator.wikimedia.org/T248600 (10Nuria) a:03Nuria [16:21:52] 10Analytics: Analytics Hardware for Fiscal Year 2019/2020 - https://phabricator.wikimedia.org/T244211 (10elukey) [16:21:54] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Hadoop Hardware Orders FY2019-2020 - https://phabricator.wikimedia.org/T243521 (10elukey) [16:22:10] 10Analytics: Analytics Hardware for Fiscal Year 2019/2020 - https://phabricator.wikimedia.org/T244211 (10elukey) [16:22:13] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Hadoop Hardware Orders FY2019-2020 - https://phabricator.wikimedia.org/T243521 (10elukey) [16:22:26] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Hadoop Hardware Orders FY2019-2020 - https://phabricator.wikimedia.org/T243521 (10elukey) [16:22:28] 10Analytics: Analytics Hardware for Fiscal Year 2019/2020 - https://phabricator.wikimedia.org/T244211 (10elukey) [16:27:27] leila: heya - you wish to join us in da cave? https://meet.google.com/rxb-bjxn-nip [16:33:05] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Definition of not text content metrics for tunning session (rich media,: images and the linke) - https://phabricator.wikimedia.org/T247417 (10Milimetric) For reference, the imagelinks table was sqooped in January to: `wmf_raw.mediawiki_imagelinks` (where... [16:37:25] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Hadoop Hardware Orders FY2019-2020 - https://phabricator.wikimedia.org/T243521 (10elukey) I had a chat with Willy about racking requirements of the new hosts (16 refreshed + GPU ones). We currently have this configuration: ` ROW A : (16) an-worker[1078-1... [16:40:00] leila: we disconnected with fdans - ping us when you're around :) [16:40:33] mforns: o/ time to test jupyter? [16:41:20] joal: I'll be with you in 5 min [16:41:29] ack leila [16:41:31] fdans: --^ [16:43:32] elukey: yes! [16:43:38] doing now [16:43:42] ack thanks [16:43:47] stat1005 [16:43:53] 1008 still not fixed [16:44:55] joal: omw! [16:49:34] ah snap mforns [16:49:35] FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/share/jupyter/kernels/jupyter_spark_scala/bin/run.sh': '/usr/local/share/jupyter/kernels/jupyter_spark_scala/bin/run.sh' [16:49:40] this is weird [16:50:00] how the hell it happened [16:50:04] elukey: I get an error when I open a new notebook in supyterlab: Error starting kernel : https://pastebin.com/QUuvmpjF [16:50:27] yea exactly [16:51:10] mforns: what kernel did you try? [16:51:21] scala spark - YARN [16:51:34] trying to repro [16:51:53] k [16:52:28] yes my bad, I thought one change was good but it isn't [16:52:39] ah no wait [16:52:55] yep [16:53:02] :] [16:53:35] mforns: lemme try a manual fix, if it works I'll commit it [16:53:49] elukey: ok [16:57:26] mforns: ok try now pliss [16:57:38] omw [16:58:08] elukey: no error yet [16:58:27] elukey: it vvvvvvorked! [16:58:41] wait trying some more substantial query. [16:58:47] 10Analytics, 10Technical blog: Techblog: Change URL permalink style to better measure pageviews - https://phabricator.wikimedia.org/T248614 (10Aklapper) [16:58:56] mforns: yesssssss [16:59:09] don't change kernel because the other ones are still broken :P [16:59:38] elukey: seems to be working fine! [17:02:13] mforns: goooooood! Thanks a lot :) [17:02:23] I am going to fix the kernel path issue now [17:02:25] ok! thank you too [17:02:28] k [17:02:36] we are missing only the sparkr kernels [17:02:58] but after that we'll have jupyter on buster with jupyterlab 1.1.0 toree 0.3.0 etc.. [17:04:23] 10Analytics, 10Technical blog: Techblog: Change URL permalink style to better measure pageviews - https://phabricator.wikimedia.org/T248614 (10Aklapper) On the technical side, `Settings 🡒 Permalinks` is currently set to `Day and name`. This request asks to change it to `Post name`. The question is if we want... [17:05:57] (03PS1) 10Elukey: Fix kernel json paths after directory rename [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/584018 (https://phabricator.wikimedia.org/T245179) [17:07:39] (03CR) 10Elukey: [V: 03+2 C: 03+2] Fix kernel json paths after directory rename [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/584018 (https://phabricator.wikimedia.org/T245179) (owner: 10Elukey) [17:14:39] nuria: would you have aminute about bots? [17:14:59] joal: Let's book some time on MOnday cause i have an interview briefly [17:16:04] 10Analytics, 10Analytics-SWAP, 10Product-Analytics, 10Patch-For-Review: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 (10elukey) As FYI stat100[5,8] (Buster nodes) are running with JupyterLab 1.1.0, if anybody wants to test and provide feedback I'd be happy :) [17:16:46] 10Analytics, 10Analytics-SWAP, 10Product-Analytics, 10Patch-For-Review, 10User-Elukey: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 (10elukey) a:05Ottomata→03elukey [17:17:47] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Hadoop Hardware Orders FY2019-2020 - https://phabricator.wikimedia.org/T243521 (10wiki_willy) Thanks for the sync up today @elukey. Just to recap here for later reference - we're running short on 10g ports in row B, so these will need to balanced out pri... [17:18:08] 10Analytics, 10Technical blog: Techblog: Change URL permalink style to better measure pageviews - https://phabricator.wikimedia.org/T248614 (10Nuria) > If we change the permalink structure we won't be able to keep links to old content working. can't we mod_rewrite it? as the change is an easy path rewrite [17:19:11] np nuria [17:20:05] nuria: just sent an invite - please move as it fits you [17:20:22] isaacj: heya - have you seen my email? [17:21:00] joal: yes -- in a meeting so haven't had a chance to test out but i'm excited to give it a try [17:21:08] ok you're faster than I am nuria :) [17:21:27] np isaacj - I'll be gone soon but please let me know :) [17:22:14] thanks -- i'll send an email but any issues can wait till Monday :) [17:22:59] 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review, 10User-Elukey: Add SWAP profile to stat1005 - https://phabricator.wikimedia.org/T245179 (10elukey) Ok so next steps are: 1) test more SWAP on stat1005/stat1008 2) add a SparkR kernel on buster [17:27:28] going to log off, have a nice weekend folks! [18:48:52] https://www.irccloud.com/pastebin/FZDfb2N0/ [18:49:26] hi Analytics team, Ive been facing an issue with my HUE account for the last few weeks.. due to this, hue is not able to load the list of databses .. this is the error : [18:49:26] java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient [18:50:42] pls let me know if I can try some things out to resolve this, or if I should open a task for analytics team to fix it. thanks! [18:57:33] mayakpwiki: see how to fix it here: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hue [18:59:16] thank you @nuria ! [19:50:00] 10Analytics, 10Product-Analytics (Kanban): SQL definition for wikidata metrics for tunning session - https://phabricator.wikimedia.org/T247099 (10Isaac) @jwang Thanks for these additional analyses! Super fascinating -- I think based on these analyses I'd advocate for splitting out namespace 0 from the other co... [19:50:56] 10Analytics, 10Product-Analytics (Kanban): SQL definition for wikidata metrics for tunning session - https://phabricator.wikimedia.org/T247099 (10Isaac) > Super thanks to @Isaac for providing context here. I think the question remains as to the impact we are seeking cause as you mentioned earlier an infobox an... [20:39:53] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Product-Analytics: EventLogging does not properly classify KaiOS user agents - https://phabricator.wikimedia.org/T248560 (10Milimetric) p:05Triage→03High [22:06:57] 10Analytics, 10Analytics-Wikistats, 10Product-Analytics: Contribution inequality graphs for Wikistats - https://phabricator.wikimedia.org/T195033 (10Quasipodo) According to the WMF GSoC coordinators is best if there are two mentors instead of one, so feel free anyone else to jump in :) [23:35:18] (03PS8) 10Mforns: Add dimensions to druid's pageview_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/570681 (https://phabricator.wikimedia.org/T243090) [23:36:49] (03CR) 10Mforns: [V: 03+2] "I think all Joseph's comments are taken care of." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/570681 (https://phabricator.wikimedia.org/T243090) (owner: 10Mforns)