[00:06:46] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10bd808) I know that the backend query killer (pt-kill) was set to be more aggressive while the cluster was operating at a lower capacity due to an instance with corrupt data (h... [00:30:21] 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Desktop Improvements, and 8 others: Enable client side error logging in prod for small wiki - https://phabricator.wikimedia.org/T246030 (10Nuria) I am a bit confused here cause i do not see things working on test wiki. The code below puts an error on... [00:35:43] 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Desktop Improvements, and 8 others: Enable client side error logging in prod for small wiki - https://phabricator.wikimedia.org/T246030 (10Nuria) Is the mw.trackSubscribe working? [02:32:04] 10Analytics, 10Analytics-Kanban, 10Tools: Pie chart is missing on the WMCS Edits dashboard - https://phabricator.wikimedia.org/T246963 (10srishakatux) 05Open→03Resolved Thank you very much @mforns and @elukey :) [02:33:11] 10Analytics, 10Cloud-Services, 10Developer-Advocacy, 10Documentation: Set up a "WMCS Edits Dashboard" page on Meta - https://phabricator.wikimedia.org/T246932 (10srishakatux) 05Open→03Resolved [02:33:15] 10Analytics, 10Cloud-Services, 10Developer-Advocacy, 10Patch-For-Review: Create a WMCS edits dashboard via Dashiki - https://phabricator.wikimedia.org/T226663 (10srishakatux) [02:33:48] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Jan-Mar 2020): Further improvements to the WMCS edits dashboard - https://phabricator.wikimedia.org/T240040 (10srishakatux) 05Open→03Resolved (nothing more remaining here) [02:33:53] 10Analytics, 10Cloud-Services, 10Developer-Advocacy, 10Patch-For-Review: Create a WMCS edits dashboard via Dashiki - https://phabricator.wikimedia.org/T226663 (10srishakatux) [02:37:22] 10Analytics, 10Cloud-Services, 10Developer-Advocacy, 10Patch-For-Review: Create a WMCS edits dashboard via Dashiki - https://phabricator.wikimedia.org/T226663 (10srishakatux) 05Open→03Resolved Relevant links: https://wmcs-edits.wmflabs.org/ https://meta.wikimedia.org/wiki/Config:Dashiki:WMCSEdits https... [06:07:12] 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Desktop Improvements, and 8 others: Enable client side error logging in prod for small wiki - https://phabricator.wikimedia.org/T246030 (10Tgr) `Script error` means the browser srubbed error details because the error did not originate from code belon... [06:25:58] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10Marostegui) Query killer is indeed back to its normal running times. 300 seconds for web service (labsdb1009 and labsdb1010) and 14400 seconds for analytics (labsdb1011): ` 4... [06:33:10] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10zhuyifei1999) Hmm I can see that quarry s indeed running on web rather than analytics. @Bstorm I see in the [[https://wikitech.wikimedia.org/wiki/Nova_Resource:Quarry/SAL|SA... [07:00:59] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Bot field in edits_hourly dataset ignores username - https://phabricator.wikimedia.org/T244632 (10nshahquinn-wmf) 05Open→03Resolved Thanks for taking care of this, @Milimetric! Uploading patches to Ger... [07:38:55] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Kerberize Superset to allow Presto queries - https://phabricator.wikimedia.org/T239903 (10elukey) I was able to force the last version of the PyHive package in the Superset Staging venv on an-tool1005, and with the following config Superse... [07:40:08] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: profile::hive::site_hdfs should work with kerberos-run-command - https://phabricator.wikimedia.org/T240880 (10elukey) [07:56:44] (03PS3) 10WMDE-Fisch: Count disables of the TwoColumnConflict interface [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/577213 (https://phabricator.wikimedia.org/T246104) [07:57:44] (03CR) 10WMDE-Fisch: Count disables of the TwoColumnConflict interface (032 comments) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/577213 (https://phabricator.wikimedia.org/T246104) (owner: 10WMDE-Fisch) [08:09:45] Good morning team [08:09:45] (03PS3) 10Joal: Fix wikidata article-placeholder job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/576449 (https://phabricator.wikimedia.org/T236895) [08:09:57] (03CR) 10Joal: Fix wikidata article-placeholder job (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/576449 (https://phabricator.wikimedia.org/T236895) (owner: 10Joal) [08:10:17] bonjour [08:10:25] \o [08:23:35] I'm starting to have some fun with Spark, but wondering what I can do to visualize results. [08:23:55] The spark scala kernels crash in SWAP, so maybe that's not a popular alternative... [08:24:57] awight: hi - it makes some time I have not used spark-scala in swap, but last time I tried it it worked :( [08:28:35] 10Analytics, 10Analytics-Kanban: Upgrade jupyterhub-systemdspawner from 0.9.9 to 0.13 to allow the use of systemd custom slices - https://phabricator.wikimedia.org/T247055 (10elukey) p:05Triage→03High [08:29:24] I can always use pyspark and matplotlib, but it seems like a shame to lose language compatibility with the library I'm building. [08:29:27] awight: indeed spark kernel don't even start in swap - will investigate with elukey (I have an idea of where it might come from) [08:29:37] awight: I very much agree [08:29:52] awight: I'll ping you with updates hopefully soon [08:29:56] Okay, thanks for the confirmation. There's no rush at all from my side. [08:30:03] elukey: would you have a minute for opsy stuff? [08:30:09] sure [08:30:51] elukey: spark-scala fails on notebook1003 :( I think it is related to the last patch I made and Andrew merged about kernel settings consistency [08:30:55] Ideally, I could get a bit modern and use d3 or observable, but happy to follow the lead for whatever people are doing in SWAP these days [08:31:20] awight: the thing I had tested is named brunell - long time not used, but was doing the job [08:32:05] kk, I see that discussed on wikitech.wmo [08:32:06] what fails exactly? Can I see the errors? [08:32:47] elukey: not even an error - When I start the kernel I get an alert window saying the kernel has died [08:33:02] elukey: how can we look at logs for jupyter on notebook machine? [08:34:03] 10Analytics, 10Discovery, 10Discovery-Search (Current work): Data for events from wdqs needs to be deleted after 90 days and/or sanitized - https://phabricator.wikimedia.org/T247034 (10dcausse) [08:35:17] for single notebooks sudo journalctl -u jupyter-joal-singleuser.service [08:35:27] otherwise journalctl -u jupyterhub [08:35:37] but I don't see anything in yours now [08:35:47] indeed [08:35:52] weird [08:35:56] jupytherhub maybe? [08:36:11] is it on 1003 ? [08:36:41] yes [08:36:42] nothing in there too [08:37:43] elukey: weird- I get logs for `Mar 04` when doing sudo journalctl -u jupyterhub [08:37:58] elukey: could it be the machine is stuck? [08:38:25] (03PS1) 10Elukey: Upgrade jupyterhub-systemdspawner to 0.13 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/577477 (https://phabricator.wikimedia.org/T247055) [08:38:47] joal: can you try on 1004? [08:38:51] sure [08:42:07] elukey: worked [08:43:16] joal: I have refreshed jupyterhub on 1003 for a patch that I made (that doesn't work yet) [08:43:30] I suspect that it is due to the OOM events [08:43:41] the task above, hopefully, should solve this mess once for all [08:43:42] ok [08:43:53] can you confirm that works on 1003 as well? [08:43:56] elukey: I have seen that patch yesterday - it's great [08:44:01] give me a minute :) [08:45:25] Nope - 1004 is working but not 1003 [08:45:29] elukey: --^ [08:45:43] same issue still: `The kernel has died, and the automatic restart has failed` [08:47:26] lovely [08:48:21] another retry pliz :) [08:49:21] joal: --^ [08:50:22] nope [08:50:29] same again [08:52:35] joal: forced the restart of your unit, I now see logs [08:53:29] ok elukey [08:53:34] shall I try again? [08:55:53] yep [08:59:56] elukey: all good now :) [09:01:32] thanks a lot elukey [09:03:44] o/ [09:04:06] Hi dcausse [09:04:26] dcausse: Thanks for the callout in yesterday's meeting :) [09:04:39] joal: well deserved! [09:06:29] question regarding kerberos and long running jobs, a flink job I was running got killed after 4 or 5 days [09:06:50] dcausse: this is expected - it should actually have been killed before :) [09:06:57] :) [09:07:15] dcausse: solution is to tun the job as search-analytics user, and pass the user keytab [09:07:16] I am curious about the killing part - did it fail due to lack of permissions for X ? [09:07:29] elukey: can you confirm that above --^? [09:07:38] elukey: I can paste the output of the flink job manager [09:08:00] joal: yes +1, I am wondering if flink offers something similar to spark [09:08:05] almost surely yes [09:08:09] dcausse: yes please! [09:08:27] dcausse: it also depends what you try to access - hdfs/hive/etc.. [09:09:13] elukey: it tries to access its workers, hdfs and kafka [09:09:23] okok [09:10:10] its workers were killed so it fails with java.net.ConnectException: Call From stat1004/10.64.5.104 to an-master1002.eqiad.wmnet:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apach [09:10:14] https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#auth-with-external-systems [09:11:08] yes I was on that page, so a keytab approach would work? [09:11:11] so as joal was saying, I'd try with security.kerberos.login.keytab and security.kerberos.login.principal [09:11:23] using the analytics-search user credentials [09:11:30] ok I'll dig into this [09:11:36] so something like: [09:12:04] sudo -u analytics-search flink etc.. [09:12:10] using: [09:12:46] security.kerberos.login.keytab = /etc/security/keytabs/analytics-search/analytics-search.keytab [09:13:11] security.kerberos.login.principal = analytics-search/stat1007.eqiad.wmnet@WIKIMEDIA [09:13:20] if you run this from stat1007 [09:13:27] where we have the credentials deployed [09:13:49] from airflow it would be the same but replacing the hostname with an-airflow1001.eqiad.wmnet [09:13:52] dcausse: --^ [09:14:05] elukey: great thank you! I'll try that for the next test [09:19:17] elukey: looks notebook1004 is not answering to me either :( [09:19:49] anything special happenning? [09:20:46] 10Analytics, 10Analytics-Kanban: "Month over month" i18n tag being mixed with locales - https://phabricator.wikimedia.org/T246750 (10fdans) p:05Triage→03High [09:21:40] joal: I applied earlier on the spawner change and jupyterhub got restarted, but in theory nothing more [09:22:11] elukey: I have very bizzare answer :( [09:22:12] elukey: on stat1004:/srv/home/dcausse/flink-1.10.0/kerb_error.log (sadly I don't have the full logs, but the job started friday feb 28) [09:22:36] joal: just restarted the notebook, can you check if it is better? [09:23:32] elukey: I get `ERR_EMPTY_RESPONSE` [09:23:46] dcausse: ah yes kerb credentials expired [09:24:02] yes [09:24:24] it ran almost 5 days I think [09:25:04] 10Analytics, 10Analytics-Wikistats: 'All" time range does not transfer well across metrics - https://phabricator.wikimedia.org/T227038 (10fdans) p:05High→03Medium [09:25:06] so in theory the boundary of 24h for kerberos is for credentials, that is enforced when you use them [09:25:25] from the logs for example it started to complain when a connection to the hadoop master was needed [09:25:36] at that point, the credentials were expired, and the connection was refused [09:26:06] with the keytab there is a periodical refresh that happens (if flink behaves like spark, but I think so) [09:26:13] so there shouldn't be any issue anymore [09:26:15] dcausse: --^ [09:26:21] joal: maybe we can bc ? [09:26:26] sure [09:26:33] elukey: thanks!! [09:45:31] !log roll restart of cassandra on AQS to pick up new openjdk upgrades [09:45:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:59:13] elukey: another question on notebooks pleae [09:59:34] elukey: in here https://wikitech.wikimedia.org/wiki/SWAP#Custom_Spark_Notebook_Kernels it says: # Activate your Jupyter virtualenv (if it isn't already activated): [09:59:49] elukey: but, -rw-r--r-- 1 joal wikidev 2.1K Jan 13 2017 activate [09:59:56] no x perm on the file :( [10:00:27] joal: on what dir is it? [10:00:41] notebook 1003 /home/joal/venv/bimn [10:00:42] notebook 1003 /home/joal/venv/bin [10:00:44] sorry [10:01:59] actually elukey it worked with adding a dot - please exuse me - I think I'm not good to anything this morning :( [10:02:56] joal: please don't say that, I have a backlog of things that I have asked in the past to you that is never ending. You can basically use my time whenever you want for the next 10y : [10:03:00] :D [10:03:09] You;re too nice ;)( [10:03:34] just created https://gerrit.wikimedia.org/r/#/c/operations/cookbooks/+/577525/2/cookbooks/sre/presto/roll-restart-workers.py [10:03:50] one daemon at the time seems reasonable for the moment [10:04:20] Yay :) [10:05:29] (03CR) 10Ladsgroup: [C: 03+1] "Thanks!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/576449 (https://phabricator.wikimedia.org/T236895) (owner: 10Joal) [10:06:52] !log roll restart Presto daemons for openjdk upgrades [10:06:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:18:04] 10Analytics: Check home/HDFS leftovers of flemmerich - https://phabricator.wikimedia.org/T246070 (10elukey) ` ====== stat1004 ====== total 0 ls: cannot access '/var/userarchive/flemmerich.tar.bz2': No such file or directory ====== stat1005 ====== total 0 ls: cannot access '/var/userarchive/flemmerich.tar.bz2':... [10:28:19] awight: I have managed to make vegas-viz work - But it's not very straightforward: you need to create a custom-kernel with a jar containing the dependency ... :S [10:35:35] O_O [10:36:14] awight: I'm gonna talk to my beloved ops elukey and Andrew to try to find a better way to make the libs accessible :) [10:36:32] * joal is always afraid of big-round-eyes [10:37:20] The nice thing awight is that the small test I have done with vegas worked like a charm [10:45:26] vegas-viz looks nice! I'm not officially supposed to be going down this rabbithole, so no pressure to get the kernel in place. [10:45:55] (03PS1) 10Fdans: Remove graph trend from Wikistats detail page [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577531 (https://phabricator.wikimedia.org/T212032) [10:46:13] ack awight - I must say the request will be for me as well ;) [10:46:32] ;-) well then it sounds great [10:46:48] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Wikistats 2 pageviews trend figure is wrong - https://phabricator.wikimedia.org/T212032 (10fdans) a:03fdans [10:47:00] I organized my scala code into smaller modules, feeling quite good about basing future experiments on that repo... [10:49:23] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] Count disables of the TwoColumnConflict interface [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/577213 (https://phabricator.wikimedia.org/T246104) (owner: 10WMDE-Fisch) [10:57:27] (03CR) 10Elukey: "Andrew: I am trying to build the repo via Docker and I realized that we are still using node 6 (that is not supported anymore). Should we " [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/577477 (https://phabricator.wikimedia.org/T247055) (owner: 10Elukey) [10:59:29] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10akosiaris) >>! In T238658#5946564, @Ottomata wrote: > FYI grafana dash here: > > https://grafana.wikimedia.org/d/znIuUcsWz/... [10:59:44] 10Analytics, 10Analytics-Wikistats: Annotations in wikistats that are only visible on "all" time range get bundled up (probably an issue we cannot resolve until we have a more granular time range) - https://phabricator.wikimedia.org/T200020 (10fdans) p:05High→03Medium @Milimetric I think when the annotatio... [11:01:38] 10Analytics: Weird performance of sqoop job on Edit Reconstruction - https://phabricator.wikimedia.org/T172579 (10fdans) This is on the wrong column, is this still an issue? [11:02:08] 10Analytics, 10Gerrit, 10Gerrit-Privilege-Requests, 10User-MarcoAurelio: Give access to Wikistats 2 to l10n-bot - https://phabricator.wikimedia.org/T245805 (10MarcoAurelio) 05Open→03Resolved Working smoothly now: https://gerrit.wikimedia.org/r/#/c/analytics/wikistats2/+/577291/ - Closed. [11:04:40] 10Analytics, 10Gerrit, 10Gerrit-Privilege-Requests, 10User-MarcoAurelio: Give access to Wikistats 2 to l10n-bot - https://phabricator.wikimedia.org/T245805 (10fdans) This is great, thank you @MarcoAurelio !! [11:06:09] 10Analytics: Remove Topic Explorer from Wikistats - https://phabricator.wikimedia.org/T247064 (10fdans) [11:07:59] 10Analytics, 10Analytics-Wikistats: WikiStats should recognize global bots - https://phabricator.wikimedia.org/T37196 (10fdans) [11:15:15] 10Analytics, 10Analytics-Wikistats: WikiStats should recognize global bots - https://phabricator.wikimedia.org/T37196 (10fdans) if the current breakdowns on editors metrics fulfill this let's close this task: https://stats.wikimedia.org/#/all-projects/contributing/top-editors/normal|table|last-month|editor_ty... [11:21:51] 10Analytics, 10Analytics-Wikistats: Siteviews of all Wikipedias per month - https://phabricator.wikimedia.org/T224963 (10fdans) p:05Low→03Medium Raising the priority of this as we move closer to working on cross-wiki views. I'll probably create a parent task for all usecase requests. [11:31:19] 10Analytics, 10Analytics-Wikistats: [Wikistats v2] Default selection for (active) editors is confusing for inexperienced users - https://phabricator.wikimedia.org/T213800 (10fdans) Question for team, should we only count users in this metric? [11:34:07] 10Analytics: Measure Community Backlog. - https://phabricator.wikimedia.org/T155497 (10fdans) p:05Medium→03Triage This is on the Wikistats column but I don't see any actionables related to Wikistats, and it hasn't been touched in 3 years so let's re-groom. [11:37:15] 10Analytics: Gather all constants related to mobile/responsiveness in config - https://phabricator.wikimedia.org/T190339 (10fdans) p:05Medium→03Low This has not proved to be a problem since we pushed the mobile site. This task's description is a bit vague so unless anyone has concrete issues with the mobile... [11:37:17] * elukey lunch! [11:41:35] 10Analytics: Serve legacy code only to legacy browsers - https://phabricator.wikimedia.org/T207311 (10fdans) Since we're doing already all kinds of magic to decide which bundle to serve (because i18n), we could totally add this and have a "light" and "heavy" bundle [11:43:20] 10Analytics: We need better UI addressing when are metrics publicly available - https://phabricator.wikimedia.org/T226403 (10fdans) p:05Medium→03Low This was a problem when the Mediawiki snapshots took until mid-month to get computed, but I don't think this is an issue anymore [11:46:09] 10Analytics: Make MapChart load async instead of parts of MapChart - https://phabricator.wikimedia.org/T229382 (10fdans) If we do this we should maaaaaybe increase the resolution of the map? [11:59:33] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10akosiaris) @Ottomata merged and pooled 1 kubernetes host with a very low weight (1) in the pool and that on just in eqiad. It... [12:09:06] (03PS1) 10Fdans: Show correct 12 month period in the dashboard [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577542 (https://phabricator.wikimedia.org/T238894) [12:09:33] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Legend for 12 months is not correct in graphs of analytics - https://phabricator.wikimedia.org/T238894 (10fdans) [12:11:07] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Legend for 12 months is not correct in graphs of analytics - https://phabricator.wikimedia.org/T238894 (10fdans) p:05Medium→03High [12:29:03] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10fgiunchedi) >>! In T238658#5947574, @akosiaris wrote: >>>! In T238658#5946564, @Ottomata wrote: >> FYI grafana dash here: >>... [12:40:41] elukey: when you have a minute - https://wikitech.wikimedia.org/wiki/Analytics/Systems/AQS/Scaling/2020/Cluster_Expansion [13:37:31] hi elukey. Please, could you add me the test_gpu group on the stat1005? [13:38:01] gpu-testers [13:56:46] joal: ack, I'll do it before EOD ok? [13:57:00] dsaez: 5 euros fee! [13:57:01] :D [13:57:19] sure np elukey - I'm also planning on bumping druid-public snapshot for AQS if ok for you [13:57:53] +1 [14:03:50] dsaez: done! [14:09:57] 10Analytics: Upgrade AMD ROCm to latest upstream - https://phabricator.wikimedia.org/T247082 (10elukey) [14:11:35] hello analytics folks, apologies as I don't even know if you're responsible for this tool (or the data backing it), but a friend pointed it out to me: and something funny is going on with the article on the United States Senate [14:11:37] it is the #1 page viewed every day right now since early feb [14:11:39] https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2020-02-01&end=2020-02-29&pages=United_States_Senate [14:12:44] that does look weird! [14:14:08] it's just 'mobile web' platform that looks affected [14:18:23] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10Ottomata) Ping @colewhite on all the service runner prometheus stuff above in the last 2 comments! :) Re {T205870} and http... [14:18:46] could it be a bot not tagged as such? [14:18:52] elukey: seems likely [14:21:48] elukey: almost certainly: https://w.wiki/Jqz [14:22:10] amazon IP space [14:22:34] 10Analytics: Kerberos credential cache expiry time on notebook is different than the OS one - https://phabricator.wikimedia.org/T247084 (10elukey) [14:26:27] cdanis: ah yes makes sense! If you have time can you open a task tagging "Analytics" please? [14:26:32] otherwise I'll do it later on [14:29:00] 10Analytics: clear bot spam-scraping [[en:United States Senate]] not being detected as a bot - https://phabricator.wikimedia.org/T247085 (10CDanis) [14:32:36] 10Analytics: Kerberos credential cache expiry time on notebook is different than the OS one - https://phabricator.wikimedia.org/T247084 (10MoritzMuehlenhoff) With PrivateTmp the the named-spaced /tmp gets removed when the correspondent service units terminates. As such, every restart of Jupyter will affect it.... [14:33:08] elukey: ty! [14:33:51] :) [14:33:54] thanks for reporting! [14:46:08] joal: aqs1004 depooled and ready to test [14:47:26] elukey I owe you 5 eur, and 1 dinner for joal. [14:47:31] ahahhaha [14:47:43] :D [14:48:03] dsaez: I added you as subscriber to a task for upgrading tensorflow to 2.x, keep it in mind if you are planning work [14:48:12] cool, thx [14:48:38] good for me elukey - FULL DEPLOY :) [14:51:13] doing it! [14:51:16] ottomata: hiiiiii [14:51:46] helloooo! [14:57:04] elukey: wikistats updated, looks like deploy is done :) [14:57:18] thanks mate :) [14:57:29] goood [14:58:03] !log AQS new druid snapshot released (2020-02) [14:58:04] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:58:09] ottomata: any thoughts about bumping nodejs to 10 on notebooks ? [14:58:27] +a billion [14:58:38] elukey: is node 10 default on buster? [14:58:46] maybe we can just put swap there and try it? [14:58:52] on stat1005? [15:01:10] ottomata: I think that we should have already moved to it since node 6 is not supported anymore, let's not mention this to moritzm [15:01:38] before adding swap to 1005 I'd love to try the slice change [15:01:58] I can rebuild now the repo with nodejs 10 on docker [15:02:11] and then we upgrade it also on notebooks when we are ready to deploy [15:02:14] wdyt? [15:02:24] sure! [15:02:29] sounds good [15:02:35] super, working on it [15:02:56] yeah, buster has Node 10 only [15:03:10] but there's a Node 10 backport/component for Stretch [15:05:12] :) [15:07:53] 10Analytics, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Deployment strategy and hardware requirement for new Flink based WDQS updater - https://phabricator.wikimedia.org/T247058 (10Ottomata) [15:17:09] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10Ottomata) > merged and pooled 1 kubernetes host with a very low weight (1) in the pool and that on just in eqiad AWESOME! ye... [15:18:14] (03PS2) 10Elukey: Upgrade repo to nodejs 10 and jupyterhub-systemdspawner 0.13 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/577477 (https://phabricator.wikimedia.org/T247055) [15:20:54] (03CR) 10Ottomata: [C: 03+1] Upgrade repo to nodejs 10 and jupyterhub-systemdspawner 0.13 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/577477 (https://phabricator.wikimedia.org/T247055) (owner: 10Elukey) [15:21:11] Luca since you are in there...you might as well take this moment to try to build for buster too! [15:21:47] ottomata: ah yes good point! I'll do it right after this thing.. I was thinking how to test, then I realized that we have an-tool1006! [15:21:49] if you just build inside a buster image, it shoudl ljust work [15:21:54] :) [15:22:15] I am sending the same patch with a better commit msg [15:22:27] ok prepping the patch to use nodejs 10 [15:22:32] on an-tool1006 [15:23:10] (03PS3) 10Elukey: Upgrade nodejs and jupyterhub-systemdspawner [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/577477 (https://phabricator.wikimedia.org/T247055) [15:26:32] 10Analytics, 10Operations, 10User-Elukey: Refactor Analytics POSIX groups in puppet to improve maintainability - https://phabricator.wikimedia.org/T246578 (10elukey) I checked all the `statistics-users` members and all except two are already in other groups (`analytics-privatedata-users`, `statistics-private... [15:37:54] 10Analytics, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Release-Engineering-Team (CI & Testing services): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10hashar) @JAllemandou I guess we should pair o... [15:45:29] 10Analytics, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Deployment strategy and hardware requirement for new Flink based WDQS updater - https://phabricator.wikimedia.org/T247058 (10Ottomata) A nice feature of Flink is its support for both batch and stream processing. Ideally, we'd be able to buil... [15:47:21] there you go https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/577590/ [15:48:33] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): EventLogging MEP Upgrade - https://phabricator.wikimedia.org/T238544 (10jlinehan) @Krinkle and I talked for >2 hours about this and @phuedx also took time to fill me in on his perspect... [16:32:33] 10Analytics: Make MapChart load async instead of parts of MapChart - https://phabricator.wikimedia.org/T229382 (10Milimetric) Hm, maybe it'd be extra cool to have two different resolutions and use the smaller/rougher one on mobile. [16:33:18] 10Analytics: Serve legacy code only to legacy browsers - https://phabricator.wikimedia.org/T207311 (10Milimetric) We're doing the i18n magic in Apache? I didn't realize... if so, then sure! [16:35:11] 10Analytics: Measure Community Backlog. - https://phabricator.wikimedia.org/T155497 (10Milimetric) I actually don't think there's any direct way our team can do this. I'm deprioritizing and making it part of our collaboration with other teams. Essentially we need buy-in from product that this is a thing they'r... [16:37:31] 10Analytics: Weird performance of sqoop job on Edit Reconstruction - https://phabricator.wikimedia.org/T172579 (10Milimetric) The latest theory is that Joseph's proposed changes with how we weight and distribute work for different size tables would make this problem go away. I'm not sure, but we can deprioritiz... [16:47:20] 10Analytics, 10Analytics-Wikistats: Add flagged revision status statistics to Wikistats 2.0 - https://phabricator.wikimedia.org/T177951 (10Milimetric) p:05Medium→03Low [16:47:52] 10Analytics, 10Product-Analytics: SQL definition for wikidata metrics for tunning session - https://phabricator.wikimedia.org/T247099 (10Nuria) [16:48:16] 10Analytics, 10Product-Analytics: SQL definition for wikidata metrics for tunning session - https://phabricator.wikimedia.org/T247099 (10Nuria) a:03jwang [16:48:59] 10Analytics, 10Product-Analytics: Tech Tunning Session metrics - https://phabricator.wikimedia.org/T247100 (10Nuria) [16:49:01] Interesting post about how Netflix uses Kafka & Druid for ingesting over 2 million events per second, and querying over 1.5 trillion rows to get detailed insights into how their users are experiencing the service https://netflixtechblog.com/how-netflix-uses-druid-for-real-time-insights-to-ensure-a-high-quality-experience-19e1e8568d06 [16:49:22] 10Analytics, 10Product-Analytics: Tech Tunning Session metrics - https://phabricator.wikimedia.org/T247100 (10Nuria) [16:49:24] 10Analytics, 10Product-Analytics: SQL definition for wikidata metrics for tunning session - https://phabricator.wikimedia.org/T247099 (10Nuria) [16:50:12] 10Analytics, 10Product-Analytics: SQL definition for structure data in commons metrics - https://phabricator.wikimedia.org/T247101 (10Nuria) [16:58:12] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): EventLogging MEP Upgrade - https://phabricator.wikimedia.org/T238544 (10mforns) Option 3 sounds great to me (given that @krinkle approves potential performance issues). I believe it's... [16:58:39] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10bd808) >>! In T246970#5947203, @zhuyifei1999 wrote: > Can I revert to analytics? Yes. No matter why it was switched 8 months ago (?) it should be running against the analytic... [16:59:27] 10Analytics, 10Epic, 10Product-Analytics (Kanban): wmfdata's Kerberos check should require at least 8 hours of validity - https://phabricator.wikimedia.org/T247103 (10nshahquinn-wmf) [17:02:30] milimetric: nuria yoohooo [17:02:59] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): EventLogging MEP Upgrade - https://phabricator.wikimedia.org/T238544 (10Ottomata) Agree! [17:05:06] 10Analytics, 10Epic, 10Product-Analytics (Kanban): wmfdata's Kerberos check should require at least 8 hours of validity - https://phabricator.wikimedia.org/T247103 (10nshahquinn-wmf) p:05Triage→03Medium [17:14:52] 10Analytics, 10Epic, 10Product-Analytics (Kanban): wmfdata's Kerberos check should require at least 24 hours of validity - https://phabricator.wikimedia.org/T247103 (10Nuria) [17:25:57] 10Analytics: Question box in wikistats, instrument usage with piwik - https://phabricator.wikimedia.org/T247106 (10Nuria) [17:34:19] 10Analytics: Measure Community Backlog. - https://phabricator.wikimedia.org/T155497 (10Nuria) Moving to deprioritized column. [17:41:29] ottomata: qq - it seems that jupyterhub is not configured in puppet to be deployed via scap, but the repo has some info about scap targets etc.. [17:41:39] (we do git::close afaics) [17:42:08] so should add scap configs etc..? [17:42:14] should I [17:42:24] not sure if we didn't do it for a reason [17:55:39] 10Analytics: Weird performance of sqoop job on Edit Reconstruction - https://phabricator.wikimedia.org/T172579 (10Nuria) 05Open→03Declined [17:57:41] -.slice [17:57:41] ├─user.slice [17:57:41] │ ├─jupyter-elukey-singleuser.service [17:57:42] │ │ └─29154 /home/elukey/venv/bin/python3 /home/elukey/venv/bin/jupyterhub-singleuser --port=50775 [17:57:45] yessssss [17:57:46] \o/ [17:57:50] it works on an-tool1006 [17:59:36] 10Analytics: Check home/HDFS leftovers of flemmerich - https://phabricator.wikimedia.org/T246070 (10leila) @flemmerich can you review the comment above and let @elukey know what can be purged? Anything that needs to be kept, I suggest we move it under @Isaac's directory if he's fine with it. flemmerich: I'd app... [17:59:53] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade jupyterhub-systemdspawner from 0.9.9 to 0.13 to allow the use of systemd custom slices - https://phabricator.wikimedia.org/T247055 (10elukey) On an-tool1006 it seems working: ` -.slice ├─user.slice │ ├─jupyter-elukey-singleuser.service │ │ └─29154... [18:00:10] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10colewhite) I created a [[ https://github.com/wikimedia/service-runner/pull/225 | PR to service-runner for the updates to heap... [18:06:54] Gone for diner :) [18:10:41] 10Analytics: Check home/HDFS leftovers of flemmerich - https://phabricator.wikimedia.org/T246070 (10Isaac) Thanks @leila for looping us in: I'll be taking care of saving the scripts from stat1007 and will ping when I am complete with that. The motivations database on Hive is just tables of redirects that were... [18:10:49] 10Analytics, 10Operations, 10Wikidata, 10Wikidata-Query-Service: Deployment strategy and hardware requirement for new Flink based WDQS updater - https://phabricator.wikimedia.org/T247058 (10Pchelolo) Yeah, @Gehel analysis is correct - change-prop is pretty simple and doesn't support any of the advanced fea... [18:31:08] joal: doc looks good, will review it again on monday with fresh brain/eyes :) [18:31:11] * elukey off! [18:35:44] 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Desktop Improvements, and 8 others: Enable client side error logging in prod for small wiki - https://phabricator.wikimedia.org/T246030 (10Nuria) @Tgr mmm, I think your last comment is incorrect. The gadget behaves the same than code like : window.se... [18:51:11] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10DiscussionTools, and 5 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10ppelberg) 05Open→03Resolved [18:54:42] (03CR) 10Nuria: "This change is in top of master but the language menu does not appear?" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577531 (https://phabricator.wikimedia.org/T212032) (owner: 10Fdans) [18:55:37] nuria: the language menu won’t appear in the dev build [18:57:40] fdans: i see [18:59:40] (03CR) 10Nuria: Remove graph trend from Wikistats detail page (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577531 (https://phabricator.wikimedia.org/T212032) (owner: 10Fdans) [19:08:35] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade jupyterhub-systemdspawner from 0.9.9 to 0.13 to allow the use of systemd custom slices - https://phabricator.wikimedia.org/T247055 (10Ottomata) SWAP is weird. You could use scap for it, but it would be mostly just doing a git pull, which is what p... [19:14:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade jupyterhub-systemdspawner from 0.9.9 to 0.13 to allow the use of systemd custom slices - https://phabricator.wikimedia.org/T247055 (10elukey) >>! In T247055#5949471, @Ottomata wrote: > SWAP is weird. You could use scap for it, but it would be most... [19:16:01] ottomata: lex needs access to hue [19:16:05] cc lexnasser [19:20:40] Done nuria and lexnasser [19:24:33] ottomata: thanks, checking [19:31:18] ottomata: login doesn't seem to be working - I'm using the same credentials as for Turnilo, is that correct? [19:31:29] shell username, ldap password [19:31:37] your username should be lexnasser [19:31:39] ya? [19:32:06] ottomata: perfect, that worked. Thanks! [19:32:53] great! [20:28:17] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): EventLogging MEP Upgrade - https://phabricator.wikimedia.org/T238544 (10Ottomata) Ok, I had a pass at doing this idea. Here's the basic gist: - EventLogging extension.json defines a... [21:13:49] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade jupyterhub-systemdspawner from 0.9.9 to 0.13 to allow the use of systemd custom slices - https://phabricator.wikimedia.org/T247055 (10Ottomata) Yup! [21:14:14] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10Core Platform Team Workboards (Clinic Duty Team), 10Performance-Team (Radar): Remove usage of MEDIAWIKI_JOB_RUNNER from WikimediaEvents extension - https://phabricator.wikimedia.org/T247130 (10Clarakosi) [21:30:29] 10Analytics, 10Better Use Of Data, 10Desktop Improvements, 10Product-Infrastructure-Team-Backlog, and 7 others: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Ottomata) @fgiunchedi very few error events are flowing in now! This is live on group0 wikis. Can we h... [21:38:38] 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Desktop Improvements, and 8 others: Enable client side error logging in prod for small wiki - https://phabricator.wikimedia.org/T246030 (10Tgr) I can see the network request for the `mediawiki.client.error` event triggered when I enable the gadget an... [22:06:22] 10Analytics, 10Analytics-Kanban: Problem with Matomo page overlay - https://phabricator.wikimedia.org/T246046 (10Varnent) Let me know if there's any code we should be adding to the sites. This did work previously, but perhaps VIP has changed something on their end? I know they have been beefing up their securi... [23:47:21] 10Analytics, 10Discovery, 10Discovery-Search (Current work), 10Patch-For-Review: Data for events from wdqs needs to be deleted after 90 days and/or sanitized - https://phabricator.wikimedia.org/T247034 (10Nuria) @dcausse similar to cirrus data some data can be kept long term if it is on the sanitization li... [23:49:05] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review, 10User-Elukey: Kerberize Superset to allow Presto queries - https://phabricator.wikimedia.org/T239903 (10Nuria) [23:49:42] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10Core Platform Team Workboards (Clinic Duty Team), 10Performance-Team (Radar): Remove usage of MEDIAWIKI_JOB_RUNNER from WikimediaEvents extension - https://phabricator.wikimedia.org/T247130 (10Reedy) [23:50:16] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review, 10User-Elukey: Kerberize Superset to allow Presto queries - https://phabricator.wikimedia.org/T239903 (10Nuria) Pinging @cchen when this ticket is done let's get together so we can show you how to do dashboards on top of data on ha... [23:53:32] 10Analytics, 10Analytics-Kanban: Hourly labeling of "automated" traffic before loading of pageviews into pageview_hourly - https://phabricator.wikimedia.org/T238361 (10Nuria) [23:54:31] (03CR) 10Nuria: "Ping us if you have issues testing the job" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/576618 (https://phabricator.wikimedia.org/T244597) (owner: 10Lex Nasser) [23:55:33] (03CR) 10Nuria: [C: 04-1] "The "3.57% over this time range" is still there, is it not?" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/577531 (https://phabricator.wikimedia.org/T212032) (owner: 10Fdans)