[05:41:43] (03CR) 10Hoo man: [C: 03+2] Remove action counter in apiLogScanner.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/520918 (https://phabricator.wikimedia.org/T226292) (owner: 10Ladsgroup) [05:42:07] (03Merged) 10jenkins-bot: Remove action counter in apiLogScanner.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/520918 (https://phabricator.wikimedia.org/T226292) (owner: 10Ladsgroup) [05:46:52] (03CR) 10Hoo man: "I think it would be better to access the setting right where it's needed." (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/520903 (https://phabricator.wikimedia.org/T218710) (owner: 10Ladsgroup) [07:03:02] !log add base::firewall to stat1004 [07:03:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:03:23] spark works fine afaics [07:03:39] let's see how it goes :) [07:04:03] if nothing pops up I'll proceed with the other stats one at the time [07:04:08] and then notebooks [07:04:58] 10Analytics: Enable Security (stronger authentication and data encryption) for the Analytics Hadoop cluster and its dependent services - https://phabricator.wikimedia.org/T211836 (10elukey) [07:06:14] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: eqiad: 2 misc nodes for the Kerberos KDC service - https://phabricator.wikimedia.org/T227288 (10elukey) Adding @Ottomata for a quick check about the next steps, but it sounds to me that having one kerberos host per DC seems the most flexible so... [07:08:08] (03PS1) 10Ladsgroup: Remove action counter in apiLogScanner.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521218 (https://phabricator.wikimedia.org/T226292) [07:08:37] (03CR) 10Ladsgroup: [C: 03+2] Remove action counter in apiLogScanner.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521218 (https://phabricator.wikimedia.org/T226292) (owner: 10Ladsgroup) [07:08:59] (03Merged) 10jenkins-bot: Remove action counter in apiLogScanner.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521218 (https://phabricator.wikimedia.org/T226292) (owner: 10Ladsgroup) [07:11:22] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Enable base::firewall on stat boxes after restricting Spark REPL ports. - https://phabricator.wikimedia.org/T170826 (10elukey) a:03elukey [07:15:38] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: eqiad: 2 misc nodes for the Kerberos KDC service - https://phabricator.wikimedia.org/T227288 (10MoritzMuehlenhoff) >>! In T227288#5308441, @elukey wrote: > If you think that we'll have a future use case for codfw, I am +1 to buy one misc node i... [07:31:17] (03CR) 10Ladsgroup: "> Patch Set 2:" (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/520903 (https://phabricator.wikimedia.org/T218710) (owner: 10Ladsgroup) [07:34:37] AC remote doesn't work. It's already a terrible day [07:35:14] good morning :) [07:49:31] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: eqiad: 1 misc node for the Kerberos KDC service - https://phabricator.wikimedia.org/T227288 (10elukey) [07:51:26] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: codfw: 1 misc node for the Kerberos KDC service - https://phabricator.wikimedia.org/T227425 (10elukey) [07:52:47] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: eqiad: 1 misc node for the Kerberos KDC service - https://phabricator.wikimedia.org/T227288 (10elukey) Amended this task and created T227425 :) [08:07:50] helloooo elukey :) [08:08:23] fdans: o/ [08:08:30] I was checking gerrit and I noticed https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/517085/ [08:08:35] do you need a merge? [08:13:40] elukey: not yet! we have it scheduled for thursday [08:20:36] ah okok nice [08:32:00] (03PS1) 10Ladsgroup: Fix file permissions of pagelinks_to_namespaces.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521223 [08:40:07] (03CR) 10Michael Große: [C: 03+2] Fix file permissions of pagelinks_to_namespaces.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521223 (owner: 10Ladsgroup) [08:40:46] (03Merged) 10jenkins-bot: Fix file permissions of pagelinks_to_namespaces.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521223 (owner: 10Ladsgroup) [08:42:15] (03PS1) 10Ladsgroup: Fix file permissions of pagelinks_to_namespaces.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521225 [08:42:20] (03CR) 10Ladsgroup: [C: 03+2] Fix file permissions of pagelinks_to_namespaces.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521225 (owner: 10Ladsgroup) [08:42:41] (03Merged) 10jenkins-bot: Fix file permissions of pagelinks_to_namespaces.php [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/521225 (owner: 10Ladsgroup) [10:36:45] * elukey lunch! [10:51:47] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: codfw: 1 misc node for the Kerberos KDC service - https://phabricator.wikimedia.org/T227425 (10MoritzMuehlenhoff) p:05Triageβ†’03Normal [12:59:46] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: codfw: 1 misc node for the Kerberos KDC service - https://phabricator.wikimedia.org/T227425 (10elukey) [12:59:48] 10Analytics, 10User-Elukey: Make the Kerberos infrastructure production ready - https://phabricator.wikimedia.org/T226089 (10elukey) [13:45:51] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: eqiad: 1 misc node for the Kerberos KDC service - https://phabricator.wikimedia.org/T227288 (10Ottomata) +1 for 1 eqiad and 1 codfw [14:01:50] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey: Import AMD rocm packages in wikimedia-buster - https://phabricator.wikimedia.org/T224723 (10elukey) ` root@install1002:~# reprepro --noskipold --component thirdparty/amd-rocm checkupdate buster-wikimedia Calculating packages to get... ` I am pr... [14:33:26] 10Analytics: Refine JsonSchemaLoader uses should use JsonParser instead of YAMLParser to load JSON data - https://phabricator.wikimedia.org/T227484 (10Ottomata) [14:34:34] 10Analytics: Enable Security (stronger authentication and data encryption) for the Analytics Hadoop cluster and its dependent services - https://phabricator.wikimedia.org/T211836 (10elukey) [14:34:55] 10Analytics: Enable Security (stronger authentication and data encryption) for the Analytics Hadoop cluster and its dependent services - https://phabricator.wikimedia.org/T211836 (10elukey) [14:35:00] 10Analytics: Refine JsonSchemaLoader should use JsonParser instead of YAMLParser to load JSON data - https://phabricator.wikimedia.org/T227484 (10Ottomata) [14:44:09] 10Analytics, 10ChangeProp, 10EventBus, 10Reading-Infrastructure-Team-Backlog, and 2 others: Add change prop rule for new talk endpoint - https://phabricator.wikimedia.org/T226669 (10Pchelolo) 05Openβ†’03Declined It's been decided to avoid caching the endpoint for now, thus we don't need purging for now. [15:01:18] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Generate edit totals by country by month/year - https://phabricator.wikimedia.org/T215655 (10Milimetric) The monthly job has been merged and running ever since March 27th, it's just the yearly job that's broken. The data generated by the monthly job is do... [15:02:08] ping fdans standu[p [15:11:39] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Performance-Team (Radar): EventLogging needs to enque events to avoid draining users' battery on mobile - https://phabricator.wikimedia.org/T225578 (10Nuria) cc @Krinkle [15:17:36] 10Analytics: Research. Alarm when one job is taking all the default queue resources? - https://phabricator.wikimedia.org/T227491 (10Nuria) [15:20:27] (03CR) 10Alaa Sarhan: Use config for wdqs host name (032 comments) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/520903 (https://phabricator.wikimedia.org/T218710) (owner: 10Ladsgroup) [15:21:33] 10Analytics, 10Analytics-Kanban: Alarming scripts for entrophy alarms. Anomaly detection and reporting. - https://phabricator.wikimedia.org/T227357 (10Milimetric) p:05Triageβ†’03High [15:21:59] 10Analytics, 10User-Elukey: Move refinery to hive 2 actions - https://phabricator.wikimedia.org/T227257 (10Milimetric) p:05Triageβ†’03High [15:28:08] 10Analytics, 10User-Elukey: Move refinery to hive 2 actions - https://phabricator.wikimedia.org/T227257 (10Milimetric) [15:32:58] 10Analytics, 10User-Elukey: Move refinery to hive 2 actions - https://phabricator.wikimedia.org/T227257 (10Milimetric) Note: might be good to start with an oozie job like projectview_hourly, followed by pageview_hourly and webrequest load. That way we incrementally run bigger jobs and hopefully allow ourselve... [15:33:28] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: Enable widgets on Jupyter Labs on SWAP - https://phabricator.wikimedia.org/T227217 (10Milimetric) p:05Triageβ†’03Normal [15:34:01] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: Enable widgets on Jupyter Labs on SWAP - https://phabricator.wikimedia.org/T227217 (10Ottomata) We need to upgrade the OSes to Debian Buster. [15:34:27] 10Analytics, 10Analytics-Kanban, 10Cleanup, 10Operations, 10Patch-For-Review: Archive zookeeper puppet submodule - https://phabricator.wikimedia.org/T227164 (10Milimetric) p:05Triageβ†’03High [15:36:38] 10Analytics, 10ExternalGuidance, 10Product-Analytics: [Bug] `init` and `mtinfo` event counts drop drastically since June 17 2019 - https://phabricator.wikimedia.org/T227150 (10Milimetric) p:05Triageβ†’03Unbreak! [15:36:55] 10Analytics, 10ExternalGuidance, 10Product-Analytics: [Bug] `init` and `mtinfo` event counts drop drastically since June 17 2019 - https://phabricator.wikimedia.org/T227150 (10Milimetric) p:05Unbreak!β†’03High [15:37:12] 10Analytics, 10Analytics-Kanban, 10ExternalGuidance, 10Product-Analytics: [Bug] `init` and `mtinfo` event counts drop drastically since June 17 2019 - https://phabricator.wikimedia.org/T227150 (10Milimetric) a:03Milimetric [15:37:57] 10Analytics, 10Release-Engineering-Team: issues with artifact cache in an-coord1001 - https://phabricator.wikimedia.org/T227132 (10Milimetric) p:05Triageβ†’03High [15:38:08] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team: issues with artifact cache in an-coord1001 - https://phabricator.wikimedia.org/T227132 (10Milimetric) a:03Ottomata [15:39:28] 10Analytics: Move Eventstreams to kubernetes deployment pipeline - https://phabricator.wikimedia.org/T227122 (10Milimetric) p:05Triageβ†’03Normal [15:40:19] 10Analytics: Refine JsonSchemaLoader should use JsonParser instead of YAMLParser to load JSON data - https://phabricator.wikimedia.org/T227484 (10Milimetric) p:05Triageβ†’03High [15:40:23] 10Analytics: Make JSONSchema aware Refine merge in existing Hive schema to read data - https://phabricator.wikimedia.org/T227088 (10Milimetric) p:05Triageβ†’03High [15:54:12] 10Analytics, 10Analytics-Wikistats: 'All" time range does not transfer well across metrics - https://phabricator.wikimedia.org/T227038 (10Milimetric) p:05Triageβ†’03High [15:55:26] 10Analytics, 10Performance-Team: Release performance data on a regular schedule - https://phabricator.wikimedia.org/T205342 (10Milimetric) @Gilles/@Krinkle whenever you're ready with the datasets you want to release, ping us and we can help set up a pipeline. [15:57:39] 10Analytics, 10Operations, 10Patch-For-Review, 10Wikimedia-Incident: Move icinga alarm for the EventStreams external endpoint to SRE - https://phabricator.wikimedia.org/T227065 (10Milimetric) p:05Normalβ†’03High [15:57:53] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10Wikimedia-Incident: Move icinga alarm for the EventStreams external endpoint to SRE - https://phabricator.wikimedia.org/T227065 (10Milimetric) a:03Ottomata [16:16:06] 10Analytics: Review Bacula home backups set for stat100[56] - https://phabricator.wikimedia.org/T201165 (10Nuria) Talked to #product-analytics on sync up about best practices of keeping code backed up in gerrit and data that needs to be backed up in hdfs, closing. [16:33:33] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Varnent) [17:34:21] ottomata: Hi Andrew! I can't connect to the kernel on notebook1004. I've tried deleting the ~/.ipython directory, but it didn't solve the problem. Do you know if there's anything else I can try? [17:34:41] hmmm maybe related to elukey's work? elukey did you enable firewall on notebooks? [17:39:36] ottomata: nope only on stat1004 [17:39:45] hello chelsyx! [17:39:51] ok will check gimme just a few [17:40:00] oh [17:40:06] chelsyx, do you tunnel through stat1004? [17:40:48] chelsyx: is that what you mean by 'connect to the kernel'? [17:41:20] ottomata: nope [17:42:15] ottomata: when I start a notebook (even a new one), there is this "Connecting to kernel" message on the upper right corner with orange background [17:42:54] ottomata: usually it will say "kernel is ready" after a few seconds, but now it just doesn't go away [17:43:17] ottomata: and I can't run any code in the notebook [17:43:22] ok so you can connect and log into jupyterhub [17:43:28] hm [17:43:33] hm [17:43:48] chelsyx: just in case, what command do you use to ssh tunnel ? [17:44:35] ottomata: `ssh -N nb4 & open http://localhost:8000/` [17:44:47] ottomata: I define nb4 in my config file [17:45:38] https://www.irccloud.com/pastebin/lGl7NvPs/ [17:45:57] ahh cool, ok was wondering how where the forwarding was, ok [17:45:58] hm [17:46:26] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Nuria) Please add ldap username (it should be one word) [17:46:31] chelsyx: do you have the same problem on notebook1003? [17:47:32] and, is it all kernels? [17:47:49] ottomata: nope, and I can run the notebook via command line on notebook1004, e.g. `venv/bin/jupyter nbconvert --execute --to html xxx.ipynb` [17:48:05] oh so it is fine on 1003 [17:48:07] ok that is strange then. [17:48:07] hmm [17:48:32] chelsyx: is it started today or before? [17:48:59] all kernels chelsyx? [17:49:12] elukey: it started around last wednesday [17:49:21] 10Analytics: Jan Dittrich would like to have access to superset - https://phabricator.wikimedia.org/T227093 (10Nuria) Please add here your ldap user name [17:49:50] ottomata: yes [17:50:14] I can see a lot of errors in jupyter-chelsyx-singleuser.service on notebook1004 [17:50:17] and not on 1003 [17:50:23] yeah i see those too elukey [17:50:27] https://github.com/jupyter/notebook/issues/4441 ? [17:50:37] SingleUserNotebookApp ioloop:909] Exception in callback starting on the 24th though [17:51:13] chelsyx: did you somehow upgrade the version of tornado your venv has? [17:51:21] notebook1003 has 5.0.2 [17:51:24] 1004 has 6.0.3 [17:51:39] chelsyx: can you retry now? [17:51:44] I restarted the unit [17:52:01] ottomata: Maybe, I don't remember... sorry [17:52:06] ah interesting! [17:52:26] chelsyx: maybe some package you installed there upgraded it and caused this bug. [17:52:50] you also have a newer version of notebook [17:53:06] apparently this is fixed in your version thou, according to that github issue. [17:53:09] It works now! [17:53:16] huh! [17:53:21] elukey: i bet [17:53:28] somehow her notebook version got upgraded, which upgraded tornadoi [17:53:32] when a new kernel started [17:53:36] it used the new version of tornado [17:53:41] to talk to the old version of notebook server [17:53:53] so your restart made the new version of notebook server (with the fix) run [17:54:17] possible yes [17:54:27] welp, great fix :) [17:54:33] ahahahah [17:54:48] very technical [17:54:49] Thanks so much ottomata and elukey ! [17:55:56] np! [17:56:00] going afk for dinner o/ [17:57:35] o/ [18:03:27] ping ottomata MEP meeting [18:03:36] OGH [18:03:42] i thougth that was wednesady! [18:03:45] sorry it got moved so much [18:03:48] be there in 2 mins... [18:35:57] ottomata: can you move your google doc for stream config design to "analytics" google drive? https://drive.google.com/drive/u/0/folders/1bcy6Iyb_bLwD1jcfjhL4vtKZvD-CN22L [18:36:29] done [18:37:59] ottomata: thnkssir [18:39:57] 10Analytics, 10Product-Analytics: [BUG] Logging error of MobileWikiAppDailyStats for the iOS app - https://phabricator.wikimedia.org/T226219 (10chelsyx) [18:41:19] ottomata: if i want to be the analytics user i do sudo -u analytics? [18:41:31] yup [18:41:41] you want to become tghe user or just execute a command? [18:46:10] 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Core Platform Team Backlog (Watching / External), and 3 others: Modern Event Platform: Stream Intake Service: Migrate Mediawiki Eventbus events to eventgate-main - https://phabricator.wikimedia.org/T211248 (10Ottomata) [18:48:44] (03CR) 10Ladsgroup: Use config for wdqs host name (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/520903 (https://phabricator.wikimedia.org/T218710) (owner: 10Ladsgroup) [19:00:00] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Varnent) Is their LDAP the same as our Google Accounts username? I was told previously it was associated with Wikitech, but perhaps I am remembering wrong or was told wrong... [19:04:18] 10Analytics, 10GrowthExperiments, 10Product-Analytics, 10Growth-Team (Current Sprint): Homepage: instrumentation - https://phabricator.wikimedia.org/T216586 (10nettrom_WMF) [19:35:11] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Nuria) ldap is associated with wikitech, normally 1 word [19:36:59] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Varnent) If it is their username, I suspect then that it is "Deb_Zierten" as they are [[ https://wikitech.wikimedia.org/wiki/User:Deb_Zierten | User:Deb Zierten ]]. [20:53:13] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Nuria) Then they should have access already using "Deb_Zierten" [21:26:01] 10Analytics, 10Analytics-Kanban, 10ExternalGuidance, 10Product-Analytics: [Bug] `init` and `mtinfo` event counts drop drastically since June 17 2019 - https://phabricator.wikimedia.org/T227150 (10Nuria) Definitely something going on here for 2019-07-01 there are ("recorded") 25 "init" events but if I get... [21:39:04] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Varnent) Okay - so basically anyone that registers on Wikitech and knows the account password for our analytics instance can get access - you do not need to add any permiss... [21:39:47] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Nuria) >Okay - so basically anyone that registers on Wikitech and knows the account password for our analytics instance can get access That's correct [21:39:59] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Nuria) 05Openβ†’03Resolved [21:40:04] 10Analytics, 10wikimediafoundation.org: Access to WikimediaFoundation.org analytics for Deb - https://phabricator.wikimedia.org/T227496 (10Varnent) Awesome - thank you! Sorry for the confusion. :) [23:13:10] 10Analytics, 10Analytics-Kanban, 10ExternalGuidance, 10Product-Analytics: [Bug] `init` and `mtinfo` event counts drop drastically since June 17 2019 - https://phabricator.wikimedia.org/T227150 (10Nuria) Rerun refine for 07/01 to see if anything changes (doesn't see m like it might as latest refine hours ar... [23:48:03] 10Analytics, 10Analytics-Kanban, 10ExternalGuidance, 10Product-Analytics: [Bug] `init` and `mtinfo` event counts drop drastically since June 17 2019 - https://phabricator.wikimedia.org/T227150 (10Nuria) Nevermind, was able to re-refine (needed to remove REFINED flags as well as SUCCESS ones) but still the... [23:59:45] 10Analytics, 10Analytics-Kanban, 10ExternalGuidance, 10Product-Analytics: [Bug] `init` and `mtinfo` event counts drop drastically since June 17 2019 - https://phabricator.wikimedia.org/T227150 (10Nuria) I think this events are being filtered because their host is translate.googleusercontent.com which does...