[02:21:16] (03PS2) 10Nuria: Refactoring eventlogging-specific hostname check [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/521577 (https://phabricator.wikimedia.org/T227150) [02:22:38] (03CR) 10Nuria: "Please try #2, removed logic that tried to have somewhat more generic code and left it eventlogging specific." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/521577 (https://phabricator.wikimedia.org/T227150) (owner: 10Nuria) [02:37:27] 10Analytics, 10Pageviews-API, 10Tool-Pageviews: 429 Too Many Requests hit despite throttling to 100 req/sec - https://phabricator.wikimedia.org/T219857 (10MusikAnimal) @Nuria @fdans Now I see "HyperSwitch request rate limit exceeded" (before it was 429s without a message), despite making no more than the max... [05:28:19] 10Analytics, 10Analytics-Kanban, 10ExternalGuidance, 10Product-Analytics, 10Patch-For-Review: [Bug] `init` and `mtinfo` event counts drop drastically since June 17 2019 - https://phabricator.wikimedia.org/T227150 (10Nuria) Also re-refined June, the sanitized data will get adjusted when the 2nd sweep of s... [05:55:55] 10Analytics, 10Pageviews-API, 10Tool-Pageviews: 429 Too Many Requests hit despite throttling to 100 req/sec - https://phabricator.wikimedia.org/T219857 (10Nuria) >when at least some should go through if it was only enforcing the 100 req/sec limit. Let's see, ratelimiting is enforced per IP for public APIs, o... [06:20:35] 10Analytics, 10Operations, 10netops, 10LDAP: LDAP ldap-ro.eqiad.wikimedia.org not reachable from Analytics VLAN - https://phabricator.wikimedia.org/T227611 (10elukey) From puppet I can see that the change for ldap-ro was reverted: ` elukey@notebook1003:~$ sudo grep ldap /var/log/puppet.log Jul 9 17:46:07... [06:27:47] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey: Import AMD rocm packages in wikimedia-buster - https://phabricator.wikimedia.org/T224723 (10elukey) [06:27:59] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Import AMD rocm packages in wikimedia-buster - https://phabricator.wikimedia.org/T224723 (10elukey) [07:12:15] 10Analytics, 10Operations, 10netops, 10LDAP: LDAP ldap-ro.eqiad.wikimedia.org not reachable from Analytics VLAN - https://phabricator.wikimedia.org/T227611 (10MoritzMuehlenhoff) There are two issues here: 1. We'll need to fix the ACLs so that the analytics VLAN can access the ldap-ro replicas, there's a w... [07:18:11] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey: Investigate if a Prometheus exporter for the AMD GPU(s) can be easily created - https://phabricator.wikimedia.org/T220784 (10elukey) 05Stalled→03Open [07:18:17] 10Analytics, 10Operations, 10Research-management, 10Patch-For-Review, 10User-Elukey: Remove computational bottlenecks in stats machine via adding a GPU that can be used to train ML models - https://phabricator.wikimedia.org/T148843 (10elukey) [07:20:56] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey: Investigate if a Prometheus exporter for the AMD GPU(s) can be easily created - https://phabricator.wikimedia.org/T220784 (10elukey) I filed a code review to create the initial version of the node exporter, with the following metrics: * usage pe... [07:21:12] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Investigate if a Prometheus exporter for the AMD GPU(s) can be easily created - https://phabricator.wikimedia.org/T220784 (10elukey) [07:32:24] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Investigate if a Prometheus exporter for the AMD GPU(s) can be easily created - https://phabricator.wikimedia.org/T220784 (10MoritzMuehlenhoff) Parsing "radeontop -d" might also be an interesting data source. [07:36:00] 10Analytics, 10Operations, 10netops, 10LDAP: LDAP ldap-ro.eqiad.wikimedia.org not reachable from Analytics VLAN - https://phabricator.wikimedia.org/T227611 (10elukey) About 1. ` elukey@re0.cr1-eqiad# show | compare [edit firewall family inet filter analytics-in4 term ldap from destination-address]... [08:25:21] 10Analytics, 10Analytics-Kanban, 10Discovery, 10Operations, and 2 others: Make hadoop cluster able to push to swift - https://phabricator.wikimedia.org/T219544 (10fgiunchedi) Sounds good to me, note that there are rate limits in place for write operations (`modules/swift/templates/proxy-server.conf.erb`) i... [09:28:05] 10Analytics, 10Analytics-Kanban, 10Cleanup, 10Operations, 10Patch-For-Review: Archive zookeeper puppet submodule - https://phabricator.wikimedia.org/T227164 (10elukey) [09:29:10] 10Analytics, 10Analytics-Kanban, 10Cleanup, 10Operations, 10Patch-For-Review: Archive zookeeper puppet submodule - https://phabricator.wikimedia.org/T227164 (10elukey) >>! In T227164#5302590, @elukey wrote: > There are some pull requests to close in https://github.com/wikimedia/puppet-zookeeper/pulls and... [10:19:54] 10Analytics, 10Operations, 10netops, 10LDAP, 10Patch-For-Review: LDAP ldap-ro.eqiad.wikimedia.org not reachable from Analytics VLAN - https://phabricator.wikimedia.org/T227611 (10elukey) I am a little bit lost with LDAP config, since we use: 1) ldap-labs.eqiad.wikimedia.org in Jupyterhub's config withou... [10:34:13] * elukey lunch! [14:21:03] (03CR) 10Ottomata: [C: 03+1] "Nice, 1 nit, but +1 :)" (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/521577 (https://phabricator.wikimedia.org/T227150) (owner: 10Nuria) [14:42:31] (03PS3) 10Fdans: Add file extension and media type classification to media files UDF [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/517641 (https://phabricator.wikimedia.org/T225911) [14:42:41] (03CR) 10Fdans: Add file extension and media type classification to media files UDF (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/517641 (https://phabricator.wikimedia.org/T225911) (owner: 10Fdans) [14:44:05] (03CR) 10jerkins-bot: [V: 04-1] Add file extension and media type classification to media files UDF [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/517641 (https://phabricator.wikimedia.org/T225911) (owner: 10Fdans) [14:58:49] (03PS4) 10Fdans: Add file extension and media type classification to media files UDF [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/517641 (https://phabricator.wikimedia.org/T225911) [15:06:43] ottomata: will look at fetcher CR after standup [15:07:27] fetcher CR? [15:10:13] 10Analytics, 10Analytics-Kanban, 10Discovery-Search (Current work): Spike. Load search data into turnilo to test whether exploratory data can do away with some of the dashboards - https://phabricator.wikimedia.org/T216058 (10Gehel) Any news on this? Can we do something to help this move forward? [15:14:20] ottomata: schemafetcher [15:14:35] ah [15:14:39] k, ya i need to test that [15:16:10] 10Analytics, 10Analytics-Kanban, 10Discovery-Search (Current work): Spike. Load search data into turnilo to test whether exploratory data can do away with some of the dashboards - https://phabricator.wikimedia.org/T216058 (10Gehel) This will not move forward until Q2 (October). We'll talk about it again at t... [15:34:27] ottomata: ops sync? [15:34:37] not sure if I am in the wrong place again [15:34:47] seems to be the new meet batcave [15:36:56] oh my okj coming [15:38:03] we can skip if you want, just seen the e-scrum [15:58:41] (03CR) 10Nuria: [C: 04-1] "Some comments around code clarity." (036 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/517641 (https://phabricator.wikimedia.org/T225911) (owner: 10Fdans) [16:01:19] ping ottomata milimetric standdupppp [16:01:29] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Check if HDFS offers a way to prevent/limit/throttle users to overwhelm the HDFS Namenode - https://phabricator.wikimedia.org/T220702 (10Nuria) 05Open→03Resolved [16:01:36] nuria: going to deployment pipeline tech talk , sent e scrum [16:01:58] sorry should’ve sent message, went to the doctor and will try to work in the afternoon [16:12:03] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Enable base::firewall on stat boxes after restricting Spark REPL ports. - https://phabricator.wikimedia.org/T170826 (10elukey) >>! In T170826#5321126, @EBernhardson wrote: > Not sure if it's related, but starting this week when i start a... [16:24:06] (03PS3) 10Nuria: Refactoring eventlogging-specific hostname check [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/521577 (https://phabricator.wikimedia.org/T227150) [16:24:32] (03CR) 10Nuria: Refactoring eventlogging-specific hostname check (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/521577 (https://phabricator.wikimedia.org/T227150) (owner: 10Nuria) [16:30:37] (03PS4) 10Nuria: Refactoring eventlogging-specific hostname check [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/521577 (https://phabricator.wikimedia.org/T227150) [16:31:04] (03CR) 10Ottomata: [C: 03+1] Refactoring eventlogging-specific hostname check [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/521577 (https://phabricator.wikimedia.org/T227150) (owner: 10Nuria) [16:40:20] 10Analytics, 10Pageviews-API, 10Tool-Pageviews: 429 Too Many Requests hit despite throttling to 100 req/sec - https://phabricator.wikimedia.org/T219857 (10MusikAnimal) Thanks. I'm thinking what I might do is when I hit the first 429, make it pause for a bit before resuming making more requests. Or I could ju... [16:54:25] 10Analytics, 10MobileFrontend, 10Readers-Web-Backlog (Readers-Web-Kanbanana-2019-20-Q1): Having trouble setting up MobileFrontend for development - https://phabricator.wikimedia.org/T226071 (10ovasileva) [16:59:08] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Beeline does not print full stack traces when a query fails - https://phabricator.wikimedia.org/T136858 (10Nuria) Tested that with beeline --verbose -f select.hql > out.txt stack trace is like the one hive would provide so closing! [16:59:40] nuria: \o/ [17:00:09] elukey: I KNOW [17:00:53] elukey: did I missed the verbose 1st time around? or did we updated since my last test? [17:01:26] nuria: I think that between 5.10 and 5.15 (cdh) something changed, but I wasn't able to track it down in the change logs [17:01:40] elukey: k [17:01:45] elukey: in any case ALL good [17:02:01] so we are unblocked with beeline, super [17:02:21] now we should put an embargo for the a-team to use only beeline :D [17:07:36] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Enable base::firewall on stat boxes after restricting Spark REPL ports. - https://phabricator.wikimedia.org/T170826 (10elukey) @EBernhardson should work now, let me know! [17:10:07] cd [18:09:11] * elukey off! [18:22:07] we also have a beeline wrapper, so maybe we can always do --verbose on it? [19:13:09] 10Analytics, 10ChangeProp, 10EventBus, 10Core Platform Team (Security, stability, performance and scalability (TEC1)): Enable controlled debug logging for change-prop - https://phabricator.wikimedia.org/T189621 (10Pchelolo) p:05Normal→03Low [19:35:08] 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Core Platform Team Backlog (Watching / External), and 3 others: Modern Event Platform: Stream Intake Service: Migrate Mediawiki Eventbus events to eventgate-main - https://phabricator.wikimedia.org/T211248 (10Ottomata) [19:47:23] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Beeline does not print full stack traces when a query fails - https://phabricator.wikimedia.org/T136858 (10Milimetric) 05Resolved→03Open I tested too and it seems a lot better. There's one issue and a minor nit: * beeline fails for me on stat1004 (w... [20:06:23] (03CR) 10Milimetric: "I just saw this. WHAT?!!! How did I manage to type the super bizarre character fi instead of fi?!!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/519719 (owner: 10Joal) [20:29:27] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Product-Analytics: page_creation_timestamp not always correct in mediawiki_history - https://phabricator.wikimedia.org/T214490 (10Milimetric) done [21:20:50] 10Analytics, 10EventBus, 10Core Platform Team Backlog (Later), 10Patch-For-Review, and 2 others: EventBus should make better use of DI - https://phabricator.wikimedia.org/T204295 (10Pchelolo) 05Open→03Declined After looking at it for a while, it feels like we're ok for now. [21:35:22] hm, I now realize I can't start hive or beeline on stat1004, something's up [21:35:33] hmmm [21:38:52] it looks like the hive script has been modified and a line iscommented out [21:42:30] milimetric: try now [21:48:15] 10Analytics, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), 10Wikimedia-production-error: EventBus rejecting events because of malformed characters in the comment - https://phabricator.wikimedia.org/T184698 (10Pchelolo) [21:50:34] (03CR) 10Nuria: Use JsonParser to parse event data rather than YAMLParser (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/521552 (https://phabricator.wikimedia.org/T227484) (owner: 10Ottomata) [22:23:25] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Enable base::firewall on stat boxes after restricting Spark REPL ports. - https://phabricator.wikimedia.org/T170826 (10EBernhardson) The UI now works, unfortunately broadcasts appear to be broken in spark. Repro with the following, compar... [23:33:28] 10Analytics-Kanban, 10Product-Analytics: Make aggregate data on editors per country per wiki publicly available - https://phabricator.wikimedia.org/T131280 (10Nuria) >The bucket size requested is 10 for editing data. This might be counter intuitive but the bucket size of the country has little to do with the p...