[06:38:37] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data-Platform-SRE, 06SRE, 13Patch-For-Review: Streamline Data Platform access approvals for WMF staff - https://phabricator.wikimedia.org/T370424#10261667 (10MoritzMuehlenhoff) >>! In T370424#10051517, @BTullis wrote: > This sounds sensible to me,... [06:40:38] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data-Platform-SRE, 06SRE, 13Patch-For-Review: Streamline Data Platform access approvals for WMF staff - https://phabricator.wikimedia.org/T370424#10261668 (10MoritzMuehlenhoff) What's the rationale for treating non-staff different? Is it intention... [08:12:13] btullis, brouberol: FYI, I've now uploaded a build of OpenJDK 8 for Bookworm to the component/jdk8 component for Bookworm [08:12:29] thank you! [08:12:45] I can see it https://apt.wikimedia.org/wikimedia/dists/bookworm-wikimedia/component/jdk8/ [12:12:36] 06Data-Engineering, 10CirrusSearch, 10Structured Data Engineering, 06Structured-Data-Backlog, and 2 others: Migrate image recommendation to use page_weighted_tags_changed stream - https://phabricator.wikimedia.org/T372912#10262265 (10pfischer) @Ottomata, I was able to successfully produce schema-validated... [13:06:01] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Data-Platform-SRE, 06SRE, 13Patch-For-Review: Streamline Data Platform access approvals for WMF staff - https://phabricator.wikimedia.org/T370424#10262307 (10Ottomata) > Mid-term the approval management will move to Bitu/idm.wikimedia.org COOL! >... [13:14:40] 06Data-Engineering, 10Metrics Platform, 10Event-Platform: Document instructions for deleting an event stream and its usages - https://phabricator.wikimedia.org/T360210#10262324 (10Ottomata) This would been helpful to have done for {T368678}. A premature schema deletion caused ops some week burden. cc @apas... [13:15:24] 14Analytics, 06Data-Engineering, 10Event-Platform, 07Wikimedia-Performance-recommendation: Avoid extra HTTPS connections for most Event Platform beacons - https://phabricator.wikimedia.org/T263049#10262332 (10Ottomata) [13:16:32] 06Data-Engineering, 10Event-Platform: [session length] Change domain of event collection to avoid ad-blocker issue - https://phabricator.wikimedia.org/T280256#10262330 (10Ottomata) →14Duplicate dup:03T263049 [13:17:48] 06Data-Engineering, 10MediaWiki-extensions-WikimediaEvents, 10Observability-Metrics, 10Event-Platform, and 3 others: Add Prometheus support to statsd.js via mw.track() - https://phabricator.wikimedia.org/T355837#10262337 (10Ottomata) Relevant for our proposal: {T263049} [13:17:50] 14Analytics, 06Data-Engineering, 10Event-Platform: EventGate throttling and DOS prevention - https://phabricator.wikimedia.org/T256891#10262342 (10Ottomata) Looks like this can be handled by service mesh stuff now: https://wikitech.wikimedia.org/wiki/Ratelimit#Enable/opt_in_to_rate_limiting We should do th... [13:21:06] 14Analytics, 06Data-Engineering, 10Event-Platform: EventGate and EventStreams rate limiting - https://phabricator.wikimedia.org/T256891#10262346 (10Ottomata) [13:21:24] 06Data-Engineering, 06cloud-services-team, 10EventStreams, 10stewardbots, and 3 others: Frequent `429 Client Error: Too Many Requests for url: https://stream.wikimedia.org/v2/stream/recentchange` errors in SULWatcher - https://phabricator.wikimedia.org/T329327#10262357 (10Ottomata) 05Open→03Declined... [13:22:27] 06Data-Engineering, 10EventStreams, 10Pywikibot, 10Event-Platform: Error 429: too many requests for stream.wikimedia.org - https://phabricator.wikimedia.org/T308931#10262353 (10Ottomata) 05Open→03Declined Closing. Solution to be implemented in {T256891} [13:23:56] 14Analytics, 06Data-Engineering, 10Event-Platform: EventGate and EventStreams rate limiting - https://phabricator.wikimedia.org/T256891#10262372 (10Ottomata) [13:27:27] 06Data-Engineering, 06serviceops, 10Event-Platform: Traffic for eventstreams-internal seems to be zero for the past months - https://phabricator.wikimedia.org/T348763#10262406 (10Ottomata) 05Declined→03Open [13:28:53] 06Data-Engineering, 06serviceops, 10Event-Platform: Traffic for eventstreams-internal seems to be zero for the past months - https://phabricator.wikimedia.org/T348763#10262402 (10Ottomata) 05Open→03Declined @BTullis do you think it would be possible to add authentication and a public domain to this servi... [13:33:17] 14Analytics-Radar, 06Data-Engineering, 10Event-Platform: Introduce EventBusSendUpdate - https://phabricator.wikimedia.org/T292123#10262464 (10Ottomata) 05Open→03Declined [13:37:32] 14Analytics, 06Data-Engineering, 10Event-Platform, 10Release-Engineering-Team (Radar): Stop using puppet + git pull for auto deployment of schema repos - https://phabricator.wikimedia.org/T274901#10262481 (10Ottomata) →14Duplicate dup:03T347421 [13:38:44] 06Data-Engineering, 06Data-Platform-SRE, 10Event-Platform: [NEEDS GROOMING] schema services should be moved to k8s - https://phabricator.wikimedia.org/T347421#10262483 (10Ottomata) [13:38:44] 06Data-Engineering, 10Event-Platform: Add schema diffing support to jsonschema-tools and run diff in CI - https://phabricator.wikimedia.org/T321850#10262494 (10Ottomata) MR in jsonschema-tools: https://gitlab.wikimedia.org/repos/data-engineering/jsonschema-tools/-/merge_requests/51 [13:45:41] 06Data-Engineering, 06serviceops, 10Event-Platform: Make eventstreams-internal available to WMF staff without an ssh tunnel - https://phabricator.wikimedia.org/T348763#10262504 (10Ottomata) [13:46:10] Hi team! [13:46:10] I'm a student at the University of Waterloo working on a data science project on wikipedia metadata. Specifically, we'd like to query the analytics API for view counts and edit counts to visualize trends in Wikipedia metadata(eg. finding the most vandalized or most influential pages of the year). [13:46:11] We wrote a python script for this to do this in batches of 1000 articles, but it seems we get rate limited quickly(we hit a 429 response). Can you help point us towards the right team to find a better way to get this data? [13:46:11] Thank you, [13:46:12] Matt [13:47:32] 06Data-Engineering, 10CirrusSearch, 10Structured Data Engineering, 06Structured-Data-Backlog, and 2 others: Migrate image recommendation to use page_weighted_tags_changed stream - https://phabricator.wikimedia.org/T372912#10262513 (10Ottomata) ACK, I see comment from David in CR. [13:49:21] Pliny: hello! (fellow Waterloo grad here :)) if you don't hear back on this channel, I very much recommend filing a task on phabricator to discuss this since that's the best place to get a response [13:49:34] https://phabricator.wikimedia.org [13:49:48] you can also try emailing noc@wikimedia.org and I can forward your response to the right people internally [13:50:09] wow!! small world! (this is our capstone project haha) [13:50:49] ah ha! yeah, I graduated in 2014; Ian G's lab, CrySP [13:51:06] so yeah, please file a task and/or email to noc@ and I will make sure it gets to the right person [13:51:23] on it!! [13:52:52] 06Data-Engineering, 06serviceops, 10Event-Platform: Make eventstreams-internal available to WMF staff without an ssh tunnel - https://phabricator.wikimedia.org/T348763#10262533 (10BTullis) >>! In T348763#10262402, @Ottomata wrote: > @BTullis do you think it would be possible to add authentication and a publi... [14:02:49] I made an account and a task here: https://phabricator.wikimedia.org/T378184 [14:02:57] thank you for your help [14:03:13] thanks, I will add the right team [14:23:03] Pliny: Apologies also for the delay. I can add some details to the ticket too, but my colleagues have pointed me to some resources that may help you. [14:23:42] Firstly, there is this dump that you may be able to use: https://dumps.wikimedia.org/other/pageview_complete/readme.html [14:24:49] In addition, we have this breakdown of pageviews per-country, using differential privacy: https://meta.wikimedia.org/wiki/Differential_privacy/Completed/Country-project-page [14:38:11] 06Data-Engineering, 06serviceops, 10Event-Platform: Make eventstreams-internal available to WMF staff without an ssh tunnel - https://phabricator.wikimedia.org/T348763#10262664 (10CDanis) >>! In T348763#10262531, @BTullis wrote: > * add an authenticating reverse proxy using [[https://gateway.envoyproxy.io/la... [14:47:22] 06Data-Engineering, 10Event-Platform, 10MW-1.43-notes (1.43.0-wmf.27; 2024-10-15), 13Patch-For-Review, and 2 others: Delete redundant mobile- and desktopwebuiactions event in WikimediaEvents - https://phabricator.wikimedia.org/T376065#10262695 (10ovasileva) 05Open→03Resolved [14:56:18] (03PS1) 10GOlson: Added ios event sanitization [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083189 [15:30:12] (03PS1) 10GOlson: Add items for event sanitization regarding iOS edit [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083192 (https://phabricator.wikimedia.org/T377259) [15:30:44] (03Abandoned) 10GOlson: Added ios event sanitization [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083189 (owner: 10GOlson) [15:47:49] 06Data-Engineering, 06Research: Rate Limited for data science project - https://phabricator.wikimedia.org/T378184#10262876 (10BTullis) Hi @Pliny2024 - I have tagged the #data-engineering and #research teams, who may be able to help you with this. I suspect that modifying the rate limits directly is unlikely to... [15:55:54] 06Data-Engineering, 06serviceops, 10Event-Platform: Make eventstreams-internal available to WMF staff without an ssh tunnel - https://phabricator.wikimedia.org/T348763#10262910 (10BTullis) >>! In T348763#10262664, @CDanis wrote: > But we're using [[ https://github.com/oauth2-proxy/oauth2-proxy | oauth2-proxy... [16:01:41] 06Data-Engineering, 10MediaWiki-extensions-WikimediaEvents, 10Observability-Metrics, 10Event-Platform, and 3 others: Add Prometheus support to statsd.js via mw.track() - https://phabricator.wikimedia.org/T355837#10262922 (10Ottomata) Other use cases: - {T319329} - {T327246} [16:05:06] 14Analytics-Radar, 06Data-Engineering-Icebox, 10MediaWiki-General: Proposal: drop kafka-php dependency from MediaWiki - https://phabricator.wikimedia.org/T265966#10262934 (10Ottomata) 05Open→03Resolved a:03Ottomata From [[ https://wikimedia.slack.com/archives/C05D7UDJZ5E/p1729867579618539 | Slack ]... [17:06:34] (03CR) 10Tsevener: [C:04-1] Add items for event sanitization regarding iOS edit (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083192 (https://phabricator.wikimedia.org/T377259) (owner: 10GOlson) [17:34:12] (03CR) 10Tsevener: [C:04-1] Add items for event sanitization regarding iOS edit (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083192 (https://phabricator.wikimedia.org/T377259) (owner: 10GOlson) [17:34:28] (03CR) 10Shay Nowick: Add items for event sanitization regarding iOS edit (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083192 (https://phabricator.wikimedia.org/T377259) (owner: 10GOlson) [18:25:45] (03PS1) 10GOlson: Updated variable name [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083256 (https://phabricator.wikimedia.org/T376320) [18:31:35] (03Abandoned) 10GOlson: Updated variable name [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083256 (https://phabricator.wikimedia.org/T376320) (owner: 10GOlson) [18:34:04] (03PS2) 10GOlson: Add items for event sanitization regarding iOS edit [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083192 (https://phabricator.wikimedia.org/T377259) [18:37:47] (03PS3) 10GOlson: Add items for event sanitization regarding iOS edit [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083192 (https://phabricator.wikimedia.org/T377259) [18:46:39] (03CR) 10Tsevener: Add items for event sanitization regarding iOS edit (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083192 (https://phabricator.wikimedia.org/T377259) (owner: 10GOlson) [18:46:50] (03CR) 10Tsevener: [C:03+1] Add items for event sanitization regarding iOS edit [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1083192 (https://phabricator.wikimedia.org/T377259) (owner: 10GOlson)