[00:25:05] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Pipeline - Performance improvements - https://phabricator.wikimedia.org/T422928#11837312 (10Ottomata) Hm, well, it isn't really better, but it isn't worse? I'm going to just reduce batch size to 2 just to try i... [03:08:02] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Pipeline - Performance improvements - https://phabricator.wikimedia.org/T422928#11837363 (10Ottomata) Ah, I have been posting my recent updates on the wrong ticket! I meant to post them on {T421216}. I'll move... [03:12:12] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Tuning & Backfilling configuration - https://phabricator.wikimedia.org/T421216#11837367 (10Ottomata) I have been accidentally posting my recent updates on {T422928} instead of this ticket. I will r... [03:13:12] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Tuning & Backfilling configuration - https://phabricator.wikimedia.org/T421216#11837374 (10Ottomata) It is hard to tell but I don't think I see that much improvement over sync vs async batch_size=2.... [03:16:26] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Tuning & Backfilling configuration - https://phabricator.wikimedia.org/T421216#11837388 (10Ottomata) I haven't been paying attention to production since I tried JEMALLOC, but either JEMALLOC or its... [03:26:34] 06Data-Engineering, 10Data Pipelines: Pageviews API returning 404 for 2026-04-17 onward - https://phabricator.wikimedia.org/T423818#11837390 (10Lovepeacejoy404) Pageviews hasn't worked for 3 days. It's no longer possible to count visits. [05:34:10] 06Data-Engineering, 06Machine-Learning-Team, 06serviceops-deprecated: Enable ChangeProp to consume mediawiki.page_content_change.v1 - https://phabricator.wikimedia.org/T409469#11837512 (10isarantopoulos) 05Declined→03Resolved [07:44:42] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 07Schema-change-in-production: Drop il_to column from imagelinks table in wmf production - https://phabricator.wikimedia.org/T419635#11837741 (10Marostegui) s6 codfw master has been switched so db2229 can now receive the schema change as a replica (T423837) [08:00:35] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Pipeline - Performance improvements - https://phabricator.wikimedia.org/T422928#11837776 (10dcausse) >>! In T422928#11833564, @Ottomata wrote: > Thanks @jmeybohm and @dcausse. I'd like to try this! > > So, IIU... [08:13:30] 06Data-Engineering, 10Event-Platform: mediawiki.page_change.v1 event - Add revision is revert field - https://phabricator.wikimedia.org/T423583#11837827 (10JAllemandou) If feasible, there are other revert related information that would be useful: * Which revision is reverted? * The timestamp-diff between cur... [08:23:39] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Pipeline - Performance improvements - https://phabricator.wikimedia.org/T422928#11837860 (10JMeybohm) >>! In T422928#11833564, @Ottomata wrote: > (QQ: which way is upstream and downstream? I would expect that re... [08:48:13] 06Data-Engineering, 06Data-Platform-SRE (2026-03-27 - 2026-04-17), 10Event-Platform, 07good first task, 13Patch-For-Review: Flink base image should not install into system python environment - https://phabricator.wikimedia.org/T418525#11837968 (10dcausse) >>! In T418525#11833996, @atsuko wrote: > Als... [08:49:17] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics_privatedata_users and SQL Lab for AnnieKim_WMDE - https://phabricator.wikimedia.org/T420500#11837972 (10AnnieKim_WMDE) Uploaded my ssh public key, waiting to be added to groups. [08:50:29] !log Test Kitchen edge-unique experiments (poll 135333) - adds: mobile-page-previews; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [08:50:31] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:27:55] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Tuning & Backfilling configuration - https://phabricator.wikimedia.org/T421216#11838701 (10Ottomata) ===== Status update 2026-04-20 - Reducing checkpoint interval makes the pipeline stable(ish). -... [12:30:03] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Pipeline - Performance improvements - https://phabricator.wikimedia.org/T422928#11838706 (10Ottomata) [12:48:19] 06Data-Engineering, 06Machine-Learning-Team, 06serviceops-deprecated: Enable ChangeProp to consume mediawiki.page_content_change.v1 - https://phabricator.wikimedia.org/T409469#11838726 (10isarantopoulos) 05Resolved→03Declined Reverting as it was accidentally resolved [13:03:48] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11838808 (10EBernhardson) I don't believe I've used the event_santized tables either. We do use some of the data beyond 90 day... [13:26:39] 06Data-Engineering: Load Google Search Console data into the Data Lake - https://phabricator.wikimedia.org/T420996#11838930 (10JerryWang-WMF) p:05Triage→03High @Ahoelzl The data grow faster and BigQuery cost is high. Let's bump up the priority for a pipeline to ingest daily data from BigQuery to our own dat... [14:22:40] 06Data-Engineering, 10Beta-Cluster-Infrastructure, 06MW-Interfaces-Team, 10WMF-JobQueue, 10Event-Platform: Jobs are not being processed in beta, April 2026 edition - https://phabricator.wikimedia.org/T423615#11839257 (10Daimona) FTR, I checked the archived logs to see about when this broke. The relevant... [14:23:33] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics_privatedata_users and SQL Lab for AnnieKim_WMDE - https://phabricator.wikimedia.org/T420500#11839263 (10Scott_French) [14:26:00] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics_privatedata_users and SQL Lab for AnnieKim_WMDE - https://phabricator.wikimedia.org/T420500#11839267 (10Scott_French) @AnnieKim_WMDE - Please see https://wikitech.wikimedia.org/wiki/SRE/Production_access#Access_Request_Process f... [14:30:14] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Backfill datasets affected by Nov 2025 automated traffic incident - https://phabricator.wikimedia.org/T421735#11839304 (10mforns) [14:34:23] 06Data-Engineering: Kindly requesting Kerberos password reset - https://phabricator.wikimedia.org/T423875#11839340 (10Aklapper) Please see https://wikitech.wikimedia.org/wiki/Data_Platform/Systems/Kerberos#Get_a_password_for_Kerberos and add appropriate tags so tasks can be found - thanks! [14:50:22] 06Data-Engineering, 10SRE-Access-Requests: Kindly requesting Kerberos password reset - https://phabricator.wikimedia.org/T423875#11839477 (10ssingh) [14:58:11] 06Data-Engineering, 10SRE-Access-Requests: Kindly requesting Kerberos password reset - https://phabricator.wikimedia.org/T423875#11839516 (10ssingh) ` sukhe@krb1002:~$ sudo manage_principals.py reset-password mfischerwmf --email_address=mfischer@wikimedia.org Password reset successfully. Successfully sent emai... [14:58:27] 06Data-Engineering, 10SRE-Access-Requests: Kindly requesting Kerberos password reset - https://phabricator.wikimedia.org/T423875#11839517 (10ssingh) 05Open→03Resolved [15:34:52] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Tuning & Backfilling configuration - https://phabricator.wikimedia.org/T421216#11839805 (10Ottomata) @dcausse pointed out that we are using mw-api-int-async, which has a timeout of 120s, not 60s! U... [15:49:33] 06Data-Engineering, 10Image-Suggestions, 06Discovery-Search (2026.04.06 - 2026.05.01): ALIS data pipeline produced too many suggestions - https://phabricator.wikimedia.org/T423238#11839920 (10pfischer) [16:13:09] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Pipeline - Performance improvements - https://phabricator.wikimedia.org/T422928#11840141 (10Ottomata) [16:18:09] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Tuning & Backfilling configuration - https://phabricator.wikimedia.org/T421216#11840174 (10Ottomata) Just applied our current settings to prod and deployed without any overrides: ` # production hel... [16:53:17] 06Data-Engineering, 10Event-Platform: Streaming HTML & Edit Types - productionization checklist - https://phabricator.wikimedia.org/T423920 (10Ottomata) 03NEW [16:53:51] 06Data-Engineering, 10Event-Platform: Streaming HTML & Edit Types - productionization checklist - https://phabricator.wikimedia.org/T423920#11840429 (10Ottomata) [16:54:21] 06Data-Engineering, 10Event-Platform: Streaming HTML & Edit Types - productionization checklist - https://phabricator.wikimedia.org/T423920#11840430 (10Ottomata) [16:54:24] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07OKR-Work (WE1 FY2025-26): WE1.5.3 Productize Data for Monthly Active Moderator Actions - https://phabricator.wikimedia.org/T410940#11840431 (10Ottomata) [16:54:34] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Research, 10Event-Platform, 13Patch-For-Review: Event stream with latest revision HTML & parent revision HTML diff - https://phabricator.wikimedia.org/T360794#11840432 (10Ottomata) [16:56:09] 06Data-Engineering, 10Event-Platform: Streaming HTML & Edit Types - productionization checklist - https://phabricator.wikimedia.org/T423920#11840447 (10Ottomata) [17:13:20] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Traffic, 06Data-Platform-SRE (2026-03-27 - 2026-04-17), 13Patch-For-Review: Surge in webrequest validation check - https://phabricator.wikimedia.org/T422030#11840498 (10Ahoelzl) a:03JAllemandou [18:13:27] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Commons, 06Data-Persistence, 07Epic, and 2 others: FY2025-26 WE 6.4.1: Move links tables of commons to a dedicated cluster - https://phabricator.wikimedia.org/T398709#11840848 (10Ladsgroup) [18:18:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Weekly core contributor metrics - MediaWiki event data source improvements for incremental MWH - https://phabricator.wikimedia.org/T423935 (10Ottomata) 03NEW [18:19:43] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Weekly core contributor metrics - MediaWiki event data source improvements for incremental MWH - https://phabricator.wikimedia.org/T423935#11840910 (10Ottomata) [18:19:52] 06Data-Engineering, 06tech-decision-forum, 10Event-Platform: MediaWiki Event Carried State Transfer - Problem Statement - https://phabricator.wikimedia.org/T291120#11840911 (10Ottomata) [18:20:29] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Weekly core contributor metrics - MediaWiki event data source improvements for incremental MWH - https://phabricator.wikimedia.org/T423935#11840913 (10Ottomata) [18:28:13] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: HTML Enrichment - Tuning & Backfilling configuration - https://phabricator.wikimedia.org/T421216#11840987 (10Ottomata) And for staging, here are the current --set overrides in case we need to redeploy it before we re... [19:21:05] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11841263 (10xcollazo) >>! In T417694#11809614, @xcollazo wrote: >>>! In T417694#11809583, @SNowick_WMF wrote: >> Confirming we... [19:29:13] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10AQS2.0: Consider updating our heuristics for media type classification in AQS / wikistats - https://phabricator.wikimedia.org/T419882#11841315 (10Snwachukwu) > Is there another I'm missing? 4. Get more accurate **media_classification** from the file... [20:03:19] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10AQS2.0: Introduce a new AQS endpoint to expose video plays - https://phabricator.wikimedia.org/T415202#11841387 (10Snwachukwu) Please see https://doc.wikimedia.org/generated-data-platform/aqs/analytics-api/reference/media-files.html#video-plays for new v... [20:36:49] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Create mediawiki.user_change event stream - https://phabricator.wikimedia.org/T423952 (10Ottomata) 03NEW [20:37:02] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Create mediawiki.user_change event stream - https://phabricator.wikimedia.org/T423952#11841514 (10Ottomata) [20:37:12] 06Data-Engineering, 06tech-decision-forum, 10Event-Platform: MediaWiki Event Carried State Transfer - Problem Statement - https://phabricator.wikimedia.org/T291120#11841515 (10Ottomata) [20:41:58] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Add new API rate limiting fields from webrequest_logs to Turnilo view - https://phabricator.wikimedia.org/T419736#11841525 (10HCoplin-WMF) Hey there 👋 Just following up on this again. I specifically needed this today, so hoping we can get the data into Turni... [20:46:50] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Wikidata, 10Wikidata-Query-Service: Add a --output-dir argument to wikibase rdf and json dumps - https://phabricator.wikimedia.org/T401296#11841527 (10xcollazo) Changes look good! [20:55:10] 06Data-Engineering, 10Data Pipelines, 10Pageviews-API: Pageviews API returning 404 for 2026-04-17 onward - https://phabricator.wikimedia.org/T423818#11841539 (10MusikAnimal) [21:04:04] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Cassandra loading DAGs are failing, investigate and fix - https://phabricator.wikimedia.org/T423955 (10amastilovic) 03NEW [21:09:38] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Cassandra loading DAGs are failing, investigate and fix - https://phabricator.wikimedia.org/T423955#11841617 (10amastilovic) After consultation with @Eevans we found out that the three hosts we're using (`aqs1010-a.eqiad.wmnet, aqs1011-a.eqiad.wmnet, aqs1012... [22:19:12] (03CR) 10Xcollazo: querypage: MostCategories: Include all content namespaces (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1267966 (https://phabricator.wikimedia.org/T413362) (owner: 10Zabe) [22:25:24] (03CR) 10Xcollazo: querypage: Add UnusedCategories.hql (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1268292 (https://phabricator.wikimedia.org/T422448) (owner: 10Zabe) [22:29:27] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11841846 (10xcollazo) >>! In T417694#11841263, @xcollazo wrote: >>>! In T417694#11809614, @xcollazo wrote: >>>>! In T417694#11... [22:31:08] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10AQS2.0: Introduce a new AQS endpoint to expose video plays - https://phabricator.wikimedia.org/T415202#11841850 (10Ladsgroup) Thank you! let me play with it and come back to you [23:46:41] 06Data-Engineering, 10Data Pipelines, 10Pageviews-API: Pageviews API returning 404 for 2026-04-17 onward - https://phabricator.wikimedia.org/T423818#11842099 (10mahmoud) Getting reports for https://top.hatnote.com/ as well. Subbed, looking forward to a fix!