[01:39:18] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Research, 10Event-Platform, 13Patch-For-Review: Implement stream of HTML content on mw.page_change event - https://phabricator.wikimedia.org/T360794#11634982 (10Ottomata) > if we use difflib, we'd want to make sure 1) there's a standardized+simple... [04:59:41] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 07OKR-Work, 13Patch-For-Review: SDS 2.2.6 Improve experiment event data data lake management - https://phabricator.wikimedia.org/T414105#11635055 (10AKhatun_WMF) Backfill complete. We now have data for ~20 days. Number of files ` akhatun@stat1008:~$... [08:32:28] 06Data-Engineering, 10Wikidata, 10Wikidata Analytics, 06Data-Platform-SRE (2026-02-13 - 2026-03-06): WMDE test Airflow instance can't egress out to external URLs - https://phabricator.wikimedia.org/T417630#11635325 (10Gehel) [08:33:01] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE (2026-02-13 - 2026-03-06): Requesting Kerberos access for hmonroy - https://phabricator.wikimedia.org/T416729#11635327 (10Gehel) [08:33:50] 14Analytics, 06Data-Engineering, 06Data-Engineering-Radar, 06SRE, and 2 others: Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561#11635329 (10Gehel) [09:15:17] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Data Pipelines, 10Wikidata, 10Wikidata Analytics: NEW FEATURE REQUEST: sqoop (all) user properties from mariadb to wmf_raw.mediawiki_user_properties - https://phabricator.wikimedia.org/T323456#11635411 (10AndrewTavis_WMDE) Thank you all so much fo... [10:18:15] 06Data-Engineering, 06Machine-Learning-Team, 10Event-Platform, 13Patch-For-Review: Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892#11635536 (10achou) > How does this sound folks? @gkyziridis Sounds good to me! :) > EVENTGA... [10:52:07] 06Data-Engineering, 06Movement-Insights: Investigate and repair pageviews and unique devices spike starting in Nov 2025 - https://phabricator.wikimedia.org/T416933#11635583 (10GGoncalves-WMF) We have [[ https://app.asana.com/1/3758245663860/project/1210331828081618/task/1213320344653123 | Legal approval ]] to... [11:04:00] !log Test Kitchen edge-unique experiments (poll 157950) - adds: none; removes: mobile-toc-abc; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [11:04:02] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:03:49] 06Data-Engineering, 03WMDE-TechWish-Sprint-2026-02-17-Beautiful-Beetroots: Run Spark Connect server in Analytics cluster - https://phabricator.wikimedia.org/T417997#11636696 (10awight) 05Open→03Declined After discussion with @xcollazo , I'll take a simpler path and write files to a temporary filesystem... [16:13:23] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Adapt Sqoop for imagelinks schema changes - https://phabricator.wikimedia.org/T416481#11636745 (10xcollazo) >>! In T416481#11636241, @Snwachukwu wrote: > @xcollazo Sounds good. Are there particular metrics you’d like us to look out f... [16:26:48] 06Data-Engineering, 06Data-Engineering-Radar, 06Commons, 06Data-Persistence, and 5 others: Migrate file tables to a modern layout (image/oldimage; file/filerevision; add primary keys) - https://phabricator.wikimedia.org/T28741#11636783 (10Zabe) [17:26:22] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Data Pipelines, 10Wikidata, 10Wikidata Analytics: NEW FEATURE REQUEST: sqoop (all) user properties from mariadb to wmf_raw.mediawiki_user_properties - https://phabricator.wikimedia.org/T323456#11637044 (10Snwachukwu) > using the prefupdate data an... [19:10:22] 06Data-Engineering, 10Dumps-Generation: Data missing from en.wiktionary.org February 2026 "MediaWiki Content File Exports" compared to "XML Database dump" - https://phabricator.wikimedia.org/T417596#11637275 (10xcollazo) @JeffDoozan thank you for the detailed, reproducible bug report! ------- @APizzata-WMF:... [19:11:47] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Airflow Refine: re-run of failed mapped tasks with changed stream config can cause silent data loss or data duplication - https://phabricator.wikimedia.org/T418021 (10Ahoelzl) 03NEW [19:19:03] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Airflow Refine: re-run of failed mapped tasks with changed stream config can cause silent data loss or data duplication - https://phabricator.wikimedia.org/T418021#11637327 (10Ahoelzl) **Short-Term Mitigations** - Avoid rerunning entire DAG runs when c... [19:33:03] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Airflow Refine: re-run of failed mapped tasks with changed stream config can cause silent data loss or data duplication - https://phabricator.wikimedia.org/T418021#11637367 (10Ahoelzl) **Potential technical solution** Separate job that: - Reads stream co... [20:09:29] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10DPE-Mediawiki-Content: Inconsistent page title styles in Mediawiki content current v1 dumps - https://phabricator.wikimedia.org/T410405#11637426 (10xcollazo) >>! In T410405#11587747, @CodeReviewBot wrote: > xcollazo **merged** https://gitlab.wikimedia... [20:30:57] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Movement-Insights (FY25-26 H2): dbt repository structure (Milestone 3) - https://phabricator.wikimedia.org/T416672#11637462 (10amastilovic) Great writeup @JMonton-WMF ! A monorepo shared by all teams definitely sounds like the way to go about this. I’... [20:47:33] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10DPE-Mediawiki-Content: Inconsistent page title styles in Mediawiki content current v1 dumps - https://phabricator.wikimedia.org/T410405#11637482 (10xcollazo) Plan: 1) Merge https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requ... [22:31:37] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Movement-Insights (FY25-26 H2): dbt repository structure (Milestone 3) - https://phabricator.wikimedia.org/T416672#11637626 (10amastilovic) ===== Some thoughts on model naming guidelines ===== First, some basic facts and rules of working with models... [22:33:05] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Movement-Insights (FY25-26 H2): dbt repository structure (Milestone 3) - https://phabricator.wikimedia.org/T416672#11637627 (10amastilovic) >> dbt doesn't allow to have multiple files with the same name, even if they live in different folders > What h... [23:24:54] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Weekly delivery cadence of core contributor metrics - https://phabricator.wikimedia.org/T418032 (10Ahoelzl) 03NEW [23:25:06] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Weekly delivery cadence of core contributor metrics - https://phabricator.wikimedia.org/T418032#11637718 (10Ahoelzl) a:03Ahoelzl