[00:29:12] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): enwiki File Export failed for 2026-03-01 - https://phabricator.wikimedia.org/T419291#11726108 (10xcollazo) >>! In T419291#11723404, @xcollazo wrote: > Failed again. There was cluster instability due to T420168 and T415002 so retrying as is. > > https://y... [00:46:19] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Build a set of configurable pre-scheduled DBT Airflow DAGs executing dbt-jobs models - https://phabricator.wikimedia.org/T419925#11726138 (10amastilovic) https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/2087 [00:49:07] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Schedule three new monthly DBT models for Movement Insights - https://phabricator.wikimedia.org/T420069#11726140 (10amastilovic) For some reason Phab link to GitLab doesn't seem to be working so here's the related `airflow-dags` change that's been merged... [08:22:27] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Growth-Team, 10Image-Suggestions: Section Image Suggestions no longer available? - https://phabricator.wikimedia.org/T420244#11726574 (10APizzata-WMF) [[ https://airflow-platform-eng.wikimedia.org/dags/check_bad_parsing/grid?dag_run_id=scheduled__20... [08:28:09] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10MediaWiki-extensions-CentralAuth, 06MediaWiki-Platform-Team: CentralAuth's localuser table contains many nulls and duplicate mappings - https://phabricator.wikimedia.org/T411116#11726584 (10APizzata-WMF) > I can test this easily and come back with... [08:42:06] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Implement list of JA3N-JA4H pairs to be tagged as automated into the bot detection pipeline - https://phabricator.wikimedia.org/T420412#11726623 (10APizzata-WMF) > But we can always look at Iceberg snapshots. You mean we can look at the insert times and s... [09:18:38] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: PyFlink: Handle messages bigger than max.size - https://phabricator.wikimedia.org/T420448#11726678 (10JMonton-WMF) a:03JMonton-WMF [09:31:36] 06Data-Engineering, 10MediaWiki-extensions-EventLogging, 06Test Kitchen, 07Essential-Work: Remove mw.eventLog.id - https://phabricator.wikimedia.org/T408179#11726710 (10phuedx) [09:47:17] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10DPE-Mediawiki-Content: Use wmf.mediawiki_history as baseline for slo completeness - https://phabricator.wikimedia.org/T416312#11726722 (10APizzata-WMF) The [[ https://airflow.wikimedia.org/dags/mw_content_history_slo_monthly/grid?search=mw_content_his... [10:06:19] 06Data-Engineering, 10AQS2.0: Review and adapt wikistats / AQS video analytics to the introduction of MPEG-DASH - https://phabricator.wikimedia.org/T419879#11726781 (10GGoncalves-WMF) We shouldn't need to capture new data, but we'll need to think about how we measure media playback based on what's already avai... [10:06:42] 06Data-Engineering, 10AQS2.0: Review and adapt wikistats / AQS video analytics to the introduction of MPEG-DASH - https://phabricator.wikimedia.org/T419879#11726785 (10GGoncalves-WMF) a:05GGoncalves-WMF→03Ahoelzl [10:10:30] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10MediaWiki-extensions-CentralAuth, 06MediaWiki-Platform-Team: CentralAuth's localuser table contains many nulls and duplicate mappings - https://phabricator.wikimedia.org/T411116#11726800 (10JAllemandou) I think it's easier to make it happen this way... [10:20:46] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10Event-Platform: Fix PyFlink log levels - https://phabricator.wikimedia.org/T419997#11726812 (10JMonton-WMF) We are also seeing real Warning messages like: ` Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connecti... [10:25:32] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10MediaWiki-extensions-CentralAuth, 06MediaWiki-Platform-Team: CentralAuth's localuser table contains many nulls and duplicate mappings - https://phabricator.wikimedia.org/T411116#11726828 (10APizzata-WMF) > I think it's easier to make it happen this... [10:40:03] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Persistence, 06DBA, and 4 others: ICU 72 upgrade: `categorylinks` table swap - https://phabricator.wikimedia.org/T419980#11726880 (10Raine) [10:58:14] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Build a set of configurable pre-scheduled DBT Airflow DAGs executing dbt-jobs models - https://phabricator.wikimedia.org/T419925#11726915 (10GGoncalves-WMF) Thanks for the breakdown @amastilovic ! It's helpful to see the pros and cons. In general, I thin... [11:41:47] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Implement list of JA3N-JA4H pairs to be tagged as automated into the bot detection pipeline - https://phabricator.wikimedia.org/T420412#11727048 (10mforns) @APizzata-WMF Yes, you're right! Hm, maybe we don't even need to add it as a dataset in Airflow, si... [11:57:03] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Implement list of JA3N-JA4H pairs to be tagged as automated into the bot detection pipeline - https://phabricator.wikimedia.org/T420412#11727144 (10APizzata-WMF) > So, I guess we can skip table maintenance altogether? agreed :) [12:18:57] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Growth-Team, 10Image-Suggestions: Section Image Suggestions no longer available? - https://phabricator.wikimedia.org/T420244#11727184 (10Michael) Based on our instruments, the suggestions are back: {F73140547} [12:53:33] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): HDFS usage dashboard is quadruple counting file counts and file sizes - https://phabricator.wikimedia.org/T418780#11727284 (10Antoine_Quhen) 👍 Dataset updated. Data looks good. Checked with `hdfs dfs -count ...` [13:02:28] !log Test Kitchen edge-unique experiments (poll 272827) - adds: none; removes: none; fields: attribution-research-short-baseline-run - xLab/MPIC/TK tips at https://w.wiki/FwuD [13:02:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:34:18] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Refactor our existing Airflow dags to use EasyDAG & DagProperties - https://phabricator.wikimedia.org/T336738#11727484 (10xcollazo) All usage of `VariableProperties` from within Airflow DAGs have now been migrated to `DagProperties`.... [13:35:36] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Analyze SQL queries generating metrics - https://phabricator.wikimedia.org/T420434#11727490 (10JAllemandou) [13:37:26] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Refactor our existing Airflow dags to use EasyDAG & DagProperties - https://phabricator.wikimedia.org/T336738#11727504 (10xcollazo) Opened {T420582} to tackle that last bit of `VariableProperties` dependency. [13:53:33] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10MediaWiki-extensions-CentralAuth, 06MediaWiki-Platform-Team: CentralAuth's localuser table contains many nulls and duplicate mappings - https://phabricator.wikimedia.org/T411116#11727594 (10JAllemandou) > Regarding the old data? Since we know the du... [14:08:13] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Analyze SQL queries generating metrics - https://phabricator.wikimedia.org/T420434#11727733 (10JAllemandou) After some time reading and processing the queries used to generate the metrics asked weekly, here... [14:08:24] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10MediaWiki-extensions-CentralAuth, 06MediaWiki-Platform-Team: CentralAuth's localuser table contains many nulls and duplicate mappings - https://phabricator.wikimedia.org/T411116#11727734 (10APizzata-WMF) > Hm, this would mean partially incomplete da... [14:11:15] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10MediaWiki-extensions-CentralAuth, 06MediaWiki-Platform-Team: CentralAuth's localuser table contains many nulls and duplicate mappings - https://phabricator.wikimedia.org/T411116#11727752 (10JAllemandou) > The file I am talking about is only made up... [14:20:48] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Implement list of JA3N-JA4H pairs to be tagged as automated into the bot detection pipeline - https://phabricator.wikimedia.org/T420412#11727817 (10JAllemandou) > If we went with a Hive table for the bot JA3N-JA4H list, would you prefer it being located i... [14:30:10] !log Test Kitchen edge-unique experiments (poll 273086) - adds: none; removes: synth-test-new-external-path; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [14:30:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:31:58] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Analyze SQL queries generating metrics - https://phabricator.wikimedia.org/T420434#11727895 (10Ottomata) I wonder if the [[ https://datahub.wikimedia.org/dataset/urn:li:dataset:(urn:li:dataPlatform:hive,wmf... [15:02:59] 06Data-Engineering, 10Dumps-Generation, 06Wikimedia Enterprise: Include more namespaces in Wiktionary HTML dumps - https://phabricator.wikimedia.org/T303652#11728024 (10TranqyPoo) [15:03:26] (03PS3) 10Snwachukwu: Extend mediarequest Cassandra loads with poster/plays for video-requests API [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1250005 (https://phabricator.wikimedia.org/T415202) [15:04:12] (03CR) 10Snwachukwu: Extend mediarequest Cassandra loads with poster/plays for video-requests API (035 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1250005 (https://phabricator.wikimedia.org/T415202) (owner: 10Snwachukwu) [15:37:07] (03CR) 10GoranSMilovanovic: [C:03+1] Add BSD 3-Clause License to the repo backdated to first commit year [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1243749 (owner: 10Andrew McAllister (WMDE)) [15:39:14] 06Data-Engineering, 10Event-Platform: mediawiki.page_change.v1 event - add a page type field - https://phabricator.wikimedia.org/T409462#11728282 (10Ottomata) [15:47:39] (03CR) 10WMDE-leszek: [C:03+2] "I think that's everyone consenting. Thank you, let's do it" [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1243749 (owner: 10Andrew McAllister (WMDE)) [15:48:05] (03Merged) 10jenkins-bot: Add BSD 3-Clause License to the repo backdated to first commit year [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1243749 (owner: 10Andrew McAllister (WMDE)) [15:56:29] !log Test Kitchen edge-unique experiments (poll 1) - adds: attribution-research-short-baseline-run, share-highlight-baseline, logged-out-retention-round3, synth-aa-test-traffic-impact-3, growthexperiments-editattempt-anonwarning, synth-aa-test-traffic-impact-2, mobile-toc-abc2, synth-aa-test-traffic-impact-1; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [15:56:31] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:56:39] !log Test Kitchen mw-user experiment (poll 1) - adds: we-3-3-4-reading-list-test1, growthexperiments-revise-tone, we-3-3-4-reading-list-test1-en; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [15:56:41] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:36:50] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Analyze SQL queries generating metrics - https://phabricator.wikimedia.org/T420434#11728748 (10Ahoelzl) Interesting findings on the non-additivity ... this means we'll need to redefine some metrics to have... [16:47:17] (03PS9) 10Andrew McAllister (WMDE): Split active_user_changes.sql into user/temp account versions and run both [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/1243177 (https://phabricator.wikimedia.org/T416680) [17:06:32] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10AQS2.0, 13Patch-For-Review: Alter AQS Cassandra tables in support of video plays endpoints - https://phabricator.wikimedia.org/T420008#11728874 (10Snwachukwu) Thanks @Eevans. You can go ahead to deploy to staging. [17:07:38] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): Implement list of JA3N-JA4H pairs to be tagged as automated into the bot detection pipeline - https://phabricator.wikimedia.org/T420412#11728888 (10APizzata-WMF) >If we go for more "future-proof", I'd use Iceberg, with timestamp fields for insertion, upda... [17:27:58] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10AQS2.0, 13Patch-For-Review: Alter AQS Cassandra tables in support of video plays endpoints - https://phabricator.wikimedia.org/T420008#11729015 (10Eevans) >>! In T420008#11728874, @Snwachukwu wrote: > Thanks @Eevans. You can go ahead to deploy to st... [17:28:34] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10AQS2.0, 13Patch-For-Review: Alter AQS Cassandra tables in support of video plays endpoints - https://phabricator.wikimedia.org/T420008#11729018 (10Eevans) p:05Triage→03Medium [18:24:56] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th): HDFS usage dashboard is quadruple counting file counts and file sizes - https://phabricator.wikimedia.org/T418780#11729309 (10xcollazo) Thanks @Antoine_Quhen ! For completeness, a link to the dashboard that was fixed: https://superset.wikimedia.org/super... [18:32:05] (03CR) 10Mforns: "The grouping sets refactor and the rename look great!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1250005 (https://phabricator.wikimedia.org/T415202) (owner: 10Snwachukwu) [19:06:36] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 10MediaWiki-extensions-CentralAuth, 06MediaWiki-Platform-Team: CentralAuth's localuser table contains many nulls and duplicate mappings - https://phabricator.wikimedia.org/T411116#11729640 (10xcollazo) >>! In T411116#11726800, @JAllemandou wrote: > I... [19:19:18] 06Data-Engineering (Q3 FY25/26 January 1st - March 31th), 13Patch-For-Review: Monthly reconcile continues to emit a really large amount of events after user_id changes - https://phabricator.wikimedia.org/T419055#11729676 (10xcollazo) `spark_process_reconciliation_events` ingest for 2026-03-16 failed multiple t... [19:25:04] 06Data-Engineering, 06Test Kitchen: Logged in reader retention logging - https://phabricator.wikimedia.org/T420621#11729687 (10Milimetric) [19:35:48] (03PS4) 10Snwachukwu: Extend mediarequest Cassandra loads with poster/plays for video-requests API [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1250005 (https://phabricator.wikimedia.org/T415202) [19:38:17] !log Test Kitchen edge-unique experiments (poll 617) - adds: synth-test-new-external-path; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [19:38:18] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:10:01] !log Test Kitchen mw-user experiment (poll 890) - adds: temp-accounts-enrollment-test; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [21:10:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log