[01:00:00] FIRING: [2x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [01:00:00] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [01:00:06] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [01:05:00] RESOLVED: [2x] MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [01:05:00] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [01:05:00] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [01:36:41] FIRING: MediawikiPageHtmlContentChangeEnrichHighKafkaConsumerLag: ... [01:36:41] High Kafka consumer lag for mw_page_html_content_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Enrichment#Alerting - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-content-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_content_change_enrich - ... [01:36:44] https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlContentChangeEnrichHighKafkaConsumerLag [01:37:00] FIRING: MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [01:37:00] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [01:37:00] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [01:41:41] RESOLVED: MediawikiPageHtmlContentChangeEnrichHighKafkaConsumerLag: ... [01:41:41] High Kafka consumer lag for mw_page_html_content_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Enrichment#Alerting - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-content-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_content_change_enrich - ... [01:41:44] https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlContentChangeEnrichHighKafkaConsumerLag [01:42:07] 06Data-Engineering, 10Datasets-General-or-Unknown: Missing DP files 2022-10-19.tsv, 2020-07-19.tsv, 2020-07-20.tsv - https://phabricator.wikimedia.org/T381057#11934131 (10nshahquinn-wmf) @Effeietsanders did you find that the 3 datasets mentioned in descriptions are the only ones missing, or are they just examp... [01:47:00] RESOLVED: MediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag: ... [01:47:00] High Kafka consumer lag for mw_page_html_feature_counts_change_enrich in eqiad - https://wikitech.wikimedia.org/wiki/MediaWiki_Event_Enrichment/HTML_Feature_Counts_Enrichment#Alerting - ... [01:47:00] https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-page-html-feature-counts-change-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_page_html_feature_counts_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageHtmlFeatureCountsChangeEnrichHighKafkaConsumerLag [01:47:59] 07Analytics-Data-Problem: Netherlands (NL) absent from country_project_page flat files since 2023-11-09 - https://phabricator.wikimedia.org/T426559#11934133 (10nshahquinn-wmf) >>! In T426559#11932608, @Effeietsanders wrote: > (initially posted on the wrong ticket) > It looks like NL has silently reappeared start... [01:48:20] 07Analytics-Data-Problem, 06Data-Engineering: Netherlands (NL) absent from country_project_page flat files since 2023-11-09 - https://phabricator.wikimedia.org/T426559#11934134 (10nshahquinn-wmf) [01:57:02] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Engineering-Radar, 06Product-Analytics: Creating a Spark session causes a torrent of log spam - https://phabricator.wikimedia.org/T315024#11934136 (10nshahquinn-wmf) For the record: @xcollazo fixed this by updating Wmfdata's Spark module to au... [02:00:20] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Load Google Search Console data into the Data Lake - https://phabricator.wikimedia.org/T420996#11934139 (10nshahquinn-wmf) Just noting here that @Ahoelzl confirmed in a Slack conversation that the destination Data Lake table will have the complete dataset (a... [02:30:35] 06Data-Engineering, 10Data-Engineering-Wikistats, 10Pageviews-Anomaly: Sudden traffic increase on 1 November 2025 - https://phabricator.wikimedia.org/T412655#11934147 (10nshahquinn-wmf) For the record: the data correction applying the new bot detection rule to December–March data has finished (T421735), and... [06:27:59] (03CR) 10KCVelaga: "I tested these changes, all looks good." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1288444 (https://phabricator.wikimedia.org/T419522) (owner: 10KCVelaga) [06:31:47] 06Data-Engineering, 10Data-Engineering-Wikistats, 10Pageviews-Anomaly: Sudden traffic increase on 1 November 2025 - https://phabricator.wikimedia.org/T412655#11934416 (10GGoncalves-WMF) 05Open→03Resolved [06:37:26] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 07Schema-change-in-production: Drop il_to column from imagelinks table in wmf production - https://phabricator.wikimedia.org/T419635#11934434 (10FCeratto-WMF) [07:18:22] 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 07Schema-change-in-production: Drop il_to column from imagelinks table in wmf production - https://phabricator.wikimedia.org/T419635#11934609 (10FCeratto-WMF) [07:54:32] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review: Create an MVP data product of API requests - https://phabricator.wikimedia.org/T419522#11934779 (10KCVelaga_WMF) 05Open→03In progress [07:55:01] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review, 06Product-Analytics (Kanban): Create an MVP data product of API requests - https://phabricator.wikimedia.org/T419522#11934783 (10KCVelaga_WMF) p:05Triage→03High [08:57:59] 06Data-Engineering, 06Data-Platform-SRE: Enable the opt-in use of conda-analytics-next with the DbtSkeinOperator in Airflow - https://phabricator.wikimedia.org/T426728 (10BTullis) 03NEW [09:10:08] !log Test Kitchen edge-unique experiments (poll 15308) - adds: none; removes: synth-aa-ncs-1; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [09:10:10] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:12:09] !log Test Kitchen edge-unique experiments (poll 15314) - adds: synth-aa-ncs-1; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [09:12:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:15:30] !log Test Kitchen edge-unique experiments (poll 15324) - adds: none; removes: synth-aa-ncs-1; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [09:15:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:17:11] !log Test Kitchen edge-unique experiments (poll 15329) - adds: synth-aa-ncs-1; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [09:17:13] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:24:03] !log Test Kitchen edge-unique experiments (poll 15528) - adds: share-highlight; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [10:24:04] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:50:25] 06Data-Engineering, 06Data-Engineering-Radar, 06SRE, 10SRE-Access-Requests: Requesting access to analytics_privatedata_users and SQL Lab for AnnieKim_WMDE - https://phabricator.wikimedia.org/T420500#11935531 (10SLyngshede-WMF) @AnnieKim_WMDE Hi, did you get the access you needed? [12:08:57] 07Analytics-Data-Problem, 06Data-Engineering: Netherlands (NL) absent from country_project_page flat files since 2023-11-09 - https://phabricator.wikimedia.org/T426559#11935955 (10Effeietsanders) Is the list of countries that we should expect to be filtered out, public? For example, I'm noticing that the data... [12:40:06] FIRING: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-analytics-external. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=000000026&var-service=eventgate-analytics-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [12:47:33] 06Data-Engineering, 06Data-Platform-SRE (2026-04-24 - 2026-05-15): Enable the opt-in use of conda-analytics-next with the DbtSkeinOperator in Airflow - https://phabricator.wikimedia.org/T426728#11936081 (10Gehel) [12:48:02] 06Data-Engineering, 06Data-Platform-SRE (2026-04-24 - 2026-05-15): Enable the opt-in use of conda-analytics-next with the DbtSkeinOperator in Airflow - https://phabricator.wikimedia.org/T426728#11936084 (10Gehel) DPE SRE is waiting on #data-engineering to implement. [12:52:07] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Data-Platform-SRE (2026-04-24 - 2026-05-15), 13Patch-For-Review: Presto cluster improvements for concurrency and workload - https://phabricator.wikimedia.org/T424112#11936111 (10Gehel) [12:55:06] RESOLVED: EventgateProduceRateAnomaly: Significant produce rate deviation (+-25%) on eventgate-analytics-external. - https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate - https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-dc=000000026&var-service=eventgate-analytics-external - https://alerts.wikimedia.org/?q=alertname%3DEventgateProduceRateAnomaly [13:04:44] 06Data-Engineering, 06Data-Platform-SRE (2026-04-24 - 2026-05-15): Enable Ceph S3 locations for Hive Metastore tables - https://phabricator.wikimedia.org/T425673#11936149 (10Gehel) We need to have a security model before moving too far along. This is very much part of our general strategy, we just need to make... [13:15:04] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Add log_id to wmf.mediawiki_history - https://phabricator.wikimedia.org/T425986#11936225 (10xcollazo) >>! In T425986#11932955, @xcollazo wrote: > Now [[ https://airflow.wikimedia.org/dags/mediawiki... [13:23:56] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE: Alluxio for Improved Superset Query Performance - https://phabricator.wikimedia.org/T288252#11936264 (10JAllemandou) [13:24:00] 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE, 13Patch-For-Review: [Data Platform] Test Alluxio as cache layer for Presto - https://phabricator.wikimedia.org/T266641#11936262 (10JAllemandou) →14Duplicate dup:03T288252 [13:30:15] 06Data-Engineering, 06Data-Platform-SRE (2026-04-24 - 2026-05-15): Enable Ceph S3 locations for Hive Metastore tables - https://phabricator.wikimedia.org/T425673#11936285 (10fkaelin) Thanks for the update, can we link to the phab for the security model? [13:53:14] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History, 13Patch-For-Review: Add log_id to wmf.mediawiki_history - https://phabricator.wikimedia.org/T425986#11936374 (10xcollazo) `wmf.mediawiki_history`'s `snapshot='2026-04'` LGTM: ` spark-sql (default)> select count(1) fr... [14:38:10] 06Data-Engineering, 06Data-Platform-SRE (2026-04-24 - 2026-05-15): Enable the opt-in use of conda-analytics-next with the DbtSkeinOperator in Airflow - https://phabricator.wikimedia.org/T426728#11936619 (10amastilovic) I can take this on, great idea. [15:36:11] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History: Spike: full-history revert detection for `mediawiki_history_incremental_v1` - https://phabricator.wikimedia.org/T426469#11937142 (10xcollazo) == Spike results == Queries run against `wmf.mediawiki_history` (full histor... [15:39:29] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History: Spike: full-history revert detection for `mediawiki_history_incremental_v1` - https://phabricator.wikimedia.org/T426469#11937180 (10xcollazo) Notebook with results at P92610. [15:41:10] 06Data-Engineering, 06Data-Engineering-Radar, 10Event-Platform, 06Machine-Learning-Team (Q4 FY2025-26): Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892#11937192 (10gkyziridis) === Update === I can query `event_sanitized.me... [15:43:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Movement-Insights: interwiki imports and its effects on revision data - https://phabricator.wikimedia.org/T425735#11937221 (10xcollazo) New field now available on `snapshot='2026-04'` of `wmf.mediawiki_history`: ` `event_user_is_cross_wiki`... [16:06:47] 06Data-Engineering, 06Data-Engineering-Radar, 10Event-Platform, 06Machine-Learning-Team (Q4 FY2025-26): Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892#11937306 (10Ottomata) Great! [16:47:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Improve dbt-jobs GitLab CI/CD - https://phabricator.wikimedia.org/T426361#11937502 (10amastilovic) [17:05:44] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History: Spike: full-history revert detection for `mediawiki_history_incremental_v1` - https://phabricator.wikimedia.org/T426469#11937528 (10xcollazo) # Decision Agreed with @JAllemandou that a 90d window makes sense. [17:08:03] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History: mediawiki_history_incremental_v1: schema specification for stakeholder review - https://phabricator.wikimedia.org/T425573#11937535 (10xcollazo) We ran a dedicated spike (T426469) to work out the right revert-field desig... [17:11:08] (03CR) 10TChin: [V:03+2 C:03+2] Add refinery-source jars for v0.3.14 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1288853 (owner: 10Maven-release-user) [18:26:34] (03PS8) 10Xcollazo: Add MWHistoryDeltaWriter and MWHistorySnapshotMerger to refinery-job-35 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1284858 (https://phabricator.wikimedia.org/T424350) [18:29:14] (03CR) 10Xcollazo: "@joal@wikimedia.org this patchset should now cover `event_type=revision` fully with the new 90d revert semantics discussed in T426469. For" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1284858 (https://phabricator.wikimedia.org/T424350) (owner: 10Xcollazo) [18:50:59] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History: mediawiki_history_incremental_v1: schema specification for stakeholder review - https://phabricator.wikimedia.org/T425573#11938074 (10xcollazo) @nshahquinn-wmf and @Isaac: A full redesign addressing these concerns woul... [19:11:13] (03PS2) 10Xcollazo: Add DDL for mediawiki_history_incremental_v1 Iceberg table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1287959 (https://phabricator.wikimedia.org/T425729) [19:12:11] (03CR) 10Xcollazo: "@joal@wikimedia.org first DDL version for your review that only includes `revision` events for now. This will evolve as we add `page` and " [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1287959 (https://phabricator.wikimedia.org/T425729) (owner: 10Xcollazo) [22:46:27] !log Test Kitchen edge-unique experiments (poll 17739) - adds: logged-out-retention-round12; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [22:46:28] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [23:52:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History: mediawiki_history_incremental_v1: schema specification for stakeholder review - https://phabricator.wikimedia.org/T425573#11938699 (10xcollazo) Quick update on `revision_tags` in the incremental table. Tags are not cur... [23:54:22] (03PS9) 10Xcollazo: Add MWHistoryDeltaWriter and MWHistorySnapshotMerger to refinery-job-35 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1284858 (https://phabricator.wikimedia.org/T424350) [23:56:46] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Load Google Search Console data into the Data Lake - https://phabricator.wikimedia.org/T420996#11938700 (10Ahoelzl) **TIL** > Google Search Console's bulk export writes two separate tables per property, and they are not aggregations of each other. The site-l...