[04:18:15] 10Data-Engineering (Q3 2025 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10702165 (10tchin) @brouberol would you happen to have some insight into this issue? [07:10:30] 10Data-Engineering (Q3 2025 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10702314 (10brouberol) The previous issue you linked to in ... was fixed by introducing the following patch: `lang=diff I've deployed a small config change to th... [07:14:32] 10Data-Engineering (Q3 2025 January 1st - March 31th): Enable Spark data lineage for all Airflow instances - https://phabricator.wikimedia.org/T386862#10702315 (10brouberol) What I'm thinking atm is that multiple things use `datahub_kafka_jumbo.host`: the python Kafka client used by airflow to interact with data... [08:04:12] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10702417 (10achou) Sounds great! Thanks for all the input. We agreed that for this new stream, only `mediawiki.p... [08:13:06] 10Data-Engineering (Q3 2025 January 1st - March 31th), 10Data Pipelines, 10Observability-Metrics, 07Essential-Work, and 2 others: Disable Data Platform Engineering generated graphite metrics and dashboards - https://phabricator.wikimedia.org/T372855#10702480 (10AndrewTavis_WMDE) Happy to help, @Snwachukwu,... [08:28:53] (03CR) 10Gehel: "After discussion, we're creating "refinery-lightweight-job" module. This is slightly more than just bare JVM (we might be using HDFS in th" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 (owner: 10Aqu) [08:55:31] (03PS7) 10Aqu: Create a new module to isolate lightweight jobs [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 [09:34:08] 06Data-Engineering, 06Data-Engineering-Radar, 06Growth-Team, 10GrowthExperiments, and 6 others: mw.track: support for histogram metrics - https://phabricator.wikimedia.org/T383563#10702825 (10Michael) One use-case for this was "migrated" from Graphite to Prometheus in [homepage: Add homepage_transfersize_b... [09:51:12] (03PS8) 10Aqu: Create a new module to isolate lightweight jobs [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1120953 [10:21:04] 06Data-Engineering, 10DPE-Mediawiki-Content, 10Dumps-Generation: mediawikiwiki fails some dumps tasks - https://phabricator.wikimedia.org/T390605#10703018 (10BTullis) →14Duplicate dup:03T390839 [10:24:10] 06Data-Engineering, 10Dumps-Generation: skwikibooks dumps failing - https://phabricator.wikimedia.org/T375928#10703035 (10BTullis) 05Open→03Resolved a:03BTullis The last two dumps of skwikibooks seem to have completed successfully, so I think that we can close this. https://dumps.wikimedia.org/skwik... [10:39:08] 06Data-Engineering, 06Data-Platform-SRE, 06Java-Scala-Standardization, 10Discovery-Search (2025.03.22 - 2025.04.11): Migrate existing Java packages to deploying to Gitlab, including new version of parent pom, validation that all dependencies are available,... - https://phabricator.wikimedia.org/T367405#10703116 [11:00:45] !log roll-restarting the cephosd cluster for T389184 [11:00:48] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:00:48] T389184: Upgrade the Data Platform Ceph cluster to the latest point release: 18.2.4 - https://phabricator.wikimedia.org/T389184 [11:02:39] (03CR) 10Milimetric: 1.Add a closed flag to the project namespace map dataset 2. Add a whether to sqoop flag by checking if wikidb exists in cloud replica. (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1125184 (https://phabricator.wikimedia.org/T241741) (owner: 10Milimetric) [13:31:18] 06Data-Engineering, 06Infrastructure-Foundations: Elasticsearch dependency upgrade in spicerack - https://phabricator.wikimedia.org/T390860 (10Volans) 03NEW [14:00:08] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10703868 (10Ottomata) I personally prefer the general purpose `page` name as well. Article is a special case. A... [14:06:14] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10703937 (10kostajh) >>! In T326179#10703868, @Ottomata wrote: > I personally prefer the general purpose `page`... [14:08:41] 06Data-Engineering: Drop views of module_deps tables - https://phabricator.wikimedia.org/T388982#10703950 (10joanna_borun) [14:09:29] 06Data-Engineering: Drop views of module_deps tables - https://phabricator.wikimedia.org/T388982#10703956 (10fnegri) Is this done? I see update-views was run by @Ladsgroup [14:11:16] 06Data-Engineering: Drop views of module_deps tables - https://phabricator.wikimedia.org/T388982#10703999 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup Yup! [14:16:09] 06Data-Engineering, 10AbuseFilter, 06DBA, 10MediaWiki-extensions-IPReputation, and 4 others: AbuseFilter protected variables: Make it possible for protected variable values expire when the IP address expires - https://phabricator.wikimedia.org/T390873 (10Dreamy_Jazz) 03NEW [14:25:42] 06Data-Engineering, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Add HDFS alert at 20% free space - https://phabricator.wikimedia.org/T390875 (10Gehel) 03NEW [14:31:56] 06Data-Engineering, 10AbuseFilter, 06Data-Persistence, 10MediaWiki-extensions-IPReputation, and 4 others: AbuseFilter protected variables: Make it possible for protected variable values expire when the IP address expires - https://phabricator.wikimedia.org/T390873#10704215 (10Marostegui) [14:51:29] 06Data-Engineering, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Canary failure on airflow platform_eng intsance after migrating to Kubernetes - https://phabricator.wikimedia.org/T390727#10704309 (10BTullis) The https://airflow-platform-eng.wikimedia.org instance is fully on Kubernetes now, so we should be g... [15:19:21] 06Data-Engineering, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Canary failure on airflow platform_eng intsance after migrating to Kubernetes - https://phabricator.wikimedia.org/T390727#10704573 (10xcollazo) [15:19:53] 06Data-Engineering, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Canary failure on airflow platform_eng intsance after migrating to Kubernetes - https://phabricator.wikimedia.org/T390727#10704579 (10xcollazo) [15:59:22] 06Data-Engineering, 06Data-Engineering-Radar, 10Dumps-Generation, 06MediaWiki-Platform-Team, 06serviceops: Migrate WMF production from PHP 7.4 to PHP 8.1 - https://phabricator.wikimedia.org/T319432#10704726 (10taavi) [16:00:59] 06Data-Engineering, 10AbuseFilter, 06Data-Persistence, 10MediaWiki-extensions-IPReputation, and 4 others: AbuseFilter protected variables: Make it possible for protected variable values expire when the IP address expires - https://phabricator.wikimedia.org/T390873#10704740 (10Bugreporter) Alternatively we... [16:08:50] 10Data-Engineering (Q3 2025 January 1st - March 31th), 10DPE-Mediawiki-Content: Investigate and fix duplicate data on wmf_content.mediawiki_content_history_v1 for muswiki - https://phabricator.wikimedia.org/T388715#10704778 (10xcollazo) 05Open→03Resolved [16:09:31] 06Data-Engineering, 10AQS2.0, 10Commons-Impact-Metrics, 13Patch-For-Review: Get more data to populate CIM time series datasets for the AQS local test environment - https://phabricator.wikimedia.org/T372989#10704783 (10xcollazo) [16:09:47] 10Data-Engineering (Q4 2025 April 1st - June 30th): Use the Spark-Iceberg built in CDC mechanism to PoC a replacement for wmf.wikimedia_wikitext_current - https://phabricator.wikimedia.org/T366544#10704784 (10xcollazo) [16:10:03] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10DPE-Mediawiki-Content, 13Patch-For-Review: Figure root cause of silent failures when computing metrics for mediawiki_content_history_v1 - https://phabricator.wikimedia.org/T387033#10704785 (10xcollazo) [16:11:04] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10DPE-Mediawiki-Content: mw_content_reconcile_mw_content_history_monthly is not sensing correctly - https://phabricator.wikimedia.org/T390783#10704787 (10xcollazo) 05Open→03In progress p:05Triage→03High [16:12:26] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Canary failure on airflow platform_eng intsance after migrating to Kubernetes - https://phabricator.wikimedia.org/T390727#10704798 (10xcollazo) [16:13:49] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Canary failure on airflow platform_eng intsance after migrating to Kubernetes - https://phabricator.wikimedia.org/T390727#10704804 (10xcollazo) 05Open→03In progress p:05Triage→03High [16:14:39] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Canary failure on airflow platform_eng intsance after migrating to Kubernetes - https://phabricator.wikimedia.org/T390727#10704820 (10xcollazo) [16:14:40] 10Data-Engineering-Roadmap, 10DPE-Mediawiki-Content, 07Epic: Support downstream users in adopting mediawiki_content_history_v1 - https://phabricator.wikimedia.org/T387021#10704821 (10xcollazo) [16:26:09] 06Data-Engineering, 06Machine-Learning-Team, 06Research, 10Event-Platform: Emit revision revert risk scores as a stream and expose in EventStreams API - https://phabricator.wikimedia.org/T326179#10704864 (10achou) > Does RevertRisk work for non main namespace revisions? No, it only works for Wikipedia main... [16:46:01] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Canary failure on airflow platform_eng intsance after migrating to Kubernetes - https://phabricator.wikimedia.org/T390727#10705046 (10xcollazo) I've tested `test_generic_artifact_deployment_dag` and it was succes... [16:53:48] 06Data-Engineering, 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Add HDFS alert at 20% free space - https://phabricator.wikimedia.org/T390875#10705088 (10Gehel) 05Open→03Resolved a:03Gehel [17:02:26] 10Data-Engineering (Q4 2025 April 1st - June 30th), 10Data-Platform-SRE (2025.03.22 - 2025.04.11): Canary failure on airflow platform_eng intsance after migrating to Kubernetes - https://phabricator.wikimedia.org/T390727#10705117 (10xcollazo) Actually, I will do a minor refactor of the canaries so that it is c... [17:04:54] (03PS1) 10KCVelaga: Add MinT for Readers stream to sanitization allow list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1133484 (https://phabricator.wikimedia.org/T372724) [17:42:42] 06Data-Engineering, 06Data-Platform-SRE, 06Java-Scala-Standardization, 10Discovery-Search (2025.03.22 - 2025.04.11): Migrate existing Java packages to deploying to Gitlab, including new version of parent pom, validation that all dependencies are available,... - https://phabricator.wikimedia.org/T367405#10705307 [17:56:36] FIRING: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [17:56:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [18:01:36] RESOLVED: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [18:01:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [18:08:36] FIRING: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [18:08:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [18:13:36] RESOLVED: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [18:13:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [19:07:05] 06Data-Engineering, 10AbuseFilter, 06Data-Persistence, 10MediaWiki-extensions-IPReputation, and 4 others: AbuseFilter protected variables: Make it possible for protected variable values expire when the IP address expires - https://phabricator.wikimedia.org/T390873#10705701 (10Dreamy_Jazz) >>! In T390873#10... [19:08:51] (03CR) 10Xcollazo: [C:03+1] "LGTM, please merge at your convenience." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1127964 (https://phabricator.wikimedia.org/T384962) (owner: 10TChin) [19:22:36] FIRING: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [19:22:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [19:32:36] RESOLVED: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [19:32:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [19:37:36] FIRING: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [19:37:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [19:47:36] RESOLVED: MediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag: ... [19:47:36] High Kafka consumer lag for mw_content_history_reconcile_enrich in eqiad - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=eqiad%20prometheus/k8s-dse&var-namespace=mw-content-history-reconcile-enrich&var-helm_release=production&var-operator_name=All&var-flink_job_name=mw_content_history_reconcile_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiContentHistoryReconcileEnrichHighKafkaConsumerLag [23:00:02] 10Data-Engineering (Q3 2025 January 1st - March 31th), 07Essential-Work: Support for 4.3.11 - webrequest based scraping detection - https://phabricator.wikimedia.org/T388721#10706307 (10Ahoelzl) a:03mforns