[00:43:43] (03PS4) 10Zabe: querypage: MostCategories: Include all content namespaces [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1267966 (https://phabricator.wikimedia.org/T413362) [00:44:31] (03CR) 10Zabe: querypage: MostCategories: Include all content namespaces (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1267966 (https://phabricator.wikimedia.org/T413362) (owner: 10Zabe) [04:18:59] 06Data-Engineering, 10Data-Platform: add x3 to the MediaWiki replicas - https://phabricator.wikimedia.org/T423979 (10Novem_Linguae) 03NEW [04:20:08] 06Data-Engineering, 10Data-Platform: add x4 to the MediaWiki replicas - https://phabricator.wikimedia.org/T423980 (10Novem_Linguae) 03NEW [04:20:50] 06Data-Engineering, 10Data-Platform: add x4 to the MediaWiki replicas - https://phabricator.wikimedia.org/T423980#11842482 (10Novem_Linguae) 05Open→03Stalled Marking as stalled since I don't think the x4 cluster is created yet. Can un-stall once they're done with their work there. [04:34:16] 06Data-Engineering, 10Data Pipelines, 10Pageviews-API: Pageviews API returning 404 for 2026-04-17 onward - https://phabricator.wikimedia.org/T423818#11842503 (10MusikAnimal) I'm told there was an outage that has now been fixed, and Data Engineering is working to backfill the data. [07:09:53] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Weekly core contributor metrics - MediaWiki event data source improvements for incremental MWH - https://phabricator.wikimedia.org/T423935#11842665 (10JAllemandou) [07:26:22] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Add new API rate limiting fields from webrequest_logs to Turnilo view - https://phabricator.wikimedia.org/T419736#11842680 (10KCVelaga_WMF) Theoretically we can also answer most of these questions with Superset. However, Superset can be slow with webrequest... [08:40:51] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: HTML Enrichment - Alerting - https://phabricator.wikimedia.org/T423996 (10JMonton-WMF) 03NEW [08:51:50] 06Data-Engineering, 06Data-Engineering-Radar, 06Commons, 06Data-Persistence, and 6 others: Migrate file tables to a modern layout (image/oldimage; file/filerevision; add primary keys) - https://phabricator.wikimedia.org/T28741#11843021 (10IKhitron) Please see {T423998}. [08:53:42] (03PS1) 10Joal: Update webrequest validation algorithm [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1275815 (https://phabricator.wikimedia.org/T422030) [08:59:23] 06Data-Engineering, 06Data-Engineering-Radar, 06cloud-services-team, 10Data-Services, 06DBA: Re-run maintainviews on all clouddb* and an-redacteddb1001.eqiad.wmnet - https://phabricator.wikimedia.org/T422459#11843047 (10jeremyb) [09:00:57] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Add new API rate limiting fields from webrequest_logs to Turnilo view - https://phabricator.wikimedia.org/T419736#11843056 (10JAllemandou) >>! In T419736#11842680, @KCVelaga_WMF wrote: > Theoretically we can also answer most of these questions with Superset.... [09:21:21] 06Data-Engineering, 06Data-Engineering-Radar, 06cloud-services-team, 10Data-Services, 06DBA: Re-run maintainviews on all clouddb* and an-redacteddb1001.eqiad.wmnet - https://phabricator.wikimedia.org/T422459#11843118 (10taavi) [09:21:50] 06Data-Engineering, 06Data-Engineering-Radar, 06cloud-services-team, 10Data-Services, 06DBA: Re-run maintainviews on all clouddb* and an-redacteddb1001.eqiad.wmnet - https://phabricator.wikimedia.org/T422459#11843120 (10taavi) [09:34:46] 06Data-Engineering, 06Data-Engineering-Radar, 06Commons, 06Data-Persistence, and 6 others: Migrate file tables to a modern layout (image/oldimage; file/filerevision; add primary keys) - https://phabricator.wikimedia.org/T28741#11843191 (10Zabe) >>! In T28741#11843014, @IKhitron wrote: > Please see {T423998... [09:48:35] 06Data-Engineering, 06Data-Engineering-Radar, 06cloud-services-team, 10Data-Services, 06DBA: Re-run maintainviews on all clouddb* and an-redacteddb1001.eqiad.wmnet - https://phabricator.wikimedia.org/T422459#11843223 (10FCeratto-WMF) [10:24:33] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Audit and fix observability (logging and metrics) for pyflink jobs - https://phabricator.wikimedia.org/T418996#11843356 (10JMonton-WMF) Here there is a new Grafana dashboard for [[ https://grafana-rw.wikimedia.org/d/... [10:26:18] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Audit and fix observability (logging and metrics) for pyflink jobs - https://phabricator.wikimedia.org/T418996#11843358 (10JMonton-WMF) [11:08:01] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Requesting access to analytics_privatedata_users and SQL Lab for AnnieKim_WMDE - https://phabricator.wikimedia.org/T420500#11843467 (10AnnieKim_WMDE) SSH Key has been suspended and deleted from Bitu Identity Manager and won't be used anywhere else. I believe... [12:41:53] (03CR) 10Xcollazo: [C:03+1] Update webrequest validation algorithm [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1275815 (https://phabricator.wikimedia.org/T422030) (owner: 10Joal) [13:30:51] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11844225 (10xcollazo) [13:48:14] (03CR) 10Xcollazo: [C:03+2] querypage: MostCategories: Include all content namespaces [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1267966 (https://phabricator.wikimedia.org/T413362) (owner: 10Zabe) [14:33:38] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11844788 (10xcollazo) Now removing all other "Orphaned datasets (removed from allowlist, HDFS data never deleted)" with the fo... [14:43:28] 06Data-Engineering, 06Data-Engineering-Radar, 06Commons, 06Data-Persistence, and 6 others: Migrate file tables to a modern layout (image/oldimage; file/filerevision; add primary keys) - https://phabricator.wikimedia.org/T28741#11844856 (10matmarex) [15:03:39] 06Data-Engineering, 10Data Pipelines, 10Pageviews-API: Pageviews API returning 404 for 2026-04-17 onward - https://phabricator.wikimedia.org/T423818#11845161 (10amastilovic) We have backfilled the Cassandra jobs that have been failing for the past 3 (now 4) days, would you guys mind taking a look if everythi... [15:11:54] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): WE1.5 Consult on monthly active moderators data lake pipeline - https://phabricator.wikimedia.org/T419584#11845244 (10xcollazo) [15:11:57] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review: Add dbt base model for Wikipedia moderator actions metrics - https://phabricator.wikimedia.org/T423565#11845243 (10xcollazo) [15:12:18] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 13Patch-For-Review: Add dbt base model for Wikipedia moderator actions metrics - https://phabricator.wikimedia.org/T423565#11845246 (10xcollazo) 05Open→03In progress p:05Triage→03High a:03xcollazo [15:24:12] 06Data-Engineering, 10Event-Platform: Streaming HTML & Edit Types - productionization checklist - https://phabricator.wikimedia.org/T423920#11845309 (10Ottomata) [15:26:11] 06Data-Engineering, 06Data-Engineering-Radar, 10Event-Platform, 06Machine-Learning-Team (Q4 FY2025-26): Add Multilingual RevertRisk predictions to mediawiki.page_revert_risk_prediction_change - https://phabricator.wikimedia.org/T415892#11845348 (10gkyziridis) I am pasting here some results from loading tes... [15:31:31] 06Data-Engineering, 10Event-Platform: Streaming HTML & Edit Types - productionization checklist - https://phabricator.wikimedia.org/T423920#11845373 (10Ottomata) [15:37:17] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11845394 (10xcollazo) >>! In T417694#11844788, @xcollazo wrote: > Now removing all other "Orphaned datasets (removed from allo... [15:38:27] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11845396 (10xcollazo) [15:42:14] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform, 13Patch-For-Review: Audit and fix observability (logging and metrics) for pyflink jobs - https://phabricator.wikimedia.org/T418996#11845404 (10JMonton-WMF) [15:45:36] 06Data-Engineering, 06Data-Engineering-Icebox, 10Data Pipelines: Drop MediaViewer and MultimediaViewer* tables - https://phabricator.wikimedia.org/T311229#11845410 (10xcollazo) As part of T417694, I checked and these tables have already been deleted but not logged here: ` hive (event)> show tables in event... [15:46:09] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Data Pipelines: Drop MediaViewer and MultimediaViewer* tables - https://phabricator.wikimedia.org/T311229#11845425 (10xcollazo) 05Open→03Resolved a:03xcollazo [16:42:04] !log Test Kitchen mw-user experiment (poll 141002) - adds: growthexperiments-revise-tone; removes: none; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD [16:42:05] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:57:08] 06Data-Engineering: Drop sanitized ExternalGuidance data - https://phabricator.wikimedia.org/T304714#11845967 (10xcollazo) > This task tracks dropping any/all sanitized ExternalGuidance data. This was done as part of T417694#11845394. [16:57:32] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Drop sanitized ExternalGuidance data - https://phabricator.wikimedia.org/T304714#11845977 (10xcollazo) 05Open→03Resolved a:03xcollazo [17:00:07] 06Data-Engineering, 10Data Pipelines, 10Pageviews-API: Pageviews API returning 404 for 2026-04-17 onward - https://phabricator.wikimedia.org/T423818#11846005 (10Jpsowin) oh yeah looking good now! thank you so much! [17:00:39] 06Data-Engineering, 06Data-Engineering-Radar: Drop event.changeslistfiltergrouping table - https://phabricator.wikimedia.org/T317942#11846011 (10xcollazo) As part of T417694, I checked and these tables have already been deleted but not logged here: ` hive (event)> show tables in event like 'change*'; OK tab_n... [17:01:22] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Drop event.changeslistfiltergrouping table - https://phabricator.wikimedia.org/T317942#11846014 (10xcollazo) 05Open→03Resolved a:03xcollazo [17:02:43] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11846027 (10xcollazo) >>! In T417694#11813575, @phuedx wrote: > ... > When you delete these tables, could you please also dele... [17:14:05] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11846081 (10xcollazo) Moving on to next set: | Schema | Files | Size | Last Write | Introduced by | Date | Commit | Task | No... [17:16:46] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Add new API rate limiting fields from webrequest_logs to Turnilo view - https://phabricator.wikimedia.org/T419736#11846090 (10GGoncalves-WMF) Sorry for the delay here, @Ahoelzl is looking for an assignee. [17:22:42] (03PS1) 10Xcollazo: Remove DesktopWebUIActionsTracking, MobileWebUIActionsTracking, ReadingDepth from sanitization allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1275982 (https://phabricator.wikimedia.org/T417694) [17:31:24] 06Data-Engineering, 10Event-Platform: mediawiki.page_change.v1 event - Add revision is revert field - https://phabricator.wikimedia.org/T423583#11846203 (10Ottomata) Note: Joseph also would like it if we could get the list of revisions in between the current rev and the reverted-to-revision. This may be possi... [18:24:53] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 06Traffic, 06Data-Platform-SRE (2026-03-27 - 2026-04-17), 13Patch-For-Review: Surge in webrequest validation check - https://phabricator.wikimedia.org/T422030#11846499 (10JAllemandou) a:05JAllemandou→03xcollazo [18:41:15] FIRING: HdfsRpcQueueLength: RPC queue length on the analytics-hadoop cluster is too high. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Namenode_RPC_length_queue/latency - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=54&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsRpcQueueLength [18:47:15] FIRING: HdfsRpcQueueLatency: RPC queue latency on the analytics-hadoop cluster is too high. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Namenode_RPC_length_queue/latency - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=56&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsRpcQueueLatency [18:51:15] RESOLVED: HdfsRpcQueueLength: RPC queue length on the analytics-hadoop cluster is too high. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Namenode_RPC_length_queue/latency - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=54&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsRpcQueueLength [18:52:15] RESOLVED: HdfsRpcQueueLatency: RPC queue latency on the analytics-hadoop cluster is too high. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Namenode_RPC_length_queue/latency - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=56&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsRpcQueueLatency [18:56:16] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work, 13Patch-For-Review: Perform a one-time clean up of retained data sets in event_sanitize - https://phabricator.wikimedia.org/T417694#11846640 (10xcollazo) | Schema | In Allowlist | Data Range | Row Count (approx) | Anomaly | Introduced B... [21:51:09] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Backfill newly productionized edit types dataset - https://phabricator.wikimedia.org/T421919#11847520 (10AKhatun_WMF) Backfill is now complete. `akhatun.edit_type_v3` contains edit-type data from `ns0` and just Wikipedias. Uses `mwedittypes` v3.1.0 and `mwpa... [22:59:42] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Data Pipelines, 10Pageviews-API: Pageviews API returning 404 for 2026-04-17 onward - https://phabricator.wikimedia.org/T423818#11847764 (10Ahoelzl) p:05Triage→03High a:03amastilovic [23:57:18] (03CR) 10Zabe: "@xcollazo@wikimedia.org I'd presume this repo has no CI? Could you V+2 and merge this patch in that case? :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1267966 (https://phabricator.wikimedia.org/T413362) (owner: 10Zabe)