[06:30:16] 06Data-Engineering, 06Data-Platform-SRE: Package request: install elixir and erlang-otp to the analytics clients - https://phabricator.wikimedia.org/T362678#10311157 (10awight) [09:26:12] 06Data-Engineering: Make it possible to select the DAG deployment method - https://phabricator.wikimedia.org/T379279#10311383 (10brouberol) Actually, we currently have: `lang=yaml gitsync: enabled: true ` by default. We could just have it set to `enabled: false` by default for all production instances. [09:47:09] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0 (Kanban Board), 13Patch-For-Review: Enable HA for the mw-dump-rev-content-reconcile-enrich flink application - https://phabricator.wikimedia.org/T375176#10311447 (10tchin) For naming of the buckets, should we use the same pattern as `mw-pa... [11:05:03] 06Data-Engineering, 06Product-Analytics, 10Wmfdata-Python: Support querying a range of hourly data partitions - https://phabricator.wikimedia.org/T294654#10311707 (10nettrom_WMF) Merge request submitted: [[ https://gitlab.wikimedia.org/repos/data-engineering/wmfdata-python/-/merge_requests/65 ]] [12:39:00] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Plan for a Hadoop and Hive upgrade for the Data Platform - https://phabricator.wikimedia.org/T379385#10311983 (10BTullis) I have added [[https://issues.apache.org/jira/browse/BIGTOP-4218?focusedCommentId=17897497&page=com.atlassian.jira.plugin.system.issuetabp... [12:50:23] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Plan for a Hadoop and Hive upgrade for the Data Platform - https://phabricator.wikimedia.org/T379385#10312007 (10BTullis) I have done some initial tests to build the Hadoop and Hive packages against version 3.3.0 of Bigtop. ` btullis@marlin:~/wmf/bigtop$ doc... [12:54:55] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Plan for a Hadoop and Hive upgrade for the Data Platform - https://phabricator.wikimedia.org/T379385#10312020 (10BTullis) Now I am reading lots of information around how to do major version upgrades. e.g. [[https://www.slideshare.net/slideshow/migrating-your-... [12:59:36] dse-k8s-etcd1003 will switch temporarily to DRBD to move to a new ganeti node, latencies might go up a bit [13:05:01] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Plan for a Hadoop and Hive upgrade for the Data Platform - https://phabricator.wikimedia.org/T379385#10312032 (10BTullis) This is a techblog post about how we switched from CDH to Bigtop in 2021. https://techblog.wikimedia.org/2021/05/07/upgrading-hadoop-in-ju... [13:05:47] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Plan for a Hadoop and Hive upgrade for the Data Platform - https://phabricator.wikimedia.org/T379385#10312035 (10BTullis) p:05Triage→03Medium [13:06:50] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0 (Kanban Board), 13Patch-For-Review: Enable HA for the mw-dump-rev-content-reconcile-enrich flink application - https://phabricator.wikimedia.org/T375176#10312038 (10gmodena) > For naming of the buckets, should we use the same pattern as mw... [13:15:01] 06Data-Engineering, 06Data-Platform-SRE, 07Epic: Plan for a Hadoop and Hive upgrade for the Data Platform - https://phabricator.wikimedia.org/T379385#10312052 (10BTullis) >>! In T379385#10304807, @xcollazo wrote: > On the other hand, Uber did it: > https://www.uber.com/blog/hadoop-namenode-container/ > https... [13:18:01] and dse-k8s-etcd1003 is back to normal [13:57:58] 14Analytics, 06Data-Engineering, 10EventStreams, 10Wikidata, and 3 others: Expose rdf-streaming-updater.mutation content through EventStreams - https://phabricator.wikimedia.org/T294133#10312314 (10dcausse) >>! In T294133#10306311, @Hannah_Bast wrote: > [...] > 4 ... Update that entity using the following... [14:05:25] (03PS21) 10Gmodena: hql: webrequest: add webrequest_frontend. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1012656 (https://phabricator.wikimedia.org/T378342) [14:05:49] (03PS2) 10Aqu: Add an option to ignore missing input folders in Refine [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088346 (https://phabricator.wikimedia.org/T369845) [14:08:03] (03CR) 10Aqu: Add an option to ignore missing input folders in Refine (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088346 (https://phabricator.wikimedia.org/T369845) (owner: 10Aqu) [14:09:20] moritzm: Many thanks. [14:09:24] (03CR) 10Gmodena: hql: webrequest: add webrequest_frontend. (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1012656 (https://phabricator.wikimedia.org/T378342) (owner: 10Gmodena) [14:12:05] (03CR) 10Joal: [C:03+1] hql: webrequest: add webrequest_frontend. (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1012656 (https://phabricator.wikimedia.org/T378342) (owner: 10Gmodena) [14:13:54] 06Data-Engineering, 10CirrusSearch, 10MediaWiki-extensions-EventLogging, 10Metrics Platform, and 3 others: Error: Call to a member function getPageAsLinkTarget() on null - https://phabricator.wikimedia.org/T368543#10312420 (10phuedx) [14:31:03] (03PS2) 10Milimetric: Include user_is_temp column in user table sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1087964 (https://phabricator.wikimedia.org/T379631) [14:39:37] (03PS3) 10Milimetric: Update MediaWiki History to support Temp Accounts [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088232 (https://phabricator.wikimedia.org/T379230) (owner: 10Mforns) [14:39:37] (03CR) 10Milimetric: "looks good, of course testing is the key but I have one comment on naming of the new fields" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1088232 (https://phabricator.wikimedia.org/T379230) (owner: 10Mforns) [14:41:03] (03CR) 10Milimetric: Modify MediaWiki History queries to support Temp Accounts (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1088342 (https://phabricator.wikimedia.org/T379230) (owner: 10Mforns) [15:25:56] 06Data-Engineering-Icebox: Public Edit Data Lake: Mediawiki history snapshots available in SQL data store to cloud (labs) users - https://phabricator.wikimedia.org/T204950#10312806 (10Ottomata) [15:26:00] 06Data-Engineering, 06cloud-services-team, 10Data-Services, 07Epic: Plan a replacement for wiki replicas that is better suited to typical OLAP use cases than the MediaWiki OLTP schema - https://phabricator.wikimedia.org/T215858#10312805 (10Ottomata) [15:26:07] 06Data-Engineering-Icebox: Public Edit Data Lake: Mediawiki history snapshots available in SQL data store to cloud (labs) users - https://phabricator.wikimedia.org/T204950#10312811 (10Ottomata) [15:26:11] 14Analytics, 06Product-Analytics, 07Epic, 13Patch-For-Review: Data Lake incremental Data Updates - https://phabricator.wikimedia.org/T258511#10312812 (10Ottomata) [18:03:20] 06Data-Engineering, 06Data Products, 06MediaWiki-Platform-Team, 10MediaWiki-ResourceLoader, 07Schema-change: Drop unused module_deps table from MediaWiki schema - https://phabricator.wikimedia.org/T379661 (10Krinkle) 03NEW [18:04:39] 06Data-Engineering, 06Data Products, 06MediaWiki-Platform-Team, 10MediaWiki-ResourceLoader, 07Schema-change: Drop unused module_deps table from MediaWiki schema - https://phabricator.wikimedia.org/T379661#10313822 (10Krinkle) I noticed this when helping out in the production error at {T379589}, which inv... [18:27:20] Starting build #5 for job wikimedia-event-utilities-maven-release [18:33:03] Project wikimedia-event-utilities-maven-release build #5: 09SUCCESS in 5 min 43 sec: https://integration.wikimedia.org/ci/job/wikimedia-event-utilities-maven-release/5/ [20:04:27] !log deploying refinery repo as part of deployment train [20:04:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:21:41] 06Data-Engineering, 10Data-Engineering-Wikistats, 06Data Products, 10PageViewInfo, and 2 others: Pageviews Analysis 3.0 (Vue + Codex) - https://phabricator.wikimedia.org/T378549#10314313 (10MusikAnimal) Summary of what was discussed today with the Data Engineering team: * The hope is to get this into Wiki... [20:40:22] 06Data-Engineering, 06Data-Platform-SRE: Add relevant kafka clusters to defined airflow connections in puppet - https://phabricator.wikimedia.org/T379676 (10Ottomata) 03NEW [20:40:34] 06Data-Engineering, 06Data-Platform-SRE: Add relevant kafka clusters to defined airflow connections in puppet - https://phabricator.wikimedia.org/T379676#10314356 (10Ottomata) [21:22:37] !log Deployed refinery using scap, then deployed onto hdfs [21:22:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:27:40] !log deploying airflow as part of weekly deployment train [21:27:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log