[07:12:32] 06Data-Engineering, 06cloud-services-team, 10Data-Services, 10VPS-Projects, 10WMDE-References-FocusArea: Requesting Cloud VPS access to NFS mount /public/dumps - https://phabricator.wikimedia.org/T333549#10252851 (10awight) [07:24:23] (03PS1) 10Gmodena: static_data: add annwiki to allowlist. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082330 (https://phabricator.wikimedia.org/T376332) [07:27:05] (03PS2) 10Gmodena: static_data: add annwiki to allowlist. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082330 (https://phabricator.wikimedia.org/T376332) [07:29:07] (03CR) 10Gmodena: static_data: add annwiki to allowlist. (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082330 (https://phabricator.wikimedia.org/T376332) (owner: 10Gmodena) [07:30:44] (03PS3) 10Gmodena: static_data: add annwiki to allowlist. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082330 (https://phabricator.wikimedia.org/T376332) [09:27:02] 06Data-Engineering, 10Structured-Data-Backlog (Current Work): [L] Track commons deletion requests - https://phabricator.wikimedia.org/T370898#10253247 (10Cparle) Ok the above all makes sense, but I'm not sure I understand what you're proposing that we change our current approach to. Could you spell it out step... [09:28:51] 06Data-Engineering: [Maintenance] Safeguard VarnishKafka to HAProxy analytics transition - https://phabricator.wikimedia.org/T354694#10253252 (10gmodena) [09:30:11] 06Data-Engineering: [Maintenance] Safeguard VarnishKafka to HAProxy analytics transition - https://phabricator.wikimedia.org/T354694#10253259 (10gmodena) [09:34:21] 06Data-Engineering, 07Epic: [Maintenance] Safeguard VarnishKafka to HAProxy analytics transition - https://phabricator.wikimedia.org/T354694#10253298 (10gmodena) [09:42:02] 06Data-Engineering, 07Epic: load haproxykafka topics into HDFS via gobblin - https://phabricator.wikimedia.org/T377931 (10gmodena) 03NEW [09:49:17] 06Data-Engineering, 07Epic: load haproxykafka topics into HDFS via gobblin - https://phabricator.wikimedia.org/T377931#10253353 (10gmodena) @Antoine_Quhen @Fabfur @Ottomata Previously (benthos iteration) we decided to use separate topics for Test and Production versions of the log shipper. This made sense wh... [09:55:49] (03PS1) 10Gmodena: gogglin: add webrequest_frontend pull job. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082432 (https://phabricator.wikimedia.org/T377931) [09:56:37] (03PS2) 10Gmodena: gobblin: add webrequest_frontend pull job. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082432 (https://phabricator.wikimedia.org/T377931) [10:15:31] 06Data-Engineering, 10Dumps 2.0 (Kanban Board), 10Event-Platform: Update eventutilities_python wrappers to support Flink 1.20 - https://phabricator.wikimedia.org/T374359#10253417 (10gmodena) > Now I'm backtracking the root cause of this type mismatch. The code path is triggered when a python function throws... [11:54:17] 10Quarry, 10ChangeProp, 06collaboration-services, 06Infrastructure-Foundations, and 10 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10253657 (10jijiki) [12:11:33] 10Quarry, 10ChangeProp, 06cloud-services-team, 06collaboration-services, and 11 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10253695 (10jijiki) [13:35:01] 10Quarry, 10ChangeProp, 06cloud-services-team, 06collaboration-services, and 11 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10254048 (10bking) Forgive the drive-by comment, but at the 6-month anniversary of this ticket, it might... [13:54:49] 06Data-Engineering, 10Dumps 2.0 (Kanban Board), 10Event-Platform: Update eventutilities_python wrappers to support Flink 1.20 - https://phabricator.wikimedia.org/T374359#10254151 (10gmodena) After some more digging, I think the issue comes by the return value of `flink_instant_of_datetime()` (our method) tri... [14:16:52] (03Abandoned) 10KCVelaga: Add elia to translation providers [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1076993 (https://phabricator.wikimedia.org/T357250) (owner: 10KCVelaga) [14:17:10] 06Data-Engineering, 10Dumps 2.0 (Kanban Board), 10Event-Platform, 13Patch-For-Review: Update eventutilities_python wrappers to support Flink 1.20 - https://phabricator.wikimedia.org/T374359#10254277 (10gmodena) > Down in beam, value will end up in a code path that triggers out_stream.write_int64(value.seco... [15:13:10] 06Data-Engineering, 07Epic: [Maintenance] Safeguard VarnishKafka to HAProxy analytics transition - https://phabricator.wikimedia.org/T354694#10254542 (10gmodena) [15:15:21] 06Data-Engineering, 10Dumps 2.0 (Kanban Board), 10Event-Platform, 13Patch-For-Review: Update eventutilities_python wrappers to support Flink 1.20 - https://phabricator.wikimedia.org/T374359#10254571 (10Ottomata) > If I cast int(value.seconds) Nice find! [15:15:51] 06Data-Engineering, 10Dumps 2.0 (Kanban Board), 10Event-Platform, 13Patch-For-Review: Update eventutilities_python wrappers to support Flink 1.20 - https://phabricator.wikimedia.org/T374359#10254572 (10Ottomata) Are we still hoping to merge include https://gitlab.wikimedia.org/repos/data-engineering/eventu... [15:20:08] 06Data-Engineering, 07Epic, 13Patch-For-Review: load haproxykafka topics into HDFS via gobblin - https://phabricator.wikimedia.org/T377931#10254599 (10Ottomata) > For data validation, we can set a cut off date, and expose it via webrequest only from that date onward. IIUC, `webrequest_frontend_text` then is... [15:23:04] 06Data-Engineering, 10Structured-Data-Backlog (Current Work): [L] Track commons deletion requests - https://phabricator.wikimedia.org/T370898#10254614 (10Ottomata) I'll let @xcollazo confirm, but I'm suggesting to use the new Dumps 2 (not yet production ready, but very soon!) `mediawiki_content_history` Iceber... [15:26:36] 06Data-Engineering, 10Structured-Data-Backlog (Current Work): [L] Track commons deletion requests - https://phabricator.wikimedia.org/T370898#10254635 (10Ottomata) But, @Cparle more generally, if we (DPE) were able to prioritize work for {T258511} and {T291120} and https://wikitech.wikimedia.org/wiki/MediaWiki... [15:30:18] 06Data-Engineering, 10Structured-Data-Backlog (Current Work): [L] Track commons deletion requests - https://phabricator.wikimedia.org/T370898#10254661 (10Cparle) Haha ok cool ... by 'squeakier' do you mean broadcasting what we need/want a bit louder? And when is Dumps 2.0 expected to land? [15:32:19] 06Data-Engineering, 10Structured-Data-Backlog (Current Work): [L] Track commons deletion requests - https://phabricator.wikimedia.org/T370898#10254688 (10Ottomata) > by 'squeakier' do you mean broadcasting what we need/want a bit louder? Yup! Be the [[ https://en.wikipedia.org/wiki/The_squeaky_wheel_gets_the_g... [15:32:54] 06Data-Engineering, 10Data-Platform-SRE (2024.10.19 - 2024.11.08): Design a suitable DAG deployment method - https://phabricator.wikimedia.org/T368033#10254697 (10amastilovic) @brouberol got it. You'll mount Ceph as a file system local to the Airflow instance, and HDFS sync will write to Ceph - effectively, to... [15:42:33] 06Data-Engineering, 07Epic, 13Patch-For-Review: load haproxykafka topics into HDFS via gobblin - https://phabricator.wikimedia.org/T377931#10254751 (10gmodena) >>! In T377931#10254599, @Ottomata wrote: >> For data validation, we can set a cut off date, and expose it via webrequest only from that date onward.... [15:43:29] 06Data-Engineering, 10Data-Platform-SRE (2024.10.19 - 2024.11.08): Design a suitable DAG deployment method - https://phabricator.wikimedia.org/T368033#10254764 (10brouberol) That's exactly right! And until that materializes, we're running a git-sync process that syncs airflow-dags to the Ceph volume every 5 mi... [15:46:14] 06Data-Engineering, 07Epic, 13Patch-For-Review: load haproxykafka topics into HDFS via gobblin - https://phabricator.wikimedia.org/T377931#10254781 (10Ottomata) Okay! You might want to start finding and engaging with consumers ASAP so they aren't surprised by this. E.g. FRtech uses their own puppet reposit... [15:51:45] 06Data-Engineering, 07Epic, 13Patch-For-Review: load haproxykafka topics into HDFS via gobblin - https://phabricator.wikimedia.org/T377931#10254814 (10gmodena) > You might want to start finding and engaging with consumers ASAP so they aren't surprised by this. E.g. FRtech uses their own puppet repository and... [15:54:04] !log deploying all airflow instances to pick up changes in T351388 [15:54:09] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:54:11] T351388: Make Airflow SparkSQL operator set fileoutputcommitter.algorithm.version=2 to avoid concurrent write issues - https://phabricator.wikimedia.org/T351388 [16:15:02] 06Data-Engineering, 10Cassandra, 10Data Pipelines, 10Data-Platform-SRE (2024.10.19 - 2024.11.08), 13Patch-For-Review: Create puppet resource for adding/updating/deleting secrets or other small files on HDFS - https://phabricator.wikimedia.org/T323692#10255057 (10BTullis) This is still not working, unfort... [16:50:18] (03PS1) 10Snwachukwu: Add Query for cu_log table in sqoop. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082515 (https://phabricator.wikimedia.org/T364398) [17:05:26] 06Data-Engineering, 10Data-Engineering-Wikistats, 10Data Pipelines, 07I18n, and 2 others: Merge ks-Arab and ks-Deva to ks - https://phabricator.wikimedia.org/T314476#10255364 (10MaryMunyoki) [17:09:49] 06Data-Engineering, 10Data-Engineering-Wikistats, 10Data Pipelines, 07I18n, and 2 others: Merge ks-Arab and ks-Deva to ks - https://phabricator.wikimedia.org/T314476#10255370 (10MaryMunyoki) a:05srishakatux→03Amire80 [17:28:17] 10Data-Engineering (Q2 2024 October 1st - December 31th): Write documentation on usage of RestExternalTaskSensor - https://phabricator.wikimedia.org/T378000 (10amastilovic) 03NEW [17:29:31] (03CR) 10Xcollazo: [C:03+2] static_data: add annwiki to allowlist. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082330 (https://phabricator.wikimedia.org/T376332) (owner: 10Gmodena) [17:42:04] 10Data-Engineering (Q2 2024 October 1st - December 31th): Write documentation on usage of RestExternalTaskSensor - https://phabricator.wikimedia.org/T378000#10255513 (10Ottomata) [17:42:05] 14Data-Engineering (Q1 2024 July 1st - September 30th), 10Dumps 2.0 (Kanban Board): Develop Airflow ExternalTaskSensor to orchestrate DAG dependencies - https://phabricator.wikimedia.org/T369900#10255514 (10Ottomata) [18:30:20] 10Data-Engineering (Q2 2024 October 1st - December 31th), 07Epic, 13Patch-For-Review: load haproxykafka topics into HDFS via gobblin - https://phabricator.wikimedia.org/T377931#10255745 (10gmodena) [18:30:50] 06Data-Engineering, 10Dumps 2.0 (Kanban Board), 10Event-Platform, 13Patch-For-Review: Update eventutilities_python wrappers to support Flink 1.20 - https://phabricator.wikimedia.org/T374359#10255756 (10gmodena) a:03gmodena [19:03:44] (03PS1) 10Mforns: Modify the automated traffic detection pipeline to include redirects [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082543 (https://phabricator.wikimedia.org/T375527) [19:04:31] (03CR) 10Mforns: [V:03+1] Modify the automated traffic detection pipeline to include redirects [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082543 (https://phabricator.wikimedia.org/T375527) (owner: 10Mforns) [19:24:54] 06Data-Engineering, 10CampaignEvents, 10EntitySchema, 10JsonConfig, and 15 others: Add namespace descriptions for Special:NamespaceInfo in WMF-deployed extensions - https://phabricator.wikimedia.org/T373070#10255972 (10Msz2001) [19:33:13] 10Data-Engineering (Q2 2024 October 1st - December 31th), 07Epic, 13Patch-For-Review: load haproxykafka topics into HDFS via gobblin - https://phabricator.wikimedia.org/T377931#10255997 (10Fabfur) We've no issues in changing topic names even at the latest moment, as they will be completely configurable and t... [20:11:47] 06Data-Engineering, 10MediaWiki-extensions-General, 07Documentation, 10Event-Platform: Update code comment links to Meta-Wiki schemas to new event platform - https://phabricator.wikimedia.org/T371305#10256117 (10Ottomata) Thank you for the ticket! Someone should probably also check to see if this instrume... [20:12:24] 06Data-Engineering, 10Event-Platform, 07Wikimedia-production-error: EventBus::send incorrectly assumes that JSON response can be converted to an associative array - https://phabricator.wikimedia.org/T371432#10256123 (10Ottomata) 05Open→03Resolved a:03Ottomata [20:13:02] 06Data-Engineering, 10Event-Platform, 07Wikimedia-production-error: EventBus::send incorrectly assumes that JSON response can be converted to an associative array - https://phabricator.wikimedia.org/T371432#10256125 (10Ahoelzl) [20:16:47] 06Data-Engineering, 06cloud-services-team, 06Data Products, 06Data-Platform-SRE, and 2 others: Hide rows in the globalblocks table when the associated globaluser row has gu_hidden_level as not 0 - https://phabricator.wikimedia.org/T371488#10256132 (10Ottomata) @Milimetric @BTullis [[ https://docs.google.co... [20:22:45] 06Data-Engineering, 10Add-Link, 10CirrusSearch, 06Growth-Team, and 4 others: revalidateLinkRecommendations.php fails periodically with JobQueueError: Could not enqueue jobs - https://phabricator.wikimedia.org/T371767#10256162 (10Ottomata) 05Open→03Resolved a:03Ottomata Seems fine for now, there m... [20:24:25] 06Data-Engineering, 06Data-Platform, 10Dumps 2.0, 10Event-Platform: [NEEDS INVESTIGATION][BUG] eventutilities_python operator metrics - https://phabricator.wikimedia.org/T373112#10256179 (10Ahoelzl) [20:25:20] 06Data-Engineering, 06cloud-services-team, 06Data Products, 06Data-Platform-SRE, and 2 others: Hide rows in the globalblocks table when the associated globaluser row has gu_hidden_level as not 0 - https://phabricator.wikimedia.org/T371488#10256180 (10Dreamy_Jazz) 05Open→03Resolved a:03Dreamy_Jazz... [20:28:23] 06Data-Engineering, 10CampaignEvents, 06Data Products, 10EntitySchema, and 16 others: Add namespace descriptions for Special:NamespaceInfo in WMF-deployed extensions - https://phabricator.wikimedia.org/T373070#10256174 (10Ottomata) WMF doesn't use or maintain the Schema namespace in EventLogging anymore. T... [20:28:39] 10Data-Engineering (Q2 2024 October 1st - December 31th): [SPIKE] Learn and document how to use Flink-CDC from MediaWiki MariaDB locally - https://phabricator.wikimedia.org/T373144#10256212 (10Ahoelzl) [20:28:43] 10Data-Engineering (Q2 2024 October 1st - December 31th), 07Epic: [Maintenance] Safeguard VarnishKafka to HAProxy analytics transition - https://phabricator.wikimedia.org/T354694#10256213 (10Ahoelzl) [20:31:26] 14Data-Engineering (Q1 2024 July 1st - September 30th): Allow maxLength changes for json schema compatibility - https://phabricator.wikimedia.org/T373633#10256242 (10Ahoelzl) [20:32:55] 14Data-Engineering (Q1 2024 July 1st - September 30th): Allow maxLength changes for json schema compatibility - https://phabricator.wikimedia.org/T373633#10256245 (10Ahoelzl) 05Open→03Resolved [20:37:21] 06Data-Engineering, 10MediaWiki-extensions-WikimediaMaintenance, 10Event-Platform, 07Wikimedia-production-error: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: the local wiki, Actual: 'guwwiki'. Pass expected $... - https://phabricator.wikimedia.org/T304528#10256267 [20:37:42] 06Data-Engineering, 10MediaWiki-extensions-WikimediaMaintenance, 10Event-Platform, 07Wikimedia-production-error: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: the local wiki, Actual: 'guwwiki'. Pass expected $... - https://phabricator.wikimedia.org/T304528#10256275 [20:38:59] 06Data-Engineering, 10MediaWiki-extensions-WikimediaMaintenance, 10Event-Platform, 07Wikimedia-production-error: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: the local wiki, Actual: 'guwwiki'. Pass expected $... - https://phabricator.wikimedia.org/T304528#10256277 [20:39:31] 06Data-Engineering, 03Discovery-Search (Current work): Datahub - ingest Hive discovery database - https://phabricator.wikimedia.org/T374118#10256293 (10Ottomata) @BTullis @EBernhardson can this be resolved now that {T376657} is fixed? [20:42:46] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0, 03Discovery-Search (Current work), 13Patch-For-Review: [SPIKE] how can we support Spark producer/consumers in Event Platform - https://phabricator.wikimedia.org/T374341#10256299 (10Ahoelzl) [20:42:50] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0, 03Discovery-Search (Current work), 13Patch-For-Review: [SPIKE] how can we support Spark producer/consumers in Event Platform - https://phabricator.wikimedia.org/T374341#10256301 (10Ahoelzl) [20:42:53] 06Data-Engineering, 03Discovery-Search (Current work): Datahub - ingest Hive discovery database - https://phabricator.wikimedia.org/T374118#10256303 (10Ahoelzl) [20:42:54] 06Data-Engineering, 07Epic: All things DataHub - https://phabricator.wikimedia.org/T369756#10256304 (10Ahoelzl) [20:42:59] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0 (Kanban Board), 13Patch-For-Review: Update eventutilities_python wrappers to support Flink 1.20 - https://phabricator.wikimedia.org/T374359#10256310 (10Ahoelzl) [20:44:20] 06Data-Engineering, 06Data Products, 10MediaWiki-extensions-EventLogging, 10Temporary accounts: Prepare EventLogging for temp accounts - https://phabricator.wikimedia.org/T374812#10256331 (10Ahoelzl) [20:44:50] 06Data-Engineering, 06Data-Platform, 10DPE Temporary Accounts, 06Product-Analytics, and 2 others: Ensure performer attributes in schemas clarify if the user is a temporary account - https://phabricator.wikimedia.org/T374940#10256359 (10Ahoelzl) [20:46:14] 10Data-Engineering (Q2 2024 October 1st - December 31th): Move more of refine_hive_hourly dag logic into RefineConfiguration - https://phabricator.wikimedia.org/T375064#10256369 (10Ahoelzl) [20:47:21] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0 (Kanban Board): [Data Quality] [SPIKE] Can we identify indicators to inform an SLO for event emission and intake? - https://phabricator.wikimedia.org/T345195#10256374 (10Ahoelzl) [20:48:10] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0: [Event Platform] We should alert on EventBus performance degradation. - https://phabricator.wikimedia.org/T375197#10256378 (10Ahoelzl) [20:52:49] 06Data-Engineering, 10Research-engineering, 06Research-Freezer, 10Event-Platform: [Research Engineering Request] Productionized Edit Types - https://phabricator.wikimedia.org/T351225#10256395 (10Ottomata) Related {T291120} [20:53:24] 06Data-Engineering, 06Data Products, 06Traffic: Cookie % has been rejected because it is foreign and does not have the "Partitioned" attribute - https://phabricator.wikimedia.org/T375256#10256402 (10Ottomata) [20:58:12] 10Data-Engineering (Q2 2024 October 1st - December 31th): Update event-producing tools to overwrite `meta.dt` - https://phabricator.wikimedia.org/T376026#10256413 (10Ahoelzl) [20:58:13] 10Data-Engineering (Q2 2024 October 1st - December 31th): Update event-producing tools to overwrite `meta.dt` - https://phabricator.wikimedia.org/T376026#10256415 (10Ahoelzl) [20:59:33] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10Dumps 2.0, 03Discovery-Search (Current work), 07Epic, 13Patch-For-Review: EPIC: Update flink jobs to support Flink 1.20 - https://phabricator.wikimedia.org/T376812#10256423 (10Ahoelzl) [21:01:34] 10Data-Engineering (Q2 2024 October 1st - December 31th), 03Discovery-Search (Current work), 10Dumps 2.0 (Kanban Board), 13Patch-For-Review: Bump eventutilities to support flink 1.20 - https://phabricator.wikimedia.org/T377130#10256430 (10Ahoelzl) [21:01:40] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Discovery-Search, 10Data-Platform-SRE (2024.10.19 - 2024.11.08): Create and distribute a flink base image with flink 1.20.0 - https://phabricator.wikimedia.org/T377134#10256432 (10Ahoelzl) [21:01:46] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Discovery-Search, 10Data-Platform-SRE (2024.10.19 - 2024.11.08): Upload an image with flink-k8s-operator version that supports flink 1.20 - https://phabricator.wikimedia.org/T377137#10256433 (10Ahoelzl) [21:03:25] 06Data-Engineering, 06Data Products: [refine] Add support for extra partitioning - https://phabricator.wikimedia.org/T377600#10256443 (10Ottomata) Do you all need this this quarter Q2? [21:03:39] 10Data-Engineering (Q2 2024 October 1st - December 31th): [Refine Refactoring] Refine Data Quality - late events, RefineMonitor refactor, etc. - https://phabricator.wikimedia.org/T377739#10256444 (10Ahoelzl) [21:04:47] 10Data-Engineering (Q2 2024 October 1st - December 31th), 10MediaWiki-extensions-WikimediaMaintenance, 07Wikimedia-production-error: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: the local wiki, Actual: 'guwwiki... - https://phabricator.wikimedia.org/T304528#10256449