[09:40:38] 10Data-Engineering (Q2 2024 October 1st - December 31th): [SPIKE] Learn and document how to use Flink-CDC from MediaWiki MariaDB locally - https://phabricator.wikimedia.org/T373144#10266808 (10NoZeroDay) I didn't see any errors being captured there but I'll try a different approach within the course of the week. [10:02:26] 06Data-Engineering, 06serviceops, 10Event-Platform: Make eventstreams-internal available to WMF staff without an ssh tunnel - https://phabricator.wikimedia.org/T348763#10266916 (10phuedx) Would it be valuable to move MPIC to using oauth2-proxy for consistency with these other systems? [10:53:11] hey folks [10:53:22] I have a couple of alerts to mention [10:53:34] 1) an-test-master1001's puppet has been failing for a while [10:53:52] 2) archiva seems almost out of space [10:54:29] (both root partition and /var/lib/archiva) [10:55:22] ah and also elastic1073 seems to be failing health checks :D [10:58:41] 10Data-Engineering (Q2 2024 October 1st - December 31th): Deploy a staging airflow dag for webrequest refinement - https://phabricator.wikimedia.org/T378342 (10gmodena) 03NEW [11:08:03] elukey: looking at archiva [11:09:01] ok, back to 82% disk usage [11:10:21] thanks :) [11:25:15] 06Data-Engineering, 06Data-Platform, 10Dumps-Generation, 06Trust and Safety Product Team, and 2 others: Hide autoblocks from the globalblocks table database dump - https://phabricator.wikimedia.org/T376726#10267257 (10kostajh) [11:27:19] elukey as per an-test-master, it seems b.tullis uses it as a test bed for a feature he's developing and that's currently not working, I'm going to mute it for a week (b.tullis is afk until Nov 4th) [11:28:00] 06Data-Engineering, 06Data-Platform, 10Dumps-Generation, 06Trust and Safety Product Team, and 2 others: Hide autoblocks from the globalblocks table database dump - https://phabricator.wikimedia.org/T376726#10267256 (10kostajh) >>! In T376726#10233103, @kostajh wrote: > Now that the patches are merged, I se... [11:31:17] brouberol: please let's fix the issue, because it is technically a production host and without puppet running for days it doesn't show up in various places/reports [11:33:50] ack, I'll fix it and b.tullis will come back to it in a week [11:34:36] super thanks [11:45:53] I've tagged you on the patch [11:48:12] reviewed thanks! Shall we just comment the profile include? [12:08:18] 14Analytics, 06Data-Engineering, 10Metrics Platform, 10Event-Platform: Client-side error logging should use Elastic Common Schema (ECS) fields when possible - https://phabricator.wikimedia.org/T267602#10267496 (10phuedx) [13:01:48] 10Data-Engineering (Q2 2024 October 1st - December 31th), 13Patch-For-Review: Deploy a staging airflow dag for webrequest refinement - https://phabricator.wikimedia.org/T378342#10267643 (10gmodena) [13:16:30] 10Data-Engineering (Q2 2024 October 1st - December 31th), 07Epic, 13Patch-For-Review: [Maintenance] Safeguard VarnishKafka to HAProxy analytics transition - https://phabricator.wikimedia.org/T354694#10267675 (10gmodena) [13:16:53] 10Data-Engineering (Q2 2024 October 1st - December 31th), 07Epic, 13Patch-For-Review: [Maintenance] Safeguard VarnishKafka to HAProxy analytics transition - https://phabricator.wikimedia.org/T354694#10267679 (10gmodena) [13:45:58] (03PS2) 10Snwachukwu: Add Query for cu_log table in sqoop. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082515 (https://phabricator.wikimedia.org/T364398) [14:21:51] 06Data-Engineering, 06Data-Platform, 10Dumps-Generation, 06Trust and Safety Product Team, and 2 others: Hide autoblocks from the globalblocks table database dump - https://phabricator.wikimedia.org/T376726#10267916 (10xcollazo) >>! In T376726#10267255, @kostajh wrote: >>>! In T376726#10233103, @kostajh wro... [14:57:04] 06Data-Engineering, 10Data-Platform-SRE (2024.10.19 - 2024.11.08): MaxMind seems to be mapping the same IP to different countries - https://phabricator.wikimedia.org/T366369#10268046 (10Gehel) p:05Triage→03High [15:08:43] 10Quarry: Set query result retention time - https://phabricator.wikimedia.org/T360041#10268152 (10rook) I appreciate the commentary. Though none of it gets at the central issue of PII, and the reality that quarry is not designed to keep data in perpetuity. Data persistence is an expensive process and not being a... [15:54:07] 06Data-Engineering, 06Data Products, 06Data-Platform, 06Movement-Insights, and 3 others: Temporary Accounts Initiative (IP Masking) - Add user_is_temp to data tables - https://phabricator.wikimedia.org/T356701#10268490 (10fkaelin) [16:15:07] 10Data-Engineering (Q2 2024 October 1st - December 31th): [SPIKE] Learn and document how to use Flink-CDC from MediaWiki MariaDB locally - https://phabricator.wikimedia.org/T373144#10268629 (10Ottomata) > Just sent this email to the flink and paimon user email groups. Response from mailing list: > Hi, Andrew.... [16:21:03] (03CR) 10Mforns: [C:03+1] "Looks good to me in general, but ignore the details of this table and sqoop specifics." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082515 (https://phabricator.wikimedia.org/T364398) (owner: 10Snwachukwu) [16:22:56] (03CR) 10Mforns: [C:03+1] Add Query for cu_log table in sqoop. (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082515 (https://phabricator.wikimedia.org/T364398) (owner: 10Snwachukwu) [16:40:43] 10Quarry: Set query result retention time - https://phabricator.wikimedia.org/T360041#10268826 (10rook) 05Open→03Resolved a:03rook [17:25:44] 06Data-Engineering, 10Wikidata, 10Wikidata Analytics, 10Wmfdata-Python: Add testing framework to wmfdata-python - https://phabricator.wikimedia.org/T349531#10269217 (10nshahquinn-wmf) p:05Low→03High [17:26:36] 06Data-Engineering, 10Wikidata, 10Wikidata Analytics, 10Wmfdata-Python: Add testing framework to wmfdata-python - https://phabricator.wikimedia.org/T349531#10269221 (10nshahquinn-wmf) p:05High→03Medium [17:36:03] 10Quarry: Set query result retention time - https://phabricator.wikimedia.org/T360041#10269341 (10Novem_Linguae) > Though none of it gets at the central issue of PII I believe PII such as email addresses, password hashes, and IPs is scrubbed by the replicas? Quarry isn't a system I think of as having PII in... [17:46:34] 10Quarry: Set query result retention time - https://phabricator.wikimedia.org/T360041#10269413 (10rook) > I believe PII such as email addresses, password hashes, and IPs is scrubbed by the replicas? Quarry isn't a system I think of as having PII in it. All the data it queries is public, I think. The issue i... [18:25:29] 06Data-Engineering, 06Data Products: [refine] Add support for extra partitioning - https://phabricator.wikimedia.org/T377600#10269683 (10mforns) I think the timing depended on when is data going to start flowing in. Probably Q3? [18:27:28] (03PS3) 10Snwachukwu: Add Query for cu_log table in sqoop. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082515 (https://phabricator.wikimedia.org/T364398) [18:28:38] (03CR) 10Snwachukwu: Add Query for cu_log table in sqoop. (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1082515 (https://phabricator.wikimedia.org/T364398) (owner: 10Snwachukwu) [20:51:04] 10Data-Engineering (Q2 2024 October 1st - December 31th), 06Movement-Insights, 10Data-Platform-SRE (2024.10.19 - 2024.11.08): 2024-10-10 Data Loss Incident - webrequest Hive table - https://phabricator.wikimedia.org/T376882#10270380 (10Ottomata) Incident report has been moved to Wikitech. https://wikitech.w...