[00:38:07] <wikibugs>	 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban: npm install gives Verification failed while extracting mediawiki-storage@https://github.com/wikimedia/analytics-mediawiki-storage/archive/master.tar.gz - https://phabricator.wikimedia.org/T278982 (10Urbanecm) >>! In T278982#7016272, @Milimetric wrote: > I...
[04:20:56] <icinga-wm>	 PROBLEM - Check unit status of monitor_refine_event_sanitized_main_delayed on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_event_sanitized_main_delayed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[04:24:48] <icinga-wm>	 PROBLEM - Check unit status of monitor_refine_event_sanitized_main_immediate on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_event_sanitized_main_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[05:23:19] <wikibugs>	 10Analytics, 10Product-Analytics: Aggregate table not working after superset upgrade - https://phabricator.wikimedia.org/T280784 (10cchen) @razzi Never mind, I misunderstood it. I thought we will see both "druid.pageview_hourly" and "pageview_hourly" in the dropdown menu when click change dataset.
[06:37:48] <wikibugs>	 10Analytics-Radar, 10SRE, 10ops-eqiad: Try to move some new analytics worker nodes to different racks - https://phabricator.wikimedia.org/T276239 (10elukey) @Cmjohnson hi! Any news about the worker nodes?
[06:50:56] <wikibugs>	 10Analytics, 10Patch-For-Review: Decommission analytics-tool1001 and all the CDH leftovers - https://phabricator.wikimedia.org/T280262 (10elukey) Plan is:   * downtime + disable-puppet + stop hue on an-tool1009 * merge https://gerrit.wikimedia.org/r/683786 * `sudo mysqldump hue > hue_30042021.sql` on an-coord1...
[06:56:36] <elukey>	 !log stop hue to allow database rename (hue_next -> hue)
[06:56:38] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:04:33] <elukey>	 !log hue restarted using the database 'hue' instead of 'hue_next'
[07:04:35] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:06:15] <elukey>	 looks good afaics
[07:08:38] <wikibugs>	 10Analytics, 10Patch-For-Review: Decommission analytics-tool1001 and all the CDH leftovers - https://phabricator.wikimedia.org/T280262 (10elukey) Everything looks good! Also dropped the `hue_next` database so it is less confusing when inspecting what we run on the various db nodes (basically we now have only t...
[07:09:00] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Decommission analytics-tool1001 and all the CDH leftovers - https://phabricator.wikimedia.org/T280262 (10elukey)
[07:09:23] <elukey>	 wow finally done! 
[07:09:27] <elukey>	 I can't believe it
[07:29:01] <wikibugs>	 (03PS1) 10GoranSMilovanovic: T281316 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/683801
[07:29:28] <wikibugs>	 (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] T281316 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/683801 (owner: 10GoranSMilovanovic)
[08:22:56] <wikibugs>	 (03PS1) 10GoranSMilovanovic: T261906 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/683812
[08:23:06] <wikibugs>	 (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] T261906 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/683812 (owner: 10GoranSMilovanovic)
[08:49:59] * elukey bbiab!
[10:00:09] <wikibugs>	 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10elukey) @GoranSMilovanovic sure! During the migration of the hosts where Hive Server/Metastore runs to Debian Buster, we encounter...
[10:27:55] * elukey afk! lunch
[11:52:45] <gehel>	 ottomata: if you have a few minutes: https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/683351
[11:52:59] <gehel>	 This should fix both the release process and the postmerge builds
[12:15:13] <hnowlan>	 just a heads-up, I gotta go to the dentist in a bit and I have no idea how well it'll go so my availability is questionable in the afternoon
[12:15:29] <joal>	 noted hnowlan - thanks for pinging :)
[13:19:08] <mforns>	 hellooo team :]
[13:24:10] <wikibugs>	 (03CR) 10Ottomata: [C: 03+2] Ensure that maven site generation works. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/683351 (owner: 10Gehel)
[13:24:24] <gehel>	 ottomata: thanks!
[13:24:30] <ottomata>	 :) thank you
[13:30:10] <mforns>	 ottomata: I'm preparing a patch to remove --skip-trash from event (unsanitized) purging script. I'm removing the regex to check for underscores, is that OK? After the migration you're doing, will we be able to purge *all* event.* tables after 90 days?
[13:30:42] <ottomata>	 hmm, mforns  yes, i have a job that does this in the test cluster
[13:30:47] <mforns>	 I'm also removing the reference to WMDEBanner* tables, I imagine that until we merge this, we'll know
[13:30:51] <ottomata>	 i'm not quite ready to enable that data purge job, want to do a lot of double checking
[13:31:00] <ottomata>	 can you do that for the test cluster for now?
[13:31:10] <mforns>	 ah! Then you already have the checksum?
[13:31:11] <ottomata>	 and then wew can adapt for the prod cluster once we are ready?
[13:31:16] <ottomata>	 yes its running in test now
[13:31:27] <wikibugs>	 (03PS12) 10Gehel: Report on test coverage [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681933 (owner: 10Awight)
[13:31:32] <ottomata>	 see profile::analytics::refinery::job::test::data_purge
[13:31:41] <mforns>	 but.... how many tables do we have in test cluster?
[13:31:49] <mforns>	 ok
[13:31:49] <ottomata>	 kerberos::systemd_timer { 'drop_event':
[13:32:53] <ottomata>	 mforns:  not many at all!
[13:33:20] <ottomata>	 you think refinery-drop-older-than might not be able to handle it?
[13:37:06] <joal>	 ottomata: thanks for having cared the errors - I wanted to ask you before taking actions, but I guess you've resolved it, right?
[13:41:08] <mforns>	 ottomata: I think we'd better execute the deletion in chunks, for example starting with --older-than=1000, then --older-than=700, etc.
[13:41:41] <ottomata>	 joal:  yeah the failure was just a hiccup because of a namenode restart (i think?)
[13:41:48] <ottomata>	 the second is a followwup for the sanitizatino work
[13:41:50] <ottomata>	 i'm on that now
[13:42:00] <ottomata>	 mforns:  when first applyinig the job?
[13:42:03] <joal>	 ottomata: since jobs were sanitization I wasn't sure -
[13:42:04] <ottomata>	 or always?
[13:42:07] <joal>	 thanks ottomata :)
[13:42:16] <gehel>	 ottomata: and if you have more free time, this should start reporting code coverage: https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/681933
[13:44:08] <ottomata>	 haha free time == +2 and submit :)
[13:44:18] <wikibugs>	 (03CR) 10Ottomata: [C: 03+2] Report on test coverage [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681933 (owner: 10Awight)
[13:44:35] <wikibugs>	 (03CR) 10Ottomata: [V: 03+2 C: 03+2] Report on test coverage [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681933 (owner: 10Awight)
[13:44:52] <mforns>	 ottomata: no no, just the first time
[13:45:15] <mforns>	 ottomata: btw, the job in the test cluster has still the --skip-trash flag
[13:46:48] <ottomata>	 mforns:  i was just making it match the prod one
[13:46:54] <ottomata>	 if we are going to remove skip trash, lets do it for test now
[13:46:57] <ottomata>	 can you make patch? 
[13:49:32] <mforns>	 ottomata: sure
[13:49:44] <mforns>	 ottomata: it's getting the checksum now
[13:49:54] <mforns>	 but it's taking a long time...
[13:50:10] <mforns>	 maybe getting it in the test cluster will be faster
[13:52:22] <ottomata>	 aye
[13:52:37] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: Sanitize and ingest all event tables into the event_sanitized database - https://phabricator.wikimedia.org/T273789 (10Ottomata) ah, I need to reset the _REFINED flag mtimes for the hours I distcp-ed over, just like I did in https://phab...
[13:55:04] <gehel>	 ottomata: and coverage report is live: https://sonarcloud.io/dashboard?id=org.wikimedia.analytics.refinery%3Arefinery
[13:57:48] <mforns>	 wow, didn't know refinery-source had that, or is it brand new?
[14:01:27] <joal>	 \o/
[14:01:33] <joal>	 Thanks a lot gehel <3
[14:01:51] <joal>	 Gone for kids
[14:06:46] <ottomata>	 nice gehel !
[14:06:48] <ottomata>	 mforns:  brand neww! :)
[14:14:02] <ottomata>	 elukey:  razzi, since we want to do airflow work in q1
[14:14:09] <ottomata>	 i've got the airflow nodes schedueld for q1 procurement
[14:14:21] <ottomata>	 however the new db nodes i have in q2
[14:14:22] <ottomata>	 perhaps
[14:14:25] <ottomata>	 they should be q1 too?
[14:14:50] <ottomata>	 then we could set up the airflow dbs on them early? and prep them for multi instance migration of an-coord dbs, but do the miigrations later?
[14:15:10] <ottomata>	 that should be ok, right?  would it be ok to have the airflow dbs running on the new nodes with replication, backup, etc.
[14:15:19] <ottomata>	 will that work with the back up repl to db1108?
[14:17:39] <ottomata>	 also, we have 3 airflow nodse
[14:17:42] <ottomata>	 do we really need 3?
[14:17:43] <ottomata>	 maybe just 2?
[14:20:47] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Prep for replacing jupyter conda migration - https://phabricator.wikimedia.org/T262847 (10Ottomata) a:03Ottomata
[14:21:06] <wikibugs>	 10Analytics: Spike. Try to ML models distributted in jupyter notebooks with dask - https://phabricator.wikimedia.org/T243089 (10Ottomata) https://analytics-zoo.readthedocs.io/en/latest/   ?
[14:21:49] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Prep for replacing jupyter conda migration - https://phabricator.wikimedia.org/T262847 (10Ottomata)
[14:22:02] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Decomission SWAP - https://phabricator.wikimedia.org/T262847 (10Ottomata)
[14:22:24] <elukey>	 ottomata: o/ I wasn't aware of the airflow nodes, are those replacing ganeti instances?
[14:22:41] <ottomata>	 elukey:  there was a thead in analytics internal with infra fondations
[14:22:48] <ottomata>	 we can't get 64G ganeti nodes
[14:22:54] <ottomata>	 the biggest ones they make now are 16G
[14:25:27] <elukey>	 ottomata: yes but what airflow instances will need more than 16g of ram?
[14:25:39] <ottomata>	 elukey: .... i dunno i was just going off of what was in our spreadsheet
[14:25:40] <ottomata>	 it had 64G
[14:25:47] <ottomata>	 is that wrong?
[14:27:14] <elukey>	 I think that we should scope out what we want to do, because the multi-stack solution relies on the fact that people will be able to ssh/tunnel (to access the airflow webserver/scheduler) only to their airflow host/instace
[14:28:04] <elukey>	 if we want to have a single airflow host with multiple instance we'd also have to take into account extra work (puppet to allow multiple airflow daemons running at the same time with different users, etc..)
[14:28:06] <ottomata>	 elukey:  i think we don't really have time to scope it out... we are just guessing?
[14:28:11] <ottomata>	 kinda like data gov
[14:28:32] <ottomata>	 hm, i think we should build airflow multiinstance to start with, but probably if we can run them on different nodes?
[14:28:33] <ottomata>	 not sure
[14:28:48] <ottomata>	 so ok, 16G is maybe enough then?  i dunno why the spreadsheet at 64G
[14:28:55] <ottomata>	 today is the deadline for the capex sheet
[14:29:47] <elukey>	 the discovery instance runs with 8g of ram for example, but they do most of the work via spark
[14:29:50] <ottomata>	 elukey:  should I just say 3 16G ganeti nodes then?
[14:30:04] <ottomata>	 ok
[14:30:09] <ottomata>	 the 64G must have been a mistake then
[14:30:27] <elukey>	 mmm lemme think about how many instance we'll need
[14:30:36] <elukey>	 surely ours, but that can go on the coords
[14:31:23] <elukey>	 ml (still not sure about it but let's count it), product-analytics, platform-eng, research? 
[14:31:55] <elukey>	 4x16g should be fine, but again if we want to get a single big node with 64G of ram it is fine
[14:31:58] <ottomata>	 ok 4 sounds good, for ganeti we can be less precise than real hw
[14:32:08] <elukey>	 in that case we might need to have a backup node
[14:32:10] <ottomata>	 no lets go ganeti, much prefer the VMs if we don't know what we are doing
[14:32:52] <ottomata>	 i'll let infra foundations know about that reqqust
[14:32:54] <elukey>	 I think it is also better isolation, if a team starts hammering airflow with extra/wrong load it only affects a single vm
[14:32:57] <ottomata>	 yeah
[14:32:57] <elukey>	 etc..
[14:33:01] <ottomata>	 agree
[14:33:11] <elukey>	 +1 for the db nodes in q1 though
[14:33:14] <ottomata>	 ok cool
[14:33:17] <elukey>	 no problem in having replication to db1108
[14:33:29] <elukey>	 it could also be a good occasion to move matomo's db to it
[14:33:33] <ottomata>	 aye
[14:33:39] <ottomata>	 we don't have any procurments scheduled for Q2 then
[14:33:45] <ottomata>	 maybe iu'll move one in just to spread it out
[14:34:03] <ottomata>	 an worker node refresh maybe?
[14:34:19] <ottomata>	 or presto?
[14:34:53] <elukey>	 sure makes sense
[14:34:57] <ottomata>	 ok worker nodes.
[14:35:20] <elukey>	 ah ottomata about HDFS capacity - I forgot that we still have 6 nodes pending, a lot of capacity missing
[14:35:43] <elukey>	 that makes the refresh + current presto nodes repurpose super ok
[14:35:52] <elukey>	 it will be +11 nodes from what we have now
[14:36:02] <elukey>	 + the refresh (so extra cpu/ram)
[14:36:06] <elukey>	 (and 10gs!!)
[14:36:37] <ottomata>	 aye right
[14:36:38] <ottomata>	 nice
[14:37:02] <ottomata>	 elukey:  yeah i'll ask tem about custom vs just usiing config I for worker nodes
[14:37:20] <ottomata>	 if they want us to just use config I for presto workers anyway, maybe no real reason to refresh?
[14:37:28] <ottomata>	 sorry, to repurpose*
[14:38:50] <elukey>	 ah yes yes
[14:39:29] <elukey>	 but config D might be interesting for those nodes if we run alluxio on them
[14:39:45] <elukey>	 256G of ram, 2x~2TBs SSDs 
[14:39:59] <elukey>	 ottomata: --^
[14:40:36] <elukey>	 I noticed them yesterday while reviewing hw for ML :D
[14:40:36] <ottomata>	 elukey:  i think joseph wanted more cores
[14:40:58] <ottomata>	 i'm looking at config I withou tthe disks
[14:41:02] <ottomata>	 maybe we can get mroe RAM too
[14:41:09] <elukey>	 ack
[14:42:02] <ottomata>	 512GB RAM, 36core x 2 procs, minimal storage
[14:44:24] <wikibugs>	 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10GoranSMilovanovic) @elukey Thank you. I was thinking along the following lines:   - if due to any updates, upgrades, or other chan...
[14:46:40] <wikibugs>	 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10elukey) I think we should be fine from now on, I wouldn't add more complexity to what we have :)
[14:49:06] <elukey>	 ahahahhaha 
[14:53:09] <wikibugs>	 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10Isaac) Just wanted to update with some of the work that @Htriedman has done and discussions we've had off-ticket (feel free to jump in...
[14:53:17] <icinga-wm>	 RECOVERY - Check unit status of monitor_refine_event_sanitized_main_delayed on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_event_sanitized_main_delayed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[14:55:49] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: Sanitize and ingest all event tables into the event_sanitized database - https://phabricator.wikimedia.org/T273789 (10Ottomata) Done.
[14:56:47] <wikibugs>	 (03Abandoned) 10Gehel: Use properties to configure compiler source and target versions. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615485 (https://phabricator.wikimedia.org/T258699) (owner: 10Gehel)
[14:56:50] <wikibugs>	 (03Abandoned) 10Gehel: Introduce ForbiddenAPI as a static analysis tool. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615722 (https://phabricator.wikimedia.org/T258699) (owner: 10Gehel)
[14:56:53] <wikibugs>	 (03Abandoned) 10Gehel: Introduce Takari Maven Wrapper. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615481 (https://phabricator.wikimedia.org/T258699) (owner: 10Gehel)
[14:56:56] <wikibugs>	 (03Abandoned) 10Gehel: Introduce Maven sortpom plugin. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615482 (owner: 10Gehel)
[14:57:00] <ottomata>	 mforns:  if you make a patch for new purge job you can use Bug: T273789
[14:57:00] <stashbot>	 T273789: Sanitize and ingest all event tables into the event_sanitized database - https://phabricator.wikimedia.org/T273789
[14:57:03] <ottomata>	 that is the last step there
[14:58:17] <icinga-wm>	 RECOVERY - Check unit status of monitor_refine_event_sanitized_main_immediate on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_event_sanitized_main_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[14:59:05] <wikibugs>	 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10GoranSMilovanovic) 05Open→03Resolved Ok. In any case the fix to this script is easy if anything similar happens again. I will...
[15:07:35] <wikibugs>	 10Analytics, 10SRE: wmf-auto-restart.py + lsof + /mnt/hdfs may need to be tuned - https://phabricator.wikimedia.org/T278371 (10elukey) 05Open→03Declined Let's revisit this if anything happens again, it seems a sporadic issue.
[15:09:23] <wikibugs>	 10Analytics, 10Analytics-Kanban: Data drifts between superset_production on an-coord1001 and db1108 - https://phabricator.wikimedia.org/T279440 (10elukey) @Ottomata @razzi I think that we should do this sooner rather than later, do you want me to do it or do you prefer to do it during May?
[15:22:44] <wikibugs>	 10Analytics, 10Analytics-Kanban: Data drifts between superset_production on an-coord1001 and db1108 - https://phabricator.wikimedia.org/T279440 (10Ottomata) @elukey no preference, but if you do it can you sync with Razzi so he learns how as well?  TY!
[15:32:25] <mforns>	 ottomata: In the end I preferred to use production data, the test cluster data is so small, it won't test all use cases...
[15:32:56] <mforns>	 I executed the DRY-RUN in prod, and it just finished, I'm checking the logs, and will update the checksum if all OK
[15:37:18] <joal>	 ottomata: about configs for presto - if we wish to collocate alluxio we hsould have some storage
[15:39:52] <wikibugs>	 (03CR) 10Mholloway: [C: 03+1] "Couple more questions from me inline, but I wouldn't be opposed to merging as-is." (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF)
[16:02:39] <wikibugs>	 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10TedTed) I love the demo, congrats for your work on this!  > I have some lingering confusion on the role of delta in the Apache-beam imp...
[16:14:03] <wikibugs>	 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10TedTed) Side-note inspired by a remark from my colleague [Mirac](https://twitter.com/miracvbasaran): given that the sensitivity is [fix...
[16:14:18] * elukey bbiab!
[16:44:04] <wikibugs>	 10Analytics, 10AbuseFilter, 10BetaFeatures, 10BlueSpice, and 44 others: Prepare User group methods for hard deprecation - https://phabricator.wikimedia.org/T275148 (10Vlad.shapik)
[16:47:26] <wikibugs>	 10Analytics, 10AbuseFilter, 10BetaFeatures, 10BlueSpice, and 45 others: Prepare User group methods for hard deprecation - https://phabricator.wikimedia.org/T275148 (10Vlad.shapik) I propose to find all usages of group related methods in an extension or a skin and replace it. It shouldn't be partly replaced...
[16:54:48] <wikibugs>	 10Analytics, 10Event-Platform, 10Privacy Engineering, 10Product-Analytics, 10Privacy: Capture rev_is_revert event data in a stream different than mediawiki.revision-create - https://phabricator.wikimedia.org/T280538 (10nettrom_WMF) I'm just here to mention that "2. Making the mediawiki.revision-tags-chan...
[17:09:33] <wikibugs>	 10Analytics, 10Product-Analytics: Add timestamps of important revision events to mediawiki_history - https://phabricator.wikimedia.org/T266375 (10Ottomata) @isaac FYI, there is a `event.mediawiki_page_restrictions_change` table in Hive.
[17:13:44] <wikibugs>	 (03CR) 10Neil P. Quinn-WMF: "> Patch Set 8: Code-Review+1" (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF)
[17:23:42] <wikibugs>	 (03CR) 10Ottomata: Create content_translation_event schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF)
[17:30:26] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Product-Data-Infrastructure, and 3 others: Replace usages of Linker::link() and Linker::linkKnown() in extension EventLogging - https://phabricator.wikimedia.org/T279328 (10Mholloway) This could probably use review by a PET Clinic Duty engineer fam...
[17:34:28] <wikibugs>	 (03CR) 10Ottomata: Create content_translation_event schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF)
[17:53:22] <elukey>	 ottomata: https://issues.apache.org/jira/browse/SPARK-25815 :O
[17:53:53] <elukey>	 https://engineering.linkedin.com/blog/2020/open-sourcing-kube2hadoop
[17:55:36] <elukey>	  /o\
[17:57:39] <elukey>	 but it is exactly our use case
[17:57:43] <elukey>	 on a smaller scale :D
[18:00:02] <elukey>	 going afk for the weekend, o/
[18:23:09] <mforns>	 ottomata: I checked the deletion script, and it seems it would drop the correct Hive partitions for all tables. However there's 1 problem in the deletion of data directories: the tables that have datacenter would not be dropped, I think the optional (/datacenter=blah)? is missing from the regex. Will add!
[18:23:27] <wikibugs>	 (03PS3) 10BrandonXLF: Use one_or_none to handle non-existent queries [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/682030 (https://phabricator.wikimedia.org/T280915)
[18:24:53] <mforns>	 ottomata: oh, maybe we could do just .* instead of [^/]+ at the beginning of the path_format so that it can match <tablename> but also <tablename>/datacenter=blah
[18:25:02] <mforns>	 trying
[18:28:31] <wikibugs>	 (03CR) 10BrandonXLF: Use one_or_none to handle non-existent queries (034 comments) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/682030 (https://phabricator.wikimedia.org/T280915) (owner: 10BrandonXLF)
[18:30:02] <ottomata>	 hmm, mforns  i didn't add that to the test drop job?  I thought i did!
[18:31:15] <mforns>	 wait...
[18:31:57] <mforns>	 ottomata: oh, yes, so sorryyyy.... I messed up
[18:34:23] <mforns>	 ottomata: can we put datacenter=[a-z]+ instead of datacenter=.+ ?
[18:34:38] <mforns>	 is it always going to be a-z?
[18:35:35] <ottomata>	 i think i had an issue with that but i can't remember why
[18:35:48] <ottomata>	 i mean it is a-z atm, not sure if it will always be
[18:35:51] <ottomata>	 probably will be
[18:36:13] <ottomata>	 i guess mforns, a regex that didn' tknow about datacenter mighrt be ncie
[18:36:21] <ottomata>	 something that skipped partitions until the first date partition
[18:37:02] <wikibugs>	 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10Htriedman) @TedTed, thanks for explaining thresholding and why δ is necessary, even with Laplace noise. Really useful to know what's ha...
[18:38:05] <mforns>	 ottomata: that was my first go-to, but couldn't it be dangerous, like if for some reason the path is /wmf/data/event/../../../wmf/data/wmf/webrequest/ ?
[18:39:34] <ottomata>	 hmm yeah maybe but, does that work in hfs?
[18:39:47] <ottomata>	 hmm yeah i guess it does
[18:40:16] <mforns>	 yes, just tried
[18:41:52] <mforns>	 ottomata: I think the regex should be as adjusted as possible to that what we want to delete, to avoid it embracing unexpected things..
[18:43:15] <mforns>	 maybe: [^/.]+/[^/.]+
[18:43:46] <ottomata>	 ok sonds good
[18:44:55] <mforns>	 k
[18:47:10] <mforns>	 ottomata: hm... or even: [^/]+/(datacenter=[^/]+/)?
[18:47:26] <mforns>	 I think that last one is better
[18:47:32] <ottomata>	 ok!
[18:48:17] <mforns>	 k :]
[18:51:55] <wikibugs>	 10Analytics, 10Event-Platform: Deploy schema repos to analytics cluster and use local uris for analytics jobs - https://phabricator.wikimedia.org/T280017 (10Ottomata) Drats, I just tried this, but realized that the local cloned schema repos will have to be available to the hadoop workers, so either in HDFS or...
[19:01:20] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Event-Platform: WikimediaEventUtilities and produce_canary_events job should use api-ro.discovery.wmnet instead of meta.wikimedia.,org to get stream config - https://phabricator.wikimedia.org/T274951 (10Ottomata)
[19:02:54] <wikibugs>	 10Analytics, 10Event-Platform: Schema tests should validate examples - https://phabricator.wikimedia.org/T275143 (10Ottomata) 05Open→03Invalid Closing as invalid as this already happens for non legacy schemas
[19:03:19] <wikibugs>	 10Analytics, 10Event-Platform, 10Inuka-Team: InukaPageView Event Platform Migration - https://phabricator.wikimedia.org/T267344 (10Ottomata) Hi @sbisson checking in, how goes?
[19:18:20] <wikibugs>	 (03PS1) 10Ottomata: Remove requiredness of fields from mediawiki common schema fragments [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/683980 (https://phabricator.wikimedia.org/T275674)
[19:18:53] <wikibugs>	 10Analytics, 10Event-Platform, 10Product-Data-Infrastructure, 10Patch-For-Review: MEP: Schema fragments shouldn't require fields - https://phabricator.wikimedia.org/T275674 (10Ottomata) a:03Ottomata
[19:19:02] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Remove requiredness of fields from mediawiki common schema fragments [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/683980 (https://phabricator.wikimedia.org/T275674) (owner: 10Ottomata)
[19:19:03] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Product-Data-Infrastructure, 10Patch-For-Review: MEP: Schema fragments shouldn't require fields - https://phabricator.wikimedia.org/T275674 (10Ottomata)
[19:19:20] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Product-Data-Infrastructure, 10Patch-For-Review: MEP: Schema fragments shouldn't require fields - https://phabricator.wikimedia.org/T275674 (10Ottomata) p:05Triage→03Medium
[19:24:08] <wikibugs>	 10Analytics, 10Analytics-Kanban: Replace Camus by Gobblin - https://phabricator.wikimedia.org/T271232 (10Ottomata) @Milimetric lemme know when you are trying to figure out how to integrate Gobblin with EventStreamConfig.  We'll likely want to do https://phabricator.wikimedia.org/T273901#6879350 to make the ing...
[19:42:08] <wikibugs>	 10Analytics, 10Event-Platform, 10Research: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Ottomata) From what I can tell, these events are fully migrated on the clients. Proceeding with the rest of the backend migration.
[19:51:59] <wikibugs>	 10Analytics, 10Event-Platform, 10Research, 10Patch-For-Review: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Ottomata)
[19:53:29] <wikibugs>	 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10razzi) Alright, here's my plan @elukey, perhaps we can discuss this next week and if it looks good we can plan the maintenance.  ### Prep Backup /srv/hadoop...
[19:56:50] <wikibugs>	 10Analytics: Stop Refining mediawiki_job events in Hive - https://phabricator.wikimedia.org/T281605 (10Ottomata)
[20:14:42] <wikibugs>	 10Analytics, 10Event-Platform, 10Research: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Ottomata) 05Open→03Resolved
[20:14:47] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Better Use Of Data, and 5 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata)
[20:47:27] <wikibugs>	 10Analytics-Radar, 10SRE, 10ops-eqiad: Try to move some new analytics worker nodes to different racks - https://phabricator.wikimedia.org/T276239 (10wiki_willy) Hi @elukey - the rack space in A7 is pending on T280203.  @Cmjohnson - you should be able to complete the move to A2 though - you just need to decom...
[21:53:11] <dsaez>	 nuria
[21:53:26] <dsaez>	 hi!