[00:38:07] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban: npm install gives Verification failed while extracting mediawiki-storage@https://github.com/wikimedia/analytics-mediawiki-storage/archive/master.tar.gz - https://phabricator.wikimedia.org/T278982 (10Urbanecm) >>! In T278982#7016272, @Milimetric wrote: > I... [04:20:56] PROBLEM - Check unit status of monitor_refine_event_sanitized_main_delayed on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_event_sanitized_main_delayed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [04:24:48] PROBLEM - Check unit status of monitor_refine_event_sanitized_main_immediate on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_event_sanitized_main_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [05:23:19] 10Analytics, 10Product-Analytics: Aggregate table not working after superset upgrade - https://phabricator.wikimedia.org/T280784 (10cchen) @razzi Never mind, I misunderstood it. I thought we will see both "druid.pageview_hourly" and "pageview_hourly" in the dropdown menu when click change dataset. [06:37:48] 10Analytics-Radar, 10SRE, 10ops-eqiad: Try to move some new analytics worker nodes to different racks - https://phabricator.wikimedia.org/T276239 (10elukey) @Cmjohnson hi! Any news about the worker nodes? [06:50:56] 10Analytics, 10Patch-For-Review: Decommission analytics-tool1001 and all the CDH leftovers - https://phabricator.wikimedia.org/T280262 (10elukey) Plan is: * downtime + disable-puppet + stop hue on an-tool1009 * merge https://gerrit.wikimedia.org/r/683786 * `sudo mysqldump hue > hue_30042021.sql` on an-coord1... [06:56:36] !log stop hue to allow database rename (hue_next -> hue) [06:56:38] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:04:33] !log hue restarted using the database 'hue' instead of 'hue_next' [07:04:35] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:06:15] looks good afaics [07:08:38] 10Analytics, 10Patch-For-Review: Decommission analytics-tool1001 and all the CDH leftovers - https://phabricator.wikimedia.org/T280262 (10elukey) Everything looks good! Also dropped the `hue_next` database so it is less confusing when inspecting what we run on the various db nodes (basically we now have only t... [07:09:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Decommission analytics-tool1001 and all the CDH leftovers - https://phabricator.wikimedia.org/T280262 (10elukey) [07:09:23] wow finally done! [07:09:27] I can't believe it [07:29:01] (03PS1) 10GoranSMilovanovic: T281316 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/683801 [07:29:28] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] T281316 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/683801 (owner: 10GoranSMilovanovic) [08:22:56] (03PS1) 10GoranSMilovanovic: T261906 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/683812 [08:23:06] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] T261906 [analytics/wmde/WD/WikidataAnalytics] - 10https://gerrit.wikimedia.org/r/683812 (owner: 10GoranSMilovanovic) [08:49:59] * elukey bbiab! [10:00:09] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10elukey) @GoranSMilovanovic sure! During the migration of the hosts where Hive Server/Metastore runs to Debian Buster, we encounter... [10:27:55] * elukey afk! lunch [11:52:45] ottomata: if you have a few minutes: https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/683351 [11:52:59] This should fix both the release process and the postmerge builds [12:15:13] just a heads-up, I gotta go to the dentist in a bit and I have no idea how well it'll go so my availability is questionable in the afternoon [12:15:29] noted hnowlan - thanks for pinging :) [13:19:08] hellooo team :] [13:24:10] (03CR) 10Ottomata: [C: 03+2] Ensure that maven site generation works. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/683351 (owner: 10Gehel) [13:24:24] ottomata: thanks! [13:24:30] :) thank you [13:30:10] ottomata: I'm preparing a patch to remove --skip-trash from event (unsanitized) purging script. I'm removing the regex to check for underscores, is that OK? After the migration you're doing, will we be able to purge *all* event.* tables after 90 days? [13:30:42] hmm, mforns yes, i have a job that does this in the test cluster [13:30:47] I'm also removing the reference to WMDEBanner* tables, I imagine that until we merge this, we'll know [13:30:51] i'm not quite ready to enable that data purge job, want to do a lot of double checking [13:31:00] can you do that for the test cluster for now? [13:31:10] ah! Then you already have the checksum? [13:31:11] and then wew can adapt for the prod cluster once we are ready? [13:31:16] yes its running in test now [13:31:27] (03PS12) 10Gehel: Report on test coverage [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681933 (owner: 10Awight) [13:31:32] see profile::analytics::refinery::job::test::data_purge [13:31:41] but.... how many tables do we have in test cluster? [13:31:49] ok [13:31:49] kerberos::systemd_timer { 'drop_event': [13:32:53] mforns: not many at all! [13:33:20] you think refinery-drop-older-than might not be able to handle it? [13:37:06] ottomata: thanks for having cared the errors - I wanted to ask you before taking actions, but I guess you've resolved it, right? [13:41:08] ottomata: I think we'd better execute the deletion in chunks, for example starting with --older-than=1000, then --older-than=700, etc. [13:41:41] joal: yeah the failure was just a hiccup because of a namenode restart (i think?) [13:41:48] the second is a followwup for the sanitizatino work [13:41:50] i'm on that now [13:42:00] mforns: when first applyinig the job? [13:42:03] ottomata: since jobs were sanitization I wasn't sure - [13:42:04] or always? [13:42:07] thanks ottomata :) [13:42:16] ottomata: and if you have more free time, this should start reporting code coverage: https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/681933 [13:44:08] haha free time == +2 and submit :) [13:44:18] (03CR) 10Ottomata: [C: 03+2] Report on test coverage [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681933 (owner: 10Awight) [13:44:35] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Report on test coverage [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681933 (owner: 10Awight) [13:44:52] ottomata: no no, just the first time [13:45:15] ottomata: btw, the job in the test cluster has still the --skip-trash flag [13:46:48] mforns: i was just making it match the prod one [13:46:54] if we are going to remove skip trash, lets do it for test now [13:46:57] can you make patch? [13:49:32] ottomata: sure [13:49:44] ottomata: it's getting the checksum now [13:49:54] but it's taking a long time... [13:50:10] maybe getting it in the test cluster will be faster [13:52:22] aye [13:52:37] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: Sanitize and ingest all event tables into the event_sanitized database - https://phabricator.wikimedia.org/T273789 (10Ottomata) ah, I need to reset the _REFINED flag mtimes for the hours I distcp-ed over, just like I did in https://phab... [13:55:04] ottomata: and coverage report is live: https://sonarcloud.io/dashboard?id=org.wikimedia.analytics.refinery%3Arefinery [13:57:48] wow, didn't know refinery-source had that, or is it brand new? [14:01:27] \o/ [14:01:33] Thanks a lot gehel <3 [14:01:51] Gone for kids [14:06:46] nice gehel ! [14:06:48] mforns: brand neww! :) [14:14:02] elukey: razzi, since we want to do airflow work in q1 [14:14:09] i've got the airflow nodes schedueld for q1 procurement [14:14:21] however the new db nodes i have in q2 [14:14:22] perhaps [14:14:25] they should be q1 too? [14:14:50] then we could set up the airflow dbs on them early? and prep them for multi instance migration of an-coord dbs, but do the miigrations later? [14:15:10] that should be ok, right? would it be ok to have the airflow dbs running on the new nodes with replication, backup, etc. [14:15:19] will that work with the back up repl to db1108? [14:17:39] also, we have 3 airflow nodse [14:17:42] do we really need 3? [14:17:43] maybe just 2? [14:20:47] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Prep for replacing jupyter conda migration - https://phabricator.wikimedia.org/T262847 (10Ottomata) a:03Ottomata [14:21:06] 10Analytics: Spike. Try to ML models distributted in jupyter notebooks with dask - https://phabricator.wikimedia.org/T243089 (10Ottomata) https://analytics-zoo.readthedocs.io/en/latest/ ? [14:21:49] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Prep for replacing jupyter conda migration - https://phabricator.wikimedia.org/T262847 (10Ottomata) [14:22:02] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Decomission SWAP - https://phabricator.wikimedia.org/T262847 (10Ottomata) [14:22:24] ottomata: o/ I wasn't aware of the airflow nodes, are those replacing ganeti instances? [14:22:41] elukey: there was a thead in analytics internal with infra fondations [14:22:48] we can't get 64G ganeti nodes [14:22:54] the biggest ones they make now are 16G [14:25:27] ottomata: yes but what airflow instances will need more than 16g of ram? [14:25:39] elukey: .... i dunno i was just going off of what was in our spreadsheet [14:25:40] it had 64G [14:25:47] is that wrong? [14:27:14] I think that we should scope out what we want to do, because the multi-stack solution relies on the fact that people will be able to ssh/tunnel (to access the airflow webserver/scheduler) only to their airflow host/instace [14:28:04] if we want to have a single airflow host with multiple instance we'd also have to take into account extra work (puppet to allow multiple airflow daemons running at the same time with different users, etc..) [14:28:06] elukey: i think we don't really have time to scope it out... we are just guessing? [14:28:11] kinda like data gov [14:28:32] hm, i think we should build airflow multiinstance to start with, but probably if we can run them on different nodes? [14:28:33] not sure [14:28:48] so ok, 16G is maybe enough then? i dunno why the spreadsheet at 64G [14:28:55] today is the deadline for the capex sheet [14:29:47] the discovery instance runs with 8g of ram for example, but they do most of the work via spark [14:29:50] elukey: should I just say 3 16G ganeti nodes then? [14:30:04] ok [14:30:09] the 64G must have been a mistake then [14:30:27] mmm lemme think about how many instance we'll need [14:30:36] surely ours, but that can go on the coords [14:31:23] ml (still not sure about it but let's count it), product-analytics, platform-eng, research? [14:31:55] 4x16g should be fine, but again if we want to get a single big node with 64G of ram it is fine [14:31:58] ok 4 sounds good, for ganeti we can be less precise than real hw [14:32:08] in that case we might need to have a backup node [14:32:10] no lets go ganeti, much prefer the VMs if we don't know what we are doing [14:32:52] i'll let infra foundations know about that reqqust [14:32:54] I think it is also better isolation, if a team starts hammering airflow with extra/wrong load it only affects a single vm [14:32:57] yeah [14:32:57] etc.. [14:33:01] agree [14:33:11] +1 for the db nodes in q1 though [14:33:14] ok cool [14:33:17] no problem in having replication to db1108 [14:33:29] it could also be a good occasion to move matomo's db to it [14:33:33] aye [14:33:39] we don't have any procurments scheduled for Q2 then [14:33:45] maybe iu'll move one in just to spread it out [14:34:03] an worker node refresh maybe? [14:34:19] or presto? [14:34:53] sure makes sense [14:34:57] ok worker nodes. [14:35:20] ah ottomata about HDFS capacity - I forgot that we still have 6 nodes pending, a lot of capacity missing [14:35:43] that makes the refresh + current presto nodes repurpose super ok [14:35:52] it will be +11 nodes from what we have now [14:36:02] + the refresh (so extra cpu/ram) [14:36:06] (and 10gs!!) [14:36:37] aye right [14:36:38] nice [14:37:02] elukey: yeah i'll ask tem about custom vs just usiing config I for worker nodes [14:37:20] if they want us to just use config I for presto workers anyway, maybe no real reason to refresh? [14:37:28] sorry, to repurpose* [14:38:50] ah yes yes [14:39:29] but config D might be interesting for those nodes if we run alluxio on them [14:39:45] 256G of ram, 2x~2TBs SSDs [14:39:59] ottomata: --^ [14:40:36] I noticed them yesterday while reviewing hw for ML :D [14:40:36] elukey: i think joseph wanted more cores [14:40:58] i'm looking at config I withou tthe disks [14:41:02] maybe we can get mroe RAM too [14:41:09] ack [14:42:02] 512GB RAM, 36core x 2 procs, minimal storage [14:44:24] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10GoranSMilovanovic) @elukey Thank you. I was thinking along the following lines: - if due to any updates, upgrades, or other chan... [14:46:40] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10elukey) I think we should be fine from now on, I wouldn't add more complexity to what we have :) [14:49:06] ahahahhaha [14:53:09] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10Isaac) Just wanted to update with some of the work that @Htriedman has done and discussions we've had off-ticket (feel free to jump in... [14:53:17] RECOVERY - Check unit status of monitor_refine_event_sanitized_main_delayed on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_event_sanitized_main_delayed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:55:49] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: Sanitize and ingest all event tables into the event_sanitized database - https://phabricator.wikimedia.org/T273789 (10Ottomata) Done. [14:56:47] (03Abandoned) 10Gehel: Use properties to configure compiler source and target versions. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615485 (https://phabricator.wikimedia.org/T258699) (owner: 10Gehel) [14:56:50] (03Abandoned) 10Gehel: Introduce ForbiddenAPI as a static analysis tool. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615722 (https://phabricator.wikimedia.org/T258699) (owner: 10Gehel) [14:56:53] (03Abandoned) 10Gehel: Introduce Takari Maven Wrapper. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615481 (https://phabricator.wikimedia.org/T258699) (owner: 10Gehel) [14:56:56] (03Abandoned) 10Gehel: Introduce Maven sortpom plugin. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615482 (owner: 10Gehel) [14:57:00] mforns: if you make a patch for new purge job you can use Bug: T273789 [14:57:00] T273789: Sanitize and ingest all event tables into the event_sanitized database - https://phabricator.wikimedia.org/T273789 [14:57:03] that is the last step there [14:58:17] RECOVERY - Check unit status of monitor_refine_event_sanitized_main_immediate on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_event_sanitized_main_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:59:05] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10GoranSMilovanovic) 05Open→03Resolved Ok. In any case the fix to this script is easy if anything similar happens again. I will... [15:07:35] 10Analytics, 10SRE: wmf-auto-restart.py + lsof + /mnt/hdfs may need to be tuned - https://phabricator.wikimedia.org/T278371 (10elukey) 05Open→03Declined Let's revisit this if anything happens again, it seems a sporadic issue. [15:09:23] 10Analytics, 10Analytics-Kanban: Data drifts between superset_production on an-coord1001 and db1108 - https://phabricator.wikimedia.org/T279440 (10elukey) @Ottomata @razzi I think that we should do this sooner rather than later, do you want me to do it or do you prefer to do it during May? [15:22:44] 10Analytics, 10Analytics-Kanban: Data drifts between superset_production on an-coord1001 and db1108 - https://phabricator.wikimedia.org/T279440 (10Ottomata) @elukey no preference, but if you do it can you sync with Razzi so he learns how as well? TY! [15:32:25] ottomata: In the end I preferred to use production data, the test cluster data is so small, it won't test all use cases... [15:32:56] I executed the DRY-RUN in prod, and it just finished, I'm checking the logs, and will update the checksum if all OK [15:37:18] ottomata: about configs for presto - if we wish to collocate alluxio we hsould have some storage [15:39:52] (03CR) 10Mholloway: [C: 03+1] "Couple more questions from me inline, but I wouldn't be opposed to merging as-is." (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [16:02:39] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10TedTed) I love the demo, congrats for your work on this! > I have some lingering confusion on the role of delta in the Apache-beam imp... [16:14:03] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10TedTed) Side-note inspired by a remark from my colleague [Mirac](https://twitter.com/miracvbasaran): given that the sensitivity is [fix... [16:14:18] * elukey bbiab! [16:44:04] 10Analytics, 10AbuseFilter, 10BetaFeatures, 10BlueSpice, and 44 others: Prepare User group methods for hard deprecation - https://phabricator.wikimedia.org/T275148 (10Vlad.shapik) [16:47:26] 10Analytics, 10AbuseFilter, 10BetaFeatures, 10BlueSpice, and 45 others: Prepare User group methods for hard deprecation - https://phabricator.wikimedia.org/T275148 (10Vlad.shapik) I propose to find all usages of group related methods in an extension or a skin and replace it. It shouldn't be partly replaced... [16:54:48] 10Analytics, 10Event-Platform, 10Privacy Engineering, 10Product-Analytics, 10Privacy: Capture rev_is_revert event data in a stream different than mediawiki.revision-create - https://phabricator.wikimedia.org/T280538 (10nettrom_WMF) I'm just here to mention that "2. Making the mediawiki.revision-tags-chan... [17:09:33] 10Analytics, 10Product-Analytics: Add timestamps of important revision events to mediawiki_history - https://phabricator.wikimedia.org/T266375 (10Ottomata) @isaac FYI, there is a `event.mediawiki_page_restrictions_change` table in Hive. [17:13:44] (03CR) 10Neil P. Quinn-WMF: "> Patch Set 8: Code-Review+1" (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [17:23:42] (03CR) 10Ottomata: Create content_translation_event schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [17:30:26] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Product-Data-Infrastructure, and 3 others: Replace usages of Linker::link() and Linker::linkKnown() in extension EventLogging - https://phabricator.wikimedia.org/T279328 (10Mholloway) This could probably use review by a PET Clinic Duty engineer fam... [17:34:28] (03CR) 10Ottomata: Create content_translation_event schema (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [17:53:22] ottomata: https://issues.apache.org/jira/browse/SPARK-25815 :O [17:53:53] https://engineering.linkedin.com/blog/2020/open-sourcing-kube2hadoop [17:55:36] /o\ [17:57:39] but it is exactly our use case [17:57:43] on a smaller scale :D [18:00:02] going afk for the weekend, o/ [18:23:09] ottomata: I checked the deletion script, and it seems it would drop the correct Hive partitions for all tables. However there's 1 problem in the deletion of data directories: the tables that have datacenter would not be dropped, I think the optional (/datacenter=blah)? is missing from the regex. Will add! [18:23:27] (03PS3) 10BrandonXLF: Use one_or_none to handle non-existent queries [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/682030 (https://phabricator.wikimedia.org/T280915) [18:24:53] ottomata: oh, maybe we could do just .* instead of [^/]+ at the beginning of the path_format so that it can match but also /datacenter=blah [18:25:02] trying [18:28:31] (03CR) 10BrandonXLF: Use one_or_none to handle non-existent queries (034 comments) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/682030 (https://phabricator.wikimedia.org/T280915) (owner: 10BrandonXLF) [18:30:02] hmm, mforns i didn't add that to the test drop job? I thought i did! [18:31:15] wait... [18:31:57] ottomata: oh, yes, so sorryyyy.... I messed up [18:34:23] ottomata: can we put datacenter=[a-z]+ instead of datacenter=.+ ? [18:34:38] is it always going to be a-z? [18:35:35] i think i had an issue with that but i can't remember why [18:35:48] i mean it is a-z atm, not sure if it will always be [18:35:51] probably will be [18:36:13] i guess mforns, a regex that didn' tknow about datacenter mighrt be ncie [18:36:21] something that skipped partitions until the first date partition [18:37:02] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10Htriedman) @TedTed, thanks for explaining thresholding and why δ is necessary, even with Laplace noise. Really useful to know what's ha... [18:38:05] ottomata: that was my first go-to, but couldn't it be dangerous, like if for some reason the path is /wmf/data/event/../../../wmf/data/wmf/webrequest/ ? [18:39:34] hmm yeah maybe but, does that work in hfs? [18:39:47] hmm yeah i guess it does [18:40:16] yes, just tried [18:41:52] ottomata: I think the regex should be as adjusted as possible to that what we want to delete, to avoid it embracing unexpected things.. [18:43:15] maybe: [^/.]+/[^/.]+ [18:43:46] ok sonds good [18:44:55] k [18:47:10] ottomata: hm... or even: [^/]+/(datacenter=[^/]+/)? [18:47:26] I think that last one is better [18:47:32] ok! [18:48:17] k :] [18:51:55] 10Analytics, 10Event-Platform: Deploy schema repos to analytics cluster and use local uris for analytics jobs - https://phabricator.wikimedia.org/T280017 (10Ottomata) Drats, I just tried this, but realized that the local cloned schema repos will have to be available to the hadoop workers, so either in HDFS or... [19:01:20] 10Analytics, 10Analytics-Kanban, 10Event-Platform: WikimediaEventUtilities and produce_canary_events job should use api-ro.discovery.wmnet instead of meta.wikimedia.,org to get stream config - https://phabricator.wikimedia.org/T274951 (10Ottomata) [19:02:54] 10Analytics, 10Event-Platform: Schema tests should validate examples - https://phabricator.wikimedia.org/T275143 (10Ottomata) 05Open→03Invalid Closing as invalid as this already happens for non legacy schemas [19:03:19] 10Analytics, 10Event-Platform, 10Inuka-Team: InukaPageView Event Platform Migration - https://phabricator.wikimedia.org/T267344 (10Ottomata) Hi @sbisson checking in, how goes? [19:18:20] (03PS1) 10Ottomata: Remove requiredness of fields from mediawiki common schema fragments [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/683980 (https://phabricator.wikimedia.org/T275674) [19:18:53] 10Analytics, 10Event-Platform, 10Product-Data-Infrastructure, 10Patch-For-Review: MEP: Schema fragments shouldn't require fields - https://phabricator.wikimedia.org/T275674 (10Ottomata) a:03Ottomata [19:19:02] (03CR) 10jerkins-bot: [V: 04-1] Remove requiredness of fields from mediawiki common schema fragments [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/683980 (https://phabricator.wikimedia.org/T275674) (owner: 10Ottomata) [19:19:03] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Product-Data-Infrastructure, 10Patch-For-Review: MEP: Schema fragments shouldn't require fields - https://phabricator.wikimedia.org/T275674 (10Ottomata) [19:19:20] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Product-Data-Infrastructure, 10Patch-For-Review: MEP: Schema fragments shouldn't require fields - https://phabricator.wikimedia.org/T275674 (10Ottomata) p:05Triage→03Medium [19:24:08] 10Analytics, 10Analytics-Kanban: Replace Camus by Gobblin - https://phabricator.wikimedia.org/T271232 (10Ottomata) @Milimetric lemme know when you are trying to figure out how to integrate Gobblin with EventStreamConfig. We'll likely want to do https://phabricator.wikimedia.org/T273901#6879350 to make the ing... [19:42:08] 10Analytics, 10Event-Platform, 10Research: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Ottomata) From what I can tell, these events are fully migrated on the clients. Proceeding with the rest of the backend migration. [19:51:59] 10Analytics, 10Event-Platform, 10Research, 10Patch-For-Review: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Ottomata) [19:53:29] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Upgrade the Hadoop masters to Debian Buster - https://phabricator.wikimedia.org/T278423 (10razzi) Alright, here's my plan @elukey, perhaps we can discuss this next week and if it looks good we can plan the maintenance. ### Prep Backup /srv/hadoop... [19:56:50] 10Analytics: Stop Refining mediawiki_job events in Hive - https://phabricator.wikimedia.org/T281605 (10Ottomata) [20:14:42] 10Analytics, 10Event-Platform, 10Research: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Ottomata) 05Open→03Resolved [20:14:47] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Better Use Of Data, and 5 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [20:47:27] 10Analytics-Radar, 10SRE, 10ops-eqiad: Try to move some new analytics worker nodes to different racks - https://phabricator.wikimedia.org/T276239 (10wiki_willy) Hi @elukey - the rack space in A7 is pending on T280203. @Cmjohnson - you should be able to complete the move to A2 though - you just need to decom... [21:53:11] nuria [21:53:26] hi!