[00:18:42] (03PS2) 10Razzi: Add eowikivoyage and trwikivoyage to sqoop list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/680414 (https://phabricator.wikimedia.org/T279564) [01:20:45] (03CR) 10Razzi: "I confirmed that the databases are on labs (clouddb1021) and production (dbstore1003). Merging this to go with the train tomorrow." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/680414 (https://phabricator.wikimedia.org/T279564) (owner: 10Razzi) [01:20:52] (03CR) 10Razzi: [V: 03+2 C: 03+2] Add eowikivoyage and trwikivoyage to sqoop list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/680414 (https://phabricator.wikimedia.org/T279564) (owner: 10Razzi) [02:04:46] PROBLEM - Check unit status of refinery-import-siteinfo-dumps on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit refinery-import-siteinfo-dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [02:23:04] Hmm, I thought that I could merge my patch with the csv name change to grouped_wikis.csv, but this error ^ is happening [02:23:17] ```Apr 27 02:00:01 an-launcher1002 kerberos-run-command[32191]: ValueError: Invalid projects-file /mnt/hdfs/wmf/refinery/current/stat [02:23:17] Apr 27 02:00:01 an-launcher1002 kerberos-run-command[32191]: File doesn't exist [02:23:17] ``` [02:23:38] The line that was cut off: `Invalid projects-file /mnt/hdfs/wmf/refinery/current/static_data/mediawiki/grouped_wikis/grouped_wikis.csv` [02:24:00] So let me revert the puppet patch for now, and we can strategize how to deploy this tomorrow [02:59:37] razzi: right, that makes sense, puppet is trying to run with the file you renamed, but HDFS doesn't have it yet, because we haven't deployed and done the hdfs-sync. So when I do the train tomorrow, the rename will be synced to hdfs and then you can re-apply your patch [04:42:30] 10Analytics, 10Product-Analytics: Aggregate table not working after superset upgrade - https://phabricator.wikimedia.org/T280784 (10cchen) The issue can be seen in [[ https://superset.wikimedia.org/superset/explore/?form_data=%7B%22viz_type%22%3A%22table%22%2C%22datasource%22%3A%22432__druid%22%2C%22slice_id%2... [05:50:11] good morning! [06:07:24] RECOVERY - Check unit status of refinery-import-siteinfo-dumps on an-launcher1002 is OK: OK: Status of the systemd unit refinery-import-siteinfo-dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [06:13:28] 10Analytics-Clusters, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) Anything else pending or can this be closed? [06:14:17] 10Analytics-Clusters, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10elukey) 05Open→03Resolved [06:29:35] joal: o/ bonjour! [06:29:42] if you are ok I'd add the labels [07:01:00] Good morning :) [07:01:10] Hi elukey - I was about to tell you you can go with labels :) [07:04:07] joal: ack! [07:04:28] joal: commands are https://phabricator.wikimedia.org/T277062#7032714 [07:05:08] looks god elukey - I trust you on hosts having GPUs :) [07:05:13] addToClusterNodeLabels: java.io.IOException: Node-label-based scheduling is disabled. Please check yarn.node-labels.enabled [07:05:21] * elukey plays sad_trombone.wav [07:05:23] meh? [07:05:26] :( [07:05:31] Luca's fault of course [07:05:38] going to add the option :D [07:05:51] no no, our fault for sure - It can't us when it work and you when it doesn't :) [07:06:57] I think I missed it in the puppet code change, checking [07:09:02] https://gerrit.wikimedia.org/r/c/operations/puppet/+/682889 :) [07:09:42] going to roll restart our dear RMs [07:11:22] !log restart yarn resource managers to pick up yarn label settings [07:11:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:16:39] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] "Is there a way to lower the threshold that makes Git/Gerrit wrongly display these files as deleted and re-added?" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682748 (https://phabricator.wikimedia.org/T193169) (owner: 10Awight) [07:16:47] all right done [07:17:47] (03CR) 10Awight: "> Patch Set 1: Code-Review+1" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682748 (https://phabricator.wikimedia.org/T193169) (owner: 10Awight) [07:18:20] ack elukey - Thanks for the quick patch [07:18:24] done! [07:19:37] https://yarn.wikimedia.org/cluster/nodes/?node.label=GPU [07:19:38] \o/ [07:20:48] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Review the Yarn Capacity scheduler and see if we can move to it - https://phabricator.wikimedia.org/T277062 (10elukey) GPU label added! \o/ [07:32:05] joal: I sent an email to Miriam and Aiko, hopefully they will run some tensorflow tests soon :) [07:33:15] Great :) [08:33:31] !log run mysql_upgrade for analytics-meta on an-coord1002 (should be part of the upgrade process) - T278424 [08:33:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:33:34] T278424: Upgrade the Hadoop coordinators to Debian Buster - https://phabricator.wikimedia.org/T278424 [08:34:45] will do it later on for 1001 [08:45:43] Taking an earl break before meetings-afternoon [10:32:50] * elukey lunch! [11:43:02] (03PS1) 10Hnowlan: Add grants and schema CQL [analytics/aqs] - 10https://gerrit.wikimedia.org/r/682933 (https://phabricator.wikimedia.org/T278701) [11:45:14] (03PS1) 10Hnowlan: Use Cassandra 3 syntax in schema [analytics/aqs] - 10https://gerrit.wikimedia.org/r/682934 (https://phabricator.wikimedia.org/T278701) [11:47:29] 10Analytics, 10Analytics-Kanban, 10Cassandra: Dual loading from Hive into old and new AQS clusters - https://phabricator.wikimedia.org/T280155 (10hnowlan) 05Open→03Resolved [11:47:31] 10Analytics, 10Cassandra: Cassandra3 migration for Analytics AQS - https://phabricator.wikimedia.org/T249755 (10hnowlan) [11:47:38] 10Analytics, 10Cassandra: Cassandra3 migration for Analytics AQS - https://phabricator.wikimedia.org/T249755 (10hnowlan) [11:47:40] 10Analytics, 10Analytics-Kanban, 10Cassandra: Dual loading from Hive into old and new AQS clusters - https://phabricator.wikimedia.org/T280155 (10hnowlan) 05Resolved→03Declined [11:47:51] 10Analytics, 10Analytics-Kanban, 10Cassandra: Dual loading from Hive into old and new AQS clusters - https://phabricator.wikimedia.org/T280155 (10hnowlan) Joal is handling this in T280649 [12:13:35] (03CR) 10jerkins-bot: [V: 04-1] Add grants and schema CQL [analytics/aqs] - 10https://gerrit.wikimedia.org/r/682933 (https://phabricator.wikimedia.org/T278701) (owner: 10Hnowlan) [12:36:10] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Review the Yarn Capacity scheduler and see if we can move to it - https://phabricator.wikimedia.org/T277062 (10Ottomata) Very cool! [13:15:03] ottomata: o/ [13:15:36] I was reading https://docs.hopsworks.ai/latest/integrations/spark/ for the feature store [13:16:22] it seems as if Hops offers its own Hive Metastore + Hudi, that Spark may be able to use to push dataframes to if needed [13:16:42] (hops offers an api that makes this transparent, you are not aware about hive behind the scenes) [13:17:04] I contacted the upstream devs to see if my understanding is correct, hopefully spark + hadoop will be able to be reused [13:17:22] Hops is the only decent opensource solution that I found for the moment [13:17:41] also joal --^ [13:18:13] ack elukey [13:32:50] https://community.hopsworks.ai/t/feature-store-hardware-requirements/466 is what I asked to upstream [13:34:52] the other alternative could be to mix Airflow + Spark + Cassandra/Redis in house [13:37:33] but it may become difficult to maintain if we want fancy stuff (feature dataset versioning etc..) [13:37:45] or Airflow + Spark + Hive + Iceberg :D [13:38:05] * elukey tries to nerd snipe Joseph [13:39:09] * joal will end writing the "WMFeatureStore" [14:00:11] Here's one: "Generic identifier. If you think about it too much, your head will explode." [14:10:17] hello allll [14:12:09] hellOO [14:14:31] milimetric: that refine_eventlogging_analytics probably needs re-run, looks like a mw api lookup hiccup [14:14:36] i can do if yo like :) [14:14:40] elukey: iinnnnteresting [14:15:03] ottomata: I got it, thx [14:15:47] elukey: i think we need someone from ML in the data integration wg [14:15:47] https://docs.google.com/document/d/1CpxSbL1RfCfnSnl2tFrMLoLFx_Dd2I4zRzlaM1qjcCw/edit#heading=h.5rxa9msq41um [14:15:55] i don't see the feature store use case there [14:16:12] the first meeting is this thursday [14:17:00] milimetric: did you notice the To rerun flags it now gives you at the bottom???? :D [14:17:03] pretty nICE EHHHHH!??!!? [14:17:32] oh, no, I saw everyone saying something was nice last week but I didn't know what you were talking about (I still don't, I'm very slow) [14:17:40] oh! [14:18:00] awesome, so I don't need to read those pages of docs anymore, cool [14:18:04] :) [14:21:21] ottomata: definitely, I'd like to join but others from my team will probably want as well [14:21:35] elukey: let me know who to invite and i will invite them [14:22:18] ottomata: you can invite me for sure, I'll ask to the team as well and report back [14:23:27] ottomata: I accidentally git push -f to refinery-source instead of pushing to gerrit, can you check it's ok and I didn't crush someone else's commit? (just using your local copy) [14:23:33] I pulled right before so it should be ok [14:23:52] also: that shouldn't be allowed, is it on to allow jenkins to do stuff? [14:24:18] milimetric: git pull successful [14:24:20] minor: close quote [14:24:39] ok, cool, thx [14:24:42] https://gerrit.wikimedia.org/r/admin/repos/analytics/refinery/source,access [14:24:58] i thiink its allowed so we can fix things [14:25:49] ok, I turned off force pushing just to prevent this, we can always turn it on if we need [14:32:32] joal: yt? [14:32:41] milimetric: ok [14:56:53] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Review the Yarn Capacity scheduler and see if we can move to it - https://phabricator.wikimedia.org/T277062 (10elukey) Added https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration#Yarn_Labels [14:57:29] a-team today the managers meeting is longer than usual so I won't be in standup, but will be in time for retro at :30 [14:58:20] a-team: Olja's coming to retro today, so perhaps we can swap it with staff? Or do others want to have retro as our first meeting? [14:58:42] either is fine with me! [14:59:01] milimetric: i need to pick a HW config for data governence capex :/ [15:00:29] ottomata: reading requirements for Atlas to give you some kind of semi-decent answer [15:00:37] yeah doing the same [15:00:42] not even sure what we need [15:00:50] maybe we just need an elasticsearch clsuter??? [15:00:54] and everything else in k8s? [15:01:53] milimetric: swapping with staff sounds good to me [15:03:21] milimetric: maybe i'll just ask for 4 elasticsearch style nodes. [15:03:27] this is more about budget than procurment atm [15:03:34] so hopefuly that'll get us within range [15:04:04] those nodes are like 24 procs, 256G ram, 2x2TB SSDs [15:04:11] so not a lot of storage [15:04:16] ottomata: yeah, it looks like two web servers would need to talk to a JanusGraph cluster, which in our world would probably use Cassandra as a backing store (I'm assuming we would much rather that than to set up HBase just for this) [15:04:19] but atlast should be metadata so i assume that's fine [15:04:26] aye [15:04:33] hm so maybe we need cassandra? [15:05:06] IIRC we'd need both, cassandra and ES [15:05:09] yeah, our cluster is already pretty good, maybe we just expand it a bit more? [15:05:22] ES is the indexer behind Cassandra, Luca's right [15:05:38] ok, i'll guess and get 2ish cass nodes and 2ish es nodes [15:06:11] were we talking about using Solr? That's not too different from ES [15:06:42] yeah, that sounds good for budget, seems to me worst case is we'll find out it's not enough but we'll easily be able to set up a POC and ask for more later [15:06:45] it looks like ES works for atlas too [15:08:42] maybe we can just run atlas http service in ganeti or k8s [15:25:03] ottomata: but ES is just behind JanusGraph, we have to run that with Cassandra behind it [15:25:25] Atlas -> Janus -> Cassandra -> ES/Solr is the stack as I understand it [15:26:56] aye but...assuming janusgraph and atlas are stateless? [15:27:01] all state in cass & ES? [15:29:55] and zookeeper, hm, yeah maybe [15:30:33] do yall like that pattern that we did with AQS and Druid initially where we colocated everything? And then slowly split it out? Or was that a pain and unnecessary and it's better properly split out from the start? [15:32:55] milimetric: it is a nightmare for sres if the daemons co-located have different workloads, because finding problems or regressions is more difficult (my 2c) [15:33:42] also, we didn't co-locate for aqs and druid no? [15:33:50] except their services [16:02:39] elukey: yeah, colocating their services is what I'm talking about, and all that got untangled later on. So that's great to know though, we want to show as much love to our SREs as possible [16:15:02] (03PS1) 10Ottomata: Rename and symlink sanitization eventlogging/whitelist.yaml [analytics/refinery] - 10https://gerrit.wikimedia.org/r/682986 (https://phabricator.wikimedia.org/T273789) [16:19:53] (03PS2) 10Ottomata: Rename and symlink sanitization eventlogging/whitelist.yaml [analytics/refinery] - 10https://gerrit.wikimedia.org/r/682986 (https://phabricator.wikimedia.org/T273789) [16:20:44] mforns: quick review of https://gerrit.wikimedia.org/r/c/analytics/refinery/+/682986 so I can get it in the train? [17:03:17] 10Analytics, 10CheckUser, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), 10Schema-change: Schema changes for `cu_changes` and `cu_log` table - https://phabricator.wikimedia.org/T233004 (10Izno) [17:07:02] 10Analytics, 10CheckUser, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), 10Schema-change: Schema changes for `cu_changes` and `cu_log` table - https://phabricator.wikimedia.org/T233004 (10Izno) [17:10:26] 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by herron on cumin1001.eqiad.wmne... [17:13:51] 10Analytics, 10CheckUser, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), 10Schema-change: Schema changes for `cu_changes` and `cu_log` table - https://phabricator.wikimedia.org/T233004 (10Izno) [17:15:36] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Data-Infrastructure, 10Readers-Web-Backlog (Kanbanana-FY-2020-21): VirtualPageView should use EventLogging api to send virtual page view events - https://phabricator.wikimedia.org/T279382 (10ovasileva) [17:16:38] 10Analytics, 10Product-Analytics, 10Pageviews-Anomaly: Too many views to Skathi (moon) on enwiki - https://phabricator.wikimedia.org/T280844 (10ldelench_wmf) [17:18:09] 10Analytics, 10Product-Analytics, 10Pageviews-Anomaly: Too many views to Skathi (moon) on enwiki - https://phabricator.wikimedia.org/T280844 (10kzimmerman) Wanted to flag #analytics - could this kind of thing be captured as automated traffic? [17:24:39] 10Analytics, 10CheckUser, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), 10Schema-change: Schema changes for `cu_changes` and `cu_log` table - https://phabricator.wikimedia.org/T233004 (10Izno) [17:26:01] 10Analytics, 10CheckUser, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), 10Schema-change: Schema changes for `cu_changes` and `cu_log` table - https://phabricator.wikimedia.org/T233004 (10Izno) [17:29:55] 10Analytics, 10CheckUser, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team), 10Schema-change: Schema changes for `cu_changes` and `cu_log` table - https://phabricator.wikimedia.org/T233004 (10Izno) [17:36:04] 10Analytics, 10Multi-Content-Revisions (Tech Debt): Adapt mediawiki history for MCR - https://phabricator.wikimedia.org/T238615 (10Izno) [17:36:07] 10Analytics-Radar, 10Data-Persistence (Consultation), 10Platform Engineering Roadmap Decision Making, 10Epic, and 3 others: Remove revision_comment_temp and revision_actor_temp - https://phabricator.wikimedia.org/T215466 (10Izno) [17:36:09] 10Analytics-Radar, 10Multi-Content-Revisions (Tech Debt), 10Platform Team Initiatives (MCR), 10Schema-change: Once MCR is deployed, drop the rev_text_id, rev_content_model, and rev_content_format fields from the revision table - https://phabricator.wikimedia.org/T184615 (10Izno) [17:39:44] ottomata: Atlas uses JanusGraph, which in turns uses BOTH a keyrange-storage(cassandra/hbase) and a search engine (ES or SolR) [17:40:03] ottomata: Just mentioning after a comment you made earlier --^ [17:40:36] ottomata: last, JanusGraph uses the 2 backend and also needs computation power for the main servers, as it queries the backends and do stuff in the main server [17:41:26] So in-fine, machines for cassandra, machines for ES (or we reuse), machines for Janus, and machines for Atlas - What an easy tool to setup :) [17:47:27] 10Analytics, 10Multi-Content-Revisions (Tech Debt): Adapt mediawiki history for MCR - https://phabricator.wikimedia.org/T238615 (10Izno) [17:47:35] (03CR) 10Joal: [C: 03+1] "I have not checked the details, but it's great to have this here." [analytics/aqs] - 10https://gerrit.wikimedia.org/r/682933 (https://phabricator.wikimedia.org/T278701) (owner: 10Hnowlan) [17:59:32] 10Analytics, 10WMCZ-Stats: Review request: New datasets for WMCZ published under analytics.wikimedia.org - https://phabricator.wikimedia.org/T279567 (10JAllemandou) Hi @Urbanecm - Sorry for the late reply, I wanted to discuss with the team, and it happened yesterday. We wish to facilitate your use case, and t... [18:09:50] 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['kafka-main1004.eqiad.wmnet'] ` a... [18:10:43] 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by herron on cumin1001.eqiad.wmne... [18:32:43] 10Analytics, 10Analytics-Kanban: Crunch and delete many old dumps logs - https://phabricator.wikimedia.org/T280678 (10WDoranWMF) @Milimetric Just following up here. The data we would like to know is downloads per dump over time, I think that might been the original ask for the retention. Not sure what else mig... [18:41:08] Gone for tonight team - bye :) [18:46:56] 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['kafka-main1005.eqiad.wmnet'] ` a... [18:49:22] 10Analytics-Radar, 10Data-Services, 10Developer-Advocacy (Apr-Jun 2021), 10cloud-services-team (Kanban): Mitigate breaking changes from the new Wiki Replicas architecture - https://phabricator.wikimedia.org/T280152 (10Jhernandez) [19:06:39] (03PS3) 10Ottomata: Rename and symlink sanitization eventlogging/whitelist.yaml [analytics/refinery] - 10https://gerrit.wikimedia.org/r/682986 (https://phabricator.wikimedia.org/T273789) [19:07:17] milimetric: lemme know when you doing train [19:07:32] hmm actually i might try to get a quick event utiltieis fix in [19:13:02] 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by herron on cumin1001.eqiad.wmne... [19:18:07] (03PS1) 10Ottomata: Bump to eventutilities 1.0.6 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/683042 [19:18:39] (03PS1) 10Ottomata: Add more tables to sanitize in event_sanitized_main_allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/683043 (https://phabricator.wikimedia.org/T273789) [19:21:28] * razzi lunch [19:22:51] (03PS2) 10Ottomata: Add more tables to sanitize in event_sanitized_main_allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/683043 (https://phabricator.wikimedia.org/T273789) [19:28:52] mforns: yt? [19:29:38] i noticed that the drop-el-unsanitized-events job ignores WMDEBanner* tables [19:29:39] https://phabricator.wikimedia.org/T209503#4991424 [19:30:46] i guess it shouldn't? [19:30:57] should those be added to the allowlist? [19:31:47] hi ottomata [19:32:37] (03CR) 10Ottomata: [C: 03+2] Bump to eventutilities 1.0.6 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/683042 (owner: 10Ottomata) [19:33:01] yes, it does ignore them, I don't remember the reason those are not deleted, looking at the task you pasted it seems it was temporary no? [19:33:51] ottomata: let's ask stakeholders, maybe we don't even need to allow-list them [19:35:24] ok i guess i'll make a task [19:35:25] joal: do you know who could be a go to person in WMDE to talk about data retention of the WMDEBanner* schemas? [19:35:37] mforns: i'd look at schema pages for those tables [19:35:38] https://meta.wikimedia.org/w/index.php?title=Special%3AAllPages&from=&to=&namespace=470 [19:35:41] and find whoever made them [19:36:05] ottomata: I was asking, because they are very old, and probably the owners changes [19:36:09] *changed [19:36:11] ah [19:37:02] the talk pages just say: WMDE [19:37:20] https://meta.wikimedia.org/w/index.php?title=Schema:WMDEBannerEvents&action=history [19:39:38] 10Analytics, 10WMDE-Analytics-Engineering: Drop old WMDEBanner events from Hive - https://phabricator.wikimedia.org/T281300 (10Ottomata) [19:43:37] mforns: https://phabricator.wikimedia.org/T281300 [19:44:27] ottomata: thanks :] [19:44:55] mforns: q [19:44:59] yep [19:45:00] refinery-drop-older-than [19:45:15] el legacy table partition format is different than new talbe partition format [19:45:20] (new tables have the datacenter partition) [19:45:26] aha [19:45:42] let me remember [19:45:44] if I have --tables='.*' and --path-format with datacenter=.+ in it [19:45:55] will it only match new tables? [19:46:05] i guess i need new jobs, boht of which say --tables='.*' [19:46:12] two jobs* [19:46:21] but have the different path-formats? [19:46:41] hm [19:47:20] you could make the datacenter=blah optional in the regex I guess, and have just 1 job [19:47:30] like: [19:47:34] 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['kafka-main2004.codfw.wmnet'] ` a... [19:47:46] oh hm [19:47:48] right [19:48:12] 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by herron on cumin1001.eqiad.wmne... [19:48:25] '[^/]+/(datacenter=[a-z]+/)?year=(?P[0-9]+)(/month=(?P[0-9]+)(/day=(?P[0-9]+)(/hour=(?P[0-9]+))?)?)?' [19:48:49] ottomata: ^ [19:49:27] right cool [19:49:29] will try that thank you [19:49:58] ok, that regex is a bit an anti-pattern no? [19:50:02] too complex [19:51:12] mforns: i think in this case we have to use the regex, since we want to be sure to drop files, and not just rely on hive to know about them [19:51:13] right? [19:51:24] yea [19:57:48] ottomata: you're working on a fix you want on the train? I can delay to tomorrow, I've not finished my patch yet either [19:59:51] milimetric: all patches merged or listed in train doc [19:59:54] train can go [20:00:08] mforns: can you review this one real quick? [20:00:08] https://gerrit.wikimedia.org/r/c/analytics/refinery/+/682986 [20:00:17] and this [20:00:18] https://gerrit.wikimedia.org/r/c/analytics/refinery/+/683043 [20:00:18] ? [20:00:30] lookin [20:00:41] ok, cool, I'll just finish my stuff and deploy in an hour or so, cc razzi [20:00:58] ack [20:05:44] (03CR) 10Mforns: [C: 03+1] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/682986 (https://phabricator.wikimedia.org/T273789) (owner: 10Ottomata) [20:07:00] (03CR) 10Mforns: [C: 03+1] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/683043 (https://phabricator.wikimedia.org/T273789) (owner: 10Ottomata) [20:07:07] ty [20:07:11] np [20:07:15] hehe [20:08:58] (03PS13) 10Milimetric: Add daily referrers Hive table and Oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/655804 (https://phabricator.wikimedia.org/T270140) (owner: 10Bmansurov) [20:17:29] 10Analytics-Radar, 10SRE, 10Patch-For-Review, 10Services (watching), 10User-herron: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['kafka-main2005.codfw.wmnet'] ` a... [20:22:10] (03PS14) 10Milimetric: Add daily referrers Hive table and Oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/655804 (https://phabricator.wikimedia.org/T270140) (owner: 10Bmansurov) [20:39:35] hello joal, just a quick question ad T279567, would it be possible for me to generate the dataset manually and publish it somewhere, so others can play with it, while we wait for Airflow? If that's okay with your team, I'm totally fine with waiting for Airflow to automate it :) [20:39:36] T279567: Review request: New datasets for WMCZ published under analytics.wikimedia.org - https://phabricator.wikimedia.org/T279567 [20:46:43] Urbanecm: tthat should be fine [20:47:03] https://wikitech.wikimedia.org/wiki/Analytics/Web_publication [20:47:45] ottomata: okay, perfect. Thanks for all the help :). I guess I should put it under datasets/one-off/wmcz? Or would you suggest a better place? [20:48:26] Urbanecm: that sounds fine to me, probably good to put a README in your wmcz folder with links to owners / tickets/ docs, etc. [20:48:43] ok, will do. [20:53:24] (03CR) 10Ottomata: [C: 03+2] Rename and symlink sanitization eventlogging/whitelist.yaml [analytics/refinery] - 10https://gerrit.wikimedia.org/r/682986 (https://phabricator.wikimedia.org/T273789) (owner: 10Ottomata) [20:53:26] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Rename and symlink sanitization eventlogging/whitelist.yaml [analytics/refinery] - 10https://gerrit.wikimedia.org/r/682986 (https://phabricator.wikimedia.org/T273789) (owner: 10Ottomata) [20:53:33] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Add more tables to sanitize in event_sanitized_main_allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/683043 (https://phabricator.wikimedia.org/T273789) (owner: 10Ottomata) [20:57:33] (03PS4) 10Kosta Harlan: [WIP] Create structuredtask/article/edit schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/681052 (https://phabricator.wikimedia.org/T278177) [21:01:34] razzi: if you're around and have a minute, I have no idea why https://hue.wikimedia.org/hue/jobbrowser#!id=0002030-210426062240701-oozie-oozi-W is in an error state, everything looks ok to me [21:01:55] in any case, I think that job is ready, I'm gonna merge and deploy it and we'll figure out later whether I need a follow-up patch [21:02:13] (it's a new job so I just won't start it until it's good) [21:05:18] (03CR) 10Milimetric: "The next patches address the comments and do some cleanup, plus the archive step to generate the TSVs. The output of the test is in /tmp/" (036 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/655804 (https://phabricator.wikimedia.org/T270140) (owner: 10Bmansurov) [21:05:36] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add daily referrers Hive table and Oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/655804 (https://phabricator.wikimedia.org/T270140) (owner: 10Bmansurov) [21:07:22] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add daily referrers Hive table and Oozie job (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/655804 (https://phabricator.wikimedia.org/T270140) (owner: 10Bmansurov) [21:13:34] mmk, choo choo [21:14:35] milimetric: just saw your message, let me see [21:15:09] no worries razzi, it's not urgent, I just got short-circuited looking at it in the new hue interface, figured maybe I was crazy 'cause I can't find the error [21:19:08] (03PS1) 10Milimetric: Changelog entry for 0.1.7 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/683086 [21:19:18] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Changelog entry for 0.1.7 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/683086 (owner: 10Milimetric) [21:20:21] ottomata: the new refinery-source version I'm building is 0.1.7, any oozie jobs to bump? [21:24:48] milimetric: nopers [21:25:07] its a fix for produce canary events [21:25:17] i'll bump the version in puppet after tomorrow [21:25:19] after deploy [21:28:09] milimetric: saw the patches coming through. changes look good to me. huge thanks! did you need/want me to check anything or give an official thumbs up? [21:28:55] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10GoranSMilovanovic) [21:29:11] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10User-GoranSMilovanovic: WDCM_Sqoop_Clients.R fails from stat1004 (again) - https://phabricator.wikimedia.org/T281316 (10GoranSMilovanovic) p:05Triage→03High [22:12:34] isaacj: the data is on hdfs at /tmp/for_isaac/referrals-for-2021-03-04.tsv, if you could take a look that'd be great [22:25:00] hm... the build failed due to what look like network errors fetching dependencies? It's a mess because there's a ton of style errors in there [22:25:52] razzi: this may have to wait until tomorrow, if it's a jenkins issue we'll need some help, and either way I'm not at all sure what's going on so I don't think I can push it forward after hours. Time to practice some of that zen :) [22:26:54] sounds good milimetric [23:04:55] thanks milimetric -- i'll take a look tomorrow morning then