[07:32:01] goood morning [08:05:11] Hello [08:10:05] joal: your archiva change got merged to upstream, I created a different patch and trying to upload the new pkg in our repo [08:10:38] \o/ [08:10:52] Thanks a lot elukey for the push upstream [08:14:22] joal: sbassett also helped a lot! [08:14:33] now we should be ok, done done done [08:14:38] awesome [08:16:00] I had a chat with Rob last week and he was able to bootstrap some of the new hadoop worker nodes, he left some manual steps for me to do but I should be able to unblock the OS install in theory [08:16:05] will dedicate my day to it :D [08:16:20] elukey: let me know if I can help! [08:16:54] joal: I am also more convinced about your idea of having datanode + namenode on the sam node, to have more space (extra +96T) [08:17:22] we really need every bit that we can get, I'll prep dedicated puppet roles for this new config [08:17:34] ack elukey - namenode will need some RAM, but it'll not be put under pressure normally [08:17:54] those nodes are 128G so plenty of space, especially since we'll not run any job [08:18:03] indeed - great [08:37:20] joal: bigtop upstream seems good in keeping alluxio, I didn't see oppositions [08:37:34] and they will support Bullseye too [08:37:48] This is a great news elukey - Again, thanks a lot for caring for our relationships to upstream :) [08:38:04] so the upgrade to hadoop 3 might come sooner, when Moritz asks us to upgrade to Bullseye :D [08:38:13] :) [09:03:00] elukey: not being root, there are files I can't access on thorium :( [09:04:05] elukey: could you please run for me: 'sudo du --all /srv/backup > /tmp/thorium_backup_files.txt' ? [09:06:09] sure! [09:08:30] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1119.eqiad.wmnet'] ` The log can be fou... [09:09:10] elukey: from the ticket, the one thing I'm unsure about are links (I always have trouble) - Will check for file existence and size equality [09:09:50] joal: it is currently a weird setup, since we have a ton of hardlinks [09:10:06] I have read that elukey [09:10:12] * joal is afraid of links [09:10:17] the same files show up under /srv/backup and in their original place, so a little cunfusing [09:10:39] well those are hardlinks, so should be ok [09:10:57] ack [09:11:08] I had to rename some filenames with chars like []-[] etc.. since hdfs didn't like it, but mostly ErikZ's script stuff [09:11:17] thanks for checking :) [09:21:02] joal: the file is ready [09:22:01] elukey: ack thanks [09:22:13] Arf elukey - Thorium doesn't have access to hdfs does it? [09:23:05] nope, that was the issue :D [09:23:11] Ah [09:23:13] hm [09:23:24] Any hint on how to move that file to hdfs? [09:23:59] actually, possibly just a move to stat1008 would be enough [09:24:03] hm [09:24:05] joal: you can scp it no? [09:24:30] I can do that - ok will for that elukey - thanks [09:24:58] <3 [09:35:27] * elukey brb [09:55:45] elukey: There probably is something wrong in my way of checking, but I have big differences in number of lines between thorium and hdfs :S [09:58:13] joal: ah okok, can you give me the list somewhere so I can check? [09:58:47] elukey: on an-launcher1002 I generated /tmp/hdfs_thorium_backup_files.txt [09:58:52] 10Analytics, 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to Production Shell Access (analytics-privatedata-users) for Rmaung - https://phabricator.wikimedia.org/T266250 (10ayounsi) 05Open→03Resolved Access created, you should have received an email as well about your kerberos ac... [09:59:23] elukey: wc -l over the 2 files tell me there is 250k in one (the thorium one), 10k in hdfs - something is wrong :S [09:59:42] joal: I'll check in a bit thanks, I have probably have done something wrong [09:59:52] elukey: It could be my generation script as well, please let me know [10:09:29] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1119.eqiad.wmnet'] ` Of which those **FAILED**: ` ['an-worker1119.eqiad.wmnet'] ` [10:49:27] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10elukey) I fixed the boot order on an-worker1119 but PXE doesn't really work, I noticed that all NICs show no link up status, maybe there is something not s... [11:05:21] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1120.eqiad.wmnet'] ` The log can be fou... [11:08:01] 10Analytics, 10Analytics-Kanban: Add client TCP source port to webrequest - https://phabricator.wikimedia.org/T271953 (10elukey) @ema quick question - is the client src port something that we could pass from ATS-TLS to Varnish frontend? Via HTTP header etc.. [11:08:49] 10Analytics, 10Analytics-Kanban: Add client TCP source port to webrequest - https://phabricator.wikimedia.org/T271953 (10elukey) >>! In T271953#6749428, @Ladsgroup wrote: >>>! In T271953#6748341, @elukey wrote: >> @Ladsgroup we should follow up with Traffic to make sure that ATS-TLS adds the client's source po... [11:09:23] the new workers are soooo slow to reboot [11:09:29] :( [11:09:37] an-worker1119 is probably not connected to the switch, skipping to 1120 [11:16:21] 1120 seems good, d-i works [11:23:38] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1121.eqiad.wmnet'] ` The log can be fou... [11:23:43] 1121 in progress [11:28:52] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1120.eqiad.wmnet'] ` and were **ALL** successful. [11:38:03] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1122.eqiad.wmnet'] ` The log can be fou... [11:38:09] 1122 in progress [11:38:28] joal: I'll check the diff later, busy morning between hosts and mediawiki sorry [11:38:31] :( [11:39:25] no problem elukey - I'm doing other things for now but I can help withthat as needed - please tell me [11:44:01] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1121.eqiad.wmnet'] ` and were **ALL** successful. [11:49:15] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1123.eqiad.wmnet'] ` The log can be fou... [11:49:39] 1123 in progress [11:59:25] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1122.eqiad.wmnet'] ` and were **ALL** successful. [11:59:44] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1124.eqiad.wmnet'] ` The log can be fou... [11:59:49] 1024 in progress [12:00:38] Taking a break, back in a few [12:08:33] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1125.eqiad.wmnet'] ` The log can be fou... [12:08:51] 1125 in progress [12:10:47] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1123.eqiad.wmnet'] ` and were **ALL** successful. [12:13:17] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1126.eqiad.wmnet'] ` The log can be fou... [12:13:19] 1126 in progress [12:19:17] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1127.eqiad.wmnet'] ` The log can be fou... [12:19:20] 1127 in progress [12:20:11] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1124.eqiad.wmnet'] ` and were **ALL** successful. [12:29:56] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1125.eqiad.wmnet'] ` and were **ALL** successful. [12:34:40] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1126.eqiad.wmnet'] ` and were **ALL** successful. [12:40:00] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10elukey) [12:40:10] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1127.eqiad.wmnet'] ` and were **ALL** successful. [12:51:00] lunch break! will keep going later on [14:30:52] (03CR) 10Mforns: "> Patch Set 1: -Code-Review" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656367 (https://phabricator.wikimedia.org/T271560) (owner: 10Joal) [14:40:20] 10Analytics, 10Add-Link, 10Growth-Structured-Tasks, 10Growth-Team (Current Sprint), 10Patch-For-Review: Add Link engineering: Pipeline for moving MySQL database(s) from stats1008 to production MySQL server - https://phabricator.wikimedia.org/T266826 (10kostajh) [14:40:31] 10Analytics, 10Add-Link, 10Growth-Structured-Tasks, 10Growth-Team (Current Sprint), 10Patch-For-Review: Add Link engineering: Pipeline for moving MySQL database(s) from stats1008 to production MySQL server - https://phabricator.wikimedia.org/T266826 (10kostajh) [14:40:35] (03PS1) 10Joal: [WIP] Update to spark-3 and scala-2.12 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 [14:44:03] (03PS1) 10WMDE-Fisch: [WIP] Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) [14:44:42] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [14:59:47] (03CR) 10Mforns: [C: 03+1] "LGTM! Left a couple comments, just to be sure." (036 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/656373 (https://phabricator.wikimedia.org/T271560) (owner: 10Joal) [15:02:37] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1128.eqiad.wmnet'] ` The log can be fou... [15:07:20] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1130.eqiad.wmnet'] ` The log can be fou... [15:08:31] heya elukey :] [15:08:49] can you review this and merge if OK? https://gerrit.wikimedia.org/r/c/operations/puppet/+/655120 [15:09:16] it's the reactivation of the automatic deletion of Hive netflow data older than 90 days [15:14:08] 10Analytics, 10Analytics-Kanban, 10SRE, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) Moving this task to DONE. [15:15:02] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Structured-Data-Backlog: SuggestedTagsAction Event Platform Migration - https://phabricator.wikimedia.org/T267351 (10mforns) [15:16:38] mforns: sure! I see that Fran also reviewed it, merging now [15:16:48] ok :] thanks! [15:17:06] it should run tonight, but it should be pretty safe [15:17:22] I tested it with the checksum [15:17:31] famous last words [15:17:34] ahahahah [15:17:41] yes don't say those words out loud [15:17:43] I knew you were thinking it [15:17:48] this is my golden rule [15:17:57] hehehe [15:18:15] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1130.eqiad.wmnet'] ` Of which those **FAILED**: ` ['an-worker1130.eqiad.wmnet'] ` [15:18:22] mforns: is it snowing where you live? [15:18:50] elukey: no no, it has been a (cold but) sunny day, actually, why? [15:19:31] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up automatic deletion/snitization for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10mforns) Moving this to DONE. [15:20:33] mforns: I am following a youtube blog of DrBecky, she is an astro-physicist and she was trying to use the telescope in Palma but it was covered in snow (days ago though) [15:21:15] ah! maybe the telescope was in an observatory in the mountains, some of them have snow! [15:21:32] yes exactly! okok so it snowed on the mountains [15:21:57] Newton Telescope [15:23:56] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1130.eqiad.wmnet'] ` The log can be fou... [15:24:01] elukey: makes sense [15:24:03] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1128.eqiad.wmnet'] ` and were **ALL** successful. [15:29:49] mforns: also really nice blogpost, the cherry on top a looong and great work :) [15:30:20] elukey: the one about entropy anomalies? [15:30:30] yep! [15:31:15] yes, it looks great! nuria did pretty much all of the post [15:33:48] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10elukey) No link found as well for an-worker1131, skipping.. [15:45:40] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1132.eqiad.wmnet'] ` The log can be fou... [15:46:39] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1130.eqiad.wmnet'] ` and were **ALL** successful. [15:51:40] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1135.eqiad.wmnet'] ` The log can be fou... [15:57:16] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1136.eqiad.wmnet'] ` The log can be fou... [16:06:11] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1132.eqiad.wmnet'] ` and were **ALL** successful. [16:09:13] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1136.eqiad.wmnet'] ` Of which those **FAILED**: ` ['an-worker1136.eqiad.wmnet'] ` [16:09:17] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1137.eqiad.wmnet'] ` The log can be fou... [16:12:03] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1135.eqiad.wmnet'] ` and were **ALL** successful. [16:18:40] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1138.eqiad.wmnet'] ` The log can be fou... [16:27:28] Oops, completely missed the nonexistant standup :D [16:28:04] On the upside: I've made quite some strides on the ATS/Varnish front [16:30:49] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1138.eqiad.wmnet'] ` Of which those **FAILED**: ` ['an-worker1138.eqiad.wmnet'] ` [16:31:56] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1137.eqiad.wmnet'] ` and were **ALL** successful. [16:46:52] (03CR) 10Joal: "I got hooked into starting that... maven tests pass, but 3 test-classes are commented out." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/656897 (owner: 10Joal) [16:48:28] klausman: nice! [16:52:09] 10Analytics: Newpytyer python kernels - https://phabricator.wikimedia.org/T272313 (10fkaelin) [17:01:24] One thing I wasn't quite aware of in that context: only HTTPS requests make it to both Varnish and ATS [17:01:33] (in the current setup, that is) [17:06:59] yes exactly, the plain http ones are redirected [17:07:27] ah so ATS-TLS does the redirect for those, so you see them in atskafka right? [17:43:43] Correct, I only see TLS/SSL requests in both ATS and Varnish [17:44:02] There is still discrepancy, I am talking with ema about possible ways to figure out why that is [17:44:44] very nice work! [17:45:25] I've also had a bit of a shower thought when grocery shopping, about making this whole testing bit a bit more reproducible. But that I'll do tomorrow. [17:55:50] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10elukey) [17:59:05] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10elukey) [18:04:18] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1136.eqiad.wmnet'] ` The log can be fou... [18:10:21] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1138.eqiad.wmnet'] ` The log can be fou... [18:13:14] joal: we crossed the 50M files :O [18:16:12] mforns: also credit to joal for copyediting the post so, ya, team effort [18:16:28] yes! :] [18:16:34] nuria: hola!!! [18:24:41] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1136.eqiad.wmnet'] ` and were **ALL** successful. [18:29:22] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10elukey) [18:33:11] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1138.eqiad.wmnet'] ` and were **ALL** successful. [18:36:10] elukey: hey, rather easy review: https://gerrit.wikimedia.org/r/c/operations/puppet/+/656961 [18:36:49] Amir1: already merged :) [18:36:57] that was fast :D [18:37:04] I have another one for you have time as well [18:37:07] :P [18:37:15] sure! [18:38:07] https://gerrit.wikimedia.org/r/c/operations/puppet/+/656531 [18:38:11] Thanks! [18:39:01] we really should have a tool to find unused puppet modules. The repo is massive. It sounds really fun to do [18:39:42] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10elukey) @Cmjohnson for an-worker1119 and an-worker1131 I don't have any network link, could you please check if anything is missing from the cabling/config... [18:40:33] Amir1: I agree yes [18:43:33] Amir1: merged [18:43:54] joal: 16 nodes up, 2 with missing network link but good so far :) [18:44:02] \o/ [18:44:05] I'll double check racking locations tomorrow to be sure [18:44:18] awesome work elukey - Thanks a lot for that - I know it's not fun and tedious :( [18:44:37] Thanks! [18:45:28] elukey: we croswed 50M files in HDFS you mean I assume [18:45:35] this is a lot [18:45:46] We need compactions strategies [18:45:53] for small data [18:46:20] yes yes on hdfs :) [18:47:16] I really need to make this script to get fsimage metadata analyzed on hadoop [18:47:39] It's done as a manual one-off, I need to productionize it [18:48:00] it will also be interesting to see the namenode's behavior with the new 24 nodes, they will send block reports etc.. [18:48:04] plus 50+M files [18:48:08] and counting [18:48:08] indeed sir [18:48:12] very interesting [18:48:20] we might need to bump available heap [18:48:25] * elukey migrates to ozone [18:48:32] :) [18:48:34] yes definitely, the heap will need a bump [18:48:42] and we have a lot of RAM so I am happy [18:48:51] good [18:49:12] going afk for dinner, ttl folks! [18:49:41] elukey: I'd rather make us move to iceberg for events for now :( [18:49:42] :) [18:49:44] sorry [19:26:29] Gone for tonight - See ou tomorrow team [22:20:51] (03CR) 10Addshore: [C: 03+2] "This is already a lovely step forward, so merging and then we have a new baseline to move forward from (maybe with Amirs comment)" [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649710 (owner: 10Lucas Werkmeister (WMDE)) [22:21:05] (03PS1) 10Addshore: Reduce duplicate code in lexeme statistics script [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/656912 [22:21:09] (03CR) 10Addshore: [C: 03+2] Reduce duplicate code in lexeme statistics script [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/656912 (owner: 10Addshore) [22:22:24] (03Merged) 10jenkins-bot: Reduce duplicate code in lexeme statistics script [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649710 (owner: 10Lucas Werkmeister (WMDE)) [22:22:32] (03Merged) 10jenkins-bot: Reduce duplicate code in lexeme statistics script [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/656912 (owner: 10Addshore) [23:42:15] (03CR) 10Awight: [C: 04-1] "The schema master is in `current.yaml`, please edit only that file, then let the git hooks materialize and link the generated files. See " [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [23:50:50] (03CR) 10Awight: [C: 03+1] "Schema change itself looks good; it's just hard to see without the current.yaml diff." (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [23:53:18] (03CR) 10Awight: [C: 04-1] "I'll rewrite this to use the new user_edit_count_bucket column. Don't bother making the query backwards-compatible, we'll just fast-forwa" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/656210 (https://phabricator.wikimedia.org/T269986) (owner: 10Awight)