[08:32:50] hello people [08:33:07] I am testing a change for kerberos on the decom cookbook, please ping me beforehand if you need to run it [08:35:33] so you're testing in production... duly noted! :-P [08:37:52] of course! [09:42:09] [FYI] I'll be migrating the codfw private/public primary DNS records from the manual repo to the Netbox auto-generated ones in few minutes. Please hold on merging DNS patches for the next ~30m or so. Thanks! [09:45:29] volans: you didn't post this message in the analytics chan as well, usual bias against analytics, I know [09:45:49] :P [09:45:53] * elukey runs [09:46:28] lol, do you want me to spam on 20 chans? :-P [09:46:40] only to the important ones [09:47:10] but I see that you also skipped kormat's chan so I am happier now [09:47:23] lol [10:08:27] me too [10:21:00] [FYI] you can resume normal operations on the dns repo. I'll keep monitoring for a while but all looks good so far. [10:22:42] congrats! [10:34:56] FYI, I'm upgrading seaborgium to Buster in ~5m [10:47:35] is someone good at DNS? I have questions [10:49:07] shoot i can take a stab [10:49:32] mixing your metaphors there :) [10:49:38] :D [10:49:42] he's jbond after all [10:49:49] 'strue :) [10:50:03] shoot, we'll try to confuse you as much as we can [11:44:12] i think XioNoX had second thoughts ;) [11:45:36] <_joe_> kormat: XioNoX is wise [11:46:25] haha, chatted with people by PM [18:21:05] godog, I'm trying to use raid1-2dev.cfg and the debian installer is telling me 'No root file system is defined.' [18:21:12] any thoughts how to proceed/debug? [18:21:21] (context: https://gerrit.wikimedia.org/r/c/operations/puppet/+/643322 ) [18:29:36] hm, the answer was 'open a shell, vgremove everything, try again' [21:14:47] anyone know why ldap is breaking? Something wrong with Ganeti? [22:14:30] andrewbogott: i do!!! [22:14:36] i had this issue whats the hostname? [22:14:48] robh: I think it's fixed now, see discussion in _security [22:14:51] take a look at an-tool1010*) echo partman/standard.cfg partman/raid1-2dev.cfg ;; \ [22:14:53] ahh, the * right? [22:15:06] oh, im not in there somehow [22:15:07] oh, wait you're talking about partman and not ldap :) [22:15:27] I figured that out as well, although the solution wasn't ideal [22:15:28] andrewbogott: indeed partman no root system issue [22:15:43] since the updates to partman stuff it seems i ahd to hostname in * at the end [22:15:46] it no longer assumes our fqdn [22:15:48] I fixed it by dropping into a shell from the installer and wiping out existing lvm things [22:15:54] I guess that recipe creates things but doesn't clean up beforehand [22:15:58] ohh, that is totally different issue yep [22:16:02] also something im used to seeing sometimes [22:16:36] yeah, i just thought you may have been bit by the fact netboot.cfg has two hostname formats and one doesnt work any longer heh [22:16:43] (the stuff without a trailing *) [22:16:55] hm, does a literal hostname also not work? [22:16:59] https://gerrit.wikimedia.org/r/c/operations/puppet/+/643322/1/modules/install_server/files/autoinstall/netboot.cfg [22:17:02] not afaict [22:17:17] 'cause there sure are a lot of those in there [22:17:18] so your line 86 looks right [22:17:21] it has * at the ends [22:17:23] but 85? [22:17:39] 85 wouldnt work i dont think [22:17:46] if it did, then i dont understand the file formatting [22:17:48] hm [22:17:52] but daniel and i ran into this issue last week [22:18:05] /something/ matched but maybe it fell through to a default or something [22:18:09] well, i ran into it, he pointed out fix, it fixed it. i was getting no root filesystem defined because iwthout the trialing * [22:18:15] it doesnt apply incoming host to anythign int here [22:18:29] there isnt a default, it should just fail afaik [22:18:34] so huh [22:19:41] So my understaning is on https://gerrit.wikimedia.org/r/c/operations/puppet/+/643322/1/modules/install_server/files/autoinstall/netboot.cfg#86 like 86 good, 85 wouldnt work, and say, 93 wouldnt work [22:19:50] if things like line 85 don't work anymore then we should go through and add * to every entry in that file [22:20:11] I thought about that but then i decided i wasnt comfortable doing a scope of change that large not fully understanding why that happned [22:20:22] i assume old config had assumptions for FQDN and new config doesn't [22:20:23] me neither :/ [22:20:40] so i just changed them as i ran into them and could confirm they work post change [22:22:09] I would experiment and try to understand if this were anything but partman [22:22:26] we need someone new who doesn't already hate partman to investigate [22:23:01] like real new [22:23:07] partman hate is the first hate to take root in SRE [22:23:44] everything depends on it and even those who wrote the recipes in the past (myself included) don't seem to really be able to explain it so much as just muddle through ;D [22:24:11] hehe [22:30:52] I asked dcaro and he already hates it from his last job [22:31:13] charter member of the partman haters club right here [23:21:06] andrewbogott: try cloudcephmon2003-dev*) [23:21:44] mutante: it's working already — the mystery is why, given that you and rob have already determined it needs a * [23:22:02] dark magics [23:22:44] it's more like why did it ever work without a * [23:22:44] to me [23:22:51] when seeing all the other lines without it [23:23:11] yeah it didnt used to need * but my issue was fixed a week ago when i added * [23:23:14] hate to say it but this is technically not even partman itself, just about bash scripts [23:23:15] so no clue wtf is up [23:23:27] indeed netboot.cfg is pattern matching against a config [23:23:31] not partmans fault [23:23:36] yea, i don't know what would have changed [23:23:39] is it matching fqdn or hostname though? [23:23:45] but it only matches with * now [23:23:45] I always assumed it matched hostname [23:23:48] but partman ancillary so counts for hate [23:24:00] i thought it was hostname but the * made me think fqdn [23:24:14] this is all just from my experience last week though so very very very limited [23:24:17] zero pattern [23:37:32] it's comparing against whatever is returned by $(debconf-get netcfg/get_hostname) and since get_domain is separate that should be the short version (hostname -s) [23:44:28] yeah i dont get why we had to add * last week [23:44:33] but it fixed it so its confusing [23:45:36] yes, and it doesn't look like it was a recent change in netboot.cfg itself