[06:25:53] buster is finally release! \o/ [08:45:40] if anybody has time, I have a quick change for the disk space check in base monitoring: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/520989/ [08:45:53] basically excluding the hdfs fuse mountpoints [08:46:20] since there is not point checking them (and with kerberos enabled, one needs to be authenticated to read from them) [08:59:12] I guess there is some alternate monitoring method now, or planned? [09:08:13] elukey: I am ok with it, but check with most people in foundation, as they are normally in charge of it [09:08:38] also check if --exclude-type allows a comma separated syntax or something [09:10:38] jynus: thanks :) afaics the nagios check's help says that multiple exclude paths options are allowed [09:10:57] I am checking df to be sure [09:11:04] df source code, man is useless [09:11:51] df ? [09:12:20] I am not following sorry [09:12:56] yeah, yours is the correct way [09:13:04] https://github.com/coreutils/coreutils/blob/master/src/df.c#L1675 [09:14:06] does /usr/lib/nagios/plugins/check_disk use df internally? [09:15:17] I am guessing yes, --exclude-type is a df parameter [09:16:03] because I checked https://github.com/nagios-plugins/nagios-plugins/blob/master/plugins/check_disk.c [09:16:26] that mentioned an "exclude list" [09:18:10] (so probably df and check_disk share the same code to check file systems known by the os?) [09:19:43] it uses syscall statvfs [09:19:55] ack makes sense [09:19:59] which I am guessing df does too [09:20:06] but it may handle parameters differently [09:20:31] so good if you checked it (some executables overwrite the options with only the last one) [09:20:37] that is why I mentioned it [09:21:09] yep thanks for the review! [09:38:17] does anyone know if conftool is configured/used/working in deployment-prep? [11:54:28] do I remember correctly that this monday's SRE meeting is goal only? No status updates etc? [12:09:57] akosiaris: i belive so [12:11:30] put status updates on the notes, not sure if we'll get to them in the meeting [12:11:33] but still useful [12:29:38] ok [13:00:32] paravoid: but we have no doc I think [13:35:36] I'm running into this for 'reprepro' on install1002, seen it before ? https://phabricator.wikimedia.org/P8722 I guess there's cleanup needed [13:50:59] ah yes, this happened before with the precise removal, the trusty repo definitions were dropped, so this needs a "reprepro clearvanished" [13:51:10] I'll do that [13:51:44] I guess we're ok deleting the trusty packages too [13:52:22] this just deletes the db internal stuff [13:52:28] not the packages under /srv [13:53:16] ah! got it [13:54:13] all done, thanks moritzm [13:54:39] ack, thx [14:45:02] jynus: re: https://gerrit.wikimedia.org/r/c/operations/puppet/+/519203/14/modules/profile/files/prometheus/mysqld_exporter_config.py#209 looks like at line 40 config_path is never added to the parser [14:45:35] I don't understand [14:46:10] I may be making a mistake [14:46:21] just I still haven't seen it [14:46:25] there's never a parser.add_argument for 'config_path' in get_options() [14:46:32] or I am not understanding [14:47:02] oh, I see [14:47:04] I get it now [14:47:27] I may have pushed the wrong branch, thanks [14:47:39] will fix it and the other [14:47:52] np! I wondered too if I was looking at the latest PS but yeah [14:48:11] sorry, I was just looking at the wrong suggestion [14:48:30] np! yeah I didn't comment in the right place too [14:50:47] I will change the debug thing, but I consider it a minor issue [14:51:06] more log styling than other thing [14:51:21] I predict in most cases it will either catastrophically fail [14:51:34] (e.g. network or db is down) [14:51:42] or it will go though [14:52:44] and I was afraid of too much logging, but as the logging master I will, if you support it, I will do it :-D [14:56:59] heheh you mean the logging.exception calls? [14:58:31] question, is everything in our things ready for buster instances? like puppet, repos, etc [14:59:57] chaomodus: yes, we have buster systems in production already [15:00:11] oic cool thanks [15:00:13] or you mean wmcs vms with 'instances' ? [15:00:18] well that too but [15:00:24] caveats for wmcs? [15:00:46] i meant like installations in general [15:02:39] not afaik [15:02:41] (meeting) [15:05:19] rog [16:21:34] chaomodus: yea, m.oritz just told me the other day i can skip stretch now if i want to upgrade anything jessie [16:22:01] and buster is officially released now since 2 days.. so .. time [16:22:10] yes [16:23:19] chaomodus: the base system modules work but each individual service has not been tested. My gut feeling is that 80-90% will probably work with out issue. but some services may need some tweaks [16:25:51] best to fire up a wmcs instance with buster (if we have those) and apply the role there to check for errors.. like jessie->stretch meant usually it needed apache config changes if apache was involved [16:29:28] cool thanks! [16:29:46] idk of wmcs has a buster image yet,but i haven't spun anything up lately. [16:33:02] chaomodus: for what currently runs buster in prod: https://puppetboard.wikimedia.org/fact/lsbdistcodename/buster [16:42:13] yea, ideally we should first have the new distro in wmcs and later in prod instead of the other way around [17:02:27] fwiw there's a buster prerelease image available on vmcs [17:02:30] wwcs [17:13:01] kjh~. [17:13:49] supersafe password jbond42 :-P [17:15:51] this is just misdirection, the actual password is jbon421234 [17:29:28] that's hard to remember [17:29:36] make those all 1's and it will be a piece of cake [17:34:34] 👍 [18:21:55] chaomodus: what do you want to run on buster? most of the changes in the respective roles will be limited to small changes and rebuilding wikimedia-specific packages and a lot will simply work out of the box [18:22:18] the new netbox installations are meant to be buster [18:22:39] i suspect i'll encounter some hardcoded package version issues [18:22:41] but otherwise it should be easy enough. [18:26:18] if it' just a dedicated Ganeti VM with Netbox should be really straightforward, yes, upgrading role::netmon along with netbox will probably be a little more involved with things like Postgres updated etc. [18:26:52] netbox vm and postgres vm for netbox, yah. [18:27:06] netmon is getting netbox removed from it [18:27:49] ack [20:28:20] so there is a pre-release buster image available in labs but projects have to request access [20:29:19] it isn't open to the world by default like the stretch image right now [20:29:28] but: https://phabricator.wikimedia.org/T227474 :D [20:32:26] (it's not a big deal or anything to get access, you just ask and.rew nicely) [20:33:35] deployment-prep has had buster instances for a few months now, for running acme-chief (which depends on buster and is buster in prod) [20:34:26] ahh i see [20:34:34] the project i normally use apparently has access to it also. [20:43:42] Krenair, chaomodus: see #wikimedia-cloud-admin from earlier the day, andrewbogott will create a current buster image and make it publicly accessible once T227475 is resolved [20:43:42] T227475: Use sssd by default in cloud-vps base images - https://phabricator.wikimedia.org/T227475 [20:44:54] 👍 [21:31:57] Krenair: moritzm: aha! good to know. very nice