[05:34:16] godog, ema: re T236482, right now we aren't able to tell between text and upload clusters on the trafficserver exporters, so we need to add the cluster metric, same as we do right now with nginx, https://gerrit.wikimedia.org/r/c/operations/puppet/+/548581 [05:34:17] T236482: Add ats-tls status and availability graphs to frontend-traffic - https://phabricator.wikimedia.org/T236482 [05:38:22] <_joe_> vgutierrez: doesn't that get added automatically by the prometheus server on all metrics? [05:38:25] <_joe_> oh I see [05:38:31] for nginx yes [05:38:36] for trafficserver not yet :) [05:38:47] that's basically my CR [05:39:10] morning BTW :) [05:40:34] <_joe_> yeah I suggested we add it on the central server, but you were not [05:41:00] <_joe_> morning, I've been around for some time, I finished reading the email in my inbox, now I should go to the tasks [08:59:03] vgutierrez: ack, thanks! will take a look shortly [09:00:55] <3 [15:26:56] moritzm: any other comments or concerns re: https://gerrit.wikimedia.org/r/c/operations/puppet/+/547527 ? [15:27:58] let me check [15:40:07] hello! I'd like to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/547767 but ideally with someone who knows more puppet/icinga around in case it doesn't go as planned [15:53:01] XioNoX: looking [15:53:54] XioNoX: looks reasonable enough to me [15:54:22] cdanis: yeah, I don't want to figrefight it alone in case there is something unexpected [15:54:37] especially as I don't know the insides of icinga [15:54:46] I'll be around and I know volans is still working ;) [15:55:40] * volans around [15:55:52] cool, thx [15:56:17] XioNoX: you can do that [15:56:21] disable puppet on icinga1001 [15:56:30] run it on 2001 and run manually the icinga config check [15:56:42] sudo icinga -v /etc/icinga/icinga.cfg [15:56:59] great, thx [15:58:24] disable-puppet --help show me the puppet agent help [15:59:48] XioNoX: yeah it is a very dumb wrapper script [15:59:54] I believe it just takes a single string argument [16:00:03] disable-puppet "reason" [16:00:06] disable-puppet 'Arzhel testing 547767' [16:00:13] yeah I figured from wikitech [16:00:35] "disable-puppet" disable it without reason? [16:00:42] or does nothing? [16:01:03] disables w/o reason [16:01:03] anyway, it's disabled [16:05:16] "Things look okay - No serious problems were detected during the pre-flight check" [16:05:17] yay [16:05:23] enabling puppet on 1001 [16:06:53] this is frustrating [16:06:54] icinga1001:~$ sudo enable-puppet "CR547767 good on icinga2001" [16:07:08] doesn't enable puppet [16:07:45] I guess --force is always the solution [16:08:00] XioNoX: you have to use the same message with which you disabled it [16:08:12] I tried the same message with no luck [16:08:12] that is done because if I disable with foo and you disable with bar [16:08:17] you don't disable mine [16:08:41] but don't let this trick you into thinking they are chained [16:08:44] first one wins [16:08:44] I think it's because I ran "disable-puppet" with no parameters to try to figure out the command line [16:08:46] or: run-puppet-agent -q -e 'reason' [16:09:00] which will only do so if the current reason for disabling is 'reason' [16:10:40] so it registered my "no reasons" disable, and then enabling a "no reasons" disable requires a --force [16:11:24] alright, no errors on icinga1001 [16:11:44] thanks for the help [16:12:08] mgmt network should be much less noisy from now on [16:12:37] \o/ thanks! [16:13:32] XioNoX: btw, unrelated, I'm very likely going to use librenms and/or smokeping as test cases for rsync TLS wrapping [16:15:21] cdanis: what does that mean in English? [16:15:22] :) [16:16:24] XioNoX: so I see both librenms and smokeping have rsync::server usages [16:16:56] cdanis: note that if any effort should be spent on smokeping is to get rid of smokeping, eg https://phabricator.wikimedia.org/T169860 [16:17:02] haha [16:17:04] ok fair enough [16:17:51] https://gerrit.wikimedia.org/r/c/operations/puppet/+/547527 is the context [16:18:00] I'd like to encrypt all rsync traffic within prod (we have a bunch) [16:21:16] is there unencypted rsync traffic? [16:22:06] plenty of it [16:22:18] one thing i noticed is that there's no like unified pattern for internal services having ssh keys to each other [16:22:38] it should just be natural that the rsync module uses ssh y'know [16:23:02] it very much does ot [16:23:05] s/ot/not/ [16:23:15] yes i noticed that [16:23:45] and thus i went 'it's public anyway, let's just use git via https' [16:23:58] :) [16:24:03] but also that's a major gap [16:25:05] so my CR broke cloud puppet [16:25:12] what's a default value for Hash[String, String] ? [16:25:56] {}? [16:25:59] {} surely [16:26:13] thanks! [16:29:53] cdanis, chaomodus: https://gerrit.wikimedia.org/r/c/operations/puppet/+/548781 [16:39:42] I don't know who did this, but it looks amazing! [16:39:46] `Info: Applying configuration version '(a8b6527) Ayounsi - Icinga, fix mgmt_parents hiera call for cloud'` [16:39:53] (i.e, the commit reference) [16:41:41] arturo: I believe it's part of the prod system motd now as well [16:42:13] https://gerrit.wikimedia.org/r/c/operations/puppet/+/547505 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/547506 [16:42:34] yeah, it's super cool [16:42:48] so greetings to jbond42 :-) [16:48:38] :) [19:11:17] let's make a Phabricator button like we have for "File Decom Request" for more stuff, "File Onboarding Request", "File Access Request". people will get the pre-filled task from template instead of being told to copy/paste or not using any template [19:11:54] looks into how to get one for onboarding after talking to Joal again about duplication of check lists in office wiki and phab template [19:12:44] +1 +1 +! [19:52:42] yes definitely [21:44:01] one problem with phab checklists is the proliferation of systems for onboarding. Phab makes sense for stuff that is tracked and passed around, but a wiki page is much easier to tweak and re-template. [21:44:23] plus spreadsheets, which I think are still used in some parts of onboarding [21:44:31] We need an onboarding tool that is not a spreadsheet or phabricator [21:44:43] something expensive saas [21:44:44] plus atomizing the onboarding of employees vs volunteers and the different but overlapping needs [21:45:10] more just workflow than onboarding specifically [21:45:28] I wasn't atomized at all during my onboarding, is that a step we missed? [21:46:06] please step into the atomizor [21:47:27] if you noticed the atomizing then it failed [21:47:52] just know that there is a version of you that got disintegrated [21:48:31] was [21:49:22] if you say so! seems a little old-fashioned [21:49:33] one of these quarters we'll have to get onto a modern Continuous Disintegration framework [21:52:43] you'll never get consensus on *which* continuous disintegration framework to use [21:55:18] the latest, new-fangled disintegration techniques are not always the best. Our community looks at the long term. Otherwise, how do you know that the disintegration will hold up over years?