[02:06:37] > Alias jessie matched 0 hosts [02:06:42] \o/ [08:04:40] yay! [08:05:42] that's awesome [08:05:48] nice work to everyone involved :) [08:20:37] cool :) [09:11:41] hi all i have just added a couple of aliases for testing sites localy from caching serveres which may be usefull to others https://github.com/wikimedia/puppet/blob/production/modules/admin/files/home/jbond/.zshenv#L7-L16 [09:13:27] jbond42: FYI a more reliable way to get SITE is from /etc/wikimedia-cluster ;) [09:18:17] volans: ahh thanks i was thinkig about adding something good to know its allready thre :) [09:18:55] doesn't work in the k8s world fwiw (yet?) :) [09:23:46] ack [10:15:47] moritzm: I haven't merged your puppet change, mine can be done anytime, so feel free to do it once you're ready to merge yours :) [10:20:04] ack, both done now :-) [10:20:17] thanks! [11:43:01] jbond42: any idea how to tell systemd::unit that it should not change the running status of a service? [11:43:46] in this case it's starting a service on every puppet run [11:52:50] kormat: is that unit doing oneshot things? [11:52:57] kormat: from what i can see if you define just a systemd::unit it shouldn't manage the service, where is the code [11:53:01] volans: nop [11:53:06] jbond42: https://gerrit.wikimedia.org/r/c/operations/puppet/+/665324 [11:53:49] ahh from feb (i had a lingering memory that this may have came up before :)) [11:54:17] it's Vintage [11:56:03] kormat: i think that CR should dtrt [11:58:15] jbond42: and yet.. [11:58:53] do you have more.... e.g. an example of where/how its failing [11:59:24] let me get a clear example [12:00:17] i've just noticed a huge issue with how i was testing this. let me try again carefully [12:00:30] ok. i'm just an idiot. please ignore all previous transmissions on this subject :) [12:00:40] :) ack [12:00:48] 🥀 [12:02:07] * volans take notes :D [12:03:03] * marostegui removes "on this subject" part [12:03:37] * volans s/previous/previous and future/ :-P [12:43:54] hey folks, is there an easy way for me to stop recv emails from postmaster@wikimedia.org? I'm trying to reduce amount of emails I recv every day [12:53:13] arturo: you will need to remove your self from the root alias in /srv/private/modules/privateexim/files/wikimedia.org, however this will stop you reciving all mails for root which may be undesirable [12:53:46] that is what I suspected, thanks for confirming [13:00:42] add a filter to delete them automatically? [13:38:56] reprepro is giving me `Error: packages database contains unused 'jessie-wikimedia|backports|amd64' database.` on apt1001 [13:39:06] is it safe to supply `--ignore=undefinedtarget`? [13:41:21] https://phabricator.wikimedia.org/P15951 [13:45:55] kormat: I think that you can run reprepro clearvanished, somebody must have removed some component [13:46:01] I guess moritzm earlier on [13:46:25] (I already used clearvanished in the past) [13:46:39] I thin that jessie-wikimedia was removed by conf/distributions and as such reprepro might complain [13:47:29] https://wikitech.wikimedia.org/wiki/Reprepro#Removing_a_component for reference, I'm not sure if a whole distro is treated like a component in reprepro terms [13:52:48] clearvanished complains with: [13:52:53] `There are still packages in 'jessie-wikimedia|backports|amd64', not removing (give --delete to do so)!` [13:52:58] so i'm going to slowly back away. [13:53:11] ahahahah [13:53:51] as we don't have anymore jessie hosts I think is fine, but fine to wait for mor.itz confirmation too ;) [13:57:02] yeah, they're all gone, both in cloud VPS and prod [14:32:35] ottomata: hey. i've just noticed an extra complication with https://gerrit.wikimedia.org/r/c/operations/puppet/+/689092 - mind if i just send a patch to fix? [14:33:43] kormat: wrong andrew ? [14:33:48] you looking for andrewbogott i think [14:33:49] :) [14:34:05] oh crap. right :) [14:34:11] andrewbogott: hi! same question ^ ;) [14:34:22] one day we will merge into a super andrewbogotto [14:34:25] but that day has not yet come [14:39:34] kormat: not at all, please do [14:41:20] ok :) let's see if pcc shames me or not [14:42:09] ah, different port per section? [14:43:10] andrewbogott: yeah, so they can co-exist on the same ip [14:43:38] jbond42: "Parameter 'port' of class 'profile::mariadb::ferm_wikitech' has no call to lookup". whaat? how's that a style violation? [14:43:44] (https://integration.wikimedia.org/ci/job/operations-puppet-tests-buster-docker/25623/console) [14:44:39] it wants profile::mariadb::ferm_wikitech to be a class and not a profile [14:46:41] <_joe_> kormat: why is that a parameter? [14:47:01] _joe_: it needs to be passed in [14:47:10] only the containing profile knows the correct port [14:47:39] <_joe_> so yes, my first question would be why is this a separate profile [14:47:44] <_joe_> but lemme look at the code [14:48:55] easiest is probably to just embed it rather than have it as a separate profile. Less DRY that way but this is all a temporary hack anyway [14:49:03] <_joe_> oh ok got your error [14:49:13] <_joe_> you wanted a define, and you created a class [14:49:44] <_joe_> also; [14:49:55] <_joe_> why not add this code directly inside the profile::mariadb::ferm define? [14:50:17] because it's temporary, and specific to s6 [14:50:23] <_joe_> ok [14:50:49] <_joe_> but we explicitly don't pass parameters to other profile classes (it's in the style guide) [14:50:51] <_joe_> so [14:50:59] <_joe_> you have the following avenues: [14:51:03] kormat: before i check do i still need to (sounds like it may be sorted) [14:51:26] <_joe_> 1 - make it a class mariadb::ferm_wikitech instead, and just pass it all the parameters from the profiles [14:51:40] <_joe_> 2 - make it a define (although it's a bit of an abuse) [14:51:58] <_joe_> 3 - Add that if (with the same alert it's an hack) in profile::mariadb::ferm [14:52:08] <_joe_> you pass the section as name anyways [14:52:34] <_joe_> I would go with 3 [14:53:35] jbond42: i think no, just ran into an obscure corner of policy [14:54:26] _joe_: 3) doesn't work, as profile::mariadb::ferm can't know if it's a multi-instance host or not, and therefore can't know the port [14:54:30] 2) it is :P [14:54:51] ack yes seems like yuo and joe are sorting it [14:54:57] <_joe_> kormat: you explicitly pass the port to profile::mariadb::ferm [14:55:42] <_joe_> profile::mariadb::ferm { $section: port => $port } [14:56:01] <_joe_> anyways, I don't really care however you solve it [14:56:48] ah, ok. new objection: profile::mariadb::ferm isn't used by core single-instance db hosts [14:57:09] kormat: I'm about to to vanish into a meeting but thank you or your followup with all this [14:57:41] andrewbogott: :) [14:57:44] jbond42: new style error! [14:57:47] > modules/profile/manifests/mariadb/ferm_wikitech.pp:7 wmf-style: Found lookup call in defined type 'profile::mariadb::ferm_wikitech' for 'profile::openstack::eqiad1::labweb_hosts' [14:58:00] it's clearly been too long since i touched puppet code. everything hates me again. [14:58:14] <_joe_> kormat: oh sorry, ofc you need to pass the parameters to the define [14:58:22] <_joe_> you can't do lookups there [14:58:51] can i override the style checker or something? [14:59:09] redefining the parameters of 2 major mariadb profiles for this temporary hack is not something i want to do. [14:59:14] you can [14:59:29] ` # lint:ignore:wmf_styleguide` [14:59:47] probably be extra-sure it is in fact temporary though :) [15:00:12] cdanis: where does that go? at the end of the offending line? [15:00:13] <_joe_> wikitech not being hosted on the main cluster? Not sure when that will happen tbh :) [15:00:22] kormat: yes [15:00:28] cdanis: ty <3 [15:00:45] <_joe_> probably once we move to k8s and we can build ad-hoc wikitech images that add php-ldap [15:42:51] arturo: I see a running sre.hosts.decommission cookbook since a while, is it possible that is just sitting there waiting for input? Sorry for the ping, I was looking to deploy related stuff and prefer to do so when there are no in-flight processes. [15:48:07] volans: looking [15:48:19] volans: yes, there was a prompt, sorry [15:48:47] sorry, I was on a meeting [15:49:02] no prob at all [16:01:22] volans: {{done}} [16:01:52] thanks! [16:02:39] FYI, I converted cross-validate-accounts to a systemd unit but I didn't notice that it exits status 0 even when there are errors, so presently it doesn't send mail -- fixing that now [16:17:16] I'm getting "Error: packages database contains unused 'jessie-wikimedia|backports|amd64' database" when importing on apt1001 - might it be related to T224549? [16:17:16] T224549: Track remaining jessie systems in production - https://phabricator.wikimedia.org/T224549 [16:17:23] volans: mailed you https://gerrit.wikimedia.org/r/689951 to fix ^ [16:17:31] er, to fix my thing, not hnowlan's :) [16:17:45] :D [16:17:54] rzl: ack, in a meeting will look in a bit, sorry [16:18:13] no rush, just want to get it done today [16:19:26] and hnowlan: kormat, moritzm were discussing earlier but I don't know what conclusion they came to [16:19:36] ahh I see [17:11:26] volans: cheers! [17:14:25] yw :) all done [17:14:46] ran it manually, and email received ✅ [17:14:53] great [17:15:02] we should see an icinga alert for this too now [18:00:30] jynus: marostegui does it seem right to you that "es*" databases are considered "core" in dashboards like this one? https://grafana.wikimedia.org/d/000000278/mysql-aggregated [18:00:44] I think it'd be useful if there was an option in that menu for externalstore [18:01:10] but not sure if that's be undesirable for other reasons [20:10:34] Krinkle we need to unify groups, for now, just selecting es1-5 would work the same [20:11:26] I would start by abandoning "core" (which is not very meaningful) and calling it "mw/mediawiki" or something [20:11:56] jynus: yeah, but that's harder to explain and keep stable over time I guess. (and requires the viewer to know es =externalstore). E.g. linking to the dashboard's default configuration with a message that says "You can find out about db load here, including for parser cache and extenral store" does not result in someone being able to get to es* stats per-se. [20:12:01] without additional explaination or knowledge. [20:13:51] on our documentation, we wrote: "Sometimes core servers either include or exclude es and/or pc, depending on the context." [20:14:59] I would drop meaningless names and start having more specific names: "metadata" "content" [20:15:01] etc [20:35:05] Puppet debugging question: is there a way to run lookup() from the commandline and determine where a value was sourced? I'm trying to sort out some deployment-prep breakage and the chain of hieradata overrides is making me question my life choices. [20:48:53] dpifke: https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/ may help you. that's a git repo which mirrors the hiera/puppet settings entered via Horizon [20:49:14] https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/refs/heads/master/deployment-prep/ [20:51:04] Is there a way to output a "fully merged" version? i.e. with values from the various common directories, horizon, et al? [20:51:53] It's saying a key doesn't exist, but I plainly see it in hieradata/cloud/eqiad1/deployment-prep/common.yaml, so I think something is overriding it later down the line. [20:52:48] Staring at individual files in what I think is the order of operations isn't helping, but that's probably me missing something. Was hoping for a better way to debug it. [20:53:33] maybe with `puppet lookup --debug --explain --node ... ...`? I'm not sure if that has to be run from a puppetmaster directly or if a client can do it. I guess in deployment-prep it's easy enough to run on the puppetmaster [20:53:58] Oh, that sounds like exactly what I was looking for! Trying it... [20:54:15] https://puppet.com/blog/troubleshooting-hiera/ [20:54:46] I stopped trying to understand Puppet ages ago. Felt like that song from Desert Rose Band [20:55:21] "One step forward and two step back / Nobody gets too far like that" [20:55:57] tabbycat: it's not so bad :) but it does take a bit of getting used to [20:55:58] with beta cluster is even harder as IIRC we host configs both in operations/puppet AND Horizon [20:57:24] yeah, there was a strong push ~5 years ago to put all the deployment-prep config into ops/puppet because an SRE was helping, but that kind of backfired in the long run without constant SRE supervision [20:57:55] I'm getting some errors like `Undefined variable '::labsproject'` and `Undefined variable '::wmcs_deployment'` when I run that, but maybe that's pointing me to why it's not found. Digging a little deeper. [20:59:00] `Error: Could not run: Lookup of key 'lookup_options' failed: cloudlib::httpyaml failed undefined method `fetch' for #` (this is on the puppetmaster) [20:59:36] ^ Seems to imply it can't fetch the horizon data? [21:05:05] jynus: looks likek this config is quite deep in levels of indirection [21:05:07] https://gerrit.wikimedia.org/g/operations/puppet/+/783b1a24ad4e0e5e94b2c8e20894ee29ee322566/modules/profile/manifests/prometheus/ops.pp#191 [21:05:14] https://github.com/wikimedia/puppet/blob/production/modules/profile/files/prometheus/mysqld_exporter_config.py#L67 [21:05:26] it comes from a database somewhere? [21:14:12] Ah-ha. `puppet lookup` needs the `--compile` option.