[08:21:50] 10Acme-chief, 10Patch-For-Review: CN + SNI list on config file doesn't match issued certificate on some scenarios - https://phabricator.wikimedia.org/T218418 (10Vgutierrez) @Krenair I think that the proposed patch is even simpler. It ensures that the CN is always part of the SNI list [08:27:44] 10Acme-chief: Handle ACME directory 429 rate limit responses properly - https://phabricator.wikimedia.org/T218538 (10Vgutierrez) p:05Triage→03Normal [09:14:53] 10Acme-chief: acme-chief calls unnecessarily to ACMEChief._push_live_certificates() on daemon start - https://phabricator.wikimedia.org/T218543 (10Vgutierrez) [09:18:02] ema: sorry for https://gerrit.wikimedia.org/r/#/c/operations/debs/superior-cache-analyzer/+/496781/ I forgot to deploy your config change :) [09:18:12] it is running at https://integration.wikimedia.org/ci/job/acme-chief-tox-docker/43/ [09:18:27] uh? [09:18:41] hashar: np, thanks! [09:54:22] 10Acme-chief, 10Patch-For-Review: acme-chief calls unnecessarily to ACMEChief._push_live_certificates() on daemon start - https://phabricator.wikimedia.org/T218543 (10Vgutierrez) p:05Triage→03High [10:14:08] 10Acme-chief: acme-chief calls unnecessarily to ACMEChief._push_live_certificates() on daemon start - https://phabricator.wikimedia.org/T218543 (10Vgutierrez) 05Open→03Resolved [13:25:23] vgutierrez: How do you feel about us merging https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/496858/ today? [13:26:14] please check ema comments [13:26:52] right now is a no go [13:27:39] ah, ok, missed the review email. Will update. [13:30:21] (thanks ema!) [13:33:01] 10Traffic, 10Operations, 10Patch-For-Review: Partial cache_upload traffic switchover to ATS and switchback to Varnish - https://phabricator.wikimedia.org/T213263 (10ema) >>! In T213263#5027366, @ema wrote: > > I have disabled puppet on cp2015 and manually set `proxy.config.cache.ram_cache.size` to 1G to tes... [13:58:03] 10Traffic, 10Cloud-VPS, 10Operations, 10LDAP, and 2 others: Update openldap profile to use LE - https://phabricator.wikimedia.org/T218398 (10Andrew) 05Open→03Resolved a:03Andrew This is working with acme now. [14:16:59] puppet_svc => 'slapd', [14:16:59] key_group => 'openldap', [14:17:00] andrewbogott, ? [14:17:28] is slapd running in the openldap group? [14:17:47] slapd is the name of the openldap service [14:18:10] huh [14:18:18] I mean, openldap is the project and slapd is the actual service [14:18:35] * Krenair for some reason thought they were separate projects making different ldap servers. must be confused with something else [14:18:55] Krenair: https://gerrit.wikimedia.org/r/c/operations/software/acme-chief/+/494957 I've addressed what we discussed this morning, anything else missing? [14:19:16] probably not but I can't really view it right this moment [14:19:20] review [14:19:22] * [14:19:27] perhaps volans [14:20:34] * volans will be in various meetings in the next few hours [14:20:57] volans gave the +1 already [14:22:27] ok [15:02:50] so if you don't have more concerns with the CR I'd like to merge it [15:36:44] Krenair: hmm I just realized that we cannot replace split() with os.path.split().. the behaviour is pretty different [15:37:01] ah [15:37:02] but we can use os.sep instead of hardcoding the separator [15:37:07] what about the existing cases we moved? [15:37:17] I need to replace that as well [15:37:20] ok [15:37:27] am thinking I should get a new testing instance going [15:37:38] so I'll use split() and os.sep() [15:37:47] ok [15:37:47] so I've been using two docker containers [15:37:59] one to run the api and one as a puppet client [15:38:29] is not ideal cause I have to disable the code that checks the client TLS certificate [15:39:16] IIRC in the early stages I was using a little custom CA or something [15:39:55] I do generate the certificates for using TLS though [15:40:01] cause puppet needs https [15:40:23] RUN python3 -c 'from acme_chief.x509 import *; import datetime; key = RSAPrivateKey(); key.generate(); key.save("/tmp/rsa.key"); cert = SelfSignedCertificate(key, "172.17.0.2", [], datetime.datetime.utcnow(), datetime.datetime.utcnow() + datetime.timedelta(seconds=3600)); cert.save("/tmp/rsa.crt");' [15:40:23] CMD uwsgi_python3 --https-socket :8140,/tmp/rsa.crt,/tmp/rsa.key --callable app --wsgi-file /app/acme_chief/uwsgi.py [15:40:47] that works pretty well but of course is hacky [15:41:38] doing that I avoid setting up nginx as well [15:42:30] if you don't have nginx do you have anything checking client certs? [15:42:45] you should just be able to talk to acme-chief directly and set whatever authenticated DN you like? [15:43:21] that's what I mentioned that in my tests I have to disable the code that checks the TLS certificate [15:43:25] oh but this is with the puppet agent itself? [15:43:28] s/what/why/ [15:43:32] yep [15:43:36] I use the puppet agent as a client [15:43:39] riiight [15:43:40] darn [15:43:41] okay [15:43:56] to check that everything works as expected besides our tests [15:43:59] yep [15:45:17] ema: are you happy with https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/496858/ now? (I'm anxious to move ahead with that because we have ongoing/intermittent problems with the current ldap setup) [15:48:35] Krenair: everything replaced with str split() and os.sep, tests + puppet look happy here [15:49:03] ack, in a couple of hours I should be able to recreate my test setup and +2 it hopefully [15:49:21] great [15:49:57] on the puppet side I'm using this [15:50:03] https://www.irccloud.com/pastebin/DweoomOZ/ [15:51:28] Krenair: and in our puppetization should look like this: https://gerrit.wikimedia.org/r/c/operations/puppet/+/496148 [15:52:07] yep [15:52:21] I take it the flavour thing is just temporary [15:52:24] until we get it all migrated [15:52:58] that's the idea yes [15:53:03] cool [19:47:44] vgutierrez, so I realised that the script I had somewhere to create accounts is probably in gerrit [19:48:10] ACME directory accounts? [19:48:24] yeah [19:49:05] well.. is just hitting ACMEAccount.create() class method [20:04:03] forgot I needed to write that designate integration script [20:11:30] root@deployment-acme-chief01:~# /usr/local/bin/acme-chief-certs-sync [20:11:30] Missing config file options, system misconfigured [20:11:32] :/ [20:12:14] missing PASSIVE_FQDN or LIVE_CERTS_PATH according to the source [20:12:34] hmm that changed as well.. CERTS_PATH instead of the older LIVE_CERTS_PATH [20:12:49] we have CERTS_PATH instead of LIVE_CERTS_PATH [20:12:51] yeah [20:12:59] and PASSIVE_FQDN is empty because I have no passive host [20:13:54] maybe I should make one. it doesn't look like our puppetisation can work without one [20:42:58] So now it's upset that /var/lib/acme-chief/certs does not exist [20:43:24] hmm the .deb package handles that [20:43:40] it's got 0.9-1 [20:43:43] wait [20:43:54] did you upload the new packages to both repos or just buster? [20:43:59] just buster [20:44:10] right well I'm stuck with stretch [20:44:11] actually the latest change requires buster :) [20:44:16] ... oh and we just added that buster dependency too [20:44:17] bah [20:51:44] well I'm going to get dinner and see if I can sort this out. [23:06:27] 10netops, 10Operations: eqiad - eqord Telia link down - IC-314533 - https://phabricator.wikimedia.org/T218307 (10ayounsi) Telia did a loop test facing eqiad and our light levels didn't change. While Telia still don't receive light. The culprit seems to be an active element somewhere on the cr2-eqiad (in DC6)<-... [23:17:16] "Please change the Link Net IP and the BGP peer IPs for both IPv4 & IPv6 as per the above new IPs. [23:17:16] The reason for this change is due to a duplicate IP being already configured in another customer interface." [23:17:19] good job... [23:40:47] https://blog.cloudflare.com/monsters-in-the-middleboxes/ [23:50:36] so I have buster instances now [23:50:47] now why is keyholder refusing to sign with the key I just made for it... [23:59:52] root@deployment-acme-chief03:~# SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh-add -l [23:59:52] 256 SHA256:pf9LDsLA+r2noYae+qmRv6uQL3kB6GGo/gU3BIqs8TA root@deployment-puppetmaster03 (ED25519)