[00:21:52] 10netops, 10Operations: Replace accepted-prefix-limit with prefix-limit - https://phabricator.wikimedia.org/T211730 (10ayounsi) Confirmed that replacing accepted-prefix-limit with prefix-limit does NOT cause the peer to bounce. [00:49:14] bblack: the codfw-eqsin link is down, telia outage. Traffic is going over the tunnel, going to depool eqsin [00:53:38] done [01:01:00] 10Traffic, 10netops, 10Operations: Outage on the primary codfw-eqsin link - https://phabricator.wikimedia.org/T219847 (10ayounsi) p:05Triage→03Normal [01:03:31] https://phabricator.wikimedia.org/T219847 [01:15:31] 10Traffic, 10netops, 10Operations: Outage on the primary codfw-eqsin link - https://phabricator.wikimedia.org/T219847 (10ayounsi) [08:36:01] weird [08:36:07] $ host -t ns wicipediacymraeg.org [08:36:07] Host wicipediacymraeg.org not found: 3(NXDOMAIN) [08:36:37] but... willikins:dns vgutierrez$ whois wicipediacymraeg.org |grep "Name Server" [08:36:37] Name Server: NS0.WIKIMEDIA.ORG [08:36:37] Name Server: NS1.WIKIMEDIA.ORG [08:36:37] Name Server: NS2.WIKIMEDIA.ORG [08:38:46] and of course host -t ns wicipediacymraeg.org ns0.wikimedia.org works as expected [08:41:04] uh.. [08:41:11] Domain Status: clientHold https://icann.org/epp#clientHold [08:41:20] it looks like there is some issue with the domain [08:47:21] 10Traffic, 10Operations: wicipediacymraeg.org is on clientHold - https://phabricator.wikimedia.org/T219856 (10Vgutierrez) [08:48:03] 10Traffic, 10Operations: wicipediacymraeg.org is on clientHold - https://phabricator.wikimedia.org/T219856 (10Vgutierrez) p:05Triage→03Normal [09:18:30] non-canonical redirect certs have been issued successfully \o/ [09:25:55] 10Acme-chief, 10Traffic, 10Operations, 10Goal: Deploy managed LetsEncrypt certs for all public use-cases - https://phabricator.wikimedia.org/T213705 (10Vgutierrez) 05Open→03Resolved The non-canonical certs have been issued successfully: `root@acmechief1001:~# for i in {1..4}; do openssl x509 -text -no... [09:26:03] 10Acme-chief, 10Traffic, 10Operations, 10Goal: Deploy managed LetsEncrypt certs for all public use-cases - https://phabricator.wikimedia.org/T213705 (10Vgutierrez) [10:04:45] vgutierrez: here if you want to talk about wikiba.se [10:05:43] :) [10:06:24] so.. right now if you set wikiba.se in /etc/hosts or using curl --resolve to point to one of our text-lb you should get the wikiba.se hosted in WMF [10:06:42] something like curl --resolve wikiba.se:443:91.198.174.192 https://wikiba.se [10:06:56] currently the TLS certificate supports wikiba.se and www.wikiba.se [10:07:15] but I didn't find any references to www.wikiba.se in the current puppetization [10:07:19] vgutierrez: ooh. you already moveed it to prod? [10:07:29] or you mean the one in wmflabs [10:07:41] I didn't update the DNS records [10:08:14] oh, right, because the puppet role was already on krypton [10:08:27] wait.. looks [10:08:46] currently varnish points to webserver_misc_static [10:10:13] ack, i see. it's basically already done. i can take it from here and check with WMDE about the "www" part and actually switching [10:10:25] yep, and it works for me [10:11:32] the current setup in WMDE serves the same thing for www.wikiba.se and wikiba.se [10:11:43] so ideally we should redirect one to the other [10:12:07] ok, i'll add that [10:12:56] after checking with WMDE and switching the DNS records to our text-lb we should enable HSTS as well [10:13:12] so that means that all the traffic against *.wikiba.se will come via HTTPS [10:13:22] please confirm that there is no issue with that [10:15:33] ok, ack. yep, familiar with HSTS from adding it to misc services in the past. we can start with a low max-age and then raise it [10:15:37] will check with them [10:19:25] 10Traffic, 10Operations, 10Wikidata, 10serviceops, and 3 others: [Task] move wikiba.se webhosting to wikimedia cluster - https://phabricator.wikimedia.org/T99531 (10Dzahn) [11:31:00] 10Traffic, 10Operations, 10Wikidata, 10serviceops, and 4 others: [Task] move wikiba.se webhosting to wikimedia cluster - https://phabricator.wikimedia.org/T99531 (10Dzahn) wikiba.se can now be viewed in WMF production by editing the local `/etc/hosts` file with f.e. `91.198.174.192 wikiba.se` Open issues... [11:56:04] mutante: I don't think we need to do HSTS in the misc server, etc [11:56:32] mutante: but we do need to fix the "www" thing both in the misc-static setup, and in varnish/ATS-land [12:02:03] bblack: ok, ack. right now suggesting to rewrite "www to naked" but still debating if that is the right direction vs. "naked to www" [12:02:19] didnt we want HSTS on everything in the past though [12:03:25] mutante: we do, but the "misc-static apache config" and "right now" are not the place/time for it :) [12:04:14] ok [12:04:15] we should fix the www-bit (which is currently broken at 2 layers: the misc-static one doesn't have the vhost at all, and varnish/ATS don't recognize the www hostname for routing to it either) [12:04:50] and then get wmde to review they still like the content (maybe things have fallen out of sync between our git copy and what they're serving in their version or whatever in all this time?) [12:05:07] ack, adding ServerAlias and another patch for varnish/ATS [12:06:15] right, Amir will contact the WMDE product manager for wikiba.se and i can get that confirmed [12:07:17] once we're at that point where everyone's happen with the content with hacked DNS.... we *can* switch the A-record stuff in DNS if they want to, at least temporarily [12:07:54] but then we need to start talking about the domain ownership issues, policy, HSTS and 301->HTTPS, etc, which are all tricky and entwined, and they seem to be resistant in the past on moving the domain reg to us. [12:10:50] *nod*. starting with the domain ownership question. will bring it up. maybe it will end up being a meeting. i can collect info and report back at first [12:12:39] yeah, if they're resisting the reg transfer becaues they want "control" and ability to move it back away from WMF at some future point, then they might also not want us to be 301+HSTS-ing UAs, which might make it hard for them to flip it back to their previous HTTP-only hosting, for instance. [12:13:03] fro a policy pov, we want the reg control and we want to make it a canonical site with all the HTTPS-enforcementm, etc [12:14:34] yep, i figured we generally want "all or nothing" when it comes to domain ownership and not some mix [12:14:35] at this point, nothing technical is holding us back from doing it all, the sticking point is the domain reg move and that they're happy with the new setup enough to let us do "permanent" things like HSTS where we can't easily switch back to their old hosting. [12:14:58] ok [12:15:26] whereas before these conversations were hypothetical because it has taken us a long time to figure out how to manage certs well :) [12:15:49] right, unstalled now :) [12:30:05] mutante: while we're talking, I stumbled on the ancient issue of wikivoyage-old.org the other day [12:30:51] we own the domain fully, and we point the various hostnames and A-records at some ancient 3rd party site that no longer functions at all (doesn't even listen on 80/443) [12:31:05] amended to the regex change [12:31:16] wikivoyage.de, we own and point into our mediawiki stuff where it redirects to the real wikivoyage.org [12:31:25] oh.. wikivoyage-old, heh, yea. that used to be the German "Verein" running that [12:31:37] it's also in email aliases [12:31:43] afair [12:32:03] and related there's this, which we're not involved with at all on a technical level (all 3rd party) but is mentioned in one ticket somewhere: http://wikivoyage-ev.org/wiki/Hauptseite [12:32:07] i know the right guy to ask [12:32:33] we could probably park wikivoyage-old, or something, being careful of mail keeping working, etc [12:32:43] yea, that "ev" part means https://en.wikipedia.org/wiki/Registered_association_(Germany) [12:32:44] or add it as another redirect like wikivoyage.de [12:33:13] making a TODO to contact the guy(s) [12:33:30] but right not it's just pointing at some unused/useless IP at hetzner waiting for someone to find a way to hijack it heh [12:33:35] s/not/now/ [12:35:01] probably the simplest thing to do would be to make it a softlink to wikivoyage.org like wikivoyage.de is, if whomever is ok with it (but it's broken now, so it's not like it would be any worse) [12:35:09] and maybe add it to redirects and make it a non-canonical later or whatever [12:38:58] 10Domains, 10Traffic, 10Operations, 10serviceops: contact Wikivoyage e. V. and figure out status of wikivoyage-old.org / fix or park broken domain - https://phabricator.wikimedia.org/T219867 (10Dzahn) [12:39:00] assigned to me as new ticket [12:39:25] mutante: maybe you want to check https://phabricator.wikimedia.org/T219856 as well [12:40:17] vgutierrez: for that i think all i can do is forward to Chuck Rosloef at legal [12:40:28] thx [12:40:48] adds the 'domains' tag [12:41:10] 10Domains, 10Traffic, 10Operations: wicipediacymraeg.org is on clientHold - https://phabricator.wikimedia.org/T219856 (10Dzahn) [12:41:12] i'll send a quick mail [12:46:00] 10Domains, 10Traffic, 10Operations: wicipediacymraeg.org is on clientHold - https://phabricator.wikimedia.org/T219856 (10Dzahn) Sent a mail about it to Chuck in legal who handles domain registrations. [12:46:29] re: redactions.. i thought "it's public anyways" [12:46:52] err I didn't redacted it [12:46:57] that's the output from the wois [12:47:02] *whois [12:47:42] ah :) [12:49:34] Chuck's Out-of-Office replied. on vacation until like in a week [12:58:14] 10Domains, 10Traffic, 10Operations, 10serviceops: contact Wikivoyage e. V. and figure out status of wikivoyage-old.org / fix or park broken domain - https://phabricator.wikimedia.org/T219867 (10Peachey88) [13:12:07] 10Domains, 10Traffic, 10Operations, 10serviceops: contact Wikivoyage e. V. and figure out status of wikivoyage-old.org / fix or park broken domain - https://phabricator.wikimedia.org/T219867 (10Dzahn) Sent email to Roland Unger (http://wikivoyage-ev.org/wiki/Kontakt) [14:40:43] can i just self-merge this, adding more notes_urls like last time, just for a different class now: https://gerrit.wikimedia.org/r/c/operations/puppet/+/497512 [14:40:53] i guess if i ask it's not self-merge anymore though, heh [14:46:14] mutante: looks good, but the commit message should say varnish instead of varnish/trafficserver [14:46:34] it looks lke you're only changing varnish related stuff [14:46:38] *like [14:49:29] vgutierrez: oh, that's right. i guess it's a habit from changing hieradata for both for backends. fixed [14:50:13] +1 [14:50:34] thx [14:52:24] why is jenkins crying? [14:55:39] vgutierrez: it only cried before a rebase.. "cant be merge .. cross-dependencies" [14:55:45] ack [14:55:46] it just looked weird in IRC [15:13:45] 10HTTPS, 10Traffic, 10Operations, 10Goal, 10Patch-For-Review: Create a secure redirect service for large count of non-canonical / junk domains - https://phabricator.wikimedia.org/T133548 (10Vgutierrez) a:03Vgutierrez [15:35:17] repooling eqsin [15:43:20] 10Traffic, 10netops, 10Operations: Outage on the primary codfw-eqsin link - https://phabricator.wikimedia.org/T219847 (10ayounsi) 05Open→03Resolved a:03ayounsi Telia stabilized the situation, " Services should be stable at the moment, hands are off and we are working with the vendor to provide an RFO i... [17:13:18] 10netops, 10Operations: Replace accepted-prefix-limit with prefix-limit - https://phabricator.wikimedia.org/T211730 (10ayounsi) 05Open→03Resolved All set, no down or bouncing peers, no mentions of `accepted-prefix-limit` in Rancid [22:00:58] bblack [22:01:01] https://ticket.wikimedia.org/otrs/index.pl?Action=AgentTicketZoom;TicketID=11035486 [22:01:20] SEC_ERROR_OCSP_INVALID_SIGNING_CERT [22:02:39] IIRC you don't have OTRS access, can forward if needed [22:06:25] Krenair: please [22:07:22] sent [22:08:52] this is just one person who happened to remember the right email address, haven't heard of any other reports [22:09:25] yeah it's odd, I double-checked icinga alerts, there's nothing amiss with our OCSP according to our checks [22:09:29] ok [22:09:43] I might do some cumin checks manually in case anything with the recent wikiba.se stuff caused some kind of ocsp misconfig [22:10:01] I just wanted to let you know there was a report really [22:10:19] I remember there was an incident with this sort of error about OCSP before [22:10:23] we've broken OCSP for Firefox (which is the only browser that aggressively reports on it) once before a long time ago, and the reports came pouring in rapidly on IRC heh [22:11:21] if you want to look into whether there's some user-level problem, obviously could be some kind of proxy/malware interference or whatever, or could also be that their machine's date is way off. [22:14:40] yeah will do [22:23:00] I manually checked all the ocsp data on all the text cluster machines via cumin, it all looks legit [22:23:03] OCSP Response Status: successful (0x0) [22:23:07] Cert Status: good [22:23:11] good date ranges, good file timestamps, etc [22:54:57] bblack, yep was user's clock. sorry for bothering you [23:22:35] np!