[10:10:57] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Frack eqiad network upgrade: design, installation and configuration - https://phabricator.wikimedia.org/T377381#10249440 (10cmooney) >>! In T377381#10246886, @Jgreen wrote: >> That's a bit of a shame in some ways but no... [12:51:10] 07HTTPS, 06SRE, 06Traffic-Icebox, 07Upstream: Support ECH on Wikimedia servers - https://phabricator.wikimedia.org/T205378#10250056 (10Diskdance) FWIW, Cloudflare has [[ https://github.com/net4people/bbs/issues/393 | enabled ECH by default ]]. [12:52:05] 07HTTPS, 06SRE, 06Traffic-Icebox, 07Upstream: Support ECH on Wikimedia servers - https://phabricator.wikimedia.org/T205378#10250057 (10Diskdance) [12:52:51] 07HTTPS, 06SRE, 06Traffic-Icebox, 07Upstream: Support ECH on Wikimedia servers - https://phabricator.wikimedia.org/T205378#10250058 (10Diskdance) [13:22:56] 10netops, 06Infrastructure-Foundations, 10Prod-Kubernetes, 06serviceops: WikiKube clusters close to exhausting Calico IPPool allocations - https://phabricator.wikimedia.org/T375845#10250154 (10cmooney) >>! In T375845#10246786, @akosiaris wrote: > Good question. Let me add some data points. We currently use... [14:52:22] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Frack eqiad network upgrade: design, installation and configuration - https://phabricator.wikimedia.org/T377381#10250655 (10Jgreen) There are 6 servers being replaced: {T369565} {T369947} {T369947} Plus 3 new servers:... [15:24:25] 06Traffic, 10Maps, 06SRE, 13Patch-For-Review: Allow Wikimedia Maps usage on pediapress.com - https://phabricator.wikimedia.org/T375761#10250889 (10ssingh) Hi: As an update, this is pending approval so we are working on that internally and will merge this once that is done. Thanks! [15:53:01] sukhe: as discussed... the depool_threshold refactor https://gerrit.wikimedia.org/r/c/operations/puppet/+/1082238 /cc _joe_ --> we are switching depool_threshold from a opaque String to Float[0.0, 1.0] [16:01:45] <_joe_> vgutierrez: it not being a float was inherited from 2012 puppet [16:01:50] <_joe_> not a design decision :) [16:05:51] that explains it :) [16:08:20] 06Traffic, 10Maps, 06SRE, 13Patch-For-Review: Allow Wikimedia Maps usage on pediapress.com - https://phabricator.wikimedia.org/T375761#10251137 (10MSantos) LGTM. Approved. [17:14:55] 10netops, 06Traffic, 06Infrastructure-Foundations: Create Generalised blocking strategy - https://phabricator.wikimedia.org/T270618#10251482 (10Dzahn) [17:26:21] 06Traffic, 06SRE, 13Patch-For-Review: Disable acceptance of IPv6 router-advertisement on non-default LVS interface - https://phabricator.wikimedia.org/T358260#10251532 (10ops-monitoring-bot) Host rebooted by cmooney@cumin1002 with reason: Reboot host to apply new sysctls [18:10:15] 06Traffic, 10Maps, 06SRE, 13Patch-For-Review: Allow Wikimedia Maps usage on pediapress.com - https://phabricator.wikimedia.org/T375761#10251715 (10ssingh) 05Open→03Resolved a:03ssingh Change has been rolled out. Please re-open this task if there are any issues. Thanks! [18:33:18] hello traffic friends - I have returned to mess with more things: I have a fairly low risk ATS Lua change (https://gerrit.wikimedia.org/r/1072638) to deploy. [18:33:18] chatting with v.gutierrez, it sounds like disabling puppet on A:cp-text and piloting on a single host before reenabling should be sufficient for a change like this. [18:33:18] how should I best coordinate with you all to avoid conflicts? (e.g., I see https://gerrit.wikimedia.org/r/1072590 was recently merged, and presumably it requires a similar strategy) [18:42:52] swfrench-wmf: yeah, we can do it after the above is merged [18:43:37] brett: how is the above looking? [18:44:09] sukhe: Fine as far as I can tell, but hard to test [18:44:42] as long as varnish-frontend reloaded without any issues [18:44:52] sukhe: sounds good. I'm not in any rush - just wanted to check with you folks to see what coordination might look like :) [18:44:53] well, it's totally fine then [18:45:47] brett: let us know when it has fully rolled out, we can try to test it with an old browser or downgrading the client ciphers [18:46:16] okay [18:58:51] all rolled out [19:00:55] nice! [19:02:20] trying [19:10:02] yeah I doubt we will get anything given the 1% thing there, I am just trying to wing it :] [19:10:49] https://en.wikipedia.org/sec-warning something I guess [19:10:58] "something" [19:12:20] brett: do we have anything from when we last rolled this out? or we just did it and left it at that [19:13:03] I wasn't there when it was last rolled out but I believe we just did this and left it [19:13:32] seems like what we did was 1% then 4% then 100% of all traffic that pertains to this [19:13:37] yeah [19:14:57] Last time I believe /sec-warning was a 404 if queried directly and was only shown if there was a resp.reason set [19:15:14] Trying to think if it's possibly problematic to have it done this way [19:15:29] in what way you mean [19:15:58] I think it's fine for /sec-warning returning a 200 [19:16:20] I think so too but I guess my question is "why was it that way the timebefore?" [19:17:17] if (req.url == "/sec-warning") { [19:17:17] return (synth(200, "Browser Connection Security Warning")); [19:17:27] this is for TLS1/1.1 [19:17:32] so I think we are OK with this [19:18:05] brett: I guess we find out if we get a task on Phabricator. that would actually be nice. [19:18:38] we could come up with some varnishncsa combo to track such a request but I am not even sure if it is worth it [19:19:27] swfrench-wmf: ok I guess you can go ahead with https://gerrit.wikimedia.org/r/1072638 [19:19:30] let us know if we can help [19:20:09] but yeah, disable puppet on A:cp-text and then merge on one and then all others (we usually do -b11 if that helps for a number) [19:35:15] sukhe: apologies for the delay (somehow missed the mention). sounds good! just to confirm: your preference is to explicitly run the agent with on the rest of the fleet (paced with `-b11`) once my test host looks good, correct? [19:35:44] (rather than just reenable and have it roll out via the timer) [19:36:24] swfrench-wmf: yes, the former. I will share why we do that: so that you can see the output for yourself and are convinced it looks good vs Puppet letting it do its thing and then discovering it later after it applied the changes [19:38:28] example of something I just ran: [19:38:29] sudo cumin -b11 "A:cp-upload" 'run-puppet-agent --enable "merging CR 1078994"' [19:38:46] sukhe: awesome, thanks for confirming [19:50:27] applied cleanly on cp4040, trafficserver.service journal is clean, verified I get the expected result via curl [20:00:11] nice [20:00:45] I am sure you know but puppet reloads ATS for us so no need to do that [20:01:39] sukhe: thanks! yeah, I was mainly checking the journal to confirm the reload indeed happened :) [20:07:02] aaaaand we're done. thanks for the tips, sukhe! [20:25:16] thanks! you did all the work :) [22:20:15] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Frack eqiad network upgrade: design, installation and configuration - https://phabricator.wikimedia.org/T377381#10252289 (10Jclark-ctr) @cmooney Step 1: Firewall Installation & Cabling is complete Since we have racke... [22:21:01] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Frack eqiad network upgrade: design, installation and configuration - https://phabricator.wikimedia.org/T377381#10252290 (10Jclark-ctr)