[06:24:27] 10Traffic, 10MobileFrontend, 10Operations, 10TechCom-RFC, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10daniel) Accepted as an RFC [07:32:25] Krinkle: I was about to answer that of course the object would be left as-is, but then checked with VTC and it's the other way round https://phabricator.wikimedia.org/P8197 [07:36:07] see how the X-Banana header was updated, and X-Potato inserted [07:41:05] Krinkle: +1 for the meeting, this or EU afternoon would work for me, or tomorrow [07:41:28] s/or EU/EU/ :) [10:50:02] 10Traffic, 10netops, 10Operations, 10Patch-For-Review, 10Performance-Team (Radar): Anycast (Auth)DNS - https://phabricator.wikimedia.org/T98006 (10jbond) Thanks for the response In the last option the anycast prefix should get more then 50% of the traffic due to the SRTT algorithm mentioned by bblack bu... [10:54:18] randomly found by looking into cp2002's varnishncsa: there's some UA:Go-http-client/1.1 spamming codfw, we've sent ~ 11K 429 per second at peak [10:54:33] https://grafana.wikimedia.org/d/myRmf1Pik/varnish-aggregate-client-status-codes?orgId=1&panelId=2&fullscreen&from=now-1h&to=now&var-site=codfw&var-cache_type=All&var-status_type=1&var-status_type=2&var-status_type=3&var-status_type=4&var-status_type=5 [10:54:51] err, I meant: [10:54:58] https://grafana.wikimedia.org/d/myRmf1Pik/varnish-aggregate-client-status-codes?orgId=1&panelId=2&fullscreen&from=now-1h&to=now&var-site=codfw&var-cache_type=All&var-status_type=4 [11:02:08] they've sent 23.4K HEAD requests per second because why not [11:13:39] I'm currently using cp2002 for manual testing of ATS, easy with a ssh tunnel: [11:13:42] sudo setcap 'cap_net_bind_service=+ep' /usr/bin/ssh [11:13:53] ssh -L 443:localhost:443 cp2002.codfw.wmnet [11:14:31] oh, and then I've added `127.0.0.1 upload.wikimedia.org` to /etc/hosts of course [14:00:00] ema: wow, thanks for checking those potatoes. That's good to see confirmed and rather unexpected! [14:00:35] This means headers like Backend-Timing are likely wrong. In that they represent the time taken to compute the cheap 304 [14:00:41] Not the original. [14:00:52] To clients, as part of a 200 response [14:01:50] I can meet in an hour or so, today or tomorrow [14:06:24] Krinkle: tentative meeting invite sent [15:10:58] 10netops, 10Operations: eqiad - eqord Telia link down - IC-314533 - https://phabricator.wikimedia.org/T218307 (10Dzahn) [15:12:50] 10netops, 10Operations: eqiad - eqord Telia link down - IC-314533 - https://phabricator.wikimedia.org/T218307 (10Dzahn) affected circuit: https://netbox.wikimedia.org/circuits/circuits/31/ [15:21:39] vgutierrez or ema hi, im wondering if gerrit.wikimedia.org can be added to phabricator.wikimedia.org CSP policy please? See https://phabricator.wikimedia.org/T218308 for details. [15:28:10] 10Traffic, 10Gerrit, 10Operations, 10Phabricator: No longer possible to make CORS requests from Phabricator to Gerrit - https://phabricator.wikimedia.org/T218308 (10Paladox) [15:30:08] paladox: I'm not sure that's within our scope (traffic scope) [15:30:16] oh [15:30:29] but let me ask around :) [15:30:34] it was done varnish side i think [15:31:24] sure, we can set it on the varnish side, but the actual content of the policy is more related to the security team than to us IMHO [15:31:51] ema: Thanks, wfm. Be right there. [15:32:27] paladox: I've pinged them, so I'll let you know something as soon as I get an answer [15:32:33] thanks! [15:36:01] 10Traffic, 10Gerrit, 10Operations, 10Phabricator, 10Security-Team: No longer possible to make CORS requests from Phabricator to Gerrit - https://phabricator.wikimedia.org/T218308 (10chasemp) [15:37:36] paladox: so as chasemp [15:37:47] confirmed me, it shoul be handled by our sec team :) [15:37:47] thanks :) [15:37:55] ok [15:38:03] no problem [15:46:14] 10netops, 10Operations: eqiad - eqord Telia link down - IC-314533 - https://phabricator.wikimedia.org/T218307 (10Dzahn) I sent an email to Telia with the circuit ID and time. They responded saying "Its up now ? There is no scheduled maintenance from our side. " and i said No, it's still down, our monitoring sh... [16:15:20] 10Traffic, 10netops, 10Operations, 10Patch-For-Review: Offload pings to dedicated server - https://phabricator.wikimedia.org/T190090 (10ayounsi) My theory so far, until we can get confirmation from JTAC (as I can't find any doc confirming it or not), is that the firewall action `next-ip` can only be applie... [17:07:15] 10netops, 10Operations: Management routers: filter traffic from external to junos-host - https://phabricator.wikimedia.org/T218234 (10ayounsi) 05Open→03Resolved All patched. No need for this task to be private anymore. [17:07:20] 10netops, 10Operations: Management routers: filter traffic from external to junos-host - https://phabricator.wikimedia.org/T218234 (10ayounsi) [17:10:37] vgutierrez: i wonder what link to use for "HTTPS Unified ECDSA" / "HTTPS Unified ESA" checks. i guess if these start to alert then we just want "file a ticket with traffic team" (if it's the warning they will expire in < 30 days) or if it's worse then UBN and we are in big trouble and a runbook wont help :p [17:12:04] 10Traffic, 10Gerrit, 10Operations, 10Phabricator, 10Security-Team: No longer possible to make CORS requests from Phabricator to Gerrit - https://phabricator.wikimedia.org/T218308 (10Jdlrobson) Did something change regarding https://gerrit-review.googlesource.com/Documentation/config-gerrit.html#site.allo... [17:12:19] so far that's right [17:12:51] 10Traffic, 10Gerrit, 10Operations, 10Phabricator, 10Security-Team: No longer possible to make CORS requests from Phabricator to Gerrit - https://phabricator.wikimedia.org/T218308 (10Jdlrobson) (and to be clear I'm only interested in read only requests here) [17:12:52] when the time comes and that certificate is handled by acme-chief then a runbook should be available [17:14:13] aha :) [17:17:21] 10Traffic, 10Gerrit, 10Operations, 10Phabricator, 10Security-Team: No longer possible to make CORS requests from Phabricator to Gerrit - https://phabricator.wikimedia.org/T218308 (10Dzahn) I don't see allowOriginRegex in our Gerrit config at all. That should mean "By default, unset, denying all cross-ori... [17:21:03] 10Traffic, 10Gerrit, 10Operations, 10Phabricator, 10Security-Team: No longer possible to make CORS requests from Phabricator to Gerrit - https://phabricator.wikimedia.org/T218308 (10Dzahn) It should be the CSP on the Phabricator side. [17:26:44] ema: https://phabricator.wikimedia.org/T105657 [17:30:03] Krinkle: we might not be sure about the Age header, but the ticket is ~ 3.5 years old! :) [17:38:39] Krinkle: shall we try this tomorrow? https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/496497/ [17:39:50] ema: OK. I will re-run the original data capture first, though. To see if we still have the 5min spikes. [17:40:07] I can do that tomorrow, but then let's do this experiment Monday? [17:40:40] sounds good! Also please remember adding X-Cache to your data [17:40:45] 10Traffic, 10MediaWiki-ResourceLoader, 10Operations, 10Performance-Team, 10Patch-For-Review: Expires header for load.php should be relative to request time instead of cache time - https://phabricator.wikimedia.org/T105657 (10Krinkle) a:03Krinkle [17:40:46] Yep [17:40:51] I'm running it now [19:00:04] 10netops, 10Operations: Bird multihop BFD - https://phabricator.wikimedia.org/T209989 (10ayounsi) Followed up on the mailing list: > Junos uses the BGP multihop TTL value for BFD as well, and assumes the other side's default TTL is 255. > So if I do: > `lang=diff > [edit protocols bgp group Anycast4 multihop]... [19:01:11] ema: with x-cache https://gist.github.com/Krinkle/4593a5d76a474927e52b5b9fa19585a1#file-01-output-v1-run1-tsv [19:01:38] I mean https://gist.github.com/Krinkle/4593a5d76a474927e52b5b9fa19585a1#file-01-output-v2-run1-tsv [19:04:49] ema: Two things I see, 1) In the case where be-age keeps wrong but http-age resets, it looks like it's a varnish-be conditional cache hit. The backend hit counter dropped, so it was renewed, but by someone else. 2) this time, I also got a an example where both drop at 300. And that was one where all of them had hit=1, which means I was the first one. [19:05:41] whenever I get a response that starts with cp-some-backend (1), it was upto 300 before that. When it's higher, it went up to 600. [19:05:49] 10netops, 10Operations: Bird multihop BFD - https://phabricator.wikimedia.org/T209989 (10ayounsi) 05Open→03Resolved All done here! [19:06:19] meh, not every time. nevermind. [19:44:30] 10netops, 10Operations, 10monitoring, 10Patch-For-Review: Juniper monitoring - https://phabricator.wikimedia.org/T83992 (10ayounsi) [19:55:57] mutante: any updates from Telia?