[07:05:14] 10Traffic, 10Operations: backport ipvsadm>=1.30 to buster-wikimedia or buster-backports - https://phabricator.wikimedia.org/T263788 (10MoritzMuehlenhoff) JFTR; 1:1.31-1+deb10u1 is not an ideal version for an internal backport; better use 1:1.31-1~deb10u1 or 1:1.31-0+deb10u1 for future backports: If there's no... [08:11:17] 10Traffic, 10Cloud-Services, 10Operations: cloudweb2001-dev: add TLS termination - https://phabricator.wikimedia.org/T263829 (10ema) [08:12:18] 10Traffic, 10Continuous-Integration-Infrastructure, 10Operations: contint.wikimedia.org: add TLS termination - https://phabricator.wikimedia.org/T263830 (10ema) [08:15:25] 10Traffic, 10Operations, 10serviceops: puppetmaster[12]001: add TLS termination - https://phabricator.wikimedia.org/T263831 (10ema) [08:25:43] 10HTTPS, 10Traffic, 10Operations, 10codfw-rollout: HTTPS for internal service traffic - https://phabricator.wikimedia.org/T108580 (10ema) >>! In T108580#6488253, @BBlack wrote: > Do we need to clean these up in some new subtasks Yup, tasks created! > and/or implement some check to prevent adding new http... [08:26:20] 10Traffic, 10Cloud-Services, 10Operations: cloudweb2001-dev: add TLS termination - https://phabricator.wikimedia.org/T263829 (10ema) [08:26:23] 10HTTPS, 10Traffic, 10Operations, 10codfw-rollout: HTTPS for internal service traffic - https://phabricator.wikimedia.org/T108580 (10ema) [08:26:44] 10Traffic, 10Continuous-Integration-Infrastructure, 10Operations: contint.wikimedia.org: add TLS termination - https://phabricator.wikimedia.org/T263830 (10ema) [08:26:47] 10HTTPS, 10Traffic, 10Operations, 10codfw-rollout: HTTPS for internal service traffic - https://phabricator.wikimedia.org/T108580 (10ema) [08:27:21] 10Traffic, 10Operations, 10serviceops: puppetmaster[12]001: add TLS termination - https://phabricator.wikimedia.org/T263831 (10ema) [08:27:24] 10HTTPS, 10Traffic, 10Operations, 10codfw-rollout: HTTPS for internal service traffic - https://phabricator.wikimedia.org/T108580 (10ema) [12:33:34] 10Traffic, 10Operations: backport ipvsadm>=1.30 to buster-wikimedia or buster-backports - https://phabricator.wikimedia.org/T263788 (10CDanis) Ah, sorry, I didn't think that hard about sort order. It's been many, many years now since I had an @debian.org email address. [13:29:27] 10Traffic, 10Operations, 10Patch-For-Review: Discarded VCL files stuck in auto/busy state cause high number of backend probe requests - https://phabricator.wikimedia.org/T236754 (10ema) The upgrade to Varnish 6 (T263557) seems to have fixed this, or at least I could not reproduce the problem by issuing vari... [15:17:59] 10Traffic, 10CheckUser, 10Operations: Log source port for anonymous users and expose it for sysops/checkusers - https://phabricator.wikimedia.org/T181368 (10eranroz) It is desirable when there are trolls using ISPs which use CGN (maybe other cases) - I think this is quite rare case - but when it is required... [16:49:43] was there a plan or an idea to move port 80 traffic from varnishfe to ats-tls? [16:49:49] (and is that still likely to happen?) [16:56:13] that's probably on pause as we rethink tls termination [16:56:49] but in general, yes, the plan has always been to eventually get port 80 off of v-fe, and into something simple that only knows how to emit 301/403 as appropriate (which might be the tls terminator, or some other simple daemon) [16:57:24] I think we have yet to complete tracking down all of the edge cases to even make it possible to try the transition, though. [16:58:12] my recent no-phab-ticket patch here: https://gerrit.wikimedia.org/r/c/operations/puppet/+/629156 is intended to help track down the remaining internal cases. [16:58:23] I think, based on what I can see in webrequest, that they're rare now [16:59:01] nod [16:59:14] historically we've allowed WMF source IPs to fake-set XFP:https while reaching us on port 80 [16:59:40] yeah, with Envoy being widely-deployed internally, and often used as a 'local' TLSifying sidecar for calling other services [16:59:45] that seems much less necessary to allow [16:59:48] historically we also had a bunch of non-canonical and not-https-protected junk domains that landed on v-fe:80 too, but I think vg already moved tohse away with the ncredir work [17:00:53] when I was looking at the data recently and made that XFP-logging patch, I also trolled through our puppet repo and found two extant cases from config: [17:00:56] https://gerrit.wikimedia.org/r/c/operations/puppet/+/629096 [17:01:02] https://gerrit.wikimedia.org/r/c/operations/puppet/+/629095 [17:01:43] basically once we're reasonably sure the xfp-faking port 80 users are gone, we should be able to make it universally redirect/deny all traffic (from VCL), and then it's trivial to just move the port/traffic to another daemon [17:03:54] seems the simple approach is unlikely to work in the 629096 case, so we may have to do something trickier there to convince python to validate the cert [17:04:07] (does it have some equivalent to curl's --resolve workaround?) [17:04:31] anyways, none of it's urgent at present [17:05:28] 10netops, 10Operations, 10homer: Homer: Netbox driven switch interfaces - https://phabricator.wikimedia.org/T250429 (10crusnov) [17:07:18] verify=False is an easy workaround for python requests, not ideal ofc [17:10:43] we can also subclass HTTPAdapter and override the cert_verify method [17:11:04] probably a useful thing to have around tbh, we make a bunch of use of the requests library [17:13:04] there should be (maybe is?) a flag to check the cert san against the host header rather than the hostname/ip being connected-to [17:13:34] it's not an uncommon case, to say you want to connect to 192.0.2.1, but use "Host: Foo" and expect a SAN match for "Foo", for tooling like this [17:14:51] or there's what I was calling the --resolve method we used to use a lot with "curl" - just software-inject fake DNS for the context of one request, and say that Foo resolves to 192.0.2.1. [17:15:10] yeah, I use the --resolve method in curl a lot, it's surprising that it isn't more widespread in other tools [17:23:55] okay, this is actually quite hard to override sensibly :) [17:24:04] I think verify=False is the way to go [17:59:09] 10Traffic, 10Operations, 10Platform Team Initiatives (API Gateway), 10Platform Team Sprints Board (Sprint 4), and 2 others: Client Developer has a cookie-free API call - https://phabricator.wikimedia.org/T258748 (10WDoranWMF) [18:02:02] bblack: do you happen to know where the magic numbers used in VRT_GetHdr/VRT_SetHdr calls come from? [18:02:54] const struct gethdr_s hdr = { HDR_REQ, "\014X-Client-IP:" }; [18:03:00] oh [18:03:15] oh my god it's the length in octal [18:03:23] okay. [18:08:33] 10Traffic, 10Operations, 10serviceops, 10Patch-For-Review: Applayer services without TLS - https://phabricator.wikimedia.org/T210411 (10hashar) [18:08:37] 10Traffic, 10Continuous-Integration-Infrastructure, 10Operations: contint.wikimedia.org: add TLS termination - https://phabricator.wikimedia.org/T263830 (10hashar) [18:10:56] would like to merge a change to profile::cache::base that is just part of general code cleanup / puppet6-compat. it compiles as NOOP and has a review and i'm pretty confident but .. it's still cache::base and Friday. https://gerrit.wikimedia.org/r/c/operations/puppet/+/623662/8/modules/profile/manifests/cache/base.pp#10 [18:12:02] only thing that changes is $log_slow_request_threshold becomes an actual float instead of a string. it's used in varnishslowlog.systemd.erb [18:18:12] cdanis: :) [18:43:26] 10Traffic, 10Continuous-Integration-Infrastructure, 10Operations: contint.wikimedia.org: add TLS termination - https://phabricator.wikimedia.org/T263830 (10Dzahn) This was basically all done: https://gerrit.wikimedia.org/r/c/operations/puppet/+/591000 tlsproxy::envoy: allow limiting firewall srange https://... [20:38:20] Anyone about who might be able to help with some caching infront of https://releases.wikimedia.org ? [20:43:18] Basically, I want to purge/invalidate the cache for a handful of files under https://releases.wikimedia.org/mediawiki/1.31/ [20:46:02] bblack: ^^ [20:46:03] sure Reedy [20:46:05] ahah [20:46:17] I was refraining from pinging you cdanis as you're /away ;P [20:46:46] https://releases.wikimedia.org/mediawiki/1.31/mediawiki-1.31.9.patch.gz https://releases.wikimedia.org/mediawiki/1.31/mediawiki-1.31.9.patch.gz.sig https://releases.wikimedia.org/mediawiki/1.31/mediawiki-1.31.10.patch.gz and https://releases.wikimedia.org/mediawiki/1.31/mediawiki-1.31.10.patch.gz.sig [20:47:00] you happened to catch me catching up on backlog after I got a bit tired of printf-debugging inline C code running inside VCL running inside Varnish running inside Vagrant ;) [20:47:17] Reedy: did you try https://wikitech.wikimedia.org/wiki/Multicast_HTCP_purging#One-off_purge ? [20:47:35] I didn't [20:47:37] I think you should be able to do that [20:47:40] Yeah, I can [20:47:56] it's not 100% clear when that works for "non wiki" stuff [20:48:12] yeah, fair [20:48:20] I think it should work for the canonical form of any URL on the CDN [20:48:47] 10Traffic, 10Operations, 10Wikimedia-General-or-Unknown, 10User-DannyS712: Pages whose title ends with semicolon (;) are intermittently inaccessible - https://phabricator.wikimedia.org/T238285 (10Krinkle) Simple repro: ` krinkle@people1002$ echo -e "Hello world.\n" > 'foo;' krinkle@people1002$ cat foo\; H... [20:48:54] last-modified: Fri, 25 Sep 2020 19:21:45 GMT [20:49:01] that's the same as it was before.. [20:49:38] so interestingly it *did* purge [20:49:48] < age: 5 [20:49:50] < x-cache: cp2027 miss, cp2037 hit/1 [20:49:54] but when I refetched just now [20:49:56] < age: 0 [20:49:58] < x-cache: cp2027 miss, cp2037 miss [20:50:46] (and now ofc it is cached again) [21:00:07] Reedy: do you still need help? I haven't traced the ats-be director map to figure out what internal service backs releases.wm.o but my best guess rn is that there's caching behind varnish/ats [21:01:06] if you can, before I send an email it'd be appreciated [21:01:07] https://wikitech.wikimedia.org/wiki/Apache_Traffic_Server#Forcing_a_cache_miss_(similar_to_ban) [21:01:16] so, a purge also purges ATS [21:01:41] you only need the above when you want to hit something larger than a list of a handful of specific URLs [21:02:04] (like the example shown that forces refetches for all of itwiki) [21:02:59] hmm... did that purge actually fix it? [21:03:12] I didn't compare file contents [21:03:25] downloading the file gives the same md5sum as the source file on releases1002 [21:03:36] so maybe the last-modified is wrong but the contents are right? [21:03:56] indeed [21:04:14] that's unfortunate as it looks like the etag is also based on the mtime (or at least, I didn't see it change) [21:05:13] Reedy: https://phabricator.wikimedia.org/P12801 ? looks right [21:06:00] Reedy: did you mean to move the -fixups over the originals? [21:11:01] nope [21:11:19] the originals were moved to .old [21:11:25] replacement versions were uploaded [21:11:34] and -fixups was for an edge csae [21:14:53] https://phabricator.wikimedia.org/P12802 [21:15:26] Plus the third party reporter has confirmed it's right now [21:15:43] I thought we were past the days that purgeList.php would purge random things.. apparently not! [21:15:48] #nostalgia :) [21:17:14] ok! [21:17:16] so all set? [21:21:55] yeah, wfm thanks :)