[05:07:16] 10Traffic, 10Operations, 10Patch-For-Review: ATS fails to log the used SSLCurve when the SSL session is being reused - https://phabricator.wikimedia.org/T234011 (10Vgutierrez) After upgrading to 8.0.5-1wm9 cp5001 reports properly the EC used on reused sessions: ` - ReqHeader X-CP-TLS-Version: TLSv1.2... [08:56:39] 10netops, 10Operations, 10ops-eqiad: (Need By: Sept 30) upgrade msw1-eqiad from EX4200 to EX4300 - https://phabricator.wikimedia.org/T225121 (10faidon) 05Stalled→03Open p:05Normal→03High What's the status of this? It seems like this migration is in some limbo state :) As far as I understand it: - Ol... [09:18:11] 10Traffic, 10Operations: cp3032 and cp3040 occasional failed fetches - https://phabricator.wikimedia.org/T235736 (10jijiki) [10:35:40] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache text nodes - https://phabricator.wikimedia.org/T227432 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin1001.eqiad.wmnet for hosts: ` ['cp4027.ulsfo.wmnet'] ` The log can be found in `/var/log/wm... [11:04:56] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache text nodes - https://phabricator.wikimedia.org/T227432 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp4027.ulsfo.wmnet'] ` Of which those **FAILED**: ` ['cp4027.ulsfo.wmnet'] ` [11:41:19] 10Traffic, 10Operations: Improve ATS prometheus metrics - https://phabricator.wikimedia.org/T231533 (10Vgutierrez) [11:41:22] 10Traffic, 10Operations: ATS fails to log the used SSLCurve when the SSL session is being reused - https://phabricator.wikimedia.org/T234011 (10Vgutierrez) 05Open→03Resolved a:03Vgutierrez [14:09:45] 10Traffic, 10Operations, 10Patch-For-Review: Provide an easy way of picking the traffic serving TLS certificate used by ATS - https://phabricator.wikimedia.org/T234803 (10BBlack) Notes from IRC, etc: The current patch (merging shortly: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/541220/ ) gets us... [14:15:11] 10Acme-chief, 10Traffic, 10Operations: Decide/document criteria needed to serve acme-chief LE issued unified certificate to end users - https://phabricator.wikimedia.org/T230687 (10BBlack) >>! In T230687#5422646, @BBlack wrote: > @Vgutierrez may have some ideas about how to tackle these, but it's behind othe... [14:15:24] 10Acme-chief, 10Traffic, 10Operations: Decide/document criteria needed to serve acme-chief LE issued unified certificate to end users - https://phabricator.wikimedia.org/T230687 (10BBlack) [14:15:27] 10Traffic, 10Operations, 10Patch-For-Review: Provide an easy way of picking the traffic serving TLS certificate used by ATS - https://phabricator.wikimedia.org/T234803 (10BBlack) [17:36:00] 10Traffic, 10Operations, 10Patch-For-Review: Renew Digicert Unified in 2019 - https://phabricator.wikimedia.org/T209515 (10BBlack) 05Open→03Resolved Digicert-2019 is now in live use at the `esams` edge and we have full normal redundancy (for now) among commercial cert vendors. Random status update on ot... [18:12:08] 10Traffic, 10Operations, 10Phabricator, 10Release-Engineering-Team-TODO, and 2 others: Prepare Phame to support heavy traffic for a Tech Department blog - https://phabricator.wikimedia.org/T226044 (10srodlund) Hey all -- I am currently seeking some answers to some basic infrastructure questions. Unfortunat... [20:29:16] so I've created a problem for myself [20:29:35] cp1075 has something it shouldn't cached for grafana-beta (probably because I made some requests before my hiera change had been deployed) [20:42:56] https://phabricator.wikimedia.org/P9383 [20:43:00] I must have done something truly idiotic [21:13:31] cdanis: need something wiped out? [21:13:57] bblack: first, I am going to see what ATS does when I apply https://gerrit.wikimedia.org/r/544023 [21:14:24] I got clued into the config being wrong when I saw a reply for a cache miss that also had a bunch of appserver-returned headers :) [21:17:35] is there an easy way to query a particular cp backend, as if you were varnish? [21:18:19] just hit port 3128 directly from localhost [21:18:35] oh okay, I was trying 8443 and that wasn't it [21:19:53] oh and use XFP [21:19:57] curl -v -H 'X-Forwarded-Proto: https' http://en.wikipedia.org:3128/wiki/Main_Page --resolve en.wikipedia.org:3128:127.0.0.1 [21:20:06] ^ like this, from cp1075 itself, to hit cp1075 varnish-be [21:20:52] 443 / 8443 are the two different TLS terminators while we're in transition (nginx and ATS, depending on which is active on a given node for 443) [21:21:09] makes sense [21:21:32] and varnish-fe listens on 80, and all of 3120-3127 [21:21:36] and varnish-be listens on 3128 [21:22:10] (and ats-be also uses 3128 like varnish-be, when applicable) [21:24:11] (the other missing piece: setting the Host: header manually, so it doesn't get Host: grafana-beta...:3128) [21:24:57] is it expected that ATS serves 'miss' instead of 'pass'? [21:25:13] I don't know :) [21:25:27] fair enough :) [21:25:48] I'm not sure that ATS is supposed to be used for 'misc' traffic like grafana yet anyways [21:25:54] that may be one of the missing reasons why [21:26:38] right now the ATS backends that are deployed "live" for cache_text are only used (by the frontends) for a limited subset (the actual mediawiki / restbase / etc traffic, not the 50-something other misc services) [21:26:59] really? I'm seeing grafana-beta traffic get routed via cp1075 [21:27:25] that was the last I heard anyways [21:27:32] re: miss/pass, maybe you need a never-cache entry? [21:27:55] - primary_destination: dest_host value: grafana1001.eqiad.wmnet action: never-cache [21:28:01] ^ there's one for grafana1001, but not 1002 [21:28:05] ah [21:28:07] thanks! [21:28:32] it's further down in the same hieradata/.../backend.yaml [21:39:13] thanks again bblack! looks like no cache wipeouts needed