[07:56:04] 10Traffic, 10Operations: Monitor and plot TTFB as seen by Varnish frontends - https://phabricator.wikimedia.org/T240180 (10ema) [07:56:15] 10Traffic, 10Operations: Monitor and plot TTFB as seen by Varnish frontends - https://phabricator.wikimedia.org/T240180 (10ema) p:05Triage→03Normal [07:58:17] 10Traffic, 10Operations, 10Performance-Team: 15% response start regression starting around 2019-11-11 - https://phabricator.wikimedia.org/T238494 (10ema) Analysis repeated right now capturing requests for 60s. The numbers don't look as bad. p75 TTFB in milliseconds: |**host**| **hit** | **miss** | **pass**... [08:23:55] 10Traffic, 10Operations: Investigate trafficserver-tls crash on cp3064 - https://phabricator.wikimedia.org/T240183 (10ema) [08:23:58] 10Traffic, 10Operations: Investigate trafficserver-tls crash on cp3064 - https://phabricator.wikimedia.org/T240183 (10ema) p:05Triage→03Normal [08:38:03] 10Traffic, 10Operations, 10Performance-Team: 15% response start regression starting around 2019-11-11 - https://phabricator.wikimedia.org/T238494 (10Gilles) We're still seeing an extra 100-150ms on the p75 TTFB reported by clients in Europe compared to before 11/11. Only 20ms of which can be attributed to TLS. [14:07:28] 10Traffic, 10Operations, 10ops-esams: cp3053 is unreachable - https://phabricator.wikimedia.org/T239041 (10ema) 05Open→03Resolved The host has now been up with the new firmware with no issues for one week. Closing for now, we can re-open if needed. [14:07:29] 10Traffic, 10Operations: servers freeze across the caching cluster - https://phabricator.wikimedia.org/T238305 (10ema) [14:34:48] 10Acme-chief, 10Traffic, 10Operations, 10Patch-For-Review: memory leak on keyholder-proxy on buster/python 3.7 - https://phabricator.wikimedia.org/T239386 (10Volans) So far so good, leaving it open for another week or two to ensure the issue is totally fixed. [19:16:25] 10Traffic, 10Operations, 10observability: Varnish traffic drop alert @ codfw is noisy / codfw incoming traffic is spikey - https://phabricator.wikimedia.org/T239039 (10CDanis) 05Open→03Resolved a:03CDanis Looking at some data in grafana explore, this would have solved most cases of noise in the past fe... [21:48:13] ema: for tomorrow, re: the cookie/vary-related lua, a few things: (1) the session|token match isn't the same as Varnish, and in some ways that might be important. Varnish does regex ([Ss]ession|Token)= whereas lua is looking for case-insensitive strings session or token anywhere, which is a broader set. particularly lacking the = is going to match some cookies it shouldn't... [21:49:06] ema: (2) Vary is similar, in that we should be looking only for cookie, but it will match things like "Vary: Nocookiesplease" [21:52:01] ema: (3) One way to restore the missing hits on the read side of this (session cookie detected, but the URI doesn't vary and should've been hittable) might be to move that check down to do do_global_cache_lookup_complete() (after we've looked for a hit in cache), and then there, if we have both an incoming session cookie and the hit object has a stored Vary: Cookie header, we do magic I can't fathom> to ignore the cache object we just found and treat this as pass. [21:52:14] bah too long a line! [22:26:50] 10Traffic, 10Operations: Clean up DNS server puppetization - https://phabricator.wikimedia.org/T240285 (10BBlack) [22:27:22] 10Traffic, 10netops, 10Operations, 10Patch-For-Review, 10Performance-Team (Radar): Anycast AuthDNS - https://phabricator.wikimedia.org/T98006 (10BBlack) [22:27:23] 10Traffic, 10Operations: Clean up DNS server puppetization - https://phabricator.wikimedia.org/T240285 (10BBlack)