[00:29:26] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Q1:eqiad:frack network upgrade tracking task - https://phabricator.wikimedia.org/T371435#10381923 (10Jclark-ctr) 05Open→03Resolved [04:26:40] FIRING: VarnishHighThreadCount: Varnish's thread count on cp5020:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5020 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [04:36:40] RESOLVED: VarnishHighThreadCount: Varnish's thread count on cp5020:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5020 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [09:05:06] 10netops, 06Infrastructure-Foundations, 06SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547#10382536 (10cmooney) >>! In T344547#9301201, @cmooney wrote: > One other observation is that the MED setting does not optimize the outbound path where we are us... [09:16:45] 06Traffic, 10Data-Engineering (Q2 2024 October 1st - December 31th), 13Patch-For-Review: Rollout haproxykafka on all hosts - https://phabricator.wikimedia.org/T378578#10382559 (10Fabfur) [10:32:50] 06Traffic, 10WMF-General-or-Unknown: Misleading error message when accessing an invalid URL at upload.wikimedia.org - https://phabricator.wikimedia.org/T381232#10382828 (10A_smart_kitten) Adding to #Traffic per the note on the #Varnish tag [10:51:54] 10netops, 06Infrastructure-Foundations, 06SRE: Export routes generated from ARP/ND in EVPN - https://phabricator.wikimedia.org/T329369#10382861 (10cmooney) Huh so I've been looking at some of these old tasks while working on the Nokia testing. It's clear in the above the before / after are both the AFTE... [12:25:29] Dear traffic, I will start retiring kafka-main1005 in eqiad, I promise, it will be the last oe [12:48:09] 👍 [16:30:28] dear traffic, I think this is done [16:30:38] thanks! [16:30:57] danke! [18:04:29] 06Traffic, 06SRE: Upgrade pdns-recursor to 5.x on all prod DNS hosts (all C:dnsrecursor and so possibly WMCS) - https://phabricator.wikimedia.org/T381608 (10ssingh) 03NEW [18:04:34] 06Traffic, 06SRE: Upgrade pdns-recursor to 5.x on all prod DNS hosts (all C:dnsrecursor and so possibly WMCS) - https://phabricator.wikimedia.org/T381608#10384526 (10ssingh) p:05Triage→03Low [22:31:31] 06Traffic, 06DC-Ops, 10ops-esams, 10ops-magru, 06SRE: CPU temperature issues in cp hosts - https://phabricator.wikimedia.org/T373993#10385144 (10BCornwall) Unfortunately, it appears that we're still having throttling issues in magru: ` brett@cumin2002:~$ sudo -i cumin 'A:cp' 'zgrep "Core temperature is... [23:55:22] 06Traffic, 06DC-Ops, 10ops-esams, 10ops-magru, 06SRE: CPU temperature issues in cp hosts - https://phabricator.wikimedia.org/T373993#10385350 (10BCornwall) Some observations: * [[ https://grafana.wikimedia.org/goto/_53fKoVHR?orgId=1 | magru has the highest average CPU temperature by site yet the lowest...