[02:22:58] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511#11643245 (10Papaul) Both switches are now running version 25.10.2. Still can not get the Cookbook sre.network.tls to pass on asw1-23-ulsfo. [09:43:54] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11644063 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie [10:20:55] hello, I'm starting to set up a new VM on the internal network behind the CDN in https://phabricator.wikimedia.org/T317179 [10:34:47] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11644305 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie executed with errors: - cp2045 (**FAIL**) - Removed from Puppet and PuppetDB if pr... [10:35:18] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11644309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie [10:36:19] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11644318 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie executed with errors: - cp2045 (**FAIL**) - Removed from Puppet and PuppetDB if pr... [10:36:40] hi federico3, did you already start the step outlined in the wiki page? [10:48:02] no, just deploying the VM [11:13:29] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11644433 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie [13:53:46] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11645092 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie executed with errors: - cp2045 (**FAIL**) - Removed from Puppet and PuppetDB if pr... [14:29:46] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11645349 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie [14:34:33] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11645393 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie executed with errors: - cp2045 (**FAIL**) - Removed from Puppet and PuppetDB if pr... [14:37:03] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11645416 (10SLyngshede-WMF) {F72316811} I'm getting an RAID error ` mdadm: cannot open /dev/sdb2: No such file or directory ` So disk configuration probably isn't right [14:57:25] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026-02-13 - 2026-03-06), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11645599 (10BTullis) Just a data point. We're still seeing an ever-increasing value for these open soc... [15:44:55] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11646257 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie [15:47:10] 06Traffic, 06Data-Engineering, 06Data-Engineering-Icebox, 06Product Safety and Integrity, and 3 others: Include User-Agent Client Hints in WebRequest logs - https://phabricator.wikimedia.org/T337947#11646291 (10Dreamy_Jazz) [15:56:50] 06Traffic: Reimage cp20[43-58] to Trixie - https://phabricator.wikimedia.org/T418161#11646472 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1003 for host cp2045.codfw.wmnet with OS trixie executed with errors: - cp2045 (**FAIL**) - Removed from Puppet and PuppetDB if pr... [16:00:09] FIRING: [2x] LVSHighCPU: The host lvs1020:9100 has at least its CPU 32 saturated - https://bit.ly/wmf-lvscpu - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [16:05:09] RESOLVED: [2x] LVSHighCPU: The host lvs1020:9100 has at least its CPU 32 saturated - https://bit.ly/wmf-lvscpu - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [16:34:27] 06Traffic, 06Data-Engineering, 06MW-Interfaces-Team, 06MediaWiki-Platform-Team (Radar), 07OKR-Work: haproxy: capture x-wmf-* headers in webrequest data set - https://phabricator.wikimedia.org/T417864#11646998 (10Vgutierrez) >>! In T417864#11632385, @Clement_Goubert wrote: > In the merged task, I was prop... [16:36:40] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, and 2 others: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11647019 (10MatthewVernon) I spent quite a bit of time with codesearch last quarter trying to track down thumbnail size (ab)use, but... [16:47:18] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, and 2 others: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11647087 (10Tacsipacsi) >>! In T414805#11640558, @Ladsgroup wrote: > The point is that in order to be cached, it need to have a miss... [17:38:36] 06Traffic: HAProxy fails to start due to overly-large cpu-map argument - https://phabricator.wikimedia.org/T418182#11647475 (10BCornwall) 05Open→03Resolved a:03BCornwall [17:38:43] FIRING: HaproxyKafkaNoMessages: Unexpected rate of produced HaproxyKafka messages by cp2043 - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaNoMessages - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=codfw&var-instance=cp2043 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaNoMessages [17:42:47] hm [17:43:43] FIRING: HaproxyKafkaNoMessages: Unexpected rate of produced HaproxyKafka messages by cp2043 - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaNoMessages - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=codfw&var-instance=cp2043 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaNoMessages [17:46:08] I suspect this because I curl'd it - hopefully this will die down soon [18:46:21] 06Traffic, 06SRE, 13Patch-For-Review: Offer AuthDNS service over IPv6 - https://phabricator.wikimedia.org/T81605#11647903 (10ssingh) 05Open→03Resolved a:03ssingh With the rollout of ns[02] IPv6 glue records today, we have IPv6 support on all ns[0-2].wikimedia.org. There is some more work here: we h... [18:52:06] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511#11647948 (10Papaul) a:05Papaul→03ayounsi [21:22:28] 06Traffic, 06Fundraising-Backlog, 06Fundraising-Tech-Roadmap, 10MediaWiki-extensions-CentralNotice, 06SRE: Set expiry time for GeoIP cookies - https://phabricator.wikimedia.org/T122097#11648602 (10AKanji-WMF) [21:43:43] FIRING: HaproxyKafkaNoMessages: Unexpected rate of produced HaproxyKafka messages by cp2043 - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaNoMessages - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=codfw&var-instance=cp2043 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaNoMessages