[04:09:38] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Eqiad: lsw1-c2-eqiad BGP maintenance/ Tuesday 17th at 9:30 CDT - https://phabricator.wikimedia.org/T420158 (10Papaul) 03NEW [04:16:11] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Eqiad: lsw1-c7-eqiad BGP maintenance/ Thursday 19th at 10:00 am CDT - https://phabricator.wikimedia.org/T420159 (10Papaul) 03NEW [07:25:12] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Eqiad: lsw1-c2-eqiad BGP maintenance/ Tuesday 17th at 9:30 CDT - https://phabricator.wikimedia.org/T420158#11712110 (10ayounsi) ` an-master1003: skipping host (Make sure the redundant master is active.) an-worker1220: skipping host (no depool... [07:44:13] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Eqiad: lsw1-c7-eqiad BGP maintenance/ Thursday 19th at 10:00 am CDT - https://phabricator.wikimedia.org/T420159#11712154 (10ayounsi) ` alert1002: Couldn't get or parse depool Hiera key an-worker1151: skipping host (no depool needed) an-worker... [07:57:36] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Eqiad: lsw1-c2-eqiad BGP maintenance/ Tuesday 17th at 9:30 CDT - https://phabricator.wikimedia.org/T420158#11712178 (10MoritzMuehlenhoff) [08:18:16] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11712238 (10ABran-WMF) Before merging [[ https://gerrit.wikimedia.org/r/c/operations/puppet/+/... [08:28:43] 06Traffic, 05MW-1.46-notes (1.46.0-wmf.19; 2026-03-10), 07OKR-Work, 13Patch-For-Review, 06Test Kitchen (Experiment Platform Sprint 21): Test the impact of incremental increase in traffic for cache splitting experiments - https://phabricator.wikimedia.org/T407570#11712278 (10Sfaci) [08:29:51] 06Traffic, 05MW-1.46-notes (1.46.0-wmf.19; 2026-03-10), 07OKR-Work, 13Patch-For-Review, 06Test Kitchen (Experiment Platform Sprint 21): Test the impact of incremental increase in traffic for cache splitting experiments - https://phabricator.wikimedia.org/T407570#11712280 (10Sfaci) [08:36:29] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Eqiad: lsw1-c2-eqiad BGP maintenance/ Tuesday 17th at 9:30 CDT - https://phabricator.wikimedia.org/T420158#11712298 (10taavi) [08:36:59] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Eqiad: lsw1-c7-eqiad BGP maintenance/ Thursday 19th at 10:00 am CDT - https://phabricator.wikimedia.org/T420159#11712299 (10taavi) [08:45:02] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11712311 (10ABran-WMF) the httpd config update to align httpd on ATS has also been applied to... [09:13:07] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11712379 (10ABran-WMF) The httpd config update has been applied to all hosts. The CDN config... [09:13:33] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11712381 (10ABran-WMF) [09:22:32] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Eqiad: lsw1-c2-eqiad BGP maintenance/ Tuesday 17th at 9:30 CDT - https://phabricator.wikimedia.org/T420158#11712410 (10Gehel) [09:22:43] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Eqiad: lsw1-c7-eqiad BGP maintenance/ Thursday 19th at 10:00 am CDT - https://phabricator.wikimedia.org/T420159#11712412 (10Gehel) [09:24:14] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11712420 (10ABran-WMF) 05In progress→03Resolved Configs have been applied to the prima... [09:49:07] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Eqiad: lsw1-c2-eqiad BGP maintenance/ Tuesday 17th at 9:30 CDT - https://phabricator.wikimedia.org/T420158#11712621 (10cmooney) Can we hold off on any work related to this? I am planning to dr... [09:49:26] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Eqiad: lsw1-c7-eqiad BGP maintenance/ Thursday 19th at 10:00 am CDT - https://phabricator.wikimedia.org/T420159#11712623 (10cmooney) Can we hold off on any work related to this? I am planning... [09:49:57] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11712639 (10ABran-WMF) 05Resolved→03Open this change applied to the primary instance has c... [09:53:04] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Nokia SR-Linux DHCP Relay Bug - https://phabricator.wikimedia.org/T411054#11712668 (10cmooney) [10:11:46] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11712751 (10ABran-WMF) [10:32:52] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team: Gerrit: Debug connection re-use on Gerrit's httpd causing Gerrit interface to be very slow - https://phabricator.wikimedia.org/T420189 (10ABran-WMF) 03NEW [10:35:06] 06Traffic, 07Chinese-Sites: Images on the main pages of wmf private wikis are broken - https://phabricator.wikimedia.org/T419313#11712871 (10Aklapper) Cannot reproduce a problem here. When I am logged out, the two mentioned pages load the URIs `//upload.wikimedia.org/wikipedia/commons/thumb/c/c9/Permission_log... [10:57:10] 06Traffic, 07Chinese-Sites: Images on the main pages of wmf private wikis are broken - https://phabricator.wikimedia.org/T419313#11712932 (101F616EMO) Seems like the page fixed itself. Probably a cache problem then. [11:04:09] FIRING: [2x] LVSHighCPU: The host lvs1016:9100 has at least its CPU 22 saturated - https://bit.ly/wmf-lvscpu - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [11:09:13] RESOLVED: [2x] LVSHighCPU: The host lvs1016:9100 has at least its CPU 22 saturated - https://bit.ly/wmf-lvscpu - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [11:50:04] 06Traffic, 07Chinese-Sites: Images on the main pages of wmf private wikis are broken - https://phabricator.wikimedia.org/T419313#11713163 (10Aklapper) 05Open→03Resolved [12:05:42] 06Traffic, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Prevent HaproxykafkaNoMessages alerts from being generated due to standard maintenance operations - https://phabricator.wikimedia.org/T419829#11713213 (10BTullis) 05Open→03Resolved [12:08:40] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Nokia SR-Linux DHCP Relay Bug - https://phabricator.wikimedia.org/T411054#11713232 (10cmooney) >>! In T411054#11696665, @BTullis wrote: > Will all of the switches in rows C & D be getting this configuration change? Yes we need to fix it on a... [13:36:23] 06Traffic, 05MW-1.46-notes (1.46.0-wmf.19; 2026-03-10), 07OKR-Work, 13Patch-For-Review, 06Test Kitchen (Experiment Platform Sprint 21): Test the impact of incremental increase in traffic for cache splitting experiments - https://phabricator.wikimedia.org/T407570#11713642 (10ssingh) >>! In T407570#1170181... [13:49:52] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11713686 (10ABran-WMF) 05Open→03Resolved closing that task, we've aligned ATS and http... [13:50:31] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team: Gerrit: Debug connection re-use on Gerrit's httpd causing Gerrit interface to be very slow - https://phabricator.wikimedia.org/T420189#11713692 (10ABran-WMF) p:05Triage→03Medium [13:51:00] FIRING: [6x] AnycastHealthcheckerRestarted: anycast-healthchecker service restarted on doh1002:9100 - https://wikitech.wikimedia.org/wiki/Anycast#Anycast_healthchecker_not_running - https://alerts.wikimedia.org/?q=alertname%3DAnycastHealthcheckerRestarted [13:51:07] ^ reboots [13:51:28] downtiming is not working across the fleet for some reason [13:51:44] I am wondering if it is due to the number of the reboots underway [13:56:00] RESOLVED: [6x] AnycastHealthcheckerRestarted: anycast-healthchecker service restarted on doh1002:9100 - https://wikitech.wikimedia.org/wiki/Anycast#Anycast_healthchecker_not_running - https://alerts.wikimedia.org/?q=alertname%3DAnycastHealthcheckerRestarted [14:05:38] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: ULSFO: Update ULSFO LVS service IP's - https://phabricator.wikimedia.org/T418971#11713777 (10ssingh) @Papaul / @ayounsi : Patches should be ready for this. Any preference for the day of (this) week for when we should do this? [14:25:04] 06Traffic, 06SRE: Anycast ns[01].wikimedia.org for IPv4 - https://phabricator.wikimedia.org/T366193#11713908 (10ssingh) >>! In T366193#11686108, @cmooney wrote: > @ssingh in terms of the IPv6 anycast plans what is the current situation? > > I notice some patches like [[ https://gerrit.wikimedia.org/r/c/operat... [14:27:53] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Eqiad: lsw1-c2-eqiad BGP maintenance/ Tuesday 17th at 9:30 CDT - https://phabricator.wikimedia.org/T420158#11713919 (10cmooney) p:05Triage→03Low [14:28:02] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Eqiad: lsw1-c7-eqiad BGP maintenance/ Thursday 19th at 10:00 am CDT - https://phabricator.wikimedia.org/T420159#11713920 (10cmooney) p:05Triage→03Low [14:29:00] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team: Gerrit: Debug connection re-use on Gerrit's httpd causing Gerrit interface to be very slow - https://phabricator.wikimedia.org/T420189#11713922 (10ABran-WMF) From https://logstash.wikimedia.org/goto/01f21e6cccb2c9c7ba4c45b422ac089b:... [14:31:30] FIRING: [5x] AnycastHealthcheckerRestarted: anycast-healthchecker service restarted on doh4002:9100 - https://wikitech.wikimedia.org/wiki/Anycast#Anycast_healthchecker_not_running - https://alerts.wikimedia.org/?q=alertname%3DAnycastHealthcheckerRestarted [14:35:02] 06Traffic, 06SRE: Startup failure for Bird on new durum hosts - https://phabricator.wikimedia.org/T419868#11713955 (10ssingh) That's interesting, thanks for debugging. What is weird is that a restart of anycast-healthchecker then should have fixed this in theory? [14:36:15] RESOLVED: [5x] AnycastHealthcheckerRestarted: anycast-healthchecker service restarted on doh4002:9100 - https://wikitech.wikimedia.org/wiki/Anycast#Anycast_healthchecker_not_running - https://alerts.wikimedia.org/?q=alertname%3DAnycastHealthcheckerRestarted [14:57:21] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team: Gerrit: Debug connection re-use on Gerrit's httpd causing Gerrit interface to be very slow - https://phabricator.wikimedia.org/T420189#11714028 (10ABran-WMF) the interaction between the CDN and Gerrit did not created a burst of 5xx... [17:11:26] 06Traffic, 13Patch-For-Review: Wikimedia Commons: incorrect 429 responses for thumbnail errors - https://phabricator.wikimedia.org/T419663#11714766 (10neriah) p:05Triage→03Medium a:03neriah [17:11:35] 06Traffic, 13Patch-For-Review: Wikimedia Commons: incorrect 429 responses for thumbnail errors - https://phabricator.wikimedia.org/T419663#11714768 (10neriah) p:05Medium→03High [17:20:00] FIRING: SystemdUnitFailed: wmf_auto_restart_purged.service on cp4037:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:36:47] 06Traffic, 05MW-1.46-notes (1.46.0-wmf.19; 2026-03-10), 07OKR-Work, 13Patch-For-Review, 06Test Kitchen (Experiment Platform Sprint 21): Test the impact of incremental increase in traffic for cache splitting experiments - https://phabricator.wikimedia.org/T407570#11714880 (10Sfaci) [17:39:40] 06Traffic: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11714893 (10BCornwall) [18:23:58] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: ULSFO: Update ULSFO LVS service IP's - https://phabricator.wikimedia.org/T418971#11715178 (10Papaul) @ssingh thank you let me get back with you tomorrow. I have to double check some things in Netbox. [18:26:09] 06Traffic: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715192 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cdobbins@cumin2002 for host cp6001.drmrs.wmnet with OS trixie [18:27:07] 06Traffic: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715204 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cdobbins@cumin2002 for host cp6002.drmrs.wmnet with OS trixie [19:12:27] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715412 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6001.drmrs.wmnet with OS trixie completed: - cp6001 (**PASS**) - Downtimed on Icinga/Alertma... [19:13:19] 06Traffic, 06cloud-services-team, 06SRE Observability, 13Patch-For-Review: Move wikimediastatus.net 301 to ncredir - https://phabricator.wikimedia.org/T419887#11715415 (10ssingh) Thanks for the task and the patch @colewhite. We will discuss this in Traffic and follow up here or on the CR itself. [19:17:36] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715425 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6002.drmrs.wmnet with OS trixie completed: - cp6002 (**PASS**) - Downtimed on Icinga/Alertma... [19:20:59] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715435 (10BCornwall) [20:15:21] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715651 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6003.drmrs.wmnet with OS trixie completed: - cp6003 (**PASS**) - Downtimed on Icinga/Alertma... [20:20:29] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715666 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6004.drmrs.wmnet with OS trixie completed: - cp6004 (**PASS**) - Downtimed on Icinga/Alertma... [20:21:24] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715667 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cdobbins@cumin2002 for host cp6005.drmrs.wmnet with OS trixie [20:22:45] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715670 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cdobbins@cumin2002 for host cp6006.drmrs.wmnet with OS trixie [20:30:00] FIRING: [3x] SystemdUnitFailed: wmf_auto_restart_purged.service on cp4037:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:33:40] FIRING: [3x] SystemdUnitFailed: wmf_auto_restart_purged.service on cp4037:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:34:23] RESOLVED: ErrorBudgetBurn: varnish-combined codfw - https://slo.wikimedia.org/?search=varnish-combined - https://alerts.wikimedia.org/?q=alertname%3DErrorBudgetBurn [20:34:47] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715716 (10BCornwall) [20:35:45] ^looking at the purged issue [20:37:40] Considering this was after a reboot and the error was "Service purged not present or not running", I'm assuming some ordering/race condition issue with the auto_restart stuff. I'm going to ignore as a one-off fluke and will look deeper into that silly service if it happens again [20:37:54] (restarted wmf_auto_restart_purged.service) [20:37:59] yep sounds good. thanks for looking [20:38:40] RESOLVED: [3x] SystemdUnitFailed: wmf_auto_restart_purged.service on cp4037:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:45:23] FIRING: [2x] ErrorBudgetBurn: varnish-combined codfw - https://slo.wikimedia.org/?search=varnish-combined - https://alerts.wikimedia.org/?q=alertname%3DErrorBudgetBurn [20:50:42] 06Traffic, 10decommission-hardware: Decommission codfw cp hosts cp2027-cp2040 - https://phabricator.wikimedia.org/T419753#11715796 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by brett@cumin2002 for hosts: `cp[2027-2040].codfw.wmnet` - cp2027.codfw.wmnet (**PASS**) - Downtimed host on Ici... [20:59:55] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: ATS: align ATS and Gerrit Apache timeouts to reenable connection re-use - https://phabricator.wikimedia.org/T417998#11715846 (10Dzahn) Cool! Since timeouts between ATS and httpd have been aligned; now woul... [21:11:08] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715888 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6005.drmrs.wmnet with OS trixie completed: - cp6005 (**PASS**) - Downtimed on Icinga/Alertma... [21:12:49] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715920 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cdobbins@cumin2002 for host cp6007.drmrs.wmnet with OS trixie [21:15:27] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715931 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6006.drmrs.wmnet with OS trixie completed: - cp6006 (**PASS**) - Downtimed on Icinga/Alertma... [21:17:34] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715948 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cdobbins@cumin2002 for host cp6008.drmrs.wmnet with OS trixie [21:20:44] 06Traffic, 10MediaWiki-File-management, 13Patch-For-Review: Wikimedia Commons: incorrect 429 responses for thumbnail errors - https://phabricator.wikimedia.org/T419663#11715956 (10neriah) [21:25:34] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11715962 (10CDobbins) [21:31:06] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, and 5 others: MediaViewer (and the commons file page) should serve WebP originals not thumbnails of equivalent size - https://phabricator.wikimedia.org/T418745#11716017 (10matmarex) Two of the patches from this task (and their backports... [21:40:55] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11716063 (10BCornwall) [21:45:16] 06Traffic, 06MW-Interfaces-Team, 06ServiceOps new, 07Epic, and 3 others: Epic: Enforce API rate limits (WE5.1.3c) - https://phabricator.wikimedia.org/T412585#11716106 (10matmarex) [21:45:23] RESOLVED: ErrorBudgetBurn: varnish-combined codfw - https://slo.wikimedia.org/?search=varnish-combined - https://alerts.wikimedia.org/?q=alertname%3DErrorBudgetBurn [21:58:22] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11716135 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6008.drmrs.wmnet with OS trixie executed with errors: - cp6008 (**FAIL**) - Downtimed on Ici... [22:02:29] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11716161 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cdobbins@cumin2002 for host cp6008.drmrs.wmnet with OS trixie [22:04:03] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11716166 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6007.drmrs.wmnet with OS trixie completed: - cp6007 (**PASS**) - Downtimed on Icinga/Alertma... [22:30:14] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11716213 (10Papaul) Removed interface et-0/0/1.1221 from both routers and cleanup all reference for sandbox1-ulsfo in Netbox ` - unit 1221 { - descr... [22:33:53] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: ULSFO: Update ULSFO LVS service IP's - https://phabricator.wikimedia.org/T418971#11716224 (10Papaul) @ssingh I double check all for the new prefix 198.35.26.224/27 in Netbox all looks good. You can make your changes any time. Plea... [22:44:19] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: ULSFO: Update ULSFO LVS service IP's - https://phabricator.wikimedia.org/T418971#11716272 (10Papaul) [22:46:04] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Eqiad: lsw1-c2-eqiad BGP maintenance/ Tuesday 17th at 9:30 CDT - https://phabricator.wikimedia.org/T420158#11716280 (10colewhite) [22:49:38] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11716293 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cdobbins@cumin2002 for host cp6008.drmrs.wmnet with OS trixie completed: - cp6008 (**PASS**) - Downtimed on Icinga/Alertma... [22:52:24] 10netops, 06Infrastructure-Foundations, 06SRE: Update esams network pop diagrams - https://phabricator.wikimedia.org/T368084#11716296 (10Papaul) 05Open→03Resolved Both diagrams for esams are now up to date. Closing this. [22:57:23] FIRING: ErrorBudgetBurn: varnish-combined codfw - https://slo.wikimedia.org/?search=varnish-combined - https://alerts.wikimedia.org/?q=alertname%3DErrorBudgetBurn [23:35:36] 06Traffic, 13Patch-For-Review: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11716409 (10BCornwall)