[00:28:12] 06Traffic, 10Hiddenparma, 13Patch-For-Review: Introduce known-client identity objects and integrate with requestctl - https://phabricator.wikimedia.org/T403220#11248244 (10Scott_French) [08:24:59] 06Traffic, 06SRE, 06MediaWiki-Platform-Team (Radar): Have CDN edge set the `X-Request-Id` header for incoming external requests - https://phabricator.wikimedia.org/T221976#11248799 (10Vgutierrez) Do we know what the current behavior is for layers that set `X-Request-ID`? The usual approach for HAProxy is to... [08:51:47] 06Traffic, 06Data-Platform-SRE (2025.09.26 - 2025.10.17), 07Essential-Work, 13Patch-For-Review: Disable LVS paging for WDQS - https://phabricator.wikimedia.org/T406141#11248916 (10Gehel) [09:07:42] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: cr2-eqiad: fan failure on left tray [Oct 2025] - https://phabricator.wikimedia.org/T406554 (10cmooney) 03NEW p:05Triage→03High [09:14:37] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, and 2 others: Remove lvs1018 L2 link to ssw1-e1-eqiad - https://phabricator.wikimedia.org/T405499#11249042 (10cmooney) @BCornwall I'm hoping to make progress on this one, can you review the gerrit patch when you have a moment? In terms of how... [09:48:06] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11249202 (10elukey) @Jhancock.wm tried again, then reset the idrac on 2056, re-run again but same error :( I've reset the IDRAC for cp2052 and I was able to up... [10:31:46] 06Traffic, 06Infrastructure-Foundations, 06SRE, 10vm-requests: eqiad: 2 VM request for hCaptcha - https://phabricator.wikimedia.org/T406166#11249385 (10cmooney) @ssingh FYI I ran the //sre.dns.netbox// cookbook just now as it alerted on being a diff, it removed the entries for hcaptcha1001. The VM doesn't... [10:38:34] 06Traffic, 06Data-Platform-SRE (2025.09.26 - 2025.10.17), 07Essential-Work, 13Patch-For-Review: Disable LVS paging for WDQS - https://phabricator.wikimedia.org/T406141#11249404 (10LSobanski) Now that the other endpoints were added, is there anything else that needs to happen before the patch is deployed? [10:40:29] vgutierrez: Ok I think the best approach to the group routing for the APIs is actually to have a set matching domains to groups, https://gerrit.wikimedia.org/r/c/operations/puppet/+/1193903 [10:40:51] It's an ~900 element table [11:03:16] 👀 🍿 [13:07:47] 06Traffic, 06Data-Platform-SRE (2025.09.26 - 2025.10.17), 07Essential-Work, 13Patch-For-Review: Disable LVS paging for WDQS - https://phabricator.wikimedia.org/T406141#11249920 (10ssingh) >>! In T406141#11249404, @LSobanski wrote: > Now that the other endpoints were added, is there anything else that needs... [13:09:55] 06Traffic, 06Infrastructure-Foundations, 06SRE, 10vm-requests: eqiad: 2 VM request for hCaptcha - https://phabricator.wikimedia.org/T406166#11249922 (10ssingh) >>! In T406166#11249385, @cmooney wrote: > @ssingh FYI I ran the //sre.dns.netbox// cookbook just now as it alerted on being a diff, it removed the... [13:25:25] 06Traffic, 06Infrastructure-Foundations, 06SRE, 10vm-requests: codfw: 2 VM request for hCaptcha - https://phabricator.wikimedia.org/T406167#11249998 (10ssingh) 05Open→03Resolved a:03ssingh `hcaptcha200[1-2].wikimedia.org` are ready. [13:33:59] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11250019 (10Jhancock.wm) @elukey fixed cp2050. opening a ticket for cp2056 [13:39:52] 06Traffic, 06SRE, 06MediaWiki-Platform-Team (Radar): Have CDN edge set the `X-Request-Id` header for incoming external requests - https://phabricator.wikimedia.org/T221976#11250057 (10CDanis) >>! In T221976#11248799, @Vgutierrez wrote: > The usual approach for HAProxy is to generate a UUID, append it to the... [13:47:35] 10netops, 06Traffic, 06Infrastructure-Foundations, 06SRE: lvs1020: reimage to move primary IP from private1-d-eqiad to private1-d7-eqiad vlan - https://phabricator.wikimedia.org/T405630#11250100 (10cmooney) I'm actually not sure if this is going to be a possibility. Unfortunately the Nokia SR-Linux platfo... [13:48:19] 10netops, 06Traffic, 06DC-Ops, 06Infrastructure-Foundations, and 2 others: lvs1019: reimage to move primary IP from private1-c-eqiad to private1-c7-eqiad vlan - https://phabricator.wikimedia.org/T405632#11250105 (10cmooney) See T405630#11250099, I'm not sure this will be possible. [14:17:51] vgutierrez: do you have opinions on the group routing approach? I know we don't like leaking mediawiki-config concepts into ATS, but that would only be there until API calls are all centralised through the rest-gateway and get removed afterwards [14:19:38] it seems to make the test suite (on the whole) about .001 second slower on average which I'm not sure is significant [14:47:12] 06Traffic, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11250491 (10elukey) I was able to provision and upgrade idrac+bios on 2050, thanks! [14:53:22] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: cr2-eqiad: fan failure on left tray [Oct 2025] - https://phabricator.wikimedia.org/T406554#11250521 (10VRiley-WMF) Yes, it seems like there is an issue with the fan, it is showing the warning lights for the fan. Is it okay to proceed w... [15:01:41] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw:frack:rack/install/configuration new switches in rack F5 - https://phabricator.wikimedia.org/T405618#11250579 (10Papaul) [15:02:03] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw:frack:rack/install/configuration new switches in rack F5 - https://phabricator.wikimedia.org/T405618#11250581 (10Papaul) [15:19:47] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11250715 (10cmooney) [15:36:50] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: eqiad row C/D Traffic host migrations - https://phabricator.wikimedia.org/T405623#11250820 (10RobH) @BCornwall, I wanted to get your feedback on this as we start the migrations within a couple of weeks. With my understanding of the cp cluster, my proposal is to... [15:42:06] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: eqiad row C/D Traffic host migrations - https://phabricator.wikimedia.org/T405623#11250832 (10cmooney) >>! In T405623#11250820, @RobH wrote: > LVS: This two hosts are a bit more tricky as I'm guessing we need to fully depool an lvs host before we touch its network.... [15:46:42] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: eqiad row C/D Traffic host migrations - https://phabricator.wikimedia.org/T405623#11250846 (10RobH) [15:47:12] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11250847 (10RobH) [15:47:17] 10netops, 06Traffic, 06Infrastructure-Foundations, 06SRE: Eqiad row C/D switch refresh: LVS changes to support migration - https://phabricator.wikimedia.org/T405602#11250848 (10RobH) [15:47:47] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11250849 (10RobH) [15:47:49] 10netops, 06Traffic, 06Infrastructure-Foundations, 06SRE: Eqiad row C/D switch refresh: LVS changes to support migration - https://phabricator.wikimedia.org/T405602#11250850 (10RobH) [16:13:42] 06Traffic, 06Infrastructure-Foundations, 06SRE, 10vm-requests: codfw: 2 VM request for hCaptcha - https://phabricator.wikimedia.org/T406167#11251056 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by sukhe@cumin1003 for hosts: `hcaptcha1002.wikimedia.org` - hcaptcha1002.wikimedia.org (... [17:37:46] brett: lvs1020 seems to be idle right now, so myself and Valerie were gonna proceed with the steps to move it's link to row E/F to the other switch (T404959) [17:37:47] T404959: Move lvs1020 link from ssw1-f1-eqiad to ssw1-e1-eqiad - https://phabricator.wikimedia.org/T404959 [17:37:54] that seem ok to you? [17:38:21] topranks: Thanks for the heads up. I was just reviewing the messages [17:38:31] sounds good to me [17:39:20] yeah this one is only network-side changes so low-touch in terms of the host [17:39:38] we'll kick off shortly and I'll ping you when done if there is anything you want to double check [17:40:49] Thank you! [19:33:50] 06Traffic: Create an alert for depooled cp hosts - https://phabricator.wikimedia.org/T406641 (10CDobbins) 03NEW [19:34:47] 06Traffic: Create an alert for depooled cp hosts - https://phabricator.wikimedia.org/T406641#11251929 (10CDobbins) [19:47:18] 06Traffic, 10Beta-Cluster-Infrastructure, 07Documentation: Create a runbook for troubleshooting the CDN in deployment-prep - https://phabricator.wikimedia.org/T390213#11251967 (10bd808) https://cheatsheet.krishnaneupane.com/posts/haproxy has been helpful as more blocking has moved to haproxy. In our deployme... [20:22:27] 06Traffic, 06MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Main Rollout] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11252202 (10BCornwall) [20:42:47] 06Traffic, 10Beta-Cluster-Infrastructure: Copy the Traffic team on alerts for deployment-cache* hosts - https://phabricator.wikimedia.org/T406650 (10bd808) 03NEW [20:44:00] 06Traffic, 10Beta-Cluster-Infrastructure: Copy the Traffic team on alerts for deployment-cache* hosts - https://phabricator.wikimedia.org/T406650#11252285 (10bd808) [20:50:03] 10netops, 06Traffic, 06DC-Ops, 06Infrastructure-Foundations, and 2 others: Move lvs1020 link from ssw1-f1-eqiad to ssw1-e1-eqiad - https://phabricator.wikimedia.org/T404959#11252319 (10wiki_willy) a:05cmooney→03VRiley-WMF [20:50:35] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: cr2-eqiad: fan failure on left tray [Oct 2025] - https://phabricator.wikimedia.org/T406554#11252323 (10wiki_willy) a:05cmooney→03VRiley-WMF [20:52:36] 06Traffic, 10Beta-Cluster-Infrastructure: Copy the Traffic team on alerts for deployment-cache* hosts - https://phabricator.wikimedia.org/T406650#11252325 (10bd808) @Aklapper what do you think about using Herald for this sort of thing? It would need to be a global rule something like: {F66736932,size=full} I... [20:57:19] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: cr2-eqiad: fan failure on left tray [Oct 2025] - https://phabricator.wikimedia.org/T406554#11252329 (10cmooney) >>! In T406554#11250521, @VRiley-WMF wrote: > Yes, it seems like there is an issue with the fan, it is showing the warning... [21:16:15] 06Traffic, 10Beta-Cluster-Infrastructure: Copy the Traffic team on alerts for deployment-cache* hosts - https://phabricator.wikimedia.org/T406650#11252371 (10ssingh) @bd808 and I discussed this today and decided to split this up in two parts: - Starting immediately and once we get alerts about broken Puppet r... [21:33:32] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: cr2-eqiad: fan failure on left tray [Oct 2025] - https://phabricator.wikimedia.org/T406554#11252463 (10VRiley-WMF) Hey @cmooney I just checked the filter, and it looked clean. I also reseated the fans as well, however it still is showi... [21:39:51] 06Traffic, 10Beta-Cluster-Infrastructure: Copy the Traffic team on alerts for deployment-cache* hosts - https://phabricator.wikimedia.org/T406650#11252502 (10Aklapper) @bd808: I do not know how/where alertmanager/@wmcs-alerts currently sets the #Beta-Cluster-Infrastructure tag and task title for the Phab task... [22:04:25] 06Traffic, 10Beta-Cluster-Infrastructure: Copy the Traffic team on alerts for deployment-cache* hosts - https://phabricator.wikimedia.org/T406650#11252612 (10bd808) >>! In T406650#11252502, @Aklapper wrote: > @bd808: I do not know how/where alertmanager/@wmcs-alerts currently sets the #Beta-Cluster-Infrastruct... [22:21:55] 06Traffic, 06MediaWiki-Platform-Team (Radar), 13Patch-For-Review, 07User-notice: [Main Rollout] Enable unified mobile routing on remaining wikis - https://phabricator.wikimedia.org/T403510#11252658 (10Krinkle) [23:08:27] 06Traffic, 10Phabricator, 06Release-Engineering-Team (Radar): Phabricator videos fail in Firefox ("Range" request gets 503 from Varnish) - https://phabricator.wikimedia.org/T397661#11252788 (10MusikAnimal) >>! In T397661#11244371, @Krinkle wrote: > Yes, it remains consistently broken in Firefox on Mac and Wi... [23:15:33] 06Traffic, 10Beta-Cluster-Infrastructure: Copy the Traffic team on alerts for deployment-cache* hosts - https://phabricator.wikimedia.org/T406650#11252792 (10bd808) >>! In T406650#11252612, @bd808 wrote: >>>! In T406650#11252502, @Aklapper wrote: >> I don't know the skillset of @Maintenance_bot which could be...