[07:42:55] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: sre.k8s.wipe-cluser JSONDecodeError in kubectl_version() - https://phabricator.wikimedia.org/T406200 (10JMeybohm) 03NEW [07:43:36] jelto:^ did you fix the alternative entries manually yesterday? If so, mind adding that? [07:45:07] I did not touch the alternatives manually [07:48:59] hm, interesting. Maybe c.lem did. Because now 1.31 is the highes priority alternative on all ctrl nodes [07:57:05] yes I'm also a bit surprised, I can confirm 1.31 is used on the eqiad control plane nodes. Maybe the apt-alternatives is not working immediately in already running shells/processes? [07:57:05] yesterday dcausse mentioned he was also on 1.23 on the deploy host but after a logout/login 1.31 was used. [07:57:16] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: kube-scheduler failed to start during sre.k8s.wipe-cluster - https://phabricator.wikimedia.org/T406201 (10JMeybohm) 03NEW [07:57:34] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 06Data-Platform-SRE (2025.09.26 - 2025.10.17), and 3 others: Update wikikube eqiad to kubernetes 1.31 - https://phabricator.wikimedia.org/T405703#11235920 (10JMeybohm) [09:32:40] jayme: I didn't either [09:32:47] (touch alternatives) [09:55:09] how odd [09:56:18] claime: anything else that went sideways after I left? [09:56:32] nope all good [09:56:37] cool [10:20:35] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406212 (10Clement_Goubert) 03NEW [10:20:39] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406213 (10Clement_Goubert) 03NEW [10:32:18] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 06Data-Platform-SRE (2025.09.26 - 2025.10.17), and 3 others: Update wikikube eqiad to kubernetes 1.31 - https://phabricator.wikimedia.org/T405703#11236421 (10Clement_Goubert) [10:55:50] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: eqiad row C/D Service Ops host migrations - https://phabricator.wikimedia.org/T405950#11236473 (10Clement_Goubert) 05Open→03In progress p:05Triage→03Medium a:05Kappakayala→03Clement_Goubert `wikikube-ctrl1001` is waiting for decom/derack and can proba... [13:02:30] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: sre.k8s.wipe-cluser JSONDecodeError in kubectl_version() - https://phabricator.wikimedia.org/T406200#11236724 (10JMeybohm) p:05Triage→03High [13:02:34] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: kube-scheduler failed to start during sre.k8s.wipe-cluster - https://phabricator.wikimedia.org/T406201#11236725 (10JMeybohm) p:05Triage→03Medium [13:23:27] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: kube-scheduler failed to start during sre.k8s.wipe-cluster - https://phabricator.wikimedia.org/T406201#11236831 (10LSobanski) [13:23:33] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406212#11236832 (10LSobanski) [13:23:43] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406213#11236837 (10LSobanski) [14:05:48] 06serviceops, 06MW-Interfaces-Team (MWI-Sprint-19 (2025-09-23 to 2025-10-07)), 07OKR-Work: Execute test plan for rest gateway rerouting for rest.php requests and report findings - https://phabricator.wikimedia.org/T405368#11237028 (10HCoplin-WMF) @aaron can you please rerun the tests today to see if there is... [14:31:21] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406213#11237143 (10Edgars2007) duplicate of https://phabricator.wikimedia.org/T406212 ? [14:59:40] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 06Data-Platform-SRE (2025.09.26 - 2025.10.17), and 3 others: Update wikikube eqiad to kubernetes 1.31 - https://phabricator.wikimedia.org/T405703#11237249 (10hashar) [15:05:53] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406213#11237277 (10Clement_Goubert) Yeah phab double posted it for some reason [15:06:07] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406213#11237291 (10Clement_Goubert) →14Duplicate dup:03T406212 [15:06:10] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406212#11237293 (10Clement_Goubert) [15:12:35] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE, 13Patch-For-Review: Q1:rack/setup/install wikikube-ctrl2006 - https://phabricator.wikimedia.org/T400661#11237310 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1002 for host wikikube-ctrl2006.codfw.wmnet with OS bookworm [16:12:44] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 06Data-Platform-SRE (2025.09.26 - 2025.10.17), and 3 others: Update wikikube eqiad to kubernetes 1.31 - https://phabricator.wikimedia.org/T405703#11237591 (10RLazarus) >>! In T405703#11234563, @RLazarus wrote: >>>! In T405703#11233419, @Clement... [17:03:41] 06serviceops, 07Datacenter-Switchover, 13Patch-For-Review: 🚀 Southward Datacenter Switchover (Sept. 2025) - https://phabricator.wikimedia.org/T399891#11237856 (10ops-monitoring-bot) jasmine@cumin1003 - Cookbook cookbooks.sre.discovery.datacenter pool all active/active services in eqiad: Repool services in Eq... [17:26:02] 06serviceops, 07Datacenter-Switchover, 13Patch-For-Review: 🚀 Southward Datacenter Switchover (Sept. 2025) - https://phabricator.wikimedia.org/T399891#11237943 (10ops-monitoring-bot) jasmine@cumin1003 - Cookbook cookbooks.sre.discovery.datacenter pool all active/active services in eqiad: Repool services in Eq... [18:30:49] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: charlie wiped cluster redeployment use-case - https://phabricator.wikimedia.org/T406212#11238224 (10RLazarus) Thanks for this! I hadn't originally thought about using charlie this way. For my use case (applying the same diff to every service, like an Envoy upgra... [18:36:41] 06serviceops, 07Datacenter-Switchover, 13Patch-For-Review: 🚀 Southward Datacenter Switchover (Sept. 2025) - https://phabricator.wikimedia.org/T399891#11238243 (10Scott_French) Following up from discussion in IRC, shortly after MediaWiki (mw-api-ext, mw-web) RO traffic returned to eqiad around 17:10 UTC today... [21:30:52] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: eqiad row C/D Service Ops host migrations - https://phabricator.wikimedia.org/T405950#11238805 (10Scott_French) conf1009 is (1) a member of eqiad main-etcd cluster, so clients will attempt to issue writes to it, (2) the upstream source for etcd-mirror replication...