[01:45:04] toyofuku: hi, just seeing this! I think I missed you for the day, but I'll ping you privately to talk about it [09:04:35] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10396089 (10ops-monitoring-bot) depool host kubernetes[2011-2014].codfw.wmnet by jelto@cumin1002 with reason: Renaming nodes [09:06:53] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10396094 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 depool for host kubernetes[2011-2014].codfw.wmnet... [09:08:32] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10396100 (10Jelto) ^ I used the wrong task ID. This depool was for codfw. We decided to pause eqiad migration until the four broken hosts are back (T381... [09:09:01] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396107 (10ops-monitoring-bot) depool host kubernetes[2011-2014].codfw.wmnet by jelto@cumin1002 with reason: Renaming nodes [09:09:08] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396109 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 depool for host kubernetes[2011-2014].codfw.wmnet comple... [09:20:31] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396138 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2011 to wikikube-worker2180 completed: - kubernet... [09:26:22] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396168 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2012 to wikikube-worker2181 completed: - kubernet... [09:32:10] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396205 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2013 to wikikube-worker2182 completed: - kubernetes2013 (**PASS**) -... [09:37:31] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396252 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2014 to wikikube-worker2183 completed: - kubernetes2014 (**PASS**) -... [09:40:01] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396255 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2180.codfw.wmnet with OS bookworm [09:40:12] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396256 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2181.codfw.wmnet with OS bookworm [09:40:22] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396259 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2182.codfw.wmnet with OS bookworm [09:40:38] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396260 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2183.codfw.wmnet with OS bookworm [09:52:23] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, 06SRE: wikikube-ctrl1002 and wikikube-ctrl1003: Switch network cable from port 2 to port 1 on the 10G NIC - https://phabricator.wikimedia.org/T379717#10396280 (10JMeybohm) >>! In T379717#10395266, @VRiley-WMF wrote: > Can we proceed with swapping th... [10:15:03] 06serviceops, 10LPL Hypothesis, 10Recommendation-API: Caching service request for recommendation api - https://phabricator.wikimedia.org/T381438#10396348 (10akosiaris) Hi, Thanks for putting all this info together. Some answers inline > The two examples you gave are actually 2 incarnations of the same thi... [10:26:35] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396403 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2180.codfw.wmnet with OS bookworm completed: - wikikube-worke... [10:32:45] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396457 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2182.codfw.wmnet with OS bookworm executed with errors: - wik... [10:33:34] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396460 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2182.codfw.wmnet with OS bookworm [10:39:56] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396477 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2183.codfw.wmnet with OS bookworm completed: - wikikube-worke... [11:12:01] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396561 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2181.codfw.wmnet with OS bookworm executed with errors: - wik... [11:12:23] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396564 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2181.codfw.wmnet with OS bookworm [11:15:52] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396570 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2182.codfw.wmnet with OS bookworm completed: - wikikube-worke... [11:19:51] 06serviceops, 10Wikipedia-iOS-App-Backlog (iOS Release FY2024-25): Add new APNS server certificates to Trust Store - https://phabricator.wikimedia.org/T381808#10396582 (10MSantos) [11:31:42] 06serviceops, 10Observability-Logging, 13Patch-For-Review: Logs from containers sometimes not visible in logstash - https://phabricator.wikimedia.org/T357616#10396646 (10JMeybohm) 05Open→03Resolved I'll resolve this additional bandaid seems to work okay for now [11:37:50] 06serviceops, 10[DEPRECATED] wdwb-tech, 06Data-Platform-SRE, 10Prod-Kubernetes, and 2 others: Write and adapt Runbooks and cookbooks related to the WDQS Streaming Updater and kubernetes - https://phabricator.wikimedia.org/T293063#10396668 (10JMeybohm) Hey @dcausse, is this still relevant given we now use t... [11:40:04] 06serviceops, 10Observability-Logging, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Enable audit logging for kube-apiserver - https://phabricator.wikimedia.org/T290020#10396670 (10JMeybohm) @colewhite is there anything to do/any open questions for serviceops regarding this? [11:53:48] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396689 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2181.codfw.wmnet with OS bookworm completed: - wikikube-worke... [11:58:04] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396696 (10ops-monitoring-bot) pool host wikikube-worker[2180-2183].codfw.wmnet by jelto@cumin1002 with reason: None [11:58:05] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10396697 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 pool for host wikikube-worker[2180-2183].codfw.wmnet completed: - wikikube-wor... [11:59:17] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T381967 (10Jelto) 03NEW [13:13:54] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397011 (10ops-monitoring-bot) depool host kubernetes[2017,2021-2022,2024].codfw.wmnet by jelto@cumin1002 with reason: Renaming nodes [13:18:16] 06serviceops, 10[DEPRECATED] wdwb-tech, 06Data-Platform-SRE, 10Prod-Kubernetes, and 2 others: Write and adapt Runbooks and cookbooks related to the WDQS Streaming Updater and kubernetes - https://phabricator.wikimedia.org/T293063#10397021 (10dcausse) @JMeybohm I think we were not able to simulate what happ... [13:18:55] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397022 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 depool for host kubernetes[2017,2021-2022,2024].codfw.wm... [13:26:45] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: Videoscaler/Mercurius deployment - https://phabricator.wikimedia.org/T371701#10397034 (10TheDJ) BTW. I think a blogpost on Mercurius and the problems it solves would be very interesting for Diff. Just a thought. [13:26:49] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397035 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2017 to wikikube-worker2184 completed: - kubernet... [13:33:43] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397063 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2021 to wikikube-worker2185 completed: - kubernetes2021 (**PASS**) -... [13:41:44] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397081 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2022 to wikikube-worker2186 completed: - kubernetes2022 (**PASS**) -... [13:54:37] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397122 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2024 to wikikube-worker2187 completed: - kubernetes2024 (**PASS**) -... [14:00:41] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397132 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2184.codfw.wmnet with OS bookworm [14:00:43] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397133 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2185.codfw.wmnet with OS bookworm [14:00:45] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397134 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2186.codfw.wmnet with OS bookworm [14:00:46] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397135 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2187.codfw.wmnet with OS bookworm [14:45:53] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397311 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2186.codfw.wmnet with OS bookworm completed: - wikikube-worke... [14:48:39] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397322 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2184.codfw.wmnet with OS bookworm completed: - wikikube-worke... [14:51:56] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397334 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2187.codfw.wmnet with OS bookworm completed: - wikikube-worke... [14:55:47] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397338 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2185.codfw.wmnet with OS bookworm completed: - wikikube-worke... [15:03:03] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397381 (10ops-monitoring-bot) pool host wikikube-worker[2184-2187].codfw.wmnet by jelto@cumin1002 with reason: None [15:03:06] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10397382 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 pool for host wikikube-worker[2184-2187].codfw.wmnet completed: - wikikube-wor... [15:03:28] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, and 2 others: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T381967#10397383 (10Jelto) [15:11:39] 06serviceops, 10ChangeProp, 06Content-Transform-Team, 10MediaWiki-Core-HTTP-Cache, and 3 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#10397412 (10cscott) Note that this is decoupling the reparsing/ParserCa... [15:27:17] 06serviceops, 10ChangeProp, 06Content-Transform-Team, 10MediaWiki-Core-HTTP-Cache, and 3 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#10397505 (10cscott) This might be a reasonable pair of hypotheses for a... [16:24:41] hey folks! [16:25:03] I have some patches for tegola/kartotherian starting from https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1101897/ [16:25:36] some merciful souls already started reviewing (thanks!), lemme know if anybody has bandwidth for the rest [16:25:52] I am almost ready to add kartotherian to wikikube :) [16:52:01] 06serviceops: Decommission kubernetes20[07-14].codfw.wmnet - https://phabricator.wikimedia.org/T379788#10397993 (10Clement_Goubert) [16:55:27] 06serviceops: Decommission kubernetes20[07-14].codfw.wmnet - https://phabricator.wikimedia.org/T379788#10398004 (10Clement_Goubert) [16:59:04] jayme: jelto bad luck some of the hosts you've reimaged will be decommed. As you're taking up the vacant ids anyways, I'm not going to exclude them from the site.pp regex, thoughts? [17:09:47] 06serviceops, 13Patch-For-Review: Decommission kubernetes20[07-14].codfw.wmnet - https://phabricator.wikimedia.org/T379788#10398061 (10ops-monitoring-bot) depool host wikikube-worker[2047,2066,2085-2086,2180-2183].codfw.wmnet by cgoubert@cumin1002 with reason: decommission [17:17:05] 06serviceops, 13Patch-For-Review: Decommission kubernetes20[07-14].codfw.wmnet - https://phabricator.wikimedia.org/T379788#10398085 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by cgoubert@cumin1002 depool for host wikikube-worker[2047,2066,2085-2086,2180-2183].codfw.wmnet complet... [17:35:58] 06serviceops, 06Abstract Wikipedia team, 10function-evaluator: Have SRE provide a production-ready Rust image upstream - https://phabricator.wikimedia.org/T380807#10398209 (10Jdforrester-WMF) p:05Triage→03Medium [17:54:25] 06serviceops, 13Patch-For-Review: Decommission kubernetes20[07-14].codfw.wmnet - https://phabricator.wikimedia.org/T379788#10398331 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by cgoubert@cumin1002 for hosts: `wikikube-worker[2047,2066,2085-2086].codfw.wmnet` - wikikube-worker2047.codfw.wm... [17:58:32] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, and 2 others: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T381967#10398377 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm [18:16:27] 06serviceops, 13Patch-For-Review: Decommission kubernetes20[07-14].codfw.wmnet - https://phabricator.wikimedia.org/T379788#10398429 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by cgoubert@cumin1002 for hosts: `wikikube-worker[2180-2183].codfw.wmnet` - wikikube-worker2180.codfw.wmnet (**PAS... [18:27:15] jayme: jelto: last of the decoms are here https://phabricator.wikimedia.org/T375842 and will be handled in the coming days, don't reimage them [18:42:15] 06serviceops, 06DC-Ops, 10ops-codfw: Decommission kubernetes20[07-14].codfw.wmnet - https://phabricator.wikimedia.org/T379788#10398556 (10jasmine_) a:05jasmine_→03None [19:13:07] 06serviceops, 06collaboration-services, 06DC-Ops, 10ops-eqiad, and 3 others: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T381504#10398712 (10VRiley-WMF) a:03VRiley-WMF [20:22:18] 06serviceops, 06collaboration-services, 06DC-Ops, 10ops-eqiad, and 3 others: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T381504#10398927 (10VRiley-WMF) [20:24:19] 06serviceops, 06collaboration-services, 06DC-Ops, 10ops-eqiad, and 3 others: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T381504#10398939 (10VRiley-WMF)