[09:11:04] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to package user issues in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10111943 (10JMeybohm) I don't see anything obvious in the diff of those two packages. The systems prior to yesterday seem to have installed... [09:41:05] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112109 (10hnowlan) [09:52:08] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10112179 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by cgoubert@cumin1002 Renumb... [09:53:04] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10112181 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host w... [09:53:20] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10112182 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikik... [09:53:21] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10112183 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by cgoubert@cumin1002 Renumberin... [09:56:50] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112192 (10hnowlan) >>! In T373819#10111943, @JMeybohm wrote: > I don't see anything obvious in the diff of those two packages. > The systems p... [09:58:52] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: cfssl-issuer: Generate Kubernetes Events - https://phabricator.wikimedia.org/T337928#10112193 (10JMeybohm) 05Open→03Resolved cfssl-issuer updated on all clusters [10:04:47] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112250 (10JMeybohm) >>! In T373819#10112192, @hnowlan wrote: > > [...] - this happens well before we configure adduser.conf so the uid used i... [10:20:12] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-codfw: decommission mw226[1-2].codfw.wmnet mw22[68-77].codfw.wmnet - https://phabricator.wikimedia.org/T371262#10112273 (10Clement_Goubert) [10:21:35] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#10112276 (10Clement_Goubert) 05Open→03Resolved a:03Clement_Goubert [x] The 5 nodes with an incorrect RAID config from {T358489} that haven't yet been reimaged [x] The codfw... [10:30:12] I'm going to be moving the deployment servers to Puppet 7 later (after the UTC afternoon backport window), if anyone sees an issue with that, let me know [10:54:24] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112419 (10Clement_Goubert) There was a point release yesterday which could explain why it changed cc @elukey [11:15:45] moritzm: there is an issue [11:16:02] they use the puppet ca for the rsync services that move the data between them [11:16:16] I 've tried to have mixed puppet 5 and puppet 7 and failed majestically [11:16:34] and I 've tried to have PKI issued certs for the rsync service, failing there as well [11:16:52] so I backtracked and stayed with puppet 5 to finish the refresh process [11:17:10] what I think will work, is move the entire role to puppet 7 in 1 go, but I 've never done that [11:21:44] there was a migrate-role cookbook for that, not sure if still up to date, was used during the initial migration phase [11:26:58] oh, right. deployment::rsync uses rsync::quickdatacopy with the stunnel option, that will cause an issue with they use the different puppet CAs [11:27:35] we could bypass that by simply switching off stunnel for a few days until all deployment servers are on Puppet 7 [11:27:56] this was also done for some other role (don't remember which out of the top of my head) [11:34:10] 06serviceops, 10Deployments, 10Shellbox, 10Wikibase-Quality-Constraints, and 4 others: Burst of GuzzleHttp Exception for http://localhost:6025/call/constraint-regex-checker - https://phabricator.wikimedia.org/T371633#10112573 (10Lucas_Werkmeister_WMDE) >>! In T371633#10078899, @Krinkle wrote: > I'm going t... [11:39:14] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10112599 (10Clement_Goubert) [12:41:58] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112747 (10elukey) The new point release for Bullseye is https://www.debian.org/News/2024/2024083102, and indeed I updated the netinst image ye... [12:50:19] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112789 (10MoritzMuehlenhoff) deb11u5 is from the point release, deb11u6 is from https://lists.debian.org/debian-lts-announce/2024/09/msg00001.... [12:54:10] moritzm: we thought about that, but originally discarded it hoping we could get a proper PKI solution. [12:54:37] we can still do it now that the PKI plan didn't work out as easily as we had hoped [12:56:49] I made https://gerrit.wikimedia.org/r/c/operations/puppet/+/1070236 for that, we can add proper support for PKI in stunnel separately I'd say, the less we rely on the puppet CA the better [13:03:16] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112849 (10JMeybohm) >>! In T373819#10112747, @elukey wrote: > The new point release for Bullseye is https://www.debian.org/News/2024/202408310... [13:11:02] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112874 (10elukey) >>! In T373819#10112849, @JMeybohm wrote: >>>! In T373819#10112747, @elukey wrote: >> The new point release for Bullseye is... [13:12:41] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112885 (10MoritzMuehlenhoff) I think this might be a bug in the latest systemd update for LTS: - systemd-timesyncd deb11u4 and deb11u5 have P... [13:16:10] 06serviceops, 10Shellbox, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: [SW] [WBQC] shellbox-constraints returning 500 on preg_match error - https://phabricator.wikimedia.org/T362084#10112896 (10Lucas_Werkmeister_WMDE) > 1. It is no necessarily clear to us, whether there is something we (the wi... [13:18:37] 06serviceops, 10Shellbox, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: [SW] [WBQC] shellbox-constraints returning 500 on preg_match error - https://phabricator.wikimedia.org/T362084#10112915 (10Lucas_Werkmeister_WMDE) Side note: it sounds like this PHP code isn’t quite correct: `lang=php,name... [13:26:35] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10112936 (10MoritzMuehlenhoff) >>! In T373819#10112874, @elukey wrote: > Sure, but if it is not d-i that installs it, then it is puppet (via `sy... [13:41:56] 06serviceops, 10Shellbox, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: [SW] [WBQC] shellbox-constraints returning 500 on preg_match error - https://phabricator.wikimedia.org/T362084#10113028 (10Agabi10) >>! In T362084#10112915, @Lucas_Werkmeister_WMDE wrote: > I wonder if we can use a differen... [14:10:44] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10113171 (10MoritzMuehlenhoff) >>! In T373819#10112885, @MoritzMuehlenhoff wrote: > I'll reproduce this in an nspawn contained and report upstre... [14:46:20] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10113353 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by hnowlan@cumin1002 from kubernetes2028 to wikikube-worker... [14:50:27] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10113386 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by hnowlan@cumin1002 from kubernetes2055 to wikikube-worker... [14:59:12] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10113412 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by hnowlan@cumin1002 from mw2423 to wikikube-worker2075 com... [15:01:03] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10113418 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by hnowlan@cumin1002 from mw2422 to wikikube-worker2074 com... [15:01:44] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10113419 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin1002 for host wikikube-worker2072.codf... [15:02:17] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10113422 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin1002 for host wikikube-worker2073.codf... [15:05:38] 06serviceops, 10MW-on-K8s, 10wikitech.wikimedia.org, 13Patch-For-Review: MVP: Privately serve wikitech via mwdebug1001 - https://phabricator.wikimedia.org/T371537#10113442 (10jijiki) [15:08:08] 06serviceops, 06Infrastructure-Foundations: Issues reimaging kubernetes workers due to user conflicts in systemd-timesyncd - https://phabricator.wikimedia.org/T373819#10113459 (10MoritzMuehlenhoff) Tracking bug is https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1080418 [15:26:22] 06serviceops, 06All-and-every-Wikisource, 10Thumbor, 13Patch-For-Review: Elevated 429 responses from Thumbor on codfw starting 2024-08-14 00:00 UTC - https://phabricator.wikimedia.org/T372470#10113572 (10Midleading) Almost no new PDF file thumbnails can be generated from codfw: https://commons.wikimedia.or... [16:46:11] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114001 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin1002 for host wikiku... [16:47:14] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114007 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin1002 for host wikiku... [16:48:59] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, and 2 others: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373591#10114003 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm [16:49:31] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114037 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin1002 for host wi... [16:49:34] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114038 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin1002 for host wi... [17:04:28] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114108 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw2402 to wi... [17:05:35] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114119 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wik... [17:06:13] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114129 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw2406 to wi... [17:07:08] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114134 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wik... [17:09:30] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114147 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw2407 to wi... [17:19:42] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114188 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wik... [17:31:14] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T373696#10114261 (10VRiley-WMF) [17:32:57] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T373696#10114244 (10VRiley-WMF) 05Open→03Resolved a:03VRiley-WMF This is completed. Thank you! [17:37:57] 06serviceops: Migrate etcd::tlsproxy Nginx certs and etcd itself to PKI - https://phabricator.wikimedia.org/T352245#10114343 (10Scott_French) At a high level, we can split this into two phases: TLS proxy (nginx) and etcd. For simplicity, we can reuse the existing PKI support in `profile::etcd::v3` for the latte... [17:48:35] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114421 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin1002 for host wikiku... [17:53:22] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114457 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin1002 for host wikiku... [17:54:51] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373916 (10hnowlan) 03NEW [18:59:32] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114731 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikub... [19:00:09] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114732 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikub... [19:05:13] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114737 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikub... [19:11:47] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, and 2 others: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373916#10114752 (10kamila) [19:19:29] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114778 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by swfrench@cumin2002 from mw2312 to... [19:21:28] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10114781 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by swfrench@cumin2002 for host w... [20:07:24] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Update iDRAC on mw2260.codfw.wmnet - https://phabricator.wikimedia.org/T373934 (10Scott_French) 03NEW [20:25:03] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10115098 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by swfrench@cumin2002 for host wikik... [21:38:26] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, and 2 others: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373916#10115314 (10Scott_French)