[06:34:46] 06serviceops, 13Patch-For-Review: kafka-main200[6789] and kafka-main2010 implementation tracking - https://phabricator.wikimedia.org/T363210#10139758 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=358ccffc-965f-4494-af81-ec1629049541) set by jayme@cumin1002 for 1 day, 0:00:00 on 2 host(s)... [07:21:28] 06serviceops, 13Patch-For-Review: kafka-main200[6789] and kafka-main2010 implementation tracking - https://phabricator.wikimedia.org/T363210#10139798 (10JMeybohm) [07:22:07] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-codfw: decommission kafka-main2004.codfw.wmnet - https://phabricator.wikimedia.org/T374594#10139799 (10JMeybohm) [08:37:34] gmodena: sorry for causing trouble - where was this visible? [09:19:51] jayme no worries! [09:21:02] gmodena: I've just deployed the thing as I replaced the next broker. But it there any alert or something for flinks not starting (as that is decoupled from the deployment due to how the operator works)? [09:21:35] also interesting that each flink thing has a different argument to configure the broker list... 🙈 [09:25:28] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-codfw, and 2 others: decommission kafka-main2004.codfw.wmnet - https://phabricator.wikimedia.org/T374594#10139958 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by jayme@cumin1002 for hosts: `kafka-main2004.codfw.wmnet` - kafka-main20... [09:25:28] we got an alert for the failing app (https://w.wiki/BAfT), complaining about TaskManagers (= worker nodes) being down [09:26:26] 06serviceops: kafka-main100[6789] and kafka-main1010 implementation tracking - https://phabricator.wikimedia.org/T363214#10139967 (10JMeybohm) [09:26:30] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-codfw, and 2 others: decommission kafka-main2004.codfw.wmnet - https://phabricator.wikimedia.org/T374594#10139963 (10JMeybohm) a:05JMeybohm→03None [09:28:39] jayme we do have aletermanager rules for the single applications (decouple by the operator). For this specific one: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/alerts/+/refs/heads/master/team-data-engineering/mw-page-content-change-enrich.yaml [09:29:17] ah, cool. thanks [09:29:19] sorry about the config confusion. There's some delta in conventions between java and python applications. [09:41:41] jayme: fyi I'm re-deploying rdf-streaming-updater in codfw it started to misbehave like yesterday [09:42:23] dcausse: ok. Do you know that the issue is with this? [09:43:23] jayme: I'd say it's confused by the re-use of the broker.id but haven't tracked that down to something specific [09:44:17] dcausse: but isn't that something flink handles? meaning it should break all flinks or none of them? [09:45:25] the rdf job is using a transactional producer to limit duplicated messages so I suppose it's where the diff [09:45:29] do you have the error at hand maybe? I think it might be worth it looking into, because this is more like a standard operation and all other clients seem to be fine with it [09:45:42] ah...hmm [09:46:46] I'll file a task to gather more info and see if we can understand better [09:54:01] <3 [10:08:34] 06serviceops, 13Patch-For-Review: Prepare PHP 8.1 service images for Shellbox - https://phabricator.wikimedia.org/T374502#10140086 (10MoritzMuehlenhoff) We don't strictly need to rebuild ffmpeg for bullseye, though. The current backport we run in production was made to cover the for the no longer supported ver... [10:12:31] hnowlan: o/ [10:12:44] yoo [10:13:00] if you have a min - thumbor in staging is set for poolcounter2005, is there anything that we can do to trigger some new conns to poolcounter? [10:14:51] elukey: yeah it should be pretty simple, will give it a go in a few minutes [10:17:55] super, I'll keep monitored the conns to port 7531 on 2005 [10:30:25] (stepping afk for lunch, I'll read in a bit!) [10:35:06] FYI, I'll switch deploy2002 to Puppet 7 later (stunnel for the rsync between 1003 and 2002 has been temporarily disabled to avoid P5-P7 cert interop issues) [10:42:21] elukey: 2024-09-12 10:27:53,601 ???? thumbor:DEBUG [PoolCounter] Connecting to: poolcounter2005.codfw.wmnet 7531 [10:42:24] 2024-09-12 10:27:53,676 ???? thumbor:DEBUG [PoolCounter] Got data of 'b'LOCKED\n'' from poolcounter during ACQ4ME [10:42:27] looks good to me [12:18:35] hnowlan: thanks! <3 [12:18:43] shall I deploy to codfw? [12:21:26] I think it is good, doing i [12:21:28] *it [12:45:12] 06serviceops, 13Patch-For-Review: Migrate poolcounter hosts to bookworm - https://phabricator.wikimedia.org/T332015#10140592 (10elukey) Moved thumbor codfw to poolcounter2005, everything worked nicely. At this point I think that we can: * Create the missing 3 VMs (poolcounter2006, poolcounter1005 and poolcou... [12:50:10] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140630 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2390 to wikikube-worker2107 completed: - mw2390 (**... [12:55:14] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140645 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2394 to wikikube-worker2108 completed: - mw2394 (**... [13:04:19] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140667 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2107.codfw.wm... [13:04:38] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140668 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host wikikube-worker2107.codfw.wmnet with OS bull... [13:06:15] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140673 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2108.codfw.wm... [13:06:27] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140678 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2395 to wikikube-worker2109 completed: - mw2395 (**... [13:06:31] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140680 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host wikikube-worker2108.codfw.wmnet with OS bull... [13:09:46] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140703 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2109.codfw.wm... [13:10:02] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140704 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host wikikube-worker2109.codfw.wmnet with OS bull... [13:10:22] deploy2002 is on Puppet 7 now [13:13:11] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140707 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2396 to wikikube-worker2110 completed: - mw2396 (**... [13:23:02] jayme: do you a link to the procedure you used? and possibly the timing of when the operation started/ended for one of the replacement node (trying to do some correlation in the app logs) [13:24:56] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140757 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2397 to wikikube-worker2111 completed: - mw2397 (**... [13:25:25] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140759 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2110.codfw.wm... [13:25:41] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140760 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host wikikube-worker2110.codfw.wmnet with OS bull... [13:26:30] elukey: thank you! [13:26:43] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140764 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2111.codfw.wm... [13:26:52] 06serviceops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140765 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host wikikube-worker2111.codfw.wmnet with OS bull... [13:28:19] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140768 (10Clement_Goubert) [13:33:48] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140790 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2398 to wikikube-worker2112 completed: - mw2398 (**PASS**)... [13:34:45] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140792 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2112.codfw.wmnet [13:34:57] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140793 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host wikikube-worker2112.codfw.wmnet with OS bullseye [13:44:45] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140828 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2399 to wikikube-worker2113 completed: - mw2399 (**PASS**)... [13:46:25] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140831 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2113.codfw.wmnet [13:46:44] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140833 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2113.codfw.wmnet completed... [13:47:19] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140842 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2113.codfw.wmnet [13:47:35] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140846 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2113.codfw.wmnet completed... [13:47:43] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140850 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wikikube-worker2107.codfw.wmnet with OS bullseye complete... [13:49:00] dcausse: should be in https://phabricator.wikimedia.org/T363210 [13:50:34] thx! [13:54:36] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140871 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node was started by akosiaris@cumin1002 Renumbering for host wikikube-worker2113.codfw.wmnet [13:54:39] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140872 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host wikikube-worker2113.codfw.wmnet with OS bullseye [13:56:24] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140889 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2107.codfw.wmnet completed... [13:56:58] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140893 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wikikube-worker2108.codfw.wmnet with OS bullseye complete... [13:57:46] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes mw2390 and mw2394-mw2399 - https://phabricator.wikimedia.org/T374622 (10akosiaris) 03NEW [14:02:21] 06serviceops, 10MW-on-K8s, 10Release-Engineering-Team (Priority Backlog 📥): Provide an mwdebug functionality on kubernetes - https://phabricator.wikimedia.org/T276994#10140924 (10jijiki) [14:02:46] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140932 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wikikube-worker2109.codfw.wmnet with OS bullseye complete... [14:02:53] 06serviceops, 10MW-on-K8s, 10Release-Engineering-Team (Priority Backlog 📥): Provide an mwdebug functionality on kubernetes - https://phabricator.wikimedia.org/T276994#10140934 (10jijiki) [14:11:00] 06serviceops, 10MW-on-K8s, 10Release-Engineering-Team (Priority Backlog 📥): Provide an mwdebug functionality on kubernetes - https://phabricator.wikimedia.org/T276994#10140956 (10jijiki) [14:11:06] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140959 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wikikube-worker2110.codfw.wmnet with OS bullseye complete... [14:13:45] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140966 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2108.codfw.wmnet completed... [14:17:20] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140972 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wikikube-worker2111.codfw.wmnet with OS bullseye complete... [14:18:50] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140979 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2109.codfw.wmnet completed... [14:23:41] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10140999 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wikikube-worker2112.codfw.wmnet with OS bullseye complete... [14:27:51] 06serviceops, 10MediaWiki-Uploading, 07Regression, 07Wikimedia-production-error: Large file uploads broken via Special:Upload - https://phabricator.wikimedia.org/T374436#10141014 (10hnowlan) p:05Triage→03Unbreak! [14:30:42] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission kafka-main2004.codfw.wmnet - https://phabricator.wikimedia.org/T374594#10141048 (10Jhancock.wm) a:03Jhancock.wm [14:30:59] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission kafka-main2004.codfw.wmnet - https://phabricator.wikimedia.org/T374594#10141054 (10Jhancock.wm) 05Open→03Resolved [14:31:16] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission kafka-main2003.codfw.wmnet - https://phabricator.wikimedia.org/T374542#10141040 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm [14:33:25] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes: Degraded RAID on wikikube-worker2092 - https://phabricator.wikimedia.org/T374409#10141070 (10Jhancock.wm) part arriving today. will update when swapped. [14:43:49] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, and 2 others: Relabel codfw kubernetes nodes mw2390 and mw2394-mw2399 - https://phabricator.wikimedia.org/T374622#10141130 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm [14:47:59] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10141142 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wikikube-worker2113.codfw.wmnet with OS bullseye complete... [15:30:24] 06serviceops, 10MoveComms-Support, 07Datacenter-Switchover: MoveComms support for Southward Datacenter Switchover (September 2024) - https://phabricator.wikimedia.org/T371130#10141322 (10Trizek-WMF) [15:34:17] topranks: k8s nodes depooled, and I think kafka-main2009 should be ok with a small blip cc jayme [15:34:44] claime: AIUI it should, yes [15:35:32] cluster is healthy at least, so I won't expect any trouble...although from what I learned the least couple of days there is not much to expect from kafka besides of troube [15:40:31] thanks guys! [15:48:10] 06serviceops, 10MoveComms-Support, 07Datacenter-Switchover: MoveComms support for Southward Datacenter Switchover (September 2024) - https://phabricator.wikimedia.org/T371130#10141416 (10Trizek-WMF) Switching from 14:00 to 15:00 UTC required extra checks on the [[ https://meta.wikimedia.org/wiki/Special:Tran... [15:56:28] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10141438 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2111.codfw.wmnet completed... [15:56:32] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10141439 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2111.codfw.wmnet completed... [15:58:46] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10141468 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2112.codfw.wmnet completed... [15:58:47] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10141467 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2112.codfw.wmnet completed... [16:00:05] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10141483 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2113.codfw.wmnet completed... [16:01:04] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10141485 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2110.codfw.wmnet completed... [16:01:06] 06serviceops, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10141486 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.renumber-node started by akosiaris@cumin1002 Renumbering for host wikikube-worker2110.codfw.wmnet completed... [16:18:25] claime, jayme: all server moves done for the day [16:18:26] thanks! [16:18:32] awesome thanks [17:42:37] o/ we're investigating a weird request that "seems" to have lasted for 2hours: https://logstash.wikimedia.org/goto/f5dff72f669ce0fecad15aa8e8b46022 (start at 04:51 and end around 07:00) same req_id, same pod, is this something "expected"? [17:44:13] context T331127 where we tried to find the root cause of missing event, I think the cause is just that the request went poorly but we were very surprised to see that it could last that long [17:45:27] 06serviceops, 10CX-cxserver, 10RESTBase Sunsetting, 10LPL Essential (LPL Essential 2024 Jul-Sep): Switchover plan from RESTBase to REST Gateway for cxserver - https://phabricator.wikimedia.org/T372753#10142068 (10MSantos) After further investigation, this is considered invalid and we will instead remove th... [19:06:15] 06serviceops, 10Continuous-Integration-Infrastructure, 10MediaWiki-Platform-Team (Radar): Prepare WMF PHP 8.1 packages for Bullseye - https://phabricator.wikimedia.org/T372507#10142205 (10Scott_French) [19:09:03] 06serviceops, 10Continuous-Integration-Infrastructure, 13Patch-For-Review: Prepare PHP 8.1 production images - https://phabricator.wikimedia.org/T372602#10142203 (10Scott_French) 05Open→03Resolved At this point, I believe we're good to move forward with initial testing using these images, and there a... [19:10:39] 06serviceops, 10Continuous-Integration-Infrastructure, 10MediaWiki-Platform-Team (Radar): Prepare WMF PHP 8.1 packages for Bullseye - https://phabricator.wikimedia.org/T372507#10142206 (10Scott_French) 05Open→03Resolved Many thanks for driving the CI updates, @Jdforrester-WMF. I believe this wraps ev... [19:12:18] 06serviceops, 13Patch-For-Review: Build php-uuid package, and add to WMF production and CI - https://phabricator.wikimedia.org/T373752#10142213 (10Jdforrester-WMF) [21:59:29] 06serviceops, 07Datacenter-Switchover, 13Patch-For-Review: Pre-switchover cookbook testing - https://phabricator.wikimedia.org/T374047#10142643 (10Scott_French) [22:12:56] 06serviceops, 13Patch-For-Review: Prepare PHP 8.1 service images for Shellbox - https://phabricator.wikimedia.org/T374502#10142666 (10Scott_French) Many thanks for clarifying the status of 4.3.x on bullseye, @MoritzMuehlenhoff - that's great news. Alright, per T374502#10140086 and additional discussion with... [23:20:28] 06serviceops, 10CirrusSearch, 06MediaWiki-Platform-Team, 03Discovery-Search (Current work): PHP web requests running for multiple hours - https://phabricator.wikimedia.org/T374662#10142864 (10bd808) My initial random guess at a proximal cause would be deferred updates blocking on insert locks at the db whe...