[00:10:28] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:10:50] FIRING: TfInfraTestDestroyFailed: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [00:14:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [00:15:28] RESOLVED: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:19:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [00:22:36] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [00:22:37] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [00:22:49] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [00:22:57] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [00:23:07] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [00:23:15] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [00:23:28] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [00:23:40] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [00:29:05] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=99) [00:51:55] FIRING: MaxConntrack: Max conntrack at 84.14% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:52:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-20 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [00:56:55] RESOLVED: MaxConntrack: Max conntrack at 88.41% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [01:38:55] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [01:39:06] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [01:39:37] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [01:39:48] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [01:52:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-20 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [01:52:18] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-20 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [01:53:33] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-20 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [01:57:18] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-20 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [02:00:42] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [02:00:55] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [02:07:18] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Kubernetes worker tools-k8s-worker-nfs-20 has many processes stuck on IO (probably NFS) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [02:13:55] 10Toolforge (Toolforge iteration 14), 13Patch-For-Review: [k8s] Add node anti-affinity topologySpreadConstraints to infrastructure components where relevant - https://phabricator.wikimedia.org/T358203#10065728 (10Raymond_Ndibe) This has been open for some time now so we should probably close it. To test it her... [02:34:28] 10Tools: Template transclusion count per page - https://phabricator.wikimedia.org/T372523 (10Fgnievinski) 03NEW [03:02:28] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [03:02:37] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [03:02:48] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [03:02:59] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [03:03:08] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [03:03:16] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [03:03:30] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [03:03:31] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [03:52:49] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [03:53:43] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 29689 bytes in 3.496 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [04:18:22] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [04:18:23] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [04:18:38] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [04:18:39] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [04:18:54] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [04:18:55] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [04:19:25] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [04:19:26] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [04:19:47] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [04:21:16] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [04:21:17] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [04:22:04] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [04:22:21] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [04:25:03] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [04:25:51] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [04:26:50] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [06:23:40] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: [ceph] Metrics started not responding during the drain - https://phabricator.wikimedia.org/T372528 (10dcaro) 03NEW [06:24:33] !log dcaro@urcuchillay tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-20 [06:24:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [06:30:24] !log dcaro@urcuchillay tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-20 [06:30:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [06:56:21] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: [ceph] Metrics started not responding during the drain - https://phabricator.wikimedia.org/T372528#10065958 (10dcaro) This explains the issue: https://documentation.suse.com/ses/7.1/html/ses-all/monitoring-alerting.html#monitoring-stale-cache Turns out t... [07:07:22] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: [ceph] Metrics started not responding during the drain - https://phabricator.wikimedia.org/T372528#10065962 (10dcaro) 05Open→03Resolved a:03dcaro I've also set the scrape_interval to the same we have on prometheus side (300), and restarted the m... [07:09:23] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: [ceph] Metrics started not responding during the drain - https://phabricator.wikimedia.org/T372528#10065972 (10dcaro) Actually, changed the scrape_interval to 60s as that's what we have configured: ` root@prometheus1005:~# cat /srv/prometheus/cloud/pr... [07:10:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [07:51:15] 10Toolforge (Toolforge iteration 14), 13Patch-For-Review: [k8s] Add node anti-affinity topologySpreadConstraints to infrastructure components where relevant - https://phabricator.wikimedia.org/T358203#10066033 (10dcaro) >>! In T358203#10065728, @Raymond_Ndibe wrote: > Any other way to have less number of deplo... [08:05:02] (03update) 10dcaro: worker: add simple task and worker process [toolforge-repos/sample-complex-app-backend] - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-backend/-/merge_requests/1 (https://phabricator.wikimedia.org/T370321) [08:26:40] (03approved) 10dcaro: [jobs-cli] remove _display_messages [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/62 (owner: 10raymond-ndibe) [08:35:25] !log dcaro@urcuchillay samplecomplexapp START - Cookbook wmcs.vps.create_project for trove-only project samplecomplexapp in eqiad1 (T370317) [08:35:27] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:35:27] T370317: [sct.backend] Create trove database - https://phabricator.wikimedia.org/T370317 [08:36:04] !log dcaro@urcuchillay samplecomplexapp END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for trove-only project samplecomplexapp in eqiad1 (T370317) [08:36:05] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:40:56] !log dcaro@urcuchillay samplecomplexapp START - Cookbook wmcs.vps.create_project for trove-only project samplecomplexapp in eqiad1 (T370317) [08:40:58] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:40:58] T370317: [sct.backend] Create trove database - https://phabricator.wikimedia.org/T370317 [08:40:59] !log dcaro@urcuchillay samplecomplexapp END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for trove-only project samplecomplexapp in eqiad1 (T370317) [08:40:59] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:41:23] !log dcaro@urcuchillay samplecomplexapp START - Cookbook wmcs.vps.create_project for trove-only project samplecomplexapp in eqiad1 (T370317) [08:41:24] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:41:42] !log dcaro@urcuchillay samplecomplexapp END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for trove-only project samplecomplexapp in eqiad1 (T370317) [08:41:43] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:41:45] !log dcaro@urcuchillay samplecomplexapp START - Cookbook wmcs.vps.create_project for trove-only project samplecomplexapp in eqiad1 (T370317) [08:41:45] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:43:22] !log dcaro@urcuchillay samplecomplexapp END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for trove-only project samplecomplexapp in eqiad1 (T370317) [08:43:22] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:43:25] !log dcaro@urcuchillay samplecomplexapp START - Cookbook wmcs.vps.create_project for trove-only project samplecomplexapp in eqiad1 (T370317) [08:43:25] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:43:56] !log dcaro@urcuchillay samplecomplexapp END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for trove-only project samplecomplexapp in eqiad1 (T370317) [08:43:56] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:45:09] !log dcaro@urcuchillay samplecomplexapp START - Cookbook wmcs.vps.create_project for trove-only project samplecomplexapp in eqiad1 (T370317) [08:45:09] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:45:24] !log dcaro@urcuchillay samplecomplexapp END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for trove-only project samplecomplexapp in eqiad1 (T370317) [08:45:24] wmbot~dcaro@urcuchillay: Unknown project "samplecomplexapp" [08:47:04] (03PS1) 10David Caro: create_project: fix the quota setting for the project [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062964 [08:50:45] (03CR) 10CI reject: [V:04-1] create_project: fix the quota setting for the project [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062964 (owner: 10David Caro) [08:50:58] (03approved) 10dcaro: toolforge_deploy_mr: make all apt actions non-interactive [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/182 [08:51:02] (03merge) 10dcaro: toolforge_deploy_mr: make all apt actions non-interactive [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/182 [08:54:47] (03PS2) 10David Caro: openstack: skip passing envvar for role commands [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062674 [08:54:47] (03PS2) 10David Caro: create_project: fix the quota setting for the project [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062964 [08:56:12] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.wait_for_rebalance [08:56:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [08:58:20] (03approved) 10dcaro: [envvars-cli] remove display_messages [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/57 (owner: 10raymond-ndibe) [08:58:45] (03update) 10dcaro: [toolforge-weld] move _display_message into toolforge weld [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/46 (owner: 10raymond-ndibe) [09:42:32] 10Tools: Flickr2 Commons is currently down - https://phabricator.wikimedia.org/T372451#10066263 (10Jeff_G) See also https://commons.wikimedia.org/wiki/Commons:Village_pump#Flickr2Commons and unactioned issue 320 at https://bitbucket.org/magnusmanske/flickr2commons/issues/320/doesnt-respond . [09:52:25] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.wait_for_rebalance (exit_code=0) [09:52:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:00:50] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.wait_for_rebalance [10:00:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:03:54] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [10:10:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [10:22:02] 10VPS-project-devtools, 10GitLab: https://gitlab.devtools.wmcloud.org is being indexed by google (and scoring pretty high) - https://phabricator.wikimedia.org/T372538#10066439 (10Bugreporter) Also, we need a notice in that tool to indicate this is only a test instance, and provide a link to official Wikimedia... [12:54:54] (03open) 10dcaro: db: add database check and boileplate [toolforge-repos/sample-complex-app-backend] (add_worker) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-backend/-/merge_requests/2 (https://phabricator.wikimedia.org/T370317) [13:02:39] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [13:35:53] (03update) 10dcaro: db: add database check and boileplate [toolforge-repos/sample-complex-app-backend] (add_worker) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-backend/-/merge_requests/2 (https://phabricator.wikimedia.org/T370317) [13:39:17] 10Cloud-VPS (Debian Buster Deprecation), 10Wikispore: Rebuild Wikispore Vagrant boxes on Bullseye or Bookworm - https://phabricator.wikimedia.org/T365934#10066871 (10Andrew) Hello @tgr, were you able to make any progress with this? I'm going on sabbatical soon and (for arcane backend reasons, T364457) need to... [13:40:36] 10Cloud-VPS (Debian Buster Deprecation), 10linkwatcher: Cloud VPS "linkwatcher" project Buster deprecation - https://phabricator.wikimedia.org/T367536#10066881 (10Andrew) Hi @Beetstra! Did you ever get your access issues sorted out? I need to rebuild the host that these VMs are on soon (see T364457 for probabl... [13:40:48] (03update) 10dcaro: db: add database check and boileplate [toolforge-repos/sample-complex-app-backend] (add_worker) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-backend/-/merge_requests/2 (https://phabricator.wikimedia.org/T370317) [13:48:40] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "wikipathways" project Buster deprecation - https://phabricator.wikimedia.org/T367563#10066919 (10Andrew) Emailed project admins today: ` Hello! You are receiving this email because you are listed as an admin of the 'wikipathways' project in cloud-vps. None... [13:53:09] 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "dumps" project Buster deprecation - https://phabricator.wikimedia.org/T367528#10066923 (10Andrew) I'm going to delete these VMs next week. If you need to check them or rescue an data, now's the time :) [13:59:00] (03update) 10dcaro: db: add database check and boileplate [toolforge-repos/sample-complex-app-backend] (add_worker) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-backend/-/merge_requests/2 (https://phabricator.wikimedia.org/T370317) [14:00:18] 10Cloud-VPS (Debian Buster Deprecation), 06Infrastructure-Foundations, 10Puppet CI: Cloud VPS "puppet-diffs" project Buster deprecation - https://phabricator.wikimedia.org/T367547#10066941 (10Andrew) Hi folks! I haven't read all of this ticket but can I get an update about when/if you plan to remove the Bust... [14:02:21] (03update) 10dcaro: db: add database check and boileplate [toolforge-repos/sample-complex-app-backend] (add_worker) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-backend/-/merge_requests/2 (https://phabricator.wikimedia.org/T370317) [14:08:21] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Cloud-VPS: [network,D5] reboot cloudsw-d5 - https://phabricator.wikimedia.org/T371878#10066963 (10Andrew) Quick summary: Cathal upgraded and rebooted the switch on Tuesday the 13th. That did not solve the flapping. Vriley then suggested that we do a physical powerd... [14:08:26] (03update) 10dcaro: db: add database check and boileplate [toolforge-repos/sample-complex-app-backend] (add_worker) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-backend/-/merge_requests/2 (https://phabricator.wikimedia.org/T370317) [14:13:41] (03PS1) 10Pwangai: Exempt Test Group Repositories [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/1063008 (https://phabricator.wikimedia.org/T372565) [14:13:58] (03CR) 10CI reject: [V:04-1] Exempt Test Group Repositories [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/1063008 (https://phabricator.wikimedia.org/T372565) (owner: 10Pwangai) [14:15:14] (03PS2) 10Pwangai: Exempt Test Group Repositories [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/1063008 (https://phabricator.wikimedia.org/T372565) [14:15:32] (03CR) 10CI reject: [V:04-1] Exempt Test Group Repositories [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/1063008 (https://phabricator.wikimedia.org/T372565) (owner: 10Pwangai) [14:16:43] (03open) 10dcaro: checks: add database check [toolforge-repos/sample-complex-app-frontend] (show_deployment_steps) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/4 (https://phabricator.wikimedia.org/T370317) [14:19:48] (03update) 10dcaro: checks: add database check [toolforge-repos/sample-complex-app-frontend] (show_deployment_steps) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/4 (https://phabricator.wikimedia.org/T370317) [14:19:51] (03update) 10dcaro: checks: add database check [toolforge-repos/sample-complex-app-frontend] (show_deployment_steps) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/4 (https://phabricator.wikimedia.org/T370317) [14:19:55] (03update) 10dcaro: checks: add database check [toolforge-repos/sample-complex-app-frontend] (show_deployment_steps) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/4 (https://phabricator.wikimedia.org/T370317) [14:19:58] (03update) 10dcaro: checks: add database check [toolforge-repos/sample-complex-app-frontend] (show_deployment_steps) - 10https://gitlab.wikimedia.org/toolforge-repos/sample-complex-app-frontend/-/merge_requests/4 (https://phabricator.wikimedia.org/T370317) [14:30:34] (03CR) 10Andrew Bogott: [C:03+1] create_project: fix the quota setting for the project [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062964 (owner: 10David Caro) [14:32:15] (03PS3) 10Pwangai: Exempt Test Group Repositories [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/1063008 (https://phabricator.wikimedia.org/T372565) [14:44:55] (03CR) 10David Caro: [C:03+2] create_project: fix the quota setting for the project [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062964 (owner: 10David Caro) [14:44:59] (03CR) 10David Caro: [C:03+2] openstack: skip passing envvar for role commands [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062674 (owner: 10David Caro) [14:45:39] 10Toolforge (Toolforge iteration 14), 13Patch-For-Review: [k8s] Add node anti-affinity topologySpreadConstraints to infrastructure components where relevant - https://phabricator.wikimedia.org/T358203#10067120 (10Raymond_Ndibe) >>! In T358203#10066033, @dcaro wrote: >>>! In T358203#10065728, @Raymond_Ndibe wro... [14:48:58] (03Merged) 10jenkins-bot: openstack: skip passing envvar for role commands [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062674 (owner: 10David Caro) [14:48:58] (03Merged) 10jenkins-bot: create_project: fix the quota setting for the project [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1062964 (owner: 10David Caro) [14:54:24] (03update) 10raymond-ndibe: [builds-api] add topologySpreadConstraints to deployment [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/82 (https://phabricator.wikimedia.org/T358203) [15:17:45] FIRING: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:22:06] 10VPS-project-devtools, 06collaboration-services, 10GitLab, 06Release-Engineering-Team: https://gitlab.devtools.wmcloud.org is being indexed by google (and scoring pretty high) - https://phabricator.wikimedia.org/T372538#10067210 (10brennen) > Have a robots.txt that makes it non-indexable Reasonable. I do... [15:22:12] 10VPS-project-devtools, 06collaboration-services, 10GitLab, 06Release-Engineering-Team: https://gitlab.devtools.wmcloud.org is being indexed by google (and scoring pretty high) - https://phabricator.wikimedia.org/T372538#10067212 (10brennen) [15:22:45] RESOLVED: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_toolserver_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:31:58] (03open) 10raymond-ndibe: [toolforge-deploy] DO_NOT_MERGE : increase builds-api replicas in local env [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/481 (https://phabricator.wikimedia.org/T358203) [15:40:52] (03open) 10raymond-ndibe: Draft: [builds-api] DO_NOT_MERGE: schedule all pods on toolforge-worker [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/108 (https://phabricator.wikimedia.org/T358203) [15:41:05] (03update) 10raymond-ndibe: Draft: [toolforge-deploy] DO_NOT_MERGE : increase builds-api replicas in local env [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/481 (https://phabricator.wikimedia.org/T358203) [15:42:23] (03update) 10raymond-ndibe: [builds-api] add topologySpreadConstraints to deployment [repos/cloud/toolforge/builds-api] (node-selector-to-test-topology-spread) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/82 (https://phabricator.wikimedia.org/T358203) [15:44:11] (03update) 10raymond-ndibe: [builds-api] add topologySpreadConstraints to deployment [repos/cloud/toolforge/builds-api] (node-selector-to-test-topology-spread) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/82 (https://phabricator.wikimedia.org/T358203) [15:44:28] (03update) 10raymond-ndibe: [builds-api] add topologySpreadConstraints to deployment [repos/cloud/toolforge/builds-api] (node-selector-to-test-topology-spread) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/82 (https://phabricator.wikimedia.org/T358203) [15:53:09] FIRING: CephClusterInUnknown: #page Ceph cluster in eqiad is in unknown status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInUnknown - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInUnknown [15:53:14] 06cloud-services-team: CephClusterInUnknown - https://phabricator.wikimedia.org/T372573 (10phaultfinder) 03NEW [15:54:42] (03update) 10raymond-ndibe: DO_NOT_MERGE: testing _display_messages move to toolforge-weld [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/90 [16:02:41] 10VPS-project-devtools, 06collaboration-services, 10GitLab, 06Release-Engineering-Team: https://gitlab.devtools.wmcloud.org is being indexed by google (and scoring pretty high) - https://phabricator.wikimedia.org/T372538#10067315 (10brennen) Here we go: https://docs.gitlab.com/omnibus/settings/nginx.html#... [16:05:34] (03CR) 10Kosta Harlan: Exempt Test Group Repositories (034 comments) [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/1063008 (https://phabricator.wikimedia.org/T372565) (owner: 10Pwangai) [16:09:02] (03approved) 10dcaro: [jobs-cli] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/63 (https://phabricator.wikimedia.org/T341066) (owner: 10raymond-ndibe) [16:09:06] (03update) 10dcaro: [jobs-cli] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/63 (https://phabricator.wikimedia.org/T341066) (owner: 10raymond-ndibe) [16:25:06] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [16:26:00] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 29687 bytes in 4.383 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [16:27:00] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:27:23] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=97) [16:27:45] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:27:52] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:28:29] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:28:35] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:28:54] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:29:01] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:29:06] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [16:29:48] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:29:54] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:30:00] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 29687 bytes in 4.421 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [16:30:39] RESOLVED: CephClusterInUnknown: #page Ceph cluster in eqiad is in unknown status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInUnknown - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInUnknown [16:30:40] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:30:46] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:31:30] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:31:36] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:32:04] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:32:10] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:33:03] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.wait_for_rebalance (exit_code=99) [16:33:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:34:10] 10Tool-bridgebot: Have wm-bridgebot bridge #wikipedia-it-sysop on Libera like it does on freenode - https://phabricator.wikimedia.org/T283357#10067415 (10bd808) >>! In T283357#7114024, @bd808 wrote: > This bridge is now shutdown. :facepalm: This comment should have been applied to {T261954} rather than this... [16:36:49] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE, 13Patch-For-Review: Q4:rack/setup/install cloudcephosd10[35-38] - https://phabricator.wikimedia.org/T363344#10067421 (10Andrew) @cmooney can we get cloudcephosd1036 set up now that the switch work is done? [16:38:20] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:38:26] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:39:24] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:39:30] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:39:43] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [16:39:49] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [16:40:54] (03PS4) 10Pwangai: Exempt Test Group Repositories [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/1063008 (https://phabricator.wikimedia.org/T372565) [16:44:50] (03CR) 10Pwangai: Exempt Test Group Repositories (033 comments) [labs/tools/sonarqubebot] - 10https://gerrit.wikimedia.org/r/1063008 (https://phabricator.wikimedia.org/T372565) (owner: 10Pwangai) [17:22:39] 10Cloud-VPS (Debian Buster Deprecation), 10Humaniki: Cloud VPS "wikidumpparse" project Buster deprecation - https://phabricator.wikimedia.org/T367561#10067519 (10Maximilianklein) update for 2024-08-14 [x] create cinder volume. [x] move project code [x] move mysql-db files [x] create a new debian bookworm inst... [17:30:26] 10Toolforge, 10Bitu, 06Infrastructure-Foundations: Can't activate my new key using the idm.wikimedia.org (bitu) interface - https://phabricator.wikimedia.org/T372581 (10Meno25) 03NEW [18:09:57] 10Cloud-VPS (Debian Buster Deprecation), 10Humaniki: Cloud VPS "wikidumpparse" project Buster deprecation - https://phabricator.wikimedia.org/T367561#10067602 (10Maximilianklein) update for 2024-08-15 [x] create cinder volume. [x] move project code [x] move mysql-db files [x] create a new debian bookworm inst... [18:53:18] (03update) 10raymond-ndibe: DO_NOT_MERGE: testing _display_messages move to toolforge-weld [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/90 [19:15:38] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:15:44] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:15:49] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:15:55] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:16:13] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:16:19] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:16:24] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:16:31] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:16:47] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:16:54] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:18:00] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:18:07] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:18:43] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:18:51] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:19:36] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:19:43] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:20:46] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:20:52] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:21:35] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [19:21:42] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [19:38:05] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-codfw, 06SRE: Q1:rack/setup/install cloudlb2004-dev - https://phabricator.wikimedia.org/T370678#10067817 (10Jhancock.wm) [20:19:41] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:23:39] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:23:46] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-codfw, 06SRE: Q1:rack/setup/install cloudlb2004-dev - https://phabricator.wikimedia.org/T370678#10067968 (10Jhancock.wm) @aborrero when you have a moment, can you do this step for me please? thanks! Update the operations/puppet repo [20:28:39] RESOLVED: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:56:23] 10Toolforge (Toolforge iteration 14): something is wrong with pre-commit on builds-api - https://phabricator.wikimedia.org/T372601 (10Raymond_Ndibe) 03NEW [21:08:18] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [21:10:10] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 29698 bytes in 0.435 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [22:10:18] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-codfw, 06SRE: Q1:rack/setup/install cloudlb2004-dev - https://phabricator.wikimedia.org/T370678#10068225 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host cloudlb2004-dev.codfw.wmnet with OS bookworm [23:30:29] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-codfw, 06SRE: Q1:rack/setup/install cloudlb2004-dev - https://phabricator.wikimedia.org/T370678#10068319 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin2002 for host cloudlb2004-dev.codfw.wmnet with OS bookworm executed...