[00:01:28] FIRING: InstanceDown: Project tools instance tools-prometheus-8 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:06:24] FIRING: ToolforgeKubernetesNodeNotReady: Multiple Kubernetes nodes are not ready #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesNodeNotReady [00:06:34] FIRING: EnvvarsAdmissionDown: EnvvarsAdmission is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/EnvvarsAdmissionDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DEnvvarsAdmissionDown [00:09:45] andrew@cloudcumin1001 safe_reboot (PID 3795026) is awaiting input [00:11:24] RESOLVED: ToolforgeKubernetesNodeNotReady: Multiple Kubernetes nodes are not ready #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesNodeNotReady [00:11:28] RESOLVED: InstanceDown: Project tools instance tools-prometheus-8 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:11:34] RESOLVED: EnvvarsAdmissionDown: EnvvarsAdmission is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/EnvvarsAdmissionDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DEnvvarsAdmissionDown [06:58:36] 10Toolforge (Toolforge iteration 21), 07good first task, 13Patch-For-Review: [components-api] add `GET` endpoint `/v1/tool//deployments/latest` - https://phabricator.wikimedia.org/T394990#10936431 (10Godspowertechnical) [06:59:48] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 21), 05Goal, 13Patch-For-Review: [infra] Commission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#10936432 (10Godspowertechnical) [07:51:45] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS: [trove] Disk full for DBapp instance in glamwikidashboard project - https://phabricator.wikimedia.org/T396724#10936460 (10YochayCO) I've tried adding `archive_mode = off` via `https://horizon.wikimedia.org/project/database_configurations/{id}` as you sugg... [07:55:07] 10Toolforge (Toolforge iteration 21), 07good first task, 13Patch-For-Review: [components-api] add `GET` endpoint `/v1/tool//deployments/latest` - https://phabricator.wikimedia.org/T394990#10936461 (10JJMC89) [07:55:28] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 21), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#10936462 (10JJMC89) [10:31:04] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-77 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [10:46:04] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-77 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [11:05:00] FIRING: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [15:05:15] FIRING: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [16:15:00] RESOLVED: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [17:17:19] 06cloud-services-team, 10Toolforge: [jobs-api] logs internal datetime error - https://phabricator.wikimedia.org/T362521#10936676 (10derenrich) I made a small patch to deal with this. I would push it to. see the attached patch.{F62424246} i'd make a pull request but i don't have rights. i [17:56:01] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [18:22:19] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [18:42:23] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [18:50:51] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [19:06:49] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [19:26:06] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [19:35:56] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [19:45:15] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [20:01:40] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [20:25:30] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance cvn-app10 in project cvn - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [20:26:28] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7 [20:55:51] (03update) 10galrach600: Separated html, css and js. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/7