[03:43:49] FIRING: PuppetZeroResources: Puppet has failed generate resources on testhost2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [04:00:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-19 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [06:25:18] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-19 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [06:26:48] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-19 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [07:00:10] 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Data-Engineering-Wikistats, and 2 others: Pageviews Analysis 3.0 (Vue + Codex) - https://phabricator.wikimedia.org/T378549#10617025 (10Aklapper) [07:07:01] 14Toolforge (Tools to be deleted), 10Tools, 10Projects-Cleanup: Archive/delete tool test-stats - https://phabricator.wikimedia.org/T310803#10617030 (10Aklapper) 05Resolved→03Open a:05taavi→03None Repo at https://gitlab.wikimedia.org/toolforge-repos/test-stats/ still exists thus boldly reopening [07:35:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-19 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [07:43:49] FIRING: PuppetZeroResources: Puppet has failed generate resources on testhost2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [09:26:07] (03update) 10dcaro: [jobs-cli] include container info in log [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/87 (https://phabricator.wikimedia.org/T388274) (owner: 10raymond-ndibe) [09:40:58] (03update) 10dcaro: [toolforge-weld] get all logs from all containers in all pods [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/77 (https://phabricator.wikimedia.org/T388274) (owner: 10raymond-ndibe) [09:42:46] (03update) 10dcaro: [jobs-cli] include container info in log [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/87 (https://phabricator.wikimedia.org/T388274) (owner: 10raymond-ndibe) [09:46:40] (03approved) 10dcaro: [jobs-cli] include container info in log [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/87 (https://phabricator.wikimedia.org/T388274) (owner: 10raymond-ndibe) [09:46:41] (03update) 10dcaro: [jobs-cli] include container info in log [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/87 (https://phabricator.wikimedia.org/T388274) (owner: 10raymond-ndibe) [10:22:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-19 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [10:31:31] 10Striker, 10Bitu, 06Infrastructure-Foundations: Remove feature to connect SUL account to Striker (read from Bitu instead) - https://phabricator.wikimedia.org/T371595#10617703 (10Arendpieter) Am I correct that the same feature exists in Phabricator as well? Thus, a user may connect their Wikimedia SUL accoun... [10:41:30] (03update) 10fnegri: wmcs-k8s-metrics: upgrade charts for K8s v1.29 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/681 (https://phabricator.wikimedia.org/T362868) [10:51:44] FIRING: [2x] ProbeDown: Service virt.cloudgw.codfw1dev.wikimediacloud.org:0 has failed probes (icmp_virt_cloudgw_codfw1dev_wikimediacloud_org_from_eqiad_ip6) - https://wikitech.wikimedia.org/wiki/Network_monitoring#ProbeDown - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [10:51:54] 06cloud-services-team: ProbeDown - https://phabricator.wikimedia.org/T388379 (10phaultfinder) 03NEW [10:52:10] (03update) 10fnegri: wmcs-k8s-metrics: upgrade charts for K8s v1.29 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/681 (https://phabricator.wikimedia.org/T362868) [10:52:33] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-19 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [11:17:05] 06cloud-services-team, 10Toolforge: Refactor wmcs-k8s-metrics component - https://phabricator.wikimedia.org/T388382 (10fnegri) 03NEW [11:17:48] (03update) 10fnegri: wmcs-k8s-metrics: upgrade charts for K8s v1.29 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/681 (https://phabricator.wikimedia.org/T362868) [11:21:44] FIRING: [4x] ProbeDown: Service virt.cloudgw.codfw1dev.wikimediacloud.org:0 has failed probes (icmp_virt_cloudgw_codfw1dev_wikimediacloud_org_from_codfw_ip6) - https://wikitech.wikimedia.org/wiki/Network_monitoring#ProbeDown - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [11:21:49] 06cloud-services-team: ProbeDown - https://phabricator.wikimedia.org/T388379#10617936 (10phaultfinder) [12:10:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-19 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [12:11:04] 06cloud-services-team: ProbeDown - https://phabricator.wikimedia.org/T388379#10618210 (10tappof) [12:12:58] 06cloud-services-team: ProbeDown - https://phabricator.wikimedia.org/T388379#10618236 (10tappof) irc logs: ` 11:15:30 tappof │ arturo: dcaro Just a heads-up in case of any unwanted alerts: I'm merging this patch: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1100819 11:15:59 arturo │ tappof: thanks... [12:18:36] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/10 [12:45:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-19 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [12:47:32] (03CR) 10Samtar: [C:03+2] Restart EventStream if it is replaying old events [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1125575 (https://phabricator.wikimedia.org/T388292) (owner: 10AntiCompositeNumber) [12:48:04] (03Merged) 10jenkins-bot: Restart EventStream if it is replaying old events [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1125575 (https://phabricator.wikimedia.org/T388292) (owner: 10AntiCompositeNumber) [12:48:51] 10Cloud-Services, 06cloud-services-team, 06Discovery-Search, 10Elasticsearch, 06SRE Observability: Cloudelastic alerts should route to data platform alerts, not wmcs - https://phabricator.wikimedia.org/T388270#10618446 (10fgiunchedi) Thanks for reaching out @RKemper, you are indeed correct that icinga al... [12:51:06] 06cloud-services-team, 10Toolforge: Refactor wmcs-k8s-metrics component - https://phabricator.wikimedia.org/T388382#10618453 (10taavi) My choice would be to split it to several components to make it as close to other components as possible. The single component using multiple charts is a relic from times befor... [12:51:44] FIRING: [4x] ProbeDown: Service virt.cloudgw.codfw1dev.wikimediacloud.org:0 has failed probes (icmp_virt_cloudgw_codfw1dev_wikimediacloud_org_from_codfw_ip6) - https://wikitech.wikimedia.org/wiki/Network_monitoring#ProbeDown - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [13:16:44] RESOLVED: [2x] ProbeDown: Service virt.cloudgw.codfw1dev.wikimediacloud.org:0 has failed probes (icmp_virt_cloudgw_codfw1dev_wikimediacloud_org_from_codfw_ip6) - https://wikitech.wikimedia.org/wiki/Network_monitoring#ProbeDown - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [13:39:10] 06cloud-services-team, 10Toolforge: Refactor wmcs-k8s-metrics component - https://phabricator.wikimedia.org/T388382#10618774 (10dcaro) >>! In T388382#10618453, @taavi wrote: > My choice would be to split it to several components to make it as close to other components as possible. The single component using mu... [13:53:34] 06cloud-services-team: ProbeDown - https://phabricator.wikimedia.org/T388379#10618814 (10cmooney) I referenced the wrong task on this patch, but it should allow the pings to work https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1126035 [14:07:09] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: tofu-infra: refactor repo structure - https://phabricator.wikimedia.org/T375283#10618901 (10aborrero) I'm finding some difficulties with representing the current resource data model within the intended refactor. I think I will be tr... [14:08:21] (03CR) 10Arturo Borrero Gonzalez: [C:03+1] "LGTM." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1124377 (owner: 10David Caro) [14:10:24] 06cloud-services-team: ProbeDown - https://phabricator.wikimedia.org/T388379#10618948 (10aborrero) [14:10:32] 06cloud-services-team, 10Cloud-VPS, 07IPv6: CloudVPS: IPv6 in eqiad1 - https://phabricator.wikimedia.org/T380174#10618947 (10aborrero) [14:27:50] (03CR) 10David Caro: [C:03+2] toolforge.component.deploy: fail if the tests failed [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1124377 (owner: 10David Caro) [14:31:44] (03Merged) 10jenkins-bot: toolforge.component.deploy: fail if the tests failed [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1124377 (owner: 10David Caro) [14:36:25] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: tofu-infra: refactor repo structure - https://phabricator.wikimedia.org/T375283#10619058 (10fnegri) +1 from me, I like this. [14:45:41] (03approved) 10dcaro: [builds-builder] create and use maintain-harbor robot account [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/66 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [14:45:42] (03update) 10dcaro: [builds-builder] create and use maintain-harbor robot account [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/66 (https://phabricator.wikimedia.org/T361698) (owner: 10raymond-ndibe) [14:46:09] (03approved) 10dcaro: [ maintain-harbor ] add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] (refactor_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) (owner: 10sstefanova) [14:46:16] (03update) 10dcaro: [ maintain-harbor ] add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] (refactor_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) (owner: 10sstefanova) [14:51:17] 06cloud-services-team, 10Toolforge: [components-api] add one-off, scheduled and continuous jobs support to the yaml + api - https://phabricator.wikimedia.org/T362075#10619167 (10So9q) I like where this is going :) [15:24:50] 10Cloud-Services, 06cloud-services-team, 10Elasticsearch, 06SRE Observability, and 2 others: Cloudelastic alerts should route to data platform alerts, not wmcs - https://phabricator.wikimedia.org/T388270#10619372 (10Gehel) [15:27:18] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [15:30:35] 06cloud-services-team, 10Toolforge, 07Kubernetes: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237#10619418 (10fnegri) p:05Medium→03High This will upgrade etcd from 3.3.25 to 3.4.23, which is inside the range currently [approved by Kubernetes](https://kuber... [15:38:27] (03update) 10raymond-ndibe: [ maintain-harbor ] add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] (refactor_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) (owner: 10sstefanova) [15:38:38] (03update) 10raymond-ndibe: [ maintain-harbor ] add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] (refactor_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) (owner: 10sstefanova) [15:40:03] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [16:08:20] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [16:42:29] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [16:48:52] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [16:50:23] (03open) 10dcaro: add build status [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/22 [16:51:18] (03update) 10dcaro: add build status [repos/cloud/toolforge/components-cli] (add_tabulate) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/22 [16:52:14] (03open) 10dcaro: show: add some indentation for the long status [repos/cloud/toolforge/components-cli] (add_build_status) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/23 [16:52:22] (03update) 10dcaro: show: add some indentation for the long status [repos/cloud/toolforge/components-cli] (add_build_status) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/23 [17:02:31] (03approved) 10raymond-ndibe: [toolforge-deploy] refactor maintain-harbor config [repos/cloud/toolforge/toolforge-deploy] (maintain_harbor_use_robot_account) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/689 (https://phabricator.wikimedia.org/T386953) [17:02:34] (03update) 10raymond-ndibe: [toolforge-deploy] refactor maintain-harbor config [repos/cloud/toolforge/toolforge-deploy] (maintain_harbor_use_robot_account) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/689 (https://phabricator.wikimedia.org/T386953) [17:02:36] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [17:03:14] (03update) 10raymond-ndibe: [toolforge-deploy] test maintain-harbor quota management [repos/cloud/toolforge/toolforge-deploy] (add_maintain_harbor_quota_config) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/690 (https://phabricator.wikimedia.org/T352417) [17:06:57] (03approved) 10raymond-ndibe: [builds-builder] create and use maintain-harbor robot account [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/66 (https://phabricator.wikimedia.org/T361698) [17:07:01] (03merge) 10raymond-ndibe: [builds-builder] create and use maintain-harbor robot account [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/66 (https://phabricator.wikimedia.org/T361698) [17:09:02] (03open) 10group_203_bot_4866fc124f4b41659f667468a6115cf3: builds-builder: bump to 0.0.127-20250310170709-62faa3b3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/701 (https://phabricator.wikimedia.org/T361698) [17:09:27] (03approved) 10raymond-ndibe: [toolforge-deploy] maintain-harbor use robot account [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/672 (https://phabricator.wikimedia.org/T361698) [17:09:29] (03update) 10raymond-ndibe: [toolforge-deploy] maintain-harbor use robot account [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/672 (https://phabricator.wikimedia.org/T361698) [17:09:47] (03merge) 10raymond-ndibe: [toolforge-deploy] maintain-harbor use robot account [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/672 (https://phabricator.wikimedia.org/T361698) [17:09:51] (03update) 10raymond-ndibe: [toolforge-deploy] refactor maintain-harbor config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/689 (https://phabricator.wikimedia.org/T386953) [17:12:25] (03approved) 10raymond-ndibe: [builds-api] use maintain-harbor robot account locally [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/119 (https://phabricator.wikimedia.org/T361698) [17:12:43] (03merge) 10raymond-ndibe: [builds-api] use maintain-harbor robot account locally [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/119 (https://phabricator.wikimedia.org/T361698) [17:14:36] (03approved) 10raymond-ndibe: [maintain-harbor] move image_retention_policy_template to config [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/44 (https://phabricator.wikimedia.org/T386953) [17:14:39] (03update) 10raymond-ndibe: [maintain-harbor] move image_retention_policy_template to config [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/44 (https://phabricator.wikimedia.org/T386953) [17:14:52] (03update) 10raymond-ndibe: [toolforge-deploy] refactor maintain-harbor config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/689 (https://phabricator.wikimedia.org/T386953) [17:15:06] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10620048 (10fnegri) The temperature remains very close to the threshold, and the alert has been firing intermittently since... [17:15:21] (03merge) 10raymond-ndibe: [toolforge-deploy] refactor maintain-harbor config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/689 (https://phabricator.wikimedia.org/T386953) [17:15:22] (03update) 10raymond-ndibe: [toolforge-deploy] add maintain-harbor quota config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) [17:15:41] (03merge) 10raymond-ndibe: [maintain-harbor] move image_retention_policy_template to config [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/44 (https://phabricator.wikimedia.org/T386953) [17:15:43] (03update) 10raymond-ndibe: [ maintain-harbor ] add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) (owner: 10sstefanova) [17:16:27] (03update) 10raymond-ndibe: [toolforge-deploy] add maintain-harbor quota config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) [17:16:45] (03update) 10raymond-ndibe: [toolforge-deploy] add maintain-harbor quota config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) [17:17:18] (03approved) 10raymond-ndibe: [toolforge-deploy] add maintain-harbor quota config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) [17:17:33] (03merge) 10raymond-ndibe: [toolforge-deploy] add maintain-harbor quota config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/674 (https://phabricator.wikimedia.org/T352417) [17:17:33] (03update) 10raymond-ndibe: [toolforge-deploy] test maintain-harbor quota management [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/690 (https://phabricator.wikimedia.org/T352417) [17:18:06] (03open) 10group_203_bot_4866fc124f4b41659f667468a6115cf3: maintain-harbor: bump to 0.0.20-20250310171549-ae483a76 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/702 (https://phabricator.wikimedia.org/T386953) [17:18:25] (03approved) 10raymond-ndibe: [ maintain-harbor ] add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) (owner: 10sstefanova) [17:19:09] (03merge) 10raymond-ndibe: [ maintain-harbor ] add job for managing harbor quotas [repos/cloud/toolforge/maintain-harbor] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor/-/merge_requests/22 (https://phabricator.wikimedia.org/T352417) (owner: 10sstefanova) [17:19:44] (03open) 10group_203_bot_4866fc124f4b41659f667468a6115cf3: builds-api: bump to 0.0.180-20250310171252-3a5bd08b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/703 (https://phabricator.wikimedia.org/T361698) [17:21:35] (03update) 10group_203_bot_4866fc124f4b41659f667468a6115cf3: maintain-harbor: bump to 0.0.20-20250310171549-ae483a76 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/702 (https://phabricator.wikimedia.org/T386953) [17:21:38] (03update) 10group_203_bot_4866fc124f4b41659f667468a6115cf3: maintain-harbor: bump to 0.0.20-20250310171549-ae483a76 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/702 (https://phabricator.wikimedia.org/T352417 https://phabricator.wikimedia.org/T386953) [17:24:11] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [17:24:12] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [17:27:36] !log raymond-ndibe@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component jobs-api [17:28:56] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [17:32:43] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [17:36:20] !log raymond-ndibe@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [17:36:44] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [17:37:46] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [17:40:53] 06cloud-services-team, 10Toolforge: [docs] Create a tutorial on how to deploy a Node.js app using Build Service - https://phabricator.wikimedia.org/T353313#10620172 (10fnegri) 05Open→03Resolved a:03fnegri @So9q kindly created a tutorial at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Building_c... [17:44:12] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [17:45:42] (03open) 10dcaro: deployment: use just the timestamp + a bit of random for the id [repos/cloud/toolforge/components-api] (build_on_deploy) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/58 [17:47:18] (03update) 10dcaro: deployment: use just the timestamp + a bit of random for the id [repos/cloud/toolforge/components-api] (build_on_deploy) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/58 [17:47:25] (03update) 10dcaro: deployment: use just the timestamp + a bit of random for the id [repos/cloud/toolforge/components-api] (build_on_deploy) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/58 [18:04:05] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10620286 (10VRiley-WMF) Understood, I'm currently investigating this [18:08:09] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [18:08:23] (03update) 10dcaro: deployment: use just the timestamp + a bit of random for the id [repos/cloud/toolforge/components-api] (build_on_deploy) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/58 [18:31:11] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [18:32:02] (03update) 10dcaro: deployment: use just the timestamp + a bit of random for the id [repos/cloud/toolforge/components-api] (build_on_deploy) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/58 [18:36:24] (03update) 10raymond-ndibe: builds-builder: bump to 0.0.127-20250310170709-62faa3b3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/701 (https://phabricator.wikimedia.org/T361698) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [18:36:24] (03approved) 10raymond-ndibe: builds-builder: bump to 0.0.127-20250310170709-62faa3b3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/701 (https://phabricator.wikimedia.org/T361698) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [18:37:02] (03update) 10dcaro: deploy_task: build the source based components [repos/cloud/toolforge/components-api] (add_deployment_status_updates) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/57 [18:37:11] (03merge) 10raymond-ndibe: builds-builder: bump to 0.0.127-20250310170709-62faa3b3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/701 (https://phabricator.wikimedia.org/T361698) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [18:37:13] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-api [18:37:19] (03approved) 10raymond-ndibe: builds-api: bump to 0.0.180-20250310171252-3a5bd08b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/703 (https://phabricator.wikimedia.org/T361698) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [18:37:28] (03update) 10raymond-ndibe: builds-api: bump to 0.0.180-20250310171252-3a5bd08b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/703 (https://phabricator.wikimedia.org/T361698) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [18:37:31] (03update) 10dcaro: deployment: use just the timestamp + a bit of random for the id [repos/cloud/toolforge/components-api] (build_on_deploy) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/58 [18:39:22] !log raymond-ndibe@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component builds-api [18:41:06] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [18:48:46] !log raymond-ndibe@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor [18:49:05] (03update) 10raymond-ndibe: maintain-harbor: bump to 0.0.20-20250310171549-ae483a76 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/702 (https://phabricator.wikimedia.org/T352417 https://phabricator.wikimedia.org/T386953) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [18:49:24] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-api [18:52:26] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10620517 (10VRiley-WMF) Is there a timeframe for us to take this server down? [18:56:42] !log raymond-ndibe@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-api [18:57:08] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-api [19:07:04] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-api [19:07:20] (03update) 10raymond-ndibe: builds-api: bump to 0.0.180-20250310171252-3a5bd08b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/703 (https://phabricator.wikimedia.org/T361698) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [19:07:26] (03merge) 10raymond-ndibe: builds-api: bump to 0.0.180-20250310171252-3a5bd08b [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/703 (https://phabricator.wikimedia.org/T361698) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [19:08:01] (03update) 10raymond-ndibe: maintain-harbor: bump to 0.0.20-20250310171549-ae483a76 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/702 (https://phabricator.wikimedia.org/T352417 https://phabricator.wikimedia.org/T386953) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [19:17:28] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10620665 (10fnegri) I think this one is tricky to depool, there are some notes at https://wikitech.wikimedia.org/wiki/Dumps/... [19:18:46] 06cloud-services-team, 06DC-Ops, 10Ganeti, 06Infrastructure-Foundations, and 2 others: Temperature Inlet Temp issue on clouddumps1001:9290 - https://phabricator.wikimedia.org/T383723#10620679 (10fnegri) Routing the traffic to the other host would also clarify if the high temperature is somehow related to t... [19:42:17] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [19:50:02] !log raymond-ndibe@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component maintain-harbor [19:51:29] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [19:55:19] !log raymond-ndibe@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component maintain-harbor [19:56:20] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [19:59:54] !log raymond-ndibe@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component maintain-harbor [20:05:06] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-api [20:05:10] !log raymond-ndibe@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component builds-api [20:05:22] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-api [20:09:53] !log raymond-ndibe@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component builds-api [20:20:21] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [20:28:45] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor [20:30:10] (03approved) 10raymond-ndibe: maintain-harbor: bump to 0.0.20-20250310171549-ae483a76 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/702 (https://phabricator.wikimedia.org/T352417 https://phabricator.wikimedia.org/T386953) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [20:30:11] (03update) 10raymond-ndibe: maintain-harbor: bump to 0.0.20-20250310171549-ae483a76 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/702 (https://phabricator.wikimedia.org/T352417 https://phabricator.wikimedia.org/T386953) (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [20:30:42] (03update) 10raymond-ndibe: [toolforge-deploy] test maintain-harbor quota management [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/690 (https://phabricator.wikimedia.org/T352417) [20:31:58] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [20:39:51] !log raymond-ndibe@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor [20:40:23] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [20:49:18] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor [20:49:41] (03approved) 10raymond-ndibe: [toolforge-deploy] test maintain-harbor quota management [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/690 (https://phabricator.wikimedia.org/T352417) [20:49:45] (03merge) 10raymond-ndibe: [toolforge-deploy] test maintain-harbor quota management [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/690 (https://phabricator.wikimedia.org/T352417) [22:01:47] 10VPS-project-Codesearch, 06collaboration-services: Graduate codesearch to production - https://phabricator.wikimedia.org/T268199#10621366 (10Dzahn) a:03Dzahn [22:45:26] 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Replace or remove deployment-echostore02.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T361383#10621478 (10bd808) 05Resolved→03Open This instance apparently never got a replacement. The `echostore.svc.deplo... [23:19:57] 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Replace or remove deployment-echostore02.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T361383#10621529 (10bd808) @thcipriani dug around and found some config from older instances that might get somebody starte...