[00:00:33] 10Tool-refill: Merge github.com/refill-ng/refill-labsconf repo into github.com/CurbSafeCharmer/refill repo - https://phabricator.wikimedia.org/T340503#11791794 (10Novem_Linguae) https://github.com/refill-ng/refill-labsconf appears to be some sort of repo to hold config files for some kind of continuous deploymen... [00:01:28] 10Tool-refill: Pick a main branch, then stop using other branches - https://phabricator.wikimedia.org/T340504#11791796 (10Novem_Linguae) 05Open→03Resolved a:03Novem_Linguae We killed off the labs-stable branch and its associated front end code a few years ago. Everything is on master now (and the Vue r... [00:11:04] 10Tool-refill: delete 33,000 unnecessary front end files from production - https://phabricator.wikimedia.org/T422441 (10Novem_Linguae) 03NEW [00:12:40] 10Tool-refill: delete unnecessary back end files from production - https://phabricator.wikimedia.org/T422442 (10Novem_Linguae) 03NEW [00:30:55] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [01:03:14] 10Tool-refill: get back end working in localhost using Docker, and write work instruction - https://phabricator.wikimedia.org/T422451 (10Novem_Linguae) 03NEW [01:04:42] 10Tool-refill: in the GitHub repo, reorganize the /backend/ files so that the directory is named /refill-api.toolforge.org/, and so that all subfolders match what is deployed - https://phabricator.wikimedia.org/T422436#11791987 (10Novem_Linguae) [01:06:09] 10Tool-refill: figure out how to deploy the back end - https://phabricator.wikimedia.org/T422439#11791991 (10Novem_Linguae) [02:03:53] 06tools-platform-team: clis: only create tag on merge of the release patch - https://phabricator.wikimedia.org/T422452 (10Raymond_Ndibe) 03NEW [02:12:43] 06tools-platform-team: logs-api fails with cryptic error if query range is too far in the past e.g. --since 1000d - https://phabricator.wikimedia.org/T422453 (10Raymond_Ndibe) 03NEW [02:15:25] 06tools-platform-team: logs-api: handle exception raised when query range exceeds max_query_length - https://phabricator.wikimedia.org/T422454 (10Raymond_Ndibe) 03NEW [02:18:58] (03update) 10raymond-ndibe: loki.py: handle query time range error in do_follow [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/17 (https://phabricator.wikimedia.org/T422454) [02:19:19] (03update) 10raymond-ndibe: loki.py: handle query time range error in do_follow [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/17 (https://phabricator.wikimedia.org/T422454) [02:19:55] (03update) 10raymond-ndibe: loki.py: handle query time range error in do_follow [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/17 (https://phabricator.wikimedia.org/T422454) [02:20:02] (03update) 10raymond-ndibe: loki.py: handle query time range error in do_follow [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/17 (https://phabricator.wikimedia.org/T422454) [02:26:39] 06cloud-services-team, 10Toolforge: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path - https://phabricator.wikimedia.org/T422224#11792087 (10Hawkeye7) @bd808 Bryan, that would tie in with T412653: migrate to Heroku buildpack for dotnet 10. However, I need... [03:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [03:10:17] FIRING: [2x] PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [03:24:54] 06cloud-services-team, 10Toolforge: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path - https://phabricator.wikimedia.org/T422224#11792182 (10Hawkeye7) For the present, I am following @dcaro's recommendation and building with `--use-deprecated-versions`.... [03:35:17] RESOLVED: [2x] PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [03:39:19] 10Tool-refill: get back end working in localhost using Docker, and write work instruction - https://phabricator.wikimedia.org/T422451#11792193 (10Novem_Linguae) [04:31:03] 10Tool-refill: figure out how to deploy the front end - https://phabricator.wikimedia.org/T422370#11792229 (10Novem_Linguae) [04:31:42] 10Tool-refill: figure out how to deploy the back end - https://phabricator.wikimedia.org/T422439#11792230 (10Novem_Linguae) [05:07:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [05:37:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [06:33:28] 10Toolforge (Quota-requests): Elasticsearch credential request for techactivity - https://phabricator.wikimedia.org/T422462 (10Addshore) 03NEW [07:08:57] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Move all openstack rabbitmq queues to quorum - https://phabricator.wikimedia.org/T421054#11792439 (10fgiunchedi) FS utilization has stabilized as segments are reclaimed {F75217753} [07:13:07] 10Toolforge, 06tools-platform-team: clis: only create tag on merge of the release patch - https://phabricator.wikimedia.org/T422452#11792444 (10taavi) [07:13:37] 10Toolforge, 06tools-platform-team: logs-api fails with cryptic error if query range is too far in the past e.g. --since 1000d - https://phabricator.wikimedia.org/T422453#11792445 (10taavi) [07:13:42] 10Toolforge, 06tools-platform-team: logs-api: handle exception raised when query range exceeds max_query_length - https://phabricator.wikimedia.org/T422454#11792446 (10taavi) [07:14:01] (03approved) 10filippo: istio-gateway: Reduce Istio metric cardinality [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1206 (https://phabricator.wikimedia.org/T421386) (owner: 10taavi) [07:14:56] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component istio-gateway [07:15:06] !log taavi@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component istio-gateway [07:16:02] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component istio-gateway [07:25:28] 10Data-Services, 06tools-infrastructure-team, 06Data-Platform-SRE, 10Datasets-General-or-Unknown, 13Patch-For-Review: Dumps access log analytics should support multiple active hosts - https://phabricator.wikimedia.org/T422042#11792466 (10taavi) p:05Triage→03High I have [[ https://wikimedia.slack.com/... [07:30:35] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component istio-gateway [07:30:48] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component istio-gateway [07:34:34] (03update) 10isaranto: Draft: Add daily thread mode for project watch notifications [toolforge-repos/phabricator-slack-bot] - 10https://gitlab.wikimedia.org/toolforge-repos/phabricator-slack-bot/-/merge_requests/2 [07:46:50] 10Tool-refill: figure out how to deploy the front end - https://phabricator.wikimedia.org/T422370#11792511 (10Novem_Linguae) [07:49:38] 10Data-Services, 06tools-infrastructure-team, 10Datasets-General-or-Unknown, 06Traffic: Migrate clouddumps https/rsync interfaces behind LVS - https://phabricator.wikimedia.org/T422040#11792514 (10taavi) p:05Triage→03Medium [07:49:54] 10Data-Services, 06tools-infrastructure-team, 06Data-Platform-SRE, 10Datasets-General-or-Unknown, 13Patch-For-Review: Dumps access log analytics should support multiple active hosts - https://phabricator.wikimedia.org/T422042#11792515 (10taavi) a:03taavi [07:50:34] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component istio-gateway [07:50:55] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [07:51:13] (03merge) 10taavi: istio-gateway: Reduce Istio metric cardinality [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1206 (https://phabricator.wikimedia.org/T421386) [07:52:25] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [07:52:44] FIRING: IstioGatewayPodMisplaced: istio-gateway pod misplaced - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/IstioGatewayPodMisplaced - https://prometheus-alerts.wmcloud.org/?q=alertname%3DIstioGatewayPodMisplaced [07:57:44] RESOLVED: IstioGatewayPodMisplaced: istio-gateway pod misplaced - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/IstioGatewayPodMisplaced - https://prometheus-alerts.wmcloud.org/?q=alertname%3DIstioGatewayPodMisplaced [08:02:22] FIRING: HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s1 backend clouddb1017.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:07:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s1 backend clouddb1017.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:08:42] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-api [08:08:44] (03update) 10dcaro: builds-api: bump to 0.0.211-20260402143212-0de993af [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1199 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [08:09:28] 10Tool-refill: in the GitHub repo, reorganize the /backend/ files so that the directory is named /refill-api.toolforge.org/, and so that all subfolders match what is deployed - https://phabricator.wikimedia.org/T422436#11792555 (10Novem_Linguae) [08:10:52] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Engineering-Radar, 06DBA: Re-run maintainviews on all clouddb* and an-redacteddb1001.eqiad.wmnet - https://phabricator.wikimedia.org/T422459#11792557 (10taavi) [08:13:47] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-api [08:30:45] 10Tool-refill: figure out if we're still using internationalization/localization - https://phabricator.wikimedia.org/T422440#11792625 (10Novem_Linguae) I asked AI and it thinks that... 1) there is no way to change languages in the UI, and 2) you can set the cookie `TsIntuition_userlang` to something such as `es... [08:31:19] 10Data-Services, 06tools-infrastructure-team, 06Data-Platform-SRE, 10Datasets-General-or-Unknown: Dumps access log analytics should support multiple active hosts - https://phabricator.wikimedia.org/T422042#11792630 (10taavi) 05Open→03Resolved [08:31:31] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-api [08:32:49] 10Data-Services, 06tools-infrastructure-team, 10Datasets-General-or-Unknown, 06Traffic: Migrate clouddumps https/rsync interfaces behind LVS - https://phabricator.wikimedia.org/T422040#11792643 (10taavi) a:03taavi Per Traffic this should be a high-traffic2 service. I have allocated a VIP, namely ` dumps-... [08:37:11] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-api [08:37:20] (03approved) 10dcaro: builds-api: bump to 0.0.211-20260402143212-0de993af [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1199 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [08:37:28] (03merge) 10dcaro: builds-api: bump to 0.0.211-20260402143212-0de993af [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1199 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [08:37:40] (03update) 10dcaro: registry-admission: bump to 0.0.74-20260402143720-ac017c44 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1198 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [08:37:45] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component registry-admission [08:41:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [08:43:30] 06cloud-services-team, 10Toolforge: Buildservice for Rust fails due to fagiani/apt and builder stack "24" mismatch - https://phabricator.wikimedia.org/T422384#11792712 (10dcaro) That would be the future-proof way to go yep (url changed https://wikitech.wikimedia.org/wiki/Help:Toolforge/Building_container_image... [08:46:13] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component registry-admission [08:47:38] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component registry-admission [08:53:05] 06cloud-services-team, 10Toolforge: Buildservice for Rust fails due to fagiani/apt and builder stack "24" mismatch - https://phabricator.wikimedia.org/T422384#11792774 (10magnusmanske) I have added the `project.toml` file to both my [[ https://github.com/magnusmanske/mixnmatch_rs/commit/6c80593b43e45a6ac11beff... [08:57:15] 06cloud-services-team, 10Toolforge: Buildservice for Rust fails due to fagiani/apt and builder stack "24" mismatch - https://phabricator.wikimedia.org/T422384#11792781 (10dcaro) >>! In T422384#11792774, @magnusmanske wrote: > I have added the `project.toml` file to both my [[ https://github.com/magnusmanske/mi... [08:57:25] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [08:57:49] !log dcaro@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component registry-admission [08:58:52] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component registry-admission [09:08:41] FIRING: CloudVPSDesignateLeaks: Detected 8 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:08:55] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [09:09:24] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component registry-admission [09:11:17] FIRING: [2x] PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [09:20:00] (03approved) 10dcaro: registry-admission: bump to 0.0.74-20260402143720-ac017c44 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1198 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:20:04] (03merge) 10dcaro: registry-admission: bump to 0.0.74-20260402143720-ac017c44 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1198 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:20:24] (03update) 10dcaro: ingress-admission: bump to 0.0.86-20260402143548-e87f6446 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1197 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:20:55] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component ingress-admission [09:24:53] 10Tool-refill: Preview/Save button doesn't work - https://phabricator.wikimedia.org/T422399#11792881 (10Novem_Linguae) 05Open→03Resolved a:03Novem_Linguae This bug is related to the lines `let wdiff = new WikEdDiff();` or `this.diff = wdiff.diff(this.origWikicode, this.markedWikicode);` throwing an err... [09:28:52] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component ingress-admission [09:29:55] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component ingress-admission [09:36:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [09:37:39] 10Tool-refill: figure out how to deploy the front end - https://phabricator.wikimedia.org/T422370#11792941 (10Novem_Linguae) [09:38:16] 10Tool-refill: figure out how to deploy the back end - https://phabricator.wikimedia.org/T422439#11792942 (10Novem_Linguae) [09:39:10] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component ingress-admission [09:41:09] (03approved) 10dcaro: ingress-admission: bump to 0.0.86-20260402143548-e87f6446 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1197 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:41:13] (03merge) 10dcaro: ingress-admission: bump to 0.0.86-20260402143548-e87f6446 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1197 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:42:06] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component envvars-api [09:42:11] (03update) 10dcaro: envvars-api: bump to 0.0.83-20260402143135-c3f2dcc8 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1196 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:42:56] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component envvars-api [09:43:04] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component envvars-api [09:44:21] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component envvars-api [09:44:51] 06cloud-services-team, 10Toolforge, 06tools-platform-team: Buildservice for Rust fails due to fagiani/apt and builder stack "24" mismatch - https://phabricator.wikimedia.org/T422384#11792970 (10dcaro) a:03dcaro [09:45:05] 06cloud-services-team, 10Toolforge, 06tools-platform-team: Buildservice for Rust fails due to fagiani/apt and builder stack "24" mismatch - https://phabricator.wikimedia.org/T422384#11792973 (10dcaro) p:05Triage→03High [09:45:40] 06cloud-services-team, 10Toolforge, 06tools-platform-team: [builds-builder] incompatibility of fagiani/apt and builder stack "24" - https://phabricator.wikimedia.org/T422384#11792975 (10dcaro) [09:46:18] (03approved) 10dcaro: envvars-api: bump to 0.0.83-20260402143135-c3f2dcc8 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1196 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:46:25] (03merge) 10dcaro: envvars-api: bump to 0.0.83-20260402143135-c3f2dcc8 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1196 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:46:42] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [09:46:49] (03update) 10dcaro: builds-builder: bump to 0.0.144-20260402143052-dcd3d2fa [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1194 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [09:50:50] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [09:51:06] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [09:56:02] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [10:03:23] 06cloud-services-team, 10Cloud-VPS, 10Continuous-Integration-Infrastructure: Puppet fail to create volume group for ephemeral disk space when it is sda (instead of sdb) - https://phabricator.wikimedia.org/T422258#11793023 (10hashar) I have created some new virtual machines this morning, most had the ephemera... [10:12:53] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Move trove DB instances to rabbitmq transient quorum queues - https://phabricator.wikimedia.org/T421857#11793058 (10fgiunchedi) [10:47:29] (03approved) 10dcaro: builds-builder: bump to 0.0.144-20260402143052-dcd3d2fa [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1194 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [10:47:32] (03merge) 10dcaro: builds-builder: bump to 0.0.144-20260402143052-dcd3d2fa [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1194 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [11:01:31] (03update) 10fnegri: Add --diff-mode and remove --dry-run [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/10 [11:04:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [11:13:08] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team: Improve linting - requestBody and response examples - https://phabricator.wikimedia.org/T421375#11793263 (10KBach) [11:13:57] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team: Add tests to ensure consistency between OAD example and OpenAPI linter - https://phabricator.wikimedia.org/T419576#11793266 (10KBach) [11:34:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [11:47:10] 06cloud-services-team, 10Toolforge, 06tools-platform-team: [builds-builder] incompatibility of fagiani/apt and builder stack "24" - https://phabricator.wikimedia.org/T422384#11793379 (10Sean.hoyland) I had a similar build error with a python tool that used an Aptfile to get noto font packages for a plotting... [11:54:04] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Engineering-Radar, 06DBA: Re-run maintainviews on all clouddb* and an-redacteddb1001.eqiad.wmnet - https://phabricator.wikimedia.org/T422459#11793398 (10Marostegui) [11:55:31] (03approved) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/toolforge-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli/-/merge_requests/63 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [11:55:39] (03merge) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/toolforge-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli/-/merge_requests/63 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [12:02:48] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Engineering-Radar, 06DBA: Re-run maintainviews on all clouddb* and an-redacteddb1001.eqiad.wmnet - https://phabricator.wikimedia.org/T422459#11793411 (10Marostegui) [12:06:38] 06cloud-services-team, 10Data-Services: fiwiki_p.imagelinks view is broken on Toolforge replica (ERROR 1356), while enwiki_p works - https://phabricator.wikimedia.org/T422206#11793418 (10Zabe) 05Open→03Resolved fiwiki should work again, see T422459. [12:10:47] (03approved) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/jobs-emailer] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-emailer/-/merge_requests/44 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [12:10:51] (03merge) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/jobs-emailer] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-emailer/-/merge_requests/44 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [12:13:07] (03update) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: jobs-emailer: bump to 0.0.80-20260407121103-a5d11051 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1195 [12:13:13] (03update) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: jobs-emailer: bump to 0.0.80-20260407121103-a5d11051 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1195 [12:16:16] (03open) 10dcaro: inject_buildpacks: update fagiani for ubuntu 24 [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/86 [12:16:33] (03update) 10dcaro: inject_buildpacks: update fagiani for ubuntu 24 [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/86 [12:25:45] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06DBA, and 3 others: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11793484 (10Marostegui) 05In progress→03Resolved I think this is fixed. The new patched version (10.11.16u3) has bee... [12:28:48] (03update) 10dcaro: inject_buildpacks: update fagiani for ubuntu 24 [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/86 [12:29:40] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-emailer [12:32:00] 06cloud-services-team, 10Toolforge, 06tools-platform-team: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path - https://phabricator.wikimedia.org/T422224#11793547 (10dcaro) a:03dcaro [12:42:42] (03update) 10dcaro: inject_buildpacks: update fagiani for ubuntu 24 [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/86 [12:44:17] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-emailer [12:48:18] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-emailer [13:03:19] (03approved) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/281 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:03:23] (03merge) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/281 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:06:11] 06cloud-services-team, 10Toolforge, 06tools-platform-team: `toolforge jobs logs` misplaces my logs - https://phabricator.wikimedia.org/T421929#11793700 (10dcaro) >>! In T421929#11787825, @Soda wrote: > Got it, will make a note to not use the `-f` from now on (until `-f` is fixed) This is already deployed, y... [13:06:29] (03update) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: jobs-api: bump to 0.0.481-20260407130337-156f92d1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1207 [13:06:35] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: jobs-api: bump to 0.0.481-20260407130337-156f92d1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1207 [13:06:39] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-emailer [13:07:14] 06cloud-services-team, 10Toolforge: [builds-builder,apt] migrate from apt buildpack to Heroku's .deb packages buildpack - https://phabricator.wikimedia.org/T387141#11793708 (10dcaro) >>! In T387141#11791590, @bd808 wrote: > {T422384} is reporting the bug I filed as {T394466} which was merged into this task but... [13:07:28] (03approved) 10dcaro: jobs-emailer: bump to 0.0.80-20260407121103-a5d11051 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1195 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:07:31] (03merge) 10dcaro: jobs-emailer: bump to 0.0.80-20260407121103-a5d11051 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1195 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:07:43] (03update) 10dcaro: jobs-api: bump to 0.0.481-20260407130337-156f92d1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1207 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:07:43] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [13:08:56] FIRING: CloudVPSDesignateLeaks: Detected 12 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:09:22] (03approved) 10fnegri: inject_buildpacks: update fagiani for ubuntu 24 [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/86 (owner: 10dcaro) [13:11:00] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: Toolforge Prometheus instance is unstable - https://phabricator.wikimedia.org/T422287#11793724 (10taavi) [13:14:19] (03open) 10dcaro: Procfile: update path to new builder path [toolforge-repos/milhistbot-autoreport] - 10https://gitlab.wikimedia.org/toolforge-repos/milhistbot-autoreport/-/merge_requests/1 [13:14:59] 06cloud-services-team, 10Toolforge, 06tools-platform-team: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path - https://phabricator.wikimedia.org/T422224#11793740 (10dcaro) @Hawkeye7 as @bd808 mentions, the new path for the procfile is `bin/publish/AutoRe... [13:15:39] 06cloud-services-team, 10Toolforge, 06tools-platform-team: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path - https://phabricator.wikimedia.org/T422224#11793743 (10dcaro) p:05Triage→03Medium [13:17:16] 06cloud-services-team, 10Toolforge, 06tools-platform-team: [builds-builder] incompatibility of fagiani/apt and builder stack "24" - https://phabricator.wikimedia.org/T422384#11793748 (10dcaro) Got a fix for this, waiting for reviews, should be merged soon-ish: https://gitlab.wikimedia.org/repos/cloud/toolfor... [13:19:12] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [13:19:13] (03merge) 10dcaro: inject_buildpacks: update fagiani for ubuntu 24 [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/86 [13:19:19] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [13:20:37] (03update) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: builds-builder: bump to 0.0.145-20260407131926-db92b04c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1208 [13:20:43] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: builds-builder: bump to 0.0.145-20260407131926-db92b04c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1208 [13:33:16] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [13:45:08] !log filippo@cloudcumin1001 kafka-infrastructure START - Cookbook wmcs.openstack.cloudvirt.vm_console [13:45:11] !log filippo@cloudcumin1001 kafka-infrastructure END (FAIL) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=99) [13:45:17] !log filippo@cloudcumin1001 kafka-infrastructure START - Cookbook wmcs.openstack.cloudvirt.vm_console [13:45:19] !log filippo@cloudcumin1001 kafka-infrastructure END (FAIL) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=99) [13:47:34] (03approved) 10dcaro: jobs-api: bump to 0.0.481-20260407130337-156f92d1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1207 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:47:43] (03merge) 10dcaro: jobs-api: bump to 0.0.481-20260407130337-156f92d1 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1207 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:47:51] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [13:47:53] (03update) 10dcaro: builds-builder: bump to 0.0.145-20260407131926-db92b04c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1208 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:48:51] (03approved) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/90 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:48:54] (03merge) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/90 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:51:07] !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [13:51:47] (03update) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: api-gateway: bump to 0.0.91-20260407134903-c79d5988 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1209 [13:51:49] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: api-gateway: bump to 0.0.91-20260407134903-c79d5988 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1209 [13:51:58] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [13:59:53] !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [14:32:00] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [14:43:24] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [14:46:17] (03update) 10vriaa: feat: add fixed CSS support to generated banner code [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/49 (https://phabricator.wikimedia.org/T420950) [14:46:30] 06cloud-services-team, 10Cloud-VPS: Cloud init and unattended upgrades while bootstrapping Trixie VMs - https://phabricator.wikimedia.org/T422509 (10elukey) 03NEW [14:46:38] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T361237) [14:46:42] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [14:58:54] 10Cloud-VPS (Quota-requests), 10Browser Test Platform, 10Continuous-Integration-Infrastructure, 10Continuous-Integration-Config, 07Jenkins: New flavor for the integration project with more vCPU and ephemeral disk space - https://phabricator.wikimedia.org/T421242#11794434 (10Andrew) note to self: the... [15:00:02] 06cloud-services-team, 10Cloud-VPS: Handle project IDs with dash in cloud cookbooks / openstack API - https://phabricator.wikimedia.org/T422515 (10fgiunchedi) 03NEW [15:00:22] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team (MWI-Sprint-31 (2026-04-07 to 2026-04-21)): Fix linter issues discovered during implementation of the OAD example - https://phabricator.wikimedia.org/T414974#11794460 (10HCoplin-WMF) [15:06:27] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=99) [15:16:28] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [15:18:41] RESOLVED: CloudVPSDesignateLeaks: Detected 12 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:25:01] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-builder [15:25:04] 06cloud-services-team, 10Cloud-VPS: Handle project IDs with dash in cloud cookbooks / openstack API - https://phabricator.wikimedia.org/T422515#11794599 (10fgiunchedi) For context / more info ` 15:03 who has two thumbs and keeps stubbing them into openstack project names with dashes? 15:09 ... [15:29:49] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-builder [15:30:45] PROBLEM - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/etcd/k8s - 488 bytes in 0.450 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [15:31:01] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [15:33:13] 06cloud-services-team, 10Data-Services, 06Data-Engineering, 06Data-Engineering-Radar, 06DBA: Re-run maintainviews on all clouddb* and an-redacteddb1001.eqiad.wmnet - https://phabricator.wikimedia.org/T422459#11794638 (10amastilovic) Can confirm that this resolved our Sqoop issue - thank you @Marostegui ! [15:33:39] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node [15:35:35] (03approved) 10dcaro: builds-builder: bump to 0.0.145-20260407131926-db92b04c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1208 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [15:35:40] (03merge) 10dcaro: builds-builder: bump to 0.0.145-20260407131926-db92b04c [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1208 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [15:37:27] 06cloud-services-team, 10Toolforge, 06tools-platform-team: [builds-builder] incompatibility of fagiani/apt and builder stack "24" - https://phabricator.wikimedia.org/T422384#11794670 (10dcaro) 05Open→03Resolved >>! In T422384#11793745, @dcaro wrote: > Got a fix for this, waiting for reviews, should b... [15:38:26] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 26), 06tools-platform-team, 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11794674 (10dcaro) [15:38:58] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 26), 06tools-platform-team, 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11794683 (10dcaro) [15:48:06] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [15:48:33] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge (Toolforge iteration 26), 06tools-platform-team, 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#11794760 (10dcaro) [15:49:43] 06cloud-services-team, 10Toolforge, 06tools-platform-team: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path - https://phabricator.wikimedia.org/T422224#11794773 (10dcaro) 05Open→03Resolved I'll close this for now, but feel free to reopen if you... [15:51:02] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_cluster [15:52:07] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_cluster (exit_code=99) [15:57:34] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_cluster [16:00:45] RECOVERY - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.492 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [16:01:17] (03open) 10dcaro: toolforge_deploy: also unregister packages when restoring [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/318 [16:02:33] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_cluster (exit_code=0) [16:02:48] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T361237) [16:02:52] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [16:03:22] (03approved) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/106 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [16:03:31] (03merge) 10dcaro: build: Upgrade Poetry dependencies [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/106 (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [16:04:40] (03open) 10dcaro: toolforge_get_versions: fix reading package version [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1210 [16:05:05] (03update) 10dcaro: toolforge_get_versions: fix reading package version [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1210 [16:06:33] 10Toolforge, 06tools-platform-team: logs-api: handle exception raised when query range exceeds max_query_length - https://phabricator.wikimedia.org/T422454#11794877 (10dcaro) [16:06:55] 10Toolforge, 06tools-platform-team: logs-api: handle exception raised when query range exceeds max_query_length - https://phabricator.wikimedia.org/T422454#11794878 (10fnegri) 05Open→03In progress p:05Triage→03Medium [16:06:59] 06cloud-services-team, 10Toolforge, 06tools-platform-team: Running dotnet job fails on Toolforge because "24" builder stack changed the compiled binary output path - https://phabricator.wikimedia.org/T422224#11794880 (10bd808) >>! In T422224#11791440, @bd808 wrote: >>>! In T422224#11787862, @Hawkeye7 wro... [16:07:51] 10Toolforge (Toolforge iteration 26), 06tools-platform-team: [global] First of the month automatic dependency upgrade - https://phabricator.wikimedia.org/T422191#11794886 (10fnegri) 05Open→03In progress [16:09:32] FIRING: PuppetAgentFailure: Puppet agent failure detected on instance tools-k8s-etcd-29 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:13:32] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tools-k8s-etcd-30 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [16:13:42] (03update) 10dcaro: toolforge_deploy: also unregister packages when restoring [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/318 [16:20:33] (03update) 10dcaro: toolforge_get_versions: fix reading package version [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1210 [16:21:57] 06cloud-services-team, 10Data-Services: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527 (10fnegri) 03NEW [16:22:32] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11794997 (10fnegri) p:05Triage→03Low [16:24:32] RESOLVED: PuppetAgentFailure: Puppet agent failure detected on instance tools-k8s-etcd-29 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:24:39] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06DBA, and 3 others: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11795036 (10fnegri) @marostegui thank you very much for chasing and fixing this issue! I see that 4 out of 12 clouddb h... [16:28:23] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11795061 (10fnegri) [16:28:32] RESOLVED: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tools-k8s-etcd-30 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [16:28:57] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=0) [16:30:45] PROBLEM - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/etcd/k8s - 177 bytes in 0.005 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [16:31:14] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [16:31:18] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [16:41:48] 06cloud-services-team, 10Cloud-VPS: Handle project IDs with dash in cloud cookbooks / openstack API - https://phabricator.wikimedia.org/T422515#11795213 (10fnegri) Maybe we can change this task to "vm_console cookbook should require project_name instead of project_id"? Something like https://gerrit.wikimedia.o... [16:42:59] (03approved) 10fnegri: toolforge_deploy: also unregister packages when restoring [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/318 (owner: 10dcaro) [16:43:38] (03approved) 10fnegri: toolforge_get_versions: fix reading package version [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1210 (owner: 10dcaro) [16:48:23] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [16:50:45] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [16:50:48] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [16:52:15] (03merge) 10dcaro: toolforge_get_versions: fix reading package version [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1210 [16:52:21] (03merge) 10dcaro: toolforge_deploy: also unregister packages when restoring [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/318 [16:52:50] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [16:57:21] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [16:57:27] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [16:57:54] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [16:59:01] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [16:59:17] !log andrew@cloudcumin1001 tools END (ERROR) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=97) [16:59:23] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [16:59:56] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [17:01:23] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [17:01:56] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [17:02:44] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [17:02:49] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [17:03:17] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [17:04:13] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [17:07:37] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11795507 (10Pppery) 05In progress→03Resolved [17:12:07] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11795547 (10A_smart_kitten) Thanks @pppery -- I am happy declaring this to be fixed for now :) Since the fix was deploy... [17:18:43] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [17:20:31] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [17:20:35] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [17:20:45] RECOVERY - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.505 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [17:27:08] 06cloud-services-team, 10Toolforge: Connection with `k8s.tools.eqiad1.wikimedia.cloud` hits SSL error - https://phabricator.wikimedia.org/T422538#11795644 (10Nokib_Sarkar) [17:31:57] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11795679 (10Dzahn) Thank you very much @A_smart_kitten I edited that slightly to phrase it in the past tense. Otherwi... [17:34:46] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [17:41:48] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [17:41:54] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [17:53:29] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [17:53:41] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [17:53:46] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [17:54:26] 06cloud-services-team, 10Toolforge: Connection with `k8s.tools.eqiad1.wikimedia.cloud` hits SSL error - https://phabricator.wikimedia.org/T422538#11795774 (10Nokib_Sarkar) Also got ` tools.campwiz-backend-beta@tools-bastion-15:~$ toolforge components config create toolforge.yaml Warning: You are using a beta... [17:55:45] PROBLEM - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/etcd/k8s - 488 bytes in 0.019 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [17:58:55] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [18:06:29] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [18:07:48] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T361237) [18:07:52] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [18:25:55] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [18:30:41] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=0) [18:30:45] RECOVERY - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.515 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [18:31:06] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [18:31:10] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [18:38:41] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:43:38] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [18:45:45] PROBLEM - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/etcd/k8s - 488 bytes in 0.016 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [18:45:55] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [18:47:30] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [18:47:34] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [18:52:10] 10Tool-globalcontribution, 06Community-Tech, 10CopyPatrol, 10XTools: Investigate why tools do not stay logged in for the duration of the session cookie - https://phabricator.wikimedia.org/T224382#11796011 (10MusikAnimal) [18:58:50] 10Tool-globalcontribution, 06Community-Tech, 10CopyPatrol, 10XTools, 07patch-welcome: Investigate why tools do not stay logged in for the duration of the session cookie - https://phabricator.wikimedia.org/T224382#11796026 (10MusikAnimal) [18:59:36] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [19:00:22] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [19:00:26] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [19:03:08] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team (MWI-Sprint-31 (2026-04-07 to 2026-04-21)): Fix linter issues discovered during implementation of the OAD example - https://phabricator.wikimedia.org/T414974#11796044 (10Atieno) a:03Atieno [19:05:31] 06cloud-services-team, 10decommission-hardware, 13Patch-For-Review: decommission cloudcephmon2004-dev - https://phabricator.wikimedia.org/T422437#11796046 (10Andrew) decom script is failing: ` DRY-RUN: Issuing read for key /spicerack/locks/cookbooks/sre.dns.netbox with args {'timeout': 60}... [19:09:42] !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99) [19:09:53] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T361237) [19:09:57] T361237: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237 [19:20:05] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0) [19:41:55] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [19:50:45] RECOVERY - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.327 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker [20:32:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudcephmon2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [20:40:41] 06cloud-services-team, 10Toolforge (Toolforge iteration 26), 07Kubernetes: [infra] Upgrade Toolforge K8s etcd nodes to Bookworm - https://phabricator.wikimedia.org/T361237#11796509 (10Andrew) 05Open→03Resolved [20:48:39] (03merge) 10hawkeye7: Procfile: update path to new builder path [toolforge-repos/milhistbot-autoreport] - 10https://gitlab.wikimedia.org/toolforge-repos/milhistbot-autoreport/-/merge_requests/1 (owner: 10dcaro) [20:50:13] (03update) 10danyya: Draft: Migrate to SQLite [toolforge-repos/humaniki] - 10https://gitlab.wikimedia.org/toolforge-repos/humaniki/-/merge_requests/3 [21:07:26] 10VPS-project-Phabricator: The test Phabricator instance doesn't seem to be successfully sending emails to @wikimedia.org addresses - https://phabricator.wikimedia.org/T422559 (10A_smart_kitten) 03NEW [21:07:47] 10VPS-project-Phabricator, 06collaboration-services: The test Phabricator instance doesn't seem to be successfully sending emails to @wikimedia.org addresses - https://phabricator.wikimedia.org/T422559#11796634 (10A_smart_kitten) I am tagging #collaboration-services for your information only, feel free to tria... [21:08:34] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11796637 (10A_smart_kitten) >>! In T388022#11795547, @A_smart_kitten wrote: > I am unsure why @dzahn may not have recei... [21:21:37] 10VPS-project-Phabricator, 06collaboration-services: The test Phabricator instance doesn't seem to be successfully sending emails to @wikimedia.org addresses - https://phabricator.wikimedia.org/T422559#11796709 (10taavi) Seems like `mx-in*.wikimedia.org` do not like these emails for whatever reason: ` 2026-04-... [21:44:20] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [builds-api] expose supported versions - https://phabricator.wikimedia.org/T422046#11796811 (10DamianZaremba) >>! In T422046#11778804, @dcaro wrote: > Pack will not be able to reproduce the build currently, you'll need also the runner and the whole envi... [21:44:59] 10Tool-refill: figure out how to deploy the front end - https://phabricator.wikimedia.org/T422370#11796813 (10Novem_Linguae) [21:46:43] 10Tool-refill: figure out how to deploy the front end - https://phabricator.wikimedia.org/T422370#11796818 (10Novem_Linguae) [22:08:40] !log tools.cluebotng-review Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24106708812 (https://github.com/cluebotng/component-configs/commits/b85f56b6997ccf41cc8ea32f33a61809b68b9bc5) [22:08:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-review/SAL [22:29:22] (03PS1) 10Cwhite: add beta-logs pki key [labs/private] - 10https://gerrit.wikimedia.org/r/1268683 (https://phabricator.wikimedia.org/T350516) [22:38:56] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:42:43] 10VPS-project-Phabricator, 06collaboration-services: The test Phabricator instance doesn't seem to be successfully sending emails to @wikimedia.org addresses - https://phabricator.wikimedia.org/T422559#11796987 (10jeremyb) >>! In T422559#11796709, @taavi wrote: > Seems like `mx-in*.wikimedia.org` do not like t... [23:18:41] RESOLVED: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:23:24] 10Toolforge, 06tools-platform-team: logs-api fails with cryptic error if query range is too far in the past e.g. --since 1000d - https://phabricator.wikimedia.org/T422453#11797066 (10Raymond_Ndibe) Most promising paths: 1. Adding `max_query_length: 336h` to loki configuration. This will always raise the `query... [23:29:36] (03open) 10raymond-ndibe: common.yaml: set max_query_length [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1211 (https://phabricator.wikimedia.org/T422453) [23:29:42] (03update) 10raymond-ndibe: common.yaml: set max_query_length [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1211 (https://phabricator.wikimedia.org/T422453) [23:30:26] (03update) 10raymond-ndibe: loki.py: handle query time range error in do_follow [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/17 (https://phabricator.wikimedia.org/T422454) [23:30:34] (03update) 10raymond-ndibe: loki.py: handle query time range error in do_follow [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/17 (https://phabricator.wikimedia.org/T422454) [23:30:34] (03approved) 10raymond-ndibe: loki.py: handle query time range error in do_follow [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/17 (https://phabricator.wikimedia.org/T422454) [23:30:43] (03merge) 10raymond-ndibe: loki.py: handle query time range error in do_follow [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/17 (https://phabricator.wikimedia.org/T422454) [23:33:52] (03update) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: logs-api: bump to 0.0.18-20260407233052-4b8ebdca [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1212 (https://phabricator.wikimedia.org/T422454) [23:33:56] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: logs-api: bump to 0.0.18-20260407233052-4b8ebdca [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1212 (https://phabricator.wikimedia.org/T422454) [23:36:59] (03update) 10raymond-ndibe: istio-gateway: allow customizing the resources [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1189 (owner: 10dcaro) [23:41:29] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component logs-api [23:50:11] !log raymond-ndibe@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component logs-api [23:57:33] (03approved) 10raymond-ndibe: istio-gateway: allow customizing the resources [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1189 (owner: 10dcaro) [23:58:28] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component logs-api