[00:08:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:21:42] !log tools.cluebotng-review Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23775778726 (https://github.com/cluebotng/component-configs/commits/47f8e20c39e29b952f6dbbd04917970802ce1a0b) [01:21:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-review/SAL [02:19:37] FIRING: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s3 backend clouddb1022.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [03:00:59] FIRING: MaintainDBUsersStuck: Maintain-dbusers is stuck - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersStuck - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersStuck [03:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [03:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [04:08:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:19:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s3 backend clouddb1022.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [05:20:44] RESOLVED: MaintainDBUsersStuck: Maintain-dbusers is stuck - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersStuck - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersStuck [05:21:56] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06DBA: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11770390 (10Marostegui) I think the culprit query has been found (and it is pretty crazy): {F74611061} ` root@clouddb1022:/srv/sqldata.s3#... [05:27:16] (03open) 10r4356th: Support Croatian file options [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/13 [05:33:45] (03update) 10r4356th: Support Croatian file options [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/13 [05:34:55] (03update) 10r4356th: Support Croatian file options [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/13 [05:35:17] (03merge) 10r4356th: Support Croatian file options [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/13 [06:07:04] !log tools.cluebotng-review Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23782957628 (https://github.com/cluebotng/component-configs/commits/7888bbd75773dc064d78ad2ee8949f1540eab0fd) [06:07:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-review/SAL [06:59:22] FIRING: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s1 backend clouddb1013.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [07:04:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s1 backend clouddb1013.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [07:21:50] FIRING: NeutronAgentDown: Neutron neutron-l3-agent on cloudnet1006 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [07:26:50] RESOLVED: NeutronAgentDown: Neutron neutron-l3-agent on cloudnet1006 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [07:33:22] FIRING: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s2 backend clouddb1014.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [07:38:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s2 backend clouddb1014.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:06:57] 06cloud-services-team, 10Data-Services, 06DBA: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826#11770803 (10Marostegui) [08:08:11] 06cloud-services-team (FY2025/2026-Q3-Q4), 06tools-infrastructure-team: wmcs.openstack.restart_openstack attempts to restart services on decom cloudcontrol1005 - https://phabricator.wikimedia.org/T421832 (10fgiunchedi) 03NEW [08:08:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [08:23:08] 06cloud-services-team, 10Data-Services, 06DBA: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826#11770892 (10Marostegui) [08:24:11] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11770896 (10Kwametech) Hello @Nokib_Sarkar @Tiven2240 My name is Edmond, a student of Kwame Nkrumah University of Science and Technology and a member of WIKITECH KNUST. I'm intereste... [08:27:22] FIRING: HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s4 backend clouddb1019.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:28:49] 06cloud-services-team, 10Data-Services, 06DBA: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826#11770904 (10Marostegui) [08:32:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s4 backend clouddb1019.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:33:14] 06cloud-services-team, 10Cloud-VPS: rabbitmq: missing heartbeats issue - https://phabricator.wikimedia.org/T347017#11770922 (10fgiunchedi) 05Open→03Invalid No longer observed [08:35:16] 06cloud-services-team, 10Cloud-VPS: rabbitmq: missing heartbeats issue - https://phabricator.wikimedia.org/T347017#11770932 (10dcaro) \o/ [08:41:18] (03CR) 10Nikerabbit: [C:03+2] Localisation updates from https://translatewiki.net. [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1264592 (owner: 10L10n-bot) [08:41:44] (03CR) 10Nikerabbit: [V:03+2] Localisation updates from https://translatewiki.net. [labs/tools/wikiinfo] - 10https://gerrit.wikimedia.org/r/1264616 (owner: 10L10n-bot) [08:42:31] !log filippo@cloudcumin1001 trove START - Cookbook wmcs.openstack.cloudvirt.vm_console [08:43:18] 06cloud-services-team, 10Data-Services, 06DBA: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826#11770969 (10Marostegui) [08:43:52] FIRING: [2x] HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s4 backend clouddb1019.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:45:06] !log filippo@cloudcumin1001 trove END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0) [08:48:52] FIRING: [2x] HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s3 backend clouddb1023.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:52:13] 06cloud-services-team, 10Data-Services, 06DBA: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826#11771158 (10Marostegui) [08:53:52] RESOLVED: [2x] HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s3 backend clouddb1023.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:58:27] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Move all openstack rabbitmq queues to quorum - https://phabricator.wikimedia.org/T421054#11771226 (10dcaro) > Note that despite the graph we have ~300G free on the VG, plus the vg0/srv LV is not really used and we can reclaim its spa... [09:05:07] FIRING: [2x] HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s3 backend clouddb1023.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [09:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [09:05:19] 06cloud-services-team, 10Data-Services, 06DBA: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826#11771266 (10Marostegui) [09:06:52] RESOLVED: [2x] HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s3 backend clouddb1023.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [09:11:19] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 06tools-infrastructure-team: wmcs.openstack.restart_openstack attempts to restart services on decom cloudcontrol1005 - https://phabricator.wikimedia.org/T421832#11771335 (10taavi) [09:12:57] FIRING: HarborComponentDown: No data about Harbor components found. #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborComponentDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborComponentDown [09:14:14] 06cloud-services-team, 10Data-Services, 06DBA: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826#11771381 (10Marostegui) [09:15:07] 06cloud-services-team, 10Data-Services, 06DBA: Downgrade clouddb* hosts to 10.11.13 - https://phabricator.wikimedia.org/T421826#11771402 (10Marostegui) 05Open→03Resolved I've compiled 10.11.16 with the patch provided by MariaDB into wmf-mariadb1011_10.11.16+deb12u2_amd64.deb and I've installed it on... [09:27:50] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11771479 (10Ykchoudhary110) Hi, I have submitted a GSoC 2026 proposal for the CampWiz NxT Redesign project. I am currently exploring the codebase and understanding the architecture.... [09:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [09:38:59] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Move trove DB instances to rabbitmq transient quorum queues - https://phabricator.wikimedia.org/T421857 (10fgiunchedi) 03NEW [09:55:27] RESOLVED: HarborComponentDown: No data about Harbor components found. #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborComponentDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborComponentDown [09:56:57] FIRING: HarborDown: Harbor is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborDown [10:56:57] RESOLVED: HarborDown: Harbor is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborDown [11:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [11:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [11:37:01] 06cloud-services-team, 10Toolforge: [infra,o11y] Alert on Prometheus instability / unexpected restarts - https://phabricator.wikimedia.org/T421416#11772066 (10dcaro) Last restart logs: ` Mar 31 09:06:33 tools-prometheus-9 prometheus@tools[1745172]: ts=2026-03-31T09:06:33.210Z caller=db.go:1619 level=info compo... [12:08:56] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:12:33] !log raymond-ndibe@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component logs-api [12:20:22] FIRING: HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s3 backend clouddb1023.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [12:21:11] !log raymond-ndibe@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component logs-api [12:21:30] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component logs-api [12:31:14] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component logs-api [12:33:43] (03update) 10raymond-ndibe: logs-api: bump to 0.0.16-20260330220252-cd8acfa2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1187 (https://phabricator.wikimedia.org/T400917) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [12:33:48] (03approved) 10raymond-ndibe: logs-api: bump to 0.0.16-20260330220252-cd8acfa2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1187 (https://phabricator.wikimedia.org/T400917) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [12:34:03] (03merge) 10raymond-ndibe: logs-api: bump to 0.0.16-20260330220252-cd8acfa2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1187 (https://phabricator.wikimedia.org/T400917) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [12:34:23] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11772337 (10Aryash_Jain) Hi, I'm Aryash Jain, a GSoC 2026 contributor applicant. I've submitted a proposal for this project and will be completing microtask T415408 shortly. Looking f... [12:44:58] (03update) 10raymond-ndibe: values.yaml: replace job image variants with webservice image variants [repos/cloud/toolforge/image-config] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config/-/merge_requests/18 (https://phabricator.wikimedia.org/T415322) [12:47:50] 10Tools, 07Privacy: Tool "wd-vectordb" loads assets from Google gstatic.com - https://phabricator.wikimedia.org/T421889 (10valerio.bozzolan) 03NEW [12:51:28] 10Tools, 07Privacy: Tool "wd-vectordb" loads assets from Google gstatic.com - https://phabricator.wikimedia.org/T421889#11772505 (10valerio.bozzolan) [13:00:25] (03open) 10fnegri: Update lima version check [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/316 [13:00:53] (03update) 10fnegri: Update lima version check [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/316 [13:01:31] (03update) 10fnegri: Update lima version check [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/316 [13:02:22] (03approved) 10dcaro: Update lima version check [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/316 (owner: 10fnegri) [13:10:22] (03update) 10raymond-ndibe: values.yaml: replace job image variants with webservice image variants [repos/cloud/toolforge/image-config] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config/-/merge_requests/18 (https://phabricator.wikimedia.org/T415322) [13:10:26] 10Tools, 10Wikidata, 07Privacy: Tool "wd-vectordb" loads assets from Google gstatic.com - https://phabricator.wikimedia.org/T421889#11772603 (10LucasWerkmeister) [13:11:58] (03merge) 10fnegri: Update lima version check [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/316 [13:13:29] (03update) 10fnegri: Update lima version check [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/316 [13:18:00] PROBLEM - mysqld processes on clouddb1023 is CRITICAL: PROCS CRITICAL: 1 process with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting [13:18:57] 06Toolforge-standards-committee, 10Tools, 10Wikidata, 07Privacy: Tool "wd-vectordb" loads assets from Google gstatic.com - https://phabricator.wikimedia.org/T421889#11772653 (10taavi) [13:39:48] FIRING: PuppetFailure: Puppet has failed on clouddb1023:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [13:41:17] 06cloud-services-team, 10Horizon: Openstack services should use standard HTTPS port - https://phabricator.wikimedia.org/T377055#11772838 (10Andrew) [13:41:52] (03open) 10vriaa: refactor: remove unused style ids from templates [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/44 [13:42:26] 06cloud-services-team, 10Cloud-VPS: Openstack services should use standard HTTPS port - https://phabricator.wikimedia.org/T377055#11772850 (10taavi) [13:42:37] 06cloud-services-team, 10Cloud-VPS: Openstack services should use standard HTTPS port - https://phabricator.wikimedia.org/T377055#11772851 (10taavi) [13:44:42] (03merge) 10vriaa: refactor: remove unused style ids from templates [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/44 [13:50:00] RECOVERY - mysqld processes on clouddb1023 is OK: PROCS OK: 2 processes with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting [13:52:24] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11772922 (10Only-Vikas) Hi @Nokib_Sarkar and @Tiven2240, Just a quick update: I have officially submitted my final proposal for the CampWiz NxT Redesign to the GSoC portal! The final... [13:55:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service wikireplica-db-analytics-s3 backend clouddb1023.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [14:03:42] (03open) 10vriaa: refactor: remove fixed CSS from template starter [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/45 [14:03:57] (03merge) 10vriaa: refactor: remove fixed CSS from template starter [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/45 [14:19:48] RESOLVED: PuppetFailure: Puppet has failed on clouddb1023:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [14:33:47] (03merge) 10dcaro: start: add --use-deprecated-versions flag [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/129 [14:35:08] (03close) 10dcaro: DO NOT MERGE: build: use pipefail to not shadow command errors [repos/cloud/toolforge/buildpacks/apt-buildpack] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/buildpacks/apt-buildpack/-/merge_requests/1 (https://phabricator.wikimedia.org/T348746) [14:36:30] (03close) 10dcaro: Draft: DONOTMERGE: always auth as tf-test [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/72 [14:36:37] (03close) 10dcaro: logs: handle case where date can't be parsed [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/81 (https://phabricator.wikimedia.org/T362521) [14:36:50] (03close) 10dcaro: Draft: api: add idp authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/69 [14:37:05] 06Toolforge-standards-committee, 10Tools, 10Wikidata, 07Privacy: Tool "wd-vectordb" loads assets from Google gstatic.com - https://phabricator.wikimedia.org/T421889#11773227 (10valerio.bozzolan) "Sub-task" - need of an issue tracker for this tool: https://www.wikidata.org/wiki/Wikidata_talk:Vector_Databas... [14:37:14] (03close) 10dcaro: bump_version: copy from jobs-api [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/67 [14:37:23] (03close) 10dcaro: Draft: investigate authentication [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/22 (https://phabricator.wikimedia.org/T363983) [14:38:52] (03close) 10dcaro: openapi: Allow lowercase ASCII letters too [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/60 (https://phabricator.wikimedia.org/T374780) [14:38:58] (03update) 10dcaro: builds-api: update buildpacks to 24_0.21.5 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1175 [14:39:44] (03close) 10dcaro: inject_buildpacks: skip injecting aptfile if builtin exists [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/83 [14:39:46] 06cloud-services-team, 10Cloud-VPS: openstack: keystone may be failing to add users to the bastion project in Keystone and/or LDAP - https://phabricator.wikimedia.org/T379550#11773246 (10Andrew) >>! In T379550#11298228, @Andrew wrote: > I just encountered a variation on this: new user dpogorzelski was in the b... [14:40:57] 06cloud-services-team, 10Cloud-VPS: Keystone logs no longer appearing in logstash - https://phabricator.wikimedia.org/T421911 (10Andrew) 03NEW [14:48:55] (03Abandoned) 10Mainframe98: Use git archive to exclude dot files [labs/tools/extdist] - 10https://gerrit.wikimedia.org/r/598697 (https://phabricator.wikimedia.org/T68985) (owner: 10Mainframe98) [14:51:27] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Move all openstack rabbitmq queues to quorum - https://phabricator.wikimedia.org/T421054#11773322 (10fgiunchedi) >>! In T421054#11771226, @dcaro wrote: >> Note that despite the graph we have ~300G free on the VG, plus the vg0/srv LV is not really used and... [15:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [15:07:44] 10Tool-campwiz-nxt, 06Front-end-Standards-Group, 10Google-Summer-of-Code (2026), 07JavaScript: GSoC 2026 Proposal CampWiz NxT Frontend Migration from Nextjs to Reactjs - https://phabricator.wikimedia.org/T421918 (10Yogesh_U_G) 03NEW [15:10:16] (03close) 10fnegri: Store and return valid ISO 8601 timestamps [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/157 (https://phabricator.wikimedia.org/T421369) [15:11:13] 10Tool-campwiz-nxt, 06Front-end-Standards-Group, 10Google-Summer-of-Code (2026), 07JavaScript: GSoC 2026 Proposal CampWiz NxT Frontend Migration from Nextjs to Reactjs - https://phabricator.wikimedia.org/T421918#11773480 (10Yogesh_U_G) Hi, I have submitted my GSoC 2026 proposal for CampWiz NxT frontend mig... [15:13:28] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-cli [15:13:31] !log dcaro@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component builds-cli [15:15:13] 10Tool-campwiz-nxt, 06Front-end-Standards-Group, 10Google-Summer-of-Code (2026), 07JavaScript: CampWiz NxT Frontend Migration from Next.js to React.js - https://phabricator.wikimedia.org/T421918#11773511 (10Yogesh_U_G) [15:15:32] 10Tool-campwiz-nxt, 06Front-end-Standards-Group, 10Google-Summer-of-Code (2026), 07JavaScript: CampWiz NxT Frontend Migration from Next.js to React.js - https://phabricator.wikimedia.org/T421918#11773516 (10Yogesh_U_G) Additionally, I would be happy to start working on small frontend issues or microtasks t... [15:22:52] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11773574 (10Aklapper) [15:24:46] (03update) 10raymond-ndibe: values.yaml: replace job image variants with webservice image variants [repos/cloud/toolforge/image-config] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/image-config/-/merge_requests/18 (https://phabricator.wikimedia.org/T415322) [15:25:55] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11773642 (10Aklapper) [15:32:52] 06cloud-services-team, 10Cloud-VPS: rabbitmq: missing heartbeats issue - https://phabricator.wikimedia.org/T347017#11773697 (10fgiunchedi) 05Invalid→03Open Unfortunately I was too hasty: we do still see this, though it is unclear to me whether what the impact (if any) is ` root@cloudrabbit1002:/var/log/ra... [15:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [15:36:32] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 06tools-infrastructure-team: wmcs.openstack.restart_openstack attempts to restart services on decom cloudcontrol1005 - https://phabricator.wikimedia.org/T421832#11773722 (10Andrew) 05Open→03Resolved a:03Andrew I have not used it before, but 'd... [15:42:44] 06cloud-services-team, 10Toolforge: `toolforge jobs logs` misplaces my logs - https://phabricator.wikimedia.org/T421929 (10Soda) 03NEW [16:08:56] FIRING: CloudVPSDesignateLeaks: Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:22:36] 06cloud-services-team, 10Toolforge: `toolforge jobs logs` misplaces my logs - https://phabricator.wikimedia.org/T421929#11774111 (10Soda) If it helps, `crawljob` is a celery container that health-checks URLs, I can invoke it from the web UI of the tool and it appears that the job works fine. [16:24:24] (03update) 10fnegri: Add summary with counts [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/11 [16:25:31] 06cloud-services-team, 10Toolforge: `toolforge jobs logs` misplaces my logs - https://phabricator.wikimedia.org/T421929#11774133 (10bd808) `lang=shell-session tools.link-dispenser@tools-bastion-14:~$ toolforge jobs show crawljob +---------------+-----------------------------------------------------------------... [16:44:47] 06cloud-services-team, 10Toolforge, 06Privacy Engineering: Add Content-Security-Policy header enforcing 3rd party web interaction restrictions to proxy responses - https://phabricator.wikimedia.org/T130748#11774330 (10valerio.bozzolan) Is it possible to evaluate a gradual introduction of this `Content-Securi... [17:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [17:10:29] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11774435 (10Riddhi) [17:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [17:37:33] (03approved) 10andrew: Add summary with counts [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/11 (owner: 10fnegri) [17:49:53] (03open) 10dcaro: d/changelog: bump to 0.0.25 [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/130 (https://phabricator.wikimedia.org/T380127 https://phabricator.wikimedia.org/T408783) [17:51:18] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-cli [17:53:20] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-cli [17:55:37] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11774695 (10LikhithSappa) Hi @Nokib_Sarkar, I have submitted my interview request via email. Looking forward to your feedback. Thank you! [17:56:39] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-cli [17:58:42] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-cli [18:01:29] (03approved) 10dcaro: d/changelog: bump to 0.0.25 [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/130 (https://phabricator.wikimedia.org/T380127 https://phabricator.wikimedia.org/T408783) [18:01:39] (03merge) 10dcaro: d/changelog: bump to 0.0.25 [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/130 (https://phabricator.wikimedia.org/T380127 https://phabricator.wikimedia.org/T408783) [18:02:01] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-api [18:02:02] !log dcaro@cloudcumin1001 tools END (ERROR) - Cookbook wmcs.toolforge.component.deploy (exit_code=97) for component builds-api [18:02:11] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-api [18:06:13] !log dcaro@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component builds-api [18:07:03] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-api [18:10:23] !log dcaro@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component builds-api [18:19:34] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-api [18:24:15] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-api [18:26:41] 06cloud-services-team, 06DC-Ops, 10ops-codfw, 06SRE: Power Supply - Status - issue on cloudbackup2003:9290 - https://phabricator.wikimedia.org/T420948#11774886 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm [18:45:56] 06cloud-services-team, 10Toolforge: `toolforge jobs logs` misplaces my logs - https://phabricator.wikimedia.org/T421929#11774930 (10taavi) >>! In T421929#11774133, @bd808 wrote: > There are logs going to stdout/stderr in the container and being seen by Kubernetes, but it looks like they are not being collected... [18:50:12] (03update) 10dcaro: builds-api: update buildpacks to 24_0.21.5 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1175 [19:04:29] (03open) 10dcaro: build: added use-deprecated-version config [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/158 [19:20:17] FIRING: JobUnavailable: Reduced availability for job blackbox_http in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [19:25:17] RESOLVED: JobUnavailable: Reduced availability for job blackbox_http in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [20:08:57] FIRING: CloudVPSDesignateLeaks: Detected 3 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:40:48] FIRING: PuppetDisabled: Puppet disabled on cloudlb2003-dev:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=wmcs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [21:07:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [21:37:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [22:00:48] FIRING: [2x] PuppetDisabled: Puppet disabled on cloudlb2003-dev:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=wmcs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [22:10:48] FIRING: [3x] PuppetDisabled: Puppet disabled on cloudlb2002-dev:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=wmcs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [22:30:48] FIRING: [3x] PuppetDisabled: Puppet disabled on cloudlb2002-dev:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=wmcs&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled