[00:00:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:01:20] RESOLVED: [2x] PrometheusK8sCertExpirySoon: Prometheus k8s certificate is about to expire - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/PrometheusK8sCertExpirySoon - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusK8sCertExpirySoon [00:02:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:07:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:16:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:21:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:05:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:15:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:41:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:46:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:22:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:32:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:34:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:39:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:40:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:55:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:23:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:48:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:09:46] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team (MWI-Sprint-34 (2026-05-19 to 2026-06-02)), 07OKR-Work: Fix issue line highlighting in the linter - https://phabricator.wikimedia.org/T425935#11947291 (10KineticPelagic) a:03KineticPelagic [07:22:01] 06cloud-services-team, 06collaboration-services, 06Release-Engineering-Team, 10GitLab (CI & Job Runners), 13Patch-For-Review: gitlab workers ulimit nofiles 1073741816 slows down fakeroot - https://phabricator.wikimedia.org/T426827#11947353 (10fgiunchedi) [07:47:55] (03update) 10filippo: Use deb.d.o mirror [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/86 (https://phabricator.wikimedia.org/T423596) [07:48:07] (03update) 10filippo: bandaid high RLIMIT_NOFILE in gitlab runners [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/85 (https://phabricator.wikimedia.org/T426827) [08:37:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:42:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:03:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:08:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:21:07] 06cloud-services-team, 10Toolforge, 03Wikimania-Hackathon-2026: Export a dataset of licenses of Toolforge tools (Toolforge Licenses Catalogue) - https://phabricator.wikimedia.org/T427037 (10valerio.bozzolan) 03NEW [09:22:53] 06cloud-services-team, 10Toolforge, 03Wikimania-Hackathon-2026: Export a dataset of licenses of Toolforge tools (Toolforge Licenses Catalogue) - https://phabricator.wikimedia.org/T427037#11947520 (10valerio.bozzolan) [09:23:40] 06cloud-services-team, 10Toolforge, 07Software-Licensing: Expand the Toolforge definition of "free license" to include FSF-approved and DFSG-compatible licenses - https://phabricator.wikimedia.org/T152581#11947522 (10valerio.bozzolan) >>! In T152581#11945754, @bd808 wrote: >> → So, in all cases, we need answ... [09:31:35] 06cloud-services-team, 10Toolforge, 03Wikimania-Hackathon-2026: Export a dataset of licenses of Toolforge tools (Toolforge Licenses Catalogue) - https://phabricator.wikimedia.org/T427037#11947538 (10Super_nabla) Also refer to https://www.wikidata.org/wiki/Wikidata:List_of_Wikimedia_tools_with_Wikidata_item L... [09:46:26] 06cloud-services-team, 10Striker, 10CAS-SSO, 13Patch-For-Review: Use IDP for authentication in Striker - https://phabricator.wikimedia.org/T359554#11947564 (10Arendpieter) @taavi I see that no one is interested in reviewing my second pull request, so I’m thinking of abandoning it. [09:46:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:51:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:16:10] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11947737 (10ops-monitoring-bot) Upgrading clouddb1017.eqiad.wmnet [10:44:23] (03open) 10ademola: links to create items in no-results, results, and works-list pages [toolforge-repos/paulina] - 10https://gitlab.wikimedia.org/toolforge-repos/paulina/-/merge_requests/187 [10:49:16] (03update) 10ademola: links to create items in no-results, results, and works-list pages [toolforge-repos/paulina] - 10https://gitlab.wikimedia.org/toolforge-repos/paulina/-/merge_requests/187 [10:50:42] 10Cloud-VPS (Quota-requests), 10WMIT-Infrastructure: Quota increase request for project osmit - https://phabricator.wikimedia.org/T426790#11947830 (10Danysan1) As for Overpass, for which this request was made, it's one of the most used tools to extract geographic data from OpenStreetMap and it helps many Wiki... [10:55:24] (03close) 10ademola: links to create items in no-results, results, and works-list pages [toolforge-repos/paulina] - 10https://gitlab.wikimedia.org/toolforge-repos/paulina/-/merge_requests/187 [11:01:39] (03approved) 10fnegri: Use deb.d.o mirror [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/86 (https://phabricator.wikimedia.org/T423596) (owner: 10filippo) [11:04:54] (03open) 10ademola: links to create items in no-results, results, and works-list pages [toolforge-repos/paulina] - 10https://gitlab.wikimedia.org/toolforge-repos/paulina/-/merge_requests/188 [11:21:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:26:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:16:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:21:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:36:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:37:26] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shut down - https://phabricator.wikimedia.org/T427060 (10fnegri) 03NEW [12:41:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:41:44] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shut down - https://phabricator.wikimedia.org/T427060#11948164 (10Marostegui) There's not much you can do at this stage if the daemon is not accepting connections anymore - your only option is to sigkill it. MariaDB... [12:43:40] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shut down - https://phabricator.wikimedia.org/T427060#11948165 (10fnegri) Yes it's not the first time this happens, I created a task to have a papertrail and see if we can find ways to prevent it. I can also try wai... [12:56:27] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shut down - https://phabricator.wikimedia.org/T427060#11948233 (10fnegri) @fgiunchedi attempted a SIGABRT that resulted in: ` May 22 12:53:07 clouddb1017 mysqld[1825533]: 260522 12:53:07 [ERROR] /opt/wmf-mariadb1011... [12:57:53] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shut down - https://phabricator.wikimedia.org/T427060#11948236 (10Marostegui) It is hard to know if and when it may end. If you are not in a rush, you can always leave it for a few more hours, but who knows. The shut... [13:00:14] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shut down - https://phabricator.wikimedia.org/T427060#11948242 (10fnegri) Rebooted the host and restarted mariadb, this is the startup log: ` May 22 12:58:50 clouddb1017 systemd[1]: Starting mariadb@s1.service - mar... [13:02:01] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shutdown - https://phabricator.wikimedia.org/T427060#11948245 (10fnegri) [13:02:18] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shutdown - https://phabricator.wikimedia.org/T427060#11948248 (10Marostegui) The recovery looks good to me. We should keep an eye in case this starts crashing, as T427060#11948232 doesn't look great. [13:08:34] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shutdown - https://phabricator.wikimedia.org/T427060#11948268 (10fnegri) p:05Triage→03Low There's a threaddump in `/root/shutdown-threaddump` if you want to have a look, it was taken before the SIGABRT. I'll lea... [13:14:10] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948281 (10fnegri) The cookbook for clouddb1017 took longer than expected because of {T427060}. It eventually completed, but failed on the last step because... [13:16:08] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948285 (10ops-monitoring-bot) Upgrading clouddb[1018,1020,1022-1025].eqiad.wmnet [13:17:27] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shutdown - https://phabricator.wikimedia.org/T427060#11948287 (10Marostegui) Unless this crashes again I don't think we should spend much more time on it. [13:18:40] 10Data-Services, 06tools-platform-team, 06Data-Persistence: clouddb1017 getting stuck during shutdown - https://phabricator.wikimedia.org/T427060#11948288 (10fnegri) Agreed, I was thinking of testing another shutdown in one or two weeks from now, and check if it gets stuck again. If it doesn't, I will mark i... [13:25:35] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948309 (10ops-monitoring-bot) Upgrading clouddb1018.eqiad.wmnet [13:33:49] 06cloud-services-team, 10Data-Services, 06tools-platform-team, 06Data-Persistence, 13Patch-For-Review: Extend sre.mysql.upgrade to work with multiinstance hosts - https://phabricator.wikimedia.org/T420203#11948356 (10fnegri) I did run the cookbook with test-cookbook on 3 more hosts and it worked fine (cl... [13:46:38] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948404 (10ops-monitoring-bot) Upgrading clouddb1018.eqiad.wmnet [13:49:36] 06cloud-services-team, 10decommission-hardware: decommission cloudnet200[78]-dev.codfw.wmnet - https://phabricator.wikimedia.org/T427071 (10Andrew) 03NEW [13:49:52] 06cloud-services-team, 10decommission-hardware: decommission cloudnet200[78]-dev.codfw.wmnet - https://phabricator.wikimedia.org/T427071#11948423 (10Andrew) [13:50:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:52:31] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948425 (10ops-monitoring-bot) Upgrade of clouddb1018.eqiad.wmnet completed [13:55:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:59:32] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948431 (10ops-monitoring-bot) Upgrading clouddb1020.eqiad.wmnet [14:05:09] 10Cloud-VPS (Quota-requests), 10WMIT-Infrastructure: Quota increase request for project osmit - https://phabricator.wikimedia.org/T426790#11948440 (10fnegri) +1 [14:06:20] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948441 (10ops-monitoring-bot) Upgrade of clouddb1020.eqiad.wmnet completed [14:06:26] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948442 (10ops-monitoring-bot) Upgrading clouddb1022.eqiad.wmnet [14:13:59] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948471 (10ops-monitoring-bot) Upgrade of clouddb1022.eqiad.wmnet completed [14:14:05] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948472 (10ops-monitoring-bot) Upgrading clouddb1023.eqiad.wmnet [14:15:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:20:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:21:22] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948487 (10ops-monitoring-bot) Upgrade of clouddb1023.eqiad.wmnet completed [14:21:25] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948488 (10ops-monitoring-bot) Upgrading clouddb1024.eqiad.wmnet [14:23:18] 06cloud-services-team, 10decommission-hardware, 13Patch-For-Review: decommission cloudnet200[78]-dev.codfw.wmnet - https://phabricator.wikimedia.org/T427071#11948491 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by andrew@cumin2002 for hosts: `cloudnet2007-dev.codfw.wmnet` - cloudnet2007-d... [14:24:22] FIRING: [2x] HAProxyWikiReplicaSectionUnavailable: Wiki replica section x4 has no available servers on cloudlb1001:9900 - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyWikiReplicaSectionUnavailable [14:27:21] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948501 (10ops-monitoring-bot) Upgrade of clouddb1024.eqiad.wmnet completed [14:27:26] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948502 (10ops-monitoring-bot) Upgrading clouddb1025.eqiad.wmnet [14:29:22] RESOLVED: [2x] HAProxyWikiReplicaSectionUnavailable: Wiki replica section x4 has no available servers on cloudlb1001:9900 - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyWikiReplicaSectionUnavailable [14:33:48] !log andrew@cloudcumin1001 osmit START - Cookbook wmcs.openstack.quota_increase by 310 gigabytes [14:33:51] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948521 (10ops-monitoring-bot) Upgrade of clouddb1025.eqiad.wmnet completed [14:33:55] !log andrew@cloudcumin1001 osmit END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) by 310 gigabytes [14:34:13] 10Cloud-VPS (Quota-requests), 10WMIT-Infrastructure: Quota increase request for project osmit - https://phabricator.wikimedia.org/T426790#11948523 (10Andrew) 05Open→03Resolved a:03Andrew all set [14:34:33] 06cloud-services-team, 10decommission-hardware: decommission cloudnet200[78]-dev.codfw.wmnet - https://phabricator.wikimedia.org/T427071#11948527 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by andrew@cumin2002 for hosts: `cloudnet2008-dev.codfw.wmnet` - cloudnet2008-dev.codfw.wmnet (**PASS... [14:35:42] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-codfw: decommission cloudnet200[78]-dev.codfw.wmnet - https://phabricator.wikimedia.org/T427071#11948531 (10Andrew) a:05Andrew→03None [14:36:53] 06cloud-services-team (Hardware), 10Cloud-VPS: wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#11948536 (10Andrew) [14:36:56] 06cloud-services-team (Hardware), 10Cloud-VPS: wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#11948540 (10Andrew) 05Open→03Resolved [14:38:13] 10Tool-wikimedia-attribution: Technical implementation: Add "User-Agent policy and rate limits" section - https://phabricator.wikimedia.org/T425976#11948546 (10Sarai-WMF) Update: A [[ https://docs.google.com/document/d/1J8O8dw9O3WAelo9YDe4TLoRJ0zH82HmNbzE4vIr_Ivc/edit?tab=t.pb9b3ykfy5v9#heading=h.4a9m30bszxbf |... [14:38:51] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948547 (10fnegri) [14:39:25] 06cloud-services-team, 10Openstack-Magnum: debian packaging for magnum-cluster-api - https://phabricator.wikimedia.org/T426431#11948561 (10Andrew) gentle reminder that I'm still hoping for advice re "appropriate for inclusion on apt.wikimedia.org" [14:40:12] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948563 (10fnegri) All hosts have been upgraded and rebooted. `lang=shell-session fnegri@cumin1003:~$ sudo cumin 'clouddb*' 'dpkg -l wmf-mariadb1011' 11 hos... [14:40:38] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11948565 (10fnegri) 05In progress→03Resolved [14:40:47] 06cloud-services-team, 10Cloud-VPS: upgrade WMCS ceph nodes to Debian Trixie - https://phabricator.wikimedia.org/T413726#11948569 (10Andrew) a:05Andrew→03None [14:40:48] 10Tool-wikimedia-attribution: Explore analytics options for the Wikimedia Attribution Framework site - https://phabricator.wikimedia.org/T426738#11948570 (10Sarai-WMF) [14:42:18] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: experiment with moving rabbitmq behind haproxy - https://phabricator.wikimedia.org/T420937#11948590 (10Andrew) @godog, does this interest you at all, or are you happy with the current behavior of haproxy? I'm happy with it for now but mostly because ha... [14:43:51] 10Tool-wikimedia-attribution: Explore analytics options for the Wikimedia Attribution Framework site - https://phabricator.wikimedia.org/T426738#11948596 (10Sarai-WMF) Status update: The Attribution framework site was added to Wikimedia's Matomo instance. Analytics data is starting to flow in and is reflected in... [14:44:07] 06cloud-services-team, 10Cloud-VPS: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217#11948601 (10Andrew) 05Open→03Resolved This is done apart from Ceph hosts, which are tracked elsewhere. [14:44:55] 06cloud-services-team, 10Data-Services, 06tools-platform-team, 06Data-Persistence, 13Patch-For-Review: Extend sre.mysql.upgrade to work with multiinstance hosts - https://phabricator.wikimedia.org/T420203#11948607 (10fnegri) > there's something wrong with the looping logic I fixed this in a separate pa... [14:45:22] 06cloud-services-team, 10Cloud-VPS: openstack magnum (or heat) resource leak - https://phabricator.wikimedia.org/T392031#11948612 (10Andrew) 05Open→03Invalid I'm no longer sure there's actually a leak here. In any case this will matter much less as we adopt the new cluster-api driver which is T393782 [15:07:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:09:22] 10Tool-wikimedia-attribution: [WE5.3.1c] Publish attribution guidelines for the Media and publications scenario - https://phabricator.wikimedia.org/T426160#11948670 (10Sarai-WMF) Status update: A first draft of the [[ https://docs.google.com/document/d/1J8O8dw9O3WAelo9YDe4TLoRJ0zH82HmNbzE4vIr_Ivc/edit?tab=t.wzn2... [15:12:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:21:02] 06cloud-services-team, 10Cloud-VPS (Quota-requests): Quota increase request for project trove - https://phabricator.wikimedia.org/T427076 (10Andrew) 03NEW [15:21:14] !log andrew@cloudcumin1001 trove START - Cookbook wmcs.openstack.quota_increase by 20 cores [15:21:22] !log andrew@cloudcumin1001 trove END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) by 20 cores [15:21:36] 06cloud-services-team, 10Cloud-VPS (Quota-requests): Quota increase request for project trove - https://phabricator.wikimedia.org/T427076#11948725 (10Andrew) 05Open→03Resolved a:03Andrew I'm self-approving this but creating a task for visibility. [15:42:52] 06cloud-services-team, 10Toolforge, 03Wikimania-Hackathon-2026: Export a dataset of licenses of Toolforge tools (Toolforge Licenses Catalogue) - https://phabricator.wikimedia.org/T427037#11948808 (10valerio.bozzolan) >>! In T427037#11947538, @Super_nabla wrote: > Also refer to https://www.wikidata.org/wiki/W... [15:47:47] 06cloud-services-team, 10Toolforge, 03Wikimania-Hackathon-2026: Export a dataset of licenses of Toolforge tools (Toolforge Licenses Catalogue) - https://phabricator.wikimedia.org/T427037#11948839 (10valerio.bozzolan) I mean, blindly reading this, a Toolforge tool is not automatically in scope (?) https://ww... [15:52:49] 10wikitech.wikimedia.org: Wikitech static is down with a 502 HTTP error (Bad Gateway) - https://phabricator.wikimedia.org/T427081 (10Koavf) 03NEW [15:54:26] 10wikitech.wikimedia.org: Wikitech static is down with a 502 HTTP error (Bad Gateway) - https://phabricator.wikimedia.org/T427081#11948926 (10Koavf) For what it's worth, https://www.wikimediastatus.net/ says "All Systems Operational". I don't know of any other tracker to see if a WMF site is down. If I'm missing... [16:12:23] 10wikitech.wikimedia.org: Wikitech static is down with a 502 HTTP error (Bad Gateway) - https://phabricator.wikimedia.org/T427081#11948994 (10Lucas_Werkmeister_WMDE) >>! In T427081#11948926, @Koavf wrote: > For what it's worth, https://www.wikimediastatus.net/ says "All Systems Operational". That part is not su... [16:15:32] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission cloudnet200[78]-dev.codfw.wmnet - https://phabricator.wikimedia.org/T427071#11948995 (10Jhancock.wm) a:03Jhancock.wm [16:22:54] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission cloudnet200[78]-dev.codfw.wmnet - https://phabricator.wikimedia.org/T427071#11949047 (10Jhancock.wm) 05Open→03Resolved [16:40:36] FIRING: PuppetCertificateAboutToExpire: Puppet CA certificate Puppet CA: metricsinfra-puppetmaster-1.metricsinfra.eqiad1.wikimedia.cloud is about to expire in 22d 23h 58m 10s - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetCertificateAboutToExpire - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetCertificateAboutToExpire [16:50:40] (03open) 10vriaa: feat: add appearance settings with light/dark/auto mode [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/66 (https://phabricator.wikimedia.org/T421746) [16:52:37] (03open) 10vriaa: feat: add appearance settings with light/dark/auto mode [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/67 (https://phabricator.wikimedia.org/T421746) [16:53:59] (03update) 10vriaa: fix: replace hardcoded colors and -fixed tokens with mode-aware Codex tokens [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/67 (https://phabricator.wikimedia.org/T421746) [16:55:17] (03open) 10vriaa: feat: add one-value editing for side-based properties [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/68 (https://phabricator.wikimedia.org/T421063) [17:24:54] 10Cloud-VPS, 06tools-infrastructure-team: Consider allowing cumin access to all Cloud VPS VMs - https://phabricator.wikimedia.org/T422801#11949237 (10Andrew) I experimented with this and ran into an extra wrinkle, which is that nova-injected keys are typically not root keys. For instance, on ubuntu hosts the k... [17:50:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:05:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown