[00:10:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:12:06] 06cloud-services-team, 10Toolforge: Account disabled - https://phabricator.wikimedia.org/T426544#11929103 (10Reedy) You've not given enough information for people to really be able to do any investigation. What account username? What wiki(s)? Was your bot account definitely logged in when it attempted that s... [00:15:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:51:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:06:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:13:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:28:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:34:32] FIRING: TargetDown: Job app is unreachable in project quarry instance quarry.wmcloud.org:443 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTargetDown [04:34:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:34:43] FIRING: QuarryDown: Quarry application is unreachable - https://prometheus-alerts.wmcloud.org/?q=alertname%3DQuarryDown [04:39:32] RESOLVED: TargetDown: Job app is unreachable in project quarry instance quarry.wmcloud.org:443 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTargetDown [04:39:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:39:43] RESOLVED: QuarryDown: Quarry application is unreachable - https://prometheus-alerts.wmcloud.org/?q=alertname%3DQuarryDown [06:45:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [06:50:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:09:34] FIRING: DiskSpace: Disk space cloudcumin1001:9100:/ 4.678% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcumin1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [07:17:38] 06cloud-services-team, 10Toolforge: Tools may not allow non-interactive commands via 'become' due to dotfile configuration - https://phabricator.wikimedia.org/T426378#11929379 (10fgiunchedi) Thank you @bd808 for the fix and digging up T186108, definitely not a new problem! [07:29:34] RESOLVED: DiskSpace: Disk space cloudcumin1001:9100:/ 4.675% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcumin1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [07:51:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:56:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:02:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:07:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:22:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:25:21] (03open) 10komla: reorganize directories [toolforge-repos/komla-apps] - 10https://gitlab.wikimedia.org/toolforge-repos/komla-apps/-/merge_requests/2 [08:25:23] (03approved) 10komla: reorganize directories [toolforge-repos/komla-apps] - 10https://gitlab.wikimedia.org/toolforge-repos/komla-apps/-/merge_requests/2 [08:25:26] (03merge) 10komla: reorganize directories [toolforge-repos/komla-apps] - 10https://gitlab.wikimedia.org/toolforge-repos/komla-apps/-/merge_requests/2 [08:25:42] (03update) 10raymond-ndibe: tests: fix wrong test data [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/290 (https://phabricator.wikimedia.org/T423880) [08:26:41] (03update) 10raymond-ndibe: core.py: skip one-off jobs when updating storage [repos/cloud/toolforge/jobs-api] (fix_data_handling_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/288 (https://phabricator.wikimedia.org/T423544) [08:27:24] (03update) 10raymond-ndibe: loki.py: fix logs ordering [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/18 (https://phabricator.wikimedia.org/T401552) [08:28:07] (03update) 10raymond-ndibe: cli.py: stop resolving dynamic fields in the cli [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/151 (https://phabricator.wikimedia.org/T423828) [08:29:31] (03update) 10raymond-ndibe: core.py: fix jobs loading messaging ux issue [repos/cloud/toolforge/jobs-api] (fix_jobs_load_bug) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/291 (https://phabricator.wikimedia.org/T423880 https://phabricator.wikimedia.org/T423891) [08:30:44] (03update) 10raymond-ndibe: core.py: fix jobs loading messaging ux issue [repos/cloud/toolforge/jobs-api] (fix_jobs_load_bug) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/291 (https://phabricator.wikimedia.org/T423891) [08:32:14] (03update) 10raymond-ndibe: replace job images with web images [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/263 (https://phabricator.wikimedia.org/T415322) [08:33:32] (03update) 10raymond-ndibe: fix jobs load bug [repos/cloud/toolforge/jobs-api] (fix_oneoff_storage_error) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/287 (https://phabricator.wikimedia.org/T423544) [08:35:02] (03update) 10raymond-ndibe: jobs-api: test for proper handling of the diff variations of the --image argument [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1113 (https://phabricator.wikimedia.org/T414978 https://phabricator.wikimedia.org/T415322) [08:35:14] (03update) 10raymond-ndibe: jobs-api: use webservice image variants in one-off job tests [repos/cloud/toolforge/toolforge-deploy] (test_for_image_argument_handling_in_jobs) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1115 (https://phabricator.wikimedia.org/T415322) [08:47:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:57:05] 06cloud-services-team, 10Toolforge: [toolsbeta] probe flapping on ipv6 only - https://phabricator.wikimedia.org/T426584 (10fnegri) 03NEW [09:06:57] !log fnegri@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component builds-api (T423417) [09:11:09] !log fnegri@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-api (T423417) [09:23:54] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: Define update process for Toolforge builder/runner images - https://phabricator.wikimedia.org/T424362#11930081 (10fnegri) Side-note: Heroku [automatically restarts](https://devcenter.heroku.com/changelog-items/3641) all containers every 24 hours, so any... [09:30:05] 10Tool-wikimedia-attribution, 10MediaWiki-REST-API, 06MW-Interfaces-Team, 13Patch-For-Review: Attribution API: Include wprov parameters in response URLs - https://phabricator.wikimedia.org/T425576#11930103 (10Sarai-WMF) >>! In T425576#11922017, @pmiazga wrote: > @Sarai-WMF @HCoplin-WMF I added also `wprov`... [09:52:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:54:48] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: heroku builder and runner 24_0.21.8 rejects harbor ip host - https://phabricator.wikimedia.org/T426016#11930231 (10fgiunchedi) Verified on lima-kilo on Linux, nuked the VM when `./start-devenv.sh` asked and ran the verification commands ` ...... [09:57:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [09:59:31] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: heroku builder and runner 24_0.21.8 rejects harbor ip host - https://phabricator.wikimedia.org/T426016#11930237 (10fnegri) 05In progress→03Resolved > Verified on lima-kilo on Linux, nuked the VM when ./start-devenv.sh asked and ran the verificat... [10:01:37] 06cloud-services-team, 10Cloud-VPS, 10Lingua-Libre: Configure vanity domain for lingualibre - https://phabricator.wikimedia.org/T419525#11930245 (10Michael_Barbereau_WMFr) Hello @Volans and @dcaro, After three weeks of testing, everything looks good. THX Therefore, we can migrate our production environment... [10:05:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:09:04] !log fnegri@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component builds-api (T423417) [10:10:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:13:08] 06cloud-services-team, 10Cloud-VPS: acme-chief http-01 backend requests failing on ipv6 - https://phabricator.wikimedia.org/T426594 (10taavi) 03NEW [10:16:07] !log fnegri@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component builds-api (T423417) [10:16:48] (03merge) 10fnegri: builds-api: update heroku builder and runner images [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1225 (https://phabricator.wikimedia.org/T423417 https://phabricator.wikimedia.org/T424362) [10:18:29] 06cloud-services-team, 10Cloud-VPS, 10Lingua-Libre: Configure vanity domain for lingualibre - https://phabricator.wikimedia.org/T419525#11930324 (10taavi) 05Open→03Resolved a:03taavi `lingualibre.org` is now available for you to use in the proxy as well. [10:21:19] 06cloud-services-team, 10Cloud-VPS: acme-chief http-01 backend requests failing on ipv6 - https://phabricator.wikimedia.org/T426594#11930343 (10taavi) This is a security group issue. In particular, the `default` security group in the project includes the following two v4-only rules instead of the usual allow a... [10:45:34] (03open) 10taavi: eqiad1: project-proxy: Add rules for acme-chief http-01 validation [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 (https://phabricator.wikimedia.org/T426594) [10:45:54] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 [10:46:03] !log taavi@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 [10:46:57] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 [10:47:02] (03update) 10taavi: eqiad1: project-proxy: Add rules for acme-chief http-01 validation [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 (https://phabricator.wikimedia.org/T426594) [10:47:18] !log taavi@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 [10:52:21] (03update) 10taavi: eqiad1: project-proxy: Add rules for acme-chief http-01 validation [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 (https://phabricator.wikimedia.org/T426594) [11:58:45] (03approved) 10volans: eqiad1: project-proxy: Add rules for acme-chief http-01 validation [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 (https://phabricator.wikimedia.org/T426594) (owner: 10taavi) [11:59:09] (03merge) 10taavi: eqiad1: project-proxy: Add rules for acme-chief http-01 validation [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/312 (https://phabricator.wikimedia.org/T426594) [11:59:09] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [11:59:58] !log taavi@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [12:01:06] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/313 [12:01:13] (03open) 10taavi: eqiad1: Remove stale imports [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/313 [12:01:26] !log taavi@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/313 [12:01:43] (03merge) 10taavi: eqiad1: Remove stale imports [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/313 [12:13:57] (03update) 10vriaa: feat: add Codex color token support to all color inputs [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/64 (https://phabricator.wikimedia.org/T420941 https://phabricator.wikimedia.org/T426315) [12:15:41] (03merge) 10vriaa: feat: add Codex color token support to all color inputs [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/64 (https://phabricator.wikimedia.org/T420941 https://phabricator.wikimedia.org/T426315) [12:37:57] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: maintain-dbusers should gracefully handle missing kubeconfig - https://phabricator.wikimedia.org/T424207#11930871 (10Raymond_Ndibe) a:03Raymond_Ndibe [12:38:12] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: maintain-dbusers should not pass sensitive data in command line parameters - https://phabricator.wikimedia.org/T424209#11930873 (10Raymond_Ndibe) a:03Raymond_Ndibe [12:38:31] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: webservice start should give a better error message when a conflicting job exists - https://phabricator.wikimedia.org/T423005#11930875 (10Raymond_Ndibe) a:03Raymond_Ndibe [12:42:59] 06cloud-services-team, 10Cloud-VPS: acme-chief http-01 backend requests failing on ipv6 - https://phabricator.wikimedia.org/T426594#11930897 (10taavi) 05Open→03Resolved a:03taavi [12:53:37] (03update) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/40 [13:20:24] 06cloud-services-team, 10Toolforge: Account disabled - https://phabricator.wikimedia.org/T426544#11931074 (10taavi) This is the standard [[ https://www.mediawiki.org/wiki/Manual:$wgSoftBlockRanges | software-configured ]] soft block on WMCS ranges that's been in place for a while. @Hawkeye7 what makes you thi... [13:49:08] 10Tool-wikimedia-attribution, 10MediaWiki-REST-API, 06MW-Interfaces-Team (MWI-Sprint-33 (2026-05-05 to 2026-05-19)), 13Patch-For-Review: Attribution API: Include wprov parameters in response URLs - https://phabricator.wikimedia.org/T425576#11931291 (10pmiazga) [13:49:09] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: Define update process for Toolforge builder/runner images - https://phabricator.wikimedia.org/T424362#11931293 (10fnegri) > I would like to test this assumption with this patch that updates our config to use the latest available version of heroku:24, bo... [13:53:02] (03open) 10taavi: istio-gateway: Make gateway node binding a hard rule [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1259 (https://phabricator.wikimedia.org/T426321) [13:53:06] (03update) 10taavi: istio-gateway: Make gateway node binding a hard rule [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1259 (https://phabricator.wikimedia.org/T426321) [13:53:22] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [istio-gateway] Deploying the component can cause an outage - https://phabricator.wikimedia.org/T426321#11931302 (10taavi) a:03taavi [13:53:24] (03update) 10taavi: istio-gateway: Make gateway node binding a hard rule [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1259 (https://phabricator.wikimedia.org/T426321) [14:06:01] 10Tool-wikimedia-attribution, 10MediaWiki-REST-API, 06MW-Interfaces-Team (MWI-Sprint-33 (2026-05-05 to 2026-05-19)), 13Patch-For-Review: Attribution API: Include wprov parameters in response URLs - https://phabricator.wikimedia.org/T425576#11931376 (10HCoplin-WMF) The schema description is my preferred app... [14:19:28] 10Tool-wikimedia-attribution, 10MediaWiki-REST-API, 06MW-Interfaces-Team (MWI-Sprint-33 (2026-05-05 to 2026-05-19)), 13Patch-For-Review: Attribution API: Include wprov parameters in response URLs - https://phabricator.wikimedia.org/T425576#11931442 (10pmiazga) 05Open→03In progress a:03pmiazga [14:40:21] 10VPS-project-Codesearch, 06MediaWiki-Platform-Team: Include "Mobile Apps" in "MediaWiki & services at WMF" preset for Codesearch - https://phabricator.wikimedia.org/T426627 (10Krinkle) 03NEW [14:41:58] 10VPS-project-Codesearch, 06MediaWiki-Platform-Team (Kanban Board): Include "Mobile Apps" in "MediaWiki & services at WMF" preset for Codesearch - https://phabricator.wikimedia.org/T426627#11931583 (10JTweed-WMF) [15:12:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:17:38] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:18:54] 06cloud-services-team (Hardware), 10Cloud-VPS: wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#11931810 (10Andrew) [15:20:53] 06cloud-services-team (Hardware), 10Cloud-VPS: wmcs codfw hardware changes proposal - https://phabricator.wikimedia.org/T377568#11931845 (10Andrew) After some discussion today, I propose that we just switch of a decom 200[78]. [15:29:38] 10VPS-project-Codesearch, 06MediaWiki-Platform-Team (Kanban Board): Include "Mobile Apps" in "MediaWiki & services at WMF" preset for Codesearch - https://phabricator.wikimedia.org/T426627#11931930 (10A_smart_kitten) Cross-referencing: - {T335407} - {T406500} ...as the tasks for which the 'Mobile Apps' secti... [15:33:31] 06cloud-services-team, 10Openstack-Magnum, 13Patch-For-Review: Investigate new Magnum drivers - https://phabricator.wikimedia.org/T393782#11931952 (10Andrew) "The new CAPI driver and the old Heat driver are compatible and can both be active on the same deployment, and the decision of which driver is used for... [15:35:04] 10Cloud-VPS (Project-requests): Request creation of wiki-polis-backend VPS project - https://phabricator.wikimedia.org/T425892#11931956 (10Andrew) Hello @Effeietsanders -- we're trying to figure out if this is something that could run on toolforge instead of having its own cloud-vps project. The need for postgre... [15:37:08] 06cloud-services-team, 10Cloud-VPS, 10Lingua-Libre: Configure vanity domain for lingualibre - https://phabricator.wikimedia.org/T419525#11931964 (10Yug) We are actively deploying lingua libre new code to its new address https://lingualibre.org (wmcloud.org) hosted. Thank @taavi for this. [15:47:19] (03close) 10raymond-ndibe: Draft: core: normalize job images [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/245 (owner: 10damian) [15:48:17] (03update) 10raymond-ndibe: istio-system: Disable X-Envoy-Peer-Metadata headers [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1235 (https://phabricator.wikimedia.org/T392356) (owner: 10taavi) [16:28:51] 10Cloud-VPS (Project-requests): Request creation of wiki-polis-backend VPS project - https://phabricator.wikimedia.org/T425892#11932376 (10Effeietsanders) I would expect that we would build the containers ourselves from the source. I am not aware of ready-made images of Particiapi and Polis. PostgreSQL may be av... [16:40:32] FIRING: PuppetCertificateAboutToExpire: Puppet CA certificate Puppet CA: metricsinfra-puppetmaster-1.metricsinfra.eqiad1.wikimedia.cloud is about to expire in 26d 23h 58m 10s - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetCertificateAboutToExpire - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetCertificateAboutToExpire [17:36:13] (03approved) 10fnegri: istio-gateway: Make gateway node binding a hard rule [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1259 (https://phabricator.wikimedia.org/T426321) (owner: 10taavi) [19:03:41] 10Cloud-VPS, 06tools-infrastructure-team: Web proxy DNS A record failure when creating development-metrics.wmcloud.org - https://phabricator.wikimedia.org/T426675 (10thcipriani) 03NEW [19:04:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:07:39] 10Cloud-VPS, 06tools-infrastructure-team: Web proxy DNS A record failure when creating development-metrics.wmcloud.org - https://phabricator.wikimedia.org/T426675#11933027 (10taavi) Unable to reproduce: `lang=shell-session taavi@runko:~ $ host development-metrics.wmcloud.org 8.8.8.8 Using domain server: Name:... [19:13:35] 10Cloud-VPS, 06tools-infrastructure-team: Web proxy DNS A record failure when creating development-metrics.wmcloud.org - https://phabricator.wikimedia.org/T426675#11933051 (10thcipriani) I did leave the second web proxy in place so that it would stick. Just removed that. But `development-metrics.wmcloud.org` i... [19:29:38] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:33:34] RESOLVED: DiskSpace: Disk space cloudbackup1004:9100:/srv 6.982% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [19:43:38] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:48:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:25:19] 06cloud-services-team, 10Cloud-VPS: [upstream] [openstack] Fix capi-helm magnum driver to support more template options - https://phabricator.wikimedia.org/T402232#11933400 (10Andrew) 05Open→03Invalid we've abandoned that template in favor of magnum-cluster-api [20:33:49] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_osds (T426563) [20:33:49] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.roll_reboot_osds (exit_code=99) (T426563) [20:34:31] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_osds (T426563) [20:34:31] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.roll_reboot_osds (exit_code=99) (T426563) [20:35:06] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_osds (T426563) [20:36:59] (03PS1) 10Andrew Bogott: inventory: replace cloudcephmon2004-dev with cloudcephmon2007-dev [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1288982 [20:37:42] PROBLEM - Host cloudcephosd1016 is DOWN: PING CRITICAL - Packet loss = 100% [20:39:11] RECOVERY - Host cloudcephosd1016 is UP: PING OK - Packet loss = 0%, RTA = 0.30 ms [20:40:33] (03CR) 10CI reject: [V:04-1] inventory: replace cloudcephmon2004-dev with cloudcephmon2007-dev [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1288982 (owner: 10Andrew Bogott) [20:42:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [20:42:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:47:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:54:19] 10Cloud-VPS, 06tools-infrastructure-team: Web proxy DNS A record failure when creating development-metrics.wmcloud.org - https://phabricator.wikimedia.org/T426675#11933668 (10thcipriani) 05Open→03Invalid Well. After I removed the second web proxy a bit ago, the first web proxy is still resolving. Nothi... [22:48:13] 06cloud-services-team, 10Openstack-Magnum: Investigate new Magnum drivers - https://phabricator.wikimedia.org/T393782#11933819 (10bd808) >>! In T393782#11931952, @Andrew wrote: > "The new CAPI driver and the old Heat driver are compatible and can both be active on the same deployment" That should make testing... [22:48:58] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team (MWI-Sprint-34 (2026-05-19 to 2026-06-02)), 07OKR-Work: Improve linting - enum descriptions - https://phabricator.wikimedia.org/T424210#11933826 (10HCoplin-WMF) [23:13:38] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:18:13] !log tools.cluebotng-monitoring Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26065893095 (https://github.com/cluebotng/component-configs/commits/0bf6e1847388fd90f2c03acdc1b80d388cb437e7) [23:18:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-monitoring/SAL [23:18:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:33:58] !log tools.cluebotng-monitoring Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26066535621 (https://github.com/cluebotng/component-configs/commits/2b0b0bbd4e0a727a468343823d4343fe7a082ef9) [23:34:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-monitoring/SAL [23:39:10] !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.ceph.roll_reboot_osds (exit_code=99) (T426563)