[00:03:15] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component logs-api [00:06:20] FIRING: [2x] PrometheusK8sCertExpirySoon: Prometheus k8s certificate is about to expire - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/PrometheusK8sCertExpirySoon - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusK8sCertExpirySoon [00:12:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:16:52] (03update) 10raymond-ndibe: logs-api: bump to 0.0.24-20260520233418-6d096df8 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1262 (https://phabricator.wikimedia.org/T401552) (owner: 10group_203_bot_3c0afd0d9fd9529f3b7bc7e69a4a3bce) [00:16:56] (03approved) 10raymond-ndibe: logs-api: bump to 0.0.24-20260520233418-6d096df8 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1262 (https://phabricator.wikimedia.org/T401552) (owner: 10group_203_bot_3c0afd0d9fd9529f3b7bc7e69a4a3bce) [00:17:01] (03merge) 10raymond-ndibe: logs-api: bump to 0.0.24-20260520233418-6d096df8 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1262 (https://phabricator.wikimedia.org/T401552) (owner: 10group_203_bot_3c0afd0d9fd9529f3b7bc7e69a4a3bce) [00:17:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:21:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:36:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:50:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:55:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:10:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:25:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:38:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:58:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:04:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [04:19:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [05:16:51] (03open) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [05:21:21] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [05:41:52] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [05:45:30] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [05:48:38] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [05:56:21] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [06:05:33] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [06:11:20] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [06:55:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:10:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:51:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:56:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [08:16:19] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team, 07OKR-Work: Fix schema property example validation - https://phabricator.wikimedia.org/T426836#11943791 (10KBach) [08:18:08] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [08:26:45] (03update) 10countcount: Run tests on merge requests as well if the variables are set [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [08:30:06] (03update) 10countcount: Improve CI: run tests on MRs, prevent duplicate pipelines, add Cargo caching [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [09:04:28] (03merge) 10countcount: Improve CI: run tests on MRs, prevent duplicate pipelines, add Cargo caching [toolforge-repos/multiuserinfo] - 10https://gitlab.wikimedia.org/toolforge-repos/multiuserinfo/-/merge_requests/3 [09:06:01] 06cloud-services-team, 06collaboration-services, 06Release-Engineering-Team, 10GitLab (CI & Job Runners): webservice-cli package deb gitlab CI job went from 9 minutes to 27 minutes - https://phabricator.wikimedia.org/T426827#11943992 (10LSobanski) [09:09:26] (03merge) 10filippo: Lower LimitRange for cpu/memory requests [repos/cloud/toolforge/maintain-kubeusers] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/-/merge_requests/88 (https://phabricator.wikimedia.org/T420565) [09:09:50] !log filippo@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component maintain-kubeusers [09:09:55] !log filippo@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component maintain-kubeusers [09:10:24] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team, 07OKR-Work: Fix linter rule duplicates - https://phabricator.wikimedia.org/T426825#11943997 (10KBach) 05Open→03In progress p:05Low→03Medium [09:13:45] !log filippo@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component maintain-kubeusers [09:13:49] (03update) 10group_203_bot_3c0afd0d9fd9529f3b7bc7e69a4a3bce: maintain-kubeusers: bump to 0.0.196-20260521090937-6e415674 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1263 (https://phabricator.wikimedia.org/T420565) [09:13:51] (03open) 10group_203_bot_3c0afd0d9fd9529f3b7bc7e69a4a3bce: maintain-kubeusers: bump to 0.0.196-20260521090937-6e415674 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1263 (https://phabricator.wikimedia.org/T420565) [09:29:05] !log filippo@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component maintain-kubeusers [09:33:17] !log filippo@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component maintain-kubeusers [09:48:12] !log filippo@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-kubeusers [09:48:38] !log filippo@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-kubeusers [10:10:44] !log filippo@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-kubeusers [10:13:04] (03merge) 10filippo: maintain-kubeusers: bump to 0.0.196-20260521090937-6e415674 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1263 (https://phabricator.wikimedia.org/T420565) (owner: 10group_203_bot_3c0afd0d9fd9529f3b7bc7e69a4a3bce) [10:25:46] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11944159 (10fnegri) 05Open→03In progress [10:35:25] 06cloud-services-team, 06collaboration-services, 06Release-Engineering-Team, 10GitLab (CI & Job Runners): webservice-cli package deb gitlab CI job went from 9 minutes to 27 minutes - https://phabricator.wikimedia.org/T426827#11944204 (10fgiunchedi) The problem seems to be `fakeroot` and a huge `ulimit -n`,... [10:39:35] 06cloud-services-team, 06collaboration-services, 06Release-Engineering-Team, 10GitLab (CI & Job Runners): webservice-cli package deb gitlab CI job went from 9 minutes to 27 minutes - https://phabricator.wikimedia.org/T426827#11944222 (10fgiunchedi) To test this theory I changed `webservice-cli` gitlab-ci t... [10:47:47] (03open) 10filippo: bandaid high RLIMIT_NOFILE in gitlab runners [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/85 (https://phabricator.wikimedia.org/T426827) [10:48:33] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-gateway-1, tools-k8s-gateway-2, tools-k8s-gateway-3 [10:50:21] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-gateway-1, tools-k8s-gateway-2, tools-k8s-gateway-3 [10:51:14] FIRING: [8x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAProxy server tools-k8s-gateway-1.tools.eqiad1.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown [10:51:48] FIRING: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_api_svc_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:54:08] 06cloud-services-team, 10Toolforge: wmcs.toolforge.k8s.reboot needs to be slower with Gateway nodes - https://phabricator.wikimedia.org/T426948 (10taavi) 03NEW [10:54:18] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-control-9 [10:55:06] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-control-9 [10:55:39] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-control-8 [10:56:14] RESOLVED: [18x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAProxy server tools-k8s-gateway-1.tools.eqiad1.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown [10:56:19] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-control-8 [10:56:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:56:48] RESOLVED: [3x] ProbeDown: Service tools-k8s-haproxy-7:443 has failed probes (http_api_svc_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [10:57:10] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-control-7 [10:57:56] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-control-7 [11:06:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:56:37] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudlb.safe_reboot on hosts matched by 'D{cloudlb2002-dev.codfw.wmnet}' [11:59:53] !log taavi@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudlb.safe_reboot (exit_code=0) on hosts matched by 'D{cloudlb2002-dev.codfw.wmnet}' [12:01:05] FIRING: [2x] HostBGPDown: BGP session for cloudlb2002-dev (172.20.5.3) is down - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DHostBGPDown [12:04:13] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: anycast-healthchecker fails to start on boot - https://phabricator.wikimedia.org/T426837#11944491 (10taavi) 05Open→03Resolved [12:05:06] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team, 07OKR-Work, 13Patch-For-Review: Fix linter rule duplicates - https://phabricator.wikimedia.org/T426825#11944494 (10KBach) [12:06:05] RESOLVED: [2x] HostBGPDown: BGP session for cloudlb2002-dev (172.20.5.3) is down - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DHostBGPDown [12:10:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:15:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [12:18:25] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team (MWI-Sprint-34 (2026-05-19 to 2026-06-02)), 07OKR-Work, 13Patch-For-Review: Fix linter rule duplicates - https://phabricator.wikimedia.org/T426825#11944545 (10KBach) [12:19:12] (03update) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/40 [12:19:17] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/36 [12:53:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [13:08:17] FIRING: [2x] PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [13:12:18] (03open) 10filippo: Use deb.d.o mirror [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/86 (https://phabricator.wikimedia.org/T423596) [13:23:17] FIRING: [2x] PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [13:28:39] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_mons (T426563) [13:31:42] PROBLEM - Host cloudcephmon1004 is DOWN: PING CRITICAL - Packet loss = 100% [13:32:07] 10Toolforge, 06tools-platform-team: [logs-api,jobs-cli] `toolforge jobs logs` has inconsistent ordering - https://phabricator.wikimedia.org/T401552#11944945 (10Raymond_Ndibe) 05Open→03Resolved [13:33:30] RECOVERY - Host cloudcephmon1004 is UP: PING OK - Packet loss = 0%, RTA = 0.35 ms [13:37:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [13:38:17] RESOLVED: [2x] PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [13:41:08] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.roll_reboot_mons (exit_code=0) (T426563) [13:47:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [14:14:33] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: maintain-dbusers should gracefully handle missing kubeconfig - https://phabricator.wikimedia.org/T424207#11945147 (10Raymond_Ndibe) 05Open→03Resolved [14:14:37] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: maintain-dbusers should not pass sensitive data in command line parameters - https://phabricator.wikimedia.org/T424209#11945149 (10Raymond_Ndibe) 05Open→03Resolved [14:16:42] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_mons [14:17:04] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.ceph.roll_reboot_mons (exit_code=97) [14:17:21] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_mons [14:17:44] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team, 07OKR-Work: Fix schema property example validation - https://phabricator.wikimedia.org/T426836#11945155 (10AGhirelli-WMF) Note for estimation: The fix requires inverting the logic in `schemaExampleUtils.js`: `findNestedExamples` currently passes if any propert... [14:28:11] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.roll_reboot_mons (exit_code=0) [14:29:05] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_mons [14:39:55] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.roll_reboot_mons (exit_code=0) [15:02:45] 06cloud-services-team, 10Toolforge, 07Software-Licensing: Expand the Toolforge definition of "free license" to include FSF-approved and DFSG-compatible licenses - https://phabricator.wikimedia.org/T152581#11945389 (10valerio.bozzolan) >>! In T152581#2859136, @Legoktm wrote: > I think the main ones in that li... [15:12:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:17:27] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team, 07OKR-Work: Fix schema property example validation - https://phabricator.wikimedia.org/T426836#11945468 (10HCoplin-WMF) [15:59:37] 10Cloud-VPS (Project-requests): Request creation of wiki-polis-backend VPS project - https://phabricator.wikimedia.org/T425892#11945646 (10Andrew) a:03Andrew [16:02:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [16:05:33] 10Cloud-VPS (Project-requests): Request creation of wiki-polis-backend VPS project - https://phabricator.wikimedia.org/T425892#11945668 (10Andrew) After much discussion today we've agreed that cloud-vps is probably the best place for this. I'd nevertheless like to hear about what kind of setup you wind up with s... [16:06:50] 10Cloud-VPS (Project-requests): Request creation of wiki-polis-backend VPS project - https://phabricator.wikimedia.org/T425892#11945677 (10Andrew) Oh, um... for historical reasons having dashes in the project name is bad luck, is it OK if we do wikipolisbackend instead? [16:09:46] 06cloud-services-team, 10Data-Services, 06tools-platform-team, 06Data-Persistence, 13Patch-For-Review: Extend sre.mysql.upgrade to work with multiinstance hosts - https://phabricator.wikimedia.org/T420203#11945680 (10fnegri) a:03fnegri [16:11:38] 10Toolforge, 06tools-platform-team: [jobs-cli] support publishing continuous job to the internet - https://phabricator.wikimedia.org/T423410#11945694 (10aputhin) 05Open→03In progress [16:11:45] 10Toolforge, 06tools-platform-team: [jobs-api] support exposing continuous jobs to the internet - https://phabricator.wikimedia.org/T423408#11945697 (10aputhin) 05Open→03In progress [16:12:52] !log andrew@cloudcumin1001 wiki-polis-backend START - Cookbook wmcs.vps.create_project for project wiki-polis-backend in eqiad1 [16:12:54] andrew@cloudcumin1001: Unknown project "wiki-polis-backend" [16:13:28] !log andrew@cloudcumin1001 wiki-polis-backend END (FAIL) - Cookbook wmcs.vps.create_project (exit_code=99) for project wiki-polis-backend in eqiad1 [16:13:28] andrew@cloudcumin1001: Unknown project "wiki-polis-backend" [16:15:01] !log andrew@cloudcumin1001 wiki-polis-backend START - Cookbook wmcs.vps.create_project for project wiki-polis-backend in eqiad1 [16:15:01] andrew@cloudcumin1001: Unknown project "wiki-polis-backend" [16:15:44] (03open) 10group_199_bot_f98be072172e323ae6d1441939d3e461: projects: added project wiki-polis-backend [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/314 [16:18:45] (03merge) 10andrew: projects: added project wiki-polis-backend [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/314 (owner: 10group_199_bot_f98be072172e323ae6d1441939d3e461) [16:19:00] andrew@cloudcumin1001 create_project (PID 337593) is awaiting input [16:21:19] !log andrew@cloudcumin1001 wiki-polis-backend END (PASS) - Cookbook wmcs.vps.create_project (exit_code=0) for project wiki-polis-backend in eqiad1 [16:28:02] 10Cloud-VPS (Project-requests): Request creation of wiki-polis-backend VPS project - https://phabricator.wikimedia.org/T425892#11945730 (10Andrew) 05Open→03Resolved Project created -- you should be able to create a postgres db with the 'databases' tab in Horizon, but be warned that a lot of the actual DB... [16:33:25] (03PS1) 10Hashar: Use `npm run test` as a CI entrypoint [labs/tools/wdaudiolex-fe] - 10https://gerrit.wikimedia.org/r/1290831 (https://phabricator.wikimedia.org/T426366) [16:33:32] (03CR) 10CI reject: [V:04-1] Use `npm run test` as a CI entrypoint [labs/tools/wdaudiolex-fe] - 10https://gerrit.wikimedia.org/r/1290831 (https://phabricator.wikimedia.org/T426366) (owner: 10Hashar) [16:36:10] (03CR) 10Hashar: "recheck" [labs/tools/wdaudiolex-fe] - 10https://gerrit.wikimedia.org/r/1290831 (https://phabricator.wikimedia.org/T426366) (owner: 10Hashar) [16:37:43] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11945746 (10ops-monitoring-bot) Upgrading clouddb1014.eqiad.wmnet [16:40:35] FIRING: PuppetCertificateAboutToExpire: Puppet CA certificate Puppet CA: metricsinfra-puppetmaster-1.metricsinfra.eqiad1.wikimedia.cloud is about to expire in 23d 23h 58m 10s - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetCertificateAboutToExpire - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetCertificateAboutToExpire [16:44:41] (03CR) 10Hashar: "Zuul fails to process because it does not support `@` in the branch `@feat/pagination`. The branch must be renamed. See T426366#11945751" [labs/tools/wdaudiolex-fe] - 10https://gerrit.wikimedia.org/r/1290831 (https://phabricator.wikimedia.org/T426366) (owner: 10Hashar) [16:45:15] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11945753 (10ops-monitoring-bot) Upgrade of clouddb1014.eqiad.wmnet completed [16:48:05] 06cloud-services-team, 10Toolforge, 07Software-Licensing: Expand the Toolforge definition of "free license" to include FSF-approved and DFSG-compatible licenses - https://phabricator.wikimedia.org/T152581#11945754 (10bd808) >>! In T152581#11945389, @valerio.bozzolan wrote: > → So, in all cases, we need answe... [16:59:01] 10VPS-project-Phabricator, 06collaboration-services, 06Infrastructure-Foundations, 10Mail: @wikimedia.org email addresses don't seem to be receiving emails sent by the test Phabricator instance - https://phabricator.wikimedia.org/T422559#11945766 (10jhathaway) @Dzahn and @A_smart_kitten, good points, I thi... [16:59:46] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11945768 (10fnegri) [17:02:39] 10tool-wdlocator: No data from wikidata is loaded - https://phabricator.wikimedia.org/T426903#11945771 (10Aklapper) a:05Strubbl→03None [17:04:26] 06cloud-services-team, 10Data-Services, 06tools-platform-team, 06Data-Persistence, 13Patch-For-Review: Extend sre.mysql.upgrade to work with multiinstance hosts - https://phabricator.wikimedia.org/T420203#11945774 (10fnegri) 05Open→03In progress I gave a shot at adapting the cookbook to support cloud... [17:11:37] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26241143565 (https://github.com/cluebotng/component-configs/commits/76187b1d77a1454297932d833d668fafc6490c10) [17:11:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [17:13:59] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11945796 (10ops-monitoring-bot) Upgrading clouddb1016.eqiad.wmnet [17:17:57] 10Cloud-VPS (Quota-requests), 10WMIT-Infrastructure: Quota increase request for project osmit - https://phabricator.wikimedia.org/T426790#11945798 (10bd808) WMCS has an ancient and largely locally undocumented tie to OpenStreetMap projects. The connection comes from a [[https://meta.wikimedia.org/wiki/OpenStre... [17:23:05] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11945816 (10ops-monitoring-bot) Upgrade of clouddb1016.eqiad.wmnet completed [17:24:05] 06cloud-services-team, 10Data-Services, 06tools-platform-team: [wikireplicas] Upgrade clouddbs to 10.11.16 - https://phabricator.wikimedia.org/T422527#11945819 (10fnegri) [17:24:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [17:28:57] !log andrew@cloudcumin1001 testlabs START - Cookbook wmcs.vps.instance.force_reboot vm util-abogott (cluster codfw1dev, project testlabs) [17:29:01] !log andrew@cloudcumin1001 testlabs END (PASS) - Cookbook wmcs.vps.instance.force_reboot (exit_code=0) vm util-abogott (cluster codfw1dev, project testlabs) [17:30:47] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_mons [17:41:13] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.roll_reboot_mons (exit_code=0) [17:49:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [18:06:22] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_mons [18:17:32] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.roll_reboot_mons (exit_code=0) [18:21:15] 10Cloud-VPS (Project-requests): Request creation of wiki-polis-backend VPS project - https://phabricator.wikimedia.org/T425892#11946055 (10Effeietsanders) Thanks a lot! And glad that there's one less thing to worry about re: back luck! [18:24:08] 10Cloud-Services, 06collaboration-services, 06Infrastructure-Foundations, 10Mail: @wikimedia.org email addresses don't seem to be receiving emails sent by the test Phabricator instance - https://phabricator.wikimedia.org/T422559#11946062 (10Dzahn) The #Cloud-Services project tag is not intended to have any... [18:26:12] 06cloud-services-team, 10VPS-project-Phabricator, 06collaboration-services, 06Infrastructure-Foundations, 10Mail: @wikimedia.org email addresses don't seem to be receiving emails sent by the test Phabricator instance - https://phabricator.wikimedia.org/T422559#11946070 (10JJMC89) [18:27:48] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.roll_reboot_mons [18:38:33] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.roll_reboot_mons (exit_code=0) [18:58:20] (03PS3) 10Andrew Bogott: inventory: replace cloudcephmon2004-dev with cloudcephmon2007-dev [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1288982 [19:01:50] (03CR) 10CI reject: [V:04-1] inventory: replace cloudcephmon2004-dev with cloudcephmon2007-dev [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1288982 (owner: 10Andrew Bogott) [19:04:14] (03update) 10ejegg: Add basic undo functionality [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/61 (https://phabricator.wikimedia.org/T421061) [19:10:39] (03update) 10ejegg: Add basic undo functionality [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/61 (https://phabricator.wikimedia.org/T421061) [19:15:44] FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [19:16:36] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/40 (owner: 10l10n-bot) [19:16:41] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/40 (owner: 10l10n-bot) [19:17:55] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/36 (owner: 10l10n-bot) [19:17:57] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/36 (owner: 10l10n-bot) [19:20:44] RESOLVED: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [19:48:04] 10Cloud-VPS (Quota-requests), 10WMIT-Infrastructure: Quota increase request for project osmit - https://phabricator.wikimedia.org/T426790#11946277 (10valerio.bozzolan) (thanks @bd808 for recalling these nice words about OSM - appreciated) The short answer is: Wikimedia Italia is the official OSM chapter in It... [20:04:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:09:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:37:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:39:21] !log tools.cluebotng Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26251627543 (https://github.com/cluebotng/component-configs/commits/96f9184e66a6e4b35a49f02940a213125945b056) [20:39:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng/SAL [20:40:00] !log tools.cluebotng-review Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26251627549 (https://github.com/cluebotng/component-configs/commits/96f9184e66a6e4b35a49f02940a213125945b056) [20:40:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-review/SAL [20:42:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:46:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [20:51:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:41:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:44:32] !log tools.cluebotng-monitoring Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26254787328 (https://github.com/cluebotng/component-configs/commits/80d9748516b3523cecfc76bd85d662fe91dde0c4) [21:44:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-monitoring/SAL [21:46:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:47:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:48:38] !log tools.cluebotng-monitoring Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26254813158 (https://github.com/cluebotng/component-configs/commits/ebf46d18f22eee769feeeffd6f4603ec4f320bbb) [21:48:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-monitoring/SAL [21:52:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [21:53:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [22:08:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [22:33:32] !log tools.cluebotng-monitoring Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26256861316 (https://github.com/cluebotng/component-configs/commits/e4c68154f62c8846320103abb0e97af63b20f9b1) [22:33:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-monitoring/SAL [22:37:19] !log tools.cluebotng-monitoring Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26256956868 (https://github.com/cluebotng/component-configs/commits/95f311a49172258bfd4d1c00d74a918a7aceab68) [22:37:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-monitoring/SAL [23:21:34] (03PS4) 10Andrew Bogott: inventory: replace cloudcephmon2004-dev with cloudcephmon2007-dev [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1288982 [23:27:00] (03CR) 10Andrew Bogott: [C:03+2] inventory: replace cloudcephmon2004-dev with cloudcephmon2007-dev [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1288982 (owner: 10Andrew Bogott) [23:30:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:30:41] (03Merged) 10jenkins-bot: inventory: replace cloudcephmon2004-dev with cloudcephmon2007-dev [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1288982 (owner: 10Andrew Bogott) [23:35:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:44:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:49:37] RESOLVED: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:50:37] FIRING: ProbeDown: Service toolsbeta-test-k8s-haproxy-7:443 has failed probes (http_admin_beta_toolforge_org_ip6) - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/k8s-haproxy - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown