[01:35:42] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1058.eqiad.wmnet}' (T419948)
[01:38:30] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:38:39] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:38:43] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:38:52] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:38:56] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:39:05] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:39:10] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:39:19] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:39:26] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:39:35] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:39:49] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:39:58] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:40:15] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:40:17] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=99)
[01:40:23] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:40:32] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:40:36] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:40:44] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:40:48] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:40:56] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:41:04] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.unset_maintenance
[01:41:12] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.unset_maintenance (exit_code=0)
[01:44:04] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=99) on hosts matched by 'D{cloudvirt1058.eqiad.wmnet}' (T419948)
[01:45:31] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1049.eqiad.wmnet}' (T419948)
[01:46:48] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=99) on hosts matched by 'D{cloudvirt1049.eqiad.wmnet}' (T419948)
[01:47:56] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova
[01:54:02] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,nova
[01:54:32] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1049.eqiad.wmnet}' (T419948)
[02:06:36] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1049.eqiad.wmnet}' (T419948)
[02:06:37] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1048.eqiad.wmnet}' (T419948)
[02:30:07] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1048.eqiad.wmnet}' (T419948)
[02:30:08] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1047.eqiad.wmnet}' (T419948)
[02:55:59] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1047.eqiad.wmnet}' (T419948)
[02:56:01] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1046.eqiad.wmnet}' (T419948)
[02:56:49] <jinxer-wm>	 FIRING: NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudvirt1047 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[03:01:49] <jinxer-wm>	 RESOLVED: NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudvirt1047 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[03:16:59] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1046.eqiad.wmnet}' (T419948)
[03:17:00] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1045.eqiad.wmnet}' (T419948)
[03:42:25] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1045.eqiad.wmnet}' (T419948)
[03:42:26] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1044.eqiad.wmnet}' (T419948)
[03:54:16] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: 429 Too Many Requests hit despite throttling to 100 req/sec - https://phabricator.wikimedia.org/T219857#11712004 (10Hawkeye7) Massviews was working on 7 December 2023. I am sure it was working in 2024.
[04:09:20] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1044.eqiad.wmnet}' (T419948)
[04:09:21] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1043.eqiad.wmnet}' (T419948)
[04:31:32] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1043.eqiad.wmnet}' (T419948)
[04:31:33] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1042.eqiad.wmnet}' (T419948)
[04:32:49] <jinxer-wm>	 FIRING: NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudvirt1043 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[04:37:49] <jinxer-wm>	 RESOLVED: NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudvirt1043 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[04:52:17] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1042.eqiad.wmnet}' (T419948)
[04:52:18] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1041.eqiad.wmnet}' (T419948)
[04:52:49] <jinxer-wm>	 FIRING: NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudvirt1042 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[04:57:49] <jinxer-wm>	 RESOLVED: NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudvirt1042 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[05:14:19] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1041.eqiad.wmnet}' (T419948)
[05:14:20] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1040.eqiad.wmnet}' (T419948)
[05:36:18] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=0) on hosts matched by 'D{cloudvirt1040.eqiad.wmnet}' (T419948)
[05:36:49] <jinxer-wm>	 FIRING: NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudvirt1040 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[05:41:49] <jinxer-wm>	 RESOLVED: NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudvirt1040 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[08:11:22] <wikibugs>	 (03open) 10r4356th: Preserve newlines [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/3
[08:20:52] <wikibugs>	 (03update) 10r4356th: Preserve newlines [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/3
[08:22:05] <wikibugs>	 (03update) 10r4356th: Preserve whitespace around newlines [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/3
[08:22:51] <wikibugs>	 (03update) 10r4356th: Preserve whitespace around newlines [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/3
[08:23:33] <wikibugs>	 (03merge) 10r4356th: Preserve whitespace around newlines [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/3
[08:48:32] <wmcs-alerts>	 FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cloudinfra   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure
[08:53:12] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: cloudcumin not able to communicate with openstack.eqiad1.wikimediacloud.org:25000 anymore - https://phabricator.wikimedia.org/T419996#11712337 (10fgiunchedi) @taavi mentioned that https://gerrit.wikimedia.org/r/c/operations/homer/public/+/970275 might have broken this commun...
[09:03:32] <wmcs-alerts>	 RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cloudinfra   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure
[09:06:32] <wmcs-alerts>	 FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cvn   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure
[09:06:32] <wmcs-alerts>	 FIRING: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project metricsinfra   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure
[09:18:49] <wikibugs>	 10Tool-delintbot: Fix cases of tags not being closed correctly - https://phabricator.wikimedia.org/T417483#11712398 (10Kavaljeet_Singh) I have cloned the repository and explored the codebase. It looks like page text processing is handled through the performreplacements() function in str_replacements.py which is...
[09:21:32] <wmcs-alerts>	 RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project cvn   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure
[09:21:32] <wmcs-alerts>	 RESOLVED: WidespreadPuppetAgentFailure: Widespread puppet agent failures in project metricsinfra   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadPuppetAgentFailure
[09:45:40] <wikibugs>	 06cloud-services-team, 10Data-Services, 06Data-Persistence: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177 (10fnegri) 03NEW
[09:46:33] <wikibugs>	 06cloud-services-team, 10Data-Services, 06Data-Persistence: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11712598 (10fnegri)
[09:53:34] <wikibugs>	 06cloud-services-team, 10Data-Services, 06Data-Persistence: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11712670 (10fnegri) The automatic restart did not restart replication, so the host is currently lagging behind.  @taavi depooled the host today, so...
[09:57:07] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06Data-Persistence: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11712706 (10fnegri) p:05Triage→03High a:03fnegri
[09:58:19] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06Data-Persistence: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11712711 (10FCeratto-WMF) For context, the upgrade included also: ` 2026-03-13 09:46:26 status installed linux-image-6.1.0-44-a...
[10:05:00] <wikibugs>	 (03open) 10r4356th: Only fix closing tags if the tag itself is known [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/4
[10:08:11] <wikibugs>	 (03update) 10r4356th: Only fix closing tags if the tag itself is known [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/4
[10:12:42] <wikibugs>	 (03update) 10r4356th: Only fix closing tags if the tag itself is known [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/4
[10:13:15] <wikibugs>	 (03merge) 10r4356th: Only fix closing tags if the tag itself is known [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/4
[10:49:36] <wikibugs>	 (03open) 10r4356th: Strip the opening quote if it does not have a corresponding end quote in an attribute's value [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/5
[10:49:52] <wikibugs>	 (03update) 10r4356th: Strip the opening quote if it does not have a corresponding end quote in an attribute's value [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/5
[10:50:02] <wikibugs>	 (03merge) 10r4356th: Strip the opening quote if it does not have a corresponding end quote in an attribute's value [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/5
[10:51:31] <wikibugs>	 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11712914 (10Only-Vikas) Hi @Nokib_Sarkar and @Tiven2240!  Now that the contribution period has officially opened, I am excited to share my progress on the CampWiz NxT Redesign. I have...
[11:10:50] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Permanently set 'noout' for cloudceph - https://phabricator.wikimedia.org/T419877#11713011 (10Volans) I brought this up to the ceph working group and the only comments I got were from Ben that was a bit surprised about our strategy and wanted to know more...
[11:23:37] <wikibugs>	 10Tool-delintbot: Fix cases of tags not being closed correctly - https://phabricator.wikimedia.org/T417483#11713039 (10Redmin) Oh, sorry, I have moved the code for lint replacements to https://gitlab.wikimedia.org/toolforge-repos/delintbot in the meantime. The code should go in `delinter.py` (either inside the `...
[11:29:28] <wikibugs>	 06cloud-services-team, 10Data-Services: [wikireplicas] Add an option to cookbooks to specify which hosts should be targeted - https://phabricator.wikimedia.org/T393387#11713091 (10fnegri) Related: {T273199}
[11:30:10] <wikibugs>	 06cloud-services-team, 10Data-Services: Give the wmcs.wikireplicas.add_wiki cookbook a way to exclude a host - https://phabricator.wikimedia.org/T273199#11713095 (10fnegri) →14Duplicate dup:03T393387
[11:30:12] <wikibugs>	 06cloud-services-team, 10Data-Services: [wikireplicas] Add an option to cookbooks to specify which hosts should be targeted - https://phabricator.wikimedia.org/T393387#11713097 (10fnegri)
[11:34:53] <wikibugs>	 10Tool-delintbot: Fix multi-colon-escape errors for soft redirect template users - https://phabricator.wikimedia.org/T420197 (10Redmin) 03NEW
[11:35:07] <wikibugs>	 10Tool-delintbot: Fix multi-colon-escape errors for soft redirect template users - https://phabricator.wikimedia.org/T420197#11713122 (10Redmin) p:05Triage→03High a:03Redmin
[11:44:38] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06Data-Persistence: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11713149 (10Ladsgroup) >>! In T420177#11712669, @fnegri wrote: > The automatic restart did not restart replication, so the host...
[11:49:04] <wikibugs>	 (03open) 10r4356th: Fix multi-colon-escape errors for soft redirect template users [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/6 (https://phabricator.wikimedia.org/T420197)
[11:49:14] <wikibugs>	 (03update) 10r4356th: Fix multi-colon-escape errors for soft redirect template users [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/6 (https://phabricator.wikimedia.org/T420197)
[11:50:02] <wikibugs>	 (03merge) 10r4356th: Fix multi-colon-escape errors for soft redirect template users [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/6 (https://phabricator.wikimedia.org/T420197)
[12:04:52] <wikibugs>	 10Tool-delintbot, 13Patch-For-Review: Fix multi-colon-escape errors for soft redirect template users - https://phabricator.wikimedia.org/T420197#11713209 (10Redmin) 05Open→03Resolved Done with that MR, https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/commit/8d6c47c4eae0ba13cbfc9dfc518a94bd4e72a...
[12:08:49] <wikibugs>	 10Tool-delintbot, 13Patch-For-Review: Fix multi-colon-escape errors for soft redirect template users - https://phabricator.wikimedia.org/T420197#11713233 (10Redmin) 05Resolved→03Open The query needs to be changed so the `lint_template = ''` check is not added for lint category 11 (multi-colon-escape).
[12:25:53] <wikibugs>	 06cloud-services-team, 10Data-Services, 06Data-Persistence: Extend sre.mysql.upgrade to work with multiinstance hosts - https://phabricator.wikimedia.org/T420203 (10fnegri) 03NEW
[12:26:14] <wikibugs>	 06cloud-services-team, 10Data-Services, 06Data-Persistence: Extend sre.mysql.upgrade to work with multiinstance hosts - https://phabricator.wikimedia.org/T420203#11713319 (10fnegri)
[12:26:33] <wikibugs>	 (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/62
[12:26:33] <wikibugs>	 (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/34
[12:28:04] <wikibugs>	 (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1253474 (owner: 10L10n-bot)
[12:30:24] <wikibugs>	 10Tool-delintbot, 13Patch-For-Review: Fix multi-colon-escape errors for soft redirect template users - https://phabricator.wikimedia.org/T420197#11713323 (10Redmin) 05Open→03Resolved Done with https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/commit/50ba71eaea146499c89e41460e82c5732398d685.
[12:32:38] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06Data-Persistence: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11713333 (10fnegri) > I think the default behavior is not to start replication. Just issue "start slave" and it should be fineT...
[12:33:40] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06Data-Persistence: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11713345 (10fnegri) 05Open→03In progress
[13:06:27] <icinga-wm>	 PROBLEM - mysqld processes on an-redacteddb1001 is CRITICAL: PROCS CRITICAL: 9 processes with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting
[13:07:53] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06Data-Engineering, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Drop support for cl_to, cl_collation and il_to from wikireplicas - https://phabricator.wikimedia.org/T417492#11713534 (10BTullis) >>! In T417492#11703318, @fnegri wrote: > I ran th...
[13:08:53] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06Data-Engineering, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Drop support for cl_to, cl_collation and il_to from wikireplicas - https://phabricator.wikimedia.org/T417492#11713539 (10fnegri) 05In progress→03Resolved a:03fnegri
[13:10:02] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Data-Services, 06Data-Engineering, 06Data-Platform-SRE (2026-03-06 - 2026-03-27): Drop support for cl_to, cl_collation and il_to from wikireplicas - https://phabricator.wikimedia.org/T417492#11713550 (10fnegri) a:05fnegri→03Zabe
[13:19:10] <wikibugs>	 (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/62 (owner: 10l10n-bot)
[13:19:13] <wikibugs>	 (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/62 (owner: 10l10n-bot)
[13:23:42] <wikibugs>	 (03update) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/34 (owner: 10l10n-bot)
[13:25:26] <wikibugs>	 (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/34 (owner: 10l10n-bot)
[13:25:29] <wikibugs>	 (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/34 (owner: 10l10n-bot)
[13:37:27] <icinga-wm>	 RECOVERY - mysqld processes on an-redacteddb1001 is OK: PROCS OK: 8 processes with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting
[13:45:37] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: Reimage cloudgw hosts to Trixie - https://phabricator.wikimedia.org/T401899#11713673 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin2002 for host cloudgw1003.eqiad.wmnet with OS trixie
[13:52:16] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: Deprecate and remove 'bastion-restricted' hosts - https://phabricator.wikimedia.org/T420213 (10Andrew) 03NEW
[13:55:28] <wikibugs>	 (03open) 10r4356th: Preserve content inside code tags [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/7
[13:55:56] <wikibugs>	 (03merge) 10r4356th: Preserve content inside code tags [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/7
[14:00:42] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: cloudcumin not able to communicate with openstack.eqiad1.wikimediacloud.org:25000 anymore - https://phabricator.wikimedia.org/T419996#11713763 (10taavi) Since that firewall change is "correct" in terms of the administrative policy we want to do, and the cloudcumin hosts live...
[14:17:56] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: cloudcumin not able to communicate with openstack.eqiad1.wikimediacloud.org:25000 anymore - https://phabricator.wikimedia.org/T419996#11713865 (10fgiunchedi) I agree cloudcumin talking via prod http proxy like any other client is the right fix here. @Volans what do you think...
[14:29:05] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: cloudcumin not able to communicate with openstack.eqiad1.wikimediacloud.org:25000 anymore - https://phabricator.wikimedia.org/T419996#11713923 (10Volans) Conceptually that could work for me, but I fear that we might need to patch cumin for that. Given that keystoneauth1 uses...
[14:34:33] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: Reimage cloudgw hosts to Trixie - https://phabricator.wikimedia.org/T401899#11713954 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin2002 for host cloudgw1003.eqiad.wmnet with OS trixie completed: - cloudgw1003 (**PASS**)   - Downtimed on I...
[14:45:52] <wikibugs>	 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: Increased openstack latency and rabbitmq rolling restarts on certificate update - https://phabricator.wikimedia.org/T418444#11713984 (10fgiunchedi) Confirmed that rabbitmq reloads certs without a restart:  ` cloudrabbit1001:~$ sudo systemctl status rabbit...
[15:21:35] <icinga-wm>	 PROBLEM - mysqld processes on clouddb1022 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting
[15:24:05] <icinga-wm>	 PROBLEM - Host clouddb1022 is DOWN: PING CRITICAL - Packet loss = 100%
[15:25:43] <icinga-wm>	 RECOVERY - Host clouddb1022 is UP: PING OK - Packet loss = 0%, RTA = 0.37 ms
[15:26:35] <icinga-wm>	 PROBLEM - mysqld processes on clouddb1022 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting
[15:29:35] <icinga-wm>	 RECOVERY - mysqld processes on clouddb1022 is OK: PROCS OK: 2 processes with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting
[15:40:06] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services: phabricator.wmcloud.org account verification request: Pppery - https://phabricator.wikimedia.org/T420149#11714255 (10ABran-WMF) 05Open→03In progress a:03Dzahn
[15:40:18] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: 429 Too Many Requests hit despite throttling to 100 req/sec - https://phabricator.wikimedia.org/T219857#11714261 (10daniel)  >>! In T219857#11704747, @MusikAnimal wrote: > It looks like the rate limiting policy might have cha...
[15:57:32] <icinga-wm>	 PROBLEM - Host clouddb1024 is DOWN: PING CRITICAL - Packet loss = 100%
[15:58:42] <icinga-wm>	 RECOVERY - Host clouddb1024 is UP: PING OK - Packet loss = 0%, RTA = 0.42 ms
[16:00:34] <icinga-wm>	 PROBLEM - mysqld processes on clouddb1024 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting
[16:03:34] <icinga-wm>	 RECOVERY - mysqld processes on clouddb1024 is OK: PROCS OK: 1 process with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting
[16:03:43] <wikibugs>	 (03PS1) 10Btullis: Add dummy analytics-wikidata keytabs [labs/private] - 10https://gerrit.wikimedia.org/r/1253550 (https://phabricator.wikimedia.org/T404073)
[16:03:48] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services: phabricator.wmcloud.org account verification request: Pppery - https://phabricator.wikimedia.org/T420149#11714396 (10Dzahn) Hello @Pppery  I tried to verify you but the email address that is associated in LDAP with the user called pppery does not exist in th...
[16:03:57] <wikibugs>	 (03CR) 10Btullis: [V:03+2 C:03+2] Add dummy analytics-wikidata keytabs [labs/private] - 10https://gerrit.wikimedia.org/r/1253550 (https://phabricator.wikimedia.org/T404073) (owner: 10Btullis)
[16:05:02] <wikibugs>	 06cloud-services-team, 10Cloud-VPS (Quota-requests): Add floating IP and vanity domain for azwikimedia project - https://phabricator.wikimedia.org/T419582#11714405 (10fnegri) @Nemoralis your plan looks fine.  For the PTR record, can you please create a sub-task? We should be able to configure it for you.  Re:...
[16:38:40] <wikibugs>	 10Toolforge (Toolforge iteration 26): [harbor,tools] Harbor object usage in S3 is steadily increasing - https://phabricator.wikimedia.org/T418528#11714589 (10Raymond_Ndibe) I digged deeper into this. https://github.com/goharbor/harbor/issues/22111 is one of our problems, but is not the major one. Below are other...
[16:44:02] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services: phabricator.wmcloud.org account verification request: Pppery - https://phabricator.wikimedia.org/T420149#11714616 (10Pppery) I apparently used perry@olum.org on the test instance and mapreader@olum.org on LDAP. I created the account a while ago, don't rememb...
[16:55:34] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudservices1005.eqiad.wmnet' (T406516)
[16:55:40] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[16:55:55] <wikibugs>	 (03open) 10r4356th: Correctly preserve nested nowiki, code, syntaxhighlight tags [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/8
[17:01:18] <wikibugs>	 (03update) 10r4356th: Correctly preserve nested nowiki, code, syntaxhighlight tags [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/8
[17:02:02] <wikibugs>	 (03merge) 10r4356th: Correctly preserve nested nowiki, code, syntaxhighlight tags [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/8
[17:02:43] <wikibugs>	 (03update) 10r4356th: Correctly preserve nested nowiki, code, syntaxhighlight tags [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/8
[17:03:15] <wikibugs>	 (03update) 10r4356th: Correctly preserve nested nowiki, code, syntaxhighlight tags [toolforge-repos/delintbot] - 10https://gitlab.wikimedia.org/toolforge-repos/delintbot/-/merge_requests/8
[17:03:17] <jinxer-wm>	 FIRING: [2x] JobUnavailable: Reduced availability for job pdns in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[17:03:36] <icinga-wm>	 PROBLEM - Host cloudservices1005 is DOWN: PING CRITICAL - Packet loss = 100%
[17:04:20] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudservices1005.eqiad.wmnet' (T406516)
[17:04:25] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[17:04:29] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudservices1006.eqiad.wmnet' (T406516)
[17:05:09] <wikibugs>	 (03open) 10bd808: ci: Update pre-commit dependencies and fix new lint errors [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/68
[17:05:35] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services: phabricator.wmcloud.org account verification request: Pppery - https://phabricator.wikimedia.org/T420149#11714728 (10Dzahn) Gotcha!  Done.   ` dzahn@phabricator-bullseye:/srv/phab/phabricator/bin$ sudo ./auth verify perry@olum.org Done.  `
[17:06:05] <jinxer-wm>	 FIRING: [2x] HostBGPDown: BGP session for cloudservices1005 (172.20.2.4) is down - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status  - https://alerts.wikimedia.org/?q=alertname%3DHostBGPDown
[17:06:06] <icinga-wm>	 RECOVERY - Host cloudservices1005 is UP: PING OK - Packet loss = 0%, RTA = 0.40 ms
[17:06:14] <icinga-wm>	 PROBLEM - Check DNS auth via TCP of login.toolforge.org on server ns0.openstack.eqiad1.wikimediacloud.org on cloudservices1005 is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[17:06:14] <icinga-wm>	 PROBLEM - Check DNS auth via TCP of tools-puppetserver-01.tools.eqiad1.wikimedia.cloud on server ns0.openstack.eqiad1.wikimediacloud.org on cloudservices1005 is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[17:06:14] <icinga-wm>	 PROBLEM - Check DNS auth via UDP of www.wmcloud.org on server ns0.openstack.eqiad1.wikimediacloud.org on cloudservices1005 is CRITICAL: CRITICAL - Plugin timed out while executing system call https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[17:07:04] <icinga-wm>	 RECOVERY - Check DNS auth via TCP of login.toolforge.org on server ns0.openstack.eqiad1.wikimediacloud.org on cloudservices1005 is OK: DNS OK - 0.026 seconds response time (login.toolforge.org. 3600 IN CNAME bastion.toolforge.org.) https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[17:07:04] <icinga-wm>	 RECOVERY - Check DNS auth via TCP of tools-puppetserver-01.tools.eqiad1.wikimedia.cloud on server ns0.openstack.eqiad1.wikimediacloud.org on cloudservices1005 is OK: DNS OK - 0.027 seconds response time (tools-puppetserver-01.tools.eqiad1.wikimedia.cloud. 60 IN A 172.16.3.13) https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[17:07:04] <icinga-wm>	 RECOVERY - Check DNS auth via UDP of www.wmcloud.org on server ns0.openstack.eqiad1.wikimediacloud.org on cloudservices1005 is OK: DNS OK - 0.027 seconds response time (www.wmcloud.org. 3600 IN CNAME wmcloud.org.) https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting
[17:08:17] <jinxer-wm>	 RESOLVED: [2x] JobUnavailable: Reduced availability for job pdns in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[17:08:47] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services: phabricator.wmcloud.org account verification request: Pppery - https://phabricator.wikimedia.org/T420149#11714735 (10Dzahn) 05In progress→03Resolved
[17:09:35] <wikibugs>	 (03approved) 10jforrester: ci: Update pre-commit dependencies and fix new lint errors [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/68 (owner: 10bd808)
[17:09:59] <wikibugs>	 (03merge) 10jforrester: ci: Update pre-commit dependencies and fix new lint errors [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/68 (owner: 10bd808)
[17:10:13] <wikibugs>	 (03update) 10jforrester: channels: Remove wikimedia-collaboration [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/67 (owner: 10taavi)
[17:11:05] <jinxer-wm>	 RESOLVED: [2x] HostBGPDown: BGP session for cloudservices1005 (172.20.2.4) is down - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status  - https://alerts.wikimedia.org/?q=alertname%3DHostBGPDown
[17:11:52] <wikibugs>	 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: cloudcumin not able to communicate with openstack.eqiad1.wikimediacloud.org:25000 anymore - https://phabricator.wikimedia.org/T419996#11714769 (10fgiunchedi) We discussed this in the team meeting today: to restore functionality I have https://gerrit.wik...
[17:13:08] <wikibugs>	 (03approved) 10jforrester: channels: Remove wikimedia-collaboration [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/67 (owner: 10taavi)
[17:13:11] <wikibugs>	 (03merge) 10jforrester: channels: Remove wikimedia-collaboration [toolforge-repos/wikibugs2] - 10https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/67 (owner: 10taavi)
[17:13:17] <jinxer-wm>	 FIRING: [2x] JobUnavailable: Reduced availability for job pdns in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[17:13:21] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11714778 (10Dzahn)
[17:14:39] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudservices1006.eqiad.wmnet' (T406516)
[17:14:46] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[17:17:35] <jinxer-wm>	 FIRING: [4x] HostBGPDown: BGP session for cloudservices1005 (172.20.2.4) is down - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status  - https://alerts.wikimedia.org/?q=alertname%3DHostBGPDown
[17:18:17] <jinxer-wm>	 RESOLVED: [2x] JobUnavailable: Reduced availability for job pdns in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[17:20:18] <wikibugs>	 10Tool-phab-ban, 10Phabricator: Temporary ban feature from phab-ban to quickly response to Phabricator vandalism - https://phabricator.wikimedia.org/T420136#11714804 (10bd808) 05Declined→03Invalid
[17:22:35] <jinxer-wm>	 RESOLVED: [4x] HostBGPDown: BGP session for cloudservices1005 (172.20.2.4) is down - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status  - https://alerts.wikimedia.org/?q=alertname%3DHostBGPDown
[17:24:10] <wikibugs>	 10Tool-inteGraality: Function "<http://wikiba.se/ontology#isSomeValue>" is currently not supported by QLever. - https://phabricator.wikimedia.org/T420247 (10JeanFred) 03NEW
[17:24:23] <wikibugs>	 10VPS-project-Codesearch, 10VerySmallGLAM, 10Wikibase Suite Team: Indexing of wikibase related repos - https://phabricator.wikimedia.org/T420067#11714828 (10Dzahn) It would be great if WMDE could prioritize T374926 before we add Github repos to our search.
[17:40:17] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudweb.set_maintenance (T406516)
[17:40:24] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[17:42:20] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.cloudweb.set_maintenance (exit_code=99) (T406516)
[17:42:53] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1006.eqiad.wmnet' (T406516)
[17:48:10] <jinxer-wm>	 FIRING: GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[17:48:22] <jinxer-wm>	 FIRING: HAProxyBackendUnavailable: HAProxy service mysql backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[17:53:10] <jinxer-wm>	 RESOLVED: GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[17:53:22] <jinxer-wm>	 RESOLVED: HAProxyBackendUnavailable: HAProxy service mysql backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[17:59:52] <jinxer-wm>	 FIRING: [13x] HAProxyBackendUnavailable: HAProxy service glance-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:00:10] <jinxer-wm>	 FIRING: [2x] GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[18:01:30] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1006.eqiad.wmnet' (T406516)
[18:01:36] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[18:04:52] <jinxer-wm>	 FIRING: [15x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:05:10] <jinxer-wm>	 RESOLVED: [2x] GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[18:05:35] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1007.eqiad.wmnet' (T406516)
[18:09:52] <jinxer-wm>	 RESOLVED: [15x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:10:40] <jinxer-wm>	 FIRING: [2x] GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[18:10:52] <jinxer-wm>	 FIRING: [4x] HAProxyBackendUnavailable: HAProxy service heat-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:11:07] <jinxer-wm>	 FIRING: [5x] HAProxyBackendUnavailable: HAProxy service designate-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:11:22] <jinxer-wm>	 FIRING: [13x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:15:25] <jinxer-wm>	 RESOLVED: [2x] GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[18:15:52] <jinxer-wm>	 RESOLVED: [16x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:17:41] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11715141 (10Aklapper) @Dzahn: Hmmm how does this affect the production instance? Or what did you have in mind by adding the...
[18:22:17] <jinxer-wm>	 FIRING: JobUnavailable: Reduced availability for job maintain_dbusers_eqiad in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[18:23:22] <jinxer-wm>	 FIRING: [15x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:25:27] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1007.eqiad.wmnet' (T406516)
[18:25:34] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[18:26:37] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1011.eqiad.wmnet' (T406516)
[18:27:17] <jinxer-wm>	 FIRING: [2x] JobUnavailable: Reduced availability for job maintain_dbusers_eqiad in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[18:28:22] <jinxer-wm>	 FIRING: [15x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:32:17] <jinxer-wm>	 RESOLVED: [2x] JobUnavailable: Reduced availability for job maintain_dbusers_eqiad in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[18:33:22] <jinxer-wm>	 RESOLVED: [16x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:39:58] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=99) on host 'cloudcontrol1011.eqiad.wmnet' (T406516)
[18:40:04] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[18:41:53] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1011.eqiad.wmnet' (T406516)
[18:49:25] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: 429 Too Many Requests hit despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715341 (10MusikAnimal)
[18:50:10] <jinxer-wm>	 FIRING: GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[18:51:52] <jinxer-wm>	 FIRING: [15x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1011.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:51:55] <jinxer-wm>	 FIRING: [2x] GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[18:52:02] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1011.eqiad.wmnet' (T406516)
[18:52:09] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[18:53:55] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: massviews hits 429 Too Many Requests despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715367 (10daniel)
[18:55:10] <jinxer-wm>	 RESOLVED: [2x] GaleraClusterSizeMismatch: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch
[18:56:27] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: massviews hits 429 Too Many Requests despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715374 (10daniel) @MusikAnimal how many requests does this tool need to make to provide a useful respons...
[18:56:52] <jinxer-wm>	 RESOLVED: [15x] HAProxyBackendUnavailable: HAProxy service cinder-api_backend backend cloudcontrol1011.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[18:59:38] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudnet1006.eqiad.wmnet' (T406516)
[18:59:45] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[19:02:17] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: massviews hits 429 Too Many Requests despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715391 (10MusikAnimal) >>! In T219857#11715374, @daniel wrote: > @MusikAnimal how many requests does thi...
[19:03:13] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: massviews hits 429 Too Many Requests despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715405 (10MusikAnimal) And heck, for Massviews specifically, maybe it's not too much to ask for users to...
[19:06:28] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: massviews hits 429 Too Many Requests despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715406 (10Hawkeye7) Can we roll back this 429 change?  I know I only use it once a year or so, but I rea...
[19:08:24] <icinga-wm>	 PROBLEM - Host cloudnet1006 is DOWN: PING CRITICAL - Packet loss = 100%
[19:10:27] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudnet1006.eqiad.wmnet' (T406516)
[19:10:34] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[19:11:06] <icinga-wm>	 RECOVERY - Host cloudnet1006 is UP: PING OK - Packet loss = 0%, RTA = 0.32 ms
[19:13:19] <wikibugs>	 06cloud-services-team, 06SRE Observability, 06Traffic, 13Patch-For-Review: Move wikimediastatus.net 301 to ncredir - https://phabricator.wikimedia.org/T419887#11715415 (10ssingh) Thanks for the task and the patch @colewhite. We will discuss this in Traffic and follow up here or on the CR itself.
[19:13:49] <jinxer-wm>	 FIRING: [4x] NeutronAgentDown: Neutron neutron-metadata-agent on cloudnet1006 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[19:15:44] <wikibugs>	 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11715419 (10Dzahn) @Aklapper The last question above was about a configuration change. Configuration changes affect all ins...
[19:15:56] <jinxer-wm>	 FIRING: [4x] SystemdUnitDown: The service unit neutron-dhcp-agent.service is in failed status on host cloudnet1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudnet1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[19:18:14] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudnet1005.eqiad.wmnet' (T406516)
[19:18:20] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[19:20:56] <jinxer-wm>	 RESOLVED: [4x] SystemdUnitDown: The service unit neutron-dhcp-agent.service is in failed status on host cloudnet1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudnet1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[19:23:49] <jinxer-wm>	 RESOLVED: [4x] NeutronAgentDown: Neutron neutron-metadata-agent on cloudnet1006 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown
[19:42:33] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.safe_reboot on hosts matched by 'D{cloudvirt1075.eqiad.wmnet}' (T419948)
[19:43:33] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.cloudvirt.safe_reboot (exit_code=99) on hosts matched by 'D{cloudvirt1075.eqiad.wmnet}' (T419948)
[19:43:46] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1074.eqiad.wmnet' (T406516)
[19:48:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[19:51:24] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1074.eqiad.wmnet' (T406516)
[19:51:31] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[19:51:44] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1073.eqiad.wmnet' (T406516)
[19:57:25] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: massviews hits 429 Too Many Requests despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715558 (10MusikAnimal) >>! In T219857#11715391, @MusikAnimal wrote: > … we could simply let the server a...
[19:59:07] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1073.eqiad.wmnet' (T406516)
[19:59:13] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[19:59:20] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1072.eqiad.wmnet' (T406516)
[20:00:22] <jinxer-wm>	 FIRING: [2x] HAProxyBackendUnavailable: HAProxy service designate-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[20:02:17] <jinxer-wm>	 FIRING: JobUnavailable: Reduced availability for job openstack in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[20:03:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[20:05:22] <jinxer-wm>	 RESOLVED: [5x] HAProxyBackendUnavailable: HAProxy service designate-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[20:07:04] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1072.eqiad.wmnet' (T406516)
[20:07:11] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[20:07:17] <jinxer-wm>	 RESOLVED: JobUnavailable: Reduced availability for job openstack in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[20:07:19] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1071.eqiad.wmnet' (T406516)
[20:14:21] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1071.eqiad.wmnet' (T406516)
[20:14:28] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[20:14:31] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1070.eqiad.wmnet' (T406516)
[20:16:48] <wikibugs>	 10Tool-phab-ban: Consider enabling permanent sessions for clients with poor session scoped cookie handling - https://phabricator.wikimedia.org/T420147#11715657 (10bd808)
[20:21:41] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1070.eqiad.wmnet' (T406516)
[20:21:47] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[20:22:36] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1060.eqiad.wmnet' (T406516)
[20:30:11] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1060.eqiad.wmnet' (T406516)
[20:30:12] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1061.eqiad.wmnet' (T406516)
[20:30:16] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[20:31:26] <wikibugs>	 10Tool-phab-ban: Consider enabling permanent sessions for clients with poor session scoped cookie handling - https://phabricator.wikimedia.org/T420147#11715697 (10bd808) p:05Triage→03Low I'm pretty sure the behavior described in the use case is a broken user-agent or a user-agent that believes it has been as...
[20:37:10] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1061.eqiad.wmnet' (T406516)
[20:37:11] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1062.eqiad.wmnet' (T406516)
[20:37:16] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[20:44:23] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1062.eqiad.wmnet' (T406516)
[20:44:24] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1063.eqiad.wmnet' (T406516)
[20:44:30] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[20:51:04] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1063.eqiad.wmnet' (T406516)
[20:51:05] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1064.eqiad.wmnet' (T406516)
[20:51:11] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[20:58:11] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1064.eqiad.wmnet' (T406516)
[20:58:12] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1065.eqiad.wmnet' (T406516)
[20:58:19] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[21:05:15] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1065.eqiad.wmnet' (T406516)
[21:05:16] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1066.eqiad.wmnet' (T406516)
[21:05:27] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[21:12:22] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1066.eqiad.wmnet' (T406516)
[21:12:23] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1067.eqiad.wmnet' (T406516)
[21:12:31] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[21:19:22] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1067.eqiad.wmnet' (T406516)
[21:19:23] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1068.eqiad.wmnet' (T406516)
[21:19:28] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[21:20:37] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: massviews hits 429 Too Many Requests despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715954 (10daniel) >>! In T219857#11715405, @MusikAnimal wrote: > And heck, for Massviews specifically, m...
[21:26:15] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1068.eqiad.wmnet' (T406516)
[21:26:16] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1069.eqiad.wmnet' (T406516)
[21:26:24] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[21:28:18] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering, 06Data-Engineering-Icebox, 10Pageviews-API: massviews hits 429 Too Many Requests despite making requests synchronously - https://phabricator.wikimedia.org/T219857#11715985 (10MusikAnimal) Thanks so much for the help!  >> And heck, for Massviews specifically, maybe it's...
[21:33:37] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1069.eqiad.wmnet' (T406516)
[21:33:44] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[21:58:31] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1050.eqiad.wmnet' (T406516)
[21:58:37] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[22:02:12] <wikibugs>	 06cloud-services-team, 06DC-Ops, 10ops-codfw, 06SRE: cloudcephmon2007-dev service implementation - https://phabricator.wikimedia.org/T420282 (10Andrew) 03NEW
[22:05:31] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1050.eqiad.wmnet' (T406516)
[22:05:33] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1051.eqiad.wmnet' (T406516)
[22:05:38] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[22:06:22] <jinxer-wm>	 FIRING: [3x] HAProxyBackendUnavailable: HAProxy service designate-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[22:07:22] <jinxer-wm>	 FIRING: [2x] HAProxyServiceUnavailable: HAProxy service designate-api_backend has no available backends on cloudlb1001:9900 - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyServiceUnavailable
[22:11:22] <jinxer-wm>	 RESOLVED: [5x] HAProxyBackendUnavailable: HAProxy service designate-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[22:12:22] <jinxer-wm>	 RESOLVED: [2x] HAProxyServiceUnavailable: HAProxy service designate-api_backend has no available backends on cloudlb1001:9900 - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyServiceUnavailable
[22:12:22] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1051.eqiad.wmnet' (T406516)
[22:12:23] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1052.eqiad.wmnet' (T406516)
[22:12:28] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[22:18:56] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1052.eqiad.wmnet' (T406516)
[22:18:57] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1053.eqiad.wmnet' (T406516)
[22:19:03] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[22:25:43] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1053.eqiad.wmnet' (T406516)
[22:25:44] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1054.eqiad.wmnet' (T406516)
[22:25:50] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[22:32:43] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1054.eqiad.wmnet' (T406516)
[22:32:44] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1055.eqiad.wmnet' (T406516)
[22:32:49] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[22:38:41] <jinxer-wm>	 FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks
[22:39:33] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1055.eqiad.wmnet' (T406516)
[22:39:34] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1056.eqiad.wmnet' (T406516)
[22:39:40] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[22:46:29] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1056.eqiad.wmnet' (T406516)
[22:46:30] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1057.eqiad.wmnet' (T406516)
[22:46:35] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[22:48:41] <jinxer-wm>	 RESOLVED: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks
[22:53:37] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1057.eqiad.wmnet' (T406516)
[22:53:38] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1058.eqiad.wmnet' (T406516)
[22:53:44] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[23:00:31] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1058.eqiad.wmnet' (T406516)
[23:00:32] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1059.eqiad.wmnet' (T406516)
[23:00:38] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516
[23:07:39] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1059.eqiad.wmnet' (T406516)
[23:07:45] <stashbot>	 T406516: Upgrade openstack to version 'Flamingo' - https://phabricator.wikimedia.org/T406516