[00:38:25] FIRING: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [00:43:25] RESOLVED: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:04:25] FIRING: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:09:25] RESOLVED: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:25:48] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo: ULSFO: Decommision old switches (asw2-22/23-ulsfo) - https://phabricator.wikimedia.org/T427246 (10Papaul) 03NEW [04:34:16] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: Decommision old switches (asw2-22/23-ulsfo) - https://phabricator.wikimedia.org/T427246#11953658 (10Papaul) [04:47:42] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: Decommision old switches (asw2-22/23-ulsfo) - https://phabricator.wikimedia.org/T427246#11953661 (10Papaul) [04:49:42] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: Decommision old switches (asw2-22/23-ulsfo) - https://phabricator.wikimedia.org/T427246#11953662 (10Papaul) 05Open→03Resolved Both switches are now set to offline. The only step left is for onsite to remove all the cable... [04:51:51] 10netops, 06Infrastructure-Foundations, 06SRE: InboundInterfaceErrors alerts firing for Nokia switches on v25.10.1 - https://phabricator.wikimedia.org/T412733#11953665 (10Papaul) Email back from Nokia team ` The target release is still being considered. I’ll let you know once we have more information. ` [06:36:25] FIRING: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:41:25] RESOLVED: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:02:35] FIRING: DiskSpace: Disk space krb1002:9100:/ 1.639% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [07:07:35] RESOLVED: DiskSpace: Disk space krb1002:9100:/ 1.321% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [07:49:11] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations: bird bfd session with 172.20.1.1 down - Bad packet from 172.20.1.1 - unknown session id - https://phabricator.wikimedia.org/T427202#11953849 (10cmooney) Yeah not really sure what happened there @fgiunchedi, a sync issue with the se... [07:57:35] FIRING: DiskSpace: Disk space krb1002:9100:/ 2.084% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [08:03:15] moritzm: I'm seeing a 50GB /var/log folder, mostly due to large krb5kdc.log files [08:05:24] it's the presto logs, the current kdc.log became so large that logrotate failed to compress the .1 file [08:05:54] I'll also make a patch to reduce the retention, we don't need to keep 672 days of KDC logs? [08:06:31] but the root cause will only really be fixed with https://phabricator.wikimedia.org/T358196 or something else whichs avoids the excessive logging in Presto [08:07:25] FIRING: SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:07:35] RESOLVED: DiskSpace: Disk space krb1002:9100:/ 0% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [08:32:24] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations: bird bfd session with 172.20.1.1 down - Bad packet from 172.20.1.1 - unknown session id - https://phabricator.wikimedia.org/T427202#11953989 (10fgiunchedi) Thank you for the detailed explanation @cmooney, definitely TIL things abou... [08:32:25] RESOLVED: SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:59:33] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations: bird bfd session with 172.20.1.1 down - Bad packet from 172.20.1.1 - unknown session id - https://phabricator.wikimedia.org/T427202#11954086 (10cmooney) >>! In T427202#11953989, @fgiunchedi wrote: > Thank you for the detailed expla... [08:59:35] FIRING: DiskSpace: Disk space krb1002:9100:/ 2.009% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [09:04:34] RESOLVED: DiskSpace: Disk space krb1002:9100:/ 0.494% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [09:14:58] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE, and 2 others: Replace Spamassassin with Rspam for VRTS on Postfix - https://phabricator.wikimedia.org/T402260#11954191 (10ABran-WMF) Following up on yesterday's merge, I created a [[ https://grafana.wikimedia.org/goto/efn7pi5lmtj40e?org... [09:56:35] FIRING: DiskSpace: Disk space krb1002:9100:/ 2.056% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [10:06:25] FIRING: SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:06:35] RESOLVED: DiskSpace: Disk space krb1002:9100:/ 0% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [10:36:25] RESOLVED: SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:55:35] FIRING: DiskSpace: Disk space krb1002:9100:/ 2.34% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [11:05:34] RESOLVED: DiskSpace: Disk space krb1002:9100:/ 0% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [11:06:25] FIRING: SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:22:38] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11954730 (10FCeratto-WMF) `es2042` and `es2041` in section `es4` have been switched: `es2041` is now a replica and can be depooled [11:36:25] RESOLVED: SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:56:44] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11954962 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=52314e9c-92e4-4ce8-aff3-713ec1b15d3f) set by jynus@cumin1003 for 6:00... [11:57:35] FIRING: DiskSpace: Disk space krb1002:9100:/ 1.815% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [12:06:25] FIRING: SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:07:35] RESOLVED: DiskSpace: Disk space krb1002:9100:/ 0% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=krb1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [12:09:38] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955004 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=88812e40-edf3-45b2-b6f9-ae1f746a9dee) set by fabfur@cumin1003 for 2:0... [12:36:25] RESOLVED: SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:40:55] FIRING: [2x] SystemdUnitFailed: prometheus-ethtool-exporter.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:28:12] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955234 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=0ff82d6d-6a46-4d3b-b727-57ef8402c512) set by ayounsi@cumin1003 for 2:... [13:29:30] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955244 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=c14853fb-268e-4348-b4c0-d1f48c81fb76) set by ayounsi@cumin1003 for 2:... [13:31:42] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955251 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by ayounsi@cumin1003 depool for host wikikube-ctrl2003.codfw.w... [13:34:34] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955252 (10ops-monitoring-bot) Completed depooling of db2196 by ayounsi@cumin1003: switch maintenance [13:35:21] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955254 (10ops-monitoring-bot) Completed depooling of db2221 by ayounsi@cumin1003: switch maintenance [13:35:59] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955255 (10ops-monitoring-bot) Completed depooling of db2222 by ayounsi@cumin1003: switch maintenance [13:36:31] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955256 (10ops-monitoring-bot) Completed depooling of db2223 by ayounsi@cumin1003: switch maintenance [13:55:55] FIRING: [2x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:08:32] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955386 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by ayounsi@cumin1003 pool for host wikikube-ctrl2003.codfw.wmn... [14:10:55] FIRING: [2x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:15:00] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955409 (10ops-monitoring-bot) Starting pool of db2223 by ayounsi@cumin1003: switch maintenance [14:17:32] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo: ULSFO: Unrack old switches (asw2-22/23-ulsfo) - https://phabricator.wikimedia.org/T427283 (10Papaul) 03NEW [14:18:19] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: Decommision old switches (asw2-22/23-ulsfo) - https://phabricator.wikimedia.org/T427246#11955437 (10Papaul) [14:29:53] moritzm: you can repool ganeti2029/2030 [14:31:09] ok [14:49:58] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955618 (10ops-monitoring-bot) Starting pool of db2221 by fceratto@cumin1003: Rack maintenance completed [14:51:18] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955626 (10ops-monitoring-bot) Starting pool of db2222 by fceratto@cumin1003: Rack maintenance completed [14:57:07] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955679 (10ops-monitoring-bot) Starting pool of db2196 by fceratto@cumin1003: Rack maintenance completed [15:00:25] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955702 (10ops-monitoring-bot) Completed pooling of db2223 by ayounsi@cumin1003: switch maintenance [15:05:06] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955724 (10ops-monitoring-bot) Completed pooling of db2221 by fceratto@cumin1003: Rack maintenance completed [15:06:32] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955732 (10ops-monitoring-bot) Completed pooling of db2222 by fceratto@cumin1003: Rack maintenance completed [15:12:15] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955744 (10ops-monitoring-bot) Completed pooling of db2196 by fceratto@cumin1003: Rack maintenance completed [15:12:45] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955745 (10FCeratto-WMF) db2196, db2221 and db2222 have silences removed and are fully pooled-in [15:41:39] 10netops, 06Infrastructure-Foundations, 06ServiceOps new, 10ServiceOps-Upgrades-Hardware: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11955910 (10ayounsi) 05Open→03Resolved Switch upgraded ! Thanks all for the help, next one is going to be easier :) [15:48:42] 10SRE-tools, 10Ceph, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations: Enhacements to wmcs.ceph.roll_reboot_osds - https://phabricator.wikimedia.org/T427295 (10Andrew) 03NEW [16:40:04] 10netops, 06Infrastructure-Foundations, 06ServiceOps new: codfw: rack A3 maintenance - https://phabricator.wikimedia.org/T427301 (10ayounsi) 03NEW p:05Triage→03Medium [16:40:55] FIRING: [2x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:43:29] 10netops, 06Infrastructure-Foundations, 06ServiceOps new: codfw: rack A3 maintenance - https://phabricator.wikimedia.org/T427301#11956307 (10ayounsi) [16:44:07] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: pod AB switches upgrade (2026) - https://phabricator.wikimedia.org/T426197#11956308 (10ayounsi) [16:44:24] 10netops, 06Infrastructure-Foundations, 06ServiceOps new: codfw: rack A3 maintenance - https://phabricator.wikimedia.org/T427301#11956311 (10ayounsi) [16:44:26] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: pod AB switches upgrade (2026) - https://phabricator.wikimedia.org/T426197#11956312 (10ayounsi) [16:45:55] FIRING: [2x] SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:40:55] RESOLVED: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:35:27] 10SRE-tools, 10Ceph, 06cloud-services-team, 10Cloud-VPS, and 2 others: Enhacements to wmcs.ceph.roll_reboot_osds - https://phabricator.wikimedia.org/T427295#11956936 (10Andrew) Part 1 would involve a fair bit of refactoring since we currently use 'ceph node' calls to enumerate osd nodes rather than cumin. [19:46:55] Heyo, I'm trying to reimage a host (durum5003) and notice that the cookbook is no longer asking for the management password - the cookbook also is not rebooting the host into PXE and just reboots the host [21:09:09] oh, right, durum is a vm. nevermind that [21:35:54] 10SRE-tools, 10Ceph, 06cloud-services-team, 10Cloud-VPS, and 2 others: Enhancements to wmcs.ceph.roll_reboot_osds - https://phabricator.wikimedia.org/T427295#11957363 (10Aklapper)