[00:39:13] <wikibugs>	 (03PS1) 10TrainBranchBot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/912400
[00:39:17] <wikibugs>	 (03CR) 10TrainBranchBot: [C: 03+2] Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/912400 (owner: 10TrainBranchBot)
[00:56:47] <wikibugs>	 (03Merged) 10jenkins-bot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/912400 (owner: 10TrainBranchBot)
[00:57:00] <jinxer-wm>	 (PowerSupply) firing: (2) Power Supply - PS Redundancy - issue on an-worker1147:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=an-worker1147 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupply
[01:06:49] <wikibugs>	 10SRE, 10DBA: db1132 index for table pagetriage_page is corrupt - https://phabricator.wikimedia.org/T335632 (10jcrespo) I've left the full error at /home/jynus/corrupt_index on both hosts (although it should be also on the error log): ` db1106# mariadb -t -e "SHOW REPLICA STATUS\G" > /home/jynus/corrupt_index...
[02:01:33] <icinga-wm>	 RECOVERY - Check systemd state on mwlog1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[02:09:33] <jinxer-wm>	 (JobUnavailable) firing: Reduced availability for job sidekiq in ops@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[02:24:33] <jinxer-wm>	 (JobUnavailable) resolved: Reduced availability for job sidekiq in ops@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[02:28:55] <jinxer-wm>	 (FNMNotReported) firing: FastNetMon metrics not reported - https://wikitech.wikimedia.org/wiki/Fastnetmon - https://w.wiki/8oU - https://alerts.wikimedia.org/?q=alertname%3DFNMNotReported
[02:40:16] <jinxer-wm>	 (MediaWikiLatencyExceeded) firing: Average latency high: eqiad parsoid GET/200 - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad%20prometheus/ops&var-cluster=parsoid&var-method=GET - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
[02:45:16] <jinxer-wm>	 (MediaWikiLatencyExceeded) resolved: Average latency high: eqiad parsoid GET/200 - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad%20prometheus/ops&var-cluster=parsoid&var-method=GET - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
[03:00:03] <icinga-wm>	 RECOVERY - Check systemd state on build2001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[03:08:13] <jinxer-wm>	 (DiskSpace) firing: Disk space an-airflow1001:9100:/ 5.28% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-airflow1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[03:13:07] <icinga-wm>	 PROBLEM - Check systemd state on build2001 is CRITICAL: CRITICAL - degraded: The following units failed: docker-reporter-base-images.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[04:20:05] <wikibugs>	 10SRE, 10DBA: db1132 index for table pagetriage_page is corrupt - https://phabricator.wikimedia.org/T335632 (10Marostegui) Don't do anything. I want to report this upstream
[04:57:00] <jinxer-wm>	 (PowerSupply) firing: (2) Power Supply - PS Redundancy - issue on an-worker1147:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=an-worker1147 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupply
[05:39:16] <jinxer-wm>	 (MediaWikiLatencyExceeded) firing: Average latency high: eqiad parsoid GET/200 - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad%20prometheus/ops&var-cluster=parsoid&var-method=GET - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
[05:44:16] <jinxer-wm>	 (MediaWikiLatencyExceeded) resolved: Average latency high: eqiad parsoid GET/200 - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad%20prometheus/ops&var-cluster=parsoid&var-method=GET - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
[05:49:46] <jinxer-wm>	 (MediaWikiLatencyExceeded) firing: Average latency high: eqiad parsoid GET/200 - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad%20prometheus/ops&var-cluster=parsoid&var-method=GET - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
[05:54:46] <jinxer-wm>	 (MediaWikiLatencyExceeded) resolved: Average latency high: eqiad parsoid GET/200 - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad%20prometheus/ops&var-cluster=parsoid&var-method=GET - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
[05:57:35] <icinga-wm>	 PROBLEM - Host ores1002 is DOWN: PING CRITICAL - Packet loss = 100%
[06:28:55] <jinxer-wm>	 (FNMNotReported) firing: FastNetMon metrics not reported - https://wikitech.wikimedia.org/wiki/Fastnetmon - https://w.wiki/8oU - https://alerts.wikimedia.org/?q=alertname%3DFNMNotReported
[07:00:06] <jouncebot>	 Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20230430T0700)
[07:08:13] <jinxer-wm>	 (DiskSpace) firing: Disk space an-airflow1001:9100:/ 5.237% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-airflow1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[07:42:03] <icinga-wm>	 PROBLEM - Router interfaces on cr3-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 69, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[07:50:09] <icinga-wm>	 RECOVERY - Router interfaces on cr3-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 70, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
[08:06:19] <elukey>	 !log powercycle ores1002 (mgmt console tty not usable, host frozen)
[08:06:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:08:47] <icinga-wm>	 RECOVERY - Host ores1002 is UP: PING OK - Packet loss = 0%, RTA = 0.28 ms
[08:57:00] <jinxer-wm>	 (PowerSupply) firing: (2) Power Supply - PS Redundancy - issue on an-worker1147:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=an-worker1147 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupply
[09:56:25] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (bad URL) is CRITICAL: Test bad URL returned the unexpected status 503 (expecting: 404) https://wikitech.wikimedia.org/wiki/Citoid
[09:58:03] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Citoid
[10:28:55] <jinxer-wm>	 (FNMNotReported) firing: FastNetMon metrics not reported - https://wikitech.wikimedia.org/wiki/Fastnetmon - https://w.wiki/8oU - https://alerts.wikimedia.org/?q=alertname%3DFNMNotReported
[11:07:54] <wikibugs>	 (03PS1) 10Majavah: openstack: envscript: do not set a default for clouds_file [puppet] - 10https://gerrit.wikimedia.org/r/913660 (https://phabricator.wikimedia.org/T330759)
[11:07:56] <wikibugs>	 (03PS1) 10Majavah: openstack: envscript: update default port [puppet] - 10https://gerrit.wikimedia.org/r/913661
[11:08:13] <jinxer-wm>	 (DiskSpace) firing: Disk space an-airflow1001:9100:/ 5.143% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-airflow1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[11:09:57] <wikibugs>	 (03CR) 10Majavah: [V: 03+1] "PCC SUCCESS (): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/40966/console" [puppet] - 10https://gerrit.wikimedia.org/r/913661 (owner: 10Majavah)
[11:21:10] <wikibugs>	 10SRE, 10Infrastructure-Foundations, 10Traffic, 10Performance-Team (Radar): Set cookie in Varnish to start a probe - https://phabricator.wikimedia.org/T335637 (10JameelKaisar)
[11:41:54] <wikibugs>	 (03PS1) 10Ladsgroup: db1132: Disable notification [puppet] - 10https://gerrit.wikimedia.org/r/913662 (https://phabricator.wikimedia.org/T335632)
[11:42:36] <wikibugs>	 10SRE, 10DBA, 10Patch-For-Review: db1132 index for table pagetriage_page is corrupt - https://phabricator.wikimedia.org/T335632 (10Ladsgroup) a:05Ladsgroup→03Marostegui >>! In T335632#8815044, @Marostegui wrote: > Don't do anything. I want to report this upstream  Awesome. Less work for me :P
[12:06:04] <wikibugs>	 (03PS1) 10Majavah: P:toolforge::prometheus: enable mod_rewrite [puppet] - 10https://gerrit.wikimedia.org/r/913664
[12:08:31] <wikibugs>	 (03CR) 10Majavah: [V: 03+1] "PCC SUCCESS (): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/40967/console" [puppet] - 10https://gerrit.wikimedia.org/r/913664 (owner: 10Majavah)
[12:16:15] <icinga-wm>	 PROBLEM - Cxserver LVS codfw on cxserver.svc.codfw.wmnet is CRITICAL: /v2/translate/{from}/{to}/{provider} (Machine translate an HTML fragment using TestClient, adapt the links to target language wiki.) timed out before a response was received https://wikitech.wikimedia.org/wiki/CX
[12:17:47] <icinga-wm>	 RECOVERY - Cxserver LVS codfw on cxserver.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/CX
[12:57:00] <jinxer-wm>	 (PowerSupply) firing: (2) Power Supply - PS Redundancy - issue on an-worker1147:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=an-worker1147 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupply
[13:26:55] <icinga-wm>	 PROBLEM - Host db2184 is DOWN: PING CRITICAL - Packet loss = 100%
[13:37:53] <icinga-wm>	 PROBLEM - Cxserver LVS codfw on cxserver.svc.codfw.wmnet is CRITICAL: /v2/translate/{from}/{to}/{provider} (Machine translate an HTML fragment using TestClient, adapt the links to target language wiki.) timed out before a response was received https://wikitech.wikimedia.org/wiki/CX
[13:41:03] <icinga-wm>	 RECOVERY - Cxserver LVS codfw on cxserver.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/CX
[14:13:03] <wikibugs>	 10ops-codfw, 10DBA, 10Data-Persistence-Backup: db2184 down - https://phabricator.wikimedia.org/T335640 (10Marostegui) @Papaul can we order a DIMM?
[14:16:14] <wikibugs>	 (03PS1) 10Marostegui: db2184: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/913672 (https://phabricator.wikimedia.org/T335640)
[14:17:25] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2184.codfw.wmnet with reason: Host down T335640
[14:17:30] <stashbot>	 T335640: db2184 down - https://phabricator.wikimedia.org/T335640
[14:17:35] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] db2184: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/913672 (https://phabricator.wikimedia.org/T335640) (owner: 10Marostegui)
[14:17:48] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2184.codfw.wmnet with reason: Host down T335640
[14:19:00] <icinga-wm>	 ACKNOWLEDGEMENT - SSH on db2184 is CRITICAL: CRITICAL - Socket timeout after 10 seconds Marostegui https://phabricator.wikimedia.org/T335640 https://wikitech.wikimedia.org/wiki/SSH/monitoring
[14:19:00] <icinga-wm>	 ACKNOWLEDGEMENT - Host db2184 is DOWN: PING CRITICAL - Packet loss = 100% Marostegui https://phabricator.wikimedia.org/T335640
[14:28:55] <jinxer-wm>	 (FNMNotReported) firing: FastNetMon metrics not reported - https://wikitech.wikimedia.org/wiki/Fastnetmon - https://w.wiki/8oU - https://alerts.wikimedia.org/?q=alertname%3DFNMNotReported
[15:08:13] <jinxer-wm>	 (DiskSpace) firing: Disk space an-airflow1001:9100:/ 5.05% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-airflow1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[16:57:00] <jinxer-wm>	 (PowerSupply) firing: (2) Power Supply - PS Redundancy - issue on an-worker1147:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=an-worker1147 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupply
[17:46:13] <wikibugs>	 10SRE, 10ops-codfw, 10DBA, 10Data-Persistence-Backup: db2184 down - https://phabricator.wikimedia.org/T335640 (10Papaul) a:03Jhancock.wm
[17:46:59] <wikibugs>	 10SRE, 10ops-codfw, 10DBA, 10Data-Persistence-Backup: db2184 down - https://phabricator.wikimedia.org/T335640 (10Papaul) @Jhancock.wm can you order a DIMM using your Dell Tech direct account?  Thanks
[18:16:20] <wikibugs>	 (03PS1) 10Raymond Ndibe: webservice: prefix wmcs-project where neccessary [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/913681 (https://phabricator.wikimedia.org/T334657)
[18:28:55] <jinxer-wm>	 (FNMNotReported) firing: FastNetMon metrics not reported - https://wikitech.wikimedia.org/wiki/Fastnetmon - https://w.wiki/8oU - https://alerts.wikimedia.org/?q=alertname%3DFNMNotReported
[19:08:13] <jinxer-wm>	 (DiskSpace) firing: Disk space an-airflow1001:9100:/ 4.957% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-airflow1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[19:14:19] <icinga-wm>	 PROBLEM - mailman list info on lists1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
[19:14:49] <icinga-wm>	 PROBLEM - mailman archives on lists1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
[19:24:17] <icinga-wm>	 PROBLEM - mailman list info ssl expiry on lists1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
[19:25:47] <icinga-wm>	 RECOVERY - mailman list info ssl expiry on lists1001 is OK: OK - Certificate lists.wikimedia.org will expire on Tue 20 Jun 2023 04:41:39 AM GMT +0000. https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
[19:25:49] <icinga-wm>	 RECOVERY - mailman list info on lists1001 is OK: HTTP OK: HTTP/1.1 200 OK - 8572 bytes in 8.190 second response time https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
[19:26:11] <icinga-wm>	 RECOVERY - mailman archives on lists1001 is OK: HTTP OK: HTTP/1.1 200 OK - 49851 bytes in 0.064 second response time https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
[20:19:03] <icinga-wm>	 RECOVERY - Check systemd state on alert1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:57:00] <jinxer-wm>	 (PowerSupply) firing: (2) Power Supply - PS Redundancy - issue on an-worker1147:9290 - https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook#Power_Supply_Failures - https://grafana.wikimedia.org/d/ZA1I-IB4z/ipmi-sensor-state?orgId=1&var-Sensor=Power%20Supply&var-server=an-worker1147 - https://alerts.wikimedia.org/?q=alertname%3DPowerSupply
[21:25:29] <icinga-wm>	 PROBLEM - SSH on stat1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/SSH/monitoring
[21:31:01] <wikibugs>	 (03PS1) 10Andrew Bogott: mwopenstackclients3.py: add the ability to load auth creds from clouds.yaml [puppet] - 10https://gerrit.wikimedia.org/r/913684
[21:38:31] <icinga-wm>	 RECOVERY - SSH on stat1004 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
[22:28:55] <jinxer-wm>	 (FNMNotReported) firing: FastNetMon metrics not reported - https://wikitech.wikimedia.org/wiki/Fastnetmon - https://w.wiki/8oU - https://alerts.wikimedia.org/?q=alertname%3DFNMNotReported
[22:53:35] <rzl>	 if you just got paged it's yesterday's incident, I really thought I resolved it but apparently not
[22:53:39] <rzl>	 nothing new needs doing
[23:08:13] <jinxer-wm>	 (DiskSpace) firing: Disk space an-airflow1001:9100:/ 4.863% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-airflow1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace
[23:32:57] <icinga-wm>	 PROBLEM - Disk space on centrallog1002 is CRITICAL: DISK CRITICAL - free space: /srv 54436 MB (3% inode=99%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=centrallog1002&var-datasource=eqiad+prometheus/ops
[23:35:13] <jinxer-wm>	 (SystemdUnitFailed) firing: puppet-agent-timer.service Failed on apifeatureusage2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[23:37:03] <icinga-wm>	 PROBLEM - Check systemd state on apifeatureusage2001 is CRITICAL: CRITICAL - degraded: The following units failed: puppet-agent-timer.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[23:55:07] <icinga-wm>	 RECOVERY - Check systemd state on apifeatureusage2001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state