[00:02:40] FIRING: [3x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [00:10:06] FIRING: MediaWikiLoginFailures: Elevated MediaWiki centrallogin failures (centralauth_error_nologinattempt) - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000438/mediawiki-exceptions-alerts?viewPanel=3 - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLoginFailures [00:10:39] PROBLEM - MariaDB Replica Lag: s1 on db2141 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 623.67 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [00:15:06] RESOLVED: MediaWikiLoginFailures: Elevated MediaWiki centrallogin failures (centralauth_error_nologinattempt) - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000438/mediawiki-exceptions-alerts?viewPanel=3 - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLoginFailures [00:30:32] !log bking@cumin2002 END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1026.eqiad.wmnet [00:31:45] !log bking@cumin2002 END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1025.eqiad.wmnet [00:32:57] FIRING: [2x] SystemdUnitCrashLoop: prometheus-blazegraph-exporter-wdqs-categories.service crashloop on wdqs1025:9100 - TODO - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitCrashLoop [00:36:57] PROBLEM - PyBal backends health check on lvs5005 is CRITICAL: PYBAL CRITICAL - CRITICAL - uploadlb6_443: Servers cp5030.eqsin.wmnet, cp5029.eqsin.wmnet are marked down but pooled https://wikitech.wikimedia.org/wiki/PyBal [00:37:57] RESOLVED: [2x] SystemdUnitCrashLoop: prometheus-blazegraph-exporter-wdqs-categories.service crashloop on wdqs1025:9100 - TODO - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitCrashLoop [00:38:41] (03PS1) 10TrainBranchBot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1118198 [00:38:41] (03CR) 10TrainBranchBot: [C:03+2] Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1118198 (owner: 10TrainBranchBot) [00:38:55] RECOVERY - PyBal backends health check on lvs5005 is OK: PYBAL OK - All pools are healthy https://wikitech.wikimedia.org/wiki/PyBal [00:44:32] FIRING: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [00:49:06] (03Merged) 10jenkins-bot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1118198 (owner: 10TrainBranchBot) [00:51:03] PROBLEM - BGP status on cr2-magru is CRITICAL: BGP CRITICAL - No response from remote host 195.200.68.129 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [00:52:40] FIRING: SystemdUnitFailed: systemd-timedated.service on testreduce1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:03:19] uh [01:03:39] ci errors including "16:04:00 npm warn tar TAR_ENTRY_ERROR ENOSPC: no space left on device, write" :D [01:03:49] that's worrying [01:04:32] RESOLVED: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [01:09:21] (03PS1) 10TrainBranchBot: Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1118199 [01:09:21] (03CR) 10TrainBranchBot: [C:03+2] Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1118199 (owner: 10TrainBranchBot) [01:12:40] RESOLVED: SystemdUnitFailed: systemd-timedated.service on testreduce1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:24:24] it seems to have survived on a recheck, so i guess it's ok :D [01:29:37] (03Merged) 10jenkins-bot: Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1118199 (owner: 10TrainBranchBot) [01:35:57] RECOVERY - OSPF status on cr1-magru is OK: OSPFv2: 2/2 UP : OSPFv3: 2/2 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status [01:36:47] RECOVERY - OSPF status on cr2-eqiad is OK: OSPFv2: 7/7 UP : OSPFv3: 7/7 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status [01:36:47] RECOVERY - BFD status on cr2-eqiad is OK: UP: 25 AdminDown: 0 Down: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BFD_status [01:46:29] PROBLEM - Disk space on releases1003 is CRITICAL: DISK CRITICAL - /srv/docker/overlay2/878f95eb56672ad71595a0148413866b2f73b5fefdea4e96990f2782031f6f5d/merged is not accessible: Permission denied https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=releases1003&var-datasource=eqiad+prometheus/ops [02:05:39] RECOVERY - MariaDB Replica Lag: s1 on db2141 is OK: OK slave_sql_lag Replication lag: 0.32 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [02:06:29] RECOVERY - Disk space on releases1003 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=releases1003&var-datasource=eqiad+prometheus/ops [02:36:42] FIRING: JobUnavailable: Reduced availability for job sidekiq in ops@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [02:54:54] !log marostegui@cumin1002 dbctl commit (dc=all): 'Repooling after maintenance db2219 (T384592)', diff saved to https://phabricator.wikimedia.org/P73412 and previous config saved to /var/cache/conftool/dbconfig/20250208-025453-marostegui.json [02:54:57] T384592: Add normalization columns to categorylinks table - https://phabricator.wikimedia.org/T384592 [03:01:42] RESOLVED: JobUnavailable: Reduced availability for job sidekiq in ops@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [03:10:01] !log marostegui@cumin1002 dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P73413 and previous config saved to /var/cache/conftool/dbconfig/20250208-031000-marostegui.json [03:25:08] !log marostegui@cumin1002 dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P73414 and previous config saved to /var/cache/conftool/dbconfig/20250208-032508-marostegui.json [03:40:15] !log marostegui@cumin1002 dbctl commit (dc=all): 'Repooling after maintenance db2219 (T384592)', diff saved to https://phabricator.wikimedia.org/P73415 and previous config saved to /var/cache/conftool/dbconfig/20250208-034015-marostegui.json [03:40:19] T384592: Add normalization columns to categorylinks table - https://phabricator.wikimedia.org/T384592 [03:40:31] !log marostegui@cumin1002 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2236.codfw.wmnet with reason: Maintenance [03:40:39] !log marostegui@cumin1002 dbctl commit (dc=all): 'Depooling db2236 (T384592)', diff saved to https://phabricator.wikimedia.org/P73416 and previous config saved to /var/cache/conftool/dbconfig/20250208-034038-marostegui.json [03:44:15] FIRING: [4x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:49:32] FIRING: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [05:09:32] RESOLVED: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [07:34:51] RECOVERY - Categories update lag on wdqs2021 is OK: OK - Categories lag: 6 days, 11:24:37.451246 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Categories_update_lag [07:47:41] FIRING: [4x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:32:01] !log marostegui@cumin1002 dbctl commit (dc=all): 'Repooling after maintenance db2236 (T384592)', diff saved to https://phabricator.wikimedia.org/P73417 and previous config saved to /var/cache/conftool/dbconfig/20250208-083201-marostegui.json [08:32:04] T384592: Add normalization columns to categorylinks table - https://phabricator.wikimedia.org/T384592 [08:47:08] !log marostegui@cumin1002 dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P73418 and previous config saved to /var/cache/conftool/dbconfig/20250208-084707-marostegui.json [08:54:32] FIRING: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [09:02:14] !log marostegui@cumin1002 dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P73419 and previous config saved to /var/cache/conftool/dbconfig/20250208-090214-marostegui.json [09:14:32] RESOLVED: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [09:17:23] !log marostegui@cumin1002 dbctl commit (dc=all): 'Repooling after maintenance db2236 (T384592)', diff saved to https://phabricator.wikimedia.org/P73420 and previous config saved to /var/cache/conftool/dbconfig/20250208-091721-marostegui.json [09:17:30] T384592: Add normalization columns to categorylinks table - https://phabricator.wikimedia.org/T384592 [09:17:39] !log marostegui@cumin1002 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2237.codfw.wmnet with reason: Maintenance [09:17:46] !log marostegui@cumin1002 dbctl commit (dc=all): 'Depooling db2237 (T384592)', diff saved to https://phabricator.wikimedia.org/P73421 and previous config saved to /var/cache/conftool/dbconfig/20250208-091745-marostegui.json [09:20:48] PROBLEM - Router interfaces on cr2-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 45, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [09:20:57] PROBLEM - Router interfaces on cr3-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 69, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [09:42:41] FIRING: [4x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:17:51] PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 207, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [16:18:51] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 208, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [16:32:53] (03PS1) 10Ori: Remove temporary '-k8s' suffix from ArcLamp pipeline [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1118208 [16:33:27] (03PS2) 10Ori: Remove temporary '-k8s' suffix from ArcLamp pipeline [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1118208 [16:35:31] (03PS3) 10Ori: Remove temporary '-k8s' suffix from ArcLamp pipeline [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1118208 [16:40:49] (03PS1) 10Ori: Turn down the Kubernetes-specific ArcLamp listeners [puppet] - 10https://gerrit.wikimedia.org/r/1118209 [16:42:32] (03PS2) 10Ori: Turn down the Kubernetes-specific ArcLamp listeners [puppet] - 10https://gerrit.wikimedia.org/r/1118209 [17:04:32] FIRING: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [17:05:29] (03PS1) 10Ssingh: admin: temporarily remove keys for btullis [puppet] - 10https://gerrit.wikimedia.org/r/1118210 [17:06:04] (03CR) 10CI reject: [V:04-1] admin: temporarily remove keys for btullis [puppet] - 10https://gerrit.wikimedia.org/r/1118210 (owner: 10Ssingh) [17:06:34] (03PS2) 10Ssingh: admin: temporarily remove keys for btullis [puppet] - 10https://gerrit.wikimedia.org/r/1118210 [17:07:34] (03CR) 10Fabfur: [C:03+1] admin: temporarily remove keys for btullis [puppet] - 10https://gerrit.wikimedia.org/r/1118210 (owner: 10Ssingh) [17:07:42] (03CR) 10Ssingh: [C:03+2] admin: temporarily remove keys for btullis [puppet] - 10https://gerrit.wikimedia.org/r/1118210 (owner: 10Ssingh) [17:24:32] RESOLVED: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [17:24:44] 10ops-eqiad, 06SRE, 06DC-Ops: PDU sensor over limit - https://phabricator.wikimedia.org/T383383#10533748 (10phaultfinder) [17:42:41] FIRING: [3x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:30:30] (03PS1) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [18:30:34] (03CR) 10CDanis: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1118211 (owner: 10CDanis) [18:42:48] FIRING: PuppetFailure: Puppet has failed on build2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [19:00:30] (03PS2) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:00:32] (03CR) 10CDanis: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1118211 (owner: 10CDanis) [19:04:52] (03PS3) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:04:54] (03CR) 10CDanis: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1118211 (owner: 10CDanis) [19:06:46] (03PS4) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:07:11] (03PS5) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:07:12] (03CR) 10CDanis: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1118211 (owner: 10CDanis) [19:09:06] (03PS6) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:09:10] (03CR) 10CDanis: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1118211 (owner: 10CDanis) [19:11:05] (03PS7) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:11:09] (03CR) 10CDanis: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1118211 (owner: 10CDanis) [19:14:04] (03PS8) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:14:33] (03CR) 10CDanis: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1118211 (owner: 10CDanis) [19:16:05] (03PS9) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:16:08] (03CR) 10CDanis: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/1118211 (owner: 10CDanis) [19:18:09] (03PS10) 10CDanis: haproxy: Allow empty ring defs as a placeholder [puppet] - 10https://gerrit.wikimedia.org/r/1118211 [19:36:13] !log marostegui@cumin1002 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2240.codfw.wmnet with reason: Maintenance [19:36:20] !log marostegui@cumin1002 dbctl commit (dc=all): 'Depooling db2240 (T384592)', diff saved to https://phabricator.wikimedia.org/P73426 and previous config saved to /var/cache/conftool/dbconfig/20250208-193620-marostegui.json [19:36:23] T384592: Add normalization columns to categorylinks table - https://phabricator.wikimedia.org/T384592 [21:09:32] FIRING: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [21:29:32] RESOLVED: Wikidata Reliability Metrics - Median loading time alert: - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert [21:42:41] FIRING: [3x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:45:47] RECOVERY - Router interfaces on cr2-eqord is OK: OK: host 208.80.154.198, interfaces up: 46, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [21:45:57] RECOVERY - Router interfaces on cr3-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 70, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [22:42:48] FIRING: PuppetFailure: Puppet has failed on build2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [23:08:55] PROBLEM - BFD status on cr2-magru is CRITICAL: Down: 1 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BFD_status [23:08:57] PROBLEM - BFD status on cr2-eqdfw is CRITICAL: Down: 1 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BFD_status [23:09:55] RECOVERY - BFD status on cr2-magru is OK: UP: 4 AdminDown: 0 Down: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BFD_status [23:09:57] RECOVERY - BFD status on cr2-eqdfw is OK: UP: 16 AdminDown: 0 Down: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BFD_status [23:24:38] 10ops-eqiad, 06SRE, 06DC-Ops: PDU sensor over limit - https://phabricator.wikimedia.org/T383383#10533873 (10phaultfinder) [23:47:41] FIRING: SystemdUnitFailed: systemd-timedated.service on testreduce1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:49:16] RESOLVED: SystemdUnitFailed: systemd-timedated.service on testreduce1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed