[00:10:58] (03Abandoned) 10Umherirrender: Changing end date for Board election notification [extensions/WikimediaMessages] (wmf/1.45.0-wmf.22) - 10https://gerrit.wikimedia.org/r/1195197 (https://phabricator.wikimedia.org/T392232) (owner: 10Reedy) [00:16:01] (03CR) 10Arlolra: [C:04-1] Deploy Parsoid Read Views to 14 wikis (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1203142 (https://phabricator.wikimedia.org/T409593) (owner: 10Arlolra) [00:39:01] (03PS1) 10TrainBranchBot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1203183 [00:39:01] (03CR) 10TrainBranchBot: [C:03+2] Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1203183 (owner: 10TrainBranchBot) [00:39:44] FIRING: [2x] RipeAtlasAnchorUnreachable: ipv6 ping to eqiad RIPE Atlas anchor: failures over threshold for measurement 96503802 - https://wikitech.wikimedia.org/wiki/Network_monitoring#Atlas_alerts - https://grafana.wikimedia.org/d/K1qm1j-Wz/ripe-atlas?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DRipeAtlasAnchorUnreachable [00:44:44] RESOLVED: [2x] RipeAtlasAnchorUnreachable: ipv6 ping to eqiad RIPE Atlas anchor: failures over threshold for measurement 96503802 - https://wikitech.wikimedia.org/wiki/Network_monitoring#Atlas_alerts - https://grafana.wikimedia.org/d/K1qm1j-Wz/ripe-atlas?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DRipeAtlasAnchorUnreachable [00:49:45] !log ryankemper@cumin1002 END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin1002 - T390860 [00:49:49] T390860: Elasticsearch dependency upgrade in spicerack - https://phabricator.wikimedia.org/T390860 [00:54:26] (03Merged) 10jenkins-bot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - 10https://gerrit.wikimedia.org/r/1203183 (owner: 10TrainBranchBot) [01:00:50] !log mwpresync@deploy2002 Started scap build-images: Publishing wmf/next image [01:08:40] (03PS1) 10TrainBranchBot: Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1203189 [01:08:40] (03CR) 10TrainBranchBot: [C:03+2] Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1203189 (owner: 10TrainBranchBot) [01:09:05] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate default-staging-certificate.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [01:14:59] !log mwpresync@deploy2002 Finished scap build-images: Publishing wmf/next image (duration: 14m 08s) [01:17:56] (03PS8) 10Scott French: P:cache::varnish::frontend: render known-client rate limit VCL [puppet] - 10https://gerrit.wikimedia.org/r/1198182 (https://phabricator.wikimedia.org/T403220) [01:24:30] (03CR) 10Scott French: "FYI, I've made some minor changes here to reflect the introduction of a "custom limit enabled" flag in the known-client object itself." [puppet] - 10https://gerrit.wikimedia.org/r/1198182 (https://phabricator.wikimedia.org/T403220) (owner: 10Scott French) [01:26:55] PROBLEM - Check unit status of sync-puppet-volatile on puppetserver1003 is CRITICAL: CRITICAL: Status of the systemd unit sync-puppet-volatile https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state [01:29:07] (03PS1) 10Aaron Schulz: Change RESTbase => REST in wgRestSandboxSpecs names [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1203191 [01:30:25] FIRING: SystemdUnitFailed: sync-puppet-volatile.service on puppetserver1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:32:27] (03Merged) 10jenkins-bot: Branch commit for wmf/next [core] (wmf/next) - 10https://gerrit.wikimedia.org/r/1203189 (owner: 10TrainBranchBot) [01:33:21] FIRING: [5x] SystemdUnitFailed: docker-reporter-kubernetes-dse_eqiad-images.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:38:21] FIRING: [5x] SystemdUnitFailed: docker-reporter-kubernetes-dse_eqiad-images.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:45:25] RESOLVED: SystemdUnitFailed: sync-puppet-volatile.service on puppetserver1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:46:55] RECOVERY - Check unit status of sync-puppet-volatile on puppetserver1003 is OK: OK: Status of the systemd unit sync-puppet-volatile https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state [02:25:47] (03PS1) 10RLazarus: api-gateway: Envoy config updates for 1.32 [deployment-charts] - 10https://gerrit.wikimedia.org/r/1203194 (https://phabricator.wikimedia.org/T409510) [02:31:46] (03PS1) 10RLazarus: envoyproxy: Envoy config updates for 1.32 [puppet] - 10https://gerrit.wikimedia.org/r/1203195 (https://phabricator.wikimedia.org/T409510) [02:34:24] (03CR) 10CI reject: [V:04-1] envoyproxy: Envoy config updates for 1.32 [puppet] - 10https://gerrit.wikimedia.org/r/1203195 (https://phabricator.wikimedia.org/T409510) (owner: 10RLazarus) [02:48:25] (03PS2) 10RLazarus: envoyproxy: Envoy config updates for 1.32 [puppet] - 10https://gerrit.wikimedia.org/r/1203195 (https://phabricator.wikimedia.org/T409510) [03:14:06] FIRING: CertAlmostExpired: Certificate for service data-gateway-staging:30443 is about to expire - https://wikitech.wikimedia.org/wiki/TLS/Runbook#data-gateway-staging:30443 - TODO - https://alerts.wikimedia.org/?q=alertname%3DCertAlmostExpired [03:27:06] FIRING: [2x] SwitchCoreInterfaceDown: Switch core interface down - ssw1-d1-eqiad:ethernet-1/14 (Core: lsw1-d6-eqiad:ethernet-1/56 {#B00369}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Switch_interface_down - https://alerts.wikimedia.org/?q=alertname%3DSwitchCoreInterfaceDown [03:27:54] FIRING: [2x] CoreBGPDown: Core BGP session down between ssw1-d1-eqiad and lsw1-d6-eqiad (10.64.128.29) - group ibgp_evpn - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown [05:03:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:08:22] FIRING: [2x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [05:09:06] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate default-staging-certificate.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [05:33:22] RESOLVED: [2x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [05:39:05] FIRING: [5x] SystemdUnitFailed: docker-reporter-kubernetes-dse_eqiad-images.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:14:06] FIRING: CertAlmostExpired: Certificate for service data-gateway-staging:30443 is about to expire - https://wikitech.wikimedia.org/wiki/TLS/Runbook#data-gateway-staging:30443 - TODO - https://alerts.wikimedia.org/?q=alertname%3DCertAlmostExpired [07:27:06] FIRING: [2x] SwitchCoreInterfaceDown: Switch core interface down - ssw1-d1-eqiad:ethernet-1/14 (Core: lsw1-d6-eqiad:ethernet-1/56 {#B00369}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Switch_interface_down - https://alerts.wikimedia.org/?q=alertname%3DSwitchCoreInterfaceDown [07:27:54] FIRING: [2x] CoreBGPDown: Core BGP session down between ssw1-d1-eqiad and lsw1-d6-eqiad (10.64.128.29) - group ibgp_evpn - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown [09:03:40] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:09:06] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate default-staging-certificate.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [09:39:05] FIRING: [5x] SystemdUnitFailed: docker-reporter-kubernetes-dse_eqiad-images.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:51:33] FIRING: KubernetesAPILatency: High Kubernetes API latency (LIST certificaterequests) on k8s-mlstaging@codfw - https://wikitech.wikimedia.org/wiki/Kubernetes - https://grafana.wikimedia.org/d/ddNd-sLnk/kubernetes-api-details?var-site=codfw&var-cluster=k8s-mlstaging&var-latency_percentile=0.95&var-verb=LIST - https://alerts.wikimedia.org/?q=alertname%3DKubernetesAPILatency [09:56:33] RESOLVED: KubernetesAPILatency: High Kubernetes API latency (LIST certificaterequests) on k8s-mlstaging@codfw - https://wikitech.wikimedia.org/wiki/Kubernetes - https://grafana.wikimedia.org/d/ddNd-sLnk/kubernetes-api-details?var-site=codfw&var-cluster=k8s-mlstaging&var-latency_percentile=0.95&var-verb=LIST - https://alerts.wikimedia.org/?q=alertname%3DKubernetesAPILatency [11:03:25] RESOLVED: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:14:06] FIRING: CertAlmostExpired: Certificate for service data-gateway-staging:30443 is about to expire - https://wikitech.wikimedia.org/wiki/TLS/Runbook#data-gateway-staging:30443 - TODO - https://alerts.wikimedia.org/?q=alertname%3DCertAlmostExpired [11:27:06] FIRING: [2x] SwitchCoreInterfaceDown: Switch core interface down - ssw1-d1-eqiad:ethernet-1/14 (Core: lsw1-d6-eqiad:ethernet-1/56 {#B00369}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Switch_interface_down - https://alerts.wikimedia.org/?q=alertname%3DSwitchCoreInterfaceDown [11:27:54] FIRING: [2x] CoreBGPDown: Core BGP session down between ssw1-d1-eqiad and lsw1-d6-eqiad (10.64.128.29) - group ibgp_evpn - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown [11:45:45] 10ops-codfw, 06SRE, 06Data-Persistence, 06DC-Ops: Q2:rack/setup/install db2249 - https://phabricator.wikimedia.org/T407991#11355898 (10Marostegui) [13:09:06] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate default-staging-certificate.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [13:39:06] FIRING: [5x] SystemdUnitFailed: docker-reporter-kubernetes-dse_eqiad-images.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:34:45] FIRING: CirrusStreamingUpdaterUnknownErrors: CirrusSearch consumer-search@codfw is failing write requests because of unknown errors - https://wikitech.wikimedia.org/wiki/Search#Streaming_Updater - https://grafana.wikimedia.org/d/jKqki4MSk/cirrus-streaming-updater - https://alerts.wikimedia.org/?q=alertname%3DCirrusStreamingUpdaterUnknownErrors [14:44:45] RESOLVED: CirrusStreamingUpdaterUnknownErrors: CirrusSearch consumer-search@codfw is failing write requests because of unknown errors - https://wikitech.wikimedia.org/wiki/Search#Streaming_Updater - https://grafana.wikimedia.org/d/jKqki4MSk/cirrus-streaming-updater - https://alerts.wikimedia.org/?q=alertname%3DCirrusStreamingUpdaterUnknownErrors [15:08:22] FIRING: [2x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [15:14:06] FIRING: CertAlmostExpired: Certificate for service data-gateway-staging:30443 is about to expire - https://wikitech.wikimedia.org/wiki/TLS/Runbook#data-gateway-staging:30443 - TODO - https://alerts.wikimedia.org/?q=alertname%3DCertAlmostExpired [15:27:06] FIRING: [2x] SwitchCoreInterfaceDown: Switch core interface down - ssw1-d1-eqiad:ethernet-1/14 (Core: lsw1-d6-eqiad:ethernet-1/56 {#B00369}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Switch_interface_down - https://alerts.wikimedia.org/?q=alertname%3DSwitchCoreInterfaceDown [15:27:54] FIRING: [2x] CoreBGPDown: Core BGP session down between ssw1-d1-eqiad and lsw1-d6-eqiad (10.64.128.29) - group ibgp_evpn - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown [15:33:22] RESOLVED: [2x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [16:33:26] (03CR) 10Daniel Kinzler: "I think claime ispointing to the same issue I mentioned in my comment - the descriptor key is now inconsistent with the header name." [deployment-charts] - 10https://gerrit.wikimedia.org/r/1201736 (https://phabricator.wikimedia.org/T409155) (owner: 10Pmiazga) [17:09:06] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate default-staging-certificate.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [17:26:17] PROBLEM - MegaRAID on db1171 is CRITICAL: CRITICAL: 1 failed LD(s) (Degraded) https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [17:26:18] ACKNOWLEDGEMENT - MegaRAID on db1171 is CRITICAL: CRITICAL: 1 failed LD(s) (Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T409638 https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [17:26:26] 10ops-eqiad, 06SRE, 06DBA, 06DC-Ops: Degraded RAID on db1171 - https://phabricator.wikimedia.org/T409638 (10ops-monitoring-bot) 03NEW [17:36:25] 10ops-eqiad, 06SRE, 06Data-Persistence, 10Data-Persistence-Backup, and 2 others: Degraded RAID on db1171 - https://phabricator.wikimedia.org/T409638#11356125 (10Marostegui) @jcrespo this is a backup source. I would assume you want the disk replaced with a spare, but just tagging you to confirm. [17:39:06] FIRING: [5x] SystemdUnitFailed: docker-reporter-kubernetes-dse_eqiad-images.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:51:45] FIRING: CirrusStreamingUpdaterUnknownErrors: CirrusSearch consumer-search@eqiad is failing write requests because of unknown errors - https://wikitech.wikimedia.org/wiki/Search#Streaming_Updater - https://grafana.wikimedia.org/d/jKqki4MSk/cirrus-streaming-updater - https://alerts.wikimedia.org/?q=alertname%3DCirrusStreamingUpdaterUnknownErrors [17:56:45] FIRING: [2x] CirrusStreamingUpdaterUnknownErrors: CirrusSearch consumer-cloudelastic@eqiad is failing write requests because of unknown errors - https://wikitech.wikimedia.org/wiki/Search#Streaming_Updater - https://grafana.wikimedia.org/d/jKqki4MSk/cirrus-streaming-updater - https://alerts.wikimedia.org/?q=alertname%3DCirrusStreamingUpdaterUnknownErrors [18:01:45] RESOLVED: [2x] CirrusStreamingUpdaterUnknownErrors: CirrusSearch consumer-cloudelastic@eqiad is failing write requests because of unknown errors - https://wikitech.wikimedia.org/wiki/Search#Streaming_Updater - https://grafana.wikimedia.org/d/jKqki4MSk/cirrus-streaming-updater - https://alerts.wikimedia.org/?q=alertname%3DCirrusStreamingUpdaterUnknownErrors [19:14:06] FIRING: CertAlmostExpired: Certificate for service data-gateway-staging:30443 is about to expire - https://wikitech.wikimedia.org/wiki/TLS/Runbook#data-gateway-staging:30443 - TODO - https://alerts.wikimedia.org/?q=alertname%3DCertAlmostExpired [19:27:06] FIRING: [2x] SwitchCoreInterfaceDown: Switch core interface down - ssw1-d1-eqiad:ethernet-1/14 (Core: lsw1-d6-eqiad:ethernet-1/56 {#B00369}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Switch_interface_down - https://alerts.wikimedia.org/?q=alertname%3DSwitchCoreInterfaceDown [19:27:55] FIRING: [2x] CoreBGPDown: Core BGP session down between ssw1-d1-eqiad and lsw1-d6-eqiad (10.64.128.29) - group ibgp_evpn - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown [21:09:06] FIRING: [2x] PuppetCertificateAboutToExpire: Puppet CA certificate default-staging-certificate.wmnet is about to expire - https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate - TODO - https://alerts.wikimedia.org/?q=alertname%3DPuppetCertificateAboutToExpire [21:39:06] FIRING: [5x] SystemdUnitFailed: docker-reporter-kubernetes-dse_eqiad-images.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:42:09] PROBLEM - mailman list info ssl expiry on lists1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring [22:47:59] RECOVERY - mailman list info ssl expiry on lists1004 is OK: OK - Certificate lists.wikimedia.org will expire on Tue 03 Feb 2026 07:30:03 PM GMT +0000. https://wikitech.wikimedia.org/wiki/Mailman/Monitoring [23:14:06] FIRING: CertAlmostExpired: Certificate for service data-gateway-staging:30443 is about to expire - https://wikitech.wikimedia.org/wiki/TLS/Runbook#data-gateway-staging:30443 - TODO - https://alerts.wikimedia.org/?q=alertname%3DCertAlmostExpired [23:23:46] (03PS1) 10Superpes15: [ptwiki] Add new abusefilter usergroup [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1203238 (https://phabricator.wikimedia.org/T409578) [23:24:32] (03PS2) 10Superpes15: [ptwiki] Add new abusefilter usergroup [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1203238 (https://phabricator.wikimedia.org/T409578) [23:26:14] (03PS3) 10Superpes15: [ptwiki] Add new abusefilter usergroup [mediawiki-config] - 10https://gerrit.wikimedia.org/r/1203238 (https://phabricator.wikimedia.org/T409578) [23:27:06] FIRING: [2x] SwitchCoreInterfaceDown: Switch core interface down - ssw1-d1-eqiad:ethernet-1/14 (Core: lsw1-d6-eqiad:ethernet-1/56 {#B00369}) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Switch_interface_down - https://alerts.wikimedia.org/?q=alertname%3DSwitchCoreInterfaceDown [23:27:55] FIRING: [2x] CoreBGPDown: Core BGP session down between ssw1-d1-eqiad and lsw1-d6-eqiad (10.64.128.29) - group ibgp_evpn - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown