[08:39:39] 10netops, 06Traffic, 06Infrastructure-Foundations: BGP settings for liberica - https://phabricator.wikimedia.org/T379164#10558605 (10Vgutierrez) 05Open→03Resolved [08:41:33] 06Traffic: backport gobgp 3.33 from trixie - https://phabricator.wikimedia.org/T386687 (10Vgutierrez) 03NEW [08:41:55] 06Traffic: backport gobgp 3.33 from trixie - https://phabricator.wikimedia.org/T386687#10558625 (10Vgutierrez) p:05Triage→03Medium [17:19:40] 06Traffic, 06SRE: Define an event stream and schema for haproxy_requestctl analytics pipeline ingestion - https://phabricator.wikimedia.org/T383392#10560361 (10Ottomata) @Fabfur {T383914} has been deployed, so it should be possible to remove the `meta.domain` field added in [[ https://gitlab.wikimedia.org/... [19:08:30] FIRING: [6x] HAProxyRestarted: HAProxy server restarted on cp5018:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [19:10:49] uhm [19:11:25] FIRING: [2x] SystemdUnitCrashLoop: varnishmtail@internal.service crashloop on cp3068:9100 - TODO - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitCrashLoop [19:12:40] going to check all cp hosts [19:13:29] FIRING: [12x] HAProxyRestarted: HAProxy server restarted on cp3066:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [19:13:38] FIRING: LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:80 @ cp5017 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [19:16:25] RESOLVED: [2x] SystemdUnitCrashLoop: varnishmtail@internal.service crashloop on cp3068:9100 - TODO - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitCrashLoop [19:18:30] RESOLVED: [12x] HAProxyRestarted: HAProxy server restarted on cp3066:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [19:18:38] RESOLVED: LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:80 @ cp5017 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [19:19:25] FIRING: SystemdUnitFailed: wmf_auto_restart_varnish-frontend-hospital.service on cp5020:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:34:03] everything should be recovered [20:34:31] 06Traffic: update haproxy to version 2.8.14 - https://phabricator.wikimedia.org/T386751 (10Vgutierrez) 03NEW [20:34:40] 06Traffic: update haproxy to version 2.8.14 - https://phabricator.wikimedia.org/T386751#10561234 (10Vgutierrez) p:05Triage→03High [20:52:33] 06Traffic: tune haproxykafka message_buffer config value - https://phabricator.wikimedia.org/T386753 (10Vgutierrez) 03NEW [20:52:39] 06Traffic: tune haproxykafka message_buffer config value - https://phabricator.wikimedia.org/T386753#10561279 (10Vgutierrez) p:05Triage→03High [21:20:41] 06Traffic, 07Upstream: HAProxy 2.8.13 crashes after backend server goes away - https://phabricator.wikimedia.org/T386756 (10Vgutierrez) 03NEW [22:53:49] 10netops, 06Infrastructure-Foundations: cr2-esams:interface ae1 present under protocol ospf but not configure - https://phabricator.wikimedia.org/T386766 (10Papaul) 03NEW [23:19:25] FIRING: SystemdUnitFailed: wmf_auto_restart_varnish-frontend-hospital.service on cp5020:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:47:40] looking at cp5020 [23:54:25] RESOLVED: SystemdUnitFailed: wmf_auto_restart_varnish-frontend-hospital.service on cp5020:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed