[03:24:10] FIRING: SystemdUnitFailed: wmf_auto_restart_nic-saturation-exporter.service on lvs3008:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:30:32] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: gerrit: Add Envoy in Gerrit's stack - https://phabricator.wikimedia.org/T420909#11756910 (10ABran-WMF) >>! In T420909#11756114, @hashar wrote: > ` > lang=yaml > profile::tlsproxy::envoy::upstream_tls: true > pr... [07:24:10] FIRING: SystemdUnitFailed: wmf_auto_restart_nic-saturation-exporter.service on lvs3008:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:40:42] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team: gerrit: Add Envoy in Gerrit's stack - https://phabricator.wikimedia.org/T420909#11756973 (10hashar) @ABran-WMF I think I have mixed up `downstream` and `upstream. [[ https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/in... [08:13:06] 06Traffic, 06ServiceOps new, 06Product Safety and Integrity (Sprint Forsythia (Mar 23 - Apr 10))), 05WE4.2 Bot detection (WE4.2 hCaptcha editing trial): hCaptcha: Stop using urldownloader for health checks of the secure-api.js file - https://phabricator.wikimedia.org/T421464 (10kostajh) 03NEW [08:13:53] 06Traffic, 06ServiceOps new, 06Product Safety and Integrity (Sprint Forsythia (Mar 23 - Apr 10))), 05WE4.2 Bot detection (WE4.2 hCaptcha editing trial): hCaptcha: Stop using urldownloader for health checks of the secure-api.js file - https://phabricator.wikimedia.org/T421464#11757023 (10kostajh) > If you c... [08:15:12] 06Traffic, 06ServiceOps new, 06Product Safety and Integrity (Sprint Forsythia (Mar 23 - Apr 10))), 05WE4.2 Bot detection (WE4.2 hCaptcha editing trial): hCaptcha: Stop using urldownloader for health checks of the secure-api.js file - https://phabricator.wikimedia.org/T421464#11757024 (10kostajh) [08:16:49] 06Traffic, 06ServiceOps new, 06Product Safety and Integrity (Sprint Forsythia (Mar 23 - Apr 10))), 05WE4.2 Bot detection (WE4.2 hCaptcha editing trial): hCaptcha: Stop using urldownloader for health checks of the secure-api.js file - https://phabricator.wikimedia.org/T421464#11757025 (10kostajh) [09:42:50] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: gerrit: Add Envoy in Gerrit's stack - https://phabricator.wikimedia.org/T420909#11757279 (10ABran-WMF) we tweaked several knobs on httpd and Envoy and still have the same underlying issue, I think aligning Jet... [09:43:55] RESOLVED: SystemdUnitFailed: wmf_auto_restart_nic-saturation-exporter.service on lvs3008:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:57:06] 06Traffic, 10Liberica, 10Prod-Kubernetes, 06Data-Platform-SRE (2026-03-27 - 2026-04-17), 07Kubernetes: Migrate DSE k8s apiserver and services to IPIP - https://phabricator.wikimedia.org/T420437#11757456 (10Gehel) [15:33:19] hello traffic - I have patch [0] that hardens the config checks in some of the ATS lua test suites. I would normally say "it's tests" and merge on a Friday, *but* these are special, as they're run by puppet on-host upon config changes. [15:33:19] anyway, happy to hold off until Monday if that's preferable. thoughts / opinions welcome. [15:33:19] [0] https://gerrit.wikimedia.org/r/1262152 [16:19:09] swfrench-wmf: if nobody else has a stronger opinion, I'd say it can wait given Friday [16:19:41] that was poor wording and maybe unclear heh [16:19:53] bblack: sounds good, that's my default as well :) [16:19:56] "if nobody elser has a stronger opinion, my vote is to wait for Monday" [16:20:59] figured I'd ask, in the event I'm being excessively cautious (which I often am). in any case, thanks! [16:27:30] yeah... we already have basic checks in place so a blatant syntax error can't break ATS again [16:42:58] indeed, yeah - plus, it's exceedingly unlikely that we would make changes to the .lua.conf files over the weekend anyway.