[00:05:25] FIRING: SystemdUnitFailed: prometheus-node-textfile-prometheus-check-discovery-certificate-expiry.service on pki1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:05:40] FIRING: SystemdUnitFailed: prometheus-node-textfile-prometheus-check-discovery-certificate-expiry.service on pki1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:05:40] FIRING: SystemdUnitFailed: prometheus-node-textfile-prometheus-check-discovery-certificate-expiry.service on pki1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:46:27] 10CAS-SSO, 06cloud-services-team, 10Striker, 13Patch-For-Review: Use IDP for authentication in Striker - https://phabricator.wikimedia.org/T359554#11947564 (10Arendpieter) @taavi I see that no one is interested in reviewing my second pull request, so I’m thinking of abandoning it. [10:57:11] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 06SRE: Decom eqord POP - https://phabricator.wikimedia.org/T427050 (10cmooney) 03NEW p:05Triage→03Medium [10:57:40] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 06SRE: Decom eqord POP - https://phabricator.wikimedia.org/T427050#11947853 (10cmooney) [10:58:59] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 06SRE: Decom eqord POP - https://phabricator.wikimedia.org/T427050#11947860 (10cmooney) [11:39:31] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11947995 (10cmooney) [11:40:47] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11947997 (10cmooney) Everything is more-or-less done here. The eqsin link is still operational, though traffic is flowing via the switches due to OSPF cost. We can leave that one in place for now a... [11:48:31] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11948010 (10cmooney) [11:48:34] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11948012 (10cmooney) [11:48:40] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11948014 (10cmooney) [11:48:46] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11948016 (10cmooney) [12:05:40] FIRING: SystemdUnitFailed: prometheus-node-textfile-prometheus-check-discovery-certificate-expiry.service on pki1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:21:02] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11948291 (10cmooney) p:05High→03Low [15:39:15] 10SRE-tools, 06Infrastructure-Foundations, 06SRE, 06Traffic: Abstract LVS restart using cookbook - https://phabricator.wikimedia.org/T334166#11948797 (10ssingh) 05Open→03Resolved a:03ssingh LVS in core sites will be superseded by Liberica so we are unlikely to spend any time on this. I am taking... [15:58:40] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic: lvs1020: reimage to move primary IP from private1-d-eqiad to private1-d7-eqiad vlan - https://phabricator.wikimedia.org/T405630#11948942 (10ssingh) @cmooney: We plan to move to Liberica in Q1 or Q2 of APP2026. Do you think we should still consider w... [16:04:37] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic: lvs1020: reimage to move primary IP from private1-d-eqiad to private1-d7-eqiad vlan - https://phabricator.wikimedia.org/T405630#11948966 (10cmooney) >>! In T405630#11948942, @ssingh wrote: > @cmooney: We plan to move to Liberica in Q1 or Q2 of FY202... [16:05:40] FIRING: SystemdUnitFailed: prometheus-node-textfile-prometheus-check-discovery-certificate-expiry.service on pki1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:05:41] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic: lvs1020: reimage to move primary IP from private1-d-eqiad to private1-d7-eqiad vlan - https://phabricator.wikimedia.org/T405630#11948969 (10ssingh) Thanks for the update and the explanation, @cmooney! [18:10:46] 10netops, 06Infrastructure-Foundations, 06SRE: Nokia SR-Linux - wonky routing with IPv6 RAs and EVPN Anycast GW - https://phabricator.wikimedia.org/T420706#11949332 (10cmooney) Nokia have told us they are going to fix this and the patch is scheduled for releast 26.7.1 which should be out late July/August.