[00:03:25] FIRING: [25x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [00:08:25] FIRING: [50x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:25:23] FIRING: [2x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in -8d 11h 30m 34s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [02:02:08] 10netops, 06Infrastructure-Foundations, 06Traffic: POPs LVS : remove public vlan trunking - https://phabricator.wikimedia.org/T367732#11910691 (10Papaul) 05Open→03Resolved @ayounsi we can close this task. [02:03:52] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, 06SRE: EQSIN:Switch refresh diagram and wiring - https://phabricator.wikimedia.org/T423724#11910693 (10Papaul) [02:06:51] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqsin, 06SRE: EQSIN:Switch refresh diagram and wiring - https://phabricator.wikimedia.org/T423724#11910694 (10Papaul) [04:08:40] FIRING: [49x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:25:23] FIRING: [2x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in -8d 15h 30m 34s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [06:17:58] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11910890 (10ayounsi) OK, that sounds good to me! Thanks [08:08:41] FIRING: [49x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:25:23] FIRING: [2x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in -8d 19h 30m 34s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [09:48:22] 10SRE-tools, 06Infrastructure-Foundations, 13Patch-For-Review: Cookbook for rack depool - https://phabricator.wikimedia.org/T327300#11911540 (10ayounsi) [10:50:57] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11911725 (10cmooney) Agh hit a bit of a hiccup with this (really should have anticipated). Take drmrs for example: ` cmooney@asw1-b12-drmrs> show configuration interfaces et-0/0/48 description "Cor... [12:08:41] FIRING: [49x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:25:23] FIRING: [2x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in -8d 23h 30m 34s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [13:25:56] FIRING: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [13:30:56] RESOLVED: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:21:22] XioNoX: topranks: for the meeting with Google regarding backups, whose availability should I set that up around? one of you or both of you? [14:21:36] this is for the netops discussion around that [14:21:39] (also hi) [14:22:31] sukhe: hello, if possible both [14:22:42] XioNoX: ok thank you, will add both of you [14:39:00] thanks <3 [16:08:41] FIRING: [49x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:25:28] FIRING: [2x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in -9d 3h 30m 34s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [17:42:12] 10Mail, 06Infrastructure-Foundations, 10vrts: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105 (10Xaosflux) 03NEW [17:42:25] 10Mail, 06Infrastructure-Foundations, 10vrts: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11914144 (10Xaosflux) Example VRT Ticket: https://ticket.wikimedia.org/otrs/index.pl?Action=AgentTicketZoom;TicketID=17919480 [17:44:07] 10Mail, 06Infrastructure-Foundations, 10vrts: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11914162 (10Xaosflux) [17:44:23] 10Mail, 06Infrastructure-Foundations, 10vrts: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11914165 (10Xaosflux) [17:48:39] 10netops, 06Infrastructure-Foundations: ulsfo: upgrade routers (2026) - https://phabricator.wikimedia.org/T416562#11914183 (10Papaul) @ssingh and team now that we are done with the switch refresh and everything is stable in ulsfo and after we connect the missing link between cr3 and asw1-23 We will like to sch... [17:50:58] 10netops, 06Infrastructure-Foundations: ulsfo: upgrade routers (2026) - https://phabricator.wikimedia.org/T416562#11914192 (10ssingh) >>! In T416562#11914182, @Papaul wrote: > @ssingh and team now that we are done with the switch refresh and everything is stable in ulsfo and after we connect the missing link b... [17:53:31] 10netops, 06Infrastructure-Foundations: ulsfo: upgrade routers (2026) - https://phabricator.wikimedia.org/T416562#11914213 (10Papaul) @ssingh i think it will be best to depool the site since this will be my first time doing the draining process I will like to be on the safe side. [17:54:08] 10netops, 06Infrastructure-Foundations: ulsfo: upgrade routers (2026) - https://phabricator.wikimedia.org/T416562#11914214 (10ssingh) >>! In T416562#11914213, @Papaul wrote: > @ssingh i think it will be best to depool the site since this will be my first time doing the draining process I will like to be on the... [18:05:45] 10Mail, 06Infrastructure-Foundations, 10vrts: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11914272 (10Xaosflux) Ticket suggests this is also preventing things like registration, emailauth, password recovery. [18:08:51] 10Mail, 06Infrastructure-Foundations, 06SRE: Wiki email not delivered to GMail - https://phabricator.wikimedia.org/T243937#11914311 (10Xaosflux) 05Open→03Resolved a:03Xaosflux I'm going to mark this closed as there are no recent issues, I've personally tested multiple email workflows successful to... [18:09:45] 10Mail, 06Infrastructure-Foundations: Two fr.wp users have not received email notifications (to GMail) for account confirmation - https://phabricator.wikimedia.org/T130367#11914316 (10Xaosflux) 05Open→03Resolved a:03Xaosflux Similar to T243937 , closing as no longer able to reproduce. If there is sti... [18:54:18] 10netops, 06Infrastructure-Foundations: ulsfo: upgrade routers (2026) - https://phabricator.wikimedia.org/T416562#11914466 (10ssingh) Confirmed with @Papaul that this is actually meant for May 20, same time. [20:08:41] FIRING: [49x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:25:23] FIRING: [2x] PKICertificateExpiry: Intermediate certificate in the trust chain for discovery expires in -9d 7h 30m 34s - https://wikitech.wikimedia.org/wiki/PKI/CA_Operations - TODO - https://alerts.wikimedia.org/?q=alertname%3DPKICertificateExpiry [21:43:26] FIRING: [50x] SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:50:18] 10Mail, 06Infrastructure-Foundations, 10vrts: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11915208 (10Reedy) p:05Triage→03High [22:54:52] 10Mail, 06Infrastructure-Foundations, 10vrts: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11915213 (10Quiddity) I wonder if this is similar/related to the old problems we had c.2013–2017? //Cf//. This and subtasks {T58414}