[03:13:42] FIRING: [43x] SystemdUnitFailed: user@11984.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:38:26] ^ the build2002 is some leftover from running the OpenJDK test suite, I've cleaned that up now [06:43:27] FIRING: [43x] SystemdUnitFailed: user@11984.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:09:34] FIRING: DiskSpace: Disk space config-master2001:9100:/ 3.226% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=config-master2001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [07:24:35] RESOLVED: DiskSpace: Disk space config-master2001:9100:/ 3.136% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=config-master2001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [07:36:04] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: Install new line card in cr2-eqiad slot 0, move card from slot 1 to cr1-eqiad slot 0 and configure - https://phabricator.wikimedia.org/T426343#11929433 (10ayounsi) We have 2 new linecards coming, one for each router, so afaik we don't... [07:49:16] 10netops, 06Infrastructure-Foundations: POPs - free up 2xQSFP ports - https://phabricator.wikimedia.org/T424611#11929465 (10ayounsi) My suggestions here would be to open the decom tasks and leave it to DCops on how they want to tackle them. For esams that finally cleared the high power alarm, and we're going... [09:25:10] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-04-24 - 2026-05-15): Create depool hiera keys for cirrussearch hosts - https://phabricator.wikimedia.org/T426228#11930088 (10ayounsi) Thanks a lot for that ! [09:33:12] 10netops, 06Infrastructure-Foundations, 06ServiceOps new: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11930126 (10ayounsi) [10:12:22] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: Install new line card in cr2-eqiad slot 0, move card from slot 1 to cr1-eqiad slot 0 and configure - https://phabricator.wikimedia.org/T426343#11930298 (10cmooney) >>! In T426343#11929433, @ayounsi wrote: > We have 2 new linecards comi... [10:24:03] 10Mail, 06Infrastructure-Foundations, 06Product Safety and Integrity, 06Trust-and-Safety, 10vrts: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11930346 (10Xaosflux) Another yahoo user reporting emailauth problems in https://ticket.wikimedia.org/otrs/index.pl?Action=AgentTicketZo... [10:28:31] 10Mail, 06Infrastructure-Foundations, 06Product Safety and Integrity, 06SRE, and 2 others: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11930359 (10kostajh) [10:43:42] FIRING: [42x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:03:50] 10netops, 06Infrastructure-Foundations, 06SRE: InboundInterfaceErrors alerts firing for Nokia switches on v25.10.1 - https://phabricator.wikimedia.org/T412733#11930466 (10ayounsi) @Papaul did they provide a ETA for the fix? If not is it possible to ask them for any update ? [11:14:51] 10netops, 06Infrastructure-Foundations, 06ServiceOps new: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11930504 (10ayounsi) [12:58:33] 10netops, 06Infrastructure-Foundations, 06SRE: InboundInterfaceErrors alerts firing for Nokia switches on v25.10.1 - https://phabricator.wikimedia.org/T412733#11930998 (10Papaul) @ayounsi no ETA was given to me but yes i can can follow up with them. [13:41:02] 10netops, 06Infrastructure-Foundations, 06ServiceOps new: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11931215 (10ayounsi) [13:42:31] jhathaway: o/ did you manage to find anything about why https://gerrit.wikimedia.org/r/c/operations/puppet/+/1266205 is needed? i would like to get us unblocked and that patch seems like the least worst option to do it with at the moment [13:48:15] 10netops, 06Infrastructure-Foundations, 06ServiceOps new: codfw: rack A2 maintenance - https://phabricator.wikimedia.org/T426199#11931288 (10ayounsi) [14:43:42] FIRING: [42x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:19:40] taavi: that is my fault, I'll have something for you today [15:46:26] 10SRE-tools, 06Infrastructure-Foundations, 06SRE Observability: sre.kafka.roll-restart-reboot-brokers: command-config is not a recognized option - https://phabricator.wikimedia.org/T426639 (10herron) 03NEW [16:54:46] 10Mail, 06Infrastructure-Foundations, 06Product Safety and Integrity, 06SRE, and 3 others: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11932496 (10Ponor) For the record, a few more complaints on enwiki: [[https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#c-Obi2can... [17:24:09] 10Mail, 06Infrastructure-Foundations, 06Product Safety and Integrity, 06SRE, and 3 others: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11932687 (10jhathaway) All mail to yahoo.com is currently being deferred, the current message is: ` 421 4.7.0 [TSS04] Messages from 208.80.154.5... [18:46:09] FIRING: [42x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:44:18] 10Mail, 06Infrastructure-Foundations, 06Product Safety and Integrity, 06SRE, and 2 others: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11933806 (10jhathaway) Someone from Yahoo was kind enough to reach out to me directly and modify the IP reputation, so emails are flowing again! [22:46:09] FIRING: [42x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:46:10] FIRING: [42x] SystemdUnitFailed: cfssl-ocsprefresh-Wikimedia_Internal_Root_CA.service on pki1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:25:43] 10Mail, 06Infrastructure-Foundations, 06Product Safety and Integrity, 06SRE, and 2 others: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11933904 (10Xaosflux) Thank you, I've sent a few emails, and some emailauth users have reported back that they are now getting their codes. [23:25:54] 10Mail, 06Infrastructure-Foundations, 06Product Safety and Integrity, 06SRE, and 2 others: yahoo rejecting our emails - https://phabricator.wikimedia.org/T426105#11933906 (10Xaosflux) a:03jhathaway