[01:48:40] FIRING: SystemdUnitFailed: wmf_auto_restart_kerberos_rsync.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:11:41] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [03:14:40] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [04:11:41] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [05:48:40] FIRING: SystemdUnitFailed: wmf_auto_restart_kerberos_rsync.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:52:09] 10netops, 06Infrastructure-Foundations: asw1-b12-drmrs stopped reporting metrics - https://phabricator.wikimedia.org/T413181#11581852 (10ayounsi) We're currently troubleshooting why we can't see troubleshooting logs. But it can maybe be the root cause for the metrics issues. **TL;DR; we should upgrade to 23.4... [06:56:39] 10netops, 06Infrastructure-Foundations: drmrs: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416441 (10ayounsi) 03NEW [06:59:29] 10netops, 06Infrastructure-Foundations: magru: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416442 (10ayounsi) 03NEW [07:04:23] 10netops, 10Cloud-Services, 06Infrastructure-Foundations: codfw: Upgrade cloudsw1-b1-codfw (2026) - https://phabricator.wikimedia.org/T416443 (10ayounsi) 03NEW The #Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/... [07:06:25] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444 (10ayounsi) 03NEW [07:06:39] 10netops, 06Infrastructure-Foundations: drmrs: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416441#11581979 (10ayounsi) [07:06:40] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11581980 (10ayounsi) [07:07:32] 10netops, 06Infrastructure-Foundations: asw1-b12-drmrs stopped reporting metrics - https://phabricator.wikimedia.org/T413181#11581996 (10ayounsi) [07:07:34] 10netops, 06Infrastructure-Foundations: drmrs: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416441#11581997 (10ayounsi) [07:08:04] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11581999 (10ayounsi) [07:08:06] 10netops, 06Infrastructure-Foundations: magru: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416442#11582000 (10ayounsi) [07:08:08] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11582001 (10ayounsi) [07:09:36] 10netops, 06Infrastructure-Foundations, 06tools-infrastructure-team: codfw: Upgrade cloudsw1-b1-codfw (2026) - https://phabricator.wikimedia.org/T416443#11582002 (10ayounsi) [07:09:47] 10netops, 06Infrastructure-Foundations, 06tools-infrastructure-team: codfw: Upgrade cloudsw1-b1-codfw (2026) - https://phabricator.wikimedia.org/T416443#11582003 (10ayounsi) [07:09:49] 10netops, 06Infrastructure-Foundations, 10Observability-Logging: ~5k/logs/sec from netdev - https://phabricator.wikimedia.org/T412143#11582004 (10ayounsi) [07:14:40] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [07:49:58] ^ fixing the wmf_auto_restart_kerberos_rsync.service alert [08:03:25] RESOLVED: SystemdUnitFailed: wmf_auto_restart_kerberos_rsync.service on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:02:35] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450 (10ayounsi) 03NEW [09:40:55] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11582323 (10ayounsi) [09:40:56] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11582322 (10ayounsi) [09:41:26] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11582324 (10ayounsi) [09:41:28] 10netops, 06Infrastructure-Foundations, 10Observability-Logging: ~5k/logs/sec from netdev - https://phabricator.wikimedia.org/T412143#11582325 (10ayounsi) [10:56:43] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.23 - 2026.02.13), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11582569 (10Gehel) With the various investigations that have happened around Airflow, do we now have a... [11:14:40] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [11:42:51] 10SRE-tools, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 07Datacenter-Switchover: Support locking cookbooks run except for switchover related cookbooks - https://phabricator.wikimedia.org/T330997#11582768 (10Blake) [11:43:40] 10netops, 10Cloud-VPS, 06Infrastructure-Foundations, 06tools-infrastructure-team: codfw: Upgrade cloudsw1-b1-codfw (2026) - https://phabricator.wikimedia.org/T416443#11582777 (10taavi) [12:44:59] 10SRE-tools, 06Infrastructure-Foundations, 06ServiceOps new, 10Spicerack, and 2 others: Expose hosts from MysqlLegacyRemoteHosts in spicerack - https://phabricator.wikimedia.org/T328911#11582986 (10Blake) [13:54:06] is there anyone who might have time for a meeting in the next few days to discuss how locking works in spicerack? i'm trying to understand what all will be required to implement a fix for https://phabricator.wikimedia.org/T330997 [14:11:30] 10SRE-tools, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, 07Datacenter-Switchover: Support locking cookbooks run except for switchover related cookbooks - https://phabricator.wikimedia.org/T330997#11583373 (10Blake) The way I'm considering going about this would be to create a switchover lock o... [14:56:28] bjensen: volans is your best bet [15:03:23] cdanis: thanks! [15:11:36] 10netops, 06Infrastructure-Foundations, 06SRE: Update network SSH keys to ssh-ed25519 - https://phabricator.wikimedia.org/T336769#11583801 (10Aklapper) @BBlack: Another ping [15:14:40] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [19:14:40] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [19:25:19] 10netops, 06Infrastructure-Foundations: access request - read-only access to pfw's for Avishua Stein (astein) - https://phabricator.wikimedia.org/T413826#11584880 (10AStein-WMF) i regenerated my public key- here it is: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPKOcE4nDmVZiJBqTCCEIEfmJn9YLf1Sb/h4l2rQf6Di astein@... [19:49:41] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [19:54:11] 10netops, 06Infrastructure-Foundations: access request - read-only access to pfw's for Avishua Stein (astein) - https://phabricator.wikimedia.org/T413826#11584949 (10cmooney) 05Resolved→03Open Re-opening to deal with the change request [20:20:41] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [20:49:41] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [21:20:41] RESOLVED: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [23:14:40] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag