[01:09:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [01:45:36] 10SRE-tools, 06Infrastructure-Foundations, 06SRE, 10tox-wikimedia, 13Patch-Needs-Improvement: Introduce Python code formatters usage - https://phabricator.wikimedia.org/T211750#11373145 (10Pppery) [01:48:42] 07Puppet, 06cloud-services-team, 10Cloud-VPS, 06Data-Persistence, and 2 others: haproxy::site doesn't work as expected on the first puppet run - https://phabricator.wikimedia.org/T321684#11373151 (10Pppery) [01:51:08] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 13Patch-Needs-Improvement: switchdc SAL log entries are getting cut off because long lines are being split over IRC - https://phabricator.wikimedia.org/T285709#11373161 (10Pppery) [01:59:25] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-Needs-Improvement: test_matching_vlan() function crashing in Netbox network report - https://phabricator.wikimedia.org/T339133#11373180 (10Pppery) [05:09:40] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:09:25] RESOLVED: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:27:58] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic: Map video and other large files to 'low-priority' network Qos queue - https://phabricator.wikimedia.org/T410133 (10cmooney) 03NEW p:05Triage→03Low [12:05:36] 10netbox, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw:cr* router power not balance on all 4 PEM's - https://phabricator.wikimedia.org/T401937#11374105 (10cmooney) Ok so it is clear Juniper were correct. Pem 2 and 3 from //cr2-codfw// had 58v output before they were moved. Now th... [12:14:29] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 13Patch-For-Review: No free IPs on public1-ulsfo vlan (Nov 2025) - https://phabricator.wikimedia.org/T410047#11374122 (10cmooney) @ssingh I made a patch and can kick off the changes in Netbox and on the routers next week for this. However I wonde... [13:15:58] 10CAS-SSO, 06Infrastructure-Foundations: CAS stops responsing after some period of time - https://phabricator.wikimedia.org/T410139 (10SLyngshede-WMF) 03NEW [13:16:07] 10CAS-SSO, 06Infrastructure-Foundations: CAS stops responsing after some period of time - https://phabricator.wikimedia.org/T410139#11374246 (10SLyngshede-WMF) p:05Triage→03Medium [13:58:52] 10CAS-SSO, 06Infrastructure-Foundations: CAS stops responsing after some period of time - https://phabricator.wikimedia.org/T410139#11374322 (10Arendpieter) Is this happening on both https://idp.wikimedia.org and https://idp-test.wikimedia.org ? I’ll try to reproduce the issue locally using Docker Compose. [14:01:42] 10CAS-SSO, 06cloud-services-team, 10Striker, 13Patch-For-Review: Use IDP for authentication in Striker - https://phabricator.wikimedia.org/T359554#11374327 (10taavi) >>! In T359554#11372282, @Arendpieter wrote: > @taavi do I need to do something else for https://gerrit.wikimedia.org/r/c/labs/striker/+/1189... [14:31:09] 10CAS-SSO, 06Infrastructure-Foundations: CAS stops responsing after some period of time - https://phabricator.wikimedia.org/T410139#11374405 (10MoritzMuehlenhoff) We've only seen it on idp.w.o, but -test also receives several orders of a magnitude less connections. Plus, it's restarted far more often anyway fo... [18:52:44] 10netops, 06Infrastructure-Foundations, 06SRE: Audit and verify all cloudcephosd have their primary interface tagged and access to cloud-storage vlan - https://phabricator.wikimedia.org/T409690#11375257 (10cmooney) btw I haven't forgotten about this I'll get to it next week [19:42:05] 10CAS-SSO, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations: sso failure in codfw1dev (labtesthorizon.wikimedia.org) - https://phabricator.wikimedia.org/T409328#11375366 (10taavi) It seems like CAS issues the redirect when a request has the `x-forwarded-proto` header present. [20:19:30] 10CAS-SSO, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 13Patch-For-Review: sso failure in codfw1dev (labtesthorizon.wikimedia.org) - https://phabricator.wikimedia.org/T409328#11375488 (10taavi) a:05MoritzMuehlenhoff→03taavi