[10:30:49] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11649996 (10ayounsi) For asw1-23-ulsfo gNMI/TLS issue I've opened Nokia support case 05482268. --- ` We're currently provisioning two new switches. The first... [12:29:25] FIRING: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:34:25] RESOLVED: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:33:44] 10netops, 06Commons, 06Infrastructure-Foundations, 06SRE, and 2 others: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651446 (10AlexisJazz) [16:37:31] 10netops, 06Commons, 06Infrastructure-Foundations, 06SRE, and 2 others: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651469 (10AlexisJazz) [16:40:12] 10netops, 06Commons, 06Infrastructure-Foundations, 06SRE, and 2 others: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651485 (10Jdforrester-WMF) p:05Triage→03Unbreak! [16:40:58] 10netops, 06Commons, 06Infrastructure-Foundations, 06SRE, and 2 others: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651487 (10JaydenKieran) Can confirm has been affecting en.wikipedia.org and mediawiki.org too, though seems more stabl... [16:42:56] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 07Wikimedia-production-error: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651504 (10AlexisJazz) [16:52:15] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 07Wikimedia-production-error: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651544 (10Nemoralis) https://www.wikimediastatus.net/incidents/dgdcls8b0ybt [16:53:30] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 07Wikimedia-production-error: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651549 (10AlexisJazz) There was also a 5 minute spike in 50x errors at 14:15. Also between 15:30 and... [16:54:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:56:58] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, and 2 others: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651556 (10AlexisJazz) [16:58:12] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 07Wikimedia-Incident: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11651560 (10AlexisJazz) [17:05:58] Hi. I've got two hosts that are stubbornly refusing to do PXE during reimaging, despite my best efforts to upgrade/downgrade firmware. Any help would be much appreciated. https://phabricator.wikimedia.org/T418398 [17:42:16] 10netops, 06Infrastructure-Foundations, 06SRE: Nokia SR-Linux DHCP Relay Bug - https://phabricator.wikimedia.org/T411054#11651815 (10BTullis) @ayounsi directed me to this ticket after reading: {T418398} I believe that this is also preventing the reimaging of: * `dse-k8s-worker1026` on `lsw1-c2-eqiad` * `dse... [18:49:21] 07Puppet, 06collaboration-services, 10Gerrit, 13Patch-For-Review: Gerrit git replication should not break when Puppet changes its config - https://phabricator.wikimedia.org/T416929#11652066 (10Dzahn) > The short fix is to disable configuration autoreloading in the replication plugin. This config change ha... [19:32:33] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 07Wikimedia-Incident: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11652162 (10ssingh) This should now be resolved but leaving to the task author to mark this as "Resolved". We... [20:46:19] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 07Wikimedia-Incident: 503 Service Unavailable No server is available to handle this request. - https://phabricator.wikimedia.org/T418392#11652313 (10Aklapper) [20:54:40] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:04:25] RESOLVED: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed