[00:31:05] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [04:31:05] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [05:38:49] 10netops, 06Infrastructure-Foundations: FPC1 Failure on cr1-esams - take 2 - https://phabricator.wikimedia.org/T403360#11137741 (10ayounsi) 05Open→03Resolved a:03ayounsi Good news, it's still up. [08:24:26] FIRING: [4x] SystemdUnitFailed: squid-logrotate.service on install2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:31:05] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [10:34:30] FIRING: [5x] SystemdUnitFailed: squid-logrotate.service on install2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:44:26] FIRING: [5x] SystemdUnitFailed: squid-logrotate.service on install2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:31:05] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [13:04:38] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic: Remove static routes for LVS VIPs from core routers - https://phabricator.wikimedia.org/T300877#11138880 (10ssingh) >>! In T300877#11130890, @ayounsi wrote: >> the idea is that static routes should help save us in that situation > > That would only... [14:01:14] 07Puppet, 10MW-on-K8s, 10Observability-Alerting: Clean up "git repo needs merge" checks - https://phabricator.wikimedia.org/T370530#11139269 (10tappof) [14:03:21] 10netbox, 06Infrastructure-Foundations, 10Observability-Alerting: Port netbox reports checks to Prometheus/Alertmanager - https://phabricator.wikimedia.org/T374823#11139292 (10tappof) [14:28:26] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic: Remove static routes for LVS VIPs from core routers - https://phabricator.wikimedia.org/T300877#11139484 (10ayounsi) [14:44:47] FIRING: [4x] SystemdUnitFailed: squid-logrotate.service on install2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:46:05] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [15:22:22] moritzm: magru RIPE Atlas anchor is back online on routed ganeti: https://atlas.ripe.net/probes/7508/overview [15:22:50] wow nice work ! [15:23:02] glad that wasn't a blocker for us on routed ganeti :) [15:28:23] very nice! [15:46:05] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [16:17:08] 10netops, 10Ganeti, 06Infrastructure-Foundations: magru: move sandbox vlan to routed Ganeti - https://phabricator.wikimedia.org/T402372#11140100 (10ayounsi) magru Anchor is back online. It did require some remote help from the RIPE team, especially to configure v6. v4 worked out of the box, even without the... [16:22:35] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic: Remove static routes for LVS VIPs from core routers - https://phabricator.wikimedia.org/T300877#11140127 (10ssingh) Thanks for taking care of this @ayounsi! We will update this task when we are ready to remove the `eqiad` ones. [16:31:05] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [18:44:41] FIRING: [4x] SystemdUnitFailed: squid-logrotate.service on install2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:08:33] 10Mail, 06Infrastructure-Foundations, 10MediaWiki-Email, 10MediaWiki-extensions-EmailAuth, and 4 others: Could not send confirmation email: Unknown error in PHP's mail() function. - https://phabricator.wikimedia.org/T383047#11140847 (10Tgr) >>! In T383047#11087096, @Tgr wrote: > I wouldn't worry much about... [20:30:50] FIRING: [3x] PuppetConstantChange: Puppet performing a change on every puppet run on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [20:31:20] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: Eqiad: new structured cabling needed between cages to eqiad 2025/6 switch refresh - https://phabricator.wikimedia.org/T402432#11141430 (10wiki_willy) a:03Jclark-ctr [22:44:41] FIRING: [4x] SystemdUnitFailed: squid-logrotate.service on install2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed