[08:49:33] 10netops, 06Infrastructure-Foundations, 10cloud-services-team (FY2024/2025-Q1-Q2): cloud: edge network suffers downtime if one cloudsw is down - https://phabricator.wikimedia.org/T375259#10163529 (10aborrero) [08:50:25] 10netops, 06Infrastructure-Foundations, 10cloud-services-team (FY2024/2025-Q1-Q2): cloud: edge network suffers downtime if one cloudsw is down - https://phabricator.wikimedia.org/T375259#10163533 (10aborrero) p:05Triage→03Medium I'm tagging #netops in the hope they can help us figure out what happens her... [09:03:03] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 4 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#10163581 (10MoritzMuehlenhoff) [09:11:10] 10netops, 06Infrastructure-Foundations, 06SRE: netbox: create IPv6 entries for Cloud VPS - https://phabricator.wikimedia.org/T374712#10163608 (10cmooney) I've added the network containers as discussed on the previous task in Netbox now: https://netbox.wikimedia.org/ipam/prefixes/?within_include=2a02%3Aec80%... [09:23:28] 10CFSSL-PKI, 10netops, 06Infrastructure-Foundations: sre.network.tls cookbook - CFSSL error: bad request - https://phabricator.wikimedia.org/T375179#10163644 (10ayounsi) a:03ayounsi Thanks for having a look ! You're right, looks like the `preserve` directory isn't preserved across upgrades. So the workarou... [09:47:54] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Add cloudsw to gnmic interface stats collection - https://phabricator.wikimedia.org/T365012#10163710 (10ayounsi) 05Open→03Resolved Certs regenerated, so we're good for the next 12 months. Hopefully we will setup automation by then :) [10:10:04] 10CFSSL-PKI, 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: sre.network.tls cookbook - CFSSL error: bad request - https://phabricator.wikimedia.org/T375179#10163746 (10ayounsi) Patch above tested and works fine. ` INFO:__main__:================================================== INFO:__main__:E... [12:35:06] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Migrate codfw servers in rows C & D from legacy ASW to LSW - https://phabricator.wikimedia.org/T370630#10163923 (10MoritzMuehlenhoff) Ganeti row C and D have been rebalanced. [13:13:04] 10SRE-tools, 06Infrastructure-Foundations, 06SRE: debmonitor-client: Warning printed with su from buster - https://phabricator.wikimedia.org/T216832#10164013 (10hashar) a:03hashar [13:21:43] 10SRE-tools, 06Infrastructure-Foundations, 06SRE: debmonitor-client: Warning printed with su from buster - https://phabricator.wikimedia.org/T216832#10164040 (10MoritzMuehlenhoff) I've uploaded updated debs with the patch, will be rolled out next week. [14:12:27] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack: Support listing pooled / active authdns hosts (rather than all) - https://phabricator.wikimedia.org/T375014#10164116 (10ssingh) Thanks @Scott_French for documenting this by filing this task! I had a brief chat about this with @volans yesterday on IRC.... [14:55:14] 10SRE-tools, 06Infrastructure-Foundations, 06SRE: debmonitor-client: Warning printed with su from buster - https://phabricator.wikimedia.org/T216832#10164221 (10hashar) I have upgraded the client on contint1002.wikimedia.org. The warning no more appears and I can see it upgraded in the Debmonitor web interf... [17:31:25] FIRING: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:36:25] RESOLVED: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed