[01:33:36] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11381926 (10Papaul) [02:11:24] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11381965 (10Papaul) [05:32:47] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11382104 (10Papaul) [06:18:26] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11382134 (10Papaul) [06:28:38] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11382156 (10Papaul) @cmooney @ayouns I update the task with all the IPV4 and IPV6 addresses for the links, irb's and loopbacks. Please review and let me know i... [08:43:01] FIRING: NTPNoSynced: NTP not synced on tcp-proxy1002:9100 - https://wikitech.wikimedia.org/wiki/NTP - TODO - https://alerts.wikimedia.org/?q=alertname%3DNTPNoSynced [08:50:32] 10netops, 06Infrastructure-Foundations, 06Traffic: POPs LVS : remove public vlan trunking - https://phabricator.wikimedia.org/T367732#11382492 (10ayounsi) a:03ssingh @ssingh started working on this with https://gerrit.wikimedia.org/r/1206424 in {T410047} boldly assigning the task to him :) [08:53:01] RESOLVED: NTPNoSynced: NTP not synced on tcp-proxy1002:9100 - https://wikitech.wikimedia.org/wiki/NTP - TODO - https://alerts.wikimedia.org/?q=alertname%3DNTPNoSynced [08:53:25] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 13Patch-For-Review: No free IPs on public1-ulsfo vlan (Nov 2025) - https://phabricator.wikimedia.org/T410047#11382501 (10ayounsi) See also {T367732} [09:45:41] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [10:08:47] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Cloudcephosd: migrate to single network uplink - https://phabricator.wikimedia.org/T399180#11382907 (10fgiunchedi) [10:12:53] 10netops, 06Infrastructure-Foundations, 06SRE: Audit and verify all cloudcephosd have their primary interface tagged and access to cloud-storage vlan - https://phabricator.wikimedia.org/T409690#11382933 (10fgiunchedi) Something else I forgot: I'm assuming codfw also is applicable in this case? i.e. these hos... [10:13:23] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Cloudcephosd: migrate to single network uplink - https://phabricator.wikimedia.org/T399180#11382935 (10fgiunchedi) [10:45:41] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [11:11:34] 10SRE-tools, 06Infrastructure-Foundations, 06SRE: Add an option to the reimage cookbook to also update firmware - https://phabricator.wikimedia.org/T410384 (10MoritzMuehlenhoff) 03NEW [11:34:50] 10SRE-tools, 06Infrastructure-Foundations, 06SRE: Add an option to the reimage cookbook to also update firmware - https://phabricator.wikimedia.org/T410384#11383237 (10Marostegui) The host can be rebooted if needed as many times as needed - it is out of the load balancer and mariadb is stopped. [13:48:12] 10netops, 06Infrastructure-Foundations, 10Toolforge, 06tools-infrastructure-team: Plan networking for Toolforge-on-Metal experiment - https://phabricator.wikimedia.org/T407140#11383672 (10cmooney) Ok thanks @fgiunchedi for the info. I think that seems doable. As per the sub-task about a VRF I think that... [14:27:27] moritzm: thanks for trying the reimage! and yeah, both VMs in magru are failing [14:27:43] I even verified that they are on a different node than the installserver (well at least one is) [14:27:59] but beyond that I have not found the time to dig deeper (or where to look beyond what I already know to look at) I guess [14:55:52] sukhe: I'll look into it, leave that to me [14:57:57] moritzm: I also can have a look if you're busy [14:59:11] <3 [14:59:50] it's fine, I'll take care of it, I had been looking into the DB partman stuff the past days for https://phabricator.wikimedia.org/T408777 anyway [15:00:01] I will decomm the existing ones so that you have a clean run [15:00:06] sukhe: thx [15:00:19] XioNoX: maybe moving to UEFI also resolves the swap issue we'd seen [15:44:25] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic: Cleaning up Puppet and Netbox VLAN sub-ints on edge sites - https://phabricator.wikimedia.org/T410411 (10ssingh) 03NEW [15:46:25] FIRING: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:48:18] I cleaned up hcaptcha-proxy700[12] [15:49:09] 10netops, 06Infrastructure-Foundations, 06SRE, 06Traffic, 13Patch-For-Review: Cleaning up Puppet and Netbox VLAN sub-ints on edge sites - https://phabricator.wikimedia.org/T410411#11384212 (10ssingh) p:05Triage→03Low [15:51:25] RESOLVED: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:45:44] 10netops, 06Infrastructure-Foundations, 06SRE: Servers exposing incorrect LLDP info - https://phabricator.wikimedia.org/T250367#11384628 (10Papaul) I took a look at xe-1/0/8 as you mentioned it was cp5002 and i saw dns5004 and just to realized that this task has been open since 2020 5 years ago so now on por... [18:50:09] 10netops, 06Infrastructure-Foundations, 06Traffic: POPs LVS : remove public vlan trunking - https://phabricator.wikimedia.org/T367732#11385383 (10ssingh) Related: T410411. [18:51:24] 10netops, 06Infrastructure-Foundations, 06Traffic: POPs LVS : remove public vlan trunking - https://phabricator.wikimedia.org/T367732#11385400 (10ssingh) >>! In T367732#11382492, @ayounsi wrote: > @ssingh started working on this with https://gerrit.wikimedia.org/r/1206424 in {T410047} boldly assigning the t... [19:31:56] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11385630 (10RobH) Day 6 Update: * 31 hosts moved today, 77 hosts remain * got directions from Clement on how to move wikikube hosts effectively, moved half... [19:54:05] 10netops, 06Infrastructure-Foundations, 06SRE: lsw1-d6-eqiad outage Nov 18 2025 - https://phabricator.wikimedia.org/T410455 (10cmooney) 03NEW p:05Triage→03Low [19:57:04] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06Traffic: ulsfo switch refresh - https://phabricator.wikimedia.org/T410456 (10RobH) 03NEW p:05Triage→03Medium [19:57:40] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06Traffic: ulsfo switch refresh - https://phabricator.wikimedia.org/T410456#11385835 (10RobH) 05Open→03Invalid dupe of T410456 [19:58:24] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06Traffic: ulsfo switch refresh - https://phabricator.wikimedia.org/T410456#11385838 (10RobH) 05Invalid→03Resolved dupe of T408510 [19:59:35] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: switch refresh - https://phabricator.wikimedia.org/T408510#11385845 (10RobH) [20:02:10] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: switch refresh - https://phabricator.wikimedia.org/T408510#11385856 (10RobH) [20:03:11] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: ULSFO: switch refresh - https://phabricator.wikimedia.org/T408510#11385859 (10ssingh) [20:03:37] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06Traffic: ulsfo switch refresh - https://phabricator.wikimedia.org/T410456#11385866 (10Aklapper) →14Duplicate dup:03T408510 [20:03:40] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: ULSFO: switch refresh - https://phabricator.wikimedia.org/T408510#11385868 (10Aklapper) [20:04:59] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, and 2 others: ULSFO: switch refresh - https://phabricator.wikimedia.org/T408510#11385870 (10RobH) [20:05:06] 10netops, 06Infrastructure-Foundations, 06SRE: lsw1-d6-eqiad outage Nov 18 2025 - https://phabricator.wikimedia.org/T410455#11385874 (10cmooney) [20:09:34] 10netops, 06Infrastructure-Foundations, 06SRE: lsw1-d6-eqiad outage Nov 18 2025 - https://phabricator.wikimedia.org/T410455#11385892 (10cmooney) [20:11:59] 10netops, 06Infrastructure-Foundations, 06SRE: lsw1-d6-eqiad outage Nov 18 2025 - https://phabricator.wikimedia.org/T410455#11385900 (10cmooney) [20:13:35] 10netops, 06Infrastructure-Foundations, 06SRE: lsw1-d6-eqiad outage Nov 18 2025 - https://phabricator.wikimedia.org/T410455#11385909 (10cmooney) [20:35:33] 10netops, 06Infrastructure-Foundations, 06SRE: lsw1-d6-eqiad outage Nov 18 2025 - https://phabricator.wikimedia.org/T410455#11385997 (10cmooney) [23:27:16] 10netops, 06Infrastructure-Foundations, 06SRE: lsw1-d6-eqiad outage Nov 18 2025 - https://phabricator.wikimedia.org/T410455#11386635 (10cmooney) To try to verify what happened here I tried to make the same change in netbox-next, (with [[ https://netbox-next.wikimedia.org/dcim/devices/6359/ | this ]] being th...