[03:30:56] FIRING: MaxConntrack: Max conntrack at 85.26% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [03:35:55] RESOLVED: MaxConntrack: Max conntrack at 84.19% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [04:05:42] 07Puppet, 06Infrastructure-Foundations: Improve the user experience adding new nodes to puppet - https://phabricator.wikimedia.org/T389932#11267431 (10Joe) >>! In T389932#10701585, @jhathaway wrote: > In proposing possible solutions, I would love to understand a bit more why our `site.pp` uses complex regexes.... [04:09:57] 07Puppet, 06Infrastructure-Foundations: Improve the user experience adding new nodes to puppet - https://phabricator.wikimedia.org/T389932#11267432 (10Joe) >>! In T389932#10701560, @jhathaway wrote: >>>! In T389932#10697436, @Joe wrote: >>>>! In T389932#10694961, @jhathaway wrote: >>> One issue with using just... [10:45:51] 10netops, 06Infrastructure-Foundations, 06SRE: drmrs: cr1-drmrs <-> asw1-b13-drmrs link down [Oct 2025] - https://phabricator.wikimedia.org/T407107 (10cmooney) 03NEW p:05Triage→03High [11:29:52] 10netops, 06Infrastructure-Foundations, 06SRE: drmrs: cr1-drmrs <-> asw1-b13-drmrs link down [Oct 2025] - https://phabricator.wikimedia.org/T407107#11268170 (10cmooney) Remote hands request id CS3321949 [13:11:14] 10netops, 06Infrastructure-Foundations, 06SRE: drmrs: cr1-drmrs <-> asw1-b13-drmrs link down [Oct 2025] - https://phabricator.wikimedia.org/T407107#11268439 (10cmooney) They checked the fibres and reseated but no change. ` We have reseated the fiber and SFPs on both sides. However, QSFP port 2 on the cr1-drm... [14:39:59] 10netops, 06Infrastructure-Foundations, 06SRE: drmrs: cr1-drmrs <-> asw1-b13-drmrs link down [Oct 2025] - https://phabricator.wikimedia.org/T407107#11268789 (10cmooney) The QSFP replacement in cr1-drmrs did the trick: ` Oct 13 13:49:58 asw1-b13-drmrs mib2d[9179]: SNMP_TRAP_LINK_UP: ifIndex 574, ifAdminStatu... [14:41:17] 10netops, 06Infrastructure-Foundations, 06SRE: drmrs: cr1-drmrs <-> asw1-b13-drmrs link down [Oct 2025] - https://phabricator.wikimedia.org/T407107#11268792 (10cmooney) p:05High→03Low [15:48:44] 10netops, 06Infrastructure-Foundations, 10Toolforge, 06tools-infrastructure-team: Plan networking for Toolforge-on-Metal experiment - https://phabricator.wikimedia.org/T407140 (10taavi) 03NEW [15:49:14] 10netops, 06Infrastructure-Foundations, 10Toolforge, 06tools-infrastructure-team: Plan networking for Toolforge-on-Metal experiment - https://phabricator.wikimedia.org/T407140#11269090 (10taavi) /cc @Andrew @cmooney [16:31:10] 10netops, 06Infrastructure-Foundations, 10Toolforge, 06tools-infrastructure-team: Plan networking for Toolforge-on-Metal experiment - https://phabricator.wikimedia.org/T407140#11269245 (10cmooney) Thanks for the task @taavi ! Option 1 is nice and neat, but as you say there is a lot of work to do in terms... [22:56:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag