[02:42:57] 10netops, 06Traffic, 06Infrastructure-Foundations: esams/magru: 185.71.138.0/24 (wikidough) prefix not advertized - https://phabricator.wikimedia.org/T420342#11730865 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=c4f208ac-d300-44af-a51e-696d5c084e32) set by sukhe@cumin1003 for 1 day... [02:43:19] 10netops, 06Traffic, 06Infrastructure-Foundations: esams/magru: 185.71.138.0/24 (wikidough) prefix not advertized - https://phabricator.wikimedia.org/T420342#11730866 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=2fd1d9bc-b373-41d2-8079-0dddb67c1c7a) set by sukhe@cumin1003 for 1 day... [03:07:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on ncredir3005:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [03:19:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on hcaptcha-proxy3002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [07:08:03] FIRING: PuppetZeroResources: Puppet has failed generate resources on ncredir3005:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [07:20:03] FIRING: PuppetZeroResources: Puppet has failed generate resources on hcaptcha-proxy3002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [07:32:36] hello, is there a status page for wikimedia DNS? it seems to be unresponsive this morning [08:24:18] RESOLVED: PuppetZeroResources: Puppet has failed generate resources on hcaptcha-proxy3002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [09:15:01] 10netops, 06Infrastructure-Foundations, 06SRE: Nokia SR-Linux - wonky routing with IPv6 RAs and EVPN Anycast GW - https://phabricator.wikimedia.org/T420706 (10cmooney) 03NEW p:05Triage→03High [09:15:27] 10netops, 06Infrastructure-Foundations, 06SRE: Eqiad C/D refresh: move legacy switch uplinks to Nokias and migrate Vlan GWs - https://phabricator.wikimedia.org/T405562#11731460 (10cmooney) [09:15:28] 10netops, 06Infrastructure-Foundations, 06SRE: Nokia SR-Linux - wonky routing with IPv6 RAs and EVPN Anycast GW - https://phabricator.wikimedia.org/T420706#11731459 (10cmooney) [09:16:43] 10netops, 06Infrastructure-Foundations, 06SRE: Eqiad: move row-wide vlan gateways to Nokia switches - https://phabricator.wikimedia.org/T416872#11731464 (10cmooney) Unfortunately we hit another blocker with this so we will have to review the way forward. See T420706. [09:42:37] 10netops, 06Infrastructure-Foundations, 06SRE: Nokia SR-Linux - wonky routing with IPv6 RAs and EVPN Anycast GW - https://phabricator.wikimedia.org/T420706#11731593 (10cmooney) Ticket 05547487 opened with Nokia. [09:44:07] SimonSapin: only seen your message this morning [09:44:14] or now ever [09:44:35] we had a bit of a glitch at our Amersterdam location with Wikimeda DNS, but it should be stable now we will keep an eye on it [09:45:23] I'm not sure if there is a statuspage for it, perhaps someone from traffic can comment [11:25:54] [[:phab:T108827]] [11:25:55] T108827: Investigate TCP Fast Open for tlsproxy - https://phabricator.wikimedia.org/T108827 [11:27:06] bblack: Hi, I saw you created this task and TFO was enabled in 2016, but can you check this? [[:phab:T415454]] [11:27:15] T415454: TCP FastOpen not working since at least December 2025 - https://phabricator.wikimedia.org/T415454 [12:01:43] FIRING: HaproxyKafkaNoMessages: Unexpected rate of produced HaproxyKafka messages by cp2041 - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaNoMessages - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=codfw&var-instance=cp2041 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaNoMessages [12:03:10] ^^ me, I'll silence this [12:06:43] RESOLVED: [2x] HaproxyKafkaNoMessages: Unexpected rate of produced HaproxyKafka messages by cp2041 - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaNoMessages - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=codfw&var-instance=cp2041 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaNoMessages [13:09:19] 06Traffic, 06SRE: TCP FastOpen not working since at least December 2025 - https://phabricator.wikimedia.org/T415454#11732131 (10BBlack) TFO is still configured in our TLS terminators. We'll have to investigate to figure out what has gone wrong here. Possibly this is being stripped by our loadbalancers. [13:16:04] 06Traffic, 10Liberica, 10Prod-Kubernetes, 07Kubernetes, 06ServiceOps new (Next quarter): Add missing wikikube workers to conftool-data - https://phabricator.wikimedia.org/T420729 (10JMeybohm) 03NEW [14:47:20] 06Traffic, 06SRE: TCP FastOpen not working since at least December 2025 - https://phabricator.wikimedia.org/T415454#11732482 (10BBlack) @ssingh figured out where we went wrong. At the TLS terminator level, it is enabled, but at the OS level (Linux sysctl settings), it is not. We did have it enabled at that l... [15:07:43] 06Traffic: Synchronize and rotate TCP Fastopen keys for various use-cases - https://phabricator.wikimedia.org/T355446#11732574 (10ssingh) a:03SLyngshede-WMF [15:22:43] FIRING: HaproxyKafkaNoMessages: Unexpected rate of produced HaproxyKafka messages by cp2041 - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaNoMessages - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=codfw&var-instance=cp2041 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaNoMessages [15:22:56] that's fab [15:29:50] mmm I've silenced it [15:29:58] let me check again [15:30:46] bah, silenced again [15:37:49] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: Gerrit: Debug connection re-use on Gerrit's httpd causing Gerrit interface to be very slow - https://phabricator.wikimedia.org/T420189#11732690 (10hashar) //@ABran-WMF and I further exchanged about on Thursday... [15:38:17] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: Gerrit: Debug connection re-use on Gerrit's httpd causing Gerrit interface to be very slow - https://phabricator.wikimedia.org/T420189#11732691 (10hashar) ### Config troubleshooting The event mpm was designed... [15:42:33] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, 13Patch-For-Review: Gerrit: Debug connection re-use on Gerrit's httpd causing Gerrit interface to be very slow - https://phabricator.wikimedia.org/T420189#11732733 (10hashar) //I think that concludes this task (debugging), and we... [15:59:39] 06Traffic: Synchronize and rotate TCP Fastopen keys for various use-cases - https://phabricator.wikimedia.org/T355446#11732813 (10ssingh) Before we get to the other stuff, step 0 is to actually enable TFO on the cp servers. For that, we will need to set/include `profile::tcp_fast_open`. And then we can tackle th... [18:11:51] 06Traffic, 06cloud-services-team, 10Data-Services, 10Datasets-General-or-Unknown, 13Patch-For-Review: Move dumps.wikimedia.org HTTP service behind CDN edge - https://phabricator.wikimedia.org/T306550#11733520 (10HCoplin-WMF) I can hopefully answer some of those questions: 1. **How large are the largest... [19:16:03] 06Traffic: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11733729 (10CDobbins) [20:50:22] 06Traffic: lvs2013 NIC fails to set queue length / fails to initialize with ipip-multiqueue-optimizer - https://phabricator.wikimedia.org/T420789 (10BCornwall) 03NEW [20:51:43] 06Traffic: lvs2013 NIC fails to set queue length / fails to initialize with ipip-multiqueue-optimizer - https://phabricator.wikimedia.org/T420789#11733971 (10BCornwall) [21:02:41] 06Traffic: lvs2013 NIC fails to set queue length / fails to initialize with ipip-multiqueue-optimizer - https://phabricator.wikimedia.org/T420789#11734022 (10BCornwall) 05Open→03In progress p:05Triage→03High [21:58:02] 06Traffic: Upgrade Traffic hosts to trixie - https://phabricator.wikimedia.org/T401832#11734134 (10CDobbins)