[07:50:28] 06Traffic, 06ServiceOps new, 10ServiceOps-Services-Oids, 13Patch-For-Review, and 2 others: hCaptcha: Stop using urldownloader for health checks of the secure-api.js file - https://phabricator.wikimedia.org/T421464#11818006 (10OKryva-WMF) [07:52:33] 06Traffic, 10ConfirmEdit (CAPTCHA extension), 072026-user-javascript-incident, 07ContentSecurityPolicy, and 2 others: [hCaptcha] CORS error on jawiki/enwiki Special:CreateAccount (fails to load secure-api.js), but works on mediawikiwiki - https://phabricator.wikimedia.org/T423039#11818033 (10OKryva-WMF) [08:33:21] 10netops, 06Infrastructure-Foundations, 10observability, 10Prod-Kubernetes, and 5 others: Increase visibility of kubernetes network status - https://phabricator.wikimedia.org/T356877#11818201 (10ayounsi) I gave a quick review on the CR, but you should at least use https://wikitech.wikimedia.org/wiki/Networ... [11:57:35] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11818885 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=eea64f46-a692-4b84-9bd8-3c707fabc91e) set by ayounsi@cumin1003 for 1:00:00 on 3 host(s)... [12:14:40] FIRING: [8x] VarnishHighThreadCount: Varnish's thread count on cp6009:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [12:26:48] FIRING: PuppetFailure: Puppet has failed on lvs1019:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [12:51:48] RESOLVED: PuppetFailure: Puppet has failed on lvs1019:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [12:52:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on hcaptcha-proxy7001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [12:53:48] FIRING: PuppetFailure: Puppet has failed on dns1004:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [12:59:40] FIRING: [16x] VarnishHighThreadCount: Varnish's thread count on cp6009:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [13:00:00] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819484 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=969c7620-d5fd-476a-bc01-620cf3273f2e) set by ayounsi@cumin1003 for 0:30:00 on 3 host(s)... [13:10:07] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819529 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=8f6b7eae-ec0e-422c-9012-4219cdb0af72) set by ayounsi@cumin1003 for 1:00:00 on 3 host(s)... [13:17:48] RESOLVED: PuppetZeroResources: Puppet has failed generate resources on hcaptcha-proxy7001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [13:18:48] RESOLVED: PuppetFailure: Puppet has failed on dns1004:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [13:19:40] RESOLVED: [8x] VarnishHighThreadCount: Varnish's thread count on cp6009:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [13:33:50] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819625 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=318be034-a1ac-42f8-a346-f076fb030f09) set by ayounsi@cumin1003 for 1:00:00 on 13 host(s... [13:35:47] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819637 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=7aa6ddc4-76ea-4040-9c97-e3ee3ff2982d) set by ayounsi@cumin1003 for 1:00:00 on 3 host(s)... [13:54:37] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819680 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=60febde6-0fbd-4034-aa7f-25aa1b23e19d) set by ayounsi@cumin1003 for 1:00:00 on 12 host(s... [13:55:53] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819682 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=d98c0310-2bee-4b0c-9c4f-43b3ad5852ad) set by ayounsi@cumin1003 for 1:00:00 on 3 host(s)... [13:56:55] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819688 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=3ae4af28-cae2-4928-bd05-37949f6ccde3) set by ayounsi@cumin1003 for 1:00:00 on 8 host(s)... [14:07:27] FIRING: SLOMetricAbsent: haproxy-combined - https://slo.wikimedia.org/?search=haproxy-combined - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [14:10:14] FIRING: SLOMetricAbsent: varnish-combined esams - https://slo.wikimedia.org/?search=varnish-combined - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [14:10:31] FIRING: SLOMetricAbsent: varnish-combined esams - https://slo.wikimedia.org/?search=varnish-combined - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [14:12:27] FIRING: [2x] SLOMetricAbsent: haproxy-combined - https://slo.wikimedia.org/?search=haproxy-combined - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [14:15:14] RESOLVED: SLOMetricAbsent: varnish-combined esams - https://slo.wikimedia.org/?search=varnish-combined - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [14:15:31] RESOLVED: SLOMetricAbsent: varnish-combined esams - https://slo.wikimedia.org/?search=varnish-combined - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [14:17:27] RESOLVED: [2x] SLOMetricAbsent: haproxy-combined - https://slo.wikimedia.org/?search=haproxy-combined - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [14:22:14] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819834 (10ayounsi) 05Open→03Resolved All done. [14:22:32] 10netops, 06Infrastructure-Foundations, 06SRE: cr1-esams failed upgrade - https://phabricator.wikimedia.org/T422525#11819840 (10ayounsi) 05Open→03Resolved a:03ayounsi Upgraded to 23.4R2-S8 and all is well. [14:23:45] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11819847 (10ayounsi) [15:46:48] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE: Revert lvs1017 Mellanox NIC to Broadcom - https://phabricator.wikimedia.org/T421421#11820425 (10BCornwall) We've decided to move forward with this task. Would dcops be willing to handle the NIC revert in lvs1017? [17:19:56] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, and 6 others: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only - https://phabricator.wikimedia.org/T414805#11820959 (10TheDJ) [19:19:13] 06Traffic, 03Wikimedia-Hackathon-2026: Wikimedia Hackathon 2026: Wikimedia's Production DNS Infrastructure and GeoDNS User Routing - https://phabricator.wikimedia.org/T423331 (10ssingh) 03NEW [19:19:44] 06Traffic, 03Wikimedia-Hackathon-2026: Wikimedia Hackathon 2026: Wikimedia's Production DNS Infrastructure and GeoDNS User Routing - https://phabricator.wikimedia.org/T423331#11821445 (10ssingh) [19:23:53] 06Traffic: Containerize ncmonitor - https://phabricator.wikimedia.org/T408617#11821452 (10BCornwall) 05In progress→03Stalled [19:24:58] 06Traffic, 06DBA, 06Infrastructure-Foundations: Move orchestrator (dborch) to private ipaddrs + CDN - https://phabricator.wikimedia.org/T317179#11821454 (10ssingh) We discussed this in Traffic and have a few questions on why we are planning to do this, outside of the public IP thing. We also want to discuss... [21:45:44] 06Traffic, 06collaboration-services, 10Gerrit, 06Release-Engineering-Team, and 2 others: gerrit: Adapt timeouts to avoid 502 errors in CI jobs - https://phabricator.wikimedia.org/T421827#11822004 (10A_smart_kitten)