[01:53:37] 06Traffic, 06Data-Engineering: Request for a new request dataset for caching research - https://phabricator.wikimedia.org/T401331#11581666 (10Ahoelzl) @yazhuoz can you help us with the requirements for the data set? The old request from [[ https://phabricator.wikimedia.org/T225538 | 2019 ]] has some informati... [05:52:09] 10netops, 06Infrastructure-Foundations: asw1-b12-drmrs stopped reporting metrics - https://phabricator.wikimedia.org/T413181#11581852 (10ayounsi) We're currently troubleshooting why we can't see troubleshooting logs. But it can maybe be the root cause for the metrics issues. **TL;DR; we should upgrade to 23.4... [06:30:53] 06Traffic, 10Maps, 06SRE, 07affects-Kiwix-and-openZIM: On using Wikimedia Maps to build Kiwix Openstreetmap ZIMs - https://phabricator.wikimedia.org/T416374#11581910 (10Bugreporter) This does not need to add a whitelist. Instead you need to set a proper referer when fetching tiles. [06:56:39] 10netops, 06Infrastructure-Foundations: drmrs: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416441 (10ayounsi) 03NEW [06:59:29] 10netops, 06Infrastructure-Foundations: magru: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416442 (10ayounsi) 03NEW [07:04:23] 10netops, 10Cloud-Services, 06Infrastructure-Foundations: codfw: Upgrade cloudsw1-b1-codfw (2026) - https://phabricator.wikimedia.org/T416443 (10ayounsi) 03NEW The #Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/... [07:06:25] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444 (10ayounsi) 03NEW [07:06:39] 10netops, 06Infrastructure-Foundations: drmrs: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416441#11581979 (10ayounsi) [07:06:40] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11581980 (10ayounsi) [07:07:32] 10netops, 06Infrastructure-Foundations: asw1-b12-drmrs stopped reporting metrics - https://phabricator.wikimedia.org/T413181#11581996 (10ayounsi) [07:07:34] 10netops, 06Infrastructure-Foundations: drmrs: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416441#11581997 (10ayounsi) [07:08:04] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11581999 (10ayounsi) [07:08:06] 10netops, 06Infrastructure-Foundations: magru: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416442#11582000 (10ayounsi) [07:08:08] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11582001 (10ayounsi) [07:09:36] 10netops, 06Infrastructure-Foundations, 06tools-infrastructure-team: codfw: Upgrade cloudsw1-b1-codfw (2026) - https://phabricator.wikimedia.org/T416443#11582002 (10ayounsi) [07:09:47] 10netops, 06Infrastructure-Foundations, 06tools-infrastructure-team: codfw: Upgrade cloudsw1-b1-codfw (2026) - https://phabricator.wikimedia.org/T416443#11582003 (10ayounsi) [07:09:49] 10netops, 06Infrastructure-Foundations, 10Observability-Logging: ~5k/logs/sec from netdev - https://phabricator.wikimedia.org/T412143#11582004 (10ayounsi) [09:02:35] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450 (10ayounsi) 03NEW [09:40:55] 10netops, 06Infrastructure-Foundations: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11582323 (10ayounsi) [09:40:56] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11582322 (10ayounsi) [09:41:26] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11582324 (10ayounsi) [09:41:28] 10netops, 06Infrastructure-Foundations, 10Observability-Logging: ~5k/logs/sec from netdev - https://phabricator.wikimedia.org/T412143#11582325 (10ayounsi) [10:56:43] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.23 - 2026.02.13), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11582569 (10Gehel) With the various investigations that have happened around Airflow, do we now have a... [11:43:40] 10netops, 10Cloud-VPS, 06Infrastructure-Foundations, 06tools-infrastructure-team: codfw: Upgrade cloudsw1-b1-codfw (2026) - https://phabricator.wikimedia.org/T416443#11582777 (10taavi) [12:32:41] 06Traffic, 06SRE, 13Patch-For-Review: Offer AuthDNS service over IPv6 - https://phabricator.wikimedia.org/T81605#11582903 (10cmooney) RIPEstat looks good in terms of visibility of the new ns2 Anycast prefix: {F71670832 width=400} [12:56:51] FIRING: FermMSS: Unexpected MSS value on 10.2.2.27:80 @ ms-fe1021 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=eqiad&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [13:01:51] RESOLVED: FermMSS: Unexpected MSS value on 10.2.2.27:80 @ ms-fe1021 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=eqiad&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [13:02:51] FIRING: FermMSS: Unexpected MSS value on 10.2.2.27:80 @ ms-fe1021 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=eqiad&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [13:07:06] RESOLVED: FermMSS: Unexpected MSS value on 10.2.2.27:80 @ ms-fe1021 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=eqiad&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [13:22:51] FIRING: FermMSS: Unexpected MSS value on 10.2.2.27:80 @ ms-fe1024 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=eqiad&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [13:27:51] RESOLVED: FermMSS: Unexpected MSS value on 10.2.2.27:80 @ ms-fe1024 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=eqiad&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [13:52:44] uh... I missed that alert [13:53:25] I'll submit a commit soon to avoid getting paged when MSS is 0 [13:54:05] s/paged/alerts/ [14:32:55] 06Traffic, 06Data-Engineering: Request for a new request dataset for caching research - https://phabricator.wikimedia.org/T401331#11583485 (10yazhuoz) @Ahoelzl Thanks for getting back to me! Here are the detailed requirements for the new CDN caching dataset. **Data fields:** We would like to retain the prev... [14:51:51] FIRING: FermMSS: Unexpected MSS value on 10.2.2.27:80 @ ms-fe1023 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=eqiad&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [14:56:51] RESOLVED: FermMSS: Unexpected MSS value on 10.2.2.27:80 @ ms-fe1023 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=eqiad&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [15:11:36] 10netops, 06Infrastructure-Foundations, 06SRE: Update network SSH keys to ssh-ed25519 - https://phabricator.wikimedia.org/T336769#11583801 (10Aklapper) @BBlack: Another ping [16:03:24] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584077 (10RobH) After having Jin check, this system has a failure of "The system board 5V SW PG voltage is outside of range." on the front LCD he plugged into it. The warranty expired in October 20... [16:54:49] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584257 (10RobH) [16:55:46] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584262 (10RobH) [16:58:56] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584264 (10RobH) [17:22:52] Cache hit ratio of upload (frontend + backend) in normal times have increased from 93%‌ to 95% in the past six months \o/ [17:22:55] https://usercontent.irccloud-cdn.com/file/jSCEUU5d/image.png [17:23:56] nice :D [17:28:34] December of hell is also clear in the graph [17:39:04] :) [18:13:53] 06Traffic, 10AutoWikiBrowser: 429 "too many requests" while requesting "what transcludes page" for many templates in AWB - https://phabricator.wikimedia.org/T414214#11584540 (10DavidBrooks) Already mentioned to @Reedy: the current official release //should// quietly retry any HTTP error after a delay that... [18:30:17] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584635 (10wiki_willy) Hey @RobH - did Jin say what kind of initial troubleshooting he did? Like did he do a power drain, reseat certain parts, etc? I think we can go ahead and purchase parts to se... [18:34:24] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584652 (10RobH) He did full troubleshooting with photos with me in a google chat, it included the following: * confirming the power ports on the PDU towers were outputting power * confirming the pow... [18:36:18] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584657 (10RobH) We cannot purchase replacement parts via the Dell website I linked in, which is the Dell site linked when you try to open a case without warranty support. The alternative is to try... [18:41:06] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584686 (10wiki_willy) Sounds good @RobH, that plan works for me as well. Do you know if Jin has access to any of these parts by any chance? If he is able to get a hold of them, he could just add t... [18:42:16] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11584697 (10RobH) I'll ask, also going to ask in dc ops meeting if anyone has a spare r450 they can crack open to check for the part # of the power distribution board and the mainboard. [19:25:19] 10netops, 06Infrastructure-Foundations: access request - read-only access to pfw's for Avishua Stein (astein) - https://phabricator.wikimedia.org/T413826#11584880 (10AStein-WMF) i regenerated my public key- here it is: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPKOcE4nDmVZiJBqTCCEIEfmJn9YLf1Sb/h4l2rQf6Di astein@... [19:54:11] 10netops, 06Infrastructure-Foundations: access request - read-only access to pfw's for Avishua Stein (astein) - https://phabricator.wikimedia.org/T413826#11584949 (10cmooney) 05Resolved→03Open Re-opening to deal with the change request [21:27:49] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11585160 (10RobH) Ok, Jenn checked inside the R450 and it is indeed a stand alone power distro board. {F71676229} {F71676231} John might have two hosts abandoned from T342455 (he is checking) and... [22:21:39] 06Traffic, 06Security-Team, 10WMF-General-or-Unknown, 07ContentSecurityPolicy, 13Patch-For-Review: Add restrictive CSP to upload.wikimedia.org - https://phabricator.wikimedia.org/T117618#11585426 (10sbassett) @TheDJ - Thanks for all of those test cases! These make sense to me, especially the first two.... [22:30:21] 06Traffic, 06Security-Team, 10WMF-General-or-Unknown, 07ContentSecurityPolicy, 13Patch-For-Review: Add restrictive CSP to upload.wikimedia.org - https://phabricator.wikimedia.org/T117618#11585450 (10Bawolff) > So that definitely has some upload.w.o resource calls, but when I check various console output...