[02:33:20] 06Traffic, 06MediaWiki-Engineering, 06serviceops, 07Upstream, 07Wikimedia-production-error: 503 error when edit large size pages on PHP 8.1 - https://phabricator.wikimedia.org/T385395#10589637 (10Scott_French) As of ~ 15:40 UTC (Thursday), the traffic migration has returned to the state we rolled back fr... [03:30:09] FIRING: LVSHighRX: Excessive RX traffic on lvs2013:9100 (eno12399np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs2013 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [03:35:09] RESOLVED: LVSHighRX: Excessive RX traffic on lvs2013:9100 (eno12399np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs2013 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [04:51:03] 06Traffic, 06collaboration-services, 10MinT, 10LPL Essential (LPL Essential 2025 Feb-Mar), 13Patch-For-Review: MinT: Fails to download models/files from peopleweb.discovery.wmnet - https://phabricator.wikimedia.org/T383750#10589736 (10KartikMistry) 05In progress→03Resolved Closing as https://phab... [10:38:22] 10Wikimedia-Apache-configuration, 06serviceops, 06SRE, 10Wikimedia-Portals, and 2 others: www.wikipedia.org: prefilling the search box with the "search" URL parameter does not work - https://phabricator.wikimedia.org/T318285#10590297 (10elukey) 05Resolved→03Open Hi folks! I am really sorry to ruin the... [10:54:57] 10Wikimedia-Apache-configuration, 06SRE, 10Wikimedia-Portals, 10Sustainability (Incident Followup), 07Wikimedia-production-error: Wikipedia central page (https://www.wikipedia.org) fails to load with Too Many Redirects error - https://phabricator.wikimedia.org/T387549#10590328 (10jcrespo) 05Open→0... [11:21:52] topranks, XioNoX it looks like lvs5004 has some connectivity issue [11:22:10] PROBLEM - Host lvs5004 is DOWN: PING CRITICAL - Packet loss = 100% [11:23:03] or did we suffer any kind of comms issue between alert hosts and eqsin? [11:24:04] uh.. https://grafana.wikimedia.org/goto/M8vzjNtNR?orgId=1 [11:24:21] * topranks looking [11:25:13] hmm it seems like lvs5004 didn't lose connectivity within eqsin at the very least [11:26:27] connectivity looks ok yeah [11:26:44] but usage has gone to zero it seems ?? [11:27:07] https://grafana.wikimedia.org/goto/prL4jHtNg [11:27:27] appears to be coming back now [11:27:37] topranks: see SAL, I depooled the host just in case [11:27:42] ah ok [11:27:49] and I've repooled after seeing that everything was OK [11:28:52] ok yep [11:29:16] not unlikely there was some brief issue somewhere on the network path that caused some dropped packets [11:30:14] definitely a little bumb in latency at 11:18 maybe something on the Arelion network because congested [11:30:22] I think we can probably ignore unless it happens again [11:30:31] seems ok now anyway [11:30:41] https://www.irccloud.com/pastebin/cEm9F6rQ/ [11:31:40] thanks for doublechecking topranks :D [11:33:22] np! [11:35:30] https://grafana.wikimedia.org/goto/ZhFECNpHg?orgId=1 [11:35:39] text@eqsin is definitely getting some ICMP love [11:41:31] seems to all be coming from two Malaysian ASNs [11:41:47] https://w.wiki/DErv [11:43:17] TM TECHNOLOGY SERVICES (AS4788) and TIME dotCom (AS9930) [13:26:45] 10netops, 06Infrastructure-Foundations, 06SRE, 10Data-Platform-SRE (2025.03.01 - 2025.03.21): Add QoS markings to profile Hadoop/HDFS analytics traffic - https://phabricator.wikimedia.org/T381389#10590752 (10Gehel) [13:37:34] 10Wikimedia-Apache-configuration, 06serviceops, 06SRE, 10Wikimedia-Portals, and 2 others: www.wikipedia.org: prefilling the search box with the "search" URL parameter does not work - https://phabricator.wikimedia.org/T318285#10590993 (10Gehel) [15:26:57] 06Traffic, 10Continuous-Integration-Config, 13Patch-For-Review: Migrate docker-registry.wikimedia.org/releng/operations-dnslint from Buster to Bookworm - https://phabricator.wikimedia.org/T371001#10591350 (10hashar) 05Open→03Resolved a:03hashar I have rebuild the image with Bookworm and @ssingh sen... [15:38:16] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Install and cable Nokia test devices and test servers in codfw - https://phabricator.wikimedia.org/T385217#10591428 (10cmooney) Just want to confirm all the links are in place and working (the only ones I have not tested are the 100G t... [16:09:43] 10Wikimedia-Apache-configuration, 06SRE, 10Wikimedia-Portals, 07Wikimedia-production-error: Wikipedia central page (https://www.wikipedia.org) fails to load with Too Many Redirects error - https://phabricator.wikimedia.org/T387549#10591538 (10jcrespo) [16:10:29] 10Wikimedia-Apache-configuration, 06serviceops, 06SRE, 10Wikimedia-Portals, and 3 others: www.wikipedia.org: prefilling the search box with the "search" URL parameter does not work - https://phabricator.wikimedia.org/T318285#10591539 (10jcrespo) [17:22:52] 06Traffic, 06collaboration-services, 10MinT, 10LPL Essential (LPL Essential 2025 Feb-Mar), 13Patch-For-Review: MinT: Fails to download models/files from peopleweb.discovery.wmnet - https://phabricator.wikimedia.org/T383750#10591779 (10Dzahn) Just to clarify, large files are still downloaded from peop... [18:12:16] 06Traffic, 06collaboration-services, 10Phabricator, 06SRE, 13Patch-For-Review: Phabricator should cache tasks for a few minutes for logged-out users - https://phabricator.wikimedia.org/T274228#10592025 (10Dzahn) In this context T240297 also seems relevant. Specifically comments like T240297#5749688 and T... [20:32:09] FIRING: LVSHighRX: Excessive RX traffic on lvs2013:9100 (eno12399np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs2013 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [20:42:09] RESOLVED: LVSHighRX: Excessive RX traffic on lvs2013:9100 (eno12399np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs2013 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [22:25:09] FIRING: LVSHighRX: Excessive RX traffic on lvs2013:9100 (eno12399np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs2013 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [22:45:09] RESOLVED: LVSHighRX: Excessive RX traffic on lvs2013:9100 (eno12399np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs2013 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX