[00:12:43] 10Traffic, 10Librarization, 10MediaWiki-extensions-CentralNotice, 10Operations, and 3 others: Split GeoIP into a new component - https://phabricator.wikimedia.org/T102848 (10Krinkle) [08:48:34] TIL, we have a user agent policy - https://meta.wikimedia.org/wiki/User-Agent_policy [08:50:06] yup [08:50:38] it's that we use to justify hard rate limits on stuff like python-requests hammering some service [09:49:32] 10Traffic, 10Operations: Investigate esams text varnish backend fetch failures - https://phabricator.wikimedia.org/T226375 (10ema) [09:49:40] 10Traffic, 10Operations: Investigate esams text varnish backend fetch failures - https://phabricator.wikimedia.org/T226375 (10ema) p:05Triage→03Normal [10:02:24] gilles: all caches are running a fixed kernel for the SACK DoSes now so T225998 is good to proceed, shall we start with one DC initially by re-enabling SACKs on half the caches in there? [10:02:24] T225998: Study performance impact of disabling TCP selective acknowledgments - https://phabricator.wikimedia.org/T225998 [10:14:13] moritzm: would be nice, to get some extra certainty that it was the cause of the performance regression [10:16:07] 10Traffic, 10Operations: Investigate esams text varnish backend fetch failures - https://phabricator.wikimedia.org/T226375 (10Gilles) I believe the cause is {T226373} The increase is simply proportional to the thumbnail miss increase due to extra WebP requests. [10:18:45] gilles, ema: ack, I'll re-enable SACKs on half of eqsin's upload and text caches in a bit [10:19:21] moritzm: +1 thanks! [12:52:46] 10netops, 10Operations, 10ops-codfw: Setup new msw1-codfw - https://phabricator.wikimedia.org/T224250 (10Papaul) @ayounsi let me know when this week you have time for us to replace the old msw. Thanks. [13:03:27] 10Traffic, 10Operations: Investigate esams text varnish backend fetch failures - https://phabricator.wikimedia.org/T226375 (10ema) [13:16:56] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10CDanis) [13:20:46] 10Traffic, 10Operations: Investigate esams text varnish backend fetch failures - https://phabricator.wikimedia.org/T226375 (10ema) [13:27:24] 10Traffic, 10Operations, 10Patch-For-Review: Investigate esams text varnish backend fetch failures - https://phabricator.wikimedia.org/T226375 (10ema) [13:31:48] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10Volans) ` volans@re0.cr1-eqiad> show interfaces diagnostics optics xe-4/2/0 Physical interface: xe-4/2/0 Laser bias current : 39.156 mA Laser output power... [13:32:36] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10Volans) ` volans@re0.cr1-codfw> show interfaces diagnostics optics xe-5/2/1 Physical interface: xe-5/2/1 Laser bias current : 40.898 mA Laser output power... [13:36:06] 10Traffic, 10Operations, 10Patch-For-Review: Investigate esams text varnish backend fetch failures - https://phabricator.wikimedia.org/T226375 (10ema) [13:41:03] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10CDanis) Telia reports a 'major outage' and is tracking status of our circuit in case 00993514 [14:01:29] 10Traffic, 10Operations, 10Patch-For-Review: Investigate esams text varnish backend fetch failures - https://phabricator.wikimedia.org/T226375 (10ema) [14:26:45] 10netops, 10Operations: Remove access to network gear for Casey Dentinger - https://phabricator.wikimedia.org/T226405 (10MoritzMuehlenhoff) [14:33:20] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10jijiki) p:05Triage→03Unbreak! [14:33:51] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10jijiki) Triaged as UBN! even thought it is not something we can control [14:34:34] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Multimedia, and 2 others: Image thumbnail (cache?) broken on English Wikipedia, e.g. Information.svg, when viewing non-default resolution (e.g. 241px) - https://phabricator.wikimedia.org/T226271 (10jijiki) p:05Triage→03Normal [14:42:09] 10netops, 10DC-Ops, 10Operations, 10ops-codfw: Replace cr[1-2].codfw fan filters - https://phabricator.wikimedia.org/T226407 (10Papaul) [14:42:21] 10netops, 10DC-Ops, 10Operations, 10ops-codfw: Replace cr[1-2].codfw fan filters - https://phabricator.wikimedia.org/T226407 (10Papaul) p:05Triage→03Normal [15:31:24] 10netops, 10DC-Ops, 10Operations, 10ops-codfw: Replace cr[1-2].codfw fan filters - https://phabricator.wikimedia.org/T226407 (10ayounsi) Feel free to do it anytime. Doc is on https://www.juniper.net/documentation/en_US/release-independent/junos/topics/topic-map/mx480-maintain-cooling-system.html [15:31:33] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10CDanis) p:05Unbreak!→03High it's just one (not-often-used) link down, not a site down; UBN is unnecessary IMO [15:32:34] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10ayounsi) a:03ayounsi [15:53:13] 10netops, 10DC-Ops, 10Operations, 10ops-codfw: Replace cr[1-2].codfw fan filters - https://phabricator.wikimedia.org/T226407 (10Papaul) 05Open→03Resolved filter replaced on both routers [15:59:24] 10netops, 10Operations, 10ops-codfw: update RE-S-X6-64G-S in cr[12]-codfw - https://phabricator.wikimedia.org/T226422 (10RobH) p:05Triage→03Normal [15:59:38] 10netops, 10Operations, 10ops-codfw: update RE-S-X6-64G-S in cr[12]-codfw - https://phabricator.wikimedia.org/T226422 (10RobH) [16:00:29] 10netops, 10Operations, 10ops-eqiad: update RE-S-X6-64G-S in cr[12]-eqiad - https://phabricator.wikimedia.org/T226424 (10RobH) p:05Triage→03Normal [16:00:41] 10netops, 10Operations, 10ops-eqiad: update RE-S-X6-64G-S in cr[12]-eqiad - https://phabricator.wikimedia.org/T226424 (10RobH) [16:32:04] 10Traffic, 10Operations, 10Performance-Team, 10Performance: Sometimes pages load slowly for European users (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10ema) We believe that Varnish fetch failures might be related to this issue, investigation is ongoing T2... [17:01:52] 10netops, 10Operations: Remove access to network gear for Casey Dentinger - https://phabricator.wikimedia.org/T226405 (10ayounsi) 05Open→03Resolved User (set as read-only user) removed from all network devices. [18:15:26] 10Traffic, 10Community-Relations, 10Operations, 10Performance-Team, 10Performance: Sometimes pages load slowly for European users (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10Krinkle) [18:18:04] 10Traffic, 10Community-Relations, 10Operations, 10Performance-Team, 10Performance: Sometimes pages load slowly for European users (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10Krinkle) @Community-Relations Just a heads-up in case you've heard anything ar... [20:00:07] 10Traffic, 10Community-Relations, 10Operations, 10Performance-Team, 10Performance: Sometimes pages load slowly for European users (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10Krinkle) [20:00:19] 10Traffic, 10Community-Relations, 10Operations, 10Performance-Team, 10Performance: Sometimes pages load slowly for European users (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10Krinkle) [20:01:11] 10Traffic, 10Community-Relations, 10Operations, 10Performance, 10Performance-Team (Radar): Sometimes pages load slowly for European users (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10Gilles) [20:01:31] 10Traffic, 10Operations, 10Performance-Team, 10Performance: Study performance impact of disabling TCP selective acknowledgments - https://phabricator.wikimedia.org/T225998 (10Gilles) a:03Gilles [20:02:17] 10Traffic, 10Community-Relations, 10Operations, 10Performance, 10Performance-Team (Radar): Sometimes pages load slowly for European users (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10Krinkle) (The task is titled "European users", but more precisely it a... [20:56:20] 10Traffic, 10Community-Relations, 10Operations, 10Performance, and 2 others: Sometimes pages load slowly for European users (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10Legoktm)