[06:10:17] 10netops, 10Operations: Telia IC-307235 reported down from the eqiad side - https://phabricator.wikimedia.org/T226394 (10ayounsi) 05Open→03Resolved > Tha faulty card replaced and at 2019-06-25 05:41 UTC the circuit recovered and running at the moment , please check and let us know if you have any issue . >... [07:04:59] 10netops, 10Operations, 10ops-codfw: update RE-S-X6-64G-S in cr[12]-codfw - https://phabricator.wikimedia.org/T226422 (10ayounsi) [07:57:28] cd [07:57:33] right [08:03:54] ema: is this a new CS-themed navigation system? In 200 meters, cd right :-P [08:04:12] ahah yes [08:04:21] I'm moving around the city [08:04:43] lol [08:07:47] 10Traffic, 10Community-Relations, 10Operations, 10Performance, and 2 others: Sometimes pages load slowly for users routed to the Amsterdam data center (due to some factor outside of Wikimedia cluster) - https://phabricator.wikimedia.org/T226048 (10ema) [08:14:24] 10netops, 10Operations, 10ops-eqiad: update RE-S-X6-64G-S in cr[12]-eqiad - https://phabricator.wikimedia.org/T226424 (10ayounsi) [08:18:41] 10Traffic, 10Operations, 10Performance-Team, 10Performance: Study performance impact of disabling TCP selective acknowledgments - https://phabricator.wikimedia.org/T225998 (10MoritzMuehlenhoff) Breakdown of servers and their config eqsin: Enabled: cp5001-cp5003, cp5007-cp5009 Disabled: cp5004-cp5006, cp50... [08:26:34] 10Traffic, 10Operations, 10Performance-Team, 10Performance: Study performance impact of disabling TCP selective acknowledgments - https://phabricator.wikimedia.org/T225998 (10Gilles) The results are in, looking at loadEventEnd. | status | median | p90 | p95 | sample size | | SACK enabled | 1282 | 4079 | 6... [08:28:10] 10Traffic, 10Operations, 10Performance-Team, 10Performance: Study performance impact of disabling TCP selective acknowledgments - https://phabricator.wikimedia.org/T225998 (10Gilles) Hive queries used, for reference: P8650 [08:46:23] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in eqsin - https://phabricator.wikimedia.org/T226477 (10ema) [08:46:28] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in eqsin - https://phabricator.wikimedia.org/T226477 (10ema) p:05Triage→03Normal [09:00:56] gilles: thanks for the performance data, very interesting [09:00:57] 10Traffic, 10DBA, 10Operations, 10Patch-For-Review: Framework to transfer files over the LAN - https://phabricator.wikimedia.org/T156462 (10jcrespo) transfer.py was modified to add hot mysql backup taking and compression/decompression handling for provisioning. It is still a bit of a clunky mess, and it w... [09:14:26] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in eqsin - https://phabricator.wikimedia.org/T226477 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin1001.eqiad.wmnet for hosts: ` ['cp5001.eqsin.wmnet'] ` The log can be found in `... [10:17:25] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in eqsin - https://phabricator.wikimedia.org/T226477 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp5001.eqsin.wmnet'] ` and were **ALL** successful. [11:03:01] 10Traffic, 10Operations, 10Performance-Team: Send peering requests to AS with the worst TTFB - https://phabricator.wikimedia.org/T219486 (10Gilles) Now that the AS report is collecting more data, I've manually compiled a list of AS we could directly peer with (and don't yet), having checked that we have at l... [11:48:12] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Multimedia, and 2 others: Thumbnail rendering of complex SVG file leads to Error 500 or Error 429 instead of Error 408 - https://phabricator.wikimedia.org/T226318 (10ema) [11:48:51] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Multimedia, and 2 others: Thumbnail rendering of complex SVG file leads to Error 500 or Error 429 instead of Error 408 - https://phabricator.wikimedia.org/T226318 (10ArielGlenn) kibana entries for the https://upload.wikimedia.org/wikipedia/commons/thumb/... [12:30:06] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Multimedia, and 2 others: Thumbnail rendering of complex SVG file leads to Error 500 or Error 429 instead of Error 408 - https://phabricator.wikimedia.org/T226318 (10ema) Note that thumbor is occasionally returning 500 for that object. Hitting ATS to ski... [12:53:06] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in eqsin - https://phabricator.wikimedia.org/T226477 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin1001.eqiad.wmnet for hosts: ` ['cp5002.eqsin.wmnet'] ` The log can be found in `... [13:32:42] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Multimedia, and 2 others: Thumbnail rendering of complex SVG file leads to Error 500 or Error 429 instead of Error 408 - https://phabricator.wikimedia.org/T226318 (10Gilles) A lot of files fail to render for various reasons, and end up as 429s because we... [13:57:34] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in eqsin - https://phabricator.wikimedia.org/T226477 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp5002.eqsin.wmnet'] ` and were **ALL** successful. [15:01:52] 10Traffic, 10Operations, 10Performance-Team: Send peering requests to AS with the worst TTFB - https://phabricator.wikimedia.org/T219486 (10ayounsi) Current process is to lookup their email contact in PeeringDB and manually send them an email, CCing our peering@ email. [15:19:08] 10Traffic, 10Operations, 10Core Platform Team Backlog (Designing), 10MW-1.34-notes (1.34.0-wmf.6; 2019-05-21), and 6 others: Harmonise the identification of requests across our stack - https://phabricator.wikimedia.org/T201409 (10mobrovac) [15:23:16] 10netops, 10Operations: RPKI Validation - https://phabricator.wikimedia.org/T220669 (10ayounsi) New proposal after talking to @faidon. The one above focuses on dropping invalids. The one bellow adds visibility on Invalid (until we drop them) as well as unknown and valids. First push the following to cr4-ulsfo... [15:34:53] 10netops, 10Operations: RPKI Validation - https://phabricator.wikimedia.org/T220669 (10JobSnijders) > To be researched: I'm not sure yet if bringing the validator closer to the routers (eg. in POPs) brings significant improvements. If so and once T96852 is solved, we could bring them closer to the POP routers.... [15:43:57] 10netops, 10Operations: RPKI Validation - https://phabricator.wikimedia.org/T220669 (10JobSnijders) >>! In T220669#5282795, @ayounsi wrote: > Once we're ready to drop invalids everywhere, move the above term to the `BGP_community_actions` policy. I've reviewed the proposed configuration and this looks good t... [16:58:46] 10Traffic, 10Operations, 10ops-ulsfo: replace ulsfo aging servers - https://phabricator.wikimedia.org/T164327 (10RobH) [17:00:23] 10Traffic, 10Operations, 10ops-ulsfo: replace ulsfo aging servers - https://phabricator.wikimedia.org/T164327 (10RobH) [17:00:28] 10Traffic, 10Operations, 10decommission, 10ops-ulsfo: decom cp40(09|1[078]) - https://phabricator.wikimedia.org/T178815 (10RobH) 05Stalled→03Resolved [17:56:27] 10Traffic, 10DC-Ops, 10Operations: poll power data for redeployment of esams/knams - https://phabricator.wikimedia.org/T225720 (10RobH) [18:02:39] 10Traffic, 10DC-Ops, 10Operations: poll power data for redeployment of esams/knams - https://phabricator.wikimedia.org/T225720 (10RobH) Ok, using the data I've summarized from the above outputs, we have the following rough power draws at peak hours: === power figures for 2019-07-20 === @robh pulled the dat... [18:42:40] 10Traffic, 10Operations, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Addshore) Will this change also get rolled out to 3rd parties using the updater? / Is it in a certain release? [19:12:13] 10Traffic, 10Operations, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Reduce / remove the aggessive cache busting behaviour of wdqs-updater - https://phabricator.wikimedia.org/T217897 (10Smalyshev) No release yet, but if you check out Updater or WDQS build, you get the same behavior. [20:12:39] 10Traffic, 10Operations, 10Performance-Team, 10Performance: Study performance impact of disabling TCP selective acknowledgments - https://phabricator.wikimedia.org/T225998 (10Gilles) It does, when your IP gets hashed to a specific Varnish frontend, you get [20:19:39] 10Traffic, 10Operations, 10Performance-Team, 10Performance: Study performance impact of disabling TCP selective acknowledgments - https://phabricator.wikimedia.org/T225998 (10Gilles) Remember that x-cache headers are read from right to left. Trying this out right now with a clear cache and a clear local st... [20:21:23] 10Traffic, 10Operations, 10Performance-Team: Send peering requests to AS with the worst TTFB - https://phabricator.wikimedia.org/T219486 (10Gilles) So I should do that for that list? Are you ok with me requesting peering from all of these AS? Is there an existing email template? [20:23:38] 10Traffic, 10DC-Ops, 10Operations: poll power data for redeployment of esams/knams - https://phabricator.wikimedia.org/T225720 (10RobH) Summarizing IRC discussion between @bblack and @robh: The R440 CP systems will pull 250W each in our estimates (pulled from live data at peak in eqiad) The R440 lvs/misc/ga... [22:12:39] 10Traffic, 10Operations, 10Performance-Team: Send peering requests to AS with the worst TTFB - https://phabricator.wikimedia.org/T219486 (10faidon) >>! In T219486#5284083, @Gilles wrote: > So I should do that for that list? Are you ok with me requesting peering from all of these AS? > > Is there an existing... [23:01:59] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes - https://phabricator.wikimedia.org/T226589 (10Jdforrester-WMF) [23:02:31] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in eqsin - https://phabricator.wikimedia.org/T226477 (10Jdforrester-WMF) [23:02:35] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in esams - https://phabricator.wikimedia.org/T222937 (10Jdforrester-WMF) [23:02:37] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes - https://phabricator.wikimedia.org/T226589 (10Jdforrester-WMF) [23:02:40] 10Traffic, 10Operations, 10Goal, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in ulsfo - https://phabricator.wikimedia.org/T219967 (10Jdforrester-WMF)