[00:01:50] 10Traffic, 10MobileFrontend, 10Operations, 10TechCom-RFC, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10Koavf) Sounds like a bug. [08:02:30] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin2001.codfw.wmnet for hosts: ` ['cp2018.codfw.wmnet'] ` The log can be found in `... [08:23:13] 10Traffic, 10Operations, 10Goal, 10Patch-For-Review, 10User-fgiunchedi: Deprecate python varnish cachestats - https://phabricator.wikimedia.org/T184942 (10fgiunchedi) [08:40:41] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp2018.codfw.wmnet'] ` and were **ALL** successful. [08:59:52] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin2001.codfw.wmnet for hosts: ` ['cp2020.codfw.wmnet'] ` The log can be found in `... [09:37:39] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp2020.codfw.wmnet'] ` and were **ALL** successful. [11:44:39] 10Traffic, 10Operations, 10ops-esams: cp3037 is currently unreachable - https://phabricator.wikimedia.org/T222041 (10ema) >>! In T222041#5295444, @Joe wrote: > Can someone start the decommission process? this host shows up in things like debdeploy runs or cumin runs and that's distracting. +1 [11:59:42] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin2001.codfw.wmnet for hosts: ` ['cp2022.codfw.wmnet'] ` The log can be found in `... [12:55:19] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp2022.codfw.wmnet'] ` and were **ALL** successful. [12:56:16] 10Traffic, 10Analytics, 10Operations: Size of headers processed by varnish? - https://phabricator.wikimedia.org/T198152 (10ema) >>! In T198152#4324161, @ema wrote: > Both [[ https://varnish-cache.org/docs/5.1/reference/varnishd.html#http-req-hdr-len | varnish ]] and [[http://nginx.org/en/docs/http/ngx_http_c... [13:33:11] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin2001.codfw.wmnet for hosts: ` ['cp2024.codfw.wmnet'] ` The log can be found in `... [14:14:12] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp2024.codfw.wmnet'] ` and were **ALL** successful. [14:19:41] 10Traffic, 10Operations, 10Performance-Team, 10TechCom-RFC, and 4 others: Serve Main Page of WMF wikis from a consistent URL - https://phabricator.wikimedia.org/T120085 (10Krinkle) @Ladsgroup "[Under discussion](https://www.mediawiki.org/wiki/Requests_for_comment/Process#Review_process)" here merely means... [14:37:08] 10Traffic, 10MobileFrontend, 10Operations, 10TechCom-RFC, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10Krinkle) >>! In T214998#5298484, @DIKW_Pyramid wrote: > If I use desktop browser - how I... [14:56:59] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin2001.codfw.wmnet for hosts: ` ['cp2025.codfw.wmnet'] ` The log can be found in `... [15:02:20] elukey: i cant see anything wrong with that, need to look at something elses but will try to take another look later [15:04:47] heya, i have confused myself about some varnish cache settings. [15:05:19] https://github.com/wikimedia/puppet/blob/production/hieradata/role/common/cache/text.yaml#L91-L94 [15:05:40] be_opts [15:05:54] means that the max_connections: 25 is for the varnish-backend instance -> eventstreams instance, yes? [15:06:01] and, every cache node has a varnish-backend instance? [15:06:50] I know I've asked this before, but have forgotten: does the varnish-backend instance in non core DCs (e.g. ulsfo) connect directly to core DC application? or does it connect to a core DC varnish? [15:07:05] ema: ^? [15:08:07] ottomata: it's 25 per cache node, and it's only the core DCs that count [15:08:41] assuming it's active/active (I haven't looked lately), all the clients in eqsin+ulsfo+codfw would get 25x the count of text backends in codfw, and all the clients in esams+eqiad would get 25x the count of text backends in eqiad [15:08:51] ottomata: you can see how a request is routed through the CDN by looking at the X-Cache response header: https://wikitech.wikimedia.org/wiki/Varnish#X-Cache [15:08:58] (under normal conditions, assuming an even spread of clients across backends, and assuming we haven't failed out a DC) [15:09:32] ok. great. so the be_opt here is specifically for the varnish backend -> application [15:09:39] as for live text backend counts in the core: there's 10 in codfw and 8 in eqiad [15:09:45] ottomata: yes [15:09:54] and, remote DCs hit core DC varnishes, which is why the limit wouldn't matter there [15:10:01] right [15:10:02] great great. [15:10:03] ok thank you. [15:11:23] in general it won't work very perfectly in all scenarios, to use the varnish be_opts as your primary means of limiting [15:11:39] for most cases, we set the varnish limit well above normal, just as an upper sanity limit [15:12:32] there will be scenarios where e.g. we depool the codfw front edge temporarily for operational reasons, and suddenly your global total across all app-facing varnish backends will drop from 450 to 200 (18x25 -> 8x25) [15:12:47] err "depool the codfw back edge", I guess, but still [15:20:46] wryeah [15:21:38] q. would it be possible to route the same x-client IP to the same app backend? [15:21:44] or is that what we turned off in https://gerrit.wikimedia.org/r/c/operations/puppet/+/439911 [15:21:52] no, right? the hashing before was on URL? [15:22:08] oh, no [15:22:10] that won't matter [15:22:16] the app backend is the LVS service. [15:22:18] hm ok, nevermind [15:30:08] 10netops, 10Operations, 10User-fgiunchedi: Add centrallog1001 to syslog servers in network ACLs - https://phabricator.wikimedia.org/T226813 (10ayounsi) 05Open→03Resolved a:03ayounsi Done. Only needed in analytics and old labs filters. [15:39:36] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache upload nodes in codfw - https://phabricator.wikimedia.org/T226637 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp2025.codfw.wmnet'] ` and were **ALL** successful. [16:08:51] 10netops, 10Analytics, 10Operations, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10Ottomata) @Cmjohnson bump! [16:09:13] 10netops, 10Analytics, 10Operations, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10Ottomata) a:03Cmjohnson Feel free to reassign [16:58:43] 10Traffic, 10Operations, 10Performance-Team, 10TechCom-RFC, and 4 others: Serve Main Page of WMF wikis from a consistent URL - https://phabricator.wikimedia.org/T120085 (10Ladsgroup) Thanks. The naming confused me and sorry for the mess. IMO, there's three different questions that we should answer: - Sho... [17:07:12] 10Traffic, 10Operations, 10Performance-Team, 10TechCom-RFC, and 4 others: Serve Main Page of WMF wikis from a consistent URL - https://phabricator.wikimedia.org/T120085 (10jcrespo) [17:20:45] 10Traffic, 10MobileFrontend, 10Operations, 10TechCom-RFC, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10DIKW_Pyramid) >>! In T214998#5300227, @Krinkle wrote: >>>! In T214998#5298484, @DIKW_Pyra... [17:38:38] 10Traffic, 10MobileFrontend, 10Operations, 10TechCom-RFC, 10Readers-Web-Backlog (Tracking): Remove .m. subdomain, serve mobile and desktop variants through the same URL - https://phabricator.wikimedia.org/T214998 (10Krinkle) >>>! In T214998#5300227, @Krinkle wrote: >>>>! In T214998#5298484, @DIKW_Pyramid... [18:00:11] 10Traffic, 10MediaWiki-extensions-CentralAuth, 10Operations, 10Performance-Team (Radar), and 2 others: Consistent HTTP 503 Varnish Error on some urls for some logged-in users (CentralAuth Set-Cookie storm) - https://phabricator.wikimedia.org/T226840 (10BBlack) @Anomie / @Legoktm - Can you take a look at th... [18:03:08] 10Traffic, 10Operations, 10Performance-Team, 10TechCom-RFC, and 4 others: Serve Main Page of WMF wikis from a consistent URL - https://phabricator.wikimedia.org/T120085 (10Krinkle) >>! In T120085#5300982, @Ladsgroup wrote: > [..] there's three different questions that we should answer: > - Should we defin... [18:12:04] 10Traffic, 10Operations, 10Performance-Team, 10TechCom-RFC, and 4 others: Serve Main Page of WMF wikis from a consistent URL - https://phabricator.wikimedia.org/T120085 (10Ladsgroup) >>! In T120085#5301256, @Krinkle wrote: > This is not required for the current RFC. MediaWiki supports the required function... [18:16:09] 10netops, 10Analytics, 10Analytics-Kanban, 10Operations, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10Ottomata) [18:19:18] hello traffic team, I'm still having this issue where I upload a new picture on Wikitech, but some (most, including the default one) still show the old one, eg. https://wikitech.wikimedia.org/wiki/Network_design#/media/File:Wikimedia_network_overview.png [18:20:27] for example this size is the good one: https://upload.wikimedia.org/wikipedia/labs/thumb/5/5f/Wikimedia_network_overview.png/640px-Wikimedia_network_overview.png [19:05:22] does wikitech even participate in the normal PURGE flow? it may not [19:05:41] (in which case stale cache entries being update is just going to be random based on evictions from hotter objects and the passage of time) [19:06:14] seeing as it's "special" and not part of the prod mediawiki clusters, I would guess it may not purge at all [19:06:48] (or, maybe it does naive purging of basic page edits, but not the jobqueue-based stuff, and maybe that affects things like purging thumbnail variants of an original image) [19:07:16] I really don't know, someone would have to do some digging! [19:11:21] not really an issue for me, but I guess something to at least be aware of [19:28:51] 10Traffic, 10MediaWiki-extensions-CentralAuth, 10Operations, 10Performance-Team (Radar), and 2 others: Consistent HTTP 503 Varnish Error on some urls for some logged-in users (CentralAuth Set-Cookie storm) - https://phabricator.wikimedia.org/T226840 (10Tgr) The cookie is output in `CentralAuthSessionProvid... [20:00:33] XioNoX: did you try the normal purge flow? I manually purged both the wiki page and the File: page for the image [20:00:40] no idea if it fixed it for you ofc [20:02:17] thanks! but nop, not fixed [20:16:28] hm, strange. I had luck with the usual ?action=purge workflow on some template changes I just made [20:16:33] (on wikitech) [21:23:43] 10netops, 10Analytics, 10Operations, 10hardware-requests, and 2 others: Upgrade kafka-jumbo100[1-6] to 10G NICs (if possible) - https://phabricator.wikimedia.org/T220700 (10wiki_willy) @elukey - just wanted to follow up on this...@RobH will dig around for some quotes and recommendations [21:23:46] 10netops, 10Analytics, 10Operations, 10hardware-requests, and 2 others: Upgrade kafka-jumbo100[1-6] to 10G NICs (if possible) - https://phabricator.wikimedia.org/T220700 (10RobH) So these are all in warranty until 2020-05-31, so we will want to add in 10G NICs that are covered by Dell's system warranty. I... [21:29:47] 10netops, 10Analytics, 10Operations, 10hardware-requests, and 2 others: Upgrade kafka-jumbo100[1-6] to 10G NICs (if possible) - https://phabricator.wikimedia.org/T220700 (10RobH) [21:42:18] 10Traffic, 10MediaWiki-extensions-CentralAuth, 10Operations, 10Performance-Team (Radar), and 2 others: Consistent HTTP 503 Varnish Error on some urls for some logged-in users (CentralAuth Set-Cookie storm) - https://phabricator.wikimedia.org/T226840 (10Tgr) Annoyingly, adding an `X-Wikimedia-Debug` header... [22:13:25] 10netops, 10Analytics, 10Analytics-Kanban, 10Operations, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10wiki_willy) @Ottomata - can you reach out to Chris on IRC and schedule a time with him on this one? Sounds... [22:36:54] 10netops, 10Operations, 10ops-eqiad: (Need By: Sept 30) update RE-S-X6-64G-S in cr[12]-eqiad - https://phabricator.wikimedia.org/T226424 (10wiki_willy) [22:37:29] 10netops, 10Operations, 10ops-eqiad: (Need By: Sept 30) upgrade msw1-eqiad from EX4200 to EX4300 - https://phabricator.wikimedia.org/T225121 (10wiki_willy)