[04:13:12] 10Traffic, 10Operations: ATS-BE Lua mitigations for cacheable responses w/ Set-Cookie seemingly not working - https://phabricator.wikimedia.org/T264378 (10Tgr) That endpoint are from the [[https://www.mediawiki.org/wiki/API:REST_API/Reference#History|history API]], not Parsoid. Also, I think it is only used by... [08:16:53] 10Traffic, 10Operations: ATS-BE Lua mitigations for cacheable responses w/ Set-Cookie seemingly not working - https://phabricator.wikimedia.org/T264378 (10ema) >>! In T264378#6520026, @Tgr wrote: > That endpoint are from the [[https://www.mediawiki.org/wiki/API:REST_API/Reference#History|history API]], not Par... [08:24:55] 10Traffic, 10Analytics-Clusters, 10Operations: varnishkafka 1.1.0 CPU usage increase - https://phabricator.wikimedia.org/T264074 (10elukey) I found https://github.com/varnishcache/varnish-cache/issues/2788 that might be what's happening. The fix is https://github.com/varnishcache/varnish-cache/commit/ed1696e... [08:34:22] 10Traffic, 10Analytics-Clusters, 10Operations: varnishkafka 1.1.0 CPU usage increase - https://phabricator.wikimedia.org/T264074 (10ema) >>! In T264074#6520380, @elukey wrote: > I found https://github.com/varnishcache/varnish-cache/issues/2788 that might be what's happening. The fix is https://github.com/var... [09:46:50] in https://wikitech.wikimedia.org/wiki/News/Cloud_VPS_2020_Purge#traffic at least diffscan is used [09:47:04] should I mark the whole project as "in use" ? [09:47:14] yes [09:47:51] done, thx [10:42:01] 10Traffic, 10Operations, 10Technical-blog-posts: Blog post series: the evolution of Wikimedia's Content Delivery Network - https://phabricator.wikimedia.org/T264729 (10ema) [10:44:22] that's interesting [10:47:31] solene: thank you! :) [12:44:24] 10Traffic, 10Operations: ATS-BE Lua mitigations for cacheable responses w/ Set-Cookie seemingly not working - https://phabricator.wikimedia.org/T264378 (10CDanis) >>! In T264378#6520360, @ema wrote: > @cdanis only noticed the logstash entries mentioned here while looking for clues, I don't think he meant to su... [13:09:13] 10Traffic, 10Operations, 10Performance-Team (Radar): Elevated latency starting 2020-09-28 - https://phabricator.wikimedia.org/T264398 (10Gilles) Seems like the cache is warm now and the host is faster than its peer: {F32375693, size=full} [13:28:19] ema: bblack: hey, I know you both have been pretty busy with stuff, but was hoping to get some CRs on my various patches soonish. special mention to https://gerrit.wikimedia.org/r/c/operations/puppet/+/631848 which is itself a KR [13:30:56] cdanis: hello there! [13:31:11] cdanis: usual boring question: is there a task? [13:31:37] I don't believe so, no, but I can file one [13:33:48] cdanis: thanks! [13:34:04] I'm wondering if we should return a more specific error message [13:34:33] I thought about that too, wound up feeling unsure but leaning towards not [13:34:40] per our usual non-policy on such events [13:35:04] :D [13:35:09] people behind relatively large NATs are probably gonna get rate limited with 20 rps, aren't they [13:35:19] they sure are, I'm not sure what's to be done about that [13:35:46] in my mind we only want to use this switch when things are bad enough that the site would be broken for people anyway [13:36:00] (and in that case it isn't clear to me that 20rps is a hard enough limit; it all depends) [13:36:03] right [13:40:45] cdanis: so I see that, when I added 'public_clouds_shutdown', for some reason that present me does not remember I've decided to make it text-only [13:41:05] yeah I'm guessing because it had only been an issue there in the past [13:41:08] see cluster_fe_ratelimit in text-frontend.inc.vcl.erb [13:41:29] we probably want to move public_clouds_shutdown to the cluster-indep code too [13:41:32] ok! [13:41:37] I can write a followup to do that [13:42:02] I think it makes sense, it's just that the stuff beyond upload hasn't been hit in the same way, and doesn't have quite the same hard concurrency limits [13:42:26] yup [13:44:19] cdanis: shall we maybe move the attack_mode code after the node-fetch exception (ie after line 655 in your patch)? [13:44:47] the node-fetch stuff is stricter in terms of rps so logically it seems reasonable to have it first [13:45:39] but let me use gerrit for all these excellent points I'm raising [13:51:41] yeah that's reasonable [14:12:38] 10Traffic, 10Maps, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Epic: Support maps serving for affiliate sites via an allow list - https://phabricator.wikimedia.org/T261694 (10CDanis) [14:12:42] 10Traffic, 10Maps, 10Operations, 10Wiki-Loves-Monuments (2020): wikimedia.pl returns a HTTP 429 error (let it access varnish maps_domains) - https://phabricator.wikimedia.org/T261506 (10CDanis) [14:13:03] 10Traffic, 10Maps, 10Operations, 10Wiki-Loves-Monuments (2020): maps.wikilovesmonuments.org returns a HTTP 429 error (let it access varnish maps_domains) - https://phabricator.wikimedia.org/T260520 (10CDanis) [14:13:08] 10Traffic, 10Maps, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Epic: Support maps serving for affiliate sites via an allow list - https://phabricator.wikimedia.org/T261694 (10CDanis) [15:07:59] 10Traffic, 10Operations, 10Patch-For-Review: Wikidough: Upgrade to dnsdist 1.5.0 - https://phabricator.wikimedia.org/T263789 (10ssingh) [15:09:30] 10Traffic, 10Operations, 10Patch-For-Review: Deploy Wikidough: Experimental DNS-over-HTTPS (DoH) public resolver - https://phabricator.wikimedia.org/T252132 (10ssingh) [15:09:51] 10Traffic, 10Operations, 10Patch-For-Review: Wikidough: Upgrade to dnsdist 1.5.0 - https://phabricator.wikimedia.org/T263789 (10ssingh) 05Open→03Resolved ` sukhe@malmok:~$ /usr/bin/dnsdist --version dnsdist 1.5.0 (Lua 5.2.4) ` Completed upgrade to dnsdist 1.5.0, marking as closed. [15:28:50] 10Traffic, 10Operations: Create a second text-lb IP address for test purposes - https://phabricator.wikimedia.org/T237492 (10ayounsi) a:03BBlack Might also be fine to remove it now that the testing have been done. [15:35:38] 10Traffic, 10Operations, 10Technical-blog-posts: Blog post series: the evolution of Wikimedia's Content Delivery Network - https://phabricator.wikimedia.org/T264729 (10srodlund) @ema Great! The doc just came through. Looking forward to reading and editing this! [15:55:42] 10Traffic, 10Analytics-Clusters, 10Operations: varnishkafka 1.1.0 CPU usage increase - https://phabricator.wikimedia.org/T264074 (10ema) >>! In T264074#6520414, @ema wrote: > Definitely, please feel free to go ahead if you have the time. Implicit but it's probably better to state it clearly: if you do have... [15:57:31] 10Traffic, 10Maps, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Epic: Support maps serving for affiliate sites via an allow list - https://phabricator.wikimedia.org/T261694 (10Urbanecm) Hello, Wikimedia Czech Republic uses maps.wikimedia.org in our new website, which you can preview at https://... [16:13:02] 10Traffic, 10Operations, 10observability, 10Patch-For-Review, 10User-fgiunchedi: Aggregated metrics for ats-tls <-> clients ttfb percentiles - https://phabricator.wikimedia.org/T263536 (10fgiunchedi) With 50 percentile added I'm considering this closed! As a demo/playground I've started https://grafana.... [16:13:07] 10Traffic, 10Operations, 10observability, 10Patch-For-Review, 10User-fgiunchedi: Aggregated metrics for ats-tls <-> clients ttfb percentiles - https://phabricator.wikimedia.org/T263536 (10fgiunchedi) 05Open→03Resolved [16:44:04] 10Traffic, 10Maps, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Epic: Support maps serving for affiliate sites via an allow list - https://phabricator.wikimedia.org/T261694 (10bd808) As a #toolforge and #cloud-vps administrator I would like to request that `*.tooforge.org`, `*.wmcloud.org`, and... [16:45:59] 10Traffic, 10Maps, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Epic: Support maps serving for affiliate sites via an allow list - https://phabricator.wikimedia.org/T261694 (10CDanis) >>! In T261694#6522265, @bd808 wrote: > As a #toolforge and #cloud-vps administrator I would like to request th... [17:31:47] 10Wikimedia-Apache-configuration, 10Operations, 10Patch-For-Review: Redirect 2030.wikimedia.org to the new movement strategy portal - https://phabricator.wikimedia.org/T202498 (10sguebo_WMF) Hello @Dzahn, can the 2030.wikimedia.org subdomain redirect to the new url: https://meta.wikimedia.org/wiki/Wikimedia_... [17:55:45] cdanis: could this be related to recent rate limit changes? [17:55:45] You sent too many messages to this pad so it disconnected you. [17:55:49] (etherpad) [17:55:53] no [17:56:13] I don't believe there's been anything deployed on the traffic side that's not a disabled-by-default contingency plan? [17:56:29] wasn't there a new etherpad release last week though? [17:56:50] didn't follow, maybe [18:20:11] yes, Etherpad was upgraded from 1.8.4 to 1.8.6. full changelog: https://github.com/ether/etherpad-lite/blob/develop/CHANGELOG.md [18:25:28] maybe I had some bad JS, I reopened in another tab, let's see if it happens again, I had also a JS error in that tab after that [18:35:47] it's related to caching. try to empty cache and do a hard reload. I sent a mail about that because we ran into it as well [18:41:40] yeah I had it as well earlier [18:41:47] it kept ratelimiting my typing :P [18:42:23] "IMPORTANT SECURITY: Rate limit Commits when env=production" [18:42:43] which is fine, maybe, so long as every character typed isn't a commit [18:43:08] perhaps they tried to batch up commits on the client side to compensate, but there's a narrow window of "just wrong" typing rate [18:47:03] it also makes some merges of those data structures harder ;) [18:47:58] "every character is a commit" reminds me of Google Wave [18:48:56] sukhe: you probably have not yet looked at etherpad DB schema ;) [18:49:36] sukhe: it's much the same concept [18:49:47] CRDTs and such [18:50:57] ah! [19:01:42] 10Traffic, 10Maps, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Epic: Support maps serving for affiliate sites via an allow list - https://phabricator.wikimedia.org/T261694 (10Multichill) @CDanis based on the webserver logs we should know what domains give the most hits. Can you share a list of... [19:01:59] bblack: FYI as I noticed after posting that you were not subscribed: https://phabricator.wikimedia.org/T264273#6522738 [19:11:58] 10Traffic, 10Maps, 10Operations, 10Product-Infrastructure-Team-Backlog, 10Epic: Support maps serving for affiliate sites via an allow list - https://phabricator.wikimedia.org/T261694 (10CDanis) >>! In T261694#6522739, @Multichill wrote: > @CDanis based on the webserver logs we should know what domains gi... [20:16:19] 10Wikimedia-Apache-configuration, 10Operations: Update 2030.wikimedia.org redirect to new URI - https://phabricator.wikimedia.org/T264797 (10Reedy) [20:18:41] 10Wikimedia-Apache-configuration, 10Operations, 10Patch-For-Review, 10User-RhinosF1: Update 2030.wikimedia.org redirect to new URI - https://phabricator.wikimedia.org/T264797 (10RhinosF1) a:05RhinosF1→03None Was gonna do a patch but I'll review what exists [20:18:46] 10Wikimedia-Apache-configuration, 10Operations, 10Patch-For-Review, 10User-RhinosF1: Update 2030.wikimedia.org redirect to new URI - https://phabricator.wikimedia.org/T264797 (10RhinosF1) a:03RhinosF1 [20:19:30] 10Wikimedia-Apache-configuration, 10Operations, 10Patch-For-Review: Update 2030.wikimedia.org redirect to new URI - https://phabricator.wikimedia.org/T264797 (10RhinosF1) [20:28:47] 10Traffic, 10Varnish, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, and 3 others: Special:HideBanners is not really cacheable - https://phabricator.wikimedia.org/T256447 (10Tgr) [20:38:27] 10netops, 10Analytics, 10Analytics-Kanban, 10Operations: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) Hi all! I believe we can use a Refine transform function to add the requested fields (except for BGP communities IIUC) at refine time. Pl... [20:40:48] 10netops, 10Analytics, 10Analytics-Kanban, 10Operations: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10Nuria) @mforns: it could also be a second job run after the refined one (similar to how we do virtual-pageviews) as we probably do not want to cre...