[07:57:18] _joe_, bblack: So w.wiki is still redirecting to wikipedia.org :/ [08:20:34] <_joe_> ema: around? I might need some vcl-fu [08:22:13] _joe_: yep [08:23:23] <_joe_> ema: is cp1008 up? [08:23:36] <_joe_> so tl;dr is what legoktm said [08:23:47] cp1008 is up [08:24:04] <_joe_> and I think the culprit is if (req.url ~ "^/$") [08:24:07] <_joe_> sorry [08:24:15] <_joe_> if (req.url == "/") [08:24:24] <_joe_> in text-backend misspass_mangle [08:24:52] <_joe_> it's ok for me to play with cp1008's vcl? [08:24:59] absolutely [08:28:16] <_joe_> ok found the solution [08:28:24] <_joe_> PEBKAC, as usual [08:31:43] _joe_: tell me more :) [08:32:20] <_joe_> ema: req.url == "/" vs req.url ~"^/$" [08:32:47] oh that was it [08:36:14] <_joe_> uhm [08:36:26] <_joe_> I am not sure that's the case, actually [08:39:39] <_joe_> ema: is restarting vernish going to wipe the cache on cp1008? [08:39:44] <_joe_> *varnish [08:40:10] _joe_: restarting the frontend would wipe the cache, yes. Why do you want to do that? [08:40:35] <_joe_> ema: because I am not sure if the issue legoktm was seeing was due to caching of the url or not [08:41:05] go for it then [08:41:17] km@km-tp ~> curl -I "https://w.wiki/" | grep Location [08:41:17] Location: https://www.wikipedia.org/ [08:43:09] _joe_: sudo rgrep UrlShortener /etc/varnish does not return any match on cp3042 [08:43:48] <_joe_> because it's just on the misspass_mangle [08:44:02] <_joe_> which is only on direct caches (i.e. eqiad) [08:45:00] <_joe_> so I'd first try to purge w.wiki/ [08:45:04] <_joe_> from the varnishes [08:45:14] <_joe_> because I feel it's a caching problem [08:45:16] have you tried on cp1008 already? [08:45:32] <_joe_> yes [08:45:44] <_joe_> cp1008 works well with the current vcl too [08:45:55] OK [08:46:05] note that: [08:46:06] curl -I https://w.wiki/x [08:46:20] should this redirect somewhere? [08:46:56] or what is the w.wiki/$code mentioned in the ticket? [08:47:05] because I get a 404 instead [08:47:19] <_joe_> ema: just https://w.wiki/ should not redirect to wikipedia.org [08:47:28] <_joe_> w.wiki/something is a redirecting url [08:47:37] <_joe_> so 404 is ok if it's not been registered [08:48:15] alright! [08:49:04] <_joe_> I am going to ban the url specifically on one server [08:50:40] ema: https://w.wiki/2 is a working short url [08:55:03] legoktm: but that also sends me to https://www.wikimedia.org/ [08:56:27] <_joe_> ok we found another issue :) [08:56:30] ema: confusingly, that is the expected target for that short url :) [09:02:19] ema: https://gerrit.wikimedia.org/r/295632 is what _joe_ and I came up with [09:15:24] legoktm: looks good! I'm trying to come up with a VTC test [09:23:46] legoktm: VTC test added to https://gerrit.wikimedia.org/r/#/c/295632/ [09:26:01] woot, thanks :D [09:26:28] and it even passes! [12:12:02] 07HTTPS, 10Traffic, 06Operations, 10Wiki-Loves-Monuments: configure https for www.wikilovesmonuments.org - https://phabricator.wikimedia.org/T118388#2401704 (10Dzahn) Yay! Thank you! [12:47:36] ? [12:48:08] so was it caching issues? [12:52:35] <_joe_> bblack: yes [12:52:50] <_joe_> bblack: we had a redirect w.wiki => www.wikipedia.org [12:52:51] so did you ban it everywhere already? [12:52:54] <_joe_> yes [12:52:57] ok [12:53:12] <_joe_> I did eqiad => codfw => esams => ulsfo [12:53:23] <_joe_> actually, swap esams and codfw there [12:53:40] <_joe_> I think I respected the chain of dependencies [12:58:39] 10Traffic, 06Operations: Support TLS chacha20-poly1305 AEAD ciphers - https://phabricator.wikimedia.org/T131908#2401926 (10BBlack) [[ https://tools.ietf.org/html/rfc7905 | RFC 7905 ]] is published! Now we just need a released version of openssl 1.1.x :) We could test a build of openssl's master branch on cp1... [13:08:36] yeah [13:09:20] for cache_text it's currently: eqiad -> [esams, codfw -> [ ulsfo ]] [13:09:39] as long as eqiad is first, and ulsfo is after codfw, it works [13:10:15] (and frontend at any given site after its own backends, but usually I just do all the global frontends at once after finishing with the backends) [13:16:07] 10Traffic, 06Operations: Support TLS chacha20-poly1305 AEAD ciphers - https://phabricator.wikimedia.org/T131908#2401975 (10MoritzMuehlenhoff) 1.1.0~pre5 is in Debian experimental. It has quite some API changes, though. https://wiki.openssl.org/index.php/1.1_API_Changes In a rebuild of the Debian archive over... [13:29:05] 10Traffic, 06Operations: Support TLS chacha20-poly1305 AEAD ciphers - https://phabricator.wikimedia.org/T131908#2402015 (10BBlack) Yeah it's going to be a big transition. I've seen openssl-1.1.x-related patches in nginx master though (which is basically what we're running), so I'm crossing fingers that nginx... [13:49:25] 10Traffic, 06Operations, 13Patch-For-Review: Investigate TCP Fast Open for tlsproxy - https://phabricator.wikimedia.org/T108827#2402049 (10ema) The initial portion of the 3WHS can be used to check whether a remote TCP server supports TFO. For example, with [[https://github.com/secdev/scapy/ | scapy]]: ``` f... [14:18:27] 10Traffic, 06Analytics-Kanban, 06Operations, 13Patch-For-Review: Verify why varnishkafka stats and webrequest logs count differs - https://phabricator.wikimedia.org/T136314#2402115 (10elukey) Finally got the root cause of the VSL timeouts after a chat with Varnish devs. The Varnish workers use a buffer to... [14:29:41] question for the vk timeout issue - would it be worth to discuss https://www.varnish-cache.org/docs/4.0/reference/varnishd.html#vsl-buffer ? [14:30:21] as far as I can see we use the default 4k and vsl_reclen=2048 [14:43:47] elukey: yeah I brought that up some time ago, raising both that and the overall shmlog size to avoid various kinds of overflow scenarios [14:44:25] we should still have solutions that handle those overflows appropriately, though, so I think we should find solutions that works at today's tuning values + heavy traffic (e.g. PURGE), and then raise it (after which it will be hard to reproduce) [14:46:28] if you search channel logs, that part of the past conversation starts about here: [14:46:31] 2016-06-07 15:53:16ema: perhaps it's simply that our our shm log size is too small for the rate [14:46:47] ah I completely lost that conversation [14:46:57] the conclusion part being, basically: [14:46:59] 2016-06-07 16:07:27 IMHO in v4 we should waste memory on getting these right [14:47:03] 2016-06-07 16:08:23 set vsl_buffer to something like maybe 64K, and vsl_reclen at ~16K (perhaps 16K - 12, in case that affects efficiency)? [14:47:05] 2016-06-07 16:09:20 and then bump vsl_space a bunch so it's never an issue [14:47:08] 2016-06-07 16:09:48 if we bump it a whole lot, we'll have to raise the size of the tmpfs we mount on /var/lib/varnish too [14:47:11] 2016-06-07 16:10:05 in general we have the RAM for it though [14:47:14] 2016-06-07 16:10:31 our worst remaining caches have ~96G RAM [14:48:34] and also: [14:48:35] 2016-06-07 16:23:05 at the same time, varnishapi should handle overflows and I'm not sure it does that now :) [14:48:38] 2016-06-07 16:23:23 then we know for sure that overflow isn't an issue in functional or perf terms, and the only real cost is the wasted 1-2GB of memory, and whatever we waste for 64K vsl_buffer * n_threads [14:48:43] 2016-06-07 16:23:41 varnishapi should handle overflow, too. fix that first while it's easier to reproduce :) [14:49:13] the last bit being: let's not fix this until we're sure our clients handle these cases well, because once we bump the sizes it's going to be hard to reproduce, but will still be a rare corner-case. [14:49:32] makes sense [14:50:37] one thing that I want to try is to have two or more varnishlog with different -T parameters running on the same host, filtering for VSL:Timeout [14:50:58] so see what really changes in the current misc scenario [14:51:05] but it is not of course a definitive solution [14:51:40] my fear is that vk will suffer a lot when we'll migrate upload and text [15:49:14] I have a tmux session on cp300[89] with varnishlog -g request -q 'VSL:timeout' -n frontend and varnishlog -g request -q 'VSL:timeout' -n frontend -T 600 [15:49:42] fyi :) [15:51:00] ok [16:24:54] bblack: I'd go ahead with a brief TFO test in esams if it's OK with you [16:25:18] just setting fastopen=100 and seeing what I get back for a couple of minutes [16:27:44] ema: yeah I think it's fine. if we assume not much adoption, 100 won't limit us really, and is low enough for protection. [16:27:52] btw did you see netstat has the info on it? [16:28:18] bblack: yes! But I got confused by the UI [16:28:26] counters set to 0 are not returned [16:28:36] there's a CLI hack to make it easier [16:28:46] grep '^TcpExt:' /proc/net/netstat | cut -d ' ' -f 87-92 | column -t [16:28:55] (from https://bradleyf.id.au/nix/shaving-your-rtt-wth-tfo/ ) [16:28:59] I ended up doing exactly that [16:29:39] but with 90-95 [16:29:47] grep '^TcpExt:' /proc/net/netstat | cut -d ' ' -f 90-95 | column -t [16:30:20] yeah 90-95 seems correct for TFO stuff [16:30:48] yeah probably more stuff was added since that article [17:19:11] really hard to test this stuff! Once your kernel decides it's not going to send TFO to a certain host for whatever reason there is no way to make it change its mind [17:22:46] except for rebooting the machine, which sucks :) [17:22:55] but hey, we have VMs [17:23:16] :) [17:23:42] 19:17:08.163693 IP 192.168.1.27.60130 > 91.198.174.192.443: Flags [S], seq 3505098562, win 29200, options [mss 1460,sackOK,TS val 4294910684 ecr 0,nop,wscale 7,exp-tfo cookiereq], length 0 [17:23:46] 19:17:08.201287 IP 91.198.174.192.443 > 192.168.1.27.60130: Flags [S.], seq 1964587431, ack 3505098563, win 28960, options [mss 1432,sackOK,TS val 494198732 ecr 4294910684,nop,wscale 9,exp-tfo cookie 2d274b0e3d500cf2], length 0 [17:23:50] maybe that's why we don't see much traffic in testing: maybe there are real-world clients that will do it by-default, but they won't even try when we turn it on until they've all rebooted or lost some long-term negative cache entry [17:25:10] right [17:25:42] especially if the SYN+TFO gets dropped by some middleboxes and a SYN retransmission happens [17:26:02] in that case the client caches the fact that TFO should not be used [17:26:34] but I couldn't find any way to extract this info [17:26:55] yeah there's no way to really know, except leave it on for a few weeks and see if the numbers start ramping in or not [17:28:10] then we face a second problem, of course: if we initially set fastopen=100 to be reflection-conservative, and we find over the first week or two that a bunch of Edge or Safari clients will eventually learn to use it with us, they might quickly hit the 100 limit and cause tfo-disable on our end, and then re-disable themselves when the protection kicks in :) [17:28:47] but I think that's a more easy going protection in that case [17:29:08] as in, the cookie request gets ignored and the 3WHS continues as normal [17:29:42] oh, is that not what happens when we have it fully disabled-by-default on our end? we actually drop or reject SYN+TFO instead? [17:30:24] yes, fully disabled-by-default means that the TFO part gets ignored by the server [17:30:31] unless somebody drops the SYN [17:30:49] in that case the SYN gets re-transmitted without TFO [17:31:02] and the decision cached [17:31:14] well ignoring middlebox problems, though: you're saying the default state we've been in, we ignore TFO and complete 3WHS? [17:31:26] that's my understanding of the RFC [17:31:55] so what's the real difference between default-tfo-disable and getting disabled by the fastopen= limit? [17:32:17] I think none, from the handshake point of view [17:32:27] either way they're only going to cache it if a middlebox drops SYN+TFO completely [17:32:55] sucks for them if 1/1000000 random packet loss causes them to cache TFO disable pointlessly [17:33:20] yup [17:33:41] but you apparently ran into it earlier: "Once your kernel decides it's not going to send TFO to a certain host for whatever reason there is no way to make it change its mind".... [17:33:57] was that from middlebox interference, or just because we have it disabled-by-default on most caches currently? [17:34:23] I was thinking about that exactly. Perhaps also the default-tfo-disabled part gets cached then [17:34:38] *or* I could have sent funky SYNs with scapy [17:34:42] if so, then the =100 limit getting tripped will cause caching of disabled too [17:35:23] the tcpdump trace pasted before is a real tfo client, not a crafted SYN [17:35:43] the crafted SYN thingie works fine with google servers, but perhaps doesn't with LVS? [17:36:04] I've seen at least one trace where a retransmission happened, the first time with TFO and second without [17:44:52] yes indeed! the second SYN gets retransmitted (just tried rebooting the VM) [17:45:22] so getting the cookie works fine, whereas successive SYNs+cookie get dropped somewhere apparently [17:46:13] wireshark labels the second SYN as "continuation data", that's why I've missed it in the capture a while ago [17:47:47] but who's doing that? only way to find out is to reboot the VM and sniff on the load balancer I guess [17:53:47] yeah [17:53:56] I'd bet on LVS :) [17:54:27] it's tricky, LVS doesn't do "real" TCP states, so it's not like just turning on TFO there would fix it, either. probably needs ipvs kernel patches to allow SYN+TFO to flow? [17:55:05] if LVS was proxying both directions maybe it could do the TFO, but with DR it would never see the set-cookie, basically. [17:55:07] I thought it would simply forward the packet as it is to the real server? [17:55:36] yeah but it does some sanity-checks. I doubt it would trip on the extension flag, but it probably trips on the actual accompanying the SYN? [17:55:46] heh [17:55:51] I meant to say: [17:55:58] "on the actual data accompanying the SYN" [17:56:30] TCP allows data in SYN in theory [17:56:48] yeah but until TFO I don't think it was useful in common practice, right? [17:57:06] also without extensions, with the limitation that the data cannot be delivered to the application till 3WHS completiom [17:57:11] so yeah, partially useful [17:57:41] s/partially/not very/ :) [17:58:52] I guess whether you sent that first data with SYN or with the ACK response to SYNACK, either way it's delivering to the app at the same time. doing it early in that case just causes extra memory buffer needs on the server, and probably is worse-performing on average since there's been no chance to start working on RTT, MSS cwnd, etc... [17:59:33] right, all problems that TFO tries to fix [17:59:56] the no_metrics_save:0 experiment today, even if it doesn't make much diff for normal connections, will probably be substantially more-helpful with TFO clients. [17:59:58] by allowing data delivery before the end of the handshake and caching stuff like MSS for example [18:00:35] I'm afraid it might be time for me to find out how LVS actually works :) [18:00:54] because a quick look at the trace leaves me puzzled, there must be some evil black magic there [18:01:17] it's full of magical ponies that someone genetically engineered 20 years ago and didn't write down anything while they were working [18:01:35] \o/ [18:01:36] most of the docs are completely-missing or so horribly outdated that they give you bad ideas [18:03:31] gut feeling: either it doesn't like data in SYN or it doesn't like certain TCP options for whatever funny reason [18:10:22] wait [18:10:24] it's not LVS [18:10:36] it must be something between my flat and esams [18:10:45] heh [18:10:52] from my vps everything works [18:11:17] so either your ISP hates progress, or your home router does? :) [18:11:34] I would certainly not be surprised if both were true [18:12:26] interesting example though of why they've put so much care in ensuring reasonable fallbacks [18:12:42] yeah [18:13:07] I'm gonna start calling middleboxes meddleboxes [18:13:20] but: why does everything work with google servers? [18:13:49] "Let's enforce abitrary sanity-rules based on whatever we usually see on the wire, rather than the RFCs, and let's bake that into device firmware that's widely-deployed and almost never updated!" [18:13:49] perhaps other types of meddleboxes? [18:14:34] who knows re: google. maybe their implementation has better magic to work around it. [18:14:56] or maybe they paid large sums of money to home router vendors and ISPs to whitelist Google's networks from all kinds of "protections" :) [18:17:10] the fact that the internet still kind of works pretty much fine will never cease to amaze me [18:17:48] every time we get money out of an atm successfully we should be extremly happy [18:19:07] https://packetpushers.net/tcp-fast-curious-a-look-at-tcp-fast-open/ [18:20:20] apparently F5 just release a software update to allow TFO in late 2015, too: https://devcentral.f5.com/articles/faster-tcp-f5-now-supports-tcp-fast-open [18:20:51] http://www.winbeta.org/news/microsoft-edge-receives-tcp-fast-open-support-enhanced-security [18:21:01] ^ says the Edge support requires setting a non-default flag currently [18:21:14] "It is currently only accessible in the about:flags settings for Windows Insider Preview builds 14352 and higher, and will be used for EdgeHTML 14.14361 and higher." [18:21:49] I donno what that really means in microsoft-speak. maybe it means the flag will be on by default in Edge release versions >= whatever contains EdgeHTML 14.14361? [18:22:19] I think I've read something different [18:22:37] the winbeta article's like a week old [18:22:52] https://developer.microsoft.com/en-us/microsoft-edge/platform/changelog/desktop/14352/ [18:23:05] > Experimental support for TCP Fast Open is enabled by default in this build, but the about:flags setting does not function correctly. [18:24:19] the public release is way behind that number, though [18:24:25] only for preview/developers/whatever [18:25:34] I wonder if it works with wine :) [18:26:43] ewwww [18:28:51] would be also good to be able successful TFO connections on the server side, TCPFastOpenActive only gets incremented on the client [18:29:03] *to be able to see [18:30:47] anyhow, this preliminary test is encouraging: TFO seems to work fine [18:31:19] bblack: should we go back to no fastopen in esams or keep it like this for a while? [18:38:45] I would keep it in esams (or even in all, the risk profile isn't that much different), but keep an eye on stats via salt or something and take note if any significant TFO shows up organically. [18:38:55] because if it does, we probably have to bump the =100 before we hit it [18:39:25] ema: ^ [18:40:14] I wouldn't really know how to monitor that except for sniffing/systemtap [18:41:43] on the server side there is TCPFastOpenCookieReqd but what does it mean? Cookies requested by the machine as a client or requests it got as a server? [18:42:19] TCPFastOpenActive TCPFastOpenActiveFail TCPFastOpenPassive TCPFastOpenPassiveFail TCPFastOpenListenOverflow TCPFastOpenCookieReqd [18:42:20] oh, RTFBlogpost [18:42:22] 0 0 0 0 0 23 [18:42:53] I think tcpfastopenpassive (vs fail) will show us client connections to us using it [18:43:05] cookiereqd probably is cookie requests (that do nothing if disabled) [18:43:16] yes they're all explained in https://bradleyf.id.au/nix/shaving-your-rtt-wth-tfo/ [18:43:21] listenoverflow is probably an indicator that fastopen=N needs a bump [18:45:03] yes [18:45:09] TCPFastOpenListenOverflow - number of inbound SYN packets that will have TFO disabled because the socket has exceeded the max queue length. [18:45:27] so if we start seeing that it might be time to increase fastopen=N [18:46:29] there are already quite a few TCPFastOpenPassive in esams (successful inbound TFO) [18:48:13] not many, but some :) [18:49:00] ok, I'll leave it as it is for now and enable fastopen everywhere tomorrow while keeping an eye on TCPFastOpenListenOverflow [18:50:24] cool :) [18:50:53] but given how slowly the numbers of successful inbound are increasing I doubt we'll have issues with n=150 [18:51:07] unless chrome decides to enable it by default overnight :) [22:15:01] 10Traffic, 06Operations: Backend naming in VCL needs to use fqdn+port - https://phabricator.wikimedia.org/T138546#2403529 (10BBlack)