[08:51:15] 10Wikimedia-Apache-configuration, 10Wikimedia-Site-requests, 06Wikisource, 07Community-consensus-needed, 13Patch-For-Review: Move oldwikisource on www.wikisource.org to mul.wikisource.org - https://phabricator.wikimedia.org/T64717#2250775 (10Danny_B) [11:40:23] 10Wikimedia-Apache-configuration: create pk.wikimedia.org and redirect to http://loveforkarachi.org/wp/ - https://phabricator.wikimedia.org/T56780#2251093 (10Saqib) site is now at wikimediapakistan.org please redirect if possible. [12:07:09] 10Wikimedia-Apache-configuration: create pk.wikimedia.org and redirect to http://loveforkarachi.org/wp/ - https://phabricator.wikimedia.org/T56780#2251111 (10Dereckson) a:03Dereckson Thanks for the update. This domain wikimediapakistan.org has been registered by Wikimedia CH, and @Saqib is group contact for... [12:31:29] 10Wikimedia-Apache-configuration, 13Patch-For-Review: create pk.wikimedia.org and redirect to http://loveforkarachi.org/wp/ - https://phabricator.wikimedia.org/T56780#2251167 (10Dereckson) 05stalled>03Open p:05Low>03Normal [12:50:21] 07HTTPS, 10Traffic, 06Operations: Getting ssl_error_inappropriate_fallback_alert very rarely - https://phabricator.wikimedia.org/T108579#1523931 (10fgiunchedi) @DaBPunkt are you still getting the same sporadic error? [13:20:52] 07HTTPS, 10Traffic, 10MediaWiki-extensions-CodeReview, 06Operations: Provide HTTPS links in CodeReview emails - https://phabricator.wikimedia.org/T31008#2251262 (10fgiunchedi) indeed this looks closable/declined to me, @brion @Krinkle @siebrand ? [13:28:42] 07HTTPS, 10Traffic, 07Tracking: SSL related (tracking) - https://phabricator.wikimedia.org/T29946#2251284 (10yuvipanda) [13:28:44] 07HTTPS, 10Traffic, 10MediaWiki-extensions-CodeReview, 06Operations: Provide HTTPS links in CodeReview emails - https://phabricator.wikimedia.org/T31008#2251282 (10yuvipanda) 05Open>03declined Let's do it! [14:49:18] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Verify maps caching - https://phabricator.wikimedia.org/T133988#2251410 (10Yurik) [14:52:24] bblack: morning :) anything against me filing a phab task to get new DNS ns hosts in eqiad/codfw/esams and link it in https://phabricator.wikimedia.org/T101525 ? [14:54:13] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Verify maps caching - https://phabricator.wikimedia.org/T133988#2251425 (10Yurik) [15:18:48] elukey: maybe better to wait a few days before we start acting on it [15:19:12] elukey: I've been having some new thoughts about the whole situation, but I haven't really had time to sort them all out [15:21:12] bblack: ack, I'll wait then :) [15:21:24] just wanted to start paperwork :) [15:21:37] 10Traffic, 06Operations: Investigate better DNS cache/lookup solutions - https://phabricator.wikimedia.org/T104442#2251455 (10BBlack) [15:22:03] elukey: so, I think there's a lot of overlap and re-thinking to do, on all of the inter-related things in: [15:22:11] https://phabricator.wikimedia.org/T101525 [15:22:15] https://phabricator.wikimedia.org/T96852 [15:22:19] https://phabricator.wikimedia.org/T104442 [15:22:46] spewing out some random incomplete thoughts on the topic: [15:23:33] 1. ntp + recdns + authdns are all similar in nature: they're critical infra, the protocols have built-in support for multi-server resiliency and all that, they tend to be lightweight on resources [15:23:47] they're also similar in that they all need public IPs for various different reasons [15:24:11] they're also similar in that ideally we deploy all of those similarly at every DC, cache DC or primary [15:24:57] and really, not just for these but for almost all servers we deploy in our "public" networks with public wikimedia.org hostnames: they could/should be on internal networks with the primary hostname/interface in .wmnet, and some other solution should exist to get them the limited public access they need [15:25:35] maybe also looped into that part of the thinking is linux network namespaces too, as in this ticket: https://phabricator.wikimedia.org/T114979 [15:27:15] for a lot of purposes, LVS-DR is the right answer for the public access. We can put public IPs on LVS that map to internal IPs of service machines, we do this already for many things, notably the cache clusters [15:28:01] it also helps reinforce the idea that public service IPs should be distinct from hosts. It should never be the case that service hostname "foo" is just a CNAME (or ip-aliased) to server ununpentium or whatever. [15:28:48] except in the rare circumstances that some service critically needs to be independant from all other infra because we need it in outages (like monitoring) [15:29:07] but even then, we could assign a second floating service IP for the part that's supposed to face the world [15:29:45] * elukey nods [15:29:50] unfortunately inbound traffic (LVS-DR) is only half the equasion [15:30:15] the reason the recdns boxes have public IPs isn't for inbound (after all, they're only offering recdns service to our own hosts), it's because they need outbound access to query other nameservers in the world [15:30:41] with authdns it's the opposite: they have no need to contact the rest of the world, but the rest of the world needs to contact them [15:32:32] NTP is a mixed bag, but I don't think we intend to offer public service there. they just to peer/serve internally, and be able to initiate outbound flows to public sources [15:33:03] like other stratum [15:34:05] we already place ntp+recdns on the same hosts, a pair of public hosts, in each of eqiad, codfw, and esams. [15:34:20] (just not ulsfo, which is kind of a hole in even our current state of affairs) [15:35:44] it seems like there should be a way to arrive at a standardized solution for all sites (including further future cache DCs) that involves 2-3 machines with primary hostnames in private network/DNS, with a handful of public IPs mapped on through LVS-DR and/or via vlan-tagged (as in the host has virtual ethernet connection to the public network as non-primary) [15:36:42] and have those 2-3 machines all share a blender role in puppet that has them handling ntp+recdns+authdns, with the ntp and recdns service IPs on the private network (but able to do their outbound queries on the public), and authdns service IPs on the public network (but only that authdns service there, the "host" itself is on the private network) [15:37:44] and if we make the recdns part locally-resilient (and that includes in the face of stupid resolver issues as in https://phabricator.wikimedia.org/T104442 ), we can stop having resolv.conf fall-back across WAN links when a packet gets dropped [15:38:24] and in the common case, most of our recdns cache misses would resolve to authdns over the loopback to the authdns server on the same physical box, since they all have virtuals of all the authdns service IPs on lo anyways (or whatever in the new setup) [15:38:41] (because almost certainly, most of recdns lookups are for our own authdns) [15:40:38] the net result should be 3 total machines for a very resilient service in 1 DC covering recdns+ntp+authdns (which is currently 3 machines anyways: 1x authdns-only and 2x recdns+ntp) [15:41:00] and it will perform better, it wil be more secure, it will be more failure-resistant, our config gets better, etc [15:41:37] if we stick with the way we're architecting these things today and push forward on LVS-ing authdns, we're going to make those second machines public hosts, which doesn't align with this "plan" [15:41:48] but the "plan" is still half-baked, I'm not even sure how all the details work out [15:49:14] I really don't know if it even needs to be 3 machines, maybe it can be 2x per DC [15:49:25] the drivers for the excess machine are basically: [15:50:08] 1. for local ntp servers peering with each other, 3 is better than 2 for tie-breaking when something goes wrong. but then again, each of our local NTP server pools in a DC is going to peer in some way with remote DCs' ntp server pools too. [15:51:35] I mean it seems like if an NTP box goes haywire due to machine-local problems, the tie-breaker will be remote peers at other DCs + our upstreams' inputs. and it the DC is isolated from the network, that's a rare event that's independent. do we really need to offer perfect NTP in the face of both a local NTP server with a screwed up HW clock and network-isolation at the same time? is it worth it [15:51:41] ? [15:52:46] 2. for recdns: in this sort of plan, all of eqiad only has configuration to use eqiad resolver IPs. If we have 2x recdns server IPs in eqiad and 2x boxes, when 1 fails it's kind of a big deal, because we have no resiliency left. Things are still working fine, but now you're one random failure away from chaos. [15:53:41] 3. for authdns the picture is less-bleak because even if we lost all 2x authdns boxes in a DC, globally things will be ok since we have other authdns servers. and that picture improves further once we do anycast for authdns. [15:55:09] and really if recdns's lack of a global failover solution is the only good driver for 3x boxes instead of 2x boxes.... let's explore that a bit [15:55:27] I got your point and it makes sense, will require careful and strategic planning (maybe with sort of deadlines to keep things moving, like it was a goal) [15:56:01] the reason we don't want resolv.conf pointing across the WAN is it has perf impact on failover due to a single packet dropped, etc.. that part kind of makes sense [15:56:27] but what we could do, is (similarly to authdns plans) anycast our internal recdns service IPs too [15:56:37] well "anycast" within our own networks, in 10/8 space. [15:56:45] yep yep sure [15:57:09] they'd still be LVS'd, but even if we lost both boxes, pybal would drop the route and the routers would send the recdns traffic over the WAN to the next still-working recdns [15:57:29] better off failing across the WAN that way than with glibc resolv.conf failover logic [15:58:29] having recdns anycasted internally also solves bootstrapping problems bringing up a fresh new datacenter. the first host you bootstrap doesn't need resolv.conf hacks to work until recdns is working there. [15:59:29] elukey: yeah I donno if this is in strategic goal territory. it's not like this is something we're totally failing at today. [15:59:40] ahh just checked /etc/resolv.conf for cp3030, fallsback to dns-rec-lb.eqiad.wikimedia.org. [15:59:59] it's just not ideal how we do things today, and we could do better. and if we're going to wade into changing anything about any of these 3 services, we should at least align on a path to a really good end-point [16:00:02] (sorry 3010) [16:02:17] so yeah, I kinda like saying it's just 2 boxes, and both rec+auth -dns are anycasted [16:02:30] (with recdns svc IPs in private networks) [16:02:53] and in both rec+auth -dns cases, we do the LVS plan with split traffic and such [16:03:12] I mentioned "like it was a goal" since it is a lot of work and it seemed to me something like the Varnish 4 migration, namely we are not failing atm but it will be great to have new features unblocked (and better support) when the migration will be completed [16:03:47] e.g. for recdns, there's the anycasted service IPs 10.A.B.C + 10.D.E.F. Those both go to LVS. LVS has both local servers as backends for both, but one of the IPs is 90/10 on serverA/B, and the other is 10/90 on serverA/B [16:04:13] so both service IPs should keep working if a box fails, and the load is even, and we have constant confirmation both are working for both IPs, etc [16:04:51] authdns can be similar, we'd have 2x global authdns IPs, and in each DC it would do much like the above to the 2x authdns real servers [16:06:12] to me the distinction is this is mostly a design problem. once we arrive at the right design (which may require some experimentation!), the actual implementation should be fairly straightforward and quick. [16:06:55] and design is something you can background work when you sleep and such :) [16:08:13] another thing, we could split the (any|rec)dns service IPs onto different LVS classes too [16:08:36] put the first IP in high-traffic1 and second in high-traffic2, so that even the temporary blip from a single LVS machine stopping doesn't impact both IPs [16:11:10] anyways, a few more nights to sleep on it, a few more random thoughts, it will eventually coalesce into something I can put in a ticket and get further criticism on at least [16:11:54] right now there's a lot of fuzzy bits in the picture in my head of all of this [16:12:02] I'll wait for updates then, and re-read all the conversation to get a better picture and questions to follow up. Thanks a lot for the explanation :) [16:13:58] design input and criticism welcome at any point by the way! [16:14:21] that's how we make good plans, we criticize them harshly remold them until they can sustain the attack :) [16:40:43] 10Traffic, 06Operations, 06Performance-Team, 13Patch-For-Review: Support HTTP/2 - https://phabricator.wikimedia.org/T96848#2251633 (10BBlack) = 24H Results: | Set | H/1 | Both | SPDY | H/2 | |--|--|--|--|--| | **All** | 54.75% | 33.17% | 7.07% | 5.01% | | **Text** | 57.58% | 28.72% | 6.99% | 6.72% | | **Up... [16:42:25] definitely :) [16:53:31] 10Traffic, 06Operations, 06Performance-Team, 13Patch-For-Review: Support HTTP/2 - https://phabricator.wikimedia.org/T96848#2251659 (10BBlack) While the 24H data is much better quality (not so subject to daily regional highs and lows), the overall picture is still basically the same. There's a lot of inter... [16:57:52] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Verify maps caching - https://phabricator.wikimedia.org/T133988#2251667 (10Gehel) An active measurement on 800 requests give me a 99%-ile = 180 ms. Not amazing, but not incredibly slow either. If I understand our metrics correctly (https://g... [17:02:04] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Verify maps caching - https://phabricator.wikimedia.org/T133988#2251670 (10Gehel) a:03Gehel [17:16:34] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Verify maps caching - https://phabricator.wikimedia.org/T133988#2251738 (10BBlack) There are a number of misunderstandings in this ticket, so let me step through them a bit, and then we can get back to the basics and ask whatever fundamental... [17:17:06] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Verify maps caching - https://phabricator.wikimedia.org/T133988#2251739 (10BBlack) p:05Triage>03Normal [17:17:25] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Questions about map tile cache performance - https://phabricator.wikimedia.org/T133988#2251410 (10BBlack) [17:23:37] bblack: thanks for taking the time to put context on this caching request... [17:25:28] the cache infrastructure is a big complex thing that's in front of everything and that hardly anyone understands, so there's this org-wide tendency to blame it first for anything that doesn't have any other "obvious" explanation, sometimes to an extreme degree :) [17:25:57] It is the unknown, and therefore problems with unknown solutions must lie there. [17:26:59] not that problems don't sometimes exist there, but still :P [17:28:41] in the case of the maps caches, the varnish config is the simplest one we have. what complexities exist in it are shared with the text and upload clusters and thus are pretty well battle-hardened. [17:29:04] and it's the only one on varnish4, which I guess you could perceive as both a risk and a benefit, but I'm pretty in practice it's mostly just a benefit. [17:29:21] s/pretty/pretty sure/ [17:30:16] bblack: It sounded unlikely too me that Varnish was the issue in this context, but the exercise of proving that is a good way to start to understand how all that work... [17:30:24] yeah [17:30:42] In the end the premise (it feels slow) is not much to start from... [17:31:10] there is a lot of magic in the traffic layer, for better or worse. [17:31:29] That goes back to my love/hate relashionship with magic... [17:31:55] if we ever get outbound HTTPS for varnish working, it will enable plans to add much more magic [17:32:09] there's some magic optimizations we could totally engineer now that are blocked on that [17:33:14] (to only attempt what seem to be true misses (ones that at least might return cacheable objects) through all the layers, but have the frontend cache jump directly to the application on pass-only traffic that can't be cached anyways) [17:34:14] even when it's only dynamically-known that a URL is pass-only (based on response headers), we cache that for 10 minutes, so the walk through the layers to reconfirm would be only once every 10 minutes [17:35:25] but we lack HTTPS between the cache and the application layers. we do have ipsec between caches at different DCs. [17:35:44] so to protect cross-dc traffic, all the cross-dc hops have to be through caches, with the final step to the application always inside a single DC. [17:42:09] I even understood most of what you said ! [17:43:15] Once understood, it does not even sound all that much like magic. We were doing something similar at job^1, HTTPS termination with nginx, varnish behind and actuall nginx routing direct to app server for unacheable traffic [17:43:27] or at least for some uncacheable traffic. [17:44:04] yeah what we're doing here in general isn't all that novel I don't think. it's the obvious thing you'd do in our situation. there's just not that many places in our situation. [17:55:27] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Questions about map tile cache performance - https://phabricator.wikimedia.org/T133988#2251792 (10Yurik) @bblack, awesome explanation, thank you, this clarifies so much! I am observing a very slow load time on the landline connection, while... [18:12:38] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Questions about map tile cache performance - https://phabricator.wikimedia.org/T133988#2251848 (10BBlack) There's still a lot of missing detail here. What browser/os/version is this? How do I reproduce the same page load? What else is at t... [18:16:04] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Questions about map tile cache performance - https://phabricator.wikimedia.org/T133988#2251861 (10BBlack) Also, note this line in your original landline ping results: ``` 6. RT.TC2.AMS.NL.retn.net 3.2% 250 126.4 129.4 96.3 2965... [18:30:24] 10Wikimedia-Apache-configuration, 13Patch-For-Review: create pk.wikimedia.org and redirect to wikimediapakistan.org - https://phabricator.wikimedia.org/T56780#2251880 (10Dereckson) [19:07:22] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Questions about map tile cache performance - https://phabricator.wikimedia.org/T133988#2251965 (10Yurik) Chrome 50.0.2661.86 (Official Build) (64-bit) on Ubuntu 16.04. I used https://maps.wikimedia.org/#9/50.7060/-100.3725 for both tests, on... [19:20:36] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Questions about map tile cache performance - https://phabricator.wikimedia.org/T133988#2251990 (10BBlack) Refreshing after a long pause has to re-establish a connection. If you're comparing to google, then trace your pings to whatever edge I... [19:32:23] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Questions about map tile cache performance - https://phabricator.wikimedia.org/T133988#2251998 (10Yurik) 05Open>03Invalid @bblack, thanks for looking into this. Google's servers are also ipv6, and their ping response is around 10.5, which... [19:49:52] 10Traffic, 07Varnish, 06Discovery, 10Kartotherian, and 2 others: Questions about map tile cache performance - https://phabricator.wikimedia.org/T133988#2252027 (10BBlack) 05Invalid>03Resolved Yes, at 10ms that probably means the gmaps endpoint you're hitting is inside of Russia, which is completely dif...