[14:30:42] <elukey>	 we just had a bursts of memcache requests to mc1034 up to saturation of RX bandwidth
[14:30:53] <elukey>	 mcrouter tkos seems all from appservers
[14:31:32] <elukey>	 two things happened:
[14:31:43] <elukey>	 1) RX bandwidth saturation for mc1034
[14:31:44] <elukey>	 https://grafana.wikimedia.org/d/000000316/memcache?panelId=59&fullscreen&orgId=1&from=1587737709843&to=1587738572343
[14:31:54] <elukey>	 2) almost TX bandwidth saturation for mc1019
[14:32:05] <elukey>	 https://grafana.wikimedia.org/d/000000316/memcache?panelId=56&fullscreen&orgId=1&from=1587737709843&to=1587738572343
[14:33:11] <elukey>	 the problematic slab seems to be mc1034:180
[14:33:30] <elukey>	 drum roll... key size 700K :D
[14:33:31] <elukey>	 https://grafana.wikimedia.org/d/000000317/memcache-slabs?orgId=1&var-datasource=eqiad%20prometheus%2Fops&var-cluster=memcached&var-instance=mc1034&var-slab=180
[14:33:35] <elukey>	 jumbo key
[14:34:59] <elukey>	 there are 3 keys in the slab, saved in mc1034 in /home/elukey/slab_180
[14:37:27] <elukey>	 mc1019 was hammered by get requests for slab 164, ~330K
[14:37:32] <elukey>	 https://grafana.wikimedia.org/d/000000317/memcache-slabs?orgId=1&from=1587737730527&to=1587738694219&var-datasource=eqiad%20prometheus%2Fops&var-cluster=memcached&var-instance=mc1019&var-slab=164
[14:38:25] <elukey>	 saved the slab's content as well, but there are a lot more keys
[14:42:54] <cdanis>	 elukey: something that's been on the back of my mind has been probabilistic sampling of memcached gets / top-k hottest keys tracking
[14:46:47] <elukey>	 cdanis: in theory there are some patches from Aaron that should give us some metrics from MediaWiki, I hope that those will give us more insights
[14:47:40] <elukey>	 and I see two problems in general:
[14:47:46] <elukey>	 1) identification of hot keys
[14:48:00] <elukey>	 2) follow up with whatever creates the bursts
[14:48:23] <elukey>	 the latter is still a big problem, even if we come up with metrics :(
[14:48:40] <cdanis>	 yeah...
[14:48:53] <cdanis>	 i just feel it's hard to know what to look at for sure, right now
[14:49:12] <cdanis>	 aside from, local memcached might help
[14:49:19] <elukey>	 agreed
[14:50:14] <cdanis>	 even if local memcached (which is still a ways out AIUI) does help, it would be good to know _why_ -- like, are some flows in mediawiki just requesting the same key N times in one request?
[14:50:21] <cdanis>	 (i don't know if anyone knows that for sure)
[14:50:57] <cdanis>	 oh, also elukey, i made some edits to how the nic bw saturation panels are defined, i hope they make sense to you
[14:51:23] <cdanis>	 there's some help text behind the 🛈
[14:51:40] <elukey>	 super thanks a lot
[14:52:43] <elukey>	 ah snap then I completely misundersood the metric
[14:53:15] <elukey>	 so the 80% for mc1019 was the % of time that it worked at >90% bw saturation
[14:53:35] <elukey>	 then it is worse than I described, both shards with bw saturated :D
[14:53:43] <elukey>	 thanks for the clarification!
[14:54:55] <cdanis>	 yeah, it's a ratio of seconds per second, so it comes out unitless
[14:55:06] <cdanis>	 not the most intuitive to understand
[14:55:24] <cdanis>	 any sustained nonzero value there is pretty bad
[14:55:30] <elukey>	 ahahha yes yes
[14:55:38] <elukey>	 that could be a good description
[14:55:42] <elukey>	 :D :D :D
[14:56:09] <elukey>	 you described it more politely
[14:56:18] <elukey>	 in the panel
[14:56:37] <elukey>	 another thing that I was wondering is about https://netbox.wikimedia.org/search/?q=mc10&obj_type=
[14:57:29] <elukey>	 now if we pick A6, the top of rack switch should have 1G links to mc10xx and 10G links to other switches (don't know if it is a leaf or a spine)
[14:58:05] <elukey>	 one of the things to do when we have the gutter pool ready is to spread the hosts in multiple racks
[14:59:05] <elukey>	 (A6's switch should be a leaf IIUC from netbox)
[16:08:59] <XioNoX>	 elukey: A6 has 2*40G links to the spines
[16:11:07] <XioNoX>	 elukey: not sure if it would help with uplink saturation, but have you looked into jumbo frames? It might remove some overhead with large data transfer
[16:26:45] <cdanis>	 that's a good thought, it would probably give us like 5% more capacity on the larger values
[16:26:47] <elukey>	 XioNoX: ah nice I thought 10G, 40G looks very nice :)