[07:42:57] <marostegui>	 arturo: I have pushed: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/534577/ which should be transparent for everyone, but please let me know if you see something (specially quarry related)
[07:46:01] <marostegui>	 arturo: I have run some queries from querry and it worked, but let me know if you see or notice something
[08:48:44] <arturo>	 ack
[09:02:42] <_joe_>	 I am looking at the puppet ci  image
[09:03:04] <_joe_>	 just rebuilding it for buster makes it 1.12 GB
[09:03:21] <_joe_>	 of which 585 Mb are tox and bundler caches, to speed up execution
[09:03:44] <_joe_>	 I'm asking myself if it wouldn't make more sense to publish such caches somewhere instead
[09:03:52] <_joe_>	 and make the script download them
[09:04:11] <_joe_>	 like having a periodic job that every N hours recreates them
[09:05:02] <_joe_>	 at the same time, this means that every time you want to run a CI job you need to re-download ~ 350 mb of data, or to re-build your tox /bundler from scratch
[11:43:51] <cdanis>	 _joe_: can you package the pre-populated cache as part of the image?
[11:56:53] <_joe_>	 cdanis: that's what we're doing now
[11:57:06] <cdanis>	 ohh sorry I misunderstood
[15:55:37] <XioNoX>	 https://lore.kernel.org/wireguard/CAHmME9qOpDeraWo5rM31EWQW574KEduRBTL-+0A2ZyqBNDeYkg@mail.gmail.com/T/#u 🎉
[15:55:48] <XioNoX>	 [ANNOUNCE] WireGuard 1.0.0 for Linux 5.6 Released
[15:55:56] <cdanis>	 🎉
[15:56:15] <sukhe>	 yeah! great news
[15:56:30] <sukhe>	 I use it everyday and love it. I wish it had TCP support but hey
[15:56:35] <cdanis>	 has anyone played around with Tailscale any?
[15:59:50] <sukhe>	 I have seen discussions about it on HN. it does solve the key management problem in a clean way
[16:01:38] <moritzm>	 the latest linux upload to Debian unstable also backports the wireguard kernel support from 5.6 to 5.5: https://packages.qa.debian.org/l/linux/news/20200330T081008Z.html
[16:01:59] <moritzm>	 "Add WireGuard driver and required crypto changes from 5.6-rc7 and cryptodev-2.6, thanks to Jason A. Donenfeld (Closes: #953569)"
[16:47:18] <James_F>	 Heads-up: I'm switching the CI job for operations/puppet over to buster. Looks OK in testing, but if anything breaks, shout and I can trivially revert.
[16:54:36] <elukey>	 interesting, I was checking memcached metrics and noticed this trend of get hit ratio
[16:54:39] <elukey>	 https://grafana.wikimedia.org/d/000000316/memcache?orgId=1&from=now-90d&to=now&fullscreen&panelId=37
[16:55:00] <elukey>	 that seems to line up with
[16:55:01] <elukey>	 https://grafana.wikimedia.org/d/000000316/memcache?orgId=1&from=now-90d&to=now&fullscreen&panelId=42
[16:55:55] <elukey>	 on some shard the hits dropped
[16:56:13] <elukey>	 (I am checking now mc1029)
[16:58:52] <elukey>	 on a separate note, there are some tko events registered for wtp nodes
[16:58:53] <elukey>	 https://grafana.wikimedia.org/d/000000549/mcrouter?orgId=1&from=now-2d&to=now&fullscreen&panelId=9
[16:59:56] <elukey>	 the winner seems to be mc1027
[17:01:24] <elukey>	 brb and then I'll open a task
[17:01:56] <cdanis>	 elukey: please subscribe me as well
[17:05:59] <elukey>	 ack!
[17:07:58] <elukey>	 the slab is 154, ~200KB
[17:29:14] <elukey>	 ok so tx bandwidth saturation on mc1027
[17:29:14] <elukey>	 17:13:01 981662.1
[17:29:14] <elukey>	 17:13:02 981972.2
[17:29:15] <elukey>	 17:13:03 981631.8
[17:29:32] <elukey>	 (those are Kbps)
[17:30:12] <elukey>	 so now the problem is finding the key that does it :D
[17:59:28] <cdanis>	 I had meant to add even half-reasonably-done saturation monitoring for NIC rx/tx on memcache hosts this quarter, but it didn't happen
[17:59:43] <cdanis>	 (as opposed to the heinous shell one liner I wrote under duress one of the last times this happened)
[18:03:41] <elukey>	 cdanis: how dare you not doing another thing on top of the 1000 that you follow :D
[18:04:52] <elukey>	 jokes aside, it is super fine, might be good to the future to alarm on (even for other hosts!) but we are also missing mcrouter alerts etc..
[18:06:23] <elukey>	 of course the minute that I start doing tcpdump on mc1027 matches with the end of the bursts
[18:06:34] * elukey cries in a corner
[18:08:47] * volans remind to be careful before adding high frequency fleet wide checks
[18:15:55] <cdanis>	 volans: this would have been implemented by a small python daemon running on each machine, and either running an http server to export to prom, or writing to a textfile for node_exporter
[18:16:14] <cdanis>	 and then one check_prometheus ;)
[18:31:46] <volans>	 I know was more to spread awarness :D
[19:14:19] <_joe_>	 elukey: I think in those situations it's ok to use memkeys tbh
[20:18:49] <chaomodus>	 cdanis: that'd be relatively easy to implement...