[10:12:55] <volans>	 jynus: about db1071, there were spikes during the last 2 days around 8pm but the strange part is that if I do -rpc AND -api they almost disappear, but db1071 is not in the API role. Does this means that there are API calls going to db1071 too?
[10:13:57] <jynus>	 yes
[10:14:56] <jynus>	 I only think it is a job-connection issue
[10:15:20] <jynus>	 the others are synthoms only
[10:15:49] <jynus>	 I will check db1070 anyway
[10:16:52] <jynus>	 to be fair, lots of concurrent connections there, 200/server, up to peaks of 400
[10:17:02] <jynus>	 in other services it is around 50-60
[10:17:13] <jynus>	 that is again probably due to the missing db1058
[10:17:25] <jynus>	 we need the new servers there ASAP
[10:18:24] <volans>	 and stats to grafana so in this case we could sum connections for all hosts to see if something changed
[10:19:14] <jynus>	 we can check it already on the masters (or tendril)
[10:19:26] <jynus>	 check how s2 and s5 masters are overloaded
[10:19:34] <jynus>	 that usually means job queue
[10:19:43] <jynus>	 and there is indeed high job activity
[10:21:18] <jynus>	 check those spikes: https://grafana-admin.wikimedia.org/dashboard/db/job-queue-health
[10:22:38] <volans>	 yep
[10:25:07] <jynus>	 volans, you may need to deploy https://gerrit.wikimedia.org/r/#/c/291696/
[10:26:08] <volans>	 jynus: yes, I just don't know how... I asked you in -operations once merged ;)
[10:30:42] <jynus>	 oh, if I hadn't told you already is because I would do what you are going to do- check where it is installed on puppet and rebase that
[10:31:00] <volans>	 mw1152
[10:31:11] <jynus>	 I learned once and unlearned it immediately
[10:31:20] <volans>	 I was alrady there, ah so a manual rebase :)
[10:31:31] <volans>	 I thought of some magical automation :-P
[10:39:29] <jynus>	 I wouldn't be surprised that even puppet does that
[10:42:38] <volans>	 I tried a puppet noop run, it didn't ;)
[10:42:50] <volans>	 I'm updating the docs