[14:47:41] <jynus>	 there is some issue with the slave delay on dbstore1001
[15:06:28] <paravoid>	 jynus: re: https://gerrit.wikimedia.org/r/#/c/285208/
[15:06:48] <paravoid>	 as the script is currently coded, the check_procs trips on every single invocation of it, making it fairly useless (and noisy)
[15:07:07] <paravoid>	 I won't attempt fixing el_sync.sh again, but if you're not going to work on it, I'll just remove the icinga check
[15:10:22] <jynus>	 ok
[15:11:39] <jynus>	 I am just saying that if you merge a change, checking it running it so it doesn't run into an infinite loop and fills the filesystem is a good idea :-)
[15:12:28] <jynus>	 I am the first one that makes mistakes, and that script is inherited from Sean, had no time to work on it
[15:13:40] <jynus>	 I support volan* call of reverting it
[15:14:17] <paravoid>	 never disagreed on that
[15:14:54] <jynus>	 and yes, that script is horrible :-P
[15:14:55] <paravoid>	 you should had deployed it instead probably, let's do that next time :)
[15:15:28] <jynus>	 but at least now it is horrible on puppet, and not on a screen session
[15:15:48] <jynus>	 (which is how I find it)
[15:18:21] <jynus>	 spike of connection errors on db1044
[15:57:48] <jynus>	 for some reasons, the delayed replication stops the s1 thread on dbstore1001
[15:58:13] <jynus>	 however, the only difference between shards is that db1052 is now on mariadb 10
[15:59:52] <jynus>	 wait, but seconds behind master is 79522, which is 22 hours
[16:00:10] <jynus>	 so stopping replication is the right thing to do
[16:01:23] <jynus>	 however, s1 tz says '2016-04-25T10:42:00.549080', which means is behind more than 24 hours
[16:02:11] <jynus>	 ah, it could be the time 52 was down
[16:02:16] <jynus>	 so not an actual issue
[16:02:40] <jynus>	 just that s1 is 24 hours behind ITS master, not THE master
[16:02:55] <jynus>	 so both seconds behind master AND pt-heartbeat were ok
[16:03:08] <jynus>	 the alert will go soon, I will ack it
[16:03:26] <jynus>	 this is like a really deep issue
[16:03:50] <jynus>	 happily needs no actionables, just wait