[13:08:43] jynus: Think you forgot to ack the icinga alert re es1015 (or schedule downtime) [13:09:41] yes, see ops- [13:10:07] Wasn't on IRC at that time, but got it on my phone [13:10:13] SAL has it, though [13:10:44] do not worry, everything is fine. I have to do 20 manual steps to just restart mysql [13:11:04] it is easy for me to miss one [13:11:11] Yeah, I see [13:11:27] in a perfect world systemd would also take care of telling icinga [13:11:31] including a couple of mediawiki commits [13:11:32] I guess [13:11:48] in a perfect world I would have a load balancer [13:12:04] and checks would first check if the server is pooled [13:12:19] but that requires rearchitecture [13:13:02] true [13:13:19] even getting an exhaustive list of servers in use by MW in some way sounds quite hard [13:13:36] that would fix bot the checks and the commits [13:14:01] but do not worry, the ticket is there [13:14:22] https://phabricator.wikimedia.org/T119626 [13:14:59] also needed for other services to query the database reliably [13:17:27] Sounds interesting [13:18:17] oh, very, if dba team had the time [13:18:20] all of us [13:18:46] :-) [13:28:28] es1015 downtime has finished [13:28:32] will return later [13:29:06] it will, or you will? :P [13:29:25] Both, eventually [13:29:26] hopefully me [13:29:29] yes [19:59:01] I've logged it, but just in case [20:10:45] db1019 is down [20:10:48] *sorry [20:10:53] I meant es1019 [20:11:13] it should not cause any impact on production, as we have enough capacity