[02:27:15] hi SREs - VRTS is down - T411452 [02:27:16] T411452: disk full at VRTS host? - https://phabricator.wikimedia.org/T411452 [02:28:51] VRTS broken, no space on device, who can help? [10:02:50] hey folks, I am installing spicerack on cumin1003, I see some cookbooks running [10:02:59] it shouldn't really affect them, I'll wait a bit just to be sure [10:03:13] the new version is a minor one that adds extra dhcp support [10:38:29] all old cookbooks, pinged people and updated spicerack [10:54:24] cloudcumin seems still on the old spicerack, though: https://debmonitor.wikimedia.org/packages/spicerack [10:58:10] moritzm: that usually was self-managed by wmcs, I used to just notify them of new releases, I guess I can notify myself nowadays :-P [11:00:06] ok :-) [11:20:21] ah TIL that we have to notify WMCS! [11:20:36] volans: hey new spicerack release! :P [11:20:42] * elukey runs away [11:20:54] ahahahah [12:29:34] lol [13:03:45] headsup, I'm going to rebalance kafka-main-codfw - T407185 . I've put a silence in place in alert manager to avoid paging [13:03:46] T407185: Fix Kafka replicas skew - https://phabricator.wikimedia.org/T407185 [13:04:21] !log running rebalancing of kafka-main-codfw with throttle of 30MB/s - T407185 [13:04:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:25:02] hi folks - FYI, some time around 16:00 UTC today, there will be a brief (seconds) disruption for clients communicating with etcd on conf1008. [15:25:02] * read and write operations will continue to succeed against the 2 other nodes [15:25:03] * clients (e.g., conftool, spicerack) may see transient errors communicating with conf1008 specifically (though these may be retried transparently in most cases) [17:17:06] hi folks - the etcd-related work described above completed around 16:10. due to scheduling constraints, I might perform the same maintenance on the two other conf* hosts in eqiad during the upcoming mediawiki infra window starting at 18:00 UTC. I'll keep you posted. [17:37:25] I should have mentioned, but the kafka main codfw rebalancing is done, which concludes that kafka cluster rebalancing task! [17:55:54] following up, I will be moving ahead with the remaining two conf* hosts, likely starting in ~ 10-15 minutes [18:45:11] following up (again), all disruptive work has completed as of 18:23 UTC. no further disruptions planned :) [18:45:27] \o/