[10:38:41] hi all, have been rebooting the swift backends and i have noticed that some servers constantly have the HP RAID check in UNKNOWN (Service Check Timed Out) status is this a knowwn issue? [10:38:45] e.g. https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=ms-be2035 [10:39:00] godog: ^ [10:40:32] jbond42: yeah, that's a known issue, the check takes longer than our default timeout, let me find the task [10:41:33] https://phabricator.wikimedia.org/T210723 [10:41:44] cheers moritzm [10:41:48] jbond42: mostly yes [10:41:53] due to high I/O [10:44:13] indeed [10:58:50] wow its allready at 90 seconds, anyway thanks as its a none issues i will continue on withthe ms-be2* reboots [10:59:06] s/none/known/ [11:01:11] jbond42: FYI ms-be2043 had a disk replaced yesterday so you'll see higher load there while it rebuilds [11:02:05] thanks, i can leave that one untill the disk is rebuilt [11:03:17] sure, should be a couple of days tops [11:03:32] ack [14:17:00] jbond42: I noticed ms-be2040 came back with disks reordered, immediately that's harmless and makes puppet fail, I've rebooted ms-be2040 again [14:18:40] godog: how did you notice, ill add it to my check list [14:20:32] jbond42: the puppet fail on icinga tipped me off [14:21:32] ack ill make sure to do a puppet run when they come up [14:22:42] it took two reboots )o) [14:26:58] jbond42: we run puppet @reboot [14:27:30] spicerack has https://doc.wikimedia.org/spicerack/master/api/spicerack.puppet.html#spicerack.puppet.PuppetHosts.wait_since [14:27:33] fwiw [14:31:46] ack thanks [15:31:58] Will a new VM with a matching role in site.pp be puppetized automatically or do I have to do something? [15:39:18] XioNoX: a VM in CloudVPS? that needs a role specification in horizon. Not the case of ganeti [15:39:35] I mean Ganeti VM [15:45:25] XioNoX: for a ganeti VM once installed it's pretty much similar to a baremetal host [15:45:39] so will be picked up, yes [15:46:55] moritzm: do I need to use the reimage cookbook? [15:48:35] the reimage script doesn't support Ganeti instances currently [15:48:47] see https://wikitech.wikimedia.org/wiki/Ganeti#Reinstall_/_Reimage_a_VM [15:50:03] ok, thx [15:50:47] so far ssh'ing to rpki1001.eqiad.wmnet prompts for a password, I guess I haven't waited long enough? [15:51:51] you can run sudo install-console rpki1001.eqiad.wmnet from puppetmaster1001, that will get you a shell that will work before the other SSH keys are set up [15:52:24] or wait until puppet has done it's magic [16:01:56] ok, thanks! [17:34:29] so you have to use install-console to run puppet and then sign the puppet cert on the master [17:34:37] and then run puppet again on the host [17:34:46] after that you should be able to SSH normally [17:37:51] yep, all done and up now [17:42:35] https://code.fb.com/security/service-encryption/