[15:19:42] Hi! As the logs for today do not exist: Could someone put the login status in the channel topic? :-) [15:54:28] Dafu. I think gluster is dead again. [15:55:40] * Coren rips out gluster out of tools- tomorrow. [15:57:16] bots-gs seems to be working. What's the difference in setup, or is it just fate which gluster volume barfs? [15:58:34] From what I've seen, the biggest issue is the automounting; as long as the volume /remains/ mounted it tends to keep running unless it goes globally boom. [16:00:44] Ryan_Lane: If you see this ping: mount fails on gluster volumes. Restart bricks? [16:00:53] So tools- needs more use?! :-) [16:01:32] Heh. Tools needs a reliable filesystem. Right now, a roomful of scribes with quills and parchment transcribing from block dumps would be more reliable (and probably faster) [16:02:00] The good news is that Ops is aware and are looking into a good way to replace with NFS [16:03:59] Yeah, that's definitely it (automount woes). The primay exec done is working find since it has running bots to keep the mounts active. [16:05:48] Hmmm. A filesystem that fails when it is *not* stressed is certainly ... interesting :-). [16:06:23] Well, inactivity breaks it and -- apparently -- heavy use breaks it too. It's a goldilocks filesystem -- activity has to be "just right" :-) [16:06:56] * Coren notes that the bots still running is a good news. [16:10:11] Do we monitor fs availability at http://icinga.wmflabs.org/ already? [16:12:42] Not that I know of, and it's be tricky to monitor since checking (can you mount the filesystem) is activity that would probably keep that aspect running. :-/ [16:12:47] it'd* [16:13:51] Well, for me that would fall under "mission accomplished" :-). [18:04:39] Coren: all bricks? [18:04:47] it fails from all instances? [18:06:58] hm [18:07:15] it looks like it's working from bastion to me [18:07:24] this is likely limited to some projects [18:10:42] seems it's broken on tools-login [18:17:57] Coren: seems this was indeed on the client side [18:18:10] restarting the glusterd processes on the servers didn't help [18:18:29] but, killing the gluster processes on the client, then restarting autofs worked [18:19:02] as far as I can tell it was only affecting your project [18:19:11] I checked others and they were working fine [18:19:29] sigh [18:44:16] Damianz: :D [19:15:02] Ryan_Lane is clearly too happy [22:47:04] Ryan_Lane: Odd; because from what I saw it was the actual mounting that failed; and last time you had to restart the bricks to fix it. [22:47:33] Ryan_Lane: Not that I am much of a gluster expert.