[00:00:40] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [00:07:40] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [00:07:42] SMTP on turnera is CRITICAL: Connection refused [00:07:42] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [00:07:50] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [00:12:30] MySQL slave on z-dat-s4-a is OK: Uptime: 1226099 Threads: 1 Questions: 61142759 Slow queries: 15 Opens: 93 Flush tables: 1 Open tables: 72 Queries per second avg: 49.867 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1491 [00:16:00] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:16:10] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:16:40] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [00:16:40] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [00:16:40] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [00:16:50] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [00:16:50] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [00:16:51] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [00:17:20] MySQL slave on z-dat-s3-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 7344 [00:17:40] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [00:17:50] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [00:17:51] / on thyme is UNKNOWN: NRPE: Unable to read output [00:17:51] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [00:17:51] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [00:17:51] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [00:17:51] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [00:18:09] wikidata replag on z-dat-s5-b is CRITICAL: (Service Check Timed Out) [00:18:19] MySQL slave on z-dat-s6-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 7504 [00:18:39] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [00:18:40] RAID on thyme is UNKNOWN: NRPE: Unable to read output [00:18:40] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 608817.000000 [00:18:49] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [00:23:51] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [00:23:54] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 61042 MB (10% inode=99%): [00:26:05] SGE is down: "can't set additional group id (uid=0, euid=0): Cannot allocate memory" [00:47:13] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [00:47:14] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [00:47:14] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [00:47:14] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:47:14] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [00:47:24] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:47:24] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:47:24] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:47:24] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:47:34] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:47:34] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:47:34] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:51:14] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.146 second response time [00:52:03] /home on hemlock is CRITICAL: DISK CRITICAL - /home is not accessible: No such file or directory [00:52:24] Sun Grid Engine execd on nightshade is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [00:52:44] Sun Grid Engine execd on yarrow is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [00:55:17] I am having problems logging in to willow - it looks like it does not recognize my ssh key [00:55:23] is there any known issue right now? [01:03:04] MySQL slave on z-dat-s6-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3302 [01:05:13] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3366 [01:06:04] MySQL slave on z-dat-s6-a is OK: Uptime: 1059310 Threads: 5 Questions: 860830849 Slow queries: 38848 Opens: 1388187 Flush tables: 1 Open tables: 3279 Queries per second avg: 812.633 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [01:06:14] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2179.000000 [01:08:14] MySQL slave on z-dat-s3-a is OK: Uptime: 1059272 Threads: 3 Questions: 1217075971 Slow queries: 36302 Opens: 6658135 Flush tables: 1 Open tables: 16384 Queries per second avg: 1148.973 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [01:10:14] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2169.000000 [01:11:41] Ialso get "SSI error: recursion exceeded" when I try to load webpages from toolserver - so I take that as a sign things are broken [01:20:05] carl-cbm: You're right, it looks as if at least the LDAP server is down. [01:21:57] thanks [01:30:14] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3619.000000 [01:34:14] s4 replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3609.000000 [01:41:14] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3073.000000 [01:42:14] s4 replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1746.000000 [01:46:14] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [01:46:14] SSH on mayapple is CRITICAL: Server answer: [01:46:14] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [01:46:14] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [01:46:14] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2939940.000000 [01:46:15] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 963419.000000 [01:46:15] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 766531.000000 [01:46:23] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [01:46:23] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [01:46:24] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [01:46:24] RAID on thyme is UNKNOWN: NRPE: Unable to read output [01:46:34] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:46:34] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:46:34] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1938044.000000 [01:46:34] SMTP on turnera is CRITICAL: Connection refused [01:46:43] wikidata replag on z-dat-s5-b is CRITICAL: (Service Check Timed Out) [01:46:43] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:46:43] / on thyme is UNKNOWN: NRPE: Unable to read output [01:46:43] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1575521.000000 [01:46:44] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:46:44] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:46:44] / on turnera is CRITICAL: NRPE: Command check_root not defined [01:46:45] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [01:46:45] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [01:46:53] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 154872 [01:46:53] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:46:54] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:46:54] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [01:46:54] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:46:54] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [01:46:54] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:46:55] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:47:03] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [01:47:04] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [01:47:04] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [01:47:04] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [01:47:04] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 238254 MB (4% inode=63%): [01:47:14] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [01:47:14] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [01:47:14] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [01:47:14] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:14] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [01:47:14] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [01:47:14] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [01:47:15] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [01:47:15] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [01:47:24] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:47:24] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:47:24] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:24] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 154748.000000 [01:47:24] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [01:47:24] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [01:47:25] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1837264.000000 [01:47:25] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [01:47:26] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:34] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:34] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:34] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:34] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [01:47:34] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:35] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:35] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [01:47:36] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:47:36] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:47:37] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [01:47:54] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:47:54] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [01:47:54] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [01:47:54] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [01:47:54] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [01:51:14] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.013 second response time [01:52:03] /home on hemlock is CRITICAL: DISK CRITICAL - /home is not accessible: No such file or directory [01:52:23] Sun Grid Engine execd on nightshade is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [01:52:44] Sun Grid Engine execd on yarrow is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [01:54:04] SSI error: recursion exceeded [01:54:05] SSI error: recursion exceededSSI error: recursion exceededv [01:54:05] SSI error: recursion exceeded [01:54:05] v [01:54:18] oops [01:54:32] XChat momentarily froze [01:54:53] What's it doing? [01:54:59] things are broken [01:55:39] Also tool interface is returning: [01:55:40] ERROR 1045 (28000): Access denied for user 'cyberpow'@'damiana-bge0.esi.toolserver.org' (using password: NO) [01:56:00] When attempting to access mysql [01:56:20] What's wrong with it? [01:57:04] it looks like several of the underlying services are down, like ldap (authentication) and the shared filesystems [01:57:34] Well it's a good thing I never closed my willow interface [01:57:45] I can still manage my bot. [01:57:50] fortunatel. [01:58:31] although, adminstats will fail if it remains broken until 4:30 UTC [01:59:39] carl-cbm, I think toolserver has hit the point where it's simply better to just let it die. [02:00:51] as of the moment there is no replacement, although perhaps labs will get there [02:01:08] FYI; currently dumping data towards the replicas. [02:01:16] It seems the cost to restore back to a decently working state with hardware and everything is greater than having labs devote a chunk of it for former toolserver users and setting up a toolserver replicated environment including replication. [02:01:38] What broke it? [02:02:13] Cyberpower678: it is not at all clear that labs is going to implement a working replacement for the key database features. But for things that just need replicas, labs may work. [02:02:28] The problem is not cost - it's that WMF is going to cut off the replication to toolserver, so there is no good reason to sink money into it [02:03:18] Aha. Which is why I've got a replication dependant programs ready to be deployed on labs. [02:03:51] What's going to happen to toolserver though? [02:04:36] I have no idea what is going to happen to the actual hardware, but some time in the next couple years I expect it will cease to exist in its current form [02:05:06] Is toolserver a separate building devoted to servers? [02:06:43] it's separate physical servers in a data center, which I believe is also used by the WMF [02:07:34] I wonder if labs will use the hardware to expand it's own capabilities. [02:08:38] No, WMDE have plans to decomission the hardware and give it to other nonprofits that can use it, AFAIK. It's up to the board. [02:08:44] I think that much of the toolserver hardware is owned by the Wikimedia Germany organization rather than by WMF, and some of the toolserver hardware should be getting old. One theoretical advantage of labs is that it can use WMF hardware [02:09:17] It's old hardware; all of it together is less than a single cluster node in the labs. [02:09:43] Ouch. [02:09:47] That is old. [02:10:12] It's like an old man dying because his organs can no longer keep up. [02:10:16] I think the last server was bought 2 years ago, but most are 5-6 years old. [02:10:28] But don't quote me on this; that's from memory. [02:11:18] But Coren said, "the last server was bought 2 years ago, but most are 5-6 years old." And he said I could quote him on this. :p [02:14:29] I believe many of the more recent issues have to do with the fact that key components are running solaris, but the current admins are not very knowledgable about solaris. This is because the previous admin was a solaris expert and configured everything before they left [02:14:58] this is not surprising when a significant projct like toolserver is run on a shoestring budget [02:16:32] the admins have done a remarkable job keeping it up as well as possible, and I was surprised by all the recent problems [02:16:54] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [02:16:55] Coren: what is the status of replication on labs? Is there an ETA for it? [02:17:45] carl-cbm: We're on track for replication to work at the Amsterdam hackaton, for most things. Central Auth might not make it in time, but would be days behind. [02:18:24] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:18:25] I'm actually working on that now. :-) [02:18:33] Coren: that's good, I can start testing some things once there is replication. Is there a plan yet to handle user databases? [02:18:44] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:18:44] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:18:54] Sun Grid Engine execd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:19:04] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:19:04] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:19:08] carl-cbm: They are supported out of the box, both on a local DB (for stuff that doesn't need cross joins) and on the shards themselves. [02:19:13] Sun Grid Engine execd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:19:23] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:19:33] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe10-esams.mgmt. [02:19:50] Coren: cross joins are what I mean [02:19:53] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [02:19:54] Sun Grid Engine execd on yarrow is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [02:19:54] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [02:19:55] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [02:20:04] /home on hemlock is OK: DISK OK - free space: /home 12873 MB (25% inode=80%): [02:20:09] if you don't need cross joins you could keep the user database on your own machine :) [02:20:14] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.238 second response time [02:20:24] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [02:20:24] Sun Grid Engine execd on nightshade is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [02:20:44] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [02:20:56] Yeah, that's supported. The only real caveat that may require some adaptation of tools is that commons (and wikidata for that matter) aren't replicated on multiple shards. We're going to provide federated tables for those, but they need to be used carefully to be efficient. [02:21:23] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [02:21:54] That's great. I will pass it on to the WP 1.0 people - I hope they are already thinking about how to move things to labs [02:21:56] Both the WMF and the WMDE teams will provide technical help to adapt tools to maintainers who request it, though. [02:22:56] carl-cbm: If you have tools that work without replicas, you're welcome to move them now to get used to the environment. Many did so already, so that they don't hit the traffic jam once we open the floodgates. :-) [02:23:15] * Coren mixes metaphors LIKE A BOSS! [02:30:13] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3776.000000 [02:32:14] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3502.000000 [02:41:14] s4 replag on z-dat-s5-b is OK: QUERY OK: SELECT ts_rc_age() returned 1720.000000 [02:46:14] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [02:46:14] SSH on mayapple is CRITICAL: Server answer: [02:46:14] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [02:46:14] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [02:46:15] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2943540.000000 [02:46:15] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 768970.000000 [02:46:24] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [02:46:24] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [02:46:24] RAID on thyme is UNKNOWN: NRPE: Unable to read output [02:46:34] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 966390.000000 [02:46:34] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:46:34] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:46:34] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1938556.000000 [02:46:34] SMTP on turnera is CRITICAL: Connection refused [02:46:44] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:46:44] / on thyme is UNKNOWN: NRPE: Unable to read output [02:46:44] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1577529.000000 [02:46:44] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:46:44] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:46:45] / on turnera is CRITICAL: NRPE: Command check_root not defined [02:46:45] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [02:46:46] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [02:46:54] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 144013 [02:46:54] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:46:54] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:46:54] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [02:46:54] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:46:55] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:46:55] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:04] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [02:47:04] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [02:47:04] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [02:47:04] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [02:47:04] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 238226 MB (4% inode=63%): [02:47:14] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [02:47:14] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:14] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [02:47:14] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [02:47:14] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [02:47:15] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [02:47:15] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [02:47:16] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [02:47:23] wikidata replag on z-dat-s5-b is CRITICAL: (Service Check Timed Out) [02:47:24] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:24] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 143973.000000 [02:47:24] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [02:47:24] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:24] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:25] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1840864.000000 [02:47:25] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [02:47:26] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:33] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:33] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:34] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [02:47:34] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:34] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:34] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:35] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:35] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:47:53] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:47:54] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [02:47:54] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [02:47:54] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [02:47:54] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [03:18:43] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:19:54] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [03:20:24] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [03:20:33] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [03:20:34] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [03:36:53] JFTR: Login seems to work again. [03:46:14] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [03:46:14] SSH on mayapple is CRITICAL: Server answer: [03:46:14] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [03:46:14] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [03:46:14] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 768547.000000 [03:46:15] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2947140.000000 [03:46:24] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [03:46:24] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [03:46:24] RAID on thyme is UNKNOWN: NRPE: Unable to read output [03:46:33] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:46:35] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:46:35] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1937461.000000 [03:46:35] SMTP on turnera is CRITICAL: Connection refused [03:46:43] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:46:44] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 967687.000000 [03:46:44] / on thyme is UNKNOWN: NRPE: Unable to read output [03:46:44] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1576964.000000 [03:46:44] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:46:44] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:46:45] / on turnera is CRITICAL: NRPE: Command check_root not defined [03:46:45] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [03:46:46] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [03:46:53] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:46:53] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:46:53] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 140039 [03:46:54] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [03:46:54] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:46:54] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:46:55] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:47:03] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [03:47:04] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [03:47:04] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [03:47:04] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [03:47:05] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 237068 MB (4% inode=63%): [03:47:13] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [03:47:14] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:15] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [03:47:15] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [03:47:15] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [03:47:15] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [03:47:15] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [03:47:15] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [03:47:24] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:25] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 140016.000000 [03:47:25] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:47:25] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:47:26] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [03:47:26] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1844464.000000 [03:47:26] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [03:47:27] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:34] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:35] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [03:47:35] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:35] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:35] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:35] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:36] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:47:36] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:47:54] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:47:54] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [03:47:54] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [03:47:54] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [03:47:54] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [03:48:04] wikidata replag on z-dat-s5-b is CRITICAL: (Service Check Timed Out) [04:18:44] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:19:53] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [04:20:23] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [04:20:34] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [04:20:34] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [04:39:33] Free Memory on damiana is CRITICAL: CRITICAL - 4.9% (408160 kB) free! [04:41:34] Free Memory on damiana is WARNING: WARNING - 6.0% (504252 kB) free! [04:43:34] Free Memory on damiana is CRITICAL: CRITICAL - 4.4% (372052 kB) free! [04:46:13] SSH on mayapple is CRITICAL: Server answer: [04:46:14] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [04:46:14] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [04:46:15] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [04:46:15] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 768326.000000 [04:46:23] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [04:46:24] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [04:46:24] RAID on thyme is UNKNOWN: NRPE: Unable to read output [04:46:33] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:46:44] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 967639.000000 [04:46:44] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:46:54] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 135382 [04:46:54] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [04:46:54] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:46:54] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:46:54] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:46:55] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:46:55] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:04] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [04:47:04] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [04:47:04] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [04:47:04] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [04:47:04] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 236907 MB (4% inode=63%): [04:47:14] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [04:47:14] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2950800.000000 [04:47:24] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [04:47:24] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 135354.000000 [04:47:24] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:47:24] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:24] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:34] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:34] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:47:34] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1935171.000000 [04:47:34] SMTP on turnera is CRITICAL: Connection refused [04:47:44] / on thyme is UNKNOWN: NRPE: Unable to read output [04:47:44] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1575651.000000 [04:47:44] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:44] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:44] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:45] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:47:45] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [04:47:46] / on turnera is CRITICAL: NRPE: Command check_root not defined [04:47:46] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [04:47:53] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [04:47:53] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:47:54] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [04:47:54] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [04:47:54] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [04:47:54] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [04:48:13] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:48:14] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [04:48:15] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [04:48:15] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [04:48:15] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [04:48:15] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [04:48:24] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [04:48:25] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1848124.000000 [04:48:25] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:48:25] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [04:48:34] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:48:34] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [04:48:34] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:48:34] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:48:34] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:48:44] wikidata replag on z-dat-s5-b is CRITICAL: (Service Check Timed Out) [04:56:04] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 613301.000000 [04:57:34] Free Memory on damiana is WARNING: WARNING - 5.1% (429732 kB) free! [05:05:57] Load avg. on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:06:06] /tmp on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:06:10] APT on z-dat-s1-b is CRITICAL: (Service Check Timed Out) [05:06:10] APT on yucca is CRITICAL: (Service Check Timed Out) [05:06:11] RAID on thyme is CRITICAL: (Service Check Timed Out) [05:36:01] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [05:36:01] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [05:36:01] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [05:36:01] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:36:01] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [05:36:11] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:36:11] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:36:11] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:36:11] wikidata replag on z-dat-s5-b is CRITICAL: (Service Check Timed Out) [05:36:11] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:36:21] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:36:21] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:36:21] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:40:43] [[Special:Log/newusers]] create 10 * Edwin50 * (New user account) [05:40:51] /home on hemlock is CRITICAL: DISK CRITICAL - /home is not accessible: No such file or directory [05:41:11] Sun Grid Engine execd on nightshade is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [05:41:31] Sun Grid Engine execd on yarrow is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [05:41:31] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.010 second response time [05:42:10] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [05:43:40] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [05:45:01] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2187.000000 [05:45:11] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2173.000000 [05:51:53] NFS seemed to be freaking out earlier. [05:51:54] On willow. [05:52:15] And now SSH isn't working. [05:52:18] :-((((((((( [06:09:01] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3627.000000 [06:09:10] s4 replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3613.000000 [06:09:11] aliasd on nightshade is CRITICAL: Connection refused [06:12:11] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3243.000000 [06:15:10] s4 replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1271.000000 [06:35:01] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [06:35:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 773739.000000 [06:35:01] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [06:35:01] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [06:35:11] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [06:35:11] RAID on thyme is UNKNOWN: NRPE: Unable to read output [06:35:20] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2957281.000000 [06:35:21] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1939603.000000 [06:35:21] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:35:21] SMTP on turnera is CRITICAL: Connection refused [06:35:30] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:35:30] / on thyme is UNKNOWN: NRPE: Unable to read output [06:35:31] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [06:35:31] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1580097.000000 [06:35:31] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:35:31] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:35:31] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:35:40] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 119678 [06:35:40] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [06:35:40] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [06:35:40] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:35:41] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [06:35:50] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [06:35:51] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [06:35:51] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [06:35:51] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [06:35:51] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 236873 MB (4% inode=63%): [06:36:00] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [06:36:01] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [06:36:01] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [06:36:01] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:01] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [06:36:01] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [06:36:01] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [06:36:02] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 972855.000000 [06:36:11] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:36:11] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:36:11] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:11] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 119532.000000 [06:36:11] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [06:36:11] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [06:36:11] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [06:36:12] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [06:36:12] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1854591.000000 [06:36:13] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:21] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:21] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [06:36:21] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:36:21] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:21] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:22] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:22] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:23] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [06:36:23] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:36:24] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:36:24] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [06:36:31] / on turnera is CRITICAL: NRPE: Command check_root not defined [06:36:31] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [06:36:31] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [06:36:41] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:41] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:41] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:36:41] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:36:41] SSH on mayapple is CRITICAL: Server answer: [06:36:42] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [06:36:42] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [06:36:43] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [06:36:43] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [06:36:51] wikidata replag on z-dat-s5-b is CRITICAL: (Service Check Timed Out) [06:37:01] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [06:40:50] /home on hemlock is CRITICAL: DISK CRITICAL - /home is not accessible: No such file or directory [06:41:11] Sun Grid Engine execd on nightshade is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [06:41:31] Sun Grid Engine execd on yarrow is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [06:41:31] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.008 second response time [06:43:41] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [07:05:41] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [07:09:01] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7227.000000 [07:09:11] aliasd on nightshade is CRITICAL: Connection refused [07:16:41] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:16:50] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [07:16:51] /home on hemlock is OK: DISK OK - free space: /home 12907 MB (25% inode=80%): [07:17:00] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [07:17:30] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.015 second response time [07:17:31] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:28:31] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [07:29:10] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [07:29:30] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [07:31:00] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3532.000000 [07:35:00] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [07:35:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 776247.000000 [07:35:01] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [07:35:11] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [07:35:11] RAID on thyme is UNKNOWN: NRPE: Unable to read output [07:35:21] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2960881.000000 [07:35:21] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1941500.000000 [07:35:21] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:21] SMTP on turnera is CRITICAL: Connection refused [07:35:31] / on thyme is UNKNOWN: NRPE: Unable to read output [07:35:31] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [07:35:31] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1581780.000000 [07:35:31] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:31] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:32] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:35:32] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:41] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [07:35:41] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [07:35:41] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 104164 [07:35:41] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:41] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [07:35:51] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [07:35:51] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [07:35:51] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [07:35:51] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 235489 MB (4% inode=63%): [07:35:51] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [07:36:01] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [07:36:01] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [07:36:01] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [07:36:01] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:01] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [07:36:02] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [07:36:02] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 975325.000000 [07:36:11] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:36:11] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:36:11] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:11] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 104052.000000 [07:36:11] MySQL slave on z-dat-s1-b is CRITICAL: (Return code of 139 is out of bounds) [07:36:12] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [07:36:12] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [07:36:13] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [07:36:13] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1858190.000000 [07:36:14] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:20] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:20] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [07:36:20] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:36:20] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:21] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:21] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:36:21] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:36:22] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:22] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:30] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [07:36:30] / on turnera is CRITICAL: NRPE: Command check_root not defined [07:36:30] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [07:36:40] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:41] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:41] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:36:41] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:36:41] SSH on mayapple is CRITICAL: Server answer: [07:36:41] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [07:36:41] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [07:36:42] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [07:36:42] MySQL slave on z-dat-s5-b is CRITICAL: Cant connect to MySQL server on z-dat-s5-b (146) [07:37:03] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [07:37:03] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s5-b (146) [07:37:21] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s5-b (146) [07:37:21] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [07:40:00] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3072.000000 [07:42:00] MySQL slave on z-dat-s2-b is OK: Uptime: 325 Threads: 1 Questions: 3 Slow queries: 0 Opens: 15 Flush tables: 1 Open tables: 8 Queries per second avg: 0.9 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [07:43:01] MySQL on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [07:43:40] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [07:49:00] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 63554 [07:49:01] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3612.000000 [08:05:40] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [08:09:10] aliasd on nightshade is CRITICAL: Connection refused [08:15:40] /sql on z-dat-s1-b is WARNING: DISK WARNING - free space: /sql 81989 MB (8% inode=99%): [08:17:01] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:17:20] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [08:17:31] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:20:01] [[Special:Log/newusers]] create 10 * Joseph * (New user account) [08:31:01] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3516.000000 [08:35:01] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [08:35:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 776464.000000 [08:35:01] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [08:35:11] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [08:35:11] RAID on thyme is UNKNOWN: NRPE: Unable to read output [08:35:20] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2964481.000000 [08:35:20] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:35:20] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1940056.000000 [08:35:21] SMTP on turnera is CRITICAL: Connection refused [08:35:30] / on thyme is UNKNOWN: NRPE: Unable to read output [08:35:30] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [08:35:30] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1583715.000000 [08:35:31] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:35:31] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:35:31] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:35:31] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:35:40] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [08:35:41] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [08:35:41] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 95608 [08:35:41] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:35:41] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [08:35:50] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [08:35:51] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [08:35:51] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [08:35:51] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 235367 MB (4% inode=63%): [08:35:51] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [08:36:01] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [08:36:01] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [08:36:01] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [08:36:01] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:01] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [08:36:01] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 974850.000000 [08:36:11] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:11] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:11] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:11] MySQL slave on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [08:36:11] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 95528.000000 [08:36:12] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [08:36:12] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [08:36:13] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [08:36:13] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1861790.000000 [08:36:14] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:21] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:21] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [08:36:21] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:21] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:21] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:22] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:22] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:23] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:23] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:31] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [08:36:31] / on turnera is CRITICAL: NRPE: Command check_root not defined [08:36:31] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [08:36:41] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:41] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:41] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:41] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:36:41] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [08:36:42] SSH on mayapple is CRITICAL: Server answer: [08:36:42] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [08:36:43] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [08:36:43] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 57296 [08:37:01] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [08:37:20] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 57276.000000 [08:37:20] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [08:43:01] MySQL on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [08:43:41] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [08:48:00] s4 replag on z-dat-s5-b is OK: QUERY OK: SELECT ts_rc_age() returned 1547.000000 [08:49:01] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 52247 [09:05:40] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [09:09:10] aliasd on nightshade is CRITICAL: Connection refused [09:14:40] [[Special:Log/newusers]] create 10 * Seb hor * (New user account) [09:17:00] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [09:17:21] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [09:17:31] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:34:20] Free Memory on damiana is WARNING: WARNING - 5.6% (466072 kB) free! [09:35:01] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [09:35:01] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [09:35:02] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 777030.000000 [09:35:11] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [09:35:11] RAID on thyme is UNKNOWN: NRPE: Unable to read output [09:35:21] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2968081.000000 [09:35:21] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1937443.000000 [09:35:21] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:35:21] SMTP on turnera is CRITICAL: Connection refused [09:35:31] / on thyme is UNKNOWN: NRPE: Unable to read output [09:35:31] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [09:35:31] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1582850.000000 [09:35:31] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:35:31] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:35:32] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:35:32] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:35:41] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [09:35:41] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [09:35:41] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:35:41] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 87451 [09:35:41] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [09:35:50] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [09:35:51] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [09:35:51] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [09:35:51] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [09:35:51] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 235300 MB (4% inode=63%): [09:36:00] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [09:36:00] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [09:36:00] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:01] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [09:36:01] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [09:36:01] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 973930.000000 [09:36:10] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:10] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:10] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:10] MySQL slave on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [09:36:10] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 87328.000000 [09:36:11] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [09:36:11] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [09:36:12] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [09:36:12] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1865390.000000 [09:36:13] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:20] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:21] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [09:36:21] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:21] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:21] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:21] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:22] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:22] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:22] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:30] / on turnera is CRITICAL: NRPE: Command check_root not defined [09:36:31] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [09:36:31] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [09:36:41] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:41] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:41] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:41] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:36:41] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [09:36:41] SSH on mayapple is CRITICAL: Server answer: [09:36:41] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [09:36:42] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [09:36:42] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 48925 [09:37:01] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [09:37:21] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 48920.000000 [09:37:21] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [09:42:21] Free Memory on damiana is CRITICAL: CRITICAL - 4.9% (412956 kB) free! [09:43:00] MySQL on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [09:48:41] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [09:49:01] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 42810 [10:03:11] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [10:03:41] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60346 MB (9% inode=99%): [10:05:40] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [10:09:11] aliasd on nightshade is CRITICAL: Connection refused [10:48:45] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [10:48:45] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [10:48:45] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [10:48:45] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 235248 MB (4% inode=63%): [10:48:45] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [10:48:52] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [10:48:52] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [10:48:52] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:48:52] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [10:49:02] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:49:02] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:49:02] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:49:02] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:49:12] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:49:12] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:49:12] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:52:52] Load avg. on nightshade is WARNING: WARNING - load average: 2.12, 7.52, 17.41 [10:52:52] Sun Grid Engine execd on yarrow is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [10:53:42] /home on hemlock is CRITICAL: DISK CRITICAL - /home is not accessible: No such file or directory [10:53:52] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2510.000000 [10:54:02] Sun Grid Engine execd on nightshade is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [10:54:22] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.019 second response time [10:54:42] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [10:55:52] Load avg. on nightshade is OK: OK - load average: 2.10, 5.11, 14.77 [11:12:52] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3650.000000 [11:47:52] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [11:47:52] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 783002.000000 [11:47:52] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [11:47:52] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [11:47:52] / on turnera is CRITICAL: NRPE: Command check_root not defined [11:47:52] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [11:47:53] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1941459.000000 [11:47:53] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [11:47:54] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [11:47:54] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [11:47:55] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [11:47:55] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 979042.000000 [11:47:56] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [11:47:56] MySQL on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [11:48:02] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 68585.000000 [11:48:02] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [11:48:02] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [11:48:02] RAID on thyme is UNKNOWN: NRPE: Unable to read output [11:48:12] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2976050.000000 [11:48:12] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:12] SMTP on turnera is CRITICAL: Connection refused [11:48:12] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:12] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [11:48:13] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:48:13] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:48:14] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:14] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:22] / on thyme is UNKNOWN: NRPE: Unable to read output [11:48:22] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [11:48:22] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:22] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1586721.000000 [11:48:22] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:23] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:48:23] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:24] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [11:48:31] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 68472 [11:48:31] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:32] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:48:32] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [11:48:32] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [11:48:32] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [11:48:32] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:48:33] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:48:33] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:48:34] SSH on mayapple is CRITICAL: Server answer: [11:48:41] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [11:48:41] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [11:48:42] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [11:48:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 235248 MB (4% inode=63%): [11:48:51] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [11:48:51] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [11:48:52] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [11:48:52] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:48:52] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [11:49:01] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:49:01] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:49:01] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:49:01] MySQL slave on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [11:49:01] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5810.000000 [11:49:02] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [11:49:02] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [11:49:03] aliasd on nightshade is CRITICAL: Connection refused [11:49:03] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1873361.000000 [11:49:04] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:49:11] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:49:12] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:49:12] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:49:32] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [11:49:32] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [11:49:32] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [11:49:32] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [11:52:51] Sun Grid Engine execd on yarrow is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [11:53:42] /home on hemlock is CRITICAL: DISK CRITICAL - /home is not accessible: No such file or directory [11:54:01] Sun Grid Engine execd on nightshade is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [11:54:21] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.009 second response time [11:54:42] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [12:08:28] Damn you ts. [12:12:52] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7250.000000 [12:18:31] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [12:36:52] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [12:47:52] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [12:47:52] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 786602.000000 [12:47:52] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [12:47:52] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [12:47:52] / on turnera is CRITICAL: NRPE: Command check_root not defined [12:47:53] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [12:47:53] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1945059.000000 [12:47:54] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [12:47:54] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [12:47:55] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [12:47:55] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [12:47:56] MySQL on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [12:47:56] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [12:47:57] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 982642.000000 [12:48:01] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 57147.000000 [12:48:01] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [12:48:02] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [12:48:02] RAID on thyme is UNKNOWN: NRPE: Unable to read output [12:48:11] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2979650.000000 [12:48:11] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:12] SMTP on turnera is CRITICAL: Connection refused [12:48:12] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:12] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [12:48:12] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:48:12] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:48:13] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:13] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:21] / on thyme is UNKNOWN: NRPE: Unable to read output [12:48:21] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [12:48:21] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1590321.000000 [12:48:22] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:22] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:22] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:48:22] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:23] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [12:48:31] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:31] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 57048 [12:48:32] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:48:32] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [12:48:32] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [12:48:32] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [12:48:32] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:48:33] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:48:33] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:48:34] SSH on mayapple is CRITICAL: Server answer: [12:48:41] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [12:48:42] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [12:48:42] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [12:48:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 235248 MB (4% inode=63%): [12:48:52] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [12:48:52] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [12:48:52] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [12:48:52] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:48:52] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [12:49:02] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:49:02] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:49:02] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:49:02] MySQL slave on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [12:49:02] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 9410.000000 [12:49:03] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [12:49:03] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [12:49:04] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1876962.000000 [12:49:04] aliasd on nightshade is CRITICAL: Connection refused [12:49:05] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:49:12] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:49:12] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:49:12] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:49:32] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [12:49:32] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [12:49:32] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [12:49:32] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [12:52:52] Sun Grid Engine execd on yarrow is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [12:53:41] /home on hemlock is CRITICAL: DISK CRITICAL - /home is not accessible: No such file or directory [12:54:02] Sun Grid Engine execd on nightshade is UNKNOWN: Cannot execute /sge/GE/bin/linux-x64/qhost [12:54:22] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 404 Not found - 161 bytes in 0.011 second response time [12:54:42] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [13:12:51] s4 replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10850.000000 [13:18:32] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [13:25:40] Anyone know what is causing "SSI error: recursion exceeded" on toolserver.org URLs? [13:28:23] there's probably a SSI statement loading to itself [13:36:15] toolserver is having issues [13:36:30] russblau: NFS failure [13:36:53] Betacommand: "toolserver is having issues" is a statement that is always true! [13:36:57] Toolserver's been having issues. [13:47:52] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [13:47:52] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 790202.000000 [13:47:52] Sun Grid Engine execd on willow is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [13:47:52] APT on yucca is WARNING: APT WARNING: 40 packages available for upgrade (0 critical updates). [13:47:52] / on turnera is CRITICAL: NRPE: Command check_root not defined [13:47:52] ts-array5 on turnera is CRITICAL: NRPE: Command check_multipath not defined [13:47:52] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1948659.000000 [13:48:02] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 41502.000000 [13:48:02] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [13:48:02] NTP on turnera is CRITICAL: NTP CRITICAL: No response from NTP server [13:48:02] RAID on thyme is UNKNOWN: NRPE: Unable to read output [13:48:12] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2983251.000000 [13:48:12] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:12] SMTP on turnera is CRITICAL: Connection refused [13:48:12] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [13:48:12] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:13] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:13] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:14] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:14] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:22] / on thyme is UNKNOWN: NRPE: Unable to read output [13:48:22] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [13:48:22] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1593921.000000 [13:48:22] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:22] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:23] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:23] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:24] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [13:48:32] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 41350 [13:48:32] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:32] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:32] /tmp on turnera is CRITICAL: NRPE: Command check_tmp not defined [13:48:32] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [13:48:33] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [13:48:33] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:34] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:48:34] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:42] Environment IPMI on turnera is CRITICAL: NRPE: Command check_ipmi not defined [13:48:42] APT on sage is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [13:48:42] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [13:48:42] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 235248 MB (4% inode=63%): [13:48:52] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [13:48:52] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [13:48:52] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [13:48:52] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:52] Sun Grid Engine execd on ortelius is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [13:48:53] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [13:48:53] Load avg. on turnera is CRITICAL: NRPE: Command check_load not defined [13:48:54] Sun Grid Engine execd on wolfsbane is UNKNOWN: Cannot execute /sge/GE/bin/sol-amd64/qstat [13:48:54] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [13:48:55] MySQL on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [13:48:55] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [13:48:56] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 986302.000000 [13:49:01] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:49:01] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:49:01] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:49:02] MySQL slave on z-dat-s1-b is CRITICAL: Cant connect to MySQL server on z-dat-s1-b (146) [13:49:02] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 13010.000000 [13:49:02] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [13:49:02] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [13:49:03] aliasd on nightshade is CRITICAL: Connection refused [13:49:03] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1880562.000000 [13:49:32] SSH on mayapple is CRITICAL: Server answer: [13:49:32] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [13:49:32] DiskSuite on turnera is CRITICAL: NRPE: Command check_disksuite not defined [13:49:32] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [13:49:32] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [13:51:52] NTP on adenia is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [15:32:10] hello all [16:10:37] hello DaB. [16:45:32] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [16:45:32] 2013/05/18 15:58 CRIT adenia FMA (Return code of 127 is out of bounds - plugin may be missing) [16:45:32] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [16:45:32] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [16:45:32] 2013/05/18 15:58 CRIT amaranth FMA (Return code of 127 is out of bounds - plugin may be missing) [16:45:40] 2013/05/18 16:45 CRIT mayapple / Timeout while attempting connection [16:49:38] Is the name change of tsnag permanent? [16:59:36] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [16:59:36] 2013/05/18 15:58 CRIT adenia FMA (Return code of 127 is out of bounds - plugin may be missing) [16:59:36] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [16:59:36] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [16:59:44] 2013/05/18 16:59 CRIT mayapple Sensors CHECK_NRPE: Socket timeout after 30 seconds. [17:00:20] scfc_de: no, just for testing [17:00:36] DaBPunkt: Okay. [17:00:44] 2013/05/18 16:59 CRIT mayapple / CHECK_NRPE: Socket timeout after 30 seconds. [17:00:44] 2013/05/18 16:59 ?? mayapple APT CHECK_NRPE: Error receiving data from daemon. [17:00:44] 2013/05/18 16:59 ?? mayapple Environment IPMI CHECK_NRPE: Error receiving data from daemon. [17:00:44] 2013/05/18 16:59 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:00:44] 2013/05/18 17:00 ?? mayapple Sensors CHECK_NRPE: Error receiving data from daemon. [17:00:45] 2013/05/18 16:59 CRIT mayapple Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [17:01:45] 2013/05/18 16:56 CRIT damiana Free Memory CRITICAL - 1.7% (146104 kB) free! [17:01:45] 2013/05/18 17:00 ?? mayapple / CHECK_NRPE: Error receiving data from daemon. [17:01:45] 2013/05/18 17:00 CRIT mayapple APT CHECK_NRPE: Socket timeout after 30 seconds. [17:01:45] 2013/05/18 17:00 CRIT mayapple Environment IPMI CHECK_NRPE: Socket timeout after 30 seconds. [17:01:45] 2013/05/18 17:00 ?? mayapple SRaid CHECK_NRPE: Error receiving data from daemon. [17:01:46] 2013/05/18 17:00 ?? mayapple Sun Grid Engine execd CHECK_NRPE: Error receiving data from daemon. [17:03:26] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [17:03:26] 2013/05/18 15:58 CRIT adenia FMA (Return code of 127 is out of bounds - plugin may be missing) [17:03:26] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [17:03:26] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [17:03:26] 2013/05/18 15:58 CRIT amaranth FMA (Return code of 127 is out of bounds - plugin may be missing) [17:04:55] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [17:04:55] 2013/05/18 15:58 CRIT adenia FMA (Return code of 127 is out of bounds - plugin may be missing) [17:04:55] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [17:04:55] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [17:04:55] 2013/05/18 15:58 CRIT amaranth FMA (Return code of 127 is out of bounds - plugin may be missing) [17:06:03] 2013/05/18 17:05 ?? mayapple / CHECK_NRPE: Error receiving data from daemon. [17:06:03] 2013/05/18 17:05 ?? mayapple APT CHECK_NRPE: Error receiving data from daemon. [17:06:03] 2013/05/18 17:04 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:06:03] 2013/05/18 17:05 ?? mayapple Sun Grid Engine execd CHECK_NRPE: Error receiving data from daemon. [17:07:03] 2013/05/18 17:06 WARN damiana Free Memory WARNING - 6.5% (545600 kB) free! [17:07:03] 2013/05/18 17:06 CRIT mayapple Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [17:07:42] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [17:07:42] 2013/05/18 15:58 CRIT adenia FMA (Return code of 127 is out of bounds - plugin may be missing) [17:07:42] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [17:07:42] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [17:07:42] 2013/05/18 15:58 CRIT amaranth FMA (Return code of 127 is out of bounds - plugin may be missing) [17:08:53] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [17:08:53] 2013/05/18 15:58 CRIT adenia FMA (Return code of 127 is out of bounds - plugin may be missing) [17:08:53] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [17:08:53] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [17:08:53] 2013/05/18 15:58 CRIT amaranth FMA (Return code of 127 is out of bounds - plugin may be missing) [17:13:56] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [17:13:56] 2013/05/18 15:58 CRIT adenia FMA (Return code of 127 is out of bounds - plugin may be missing) [17:13:56] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [17:13:56] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [17:13:56] 2013/05/18 15:58 CRIT amaranth FMA (Return code of 127 is out of bounds - plugin may be missing) [17:14:04] 2013/05/18 17:10 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:15:04] 2013/05/18 17:14 CRIT mayapple /tmp CHECK_NRPE: Socket timeout after 30 seconds. [17:15:04] 2013/05/18 17:14 CRIT mayapple APT CHECK_NRPE: Socket timeout after 30 seconds. [17:15:04] 2013/05/18 17:14 ?? mayapple SRaid CHECK_NRPE: Error receiving data from daemon. [17:15:04] 2013/05/18 17:14 ?? mayapple Sensors CHECK_NRPE: Error receiving data from daemon. [17:15:04] 2013/05/18 17:14 CRIT mayapple aliasd CHECK_NRPE: Socket timeout after 30 seconds. [17:16:04] 2013/05/18 17:15 ?? mayapple Environment IPMI CHECK_NRPE: Error receiving data from daemon. [17:17:05] 2013/05/18 17:16 ?? mayapple / CHECK_NRPE: Error receiving data from daemon. [17:17:05] 2013/05/18 17:15 ?? mayapple /tmp CHECK_NRPE: Error receiving data from daemon. [17:17:05] 2013/05/18 17:16 ?? mayapple APT CHECK_NRPE: Error receiving data from daemon. [17:17:05] 2013/05/18 17:15 CRIT mayapple Environment IPMI CHECK_NRPE: Socket timeout after 30 seconds. [17:17:05] 2013/05/18 17:16 ?? mayapple Load avg. CHECK_NRPE: Error receiving data from daemon. [17:17:05] 2013/05/18 17:16 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:17:05] 2013/05/18 17:16 CRIT mayapple Sensors CHECK_NRPE: Socket timeout after 30 seconds. [17:17:06] 2013/05/18 17:16 ?? mayapple Sun Grid Engine execd CHECK_NRPE: Error receiving data from daemon. [17:17:06] 2013/05/18 17:16 ?? mayapple aliasd CHECK_NRPE: Error receiving data from daemon. [17:18:05] 2013/05/18 17:17 CRIT mayapple / CHECK_NRPE: Socket timeout after 30 seconds. [17:18:05] 2013/05/18 17:17 ?? mayapple SRaid CHECK_NRPE: Error receiving data from daemon. [17:18:05] 2013/05/18 17:16 CRIT mayapple Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [17:18:05] 2013/05/18 17:17 CRIT mayapple aliasd CHECK_NRPE: Socket timeout after 30 seconds. [17:19:06] 2013/05/18 17:18 ?? mayapple / CHECK_NRPE: Error receiving data from daemon. [17:19:06] 2013/05/18 17:18 CRIT mayapple APT CHECK_NRPE: Socket timeout after 30 seconds. [17:19:06] 2013/05/18 17:18 ?? mayapple Environment IPMI CHECK_NRPE: Error receiving data from daemon. [17:19:06] 2013/05/18 17:18 CRIT mayapple Load avg. Timeout while attempting connection [17:19:06] 2013/05/18 17:18 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:20:07] 2013/05/18 17:19 CRIT mayapple / CHECK_NRPE: Socket timeout after 30 seconds. [17:20:07] 2013/05/18 17:19 ?? mayapple APT CHECK_NRPE: Error receiving data from daemon. [17:20:07] 2013/05/18 17:19 CRIT mayapple Environment IPMI CHECK_NRPE: Socket timeout after 30 seconds. [17:20:07] 2013/05/18 17:19 ?? mayapple SRaid CHECK_NRPE: Error receiving data from daemon. [17:20:07] 2013/05/18 17:19 ?? mayapple Sun Grid Engine execd CHECK_NRPE: Error receiving data from daemon. [17:20:44] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [17:20:44] 2013/05/18 15:58 CRIT adenia FMA (Return code of 127 is out of bounds - plugin may be missing) [17:20:44] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [17:20:44] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [17:20:44] 2013/05/18 15:58 CRIT amaranth FMA (Return code of 127 is out of bounds - plugin may be missing) [17:20:52] 2013/05/18 17:20 ?? mayapple / CHECK_NRPE: Error receiving data from daemon. [17:20:52] 2013/05/18 17:20 CRIT mayapple APT CHECK_NRPE: Socket timeout after 30 seconds. [17:20:52] 2013/05/18 17:20 ?? mayapple Load avg. CHECK_NRPE: Error receiving data from daemon. [17:20:52] 2013/05/18 17:20 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:21:52] 2013/05/18 17:20 CRIT mayapple /tmp Timeout while attempting connection [17:21:52] 2013/05/18 17:21 CRIT mayapple Load avg. CHECK_NRPE: Socket timeout after 30 seconds. [17:21:52] 2013/05/18 17:20 ?? mayapple Sun Grid Engine execd CHECK_NRPE: Error receiving data from daemon. [17:22:52] 2013/05/18 17:22 ?? mayapple APT CHECK_NRPE: Error receiving data from daemon. [17:22:52] 2013/05/18 17:22 ?? mayapple SRaid CHECK_NRPE: Error receiving data from daemon. [17:22:52] 2013/05/18 17:21 CRIT mayapple Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [17:23:52] 2013/05/18 17:22 ?? mayapple Sun Grid Engine execd CHECK_NRPE: Error receiving data from daemon. [17:23:52] 2013/05/18 17:22 CRIT mayapple aliasd CHECK_NRPE: Socket timeout after 30 seconds. [17:24:52] 2013/05/18 17:23 CRIT mayapple / Timeout while attempting connection [17:24:52] 2013/05/18 17:23 CRIT mayapple SRaid Timeout while attempting connection [17:25:52] 2013/05/18 17:24 ?? mayapple / CHECK_NRPE: Error receiving data from daemon. [17:25:52] 2013/05/18 17:25 CRIT mayapple APT CHECK_NRPE: Socket timeout after 30 seconds. [17:25:52] 2013/05/18 17:25 ?? mayapple SRaid CHECK_NRPE: Error receiving data from daemon. [17:26:52] 2013/05/18 17:26 ?? mayapple APT CHECK_NRPE: Error receiving data from daemon. [17:28:52] hi [17:28:52] 2013/05/18 17:27 CRIT mayapple / Timeout while attempting connection [17:28:52] 2013/05/18 17:28 CRIT mayapple APT CHECK_NRPE: Socket timeout after 30 seconds. [17:28:52] 2013/05/18 17:27 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:28:52] 2013/05/18 17:27 ?? mayapple aliasd CHECK_NRPE: Error receiving data from daemon. [17:29:53] 2013/05/18 17:29 ?? mayapple APT CHECK_NRPE: Error receiving data from daemon. [17:29:53] 2013/05/18 17:28 ?? mayapple SRaid CHECK_NRPE: Error receiving data from daemon. [17:30:52] 2013/05/18 17:30 ?? mayapple / CHECK_NRPE: Error receiving data from daemon. [17:30:52] 2013/05/18 17:30 CRIT mayapple APT CHECK_NRPE: Socket timeout after 30 seconds. [17:30:52] 2013/05/18 17:30 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:30:52] 2013/05/18 17:29 CRIT mayapple aliasd CHECK_NRPE: Socket timeout after 30 seconds. [17:31:53] 2013/05/18 17:31 ?? mayapple SRaid CHECK_NRPE: Error receiving data from daemon. [17:32:53] 2013/05/18 17:32 CRIT mayapple SRaid CHECK_NRPE: Socket timeout after 30 seconds. [17:33:53] 2013/05/18 17:32 ?? mayapple aliasd CHECK_NRPE: Error receiving data from daemon. [17:34:53] 2013/05/18 17:34 WARN damiana Free Memory WARNING - 7.0% (583068 kB) free! [17:34:53] 2013/05/18 17:33 CRIT mayapple aliasd CHECK_NRPE: Socket timeout after 30 seconds. [17:35:53] 2013/05/18 17:34 ?? mayapple aliasd CHECK_NRPE: Error receiving data from daemon. [17:36:53] 2013/05/18 17:36 CRIT damiana Free Memory CRITICAL - 3.3% (273024 kB) free! [17:36:53] 2013/05/18 17:34 CRIT ptolemy Environment CHECK_NRPE: Socket timeout after 30 seconds. [17:37:34] DaBPunkt: You're aware of "qstat -j 2002359": "05/18/2013 14:38:49 [0:17844]: can't set additional group id (uid=0, euid=0): Cannot allocate memory"? [17:37:53] 2013/05/18 17:36 CRIT mayapple aliasd CHECK_NRPE: Socket timeout after 30 seconds. [17:38:53] 2013/05/18 17:37 ?? mayapple aliasd CHECK_NRPE: Error receiving data from daemon. [17:40:53] 2013/05/18 17:40 WARN damiana Free Memory WARNING - 5.6% (472312 kB) free! [17:40:53] 2013/05/18 17:39 CRIT mayapple aliasd CHECK_NRPE: Socket timeout after 30 seconds. [17:40:53] 2013/05/18 17:40 OK ptolemy Environment ok: temperature ok fan ok voltage ok chassis ok [17:44:31] scfc_de: the reason is that damiana is short of memory again and will need another eboot soon [17:44:49] DaBPunkt: Where's the memory going? [17:44:54] 2013/05/18 17:44 CRIT damiana Free Memory CRITICAL - 4.9% (407584 kB) free! [17:45:50] scfc_de: the kernel eats it. There is a memory-leak somewhere in the nfs-daemon nobody is able to find [17:47:00] DaBPunkt: Can't we mount NFS on nightshade or another host? They have uptimes of more than a month. [17:47:29] (mount NFS: Mount the disks there, and then offer NFS from there.) [17:47:35] scfc_de: no, /home lives on a special-array [17:48:23] DaBPunkt: "special-array"? [17:49:03] a array of discs that is connected with both ha-nodes [17:49:30] And that array is physically only accessible from damiana and turnera? [17:49:35] yes [17:50:08] Okay. [17:51:12] 2013/05/18 16:34 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [17:51:18] 2013/05/18 16:03 CRIT adenia MySQL Access denied for user 'tsnagios7643'@'turnera-bge0.esi.toolserver.org' (using password: NO) [17:51:18] 2013/05/18 16:03 CRIT adenia NTP NTP CRITICAL: Server not synchronized, Offset unknown [17:51:18] 2013/05/18 15:58 CRIT amaranth Load avg. Timeout while attempting connection [17:51:29] test [18:00:45] I will reboot damiana now [18:02:20] 2013/05/18 18:01 CRIT ha-ldap.esi LDAP CRITICAL - Socket timeout after 10 seconds [18:03:20] 2013/05/18 18:03 CRIT damiana / Connection refused or timed out [18:03:20] 2013/05/18 18:03 CRIT damiana /tmp Connection refused or timed out [18:03:20] 2013/05/18 18:02 CRIT damiana DiskSuite Connection refused or timed out [18:03:20] 2013/05/18 18:02 CRIT damiana Environment IPMI Connection refused or timed out [18:03:20] 2013/05/18 18:02 CRIT damiana Load avg. Connection refused or timed out [18:03:21] 2013/05/18 18:02 CRIT damiana NTP CRITICAL - Socket timeout after 10 seconds [18:03:21] 2013/05/18 18:03 CRIT damiana SMTP CRITICAL - Socket timeout after 10 seconds [18:03:22] 2013/05/18 18:02 CRIT damiana ts-array5 Connection refused or timed out [18:03:22] 2013/05/18 18:02 CRIT ha-dns-auth Authoritative DNS CRITICAL - Plugin timed out while executing system call [18:03:23] 2013/05/18 18:02 CRIT ha-dns-auth PING CRITICAL - Host Unreachable (ha-dns-auth) [18:03:23] 2013/05/18 18:02 CRIT ha-dns-recursor.esi DNS recursor CRITICAL - Plugin timed out while executing system call [18:03:24] 2013/05/18 18:02 CRIT ha-dns-recursor.esi PING CRITICAL - Host Unreachable (ha-dns-recursor.esi) [18:03:24] 2013/05/18 18:02 CRIT ha-ldap.esi PING CRITICAL - Host Unreachable (ha-ldap.esi) [18:03:25] 2013/05/18 18:02 CRIT ha-nfs.esi NFS No route to host [18:04:20] 2013/05/18 18:03 CRIT damiana PING CRITICAL - Host Unreachable (damiana) [18:04:20] 2013/05/18 18:03 CRIT damiana SSH Connection refused [18:04:20] 2013/05/18 18:03 CRIT ha-sql.esi PING CRITICAL - Host Unreachable (ha-sql.esi) [18:04:20] 2013/05/18 18:03 CRIT nightshade APT CHECK_NRPE: Socket timeout after 30 seconds. [18:04:20] 2013/05/18 18:02 CRIT yarrow APT CHECK_NRPE: Socket timeout after 30 seconds. [18:05:20] 2013/05/18 18:05 OK damiana / DISK OK - free space: / 22298 MB (31% inode=95%): [18:05:21] 2013/05/18 18:04 OK damiana /tmp DISK OK - free space: /tmp 14195 MB (99% inode=99%): [18:05:21] 2013/05/18 18:04 OK damiana Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [18:05:21] 2013/05/18 18:04 OK damiana Free Memory OK - 86.2% (7221484 kB) free. [18:05:21] 2013/05/18 18:04 OK damiana Load avg. OK - load average: 1.30, 0.64, 0.25 [18:05:21] 2013/05/18 18:04 OK damiana PING PING OK - Packet loss = 0%, RTA = 0.48 ms [18:05:21] 2013/05/18 18:05 OK damiana SMTP SMTP OK - 0.463 sec. response time [18:05:22] 2013/05/18 18:05 OK damiana SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [18:05:22] 2013/05/18 18:04 OK damiana ts-array5 2/2 paths are active [18:05:23] 2013/05/18 18:04 OK ha-nfs.esi NFS TCP OK - 0.010 second response time on port 2049 [18:05:23] 2013/05/18 18:04 OK ha-nfs.esi PING PING OK - Packet loss = 0%, RTA = 0.18 ms [18:05:24] 2013/05/18 18:04 WARN nightshade APT APT WARNING: 67 packages available for upgrade (0 critical updates). [18:05:24] 2013/05/18 18:04 ?? ortelius Sun Grid Engine execd Cannot execute /sge/GE/bin/sol-amd64/qstat [18:05:25] 2013/05/18 18:04 ?? willow Sun Grid Engine execd Cannot execute /sge/GE/bin/sol-amd64/qstat [18:06:20] 2013/05/18 18:05 ?? ha-dns-auth Authoritative DNS check_dns: Invalid hostname/address - ha-dns-auth [18:07:20] 2013/05/18 18:06 ?? ha-dns-recursor.esi DNS recursor check_dns: Invalid hostname/address - ha-dns-recursor.esi [18:07:21] 2013/05/18 18:06 OK ha-proxy.esi PING PING OK - Packet loss = 0%, RTA = 0.47 ms [18:07:21] 2013/05/18 18:06 CRIT yarrow Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [18:08:21] 2013/05/18 18:07 OK ha-www PING OK - Packet loss = 0%, RTA = 0.19 ms [18:08:21] 2013/05/18 18:07 OK ha-dns-auth Authoritative DNS DNS OK: 0.072 seconds response time. 1.www.toolserver.org returns 91.198.174.203 [18:08:21] 2013/05/18 18:07 OK ha-dns-auth PING PING OK - Packet loss = 0%, RTA = 0.17 ms [18:08:21] 2013/05/18 18:07 OK ha-dns-recursor.esi DNS recursor DNS OK: 0.342 seconds response time. www.google.com returns 74.125.136.103,74.125.136.104,74.125.136.105,74.125.136.106,74.125.136.147,74.125.136.99 [18:08:21] 2013/05/18 18:07 OK ha-dns-recursor.esi PING PING OK - Packet loss = 0%, RTA = 0.13 ms [18:08:21] 2013/05/18 18:07 OK ha-ldap.esi LDAP LDAP OK - 0.004 seconds response time [18:08:21] 2013/05/18 18:07 OK ha-ldap.esi PING PING OK - Packet loss = 0%, RTA = 0.43 ms [18:08:22] 2013/05/18 18:07 OK ha-proxy.esi HTTP proxy HTTP OK: HTTP/1.0 302 Moved Temporarily - 1114 bytes in 1.027 second response time [18:08:22] 2013/05/18 18:07 OK ha-sql.esi PING PING OK - Packet loss = 0%, RTA = 0.20 ms [18:08:23] 2013/05/18 18:07 OK ha-www HTTP svn HTTP OK: HTTP/1.1 200 OK - 310 bytes in 0.143 second response time [18:08:23] 2013/05/18 18:07 CRIT nightshade APT CHECK_NRPE: Socket timeout after 30 seconds. [18:08:24] 2013/05/18 18:06 CRIT nightshade Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [18:08:24] 2013/05/18 18:07 CRIT wolfsbane Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [18:08:25] 2013/05/18 18:06 CRIT yarrow APT CHECK_NRPE: Socket timeout after 30 seconds. [18:09:21] 2013/05/18 18:01 CRIT hemlock /home CHECK_NRPE: Socket timeout after 30 seconds. [18:09:21] 2013/05/18 18:08 CRIT ortelius Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [18:09:21] 2013/05/18 18:07 CRIT ortelius toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [18:09:21] 2013/05/18 18:08 CRIT willow Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [18:09:21] 2013/05/18 18:02 CRIT yarrow aliasd CHECK_NRPE: Socket timeout after 30 seconds. [18:11:21] 2013/05/18 18:11 CRIT damiana / Connection refused or timed out [18:11:21] 2013/05/18 18:11 CRIT damiana /tmp Connection refused or timed out [18:11:21] 2013/05/18 18:10 CRIT damiana Free Memory Timeout while attempting connection [18:11:21] 2013/05/18 18:10 CRIT damiana Load avg. Connection refused or timed out [18:11:21] 2013/05/18 18:11 CRIT damiana SMTP CRITICAL - Socket timeout after 10 seconds [18:12:22] 2013/05/18 18:11 CRIT damiana DiskSuite Connection refused or timed out [18:12:22] 2013/05/18 18:11 CRIT damiana Environment IPMI Connection refused or timed out [18:12:22] 2013/05/18 18:11 CRIT damiana NTP CRITICAL - Socket timeout after 10 seconds [18:12:22] 2013/05/18 18:11 CRIT damiana PING CRITICAL - Host Unreachable (damiana) [18:12:22] 2013/05/18 18:11 CRIT damiana SSH CRITICAL - Socket timeout after 10 seconds [18:12:23] 2013/05/18 18:11 CRIT damiana ts-array5 Connection refused or timed out [18:12:23] 2013/05/18 18:11 CRIT ha-dns-auth Authoritative DNS CRITICAL - Plugin timed out while executing system call [18:12:24] 2013/05/18 18:11 CRIT ha-dns-auth PING CRITICAL - Host Unreachable (ha-dns-auth) [18:12:24] 2013/05/18 18:11 CRIT ha-dns-recursor.esi DNS recursor CRITICAL - Plugin timed out while executing system call [18:12:25] 2013/05/18 18:11 CRIT ha-dns-recursor.esi PING CRITICAL - Host Unreachable (ha-dns-recursor.esi) [18:12:25] 2013/05/18 18:11 CRIT ha-ldap.esi LDAP Could not bind to the LDAP server [18:12:26] 2013/05/18 18:11 CRIT ha-ldap.esi PING CRITICAL - Host Unreachable (ha-ldap.esi) [18:12:26] 2013/05/18 18:11 CRIT ha-nfs.esi NFS No route to host [18:12:27] 2013/05/18 18:12 CRIT ha-nfs.esi PING CRITICAL - Host Unreachable (ha-nfs.esi) [18:13:22] 2013/05/18 18:13 OK damiana / DISK OK - free space: / 22294 MB (31% inode=95%): [18:13:23] 2013/05/18 18:13 OK damiana /tmp DISK OK - free space: /tmp 14698 MB (99% inode=99%): [18:13:23] 2013/05/18 18:06 CRIT yarrow / CHECK_NRPE: Socket timeout after 30 seconds. [18:13:23] 2013/05/18 18:06 CRIT yarrow /var/tmp CHECK_NRPE: Socket timeout after 30 seconds. [18:13:23] 2013/05/18 18:06 CRIT yarrow Sensors CHECK_NRPE: Socket timeout after 30 seconds. [18:14:24] 2013/05/18 18:13 CRIT ha-ldap.esi CRITICAL - Host Unreachable (ha-ldap.esi) [18:14:24] 2013/05/18 18:13 CRIT ha-proxy.esi CRITICAL - Host Unreachable (ha-proxy.esi) [18:14:24] 2013/05/18 18:14 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [18:14:24] 2013/05/18 18:13 OK damiana Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [18:14:24] 2013/05/18 18:13 OK damiana Free Memory OK - 85.3% (7149520 kB) free. [18:14:24] 2013/05/18 18:13 OK damiana Load avg. OK - load average: 1.20, 0.64, 0.25 [18:14:24] 2013/05/18 18:13 OK damiana PING PING OK - Packet loss = 0%, RTA = 0.52 ms [18:14:25] 2013/05/18 18:13 OK damiana ts-array5 2/2 paths are active [18:14:25] 2013/05/18 18:13 OK ha-nfs.esi PING PING OK - Packet loss = 0%, RTA = 0.16 ms [18:14:26] 2013/05/18 18:06 CRIT nightshade / CHECK_NRPE: Socket timeout after 30 seconds. [18:14:26] 2013/05/18 18:06 CRIT nightshade /tmp CHECK_NRPE: Socket timeout after 30 seconds. [18:14:27] 2013/05/18 18:07 CRIT nightshade /var CHECK_NRPE: Socket timeout after 30 seconds. [18:14:27] 2013/05/18 18:07 CRIT nightshade /var/tmp CHECK_NRPE: Socket timeout after 30 seconds. [18:14:28] 2013/05/18 18:07 CRIT nightshade Environment IPMI CHECK_NRPE: Socket timeout after 30 seconds. [18:15:23] 2013/05/18 18:14 CRIT ha-dns-recursor.esi CRITICAL - Host Unreachable (ha-dns-recursor.esi) [18:15:24] 2013/05/18 18:15 OK ha-proxy.esi PING OK - Packet loss = 0%, RTA = 0.30 ms [18:15:24] 2013/05/18 18:15 OK ha-ldap.esi PING PING OK - Packet loss = 0%, RTA = 0.34 ms [18:15:24] 2013/05/18 18:15 OK ha-proxy.esi HTTP proxy HTTP OK: HTTP/1.0 302 Moved Temporarily - 1114 bytes in 0.124 second response time [18:16:25] 2013/05/18 18:15 OK ha-dns-recursor.esi PING OK - Packet loss = 0%, RTA = 0.25 ms [18:16:25] 2013/05/18 18:15 OK ha-ldap.esi PING OK - Packet loss = 0%, RTA = 0.34 ms [18:16:25] 2013/05/18 18:15 OK ha-www PING OK - Packet loss = 0%, RTA = 0.27 ms [18:16:25] 2013/05/18 18:15 OK ha-dns-auth Authoritative DNS DNS OK: 0.069 seconds response time. 1.www.toolserver.org returns 91.198.174.203 [18:16:25] 2013/05/18 18:15 OK ha-dns-auth PING PING OK - Packet loss = 0%, RTA = 0.19 ms [18:16:25] 2013/05/18 18:15 OK ha-dns-recursor.esi DNS recursor DNS OK: 0.200 seconds response time. www.google.com returns 74.125.136.103,74.125.136.104,74.125.136.105,74.125.136.106,74.125.136.147,74.125.136.99 [18:16:25] 2013/05/18 18:15 OK ha-dns-recursor.esi PING PING OK - Packet loss = 0%, RTA = 0.17 ms [18:16:26] 2013/05/18 18:15 OK ha-ldap.esi LDAP LDAP OK - 0.001 seconds response time [18:16:26] 2013/05/18 18:11 CRIT ha-nfs.esi NFS Connection refused [18:16:27] 2013/05/18 18:15 OK ha-proxy.esi PING PING OK - Packet loss = 0%, RTA = 0.30 ms [18:16:27] 2013/05/18 18:15 OK ha-sql.esi PING PING OK - Packet loss = 0%, RTA = 0.15 ms [18:16:28] 2013/05/18 18:15 OK ha-www HTTP svn HTTP OK: HTTP/1.1 200 OK - 310 bytes in 0.180 second response time [18:18:24] 2013/05/18 18:17 OK hemlock /home DISK OK - free space: /home 12864 MB (25% inode=80%): [18:18:24] 2013/05/18 18:17 OK nightshade / DISK OK - free space: / 1593 MB (89% inode=94%): [18:18:24] 2013/05/18 18:17 OK nightshade /tmp DISK OK - free space: /tmp 3321 MB (74% inode=99%): [18:18:24] 2013/05/18 18:18 OK nightshade /var DISK OK - free space: /var 9825 MB (73% inode=48%): [18:18:24] 2013/05/18 18:18 OK nightshade /var/tmp DISK OK - free space: /var/tmp 872 MB (98% inode=99%): [18:18:25] 2013/05/18 18:18 WARN nightshade APT APT WARNING: 67 packages available for upgrade (0 critical updates). [18:18:25] 2013/05/18 18:18 OK nightshade Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [18:18:26] 2013/05/18 18:18 WARN ortelius Sun Grid Engine execd NRPE: Unable to read output [18:18:26] 2013/05/18 18:18 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.730 second response time [18:18:27] 2013/05/18 18:18 WARN willow Sun Grid Engine execd NRPE: Unable to read output [18:18:27] 2013/05/18 18:17 OK yarrow / DISK OK - free space: / 1582 MB (88% inode=94%): [18:18:28] 2013/05/18 18:18 OK yarrow /tmp DISK OK - free space: /tmp 4086 MB (96% inode=99%): [18:18:28] 2013/05/18 18:18 OK yarrow /var DISK OK - free space: /var 11630 MB (87% inode=96%): [18:18:29] 2013/05/18 18:17 OK yarrow /var/tmp DISK OK - free space: /var/tmp 827 MB (97% inode=99%): [18:19:26] 2013/05/18 18:18 OK ha-nfs.esi NFS TCP OK - 0.000 second response time on port 2049 [18:19:26] 2013/05/18 18:18 OK nightshade Sensors sensor ok [18:19:26] 2013/05/18 18:18 ?? nightshade Sun Grid Engine execd Error with qhost: error: commlib error: got select error (Connection refused) [18:19:26] 2013/05/18 18:19 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.311 second response time [18:20:26] 2013/05/18 18:20 WARN nightshade Load avg. WARNING - load average: 6.01, 21.17, 16.49 [18:20:26] 2013/05/18 18:19 OK yarrow Load avg. OK - load average: 4.39, 14.82, 11.53 [18:22:26] 2013/05/18 18:22 OK nightshade Load avg. OK - load average: 1.67, 14.49, 14.61 [18:26:26] 2013/05/18 18:25 CRIT nightshade Sun Grid Engine execd CRITICAL: execd not communicating [18:26:26] 2013/05/18 18:25 CRIT yarrow Sun Grid Engine execd CRITICAL: execd not communicating [18:27:27] 2013/05/18 18:26 OK nightshade Sun Grid Engine execd Host and Queues Ok [19:18:53] Does anyone actually use recentchanges.rc_params? [19:19:53] Take it away, and wait for the response :-). [19:23:12] Merlissimo: What ports do execds listen to? "qstat -f" show yarrow irresponsive, but I see execd in the process list. "qping -info yarrow 31117 execd 1" gives "Connection refused". [19:27:53] scfc_de: try to restart the service [19:28:34] How? It's not in /etc/init.d/rc.d. [19:29:05] scfc_de: sure it is: sgeexecd.toolserver [19:30:07] Ah, wrong directory. [19:30:48] Will the child processes (/var/sge/spool/execd/yarrow/job_scripts/*) be restarted? [19:31:03] the yshould [19:31:21] Let's see. [19:34:06] @replag [19:35:32] @replag [19:35:33] DaBPunkt: s1-rr-a: 10h 52m 49s [-]; s1-rr-a-wd: 2w 1d 22h 4m 26s [-]; s1-user-c: 3w 1d 9m 19s [-]; s1-user-wd: 3w 1d 19h 5m 29s [-]; s2-rr: 9h 23m 37s [-]; s2-user: 9h 23m 37s [-]; s2-user-c: error; s2-user-wd: 2w 4d 16h 32m 46s [-] [19:35:34] DaBPunkt: s3-user: 35s [-]; s3-user-wd: 1w 4d 15h 57m 9s [-]; s4-user-wd: 4w 6d 18h 28m 26s [-]; s5-rr-a: 11s [-]; s5-rr-a-wd: 1w 13h 22m 42s [-]; s5-user: 9h 23m 37s [-]; s5-user-c: 9h 23m 39s [-]; s6-user: 51s [-] [19:35:35] DaBPunkt: s6-user-wd: 1w 4d 15h 45m 24s [-]; s7-user-wd: 1w 2d 9h 18m 8s [-] [19:35:46] ok, works on shell [19:38:34] execd seems to be up and running at port 537 ("qping -info yarrow 537 execd 1"), but "qstat -f" hasn't noticed that yet. [19:46:01] there were some defect sge_shepherds running. I killed them and re-started sge [19:46:04] on yarrow [19:46:33] 2013/05/18 19:45 OK yarrow Sun Grid Engine execd Host and Queues Ok [20:06:38] DaBPunkt: The damiana reboot doesn't seem to have had an effect: "05/18/2013 19:48:36 [0:23348]: can't set additional group id (uid=0, euid=0): Cannot allocate memory" ("qstat -j 2002360"). [20:14:50] @replag [20:14:50] russblau: s1-rr-a: 7h 47m 14s [-]; s1-rr-a-wd: 2w 1d 22h 43m 42s [-]; s1-user: 40s [-]; s1-user-c: 3w 1d 48m 35s [-]; s1-user-wd: 3w 1d 19h 44m 45s [-]; s2-rr: 10h 2m 53s [-]; s2-user: 10h 2m 53s [-]; s2-user-c: error [20:14:51] russblau: s2-user-wd: 2w 4d 17h 12m 3s [-]; s3-user: 22s [-]; s3-user-wd: 1w 4d 16h 36m 26s [-]; s4-user-wd: 4w 6d 19h 7m 43s [-]; s5-rr-a: 14s [-]; s5-rr-a-wd: 1w 14h 1m 59s [-]; s5-user: 10h 2m 54s [-]; s5-user-c: 10h 2m 56s [-] [20:14:52] russblau: s6-user-wd: 1w 4d 16h 24m 41s [-]; s7-user: 12s [-]; s7-user-wd: 1w 2d 9h 57m 25s [-] [20:16:28] scfc_de: sorry, no idea at the moment where the problem could be [20:18:35] scfc_de: but AFAIS the bot is running with sge at the moment at least [21:01:36] 2013/05/18 20:58 CRIT damiana Free Memory CRITICAL - 3.1% (259608 kB) free! [21:19:38] 2013/05/18 21:13 CRIT yarrow aliasd Connection refused [21:35:14] [[User talk:Dab]] ! 10https://wiki.toolserver.org/w/index.php?diff=8009&oldid=7935&rcid=21922 * 2.207.78.135 * (+374) (/* Aussehen dieses Wiki */ new section) [21:46:49] [[User talk:Dab]] ! 10https://wiki.toolserver.org/w/index.php?diff=8010&oldid=8009&rcid=21923 * Tim.Landscheidt * (+201) (/* Aussehen dieses Wiki */ Antwort.) [22:28:14] WTF?!? [22:28:26] I put tsnag on my ignore list. [22:28:37] [[User talk:Dab]] ! 10https://wiki.toolserver.org/w/index.php?diff=8011&oldid=8010&rcid=21924 * 2.207.78.135 * (+112) (/* Aussehen dieses Wiki */ ) [22:34:15] Cyberpower678: scfc_de: no, just for testing [22:34:54] scfc_de, what's that supposed to mean? [22:36:03] Cyberpower678: It was just a test. [22:36:52] How did tsnag bypass my ignore command [22:36:53] ? [22:38:39] Cyberpower678: It took the disguise of ts-nag instead of tsnag. [22:39:00] -.- [22:47:41] 2013/05/18 22:47 CRIT damiana /tmp Timeout while attempting connection [22:47:41] 2013/05/18 22:47 CRIT ha-dns-recursor.esi PING CRITICAL - Host Unreachable (ha-dns-recursor.esi) [22:47:41] 2013/05/18 22:47 CRIT ha-nfs.esi NFS No route to host [22:48:41] 2013/05/18 22:48 CRIT damiana / Connection refused or timed out [22:48:41] 2013/05/18 22:47 CRIT damiana DiskSuite Connection refused or timed out [22:48:41] 2013/05/18 22:47 CRIT damiana Environment IPMI Connection refused or timed out [22:48:41] 2013/05/18 22:47 CRIT damiana Load avg. Connection refused or timed out [22:48:41] 2013/05/18 22:48 CRIT damiana NTP CRITICAL - Socket timeout after 10 seconds [22:48:42] 2013/05/18 22:48 CRIT damiana PING CRITICAL - Host Unreachable (damiana) [22:48:42] 2013/05/18 22:48 CRIT damiana SMTP No route to host [22:48:43] 2013/05/18 22:48 CRIT damiana SSH No route to host [22:48:43] 2013/05/18 22:47 CRIT damiana ts-array5 Connection refused or timed out [22:48:44] 2013/05/18 22:47 CRIT ha-dns-recursor.esi DNS recursor CRITICAL - Plugin timed out while executing system call [22:48:44] 2013/05/18 22:47 CRIT ha-ldap.esi LDAP Could not bind to the LDAP server [22:48:45] 2013/05/18 22:48 CRIT ha-ldap.esi PING CRITICAL - Host Unreachable (ha-ldap.esi) [22:48:45] 2013/05/18 22:48 CRIT ha-nfs.esi PING CRITICAL - Host Unreachable (ha-nfs.esi) [22:48:46] 2013/05/18 22:48 CRIT ha-proxy.esi HTTP proxy CRITICAL - Socket timeout after 10 seconds [22:49:42] 2013/05/18 22:49 CRIT ha-ldap.esi PING CRITICAL - Packet loss = 61%, RTA = 5458.31 ms [22:49:42] 2013/05/18 22:48 CRIT ha-dns-auth Authoritative DNS CRITICAL - Plugin timed out while executing system call [22:49:42] 2013/05/18 22:48 CRIT ha-dns-auth PING CRITICAL - Host Unreachable (ha-dns-auth) [22:50:15] Seriously? Please shut it off. [22:50:26] ts down again? :| [22:50:42] 2013/05/18 22:50 CRIT damiana CRITICAL - Host Unreachable (damiana) [22:50:42] 2013/05/18 22:50 CRIT ha-proxy.esi CRITICAL - Host Unreachable (ha-proxy.esi) [22:51:41] 2013/05/18 22:51 CRIT ha-dns-recursor.esi CRITICAL - Host Unreachable (ha-dns-recursor.esi) [22:51:41] 2013/05/18 22:50 CRIT ha-nfs.esi CRITICAL - Host Unreachable (ha-nfs.esi) [22:51:58] and it is dead again… [22:52:42] 2013/05/18 22:52 CRIT ha-sql.esi CRITICAL - Host Unreachable (ha-sql.esi) [22:52:42] 2013/05/18 22:52 CRIT ha-www CRITICAL - Host Unreachable (ha-www) [22:53:42] 2013/05/18 22:47 CRIT cassia SMTP CRITICAL - Socket timeout after 10 seconds [22:53:42] 2013/05/18 22:47 CRIT nightshade SMTP CRITICAL - Socket timeout after 10 seconds [22:53:42] 2013/05/18 22:47 CRIT ortelius toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [22:53:42] 2013/05/18 22:47 CRIT rosemary SMTP CRITICAL - Socket timeout after 10 seconds [22:53:42] 2013/05/18 22:47 CRIT thyme SMTP CRITICAL - Socket timeout after 10 seconds [22:53:43] 2013/05/18 22:47 CRIT z-dat-s4-a SMTP CRITICAL - Socket timeout after 10 seconds [22:53:43] 2013/05/18 22:47 CRIT z-dat-s6-a SMTP CRITICAL - Socket timeout after 10 seconds [22:54:42] 2013/05/18 22:54 OK damiana PING OK - Packet loss = 0%, RTA = 0.31 ms [22:54:42] 2013/05/18 22:54 OK ha-nfs.esi PING OK - Packet loss = 0%, RTA = 0.48 ms [22:54:42] 2013/05/18 22:47 CRIT adenia SMTP CRITICAL - Socket timeout after 10 seconds [22:54:42] 2013/05/18 22:54 OK damiana PING PING OK - Packet loss = 0%, RTA = 0.52 ms [22:54:42] 2013/05/18 22:54 OK ha-nfs.esi PING PING OK - Packet loss = 0%, RTA = 0.19 ms [22:54:43] 2013/05/18 22:47 CRIT hemlock /home CHECK_NRPE: Socket timeout after 30 seconds. [22:54:43] 2013/05/18 22:47 CRIT hyacinth SMTP CRITICAL - Socket timeout after 10 seconds [22:54:44] 2013/05/18 22:47 CRIT nightshade / CHECK_NRPE: Socket timeout after 30 seconds. [22:54:44] 2013/05/18 22:47 CRIT nightshade /tmp CHECK_NRPE: Socket timeout after 30 seconds. [22:54:45] 2013/05/18 22:48 CRIT nightshade /var CHECK_NRPE: Socket timeout after 30 seconds. [22:54:45] 2013/05/18 22:47 CRIT nightshade /var/tmp CHECK_NRPE: Socket timeout after 30 seconds. [22:54:46] 2013/05/18 22:47 CRIT nightshade Environment IPMI CHECK_NRPE: Socket timeout after 30 seconds. [22:54:46] 2013/05/18 22:47 CRIT nightshade Load avg. CHECK_NRPE: Socket timeout after 30 seconds. [22:54:47] 2013/05/18 22:47 CRIT nightshade Sensors CHECK_NRPE: Socket timeout after 30 seconds. [22:55:45] 2013/05/18 22:55 CRIT ha-dns-auth CRITICAL - Host Unreachable (ha-dns-auth) [22:55:45] 2013/05/18 22:55 OK damiana / DISK OK - free space: / 21408 MB (29% inode=95%): [22:55:45] 2013/05/18 22:54 OK damiana Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [22:55:45] 2013/05/18 22:54 OK damiana Free Memory OK - 91.3% (7647296 kB) free. [22:55:45] 2013/05/18 22:54 OK damiana Load avg. OK - load average: 1.08, 0.57, 0.23 [22:57:01] 2013/05/18 22:56 OK ha-dns-auth PING OK - Packet loss = 0%, RTA = 0.69 ms [22:57:01] 2013/05/18 22:56 OK ha-dns-recursor.esi PING OK - Packet loss = 0%, RTA = 523.10 ms [22:57:01] 2013/05/18 22:56 OK ha-ldap.esi PING OK - Packet loss = 0%, RTA = 0.74 ms [22:57:01] 2013/05/18 22:56 OK ha-proxy.esi PING OK - Packet loss = 0%, RTA = 0.28 ms [22:57:01] 2013/05/18 22:56 OK ha-sql.esi PING OK - Packet loss = 0%, RTA = 511.35 ms [22:57:02] 2013/05/18 22:56 OK ha-www PING OK - Packet loss = 0%, RTA = 0.41 ms [22:57:02] 2013/05/18 22:56 OK ha-proxy.esi PING PING OK - Packet loss = 0%, RTA = 0.23 ms [22:57:03] 2013/05/18 22:56 WARN nightshade Load avg. WARNING - load average: 11.41, 23.17, 13.78 [22:57:03] 2013/05/18 22:56 OK ortelius SMTP SMTP OK - 8.870 sec. response time [22:57:04] 2013/05/18 22:56 CRIT ortelius Sun Grid Engine execd CHECK_NRPE: Socket timeout after 30 seconds. [22:57:04] 2013/05/18 22:56 OK ptolemy SMTP SMTP OK - 0.857 sec. response time [22:57:05] 2013/05/18 22:56 OK z-dat-s7-a SMTP SMTP OK - 5.019 sec. response time [22:57:20] and we are back [22:58:02] 2013/05/18 22:57 OK adenia SMTP SMTP OK - 0.003 sec. response time [22:58:02] 2013/05/18 22:57 OK cassia SMTP SMTP OK - 0.002 sec. response time [22:58:02] 2013/05/18 22:57 OK ha-dns-auth Authoritative DNS DNS OK: 0.070 seconds response time. 1.www.toolserver.org returns 91.198.174.203 [22:58:02] 2013/05/18 22:57 OK ha-dns-auth PING PING OK - Packet loss = 0%, RTA = 0.28 ms [22:58:02] 2013/05/18 22:57 OK ha-dns-recursor.esi PING PING OK - Packet loss = 0%, RTA = 0.14 ms [22:58:03] 2013/05/18 22:57 OK ha-ldap.esi PING PING OK - Packet loss = 0%, RTA = 0.55 ms [22:58:03] 2013/05/18 22:57 OK ha-proxy.esi HTTP proxy HTTP OK: HTTP/1.0 302 Moved Temporarily - 1114 bytes in 0.961 second response time [22:58:04] 2013/05/18 22:57 OK ha-sql.esi PING PING OK - Packet loss = 0%, RTA = 0.44 ms [22:58:04] 2013/05/18 22:57 OK ha-www HTTP svn HTTP OK: HTTP/1.1 200 OK - 310 bytes in 0.163 second response time [22:58:05] 2013/05/18 22:56 OK hemlock /home DISK OK - free space: /home 12895 MB (25% inode=80%): [22:58:05] 2013/05/18 22:57 OK hyacinth SMTP SMTP OK - 0.003 sec. response time [22:58:06] 2013/05/18 22:57 OK nightshade SMTP SMTP OK - 0.005 sec. response time [22:58:06] 2013/05/18 22:57 OK nightshade Sun Grid Engine execd Host and Queues Ok [22:58:07] 2013/05/18 22:57 WARN ortelius Sun Grid Engine execd NRPE: Unable to read output [23:00:01] 2013/05/18 22:59 OK nightshade Load avg. OK - load average: 2.30, 13.89, 11.85 [23:01:05] [[Special:Log/newusers]] create 10 * The Illusive Man * (New user account) [23:13:21] DaBPunkt, good now can you switch that tsnag that's supposed to be on my ignore list. [23:15:42] Cyberpower678: I did hours ago [23:20:31] DaBPunkt, toolserver is still slow. [23:37:52] DaBPunkt: Things a bit better now?