[00:04:02] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [00:25:52] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:26:32] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:26:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 352163.000000 [00:27:31] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:28:32] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:28:42] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1141335.000000 [00:29:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [00:29:51] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [00:29:51] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [00:29:52] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [00:29:52] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:12] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:12] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:12] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:12] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:13] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:13] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:32] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:30:32] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:32] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:32] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:30:32] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:30:33] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:30:42] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2314393.000000 [00:30:42] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:52] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [00:30:52] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1426449.000000 [00:30:52] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:52] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:52] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:53] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:53] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:54] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:30:54] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:31:12] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:31:12] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:31:12] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:31:31] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:31:41] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:31:42] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:31:52] SSH on mayapple is CRITICAL: Server answer: [00:31:52] RAID on thyme is UNKNOWN: NRPE: Unable to read output [00:32:02] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [00:32:02] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:32:02] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:32:12] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:32:12] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:32:12] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:32:12] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:32:12] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [00:32:22] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 692749.000000 [00:32:31] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:32:31] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:32:31] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:32:52] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [00:33:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 503617.000000 [00:33:12] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:33:31] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:33:42] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [00:34:02] / on thyme is UNKNOWN: NRPE: Unable to read output [00:34:02] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1316367.000000 [00:34:03] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:34:03] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:36:22] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [00:36:51] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [00:37:02] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [00:37:12] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [00:37:12] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [00:37:32] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:38:02] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [00:38:22] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 73721 [00:38:22] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 34131 [00:40:02] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [00:40:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [00:40:42] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 65344.000000 [00:42:13] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [00:44:03] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 5868 [00:44:12] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 286233 MB (5% inode=64%): [00:44:41] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5865.000000 [00:45:21] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 65478.000000 [00:46:41] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [00:46:52] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [00:51:41] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [00:56:52] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 63132 MB (10% inode=99%): [01:25:55] Ha! Inodes have run out on yarrow's /var. [01:26:02] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:26:32] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:26:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 352125.000000 [01:27:32] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:28:33] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:28:42] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1144938.000000 [01:29:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [01:29:51] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [01:29:52] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [01:29:52] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [01:29:52] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:30:01] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:13] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:13] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:13] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:13] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:13] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:13] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:32] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:30:32] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:30:32] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:30:32] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:30:42] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:43] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:43] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2317840.000000 [01:30:51] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1429974.000000 [01:30:51] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [01:30:52] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:30:52] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:02] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:02] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:02] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:02] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:02] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:03] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:12] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:31:32] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:31:52] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:52] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:31:52] SSH on mayapple is CRITICAL: Server answer: [01:31:52] RAID on thyme is UNKNOWN: NRPE: Unable to read output [01:32:02] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [01:32:12] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:32:12] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [01:32:12] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:32:12] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:32:12] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:32:13] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:32:13] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:32:14] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:32:14] / on wolfsbane is WARNING: DISK WARNING - free space: / 5982 MB (19% inode=93%): [01:32:15] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:32:21] /tmp on wolfsbane is WARNING: DISK WARNING - free space: / 5982 MB (19% inode=93%): [01:32:32] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:32:32] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:32:32] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:33:02] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 500904.000000 [01:33:02] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [01:33:12] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:33:19] alexz, amir, betacommand, cbm, cdpark, devunt, dpl, earwig, enwp10, erfgoed, farbodebrahimi, fekepp, giftbot, gifti, hoo, hydriz, javadyou, jimmy, kolossos, liangent, lvova, merl, merliwbot, pasqual, putnik, rdab, root, russell, sk, toolserverdb, valhallasw, wolf: Someone around with a screen session on yarrow? [01:33:20] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 688867.000000 [01:33:33] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:33:33] putnik: ping [01:33:42] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [01:34:13] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:34:13] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:35:02] / on thyme is UNKNOWN: NRPE: Unable to read output [01:35:03] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1319335.000000 [01:36:52] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [01:37:03] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [01:37:12] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [01:37:21] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [01:37:21] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [01:38:02] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [01:39:21] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 68129 [01:39:21] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 36958 [01:40:02] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [01:40:03] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [01:40:42] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 67131.000000 [01:42:12] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [01:44:03] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 6040 [01:44:12] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 285959 MB (5% inode=64%): [01:44:42] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 6043.000000 [01:45:21] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 67310.000000 [01:46:42] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [01:46:51] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [01:51:42] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [01:53:03] Tim Landscheidt * [Toolserver-l] Inodes have run out on yarrow's /var [01:56:52] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60954 MB (9% inode=99%): [02:06:52] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.004 second response time [02:26:02] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:26:43] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:26:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 352909.000000 [02:27:41] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:28:42] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:28:52] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1148539.000000 [02:29:52] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [02:29:52] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [02:29:53] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [02:30:02] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:13] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:13] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:13] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:13] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:21] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:21] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [02:30:42] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:42] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:30:42] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:30:42] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:30:43] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:30:43] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:30:43] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2321233.000000 [02:30:51] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [02:30:51] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1433451.000000 [02:31:01] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:31:01] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:31:02] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:31:02] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:31:02] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:31:11] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:31:42] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:31:52] SSH on mayapple is CRITICAL: Server answer: [02:31:52] RAID on thyme is UNKNOWN: NRPE: Unable to read output [02:31:52] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:02] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:02] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:02] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:02] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:12] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [02:32:12] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [02:32:12] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:32:12] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:32:12] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:32:13] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:13] / on wolfsbane is WARNING: DISK WARNING - free space: / 6136 MB (20% inode=93%): [02:32:14] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:14] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:21] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:32:21] /tmp on wolfsbane is WARNING: DISK WARNING - free space: / 6139 MB (20% inode=93%): [02:32:41] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:32:41] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:32:41] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:33:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 498419.000000 [02:33:12] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:33:12] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:33:21] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 684007.000000 [02:33:43] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:33:43] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [02:34:02] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [02:34:12] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:34:12] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:34:31] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:35:03] / on thyme is UNKNOWN: NRPE: Unable to read output [02:35:03] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1320440.000000 [02:35:03] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [02:36:53] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [02:37:12] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [02:37:13] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [02:37:22] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [02:37:22] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [02:39:21] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 62852 [02:39:21] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 31896 [02:40:43] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 68919.000000 [02:41:02] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [02:41:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [02:42:12] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [02:44:13] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 285865 MB (5% inode=64%): [02:45:01] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 6296 [02:45:22] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 69096.000000 [02:45:42] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 6292.000000 [02:46:41] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [02:46:52] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [02:56:17] [[User talk:Gingping121]] !N 10https://wiki.toolserver.org/w/index.php?oldid=8001&rcid=21898 * Gingping121 * (+995) (Created page with "In The far east you can get all kinds of food and delightful places Inside the far east, each and every year amounts of [http://www.chinaexpeditiontours.com/ China Tours] thing...") [02:56:52] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 62855 MB (10% inode=99%): [03:05:12] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 69828 [03:07:14] / on wolfsbane is OK: DISK OK - free space: / 7860 MB (26% inode=93%): [03:07:20] /tmp on wolfsbane is OK: DISK OK - free space: / 9239 MB (30% inode=93%): [03:14:12] MySQL slave on z-dat-s5-b is OK: Uptime: 68604 Threads: 21 Questions: 107947094 Slow queries: 326 Opens: 10934 Flush tables: 1 Open tables: 256 Queries per second avg: 1573.481 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 857 [03:18:08] [[User talk:Gingping121]] ! 10https://wiki.toolserver.org/w/index.php?diff=8002&oldid=8001&rcid=21899 * Gingping121 * (+1288) (/* There are plenty of diverse meals available within Cina trip */ new section) [03:21:12] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 70738 [03:21:42] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [03:26:12] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:26:42] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:26:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 354591.000000 [03:27:42] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:28:52] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1152144.000000 [03:29:43] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:29:52] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [03:29:52] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [03:29:52] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [03:30:13] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:30:13] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:30:13] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:30:21] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:30:22] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:30:22] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:30:22] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:30:32] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [03:30:52] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2324661.000000 [03:30:52] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [03:30:52] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1436932.000000 [03:31:01] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:31:11] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:31:12] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:31:12] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:31:12] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:31:13] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:31:42] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:31:42] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:31:42] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:31:42] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:31:42] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:31:42] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:31:43] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:31:52] SSH on mayapple is CRITICAL: Server answer: [03:31:52] RAID on thyme is UNKNOWN: NRPE: Unable to read output [03:32:02] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:32:02] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:32:11] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:32:11] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [03:32:12] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [03:32:12] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:32:12] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:32:12] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:32:12] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:32:18] [[User talk:Gingping121]] ! 10https://wiki.toolserver.org/w/index.php?diff=8003&oldid=8002&rcid=21900 * Gingping121 * (+1096) (/* Diablo 3 gold is excellent to buy some thing in the game */ new section) [03:32:21] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:32:21] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:32:41] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:32:41] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:32:41] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:33:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 497670.000000 [03:33:13] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:33:13] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:33:13] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:33:13] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:33:13] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:33:22] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 680168.000000 [03:34:11] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [03:34:42] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:34:43] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [03:35:12] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:35:12] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:36:02] / on thyme is UNKNOWN: NRPE: Unable to read output [03:36:02] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1322229.000000 [03:36:02] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [03:36:34] [[Special:Log/block]] block 10 * Legoktm * (blocked [[02User:Gingping12110]] with an expiry time of infinite (account creation disabled): spam) [03:36:46] [[Special:Log/delete]] delete 10 * Legoktm * (deleted "[[02User talk:Gingping12110]]": spam) [03:37:12] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [03:37:12] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [03:37:21] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [03:37:22] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [03:37:52] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [03:39:12] MySQL slave on z-dat-s5-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2122 [03:39:21] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 56972 [03:39:22] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 23835 [03:40:13] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 71791 [03:41:01] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [03:41:01] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [03:41:11] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:41:42] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 70912.000000 [03:42:22] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [03:45:01] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 6095 [03:45:12] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 285789 MB (5% inode=64%): [03:45:32] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 72105.000000 [03:45:52] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 6065.000000 [03:46:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [03:46:52] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [03:56:52] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 62771 MB (10% inode=99%): [04:21:42] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [04:26:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 356523.000000 [04:27:11] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:27:42] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:28:42] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:28:52] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1155744.000000 [04:29:52] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [04:29:53] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [04:29:53] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [04:30:22] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:30:22] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:30:22] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:30:22] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:30:22] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:30:23] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:30:31] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2115.000000 [04:30:42] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:30:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [04:31:12] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:31:12] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:31:12] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:31:12] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:31:12] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:31:12] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:31:22] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:31:42] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:31:42] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:31:42] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:31:42] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:31:42] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:31:43] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:31:43] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:31:52] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2328110.000000 [04:31:52] SSH on mayapple is CRITICAL: Server answer: [04:31:52] RAID on thyme is UNKNOWN: NRPE: Unable to read output [04:31:52] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1440451.000000 [04:31:52] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [04:32:01] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:32:11] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [04:32:11] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:32:12] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:32:12] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [04:32:12] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:32:12] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:32:12] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:32:13] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:32:21] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:32:22] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:32:41] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:32:42] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:32:43] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:33:02] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 497770.000000 [04:33:12] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:33:22] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:33:22] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:33:22] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 678632.000000 [04:34:11] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:34:12] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:34:21] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [04:34:42] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:34:42] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [04:35:12] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:35:12] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:36:01] / on thyme is UNKNOWN: NRPE: Unable to read output [04:36:02] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1325612.000000 [04:36:02] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [04:37:22] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [04:37:22] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [04:38:12] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [04:38:12] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [04:38:52] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [04:39:01] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.009 second response time [04:39:21] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 54950 [04:39:22] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 26940 [04:41:01] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [04:41:01] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [04:41:11] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 75424 [04:41:42] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 73151.000000 [04:42:22] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [04:43:52] /sql on z-dat-s1-b is WARNING: DISK WARNING - free space: /sql 81963 MB (8% inode=99%): [04:45:02] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 4819 [04:45:12] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 285642 MB (5% inode=64%): [04:45:32] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 75676.000000 [04:45:52] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 4778.000000 [04:46:52] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [04:46:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [04:56:52] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 62705 MB (10% inode=99%): [05:02:02] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:04:52] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3581.000000 [05:05:01] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3587 [05:05:52] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3615.000000 [05:06:02] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3620 [05:06:52] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3562.000000 [05:07:04] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3566 [05:21:51] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [05:24:32] s4 replag on z-dat-s5-b is OK: QUERY OK: SELECT ts_rc_age() returned 1667.000000 [05:27:22] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:27:43] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:27:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 357404.000000 [05:28:43] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:28:51] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1159345.000000 [05:29:52] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [05:30:01] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [05:30:01] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [05:30:22] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:30:22] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:30:22] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:30:22] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:30:23] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:30:23] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:30:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [05:30:43] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:31:12] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:31:22] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:31:22] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:31:22] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:31:22] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:31:22] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:31:23] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:31:42] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:31:42] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:31:42] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:31:42] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:31:42] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:31:43] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:31:43] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:31:51] SSH on mayapple is CRITICAL: Server answer: [05:31:51] RAID on thyme is UNKNOWN: NRPE: Unable to read output [05:31:51] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2331484.000000 [05:32:11] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [05:32:11] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [05:32:12] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:32:12] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:32:22] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:32:23] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:32:23] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:32:23] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:32:23] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:32:23] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:32:23] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:32:42] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:32:42] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:32:43] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:32:51] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1442042.000000 [05:33:01] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [05:33:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 498918.000000 [05:33:11] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:33:22] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:33:22] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 679123.000000 [05:33:22] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:34:28] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:34:28] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:34:28] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [05:34:42] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [05:34:42] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:35:11] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:35:22] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:36:02] / on thyme is UNKNOWN: NRPE: Unable to read output [05:36:03] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1327763.000000 [05:36:03] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [05:36:03] MySQL slave on rosemary is OK: Uptime: 366056 Threads: 45 Questions: 289037397 Slow queries: 142590 Opens: 24214 Flush tables: 1 Open tables: 3340 Queries per second avg: 789.598 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1792 [05:36:52] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1787.000000 [05:37:22] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [05:37:23] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [05:38:11] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [05:38:12] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [05:38:52] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [05:39:23] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 53418 [05:39:23] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 12656 [05:41:02] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [05:41:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [05:41:11] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 77291 [05:41:42] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 75246.000000 [05:42:22] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [05:45:32] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 76627.000000 [05:46:12] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 285529 MB (5% inode=64%): [05:46:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [05:46:53] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [05:52:11] MySQL slave on z-dat-s5-b is OK: Uptime: 78084 Threads: 6 Questions: 132128805 Slow queries: 607 Opens: 34914 Flush tables: 1 Open tables: 256 Queries per second avg: 1692.136 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [05:56:52] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 62609 MB (10% inode=99%): [06:00:23] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3461 [06:02:23] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:03:11] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.004 second response time [06:04:22] MySQL slave on z-dat-s2-b is OK: Uptime: 78870 Threads: 10 Questions: 65347702 Slow queries: 980 Opens: 821169 Flush tables: 1 Open tables: 256 Queries per second avg: 828.549 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1341 [06:11:12] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 76505 [06:12:11] MySQL slave on z-dat-s5-b is OK: Uptime: 79284 Threads: 9 Questions: 132323590 Slow queries: 615 Opens: 35208 Flush tables: 1 Open tables: 256 Queries per second avg: 1668.982 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 297 [06:27:23] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:27:52] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:27:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 357379.000000 [06:28:52] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:28:52] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1162946.000000 [06:29:52] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [06:30:01] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [06:30:02] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [06:30:23] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:30:23] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:30:23] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:30:23] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:30:24] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:30:32] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:30:42] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [06:30:52] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:31:22] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:31:22] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:31:22] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:31:22] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:31:22] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:31:23] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:31:23] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:31:42] SSH on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:31:43] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:31:43] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:31:51] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:31:52] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:31:52] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:31:52] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:31:52] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:31:53] SSH on mayapple is CRITICAL: Server answer: [06:31:53] RAID on thyme is UNKNOWN: NRPE: Unable to read output [06:31:53] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2334780.000000 [06:32:23] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [06:32:23] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:32:23] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:32:23] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:32:23] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:32:24] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:32:24] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:32:25] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:32:25] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:32:26] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:32:52] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:32:52] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:32:52] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:32:52] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1444181.000000 [06:33:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 500465.000000 [06:33:01] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [06:33:11] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [06:33:23] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:33:24] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:33:24] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 679232.000000 [06:33:24] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:34:22] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:34:23] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:34:23] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [06:34:51] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:35:23] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:35:23] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:35:43] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [06:36:02] / on thyme is UNKNOWN: NRPE: Unable to read output [06:37:02] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1328653.000000 [06:37:02] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [06:37:23] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [06:37:23] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [06:38:11] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [06:38:22] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [06:38:52] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [06:39:23] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 51803 [06:41:02] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [06:41:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [06:41:41] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 77562.000000 [06:42:23] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [06:45:32] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 77962.000000 [06:46:11] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 285489 MB (5% inode=64%): [06:46:52] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [06:46:52] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [06:51:41] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [06:56:52] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 62457 MB (10% inode=99%): [07:20:22] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:28:22] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:28:53] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:28:53] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:28:53] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1166548.000000 [07:28:53] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 357307.000000 [07:30:02] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [07:30:02] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [07:30:02] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [07:30:23] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:30:23] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:30:23] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:30:23] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:30:32] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:30:52] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:31:22] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:31:23] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:31:23] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:31:23] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:31:23] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:31:31] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:31:43] SSH on z-dat-s2-b is OK: SSH OK - OpenSSH_5.5p1 Debian-6+squeeze3 (protocol 2.0) [07:31:43] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [07:31:52] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:31:52] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:31:52] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:31:52] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:31:52] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:31:53] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:31:53] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:31:53] SSH on mayapple is CRITICAL: Server answer: [07:31:54] RAID on thyme is UNKNOWN: NRPE: Unable to read output [07:32:01] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2338174.000000 [07:32:23] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [07:32:23] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:32:23] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:32:23] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:32:23] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:32:24] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:32:24] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:32:25] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:32:25] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:32:26] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:32:26] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:32:31] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:32:52] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:32:52] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:32:52] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:32:52] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1447686.000000 [07:33:01] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 501514.000000 [07:33:02] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [07:33:10] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [07:33:23] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:33:23] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 679109.000000 [07:33:23] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:34:22] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:34:22] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [07:34:51] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:35:23] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:23] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:23] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:23] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:35:42] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [07:36:02] / on thyme is UNKNOWN: NRPE: Unable to read output [07:37:03] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1329447.000000 [07:37:03] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [07:37:22] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [07:38:11] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [07:38:23] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [07:38:23] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [07:38:52] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [07:39:23] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 50881 [07:41:02] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [07:41:43] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 79995.000000 [07:42:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [07:42:22] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [07:45:43] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 80267.000000 [07:46:11] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 285219 MB (5% inode=64%): [07:47:51] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [07:47:52] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [07:51:42] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [07:56:52] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 62223 MB (10% inode=99%): [08:21:22] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:28:32] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:29:52] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:29:52] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:29:52] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1170208.000000 [08:29:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 357122.000000 [08:30:02] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [08:30:02] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [08:30:02] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [08:30:32] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:30:32] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:30:33] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:30:33] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:30:51] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:31:22] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:31:32] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:31:32] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:31:32] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:31:32] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:31:32] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:31:52] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:31:52] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:31:52] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:31:52] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:31:52] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:31:53] RAID on thyme is UNKNOWN: NRPE: Unable to read output [08:31:53] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:31:54] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:32:22] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:32:31] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:32:31] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:32:31] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:32:31] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:32:31] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:32:32] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:32:32] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:32:41] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [08:32:51] SSH on mayapple is CRITICAL: Server answer: [08:32:52] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:32:52] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:32:52] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:33:02] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 501099.000000 [08:33:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2341474.000000 [08:33:02] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1451216.000000 [08:33:02] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [08:33:12] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [08:33:22] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [08:33:22] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:33:22] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:33:22] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:33:22] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:33:32] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:34:21] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 679401.000000 [08:34:21] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:34:34] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:34:34] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [08:34:52] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:35:42] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [08:36:01] / on thyme is UNKNOWN: NRPE: Unable to read output [08:36:21] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:22] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:22] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:22] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:37:12] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [08:38:01] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1329405.000000 [08:38:12] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [08:38:22] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [08:39:02] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [08:39:22] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [08:39:22] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [08:40:22] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 50489 [08:41:01] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [08:41:43] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 82527.000000 [08:43:01] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [08:43:22] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [08:45:42] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 82921.000000 [08:46:12] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 285027 MB (5% inode=64%): [08:47:12] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.006 second response time [08:47:51] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:47:51] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [08:51:42] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [08:54:12] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 83426 [08:56:53] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 61985 MB (10% inode=99%): [09:06:26] @replag [09:06:30] Pyfisch: s1-rr-a: 14h 1m 10s [-0.79 s/s]; s1-rr-a-wd: 1w 4d 15h 21m 25s [+0.48 s/s]; s1-user-c: 1w 6d 13h 40m 9s [+1.00 s/s]; s1-user-wd: 2w 2d 19h 39m 57s [+0.87 s/s]; s2-user-c: error; s2-user-wd: 2w 1d 9h 4m 45s [+0.42 s/s]; s3-user: 27s [-0.00 s/s]; s3-user-wd: 1w 1d 36m 41s [-0.62 s/s] [09:06:31] Pyfisch: s4-user-wd: 3w 6d 2h 56m 42s [+0.94 s/s]; s5-rr-a: 23h 8m 41s [+0.58 s/s]; s5-rr-a-wd: 4d 3h 2m 53s [+0.14 s/s]; s5-user: 23h 15m 1s [+0.59 s/s]; s6-user-wd: 1w 20h 49m 13s [-0.43 s/s]; s7-user: 26s [-]; s7-user-wd: 5d 19h 12m 28s [-0.12 s/s] [09:09:22] MySQL slave on z-dat-s5-b is OK: Uptime: 89909 Threads: 6 Questions: 135576903 Slow queries: 1054 Opens: 37078 Flush tables: 1 Open tables: 256 Queries per second avg: 1507.934 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [09:27:21] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 84314 [09:28:32] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:29:22] MySQL slave on z-dat-s5-b is OK: Uptime: 91109 Threads: 11 Questions: 215781271 Slow queries: 1099 Opens: 37953 Flush tables: 1 Open tables: 252 Queries per second avg: 2368.385 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 189 [09:29:52] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:52] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:30:01] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1173814.000000 [09:30:02] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [09:30:02] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [09:30:02] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [09:30:31] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:30:31] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:30:51] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:30:52] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 355854.000000 [09:31:22] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:31:31] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:31:31] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:31:31] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:31:32] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:31:52] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:31:52] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:31:52] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:32:31] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:32] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:32] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:33] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:33] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:33] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:33] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:33] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:33] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:34] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:34] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:32:41] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [09:32:52] SSH on mayapple is CRITICAL: Server answer: [09:32:52] RAID on thyme is UNKNOWN: NRPE: Unable to read output [09:32:52] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:32:52] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:32:52] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:32:52] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:32:53] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:32:53] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:32:54] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:33:02] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 500932.000000 [09:33:02] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1454744.000000 [09:33:02] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2344895.000000 [09:33:02] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [09:33:21] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [09:33:21] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:33:22] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:33:22] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:33:22] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:34:11] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [09:34:26] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 679570.000000 [09:34:26] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:34:32] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:34:52] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:35:32] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:35:32] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:36:02] / on thyme is UNKNOWN: NRPE: Unable to read output [09:36:22] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:22] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:23] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:23] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:36:42] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [09:38:02] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1327276.000000 [09:38:02] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [09:38:22] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [09:39:01] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [09:39:10] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [09:39:21] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [09:39:22] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [09:40:22] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 50379 [09:41:01] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [09:42:41] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 84836.000000 [09:43:02] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [09:43:03] Finne Boonen * [Toolserver-l] Amsterdam hackathon workshops [09:43:22] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [09:46:22] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 284830 MB (5% inode=64%): [09:46:42] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 85307.000000 [09:47:51] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [09:47:52] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [09:51:11] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:51:43] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [09:53:05] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [09:53:06] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [09:53:06] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:53:07] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:53:07] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:53:08] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [09:53:08] / on thyme is UNKNOWN: NRPE: Unable to read output [09:53:14] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [09:53:14] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [09:53:14] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [09:53:14] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:53:14] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:53:15] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:53:15] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:53:16] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:53:16] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:53:17] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [09:53:17] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:56:53] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 61721 MB (10% inode=99%): [09:57:54] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:58:14] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 85892 [10:01:44] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.010 second response time [10:03:14] MySQL slave on z-dat-s5-b is OK: Uptime: 93148 Threads: 10 Questions: 216545217 Slow queries: 1164 Opens: 40349 Flush tables: 1 Open tables: 256 Queries per second avg: 2324.743 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1471 [10:09:22] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:46:14] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.011 second response time [10:52:14] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [10:52:14] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [10:52:14] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [10:52:14] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [10:52:14] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 284397 MB (5% inode=64%): [10:52:14] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [10:52:14] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:15] FC 0/2 [hemlock] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:15] FC 0/10 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:24] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [10:52:24] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [10:52:24] FC 0/11 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:24] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 49836 [10:52:24] FC 0/9 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:24] FC 0/20 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:24] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [10:52:33] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:52:33] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:52:43] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:52:43] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [10:52:43] FC 0/23 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:43] eth0 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:43] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:52:43] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:52:44] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:52:44] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1178782.000000 [10:52:45] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:52:54] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 87578.000000 [10:52:54] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 499714.000000 [10:52:54] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1327072.000000 [10:52:54] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 678199.000000 [10:52:54] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 87569.000000 [10:52:55] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:52:55] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:56] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:52:56] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:52:57] SSH on mayapple is CRITICAL: Server answer: [10:52:57] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:52:58] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:52:58] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:52:59] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:53:14] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [10:53:14] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [10:53:14] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:53:14] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:14] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [10:53:15] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:15] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [10:53:16] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:16] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:17] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:17] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:18] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:53:33] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [10:53:33] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [10:53:34] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2349466.000000 [10:53:34] FC 0/21 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:34] FC 0/12 [thyme] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:34] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:53:34] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [10:53:35] FC 0/13 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:35] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [10:53:36] FC 0/22 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:53:36] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [10:53:43] FC 0/14 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:54:03] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [10:56:54] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 61488 MB (10% inode=99%): [11:06:23] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:35:53] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:38:39] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1328069.000000 [11:38:39] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 89335.000000 [11:38:39] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 677279.000000 [11:38:39] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 500052.000000 [11:38:39] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 89329.000000 [11:38:40] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:38:40] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:38:41] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:38:41] SSH on mayapple is CRITICAL: Server answer: [11:38:42] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1460997.000000 [11:38:42] RAID on thyme is UNKNOWN: NRPE: Unable to read output [11:38:43] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:38:43] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:38:44] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [11:38:54] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [11:38:55] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [11:38:56] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [11:38:56] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:38:56] FC 0/1 [far1-n1-oe16-esams B1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:38:56] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:44:04] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:44:35] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:47:24] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.580 second response time [11:49:24] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.006 second response time [11:59:21] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [11:59:21] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:59:21] FC 0/6 [rosemary] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:59:21] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:59:21] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [11:59:22] SSH on mayapple is CRITICAL: Server answer: [11:59:22] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 61231 MB (10% inode=99%): [11:59:23] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [11:59:23] FC 0/18 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:59:25] FC 0/7 [daphne] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:59:25] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [11:59:26] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:59:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [11:59:26] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:59:26] FC 0/8 [adenia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host fsw1-n1-oe16-esams.mgmt. [11:59:45] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:59:55] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:59:55] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:59:55] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:04:55] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:25:46] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.018 second response time [12:56:35] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [12:56:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [12:58:25] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:58:36] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 501894.000000 [12:58:36] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [12:58:36] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 349918.000000 [12:58:36] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 678006.000000 [12:58:45] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:58:45] RAID on thyme is UNKNOWN: NRPE: Unable to read output [12:58:46] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 48224 [12:58:46] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 92104.000000 [12:58:46] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [12:58:46] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [12:58:55] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2356330.000000 [12:58:56] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:58:56] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:58:56] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:58:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:58:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [12:58:56] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [12:59:06] / on thyme is UNKNOWN: NRPE: Unable to read output [12:59:06] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [12:59:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1327624.000000 [12:59:06] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [12:59:06] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:06] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:59:07] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:07] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:59:08] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [12:59:16] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [12:59:16] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:59:16] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [12:59:16] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:16] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:17] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:59:17] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [12:59:18] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:18] SSH on mayapple is CRITICAL: Server answer: [12:59:19] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [12:59:19] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [12:59:20] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [12:59:26] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [12:59:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [12:59:26] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 92102.000000 [12:59:26] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [12:59:36] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:36] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [12:59:36] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:59:45] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:55] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:55] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:59:55] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:03:16] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60964 MB (9% inode=99%): [13:03:25] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 283654 MB (5% inode=64%): [13:03:45] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1186642.000000 [13:03:55] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1464999.000000 [13:04:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 92359 [13:10:16] MySQL slave on z-dat-s5-b is OK: Uptime: 104370 Threads: 11 Questions: 219740867 Slow queries: 1214 Opens: 41095 Flush tables: 1 Open tables: 256 Queries per second avg: 2105.402 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 21 [13:18:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [13:25:45] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [13:29:05] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2160.000000 [13:43:00] Merlissimo: Did you free inodes on yarrow's /var? Now there's 96 % free. [13:44:05] s4 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1433.000000 [13:56:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [13:58:36] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 503067.000000 [13:58:45] RAID on thyme is UNKNOWN: NRPE: Unable to read output [13:58:45] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 46291 [13:58:45] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 94255.000000 [13:58:45] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:58:46] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [13:58:46] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [13:58:55] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2359565.000000 [13:59:05] / on thyme is UNKNOWN: NRPE: Unable to read output [13:59:05] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [13:59:06] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 351690.000000 [13:59:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1328393.000000 [13:59:06] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [13:59:06] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:06] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:07] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:07] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [13:59:16] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [13:59:16] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [13:59:16] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:16] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:16] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:16] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:16] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [13:59:17] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:17] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [13:59:18] SSH on mayapple is CRITICAL: Server answer: [13:59:18] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [13:59:19] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [13:59:26] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:26] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [13:59:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [13:59:26] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 94267.000000 [13:59:36] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [13:59:36] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:36] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [13:59:36] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [13:59:36] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 678203.000000 [13:59:36] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:55] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:55] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:55] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:55] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:55] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [13:59:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:56] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:57] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:59:57] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [14:00:45] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:03:16] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60667 MB (9% inode=99%): [14:03:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 283328 MB (5% inode=64%): [14:03:45] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1190242.000000 [14:04:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1468617.000000 [14:18:35] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [14:25:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 95269 [14:25:46] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [14:36:16] MySQL slave on z-dat-s5-b is OK: Uptime: 109530 Threads: 6 Questions: 221121832 Slow queries: 1275 Opens: 41808 Flush tables: 1 Open tables: 256 Queries per second avg: 2018.824 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [14:41:05] Tim Landscheidt * Re: [Toolserver-l] Inodes have run out on yarrow's /var [14:56:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [14:58:36] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 503157.000000 [14:58:56] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:58:56] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2362867.000000 [14:59:06] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [14:59:06] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [14:59:06] / on thyme is UNKNOWN: NRPE: Unable to read output [14:59:06] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 349607.000000 [14:59:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1329941.000000 [14:59:06] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [14:59:06] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:59:07] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:59:07] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [14:59:26] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [14:59:26] SSH on mayapple is CRITICAL: Server answer: [14:59:26] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [14:59:27] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [14:59:27] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:59:27] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:59:27] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:59:28] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:59:28] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [14:59:29] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [14:59:29] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 96552.000000 [14:59:36] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [14:59:36] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [14:59:36] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [14:59:36] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:59:36] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:59:37] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 678151.000000 [14:59:44] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:59:44] RAID on thyme is UNKNOWN: NRPE: Unable to read output [14:59:44] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 43784 [14:59:45] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 96582.000000 [14:59:45] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [14:59:56] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [14:59:56] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [14:59:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [14:59:56] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:59:56] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:59:57] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [15:00:05] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:00:07] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:00:07] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:00:07] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:00:07] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:00:14] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:00:14] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:00:14] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [15:00:56] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:03:26] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60390 MB (9% inode=99%): [15:03:36] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 282910 MB (5% inode=64%): [15:03:57] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1193847.000000 [15:04:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1472079.000000 [15:16:45] Sun Grid Engine execd on nightshade is CRITICAL: CRITICAL: execd not communicating [15:18:37] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [15:25:55] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [15:40:29] Nemo_bis: https://jira.toolserver.org/browse/TS-1648. I don't think that nscd is causing nightshade to overload. [15:47:55] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1872.000000 [15:48:15] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1886 [15:50:00] scfc_de: thanks, why not? [15:51:05] right now it's better and nscd is stil 100 % [15:51:15] Nemo_bis: Because at the moment, nightshade's load is low (3.69), but nscd (and two other scripts) are still consuming 100 % of CPU. Nightshade seems to be rather fast ATM. [15:51:45] but why is it costantly 100 %? [15:52:41] the situation I described in the ticket happens very frequently [15:52:53] I have no idea, but Alchimista's and Dispenser's scripts are as well, but overall CPU is idle 60 %. [15:53:21] (Looking at "top" now.) [15:53:50] I'm not sure if that number actually correct, unless I got hacked [15:54:44] scfc_de: sorry, i didn't catch the conversation, what's up with the scripts? [15:54:49] i'm out of context [15:54:54] Dispenser: Is checklinks.py checking external links? [15:55:09] yes [15:55:16] Alchimista: Probably nothing, "top" just says they're consuming 100 % CPU. [15:56:13] Dispenser: That would probably explain why nscd is high as well as it has a lot to do, but I would never imagine that something could push it to 100 %. [15:56:45] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [15:57:15] (Unless you're checking thousands of links per second :-).) [15:58:19] Kill, they all have crash tolerance anyway [15:58:35] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 503067.000000 [15:59:02] I wouldn't know why. Overall, load is okay and CPU is still idling most of the time. [15:59:05] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:59:06] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2366240.000000 [15:59:07] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [15:59:07] / on thyme is UNKNOWN: NRPE: Unable to read output [15:59:07] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 347619.000000 [15:59:07] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1329728.000000 [15:59:07] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [15:59:07] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [15:59:15] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [15:59:16] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:59:16] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:59:26] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [15:59:26] SSH on mayapple is CRITICAL: Server answer: [15:59:26] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [15:59:36] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [15:59:36] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [15:59:36] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:59:36] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:59:36] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:59:37] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:59:37] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [15:59:45] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:59:45] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 678655.000000 [15:59:45] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:59:55] RAID on thyme is UNKNOWN: NRPE: Unable to read output [15:59:55] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 41670 [15:59:55] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 98936.000000 [15:59:55] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [15:59:55] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:59:56] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [15:59:56] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [16:00:05] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [16:00:05] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [16:00:16] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:00:16] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:00:16] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:00:16] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:00:16] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:00:17] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:00:17] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:00:18] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [16:00:26] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 98970 [16:00:26] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:00:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:00:26] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 98874.000000 [16:00:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [16:00:27] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [16:01:12] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:02:03] scfc_de: on night? i've got an ideia wich script is causing it. a silly question, wich flag allows top to show the command script? [16:03:26] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 60141 MB (9% inode=99%): [16:04:36] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 282493 MB (5% inode=64%): [16:04:50] $ ps -fu$LOGNAME [16:04:56] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1197511.000000 [16:05:06] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1475631.000000 [16:16:55] Sun Grid Engine execd on nightshade is CRITICAL: CRITICAL: execd not communicating [16:17:18] [[Special:Log/newusers]] create 10 * Darrel73 * (New user account) [16:18:42] sf [16:18:44] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [16:19:01] scfc_de: that's the case right now, but it often has very high wait cpu [16:19:08] of course munin being broken doesn't help :( [16:20:47] Nemo_bis: If Munin was the *only* thing that was broken ... :-) [16:21:32] it help making the rest more broken gradually :) [16:21:53] Nemo_bis: I'm not denying that sometimes the load is very high, I just don't think it's connected to nscd. [16:22:52] scfc_de: but is it normal that nscd is so high, anyway? [16:23:25] I know nothing but I see people are often forced to kill/restart nscd when it behaves like this [16:25:55] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [16:26:24] Nemo_bis: It looks certainly very odd. But as DaBPunkt wants to upgrade the Debian hosts tomorrow (and no root's around anyway), I think it's best to ignore it until there's better evidence. [16:27:38] well, I can't do anything anyway :) [16:29:15] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [16:30:28] Nemo_bis: That makes us two :-). [16:34:13] scfc_de: does that warning about sge above mean that qsub works on hosts other than nightshade (where it seemed broken to me)? [16:35:31] Nemo_bis: Which warning? [16:35:41] Nemo_bis: Ah, shit, the nightshade queues are down as well. [16:36:48] Now all Linux queues are in "au". Merlissimo to the rescue, please. [16:38:46] scfc_de: If you're looking into our performance issues I'd figuring out what changed in the MySQL upgrades. Since Feb, INSERTs are up and SELECTs are down, and EXPLAIN doesn't work anymore. [16:39:55] /tmp on willow is WARNING: DISK WARNING - free space: / 20348 MB (19% inode=98%): [16:40:55] / on willow is WARNING: DISK WARNING - free space: / 20345 MB (19% inode=98%): [16:44:26] /sql on z-dat-s1-b is WARNING: DISK WARNING - free space: /sql 81453 MB (8% inode=99%): [16:44:55] Dispenser: I have seen your link (and wondered as well), but you have to ask DaBPunkt about that, I don't know anything regarding that. EXPLAIN has been disabled at least since last year (https://jira.toolserver.org/browse/TS-1585), but the fix that fale found has not been applied yet. [16:46:16] toolserver.org HTTP on ortelius is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 4.285 second response time [16:48:16] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3298 [16:48:26] MySQL slave on z-dat-s5-b is OK: Uptime: 117458 Threads: 6 Questions: 223195026 Slow queries: 1358 Opens: 42646 Flush tables: 1 Open tables: 255 Queries per second avg: 1900.211 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 711 [16:48:55] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3308.000000 [16:49:06] toolserver.org HTTP on ortelius is WARNING: HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.817 second response time [16:50:15] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:55:26] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 101457 [16:56:26] MySQL slave on z-dat-s5-b is OK: Uptime: 117938 Threads: 4 Questions: 223463610 Slow queries: 1362 Opens: 42657 Flush tables: 1 Open tables: 255 Queries per second avg: 1894.754 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [16:56:45] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [16:58:35] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 502754.000000 [16:59:05] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:59:05] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2369653.000000 [16:59:06] / on thyme is UNKNOWN: NRPE: Unable to read output [16:59:06] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [16:59:06] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 346223.000000 [16:59:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1330427.000000 [16:59:15] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [16:59:16] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:16] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:25] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [16:59:26] SSH on mayapple is CRITICAL: Server answer: [16:59:26] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [16:59:45] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [16:59:45] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 678700.000000 [16:59:45] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:45] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:59:55] RAID on thyme is UNKNOWN: NRPE: Unable to read output [16:59:55] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 39127 [16:59:55] / on willow is OK: DISK OK - free space: / 30722 MB (29% inode=99%): [16:59:56] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 101663.000000 [16:59:56] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [16:59:56] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:56] /tmp on willow is OK: DISK OK - free space: / 30722 MB (29% inode=99%): [16:59:57] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [16:59:57] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [17:00:05] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [17:00:05] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [17:00:06] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [17:00:06] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [17:00:16] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:00:16] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:00:16] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:00:16] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:00:16] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:00:17] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:00:17] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:00:18] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [17:00:26] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:00:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:00:26] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 101688.000000 [17:00:26] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [17:00:35] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [17:00:35] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [17:00:35] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [17:00:35] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:00:35] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:00:36] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:00:36] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:01:06] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:03:26] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 59876 MB (9% inode=99%): [17:05:06] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1479155.000000 [17:05:35] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 279852 MB (5% inode=64%): [17:05:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3603.000000 [17:05:55] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1201171.000000 [17:06:16] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3614 [17:16:56] Sun Grid Engine execd on nightshade is CRITICAL: CRITICAL: execd not communicating [17:18:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [17:25:55] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [17:29:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [17:31:48] @replag [17:31:51] russblau: s1-rr-a: 10h 20m 14s [-0.44 s/s]; s1-rr-a-wd: 1w 4d 21h 5m 19s [+0.68 s/s]; s1-user: 1h 6m 9s [-0.02 s/s]; s1-user-c: 1w 6d 22h 5m 31s [+1.00 s/s]; s1-user-wd: 2w 3d 3h 18m 48s [+0.91 s/s]; s2-rr: 33s [-0.51 s/s]; s2-user: 33s [-0.51 s/s]; s2-user-c: error [17:31:52] russblau: s2-user-wd: 2w 1d 9h 36m 39s [+0.06 s/s]; s3-user: 13s [-0.00 s/s]; s3-user-wd: 1w 1d 2m 54s [-0.07 s/s]; s4-user-wd: 3w 6d 10h 44m 47s [+0.93 s/s]; s5-rr-a: 1d 4h 32m 11s [+0.64 s/s]; s5-rr-a-wd: 4d 4m 50s [-0.35 s/s]; s5-user: 1d 4h 32m 11s [+0.63 s/s]; s6-user-wd: 1w 20h 41m 25s [-0.02 s/s] [17:31:53] russblau: s7-user-wd: 5d 19h 45m 8s [+0.06 s/s] [17:35:55] / on wolfsbane is WARNING: DISK WARNING - free space: / 6214 MB (20% inode=93%): [17:36:05] /tmp on wolfsbane is WARNING: DISK WARNING - free space: / 6220 MB (20% inode=93%): [17:56:45] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [17:57:49] [[Job scheduling]] ! 10https://wiki.toolserver.org/w/index.php?diff=8004&oldid=7721&rcid=21904 * Tim.Landscheidt * (+136) (Add link to http://toolserver.org/~timl/grid-status.php.) [17:59:06] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2373015.000000 [17:59:06] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:59:06] / on thyme is UNKNOWN: NRPE: Unable to read output [17:59:06] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [17:59:06] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 345724.000000 [17:59:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1331113.000000 [17:59:16] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [17:59:16] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:59:16] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:59:26] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [17:59:26] SSH on mayapple is CRITICAL: Server answer: [17:59:26] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [17:59:35] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 503419.000000 [17:59:45] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [17:59:45] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 679851.000000 [17:59:45] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:59:45] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:59:55] RAID on thyme is UNKNOWN: NRPE: Unable to read output [17:59:55] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 33978 [17:59:55] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 103642.000000 [17:59:55] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [17:59:55] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:59:56] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [17:59:56] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [18:00:05] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [18:00:05] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [18:00:05] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [18:00:05] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [18:00:16] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:00:16] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:00:16] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:00:16] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:00:16] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:00:17] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [18:00:17] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:00:18] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:00:25] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:00:25] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:00:25] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 103659.000000 [18:00:35] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [18:00:35] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [18:00:36] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [18:00:36] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [18:00:45] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:00:46] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:00:46] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:00:46] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:01:06] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:03:26] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 59564 MB (9% inode=99%): [18:04:06] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:05:06] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1482648.000000 [18:05:35] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 279381 MB (5% inode=64%): [18:05:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 4223.000000 [18:05:56] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1204771.000000 [18:06:16] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 4214 [18:16:55] Sun Grid Engine execd on nightshade is CRITICAL: CRITICAL: execd not communicating [18:17:05] Marlen Caemmerer * Re: [Toolserver-l] Inodes have run out on yarrow's /var (fwd) [18:18:05] / on ortelius is WARNING: DISK WARNING - free space: / 6249 MB (20% inode=92%): [18:18:45] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [18:25:55] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [18:27:55] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [18:29:06] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [18:29:17] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [18:31:56] hello [18:32:02] maintenance tonight :D [18:32:29] going to install patches on solaris, reboot may be required afterwards - ill let you know [18:35:56] / on wolfsbane is WARNING: DISK WARNING - free space: / 5702 MB (19% inode=93%): [18:36:07] /tmp on wolfsbane is WARNING: DISK WARNING - free space: / 5720 MB (19% inode=93%): [18:38:02] nosy: Re yarrow, I noticed you didn't reboot it after cleaning up /var. Should we? [18:38:42] scfc_de: well not sure - id rather not do it since some people might have to restart jobs [18:39:08] is there anything broken? [18:40:10] Usually, when /var fills up, funny things can happen, so I haven't checked that every service is still up. [18:40:45] Also, the SGE queues for the Linux servers are down. Could you look into that as well? [18:41:17] scfc_de: yes i can i just wanted to start the updates - this takes a while anyway [18:41:24] nosy: np [18:41:32] ill go look [18:56:45] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [18:59:06] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:59:07] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2376453.000000 [18:59:08] / on thyme is UNKNOWN: NRPE: Unable to read output [18:59:08] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [18:59:08] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1332684.000000 [18:59:08] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 346826.000000 [18:59:16] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [18:59:17] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:59:17] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:59:26] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [18:59:26] SSH on mayapple is CRITICAL: Server answer: [18:59:26] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [18:59:35] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 503939.000000 [18:59:45] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [18:59:45] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 680873.000000 [18:59:45] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:59:45] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:59:55] RAID on thyme is UNKNOWN: NRPE: Unable to read output [18:59:55] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 30907 [18:59:55] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 105778.000000 [18:59:56] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [18:59:56] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:59:56] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [18:59:56] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [19:00:05] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [19:00:05] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [19:00:06] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [19:00:16] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:00:16] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:00:16] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:00:16] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:00:16] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:00:17] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [19:00:17] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:00:25] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 105777.000000 [19:00:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:00:26] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:00:34] scfc_de: ok, queues enabled again [19:00:45] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:00:46] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:00:46] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:00:46] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:01:06] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:01:26] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:01:35] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [19:01:35] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [19:01:35] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [19:01:36] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [19:02:15] nosy: Thanks. What was the problem? [19:03:06] SMTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:03:09] scfc_de: an old job was stuck in error state due to the failure at yarrow - it was submitted to nightshade but did not run either (dont know why) kicking the job did it afais [19:03:11] Is it possible the switch isn't functioning right? All the problems seem networking related. [19:03:26] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 59325 MB (9% inode=99%): [19:04:08] nosy: And the SGE logs on nightshade show nothing? :-( [19:05:06] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1486064.000000 [19:05:06] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:05:27] nosy: DaBPunkt searched for the hardware tool to reboot mayapple some time ago. Do you know what's mayapple's status? [19:05:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 4370.000000 [19:05:55] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1208371.000000 [19:05:58] scfc_de: afaik its not really online, no console access [19:06:12] so we have to wait for someone to be in the data center [19:06:15] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 4379 [19:06:18] logs say [19:06:22] May 7 18:45:47 nightshade sge_execd: nss_ldap: could not search LDAP server - Server is unavailable [19:06:26] thats all i find [19:06:35] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 279109 MB (5% inode=64%): [19:06:55] probably the daemon does not work afterwards...hmm...ill try to check [19:07:05] So the SGE errors are just a consequence of yet another LDAP failure. [19:07:22] We really have no way to reboot remote machines?! [19:07:35] SMTP on willow is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:07:58] not mayapple [19:08:04] the rest does [19:08:17] but this was intended as a test host [19:08:25] forgot why it has no management [19:09:17] Ah, okay, that sounds more sane :-). [19:12:06] / on ortelius is OK: DISK OK - free space: /var/run/.patchSafeMode/root 7178 MB (23% inode=92%): [19:13:26] SMTP on willow is OK: SMTP OK - 0.003 sec. response time [19:14:35] SMTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:18:45] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [19:25:55] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [19:27:15] SMTP on turnera is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:29:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [19:29:26] SMTP on damiana is OK: SMTP OK - 0.002 sec. response time [19:30:45] Load avg. on ptolemy is WARNING: WARNING - load average: 19.51, 15.77, 10.18 [19:33:45] Load avg. on ptolemy is OK: OK - load average: 13.12, 14.86, 10.85 [19:35:54] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:36:55] / on wolfsbane is WARNING: DISK WARNING - free space: /var/run/.patchSafeMode/root 4880 MB (16% inode=93%): [19:37:05] /tmp on wolfsbane is WARNING: DISK WARNING - free space: / 4851 MB (16% inode=93%): [19:40:35] SMTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:35] SMTP on willow is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:49:36] rebooting damiana [19:49:50] ortelius and willow will follow soon [19:51:48] nosy: damiana = one of the "head nodes"/LDAP servers? As they seem to cause most of the trouble, do we have some public log how often they reboot? [19:52:10] yes one ldap [19:52:26] but currently turnera is running ldap [19:52:42] hm...public log...either munin or nagios [19:53:56] ok will reboot ortelius now [19:54:04] willow follows afterwards [19:54:33] And "white" on http://munin.toolserver.org/Miscellaneous/damiana/index.html#system means rebooted? [19:54:52] nosy: how soon? [19:54:58] (willow) [19:55:10] Sun Grid Engine execd on ortelius is CRITICAL: Connection refused by host [19:55:34] (Looks rather as if that shows Munin failures. *Argl*.) [19:55:39] Danny_B: a few minutes - when ortelius is back [19:55:44] scfc_de: right failures [19:56:01] but when the graph is small again it was a reboot [19:56:34] ah, need to save my stuff [19:56:42] gimme couple mins [19:56:53] np [19:57:01] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [19:58:04] But neither damiana nor turnera are scheduled to reboot, say, every day at x o'clock? [19:59:10] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2379623.000000 [19:59:11] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [19:59:11] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1333433.000000 [19:59:11] / on thyme is UNKNOWN: NRPE: Unable to read output [19:59:11] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:59:11] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 347637.000000 [19:59:21] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [19:59:21] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:59:22] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:59:31] SSH on mayapple is CRITICAL: Server answer: [19:59:31] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [19:59:50] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:59:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:00:01] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 681536.000000 [20:00:02] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 504648.000000 [20:00:02] RAID on thyme is UNKNOWN: NRPE: Unable to read output [20:00:02] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 28245 [20:00:02] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 108094.000000 [20:00:02] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [20:00:03] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [20:00:03] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [20:00:04] SMTP on turnera is OK: SMTP OK - 0.010 sec. response time [20:00:11] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:00:11] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [20:00:11] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [20:00:21] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [20:00:21] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:00:21] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [20:00:30] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:00:30] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:00:31] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 108114.000000 [20:00:32] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:00:32] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:00:32] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:00:32] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:00:32] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:00:39] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [20:01:11] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:01:24] ok ortelius is back [20:01:31] Danny_B: ready? [20:01:39] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [20:01:40] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:01:40] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:01:40] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:01:40] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:02:22] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:02:31] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [20:02:31] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [20:02:31] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [20:02:56] nosy: But neither damiana nor turnera are scheduled to reboot, say, every day at x o'clock? [20:03:07] nope [20:03:26] dab only has a "memory exhausted" cron job that reboots [20:03:31] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 59064 MB (9% inode=99%): [20:03:34] its a bug in the cluster software [20:03:49] but we d need to buy it to fix the bug [20:04:07] was fine if ts was migrated by oct 2012 [20:04:33] nosy: go ahead [20:04:46] nosy: cluster software = load balancer/high-availability stuff? [20:04:51] ok then ill do so [20:05:10] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1489616.000000 [20:05:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [20:05:10] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:05:11] re [20:05:45] nosy: Cluster software = load balancer/high-availability stuff? [20:06:02] yes its called solaris cluster afaik [20:06:02] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1211978.000000 [20:06:02] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 4995.000000 [20:06:21] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 4997 [20:06:38] Do you know what's the price tag for it? [20:07:02] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 278831 MB (5% inode=64%): [20:07:40] 1300 dont know if for one node per year or all together [20:08:04] Okay, thanks. [20:09:17] probably makes more sense to try to rebuild it on a linux - so i hope wmde will have a second ts admin soon [20:10:08] nosy: Me, too :-). This needs better documentation so that we can understand what's needed for what, and what could be replaced by other software, and where we need to spend money (like management access to mayapple). [20:11:14] ok willow is back [20:13:02] ok waiting for wolfsbane to finish, going to reboot it afterwards and then turnera [20:15:01] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [20:18:30] scfc_de: you think anyone would like to read this? ;) [20:19:43] might make sense...ill talk to silke if there are any concerns [20:20:11] [[Labs-Moving-Survey]] ! 10https://wiki.toolserver.org/w/index.php?diff=8005&oldid=7993&rcid=21905 * 86.128.158.37 * (+211) (/* …move when possible, i.e. when Labs gets equivalent or better */ ) [20:20:49] nosy: *Like* to read probably not :-), but I don't see many people who are satisfied with the current situation. [20:21:13] scfc_de: ok...but what would the list change then? ;) [20:21:37] might be helpful if someone wants to do something [20:23:28] nosy: Well, without a list, there'll be no progress whatsoever. Somewhere we got to start. If we have a "list", we can ask WMDE for stuff. If we're just moaning, noone will listen. [20:26:01] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [20:27:21] wtf... [20:29:21] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [20:29:30] SMTP on damiana is OK: SMTP OK - 0.002 sec. response time [20:31:15] Well, at least wolfsbane is still responding :-). [20:31:21] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:32:47] (Uh, after a minute or so, ortelius responds as well.) [20:33:46] scfc_de: yes very strange [20:34:09] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.010 second response time [20:34:27] i thought it was tcp connections (too many in time_wait or close_wait) but decreasing the values for timeouts did not really do [20:34:47] and i dont see any real issues [20:34:57] might have to file a bug at zeus support... [20:35:15] will reboot wolfsbane [20:35:43] Now I get a 502 Bad Gateway at ortelius. [20:35:43] funny thing is: when i try to work at ts its quite well possible i have to fix things first... [20:36:11] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:36:14] Who breaks them? :-) [20:36:31] yeah me too [20:37:02] / on wolfsbane is WARNING: DISK WARNING - free space: /var/run/.patchSafeMode/root 4520 MB (15% inode=93%): [20:37:11] /tmp on wolfsbane is UNKNOWN: CHECK_NRPE: Error receiving data from daemon. [20:37:11] Sun Grid Engine execd on wolfsbane is UNKNOWN: CHECK_NRPE: Error receiving data from daemon. [20:37:52] oh it answers... [20:40:48] wolfsbane hasn't come up yet. [20:46:02] / on wolfsbane is WARNING: DISK WARNING - free space: / 6072 MB (20% inode=93%): [20:46:02] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [20:46:02] /tmp on wolfsbane is WARNING: DISK WARNING - free space: / 6072 MB (20% inode=93%): [20:47:02] root@wolfsbane:~# netstat -f inet|wc -l [20:47:02] 5103 [20:47:05] nur mal so [20:50:59] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2103 [20:51:37] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:51:37] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:51:37] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:52:08] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:52:09] / on wolfsbane is OK: DISK OK - free space: / 9991 MB (33% inode=93%): [20:57:40] SMTP on damiana is OK: SMTP OK - 0.161 sec. response time [20:57:40] SMTP on willow is OK: SMTP OK - 0.034 sec. response time [20:57:40] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [20:57:40] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [20:57:40] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 110159.000000 [20:57:40] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 278364 MB (5% inode=64%): [20:57:50] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [20:57:50] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [20:57:50] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [20:57:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:57:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [20:57:50] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:58:00] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:58:00] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:58:00] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:58:10] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:58:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:58:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:59:35] tried to switch services from damiana to turnera - everything switched now [20:59:46] web looks much better know, does it? [21:00:35] All responding well. (So I'm mostly concerned with LDAP and MySQL load balancing.) [21:02:49] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2684 [21:03:30] Load avg. on nightshade is WARNING: WARNING - load average: 11.03, 18.51, 16.18 [21:05:07] scfc_de: ts never sleeps, he? ;) [21:05:30] Load avg. on nightshade is OK: OK - load average: 6.03, 14.16, 14.87 [21:07:39] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:07:39] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:08:07] Well, it sucks big time if after four hours you get "Lost connection to MySQL server during query". And the 404s the webserver produces when LDAP is down (not to speak of aliasd) don't feel cozy either. [21:08:20] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:08:20] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:08:20] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:09:39] nosy: Now SGE seems to be down. qstat takes very long. [21:12:31] nosy: "NFS server ha-nfs.esi not responding still trying" [21:13:30] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:13:40] /var/tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:13:50] Sun Grid Engine execd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:13:50] Environment IPMI on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:13:50] aliasd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:13:58] turnera off too? [21:14:00] Load avg. on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:00] Environment IPMI on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:02] damiana too? [21:14:08] nagios working? [21:14:09] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:14:10] Load avg. on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:10] /home on hemlock is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:20] Sensors on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:20] / on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:29] /tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:29] Sun Grid Engine execd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:29] SRaid on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:30] aliasd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:30] /var on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:30] / on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:39] Sensors on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:39] /tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:49] /var on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:14:49] /var/tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:15:04] is there any i-hate-button? [21:15:59] ortelius and wolfsbane seem to be down as well. [21:17:49] / on nightshade is OK: DISK OK - free space: / 1593 MB (89% inode=94%): [21:17:49] Sensors on nightshade is OK: sensor ok [21:17:50] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [21:17:50] aliasd on yarrow is OK: TCP OK - 0.013 second response time on port 984 [500 Not found.] [21:17:50] Environment IPMI on yarrow is OK: ok: temperature ok fan ok voltage ok chassis ok [21:17:50] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [21:17:59] /tmp on nightshade is OK: DISK OK - free space: /tmp 3488 MB (78% inode=99%): [21:17:59] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [21:17:59] SRaid on yarrow is OK: OK md0 status=[UU]. [21:17:59] /var on nightshade is OK: DISK OK - free space: /var 9850 MB (73% inode=48%): [21:18:00] aliasd on nightshade is OK: TCP OK - 0.004 second response time on port 984 [500 Not found.] [21:18:00] / on yarrow is OK: DISK OK - free space: / 1582 MB (88% inode=94%): [21:18:00] /home on hemlock is OK: DISK OK - free space: /home 13483 MB (26% inode=81%): [21:18:09] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [21:18:09] /var/tmp on nightshade is OK: DISK OK - free space: /var/tmp 872 MB (98% inode=99%): [21:18:09] Sensors on yarrow is OK: sensor ok [21:18:09] /tmp on yarrow is OK: DISK OK - free space: /tmp 4085 MB (96% inode=99%): [21:18:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [21:18:10] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [21:18:19] /var on yarrow is OK: DISK OK - free space: /var 11652 MB (87% inode=96%): [21:18:19] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [21:18:20] Environment IPMI on nightshade is OK: ok: temperature ok fan ok voltage ok chassis ok [21:18:20] /var/tmp on yarrow is OK: DISK OK - free space: /var/tmp 827 MB (97% inode=99%): [21:18:40] Load avg. on yarrow is WARNING: WARNING - load average: 14.89, 22.74, 15.51 [21:18:42] seems anything has the old ipv6 ranges in the dns now - dont know why [21:18:54] trying to bring nfs back first [21:19:00] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.011 second response time [21:21:07] waiting for turnera to boot [21:21:29] strange...dns seems ok again [21:21:39] Load avg. on yarrow is OK: OK - load average: 3.85, 14.02, 13.36 [21:23:39] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:24:20] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:24:20] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:24:20] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:24:28] oh dear... [21:24:40] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:27:51] MySQL slave on z-dat-s2-b is OK: Uptime: 134282 Threads: 15 Questions: 124896295 Slow queries: 1630 Opens: 1271956 Flush tables: 1 Open tables: 256 Queries per second avg: 930.104 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [21:29:11] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:29:31] Sun Grid Engine execd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:31] SRaid on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:31] aliasd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:32] / on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:41] /tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:42] Sensors on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:51] Sun Grid Engine execd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:51] /var on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:52] /var/tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:29:52] aliasd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:30:01] Environment IPMI on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:30:16] hello all. How is the maintenance going? [21:30:20] / on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:30:21] Sensors on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:30:30] /tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:30:30] /var on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:30:41] /var/tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:30:52] Environment IPMI on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:31:24] Hey folks. I'm timing out when trying to SSH to the toolserver. [21:31:38] Is there maintenance happening right now? [21:32:01] NTP on turnera is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [21:32:19] halfak_: yes [21:32:20] / on turnera is CRITICAL: NRPE: Command check_root not defined [21:32:23] ok turnera is coming back [21:32:31] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [21:32:31] DiskSuite on turnera is CRITICAL: CRITICAL - submirror d11 of mirror d10 is Resyncing and submirror d12 of mirror d10 is Resyncing [21:32:32] lets see if anything recovery [21:32:42] meanwhile dns has wrong infos again [21:32:51] Free Memory on turnera is CRITICAL: NRPE: Command check_free_mem not defined [21:33:01] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2216.000000 [21:34:01] NTP on turnera is OK: NTP OK: Offset 0.006443 secs [21:34:34] DiskSuite on turnera is OK: OK - No disk failures detected [21:35:13] nfs is not available too [21:35:24] I see. I should have read my email. [21:36:52] SSH on willow is CRITICAL: Server answer: [21:39:40] Environment IPMI on nightshade is OK: ok: temperature ok fan ok voltage ok chassis ok [21:39:40] /var/tmp on yarrow is OK: DISK OK - free space: /var/tmp 827 MB (97% inode=99%): [21:39:40] /var on yarrow is OK: DISK OK - free space: /var 11653 MB (87% inode=96%): [21:39:40] aliasd on yarrow is OK: TCP OK - 0.009 second response time on port 984 [500 Not found.] [21:39:40] Environment IPMI on yarrow is OK: ok: temperature ok fan ok voltage ok chassis ok [21:39:40] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [21:39:50] SSH on willow is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [21:39:50] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [21:39:50] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [21:39:50] / on nightshade is OK: DISK OK - free space: / 1593 MB (89% inode=94%): [21:39:51] Sensors on nightshade is OK: sensor ok [21:40:04] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [21:40:04] /var on nightshade is OK: DISK OK - free space: /var 9850 MB (73% inode=48%): [21:40:04] /tmp on nightshade is OK: DISK OK - free space: /tmp 3488 MB (78% inode=99%): [21:40:04] aliasd on nightshade is OK: TCP OK - 0.003 second response time on port 984 [500 Not found.] [21:40:04] SRaid on yarrow is OK: OK md0 status=[UU]. [21:40:04] / on yarrow is OK: DISK OK - free space: / 1582 MB (88% inode=94%): [21:40:10] /var/tmp on nightshade is OK: DISK OK - free space: /var/tmp 872 MB (98% inode=99%): [21:40:10] /tmp on yarrow is OK: DISK OK - free space: /tmp 4085 MB (96% inode=99%): [21:40:10] Sensors on yarrow is OK: sensor ok [21:40:10] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [21:40:10] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [21:40:20] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [21:40:59] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.010 second response time [21:41:55] turnera is toggling the nic on/off [21:45:21] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:45:21] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:45:41] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:45:48] toggling stopped [21:46:02] cluster is electing the nfs master... [21:46:20] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:46:41] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:47:11] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [21:50:25] [[Special:Log/newusers]] create 10 * Jorgen91 * (New user account) [21:51:10] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:51:30] Sun Grid Engine execd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:30] SRaid on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:31] aliasd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:31] / on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:40] /tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:41] Sensors on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:51] Sun Grid Engine execd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:51] /var on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:51] Environment IPMI on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:51] /var/tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:51:51] aliasd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:52:00] Environment IPMI on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:52:20] / on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:52:20] Sensors on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:52:30] /tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:52:30] /var on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:52:40] /var/tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:53:45] nfs had old statd entries [21:53:49] tried to renew them [21:53:56] lets see if that works [21:56:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 507947.000000 [21:56:51] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [21:56:51] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 684703.000000 [21:57:01] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [21:57:01] RAID on thyme is UNKNOWN: NRPE: Unable to read output [21:57:01] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 24750 [21:57:01] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 113522.000000 [21:57:01] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc [21:57:01] s4 replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3624.000000 [21:57:02] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [21:57:02] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5160.000000 [21:57:03] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1218638.000000 [21:57:03] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:11] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2385624.000000 [21:57:11] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1496176.000000 [21:57:11] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:11] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:57:11] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:57:11] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:20] / on thyme is UNKNOWN: NRPE: Unable to read output [21:57:20] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [21:57:21] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1339247.000000 [21:57:21] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:21] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:21] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:57:21] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:22] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [21:57:30] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 5165 [21:57:30] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [21:57:31] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:31] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:57:31] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [21:57:31] SSH on mayapple is CRITICAL: Server answer: [21:57:31] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:57:32] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:57:32] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:57:33] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [21:57:40] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [21:57:40] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [21:57:41] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 278245 MB (5% inode=64%): [21:57:41] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [21:57:50] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [21:57:50] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [21:57:50] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:57:51] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [21:57:51] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:58:00] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:58:00] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:58:00] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3582.000000 [21:58:00] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:58:10] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:58:10] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:58:10] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:59:55] fucking statd...cluster...package...old...state files... with old ips [22:00:51] SSH on willow is CRITICAL: Server answer: [22:02:11] Load avg. on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:02:30] MySQL slave on daphne is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3302 [22:03:40] oh interesting...turnera also forgot we renumbered it [22:03:45] it has its old ips again [22:04:00] if we have bad luck the raid stuff there is broken [22:04:17] explains the dns problems too... [22:06:59] / on nightshade is OK: DISK OK - free space: / 1593 MB (89% inode=94%): [22:07:00] Sensors on nightshade is OK: sensor ok [22:07:00] Environment IPMI on yarrow is OK: ok: temperature ok fan ok voltage ok chassis ok [22:07:00] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [22:07:00] /var on nightshade is OK: DISK OK - free space: /var 9850 MB (73% inode=48%): [22:07:00] SRaid on yarrow is OK: OK md0 status=[UU]. [22:07:00] /tmp on nightshade is OK: DISK OK - free space: /tmp 3488 MB (78% inode=99%): [22:07:01] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [22:07:01] aliasd on nightshade is OK: TCP OK - 0.013 second response time on port 984 [500 Not found.] [22:07:02] / on yarrow is OK: DISK OK - free space: / 1582 MB (88% inode=94%): [22:07:09] /var/tmp on nightshade is OK: DISK OK - free space: /var/tmp 872 MB (98% inode=99%): [22:07:10] /tmp on yarrow is OK: DISK OK - free space: /tmp 4085 MB (96% inode=99%): [22:07:10] Sensors on yarrow is OK: sensor ok [22:07:10] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [22:07:11] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [22:07:11] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [22:07:11] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [22:07:20] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [22:07:20] /var on yarrow is OK: DISK OK - free space: /var 11653 MB (87% inode=96%): [22:07:20] Environment IPMI on nightshade is OK: ok: temperature ok fan ok voltage ok chassis ok [22:07:20] /var/tmp on yarrow is OK: DISK OK - free space: /var/tmp 827 MB (97% inode=99%): [22:07:20] toolserver.org HTTP on ortelius is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.907 second response time [22:07:20] aliasd on yarrow is OK: TCP OK - 0.002 second response time on port 984 [500 Not found.] [22:07:50] SSH on willow is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [22:08:00] toolserver.org HTTP on wolfsbane is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.006 second response time [22:08:20] toolserver.org HTTP on ortelius is OK: HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.044 second response time [22:08:30] looks better [22:10:31] Yes. Did we go back to before-something, or are the IP issues now fixed for good? [22:12:41] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:12:41] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:12:48] (Okay, that was too early :-).) [22:13:37] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:13:37] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:13:37] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:13:49] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [22:13:49] APT on yarrow is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [22:13:59] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [22:14:17] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [22:14:18] APT on nightshade is WARNING: APT WARNING: 67 packages available for upgrade (0 critical updates). [22:14:44] ok turnera disabled first - nfs back [22:14:48] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 16.00, 32.44, 51.12 [22:14:48] Load avg. on yarrow is WARNING: WARNING - load average: 12.32, 15.82, 19.94 [22:19:27] SSH on turnera is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:19:58] FMA on turnera is CRITICAL: ERROR - unexpected output from snmpwalk [22:20:08] NTP on turnera is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:20:18] /tmp on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:20:18] PING on turnera is CRITICAL: CRITICAL - Plugin timed out after 10 seconds [22:20:18] DiskSuite on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:20:18] Environment IPMI on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:20:18] SMTP on turnera is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:20:19] Load avg. on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:20:48] MySQL slave on daphne is OK: Uptime: 1910049 Threads: 52 Questions: 225860252 Slow queries: 414111 Opens: 80411 Flush tables: 1 Open tables: 1916 Queries per second avg: 118.248 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1795 [22:20:48] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 352535.000000 [22:20:48] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 112625.000000 [22:20:57] s4 replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1748.000000 [22:21:48] Load avg. on yarrow is OK: OK - load average: 3.81, 7.45, 14.36 [22:24:48] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:24:58] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [22:25:18] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:25:18] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:25:28] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:25:48] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:25:48] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 58542 MB (9% inode=99%): [22:30:48] Load avg. on yarrow is WARNING: WARNING - load average: 8.65, 15.22, 15.27 [22:32:48] NTP on rosemary is CRITICAL: NTP CRITICAL: Server not synchronized, Offset unknown [22:48:08] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [22:48:48] Load avg. on nightshade is WARNING: WARNING - load average: 7.16, 10.81, 19.25 [22:50:17] Load avg. on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:54:38] NFS server ha-nfs.esi not responding still trying [22:55:54] still working on servers? [22:56:05] yes [22:56:58] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 686228.000000 [22:57:08] SMTP on z-dat-s5-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:57:17] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:57:17] SMTP on sage is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:57:17] aliasd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:57:17] /tmp on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:57:17] still working [22:57:37] /mnt on thyme is UNKNOWN: NRPE: Unable to read output [22:57:37] Environment IPMI on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:57:38] Load avg. on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:57:47] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 5046 [22:57:48] /tmp on thyme is UNKNOWN: NRPE: Unable to read output [22:57:49] SSH on mayapple is CRITICAL: Server answer: [22:57:49] Environment IPMI on thyme is UNKNOWN: NRPE: Unable to read output [22:57:49] SMTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:57:49] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 278126 MB (5% inode=64%): [22:57:49] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [22:57:49] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [22:57:50] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 509529.000000 [22:57:50] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [22:57:51] APT on yucca is WARNING: APT WARNING: 39 packages available for upgrade (0 critical updates). [22:57:58] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [22:57:58] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:57:58] RAID on thyme is UNKNOWN: NRPE: Unable to read output [22:57:58] MySQL slave on z-dat-s1-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 24659 [22:57:58] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 113924.000000 [22:57:58] ethernet 0/1/12 [csw1-esams:1/24] on asw-oe10-esams.mgmt is CRITICAL: GigabitEthernet0/1/12:DOWN: 1 int NOK : CRITICAL [22:57:58] FMA on amaranth is CRITICAL: ERROR - unexpected output from snmpwalk [22:57:59] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc [22:57:59] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5037.000000 [22:58:00] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1222297.000000 [22:58:08] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 2388957.000000 [22:58:18] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1499691.000000 [22:58:18] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:58:18] / on thyme is UNKNOWN: NRPE: Unable to read output [22:58:18] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on thyme (146) [22:58:18] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 1340494.000000 [22:58:19] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:58:19] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:58:20] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:58:20] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:58:21] APT on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:58:21] NTP on mayapple is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:58:22] SMTP on z-dat-s1-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:58:22] NTP on amaranth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:58:28] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:58:28] Load avg. on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:58:48] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:58:48] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [22:58:48] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [22:58:57] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:02:08] NTP on cassia is CRITICAL: NTP CRITICAL: Offset 10.05665 secs [23:13:48] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:14:08] toolserver.org HTTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:14:18] aliasd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:18] Environment IPMI on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:18] /home on hemlock is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:18] Load avg. on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:37] Sun Grid Engine execd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:37] aliasd on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:37] /var on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:37] / on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:38] SRaid on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:38] /var/tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:38] Sensors on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:39] /tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:48] Sun Grid Engine execd on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:14:48] /var on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:15:07] / on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:15:07] /var/tmp on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:15:08] Environment IPMI on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:15:08] Sensors on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:15:28] /tmp on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:20:48] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 354019.000000 [23:20:48] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 114613.000000 [23:22:48] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:22:48] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:23:17] Sun Grid Engine execd on willow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:23:17] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:23:27] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:24:58] NTP on ptolemy is CRITICAL: NTP CRITICAL: Offset 11.505836 secs [23:25:48] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 58476 MB (9% inode=99%): [23:27:28] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [23:44:27] APT on sage is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [23:44:27] Load avg. on yarrow is CRITICAL: CRITICAL - load average: 18.51, 30.09, 32.76 [23:44:27] APT on z-dat-s2-b is WARNING: APT WARNING: 34 packages available for upgrade (0 critical updates). [23:44:27] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 278101 MB (5% inode=64%): [23:44:36] FMA on thyme is CRITICAL: ERROR - unexpected output from snmpwalk [23:44:36] Load avg. on thyme is UNKNOWN: NRPE: Unable to read output [23:44:36] APT on z-dat-s1-b is WARNING: APT WARNING: 35 packages available for upgrade (0 critical updates). [23:44:36] SMF on amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:44:36] SMTP on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:44:45] SMTP on wolfsbane is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:44:46] SRaid on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:44:46] SMF on web.amaranth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:44:55] Sensors on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:44:55] / on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:44:55] Sun Grid Engine execd on mayapple is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:48:35] NTP on hyacinth is CRITICAL: NTP CRITICAL: Offset 10.358761 secs [23:49:36] s4 replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2337.000000 [23:49:46] Sun Grid Engine execd on nightshade is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [23:50:05] Sun Grid Engine execd on yarrow is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [23:50:26] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2371.000000 [23:51:17] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 7.89, 27.97, 56.23 [23:51:17] /sql on ptolemy is CRITICAL: DISK CRITICAL - free space: /sql 58452 MB (9% inode=99%): [23:51:25] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 116301.000000 [23:51:36] NTP on cassia is CRITICAL: NTP CRITICAL: Offset 10.05665 secs [23:51:45] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [23:52:05] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [23:54:26] Sun Grid Engine execd on ortelius is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:54:46] APT on yarrow is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:55:05] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:55:25] APT on nightshade is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds.