[00:06:05] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 403073 MB (7% inode=39%): [00:08:26] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:18:27] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32575.000000 [00:18:45] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 32598 [00:22:46] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:23:15] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [00:27:26] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [00:31:45] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [00:32:55] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [00:34:45] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:56:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [01:06:04] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 402847 MB (7% inode=39%): [01:18:26] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 36178.000000 [01:18:45] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 36198 [01:23:27] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [01:23:46] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:27:26] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [01:32:36] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [01:34:46] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [01:56:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [01:59:55] /sql on thyme is WARNING: DISK WARNING - free space: /sql 193251 MB (20% inode=99%): [02:06:05] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 402781 MB (7% inode=39%): [02:18:27] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 39778.000000 [02:18:56] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 39803 [02:23:26] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [02:23:46] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:28:25] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [02:32:47] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [02:34:47] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:38:27] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:56:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [03:06:26] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 400807 MB (7% inode=39%): [03:18:27] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 43379.000000 [03:18:56] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 43406 [03:23:27] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [03:23:56] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:28:28] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [03:32:57] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [03:34:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:38:26] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:52:56] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [03:56:56] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [04:06:26] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 401598 MB (7% inode=39%): [04:08:28] /tmp on ortelius is WARNING: DISK WARNING - free space: /tmp 2195 MB (19% inode=99%): [04:16:25] /tmp on ortelius is OK: DISK OK - free space: /tmp 2365 MB (21% inode=99%): [04:18:35] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 46987.000000 [04:18:56] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 47007 [04:19:26] /tmp on ortelius is WARNING: DISK WARNING - free space: /tmp 2100 MB (19% inode=99%): [04:23:35] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [04:23:56] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:28:35] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [04:31:25] /tmp on ortelius is OK: DISK OK - free space: /tmp 3555 MB (28% inode=99%): [04:33:56] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [04:34:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [04:44:35] s1 replag on thyme is CRITICAL: (Service Check Timed Out) [04:45:05] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 526.000000 [04:56:56] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [05:07:25] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 401473 MB (7% inode=39%): [05:18:45] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 50593.000000 [05:18:55] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 50607 [05:23:35] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [05:23:56] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:28:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [05:33:56] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [05:34:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:56:56] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [06:07:25] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 401404 MB (7% inode=39%): [06:18:57] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 54208 [06:19:45] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 54257.000000 [06:23:35] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [06:24:05] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:29:45] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [06:33:56] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [06:35:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [06:47:45] Load avg. on willow is WARNING: WARNING - load average: 19.80, 13.57, 10.64 [06:50:45] Load avg. on willow is OK: OK - load average: 10.12, 12.96, 11.01 [06:57:05] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [07:07:25] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 401158 MB (7% inode=39%): [07:19:07] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 57813 [07:19:45] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 57858.000000 [07:23:45] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [07:24:07] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:29:45] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [07:34:07] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [07:36:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [07:53:25] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:57:06] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:07:35] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 401090 MB (7% inode=39%): [08:19:07] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 61413 [08:19:46] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 61458.000000 [08:24:07] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:24:45] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [08:29:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [08:33:25] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [08:35:05] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [08:36:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [08:45:43] 3(created) [TS-1413] cassia: connection errors because of "MySQL server has gone away" and s5 replication has stopped; Toolserver: Databases; Critical Bug <10https://jira.toolserver.org/browse/TS-1413> (merl) [08:57:07] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:58:05] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 39757 MB (9% inode=99%): [08:58:42] 3(assigned) [TS-1413] cassia: connection errors because of "MySQL server has gone away" and s5 replication has stopped <10https://jira.toolserver.org/browse/TS-1413> (Marlen Caemmerer) [09:02:43] 3(commented) [TS-1413] cassia: connection errors because of "MySQL server has gone away" and s5 replication has stopped <10https://jira.toolserver.org/browse/TS-1413> (Marlen Caemmerer) [09:08:35] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 400590 MB (7% inode=39%): [09:10:06] /sql on z-dat-s4-a is CRITICAL: DISK CRITICAL - free space: /sql 21817 MB (5% inode=99%): [09:19:07] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 65013 [09:19:55] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 65065.000000 [09:24:07] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:24:46] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [09:29:54] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [09:31:05] /sql on cassia is CRITICAL: DISK CRITICAL - free space: /sql 58479 MB (4% inode=98%): [09:35:16] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [09:36:15] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [09:42:16] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [09:43:15] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 37053 MB (9% inode=99%): [09:44:15] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 79402 MB (19% inode=99%): [10:08:35] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 400962 MB (7% inode=39%): [10:15:27] Sun Grid Engine execd on wolfsbane is WARNING: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.139160/1.00, alarm hl:np_load_long=0.168457/1.50, alarm hl:mem_free=554.000000M/600M, alarm hl:tmp_free=14155M/100M, alarm hl:available=1/0 [10:19:27] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 68629 [10:20:52] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 68727.000000 [10:24:25] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:24:53] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [10:26:25] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [10:27:33] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:28:03] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:28:26] SSH on hyacinth is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [10:28:43] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [10:30:25] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [10:31:44] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1933 [10:34:53] MySQL slave on z-dat-s6-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1949 [10:35:25] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [10:36:24] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [10:36:44] MySQL slave on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [10:36:53] MySQL slave on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [10:37:33] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:37:33] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:37:33] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:37:33] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:37:53] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:53] SMF on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:53] SMF on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:53] /tmp on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:53] Load avg. on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:54] /tmp on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:54] SMF on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:55] / on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:55] SMF on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:56] / on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:56] SMF on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:37:57] Load avg. on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:02] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:03] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:38:12] Environment IPMI on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:33] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:38:33] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:38:34] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:38:53] MySQL on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [10:38:53] MySQL slave on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [10:38:53] Load avg. on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:54] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:38:54] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:38:54] /tmp on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:54] /sql on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:54] /sql on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:54] /tmp on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:55] / on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:55] Load avg. on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:38:56] Load avg. on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:39:03] MySQL on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [10:39:13] s4 replag on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [10:39:13] / on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:39:13] / on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:39:13] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:39:13] /tmp on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:39:53] MySQL on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [10:39:54] MySQL on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [10:39:54] MySQL slave on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [10:40:08] umm, any ts admins around? What's going on? ^^ [10:40:34] /sql on z-dat-s7-a is OK: DISK OK - free space: /sql 75541 MB (18% inode=99%): [10:40:34] SMTP on z-dat-s6-a is OK: SMTP OK - 7.890 sec. response time [10:40:34] SMTP on z-dat-s3-a is OK: SMTP OK - 9.128 sec. response time [10:40:34] MySQL on z-dat-s6-a is OK: Uptime: 226971 Threads: 36 Questions: 48339001 Slow queries: 21967 Opens: 477128 Flush tables: 1 Open tables: 2736 Queries per second avg: 212.974 [10:40:34] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 63985 MB (15% inode=99%): [10:40:34] SMTP on z-dat-s7-a is OK: SMTP OK - 4.719 sec. response time [10:40:34] Load avg. on z-dat-s6-a is OK: OK - load average: 0.03, 0.18, 0.56 [10:40:35] /tmp on z-dat-s4-a is OK: DISK OK - free space: /tmp 1788 MB (99% inode=99%): [10:40:35] / on hyacinth is OK: DISK OK - free space: / 8218 MB (27% inode=85%): [10:40:36] /tmp on hyacinth is OK: DISK OK - free space: /tmp 1775 MB (100% inode=99%): [10:40:36] /tmp on z-dat-s6-a is OK: DISK OK - free space: /tmp 1772 MB (99% inode=99%): [10:40:37] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 115177 MB (11% inode=98%): [10:40:48] SMTP on z-dat-s4-a is OK: SMTP OK - 0.005 sec. response time [10:40:48] SMTP on hyacinth is OK: SMTP OK - 0.002 sec. response time [10:40:49] MySQL on z-dat-s7-a is OK: Uptime: 2184321 Threads: 11 Questions: 671191792 Slow queries: 118072 Opens: 4959601 Flush tables: 1 Open tables: 6502 Queries per second avg: 307.277 [10:40:49] MySQL slave on z-dat-s7-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2115 [10:40:50] /tmp on z-dat-s7-a is OK: DISK OK - free space: /tmp 1857 MB (99% inode=99%): [10:40:53] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [10:40:53] Environment IPMI on hyacinth is OK: ok: temperature ok fan ok voltage ok chassis ok [10:41:24] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [10:41:25] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [10:41:25] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [10:42:02] Sumana Harihareswara * [Toolserver-l] Upcoming hackathon for experts AND newbies: Washington, DC, USA July 10-11 [10:46:25] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.192383/1.10, alarm hl:np_load_long=0.176269/1.55, alarm hl:mem_free=235.000000M/500M, alarm hl:tmp_free=14092M/200M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.192383/1.00, alarm hl:np_load_long=0.176269/1.50, alarm hl:mem_free=235.000000M/600M, alarm hl:tmp_free= [10:47:25] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [10:54:09] c-moll: ?? [10:57:40] I'm talking about the usual tsnag flooding of the channel. Normal spam or more than normal? Everything ok? [10:59:13] looks normal [10:59:49] * Betacommand has tsnag comments hidden unless he views the unfiltered channel [11:00:25] MySQL slave on z-dat-s3-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3633 [11:02:25] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.211426/1.95, alarm hl:tmp_free=45568M/100M, alarm hl:np_load_avg=0.997559/2.0, alarm hl:mem_free=298.000000M/350M, alarm hl:available=1/0 [11:04:25] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [11:05:25] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.463379/1.10, alarm hl:np_load_long=0.262695/1.55, alarm hl:mem_free=206.000000M/500M, alarm hl:tmp_free=14032M/200M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.463379/1.00, alarm hl:np_load_long=0.262695/1.50, alarm hl:mem_free=206.000000M/600M, alarm hl:tmp_free= [11:05:33] MySQL slave on z-dat-s6-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3643 [11:08:44] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 398250 MB (7% inode=39%): [11:11:13] MySQL slave on z-dat-s7-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3633 [11:11:26] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.182617/1.95, alarm hl:tmp_free=45545M/100M, alarm hl:np_load_avg=1.125000/2.0, alarm hl:mem_free=247.000000M/350M, alarm hl:available=1/0 [11:12:04] MySQL slave on z-dat-s4-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3608 [11:12:34] s4 replag on z-dat-s4-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3632.000000 [11:20:25] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 72289 [11:20:53] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 72327.000000 [11:24:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:24:54] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [11:30:25] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [11:35:26] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [11:36:25] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [11:40:25] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.755859/1.95, alarm hl:tmp_free=45467M/100M, alarm hl:np_load_avg=0.810059/2.0, alarm hl:mem_free=340.000000M/350M, alarm hl:available=1/0 [11:43:25] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [11:45:37] @replag [11:45:38] Merlissimo: s2-user: 1w 21h 38m 36s [-0.58 s/s]; s3-rr-a: 1h 45m 7s [+0.97 s/s]; s3-user: 1h 45m 7s [+0.97 s/s]; s4-user: 1h 23m 7s [+0.76 s/s]; s5-rr-a: 20h 30m 15s [+1.00 s/s]; s5-user: 20h 30m 15s [+1.00 s/s]; s6-rr-a: 1h 38m 25s [+0.92 s/s]; s6-user: 1h 38m 25s [+0.92 s/s] [11:45:39] Merlissimo: s7-rr-a: 1h 32m 14s [+0.88 s/s]; s7-user: 1h 32m 14s [+0.88 s/s] [11:55:33] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:55:33] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:56:24] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [11:56:25] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:00:33] MySQL slave on z-dat-s3-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 7166 [12:04:43] s4 replag on z-dat-s4-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3580.000000 [12:05:03] MySQL slave on z-dat-s4-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3402 [12:05:44] MySQL slave on z-dat-s6-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 7025 [12:05:54] [[Special:Log/newusers]] create 10 * Bieliznasklepnet * (New user account) [12:08:44] s4 replag on z-dat-s4-a is OK: QUERY OK: SELECT ts_rc_age() returned 1613.000000 [12:08:44] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 398455 MB (7% inode=39%): [12:09:03] MySQL slave on z-dat-s4-a is OK: Uptime: 2211249 Threads: 9 Questions: 142354020 Slow queries: 25056 Opens: 43338 Flush tables: 1 Open tables: 1012 Queries per second avg: 64.377 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1440 [12:11:25] MySQL slave on z-dat-s7-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 6920 [12:20:24] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 75892 [12:20:53] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 75926.000000 [12:23:25] /sql on cassia is WARNING: DISK WARNING - free space: /sql 84935 MB (7% inode=99%): [12:24:25] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:25:03] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [12:28:04] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:28:16] [[Special:Log/newusers]] create 10 * AshuROGBEER * (New user account) [12:28:52] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:31:25] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [12:36:25] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [12:36:25] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [12:39:34] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.282715/1.10, alarm hl:np_load_long=0.236328/1.55, alarm hl:mem_free=172.000000M/500M, alarm hl:tmp_free=14052M/200M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.282715/1.00, alarm hl:np_load_long=0.236328/1.50, alarm hl:mem_free=172.000000M/600M, alarm hl:tmp_free= [12:40:25] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [12:46:24] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.220703/1.10, alarm hl:np_load_long=0.222656/1.55, alarm hl:mem_free=424.000000M/500M, alarm hl:tmp_free=14011M/200M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.220703/1.00, alarm hl:np_load_long=0.222656/1.50, alarm hl:mem_free=424.000000M/600M, alarm hl:tmp_free= [13:00:34] MySQL slave on z-dat-s3-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 10459 [13:05:43] MySQL slave on z-dat-s6-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 9795 [13:08:44] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 398307 MB (7% inode=39%): [13:12:23] MySQL slave on z-dat-s7-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 10127 [13:20:53] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 15534.000000 [13:21:23] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 14940 [13:24:33] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:25:03] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [13:27:34] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.883301/1.95, alarm hl:tmp_free=45176M/100M, alarm hl:np_load_avg=0.865234/2.0, alarm hl:mem_free=286.000000M/350M, alarm hl:available=1/0 [13:28:03] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:28:12] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:28:53] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [13:29:03] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [13:32:23] MySQL slave on cassia is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3439 [13:32:23] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [13:32:53] s5 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2902.000000 [13:35:53] s5 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1334.000000 [13:36:23] MySQL slave on cassia is OK: Uptime: 4067153 Threads: 21 Questions: 4740361000 Slow queries: 1548033 Opens: 10667553 Flush tables: 1 Open tables: 16135 Queries per second avg: 1165.523 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 990 [13:36:33] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [13:36:34] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:43:33] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [13:47:34] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.849609/1.95, alarm hl:tmp_free=45124M/100M, alarm hl:np_load_avg=0.831055/2.0, alarm hl:mem_free=250.000000M/350M, alarm hl:available=1/0 [13:51:34] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [13:54:53] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:55:03] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:55:44] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:55:44] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:56:03] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:56:04] /tmp on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:56:12] Load avg. on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:56:13] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:56:34] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 115668 MB (11% inode=98%): [13:56:34] /tmp on z-dat-s6-a is OK: DISK OK - free space: /tmp 1761 MB (99% inode=99%): [13:56:34] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [13:56:34] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [13:56:44] Load avg. on z-dat-s7-a is OK: OK - load average: 1.18, 1.59, 2.04 [13:56:44] SSH on hyacinth is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [13:56:53] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [13:56:53] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [14:00:33] /sql on thyme is WARNING: DISK WARNING - free space: /sql 192598 MB (20% inode=99%): [14:01:12] MySQL slave on z-dat-s3-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 11631 [14:05:53] MySQL slave on z-dat-s6-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 5908 [14:08:54] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 398256 MB (7% inode=39%): [14:09:34] MySQL slave on z-dat-s7-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3455 [14:14:34] MySQL slave on z-dat-s7-a is OK: Uptime: 2197147 Threads: 7 Questions: 672300798 Slow queries: 121003 Opens: 4959661 Flush tables: 1 Open tables: 6526 Queries per second avg: 305.988 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1438 [14:20:03] MySQL slave on z-dat-s6-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3466 [14:24:35] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:25:12] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [14:28:03] MySQL slave on z-dat-s6-a is OK: Uptime: 240650 Threads: 7 Questions: 49390874 Slow queries: 25970 Opens: 477749 Flush tables: 1 Open tables: 2797 Queries per second avg: 205.239 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1765 [14:32:23] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [14:36:34] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [14:36:34] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [14:55:13] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:55:34] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:55:34] Environment IPMI on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:55:34] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:55:43] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:55:43] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:56:34] s4 replag on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [14:56:34] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [14:56:35] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.668457/1.95, alarm hl:tmp_free=44949M/100M, alarm hl:np_load_avg=0.738769/2.0, alarm hl:mem_free=228.000000M/350M, alarm hl:available=1/0 [14:56:43] MySQL on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [14:56:43] MySQL slave on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [14:56:53] MySQL on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [14:57:04] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [14:57:23] MySQL slave on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [14:57:33] SMTP on z-dat-s3-a is OK: SMTP OK - 4.511 sec. response time [14:57:33] Environment IPMI on hyacinth is OK: ok: temperature ok fan ok voltage ok chassis ok [14:57:33] MySQL slave on z-dat-s6-a is OK: Uptime: 242424 Threads: 11 Questions: 49666114 Slow queries: 26121 Opens: 477815 Flush tables: 1 Open tables: 2797 Queries per second avg: 204.872 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 287 [14:57:33] MySQL on z-dat-s6-a is OK: Uptime: 242426 Threads: 11 Questions: 49666203 Slow queries: 26122 Opens: 477815 Flush tables: 1 Open tables: 2797 Queries per second avg: 204.871 [14:57:34] MySQL on z-dat-s4-a is OK: Uptime: 2221366 Threads: 18 Questions: 142849100 Slow queries: 25216 Opens: 43444 Flush tables: 1 Open tables: 1018 Queries per second avg: 64.306 [14:57:34] s4 replag on z-dat-s4-a is OK: QUERY OK: SELECT ts_rc_age() returned 292.000000 [14:57:34] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [14:57:54] MySQL slave on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [14:57:54] MySQL on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [14:58:04] MySQL slave on z-dat-s7-a is OK: Uptime: 2199757 Threads: 9 Questions: 672584713 Slow queries: 121182 Opens: 4960286 Flush tables: 1 Open tables: 6533 Queries per second avg: 305.754 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 308 [14:58:04] MySQL slave on z-dat-s4-a is OK: Uptime: 2221389 Threads: 15 Questions: 142851595 Slow queries: 25218 Opens: 43444 Flush tables: 1 Open tables: 1018 Queries per second avg: 64.307 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 254 [14:58:04] MySQL on z-dat-s7-a is OK: Uptime: 2199757 Threads: 9 Questions: 672584715 Slow queries: 121182 Opens: 4960286 Flush tables: 1 Open tables: 6533 Queries per second avg: 305.754 [14:58:04] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [15:01:43] MySQL slave on z-dat-s3-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 4657 [15:09:04] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396972 MB (7% inode=39%): [15:19:43] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3597 [15:25:12] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [15:25:35] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:25:42] MySQL slave on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [15:26:13] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:26:34] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:26:34] Environment IPMI on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:26:43] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:26:43] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:26:43] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:26:43] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:26:53] SSH on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:26:53] Load avg. on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:03] Load avg. on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:03] /sql on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:04] / on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:04] /tmp on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:04] /sql on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:04] / on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:04] /tmp on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:05] SMF on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:05] SMF on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:06] SMF on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:06] SMF on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:07] SMF on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:27:12] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [15:27:13] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [15:27:14] Environment IPMI on hyacinth is OK: ok: temperature ok fan ok voltage ok chassis ok [15:27:22] Load avg. on z-dat-s4-a is OK: OK - load average: 0.54, 1.46, 2.45 [15:27:35] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [15:27:35] Load avg. on z-dat-s6-a is OK: OK - load average: 0.68, 1.47, 2.44 [15:27:35] SMTP on z-dat-s7-a is OK: SMTP OK - 0.042 sec. response time [15:27:35] / on z-dat-s6-a is OK: DISK OK - free space: / 8216 MB (27% inode=85%): [15:27:35] /sql on z-dat-s6-a is OK: DISK OK - free space: /sql 115161 MB (11% inode=98%): [15:27:35] /tmp on z-dat-s6-a is OK: DISK OK - free space: /tmp 1969 MB (99% inode=99%): [15:27:36] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [15:27:36] / on z-dat-s4-a is OK: DISK OK - free space: / 8216 MB (27% inode=85%): [15:27:36] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 63812 MB (15% inode=99%): [15:27:37] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [15:27:37] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [15:27:38] SMF on z-dat-s6-a is OK: OK - all services online [15:27:38] SMF on z-dat-s7-a is OK: OK - all services online [15:27:39] SMF on z-dat-s3-a is OK: OK - all services online [15:28:03] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3591 [15:32:23] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [15:32:33] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.750000/1.95, alarm hl:tmp_free=44852M/100M, alarm hl:np_load_avg=0.762207/2.0, alarm hl:mem_free=305.000000M/350M, alarm hl:available=1/0 [15:35:34] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [15:36:34] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [15:36:34] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [15:38:34] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:42:04] MySQL slave on z-dat-s3-a is OK: Uptime: 293278 Threads: 19 Questions: 253949690 Slow queries: 48756 Opens: 3007739 Flush tables: 1 Open tables: 16383 Queries per second avg: 865.900 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1708 [15:49:44] 3(created) [ACCAPP-531] Request for a Toolserver Account for Fa-wiki; Account Approval; New Account <10https://jira.toolserver.org/browse/ACCAPP-531> (Mahan Minayi) [16:06:13] SSH on adenia is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:07:04] SSH on adenia is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [16:09:04] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396885 MB (7% inode=39%): [16:17:34] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.830078/1.95, alarm hl:tmp_free=44730M/100M, alarm hl:np_load_avg=0.871582/2.0, alarm hl:mem_free=290.000000M/350M, alarm hl:available=1/0 [16:24:04] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1947.000000 [16:25:13] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [16:26:34] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:27:33] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [16:32:23] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [16:34:04] s4 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1747.000000 [16:34:34] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.209473/1.95, alarm hl:tmp_free=44685M/100M, alarm hl:np_load_avg=0.980957/2.0, alarm hl:mem_free=185.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.209473/2.3, alarm hl:np_load_long=0.909668/2.5, alarm hl:cpu=96.500000/98, alarm hl:mem_free=185.000000M/200M, al [16:36:34] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [16:36:34] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [16:38:33] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:39:02] Dr. Trigon * Re: [Toolserver-l] Anoter SGE question [16:40:34] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [16:42:02] Dr. Trigon * Re: [Toolserver-l] Install OpenCV libraries and python bindings [16:53:04] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [17:10:04] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396792 MB (7% inode=39%): [17:14:43] Sun Grid Engine execd on wolfsbane is WARNING: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.253906/1.00, alarm hl:np_load_long=0.285156/1.50, alarm hl:mem_free=535.000000M/600M, alarm hl:tmp_free=13743M/100M, alarm hl:available=1/0 [17:15:43] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [17:16:43] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.041504/1.95, alarm hl:tmp_free=44577M/100M, alarm hl:np_load_avg=1.026855/2.0, alarm hl:mem_free=319.000000M/350M, alarm hl:available=1/0 [17:17:44] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [17:25:24] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [17:26:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:32:23] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [17:36:44] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [17:36:54] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [17:46:53] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.004395/1.95, alarm hl:tmp_free=44499M/100M, alarm hl:np_load_avg=0.953613/2.0, alarm hl:mem_free=272.000000M/350M, alarm hl:available=1/0 [17:48:54] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [17:51:53] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.268066/1.10, alarm hl:np_load_long=0.232910/1.55, alarm hl:mem_free=359.000000M/500M, alarm hl:tmp_free=13682M/200M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.268066/1.00, alarm hl:np_load_long=0.232910/1.50, alarm hl:mem_free=359.000000M/600M, alarm hl:tmp_free= [17:51:53] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.182617/1.95, alarm hl:tmp_free=44482M/100M, alarm hl:np_load_avg=1.092285/2.0, alarm hl:mem_free=313.000000M/350M, alarm hl:available=1/0 [17:53:43] Sun Grid Engine execd on wolfsbane is OK: short-sol@wolfsbane OK: medium-sol@wolfsbane OK [18:02:55] Sun Grid Engine execd on wolfsbane is WARNING: short-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.356934/1.10, alarm hl:np_load_long=0.250976/1.55, alarm hl:mem_free=179.000000M/500M, alarm hl:tmp_free=13660M/200M, alarm hl:available=1/0: medium-sol@wolfsbane exceedes load threshold: alarm hl:np_load_short=0.356934/1.00, alarm hl:np_load_long=0.250976/1.50, alarm hl:mem_free=179.000000M/600M, alarm hl:tmp_free= [18:09:54] /tmp on ortelius is WARNING: DISK WARNING - free space: /tmp 2474 MB (20% inode=99%): [18:10:14] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396745 MB (7% inode=39%): [18:10:54] /tmp on ortelius is OK: DISK OK - free space: /tmp 2674 MB (22% inode=99%): [18:25:23] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [18:26:54] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:32:34] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [18:36:54] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [18:36:54] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:10:14] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 398009 MB (7% inode=39%): [19:25:33] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [19:26:55] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:32:02] DaB. * Re: [Toolserver-l] Install OpenCV libraries and python bindings [19:32:33] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [19:35:15] multichill: poke [19:36:54] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [19:36:54] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:42:10] Betacommand: What's up? [19:43:15] multichill: I know you wrote flickrripper.py and Im trying to fix an issue with it, and thought it might be easier for you to fix [19:44:10] lines 157-161 where it grabs the description and uses it for the file name [19:45:10] when you start working with non-latin descriptions it doesnt handle multi-byte characters well, it ended up with a title over 320 bytes :/ [19:45:24] the max mediawiki lets you have is 255 [19:46:41] Lol [19:47:17] multichill: really rather a pain in the ass [19:47:27] So the check shoul probably encode it and than see how long it is? [19:47:34] correct [19:47:36] Or just lower the limit a bit? [19:48:05] thai letters for example are 3 bytes [19:48:33] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:48:39] * Betacommand notes it was discovered with flickrripper.py -autonomous -user_id:40561337@N07 -addcategory:"Files from Abhisit Vejjajiva Flickr stream" [19:49:46] Betacommand: Could you file a bug for this? [19:49:47] multichill: you would need to cut it down to 85 to be safe [19:50:51] multichill: filling [19:51:06] I helped :p [19:52:31] there is a similar issue with panoramiopicker.py [19:52:42] it breaks when & character is used [19:52:56] python panoramiopicker.py -autonomous -set:2333416 -addcategory:"Files from V&A Dudush Panoramio stream" [19:52:58] that hangs [19:53:13] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [19:53:22] multichill: http://toolserver.org/~pywikipedia/b3536400 [19:55:39] ToAruShiroiNeko: I guess urlencoding is not working properly somewhere [19:57:14] their account name is V&A Dudush I believe [20:09:02] Dr. Trigon * Re: [Toolserver-l] Install OpenCV libraries and python bindings [20:10:23] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 398081 MB (7% inode=39%): [20:12:26] I remember reading up on how MySQL handles things, and the 255 byte limit allows it to sort in memory, if you instead define a title as VARCHAR(255) it'll resort to temp-files for sorting [20:12:35] BTDT [20:25:34] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [20:27:03] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:32:34] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [20:37:53] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [20:38:03] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [20:43:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.931152/1.95, alarm hl:tmp_free=43997M/100M, alarm hl:np_load_avg=0.912598/2.0, alarm hl:mem_free=320.000000M/350M, alarm hl:available=1/0 [20:50:03] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [20:53:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.836914/1.95, alarm hl:tmp_free=43971M/100M, alarm hl:np_load_avg=0.929688/2.0, alarm hl:mem_free=308.000000M/350M, alarm hl:available=1/0 [21:10:24] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396372 MB (7% inode=39%): [21:26:34] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [21:27:05] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:33:33] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [21:38:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [21:38:05] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [22:01:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.193848/1.95, alarm hl:tmp_free=43778M/100M, alarm hl:np_load_avg=1.481445/2.0, alarm hl:mem_free=290.000000M/350M, alarm hl:available=1/0 [22:05:03] Sun Grid Engine execd on willow is OK: testqueue@willow disabled: medium-sol@willow OK: longrun-sol@willow OK [22:11:24] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 396449 MB (7% inode=39%): [22:12:00] DaB [22:14:03] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.701172/1.95, alarm hl:tmp_free=43739M/100M, alarm hl:np_load_avg=1.440430/2.0, alarm hl:mem_free=232.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.701172/2.3, alarm hl:np_load_long=1.311524/2.5, alarm hl:cpu=99.400000/98, alarm hl:mem_free=232.000000M/200M, al [22:26:44] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [22:27:05] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:33:44] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [22:39:04] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [22:39:04] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [23:12:27] /aux0 on hemlock is WARNING: DISK WARNING - free space: /aux0 395989 MB (7% inode=39%): [23:26:45] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [23:27:13] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:34:44] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [23:40:03] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [23:40:13] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default