[00:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:13:55] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3545.000000 [00:17:55] s4 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1257.000000 [00:26:47] Load avg. on nightshade is WARNING: WARNING - load average: 22.53, 20.64, 16.51 [00:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [00:34:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 32126 [00:35:26] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 201184.000000 [00:36:45] Load avg. on nightshade is OK: OK - load average: 10.28, 13.40, 14.78 [00:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 90388 MB (14% inode=99%): [00:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [00:41:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:42:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [00:43:07] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 177887.000000 [00:43:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [00:43:16] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [00:43:46] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 22148.000000 [00:43:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [00:43:47] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [00:45:35] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [00:45:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [00:45:45] Load avg. on nightshade is WARNING: WARNING - load average: 15.86, 17.08, 16.14 [00:47:25] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 176701 [00:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 154954 MB (2% inode=69%): [00:53:46] Load avg. on nightshade is OK: OK - load average: 10.47, 13.29, 14.85 [01:06:46] Load avg. on nightshade is WARNING: WARNING - load average: 15.62, 18.35, 16.66 [01:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:16:46] Load avg. on nightshade is OK: OK - load average: 11.03, 13.21, 14.81 [01:24:16] Load avg. on ortelius is WARNING: WARNING - load average: 23.71, 20.19, 12.68 [01:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [01:34:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 32286 [01:36:15] Load avg. on ortelius is CRITICAL: CRITICAL - load average: 31.41, 24.77, 18.62 [01:36:25] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 204263.000000 [01:39:16] Load avg. on ortelius is WARNING: WARNING - load average: 15.66, 22.67, 19.07 [01:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 90204 MB (14% inode=99%): [01:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [01:41:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:42:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [01:43:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [01:43:17] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [01:43:46] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 16389.000000 [01:43:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [01:43:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [01:44:05] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 147753.000000 [01:45:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [01:45:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [01:45:48] @replag [01:45:48] DaBPunkt: s1-rr-a-wd: 35s [+0.01 s/s]; s1-user-wd: 33s [+0.01 s/s]; s2-rr: 8h 39m 9s [-1.14 s/s]; s2-user: 8h 39m 9s [-1.14 s/s]; s2-user-c: error; s2-user-wd: 2d 8h 41m 58s [+0.19 s/s]; s3-user: 16s [-0.07 s/s]; s3-user-wd: 35s [+0.01 s/s] [01:45:49] DaBPunkt: s4-user-wd: 4h 33m 18s [-1.57 s/s]; s5-user: 1d 16h 56m 32s [-9.56 s/s]; s5-user-c: error; s6-user-wd: 33s [+0.00 s/s]; s7-user-wd: 33s [+0.00 s/s] [01:46:17] @replag [01:46:17] DaBPunkt: s1-rr-a-wd: 35s [-]; s1-user-wd: 40s [+0.25 s/s]; s2-rr: 8h 38m 20s [-1.72 s/s]; s2-user: 8h 38m 20s [-1.72 s/s]; s2-user-c: error; s2-user-wd: 2d 8h 41m 56s [-0.07 s/s]; s3-user: 23s [+0.24 s/s]; s3-user-wd: 32s [-0.10 s/s] [01:46:18] DaBPunkt: s4-user-wd: 4h 32m 54s [-0.84 s/s]; s5-rr-a: 11s [-]; s5-user: 1d 16h 54m 58s [-3.29 s/s]; s5-user-c: error; s6-user-wd: 33s [-]; s7-user-wd: 34s [+0.03 s/s] [01:47:26] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 147177 [01:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 157339 MB (2% inode=69%): [02:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:11:17] Load avg. on ortelius is OK: OK - load average: 6.05, 11.54, 14.31 [02:22:16] Load avg. on ortelius is CRITICAL: CRITICAL - load average: 36.02, 23.81, 18.41 [02:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [02:34:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 28479 [02:36:26] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 203992.000000 [02:40:16] Load avg. on ortelius is WARNING: WARNING - load average: 9.79, 17.70, 19.69 [02:41:06] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 90315 MB (14% inode=99%): [02:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [02:41:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:42:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [02:43:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [02:43:16] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [02:43:46] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 12619.000000 [02:43:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [02:43:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [02:44:05] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 144007.000000 [02:45:35] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [02:46:07] nacht ts [02:46:09] @replag [02:46:09] DaBPunkt: s1-rr-a-wd: 28s [-0.00 s/s]; s1-user-c: 13s [-0.11 s/s]; s1-user-wd: 26s [-0.00 s/s]; s2-rr: 7h 40m 13s [-0.97 s/s]; s2-user: 7h 40m 13s [-0.97 s/s]; s2-user-c: error; s2-user-wd: 2d 8h 41m 12s [-0.01 s/s]; s3-user: 1m 46s [+0.02 s/s] [02:46:10] DaBPunkt: s3-user-wd: 28s [-0.00 s/s]; s4-rr-a: 13s [-]; s4-user: 13s [-]; s4-user-wd: 3h 23m 0s [-1.17 s/s]; s5-user: 1d 15h 57m 12s [-0.96 s/s]; s5-user-c: error; s6-user-wd: 27s [-0.00 s/s]; s7-user-wd: 28s [-0.00 s/s] [02:46:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [02:47:14] legoktm: I will take a look tomorrow [02:47:26] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 143706 [02:47:27] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 67280 MB (6% inode=99%): [02:47:27] it eventually started so nothing urgent [02:47:31] but thanks [02:48:29] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 156250 MB (2% inode=69%): [02:56:15] Load avg. on ortelius is CRITICAL: CRITICAL - load average: 31.48, 19.79, 18.37 [02:58:17] Load avg. on ortelius is WARNING: WARNING - load average: 17.57, 20.33, 18.88 [03:03:15] Load avg. on ortelius is OK: OK - load average: 2.80, 9.67, 14.56 [03:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [03:34:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 23896 [03:36:26] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 203574.000000 [03:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 89876 MB (14% inode=99%): [03:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [03:41:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:42:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [03:43:45] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 5462.000000 [03:43:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [03:43:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [03:44:06] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 140756.000000 [03:44:16] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [03:44:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [03:45:35] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [03:46:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [03:47:26] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 140762 [03:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 156637 MB (2% inode=69%): [04:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:09:45] wikidata replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3555.000000 [04:17:46] wikidata replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1626.000000 [04:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [04:34:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 17231 [04:37:29] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 202855.000000 [04:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 89775 MB (14% inode=99%): [04:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [04:41:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:42:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [04:43:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [04:43:47] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [04:44:06] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 141293.000000 [04:44:31] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [04:44:31] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [04:46:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [04:46:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [04:47:25] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 141394 [04:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 159482 MB (2% inode=69%): [05:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:10:26] Free Memory on turnera is CRITICAL: CRITICAL - 2.2% (182608 kB) free! [05:17:16] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3487 [05:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [05:34:16] MySQL slave on z-dat-s2-b is OK: Uptime: 17137 Threads: 12 Questions: 23088402 Slow queries: 219 Opens: 399095 Flush tables: 1 Open tables: 256 Queries per second avg: 1347.283 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1651 [05:38:25] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 201548.000000 [05:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 89666 MB (14% inode=99%): [05:42:15] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:42:54] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [05:44:45] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [05:44:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [05:45:05] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 141525.000000 [05:45:15] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [05:45:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [05:46:46] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [05:47:35] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [05:48:25] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 141622 [05:49:24] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 159826 MB (3% inode=69%): [06:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:10:25] Free Memory on turnera is CRITICAL: CRITICAL - 2.0% (168728 kB) free! [06:11:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [06:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [06:38:24] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 198885.000000 [06:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 89538 MB (14% inode=99%): [06:42:15] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:43:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [06:44:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [06:44:47] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [06:45:04] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 140520.000000 [06:45:15] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [06:45:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [06:47:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [06:47:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [06:49:25] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 140704 [06:49:25] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 160210 MB (3% inode=69%): [07:08:15] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:10:35] Free Memory on turnera is CRITICAL: CRITICAL - 1.8% (150280 kB) free! [07:11:15] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [07:19:37] Sun Grid Engine execd on wolfsbane is CRITICAL: (Service Check Timed Out) [07:21:44] Sun Grid Engine execd on willow is CRITICAL: (Service Check Timed Out) [07:37:59] ugh [07:37:59] again [07:42:22] MySQL slave on rosemary is CRITICAL: (Return code of 139 is out of bounds) [07:43:25] MySQL slave on rosemary is OK: Uptime: 386758 Threads: 7 Questions: 323516124 Slow queries: 250339 Opens: 22578 Flush tables: 1 Open tables: 3056 Queries per second avg: 836.482 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1434 [07:48:06] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 67498 MB (6% inode=99%): [07:48:14] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:48:23] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [07:48:33] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:49:05] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [07:49:05] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [07:49:05] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 194311.000000 [07:49:05] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [07:49:14] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [07:49:14] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [07:50:06] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 160538 MB (3% inode=69%): [08:02:05] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [08:03:06] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [08:08:33] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:12:05] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [08:30:53] Sun Grid Engine execd on wolfsbane is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:31:43] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [08:33:46] SMTP on thyme is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:33:46] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:33:47] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:33:54] SMTP on z-dat-s2-b is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:34:13] SMTP on rosemary is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:37:24] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:38:24] ts-array5 on damiana is CRITICAL: Connection refused by host [08:38:25] Free Memory on damiana is CRITICAL: Connection refused by host [08:38:25] /tmp on damiana is CRITICAL: Connection refused by host [08:38:25] Environment IPMI on damiana is CRITICAL: Connection refused by host [08:38:54] PING on daphne is CRITICAL: CRITICAL - Plugin timed out after 10 seconds [08:39:06] SMTP on damiana is CRITICAL: Connection refused [08:39:06] SSH on damiana is CRITICAL: Connection refused [08:39:06] Load avg. on damiana is CRITICAL: Connection refused by host [08:39:06] DiskSuite on damiana is CRITICAL: Connection refused by host [08:39:14] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:39:14] / on damiana is CRITICAL: Connection refused by host [08:39:54] PING on daphne is OK: PING OK - Packet loss = 0%, RTA = 0.33 ms [08:42:24] Sun Grid Engine execd on yarrow is UNKNOWN: Execution timeout exceeded [08:46:15] toolserver.org HTTP on ortelius is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:48:05] SMTP on yarrow is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:48:34] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:48:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [08:49:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 168153.000000 [08:49:14] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:49:36] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [08:49:46] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [08:50:05] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 160053 MB (3% inode=69%): [08:54:35] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [09:00:13] hello [09:00:24] yes its broken ;) [09:00:29] i'll try to fix [09:00:31] Hi nosy. :-) [09:00:59] Hi Susan [09:01:08] I'll be back when it works [09:07:02] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [09:07:02] SMTP on z-dat-s4-a is OK: SMTP OK - 0.176 sec. response time [09:07:02] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [09:07:02] SMTP on z-dat-s2-b is OK: SMTP OK - 0.015 sec. response time [09:07:07] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [09:07:07] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 144698.000000 [09:08:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:09:26] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [09:12:06] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [09:13:16] MySQL on ha-sql.esi is CRITICAL: Access denied for user tsnagios7643@turnera-bge0 (using password: YES) [09:32:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [09:34:26] Load avg. on nightshade is WARNING: WARNING - load average: 14.67, 16.42, 14.44 [09:38:16] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [09:38:26] Environment IPMI on damiana is CRITICAL: Connection refused by host [09:38:27] Free Memory on damiana is CRITICAL: Connection refused by host [09:39:06] ts-array5 on damiana is CRITICAL: Connection refused by host [09:39:07] /tmp on damiana is CRITICAL: Connection refused by host [09:39:16] DiskSuite on damiana is CRITICAL: Connection refused by host [09:39:37] Load avg. on damiana is CRITICAL: Connection refused by host [09:39:56] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:39:56] SMTP on damiana is CRITICAL: Connection refused [09:39:56] SSH on damiana is CRITICAL: Connection refused [09:40:06] / on damiana is CRITICAL: Connection refused by host [09:43:45] still something wrong? [09:44:03] I keep receiving PHP Warning:  mysql_connect(): Lost connection to MySQL server at 'reading initial communication packet', system error: 110 [09:48:26] Load avg. on nightshade is OK: OK - load average: 13.15, 14.47, 14.95 [09:49:06] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:49:16] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 157737.000000 [09:49:26] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [09:49:46] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [09:49:56] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:50:36] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [09:50:55] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 160368 MB (3% inode=69%): [09:51:58] liangent: kinda [09:52:04] nosy said she was working on it [09:55:26] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [10:06:35] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [10:06:46] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 88930 MB (14% inode=99%): [10:07:06] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [10:07:16] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 148199.000000 [10:07:56] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [10:08:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:09:27] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [10:11:26] Load avg. on nightshade is WARNING: WARNING - load average: 18.46, 16.65, 15.60 [10:12:06] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [10:18:26] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 31.98, 25.04, 19.74 [10:32:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [10:38:16] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [10:38:26] Environment IPMI on damiana is CRITICAL: Connection refused by host [10:38:26] Free Memory on damiana is CRITICAL: Connection refused by host [10:39:06] ts-array5 on damiana is CRITICAL: Connection refused by host [10:39:07] /tmp on damiana is CRITICAL: Connection refused by host [10:39:16] DiskSuite on damiana is CRITICAL: Connection refused by host [10:39:26] Load avg. on nightshade is WARNING: WARNING - load average: 14.50, 17.89, 19.79 [10:39:36] Load avg. on damiana is CRITICAL: Connection refused by host [10:39:56] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:39:56] SMTP on damiana is CRITICAL: Connection refused [10:39:56] SSH on damiana is CRITICAL: Connection refused [10:40:06] / on damiana is CRITICAL: Connection refused by host [10:49:06] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:49:16] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 145475.000000 [10:49:26] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [10:49:46] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [10:49:56] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:50:36] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [10:50:56] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 160993 MB (3% inode=69%): [10:55:26] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [11:06:36] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [11:06:45] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 88614 MB (14% inode=99%): [11:07:06] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [11:07:26] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 151722.000000 [11:07:56] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [11:08:56] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:09:26] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [11:12:06] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [11:32:36] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [11:38:16] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [11:38:26] Environment IPMI on damiana is CRITICAL: Connection refused by host [11:38:26] Free Memory on damiana is CRITICAL: Connection refused by host [11:39:06] ts-array5 on damiana is CRITICAL: Connection refused by host [11:39:06] /tmp on damiana is CRITICAL: Connection refused by host [11:39:16] DiskSuite on damiana is CRITICAL: Connection refused by host [11:39:25] Load avg. on nightshade is WARNING: WARNING - load average: 20.09, 17.62, 16.61 [11:39:36] Load avg. on damiana is CRITICAL: Connection refused by host [11:39:56] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:39:56] SMTP on damiana is CRITICAL: Connection refused [11:39:56] SSH on damiana is CRITICAL: Connection refused [11:40:06] / on damiana is CRITICAL: Connection refused by host [11:49:16] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 128859.000000 [11:49:26] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [11:49:45] MySQL slave on z-dat-s2-b is CRITICAL: (Return code of 139 is out of bounds) [11:49:57] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:50:07] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:50:36] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [11:50:56] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 161251 MB (3% inode=68%): [11:52:26] Load avg. on nightshade is OK: OK - load average: 12.11, 13.87, 14.91 [11:52:46] /sql on ptolemy is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:52:56] /mnt user-store on rosemary is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:53:06] CAM on hemlock is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:53:16] /sql on rosemary is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:53:17] SSH on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:53:26] Environment IPMI on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:53:26] APT on yucca is UNKNOWN: CHECK_NRPE: Error receiving data from daemon. [11:55:26] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [11:58:45] /aux0 on hemlock is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:46] /tmp on cassia is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:46] / on ptolemy is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:47] SMF on ptolemy is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:47] Load avg. on rosemary is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:47] Load avg. on adenia is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:47] Environment IPMI on cassia is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:47] /home on hemlock is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:47] ethernet 0/1/16 [far1-n1-oe16-b-esams.mgmt] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [11:58:57] ethernet 0/1/3 [adenia] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [11:58:57] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: No response from remote host fsw1-n1-oe16-esams.mgmt during discovery. [11:58:57] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: No response from remote host fsw1-n1-oe16-esams.mgmt during discovery. [11:58:57] /tmp on hemlock is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:57] /tmp on ptolemy is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:57] ethernet 0/1/17 [sage] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [11:58:57] ethernet 0/1/4 [ts-array5 controller A] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [11:58:58] FC 0/5 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: No response from remote host fsw1-n1-oe16-esams.mgmt during discovery. [11:58:58] FC 0/17 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: No response from remote host fsw1-n1-oe16-esams.mgmt during discovery. [11:58:59] / on thyme is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:58:59] Load avg. on cassia is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:59:00] /sql on z-dat-s5-b is UNKNOWN: CHECK_NRPE: Error receiving data from daemon. [11:59:00] / on z-dat-s5-b is UNKNOWN: CHECK_NRPE: Error receiving data from daemon. [11:59:06] / on hyacinth is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:59:06] / on rosemary is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:59:06] Environment on ptolemy is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [11:59:06] ethernet 0/1/18 [far2-n1-oe16-a-esams.mgmt] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [11:59:06] ethernet 0/1/5 [ts-array5 controller B] on asw-oe16-esams.mgmt is UNKNOWN: ERROR: Description table : No response from remote host asw-oe16-esams.mgmt. [11:59:06] FC 0/0 [SAN far1-n1-oe16-esams A1] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: No response from remote host fsw1-n1-oe16-esams.mgmt during discovery. [11:59:17] FC 0/19 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: No response from remote host fsw1-n1-oe16-esams.mgmt during discovery. [11:59:17] / on z-dat-s3-a is UNKNOWN: CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [12:03:36] ethernet 0/1/16 [far1-n1-oe16-b-esams.mgmt] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/16:UP:1 UP: OK [12:03:37] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is OK: FC port 0/15:UP:1 UP: OK [12:03:37] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is OK: FC port 0/3:UP:1 UP: OK [12:03:46] /aux0 on hemlock is OK: DISK OK - free space: / 5501 MB (27% inode=87%): [12:03:47] /tmp on cassia is OK: DISK OK - free space: /tmp 10431 MB (99% inode=99%): [12:03:47] / on ptolemy is OK: DISK OK - free space: / 9799 MB (49% inode=91%): [12:03:47] ethernet 0/1/3 [adenia] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/3:UP:1 UP: OK [12:03:47] FC 0/16 on fsw1-n1-oe16-esams.mgmt is OK: FC port 0/16:DOWN:1 UP: OK [12:03:47] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is OK: FC port 0/4:UP:1 UP: OK [12:03:48] Load avg. on rosemary is OK: OK - load average: 3.46, 3.40, 3.41 [12:03:48] SMF on ptolemy is OK: OK - all services online [12:03:49] Load avg. on adenia is OK: OK - load average: 1.11, 1.11, 1.10 [12:03:49] ethernet 0/1/4 [ts-array5 controller A] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/4:UP:1 UP: OK [12:03:50] ethernet 0/1/17 [sage] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/17:UP:1 UP: OK [12:03:50] FC 0/17 on fsw1-n1-oe16-esams.mgmt is OK: FC port 0/17:DOWN:1 UP: OK [12:03:51] FC 0/5 on fsw1-n1-oe16-esams.mgmt is OK: FC port 0/5:DOWN:1 UP: OK [12:03:51] /home on hemlock is OK: DISK OK - free space: /home 17258 MB (34% inode=85%): [12:05:25] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:05:25] SSH on damiana is CRITICAL: Connection refused [12:05:25] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:05:25] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [12:05:25] / on damiana is CRITICAL: Connection refused by host [12:05:26] ts-array5 on damiana is CRITICAL: Connection refused by host [12:05:26] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicati [12:05:33] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 154976.000000 [12:05:33] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:08:13] MySQL on ha-sql.esi is OK: Uptime: 20 Threads: 1 Questions: 7 Slow queries: 0 Opens: 15 Flush tables: 1 Open tables: 8 Queries per second avg: 0.350 [12:08:25] legoktm@nightshade:~$ mysql -h sql-s7-rr metawiki_p [12:08:25] ERROR 2003 (HY000): Can't connect to MySQL server on 'sql-s7-rr' (111) [12:11:36] SSH on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:13:46] NTP on turnera is WARNING: NTP WARNING: Server has the LI_ALARM bit set, Offset -0.00343 secs [12:23:46] NTP on turnera is OK: NTP OK: Offset -0.010739 secs [12:39:56] s5 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2168.000000 [12:39:56] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2161.000000 [12:40:04] DaB. * [Toolserver-announce] Reboot of the linux-userland-server this evening. [12:40:06] s4 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2168.000000 [12:40:49] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2211.000000 [12:40:49] s1 replag on thyme is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2215.000000 [12:45:16] SSH on damiana is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:46:46] NTP on damiana is OK: NTP OK: Offset -0.003367 secs [12:48:26] / on damiana is OK: DISK OK - free space: / 43821 MB (60% inode=95%): [12:48:26] ts-array5 on damiana is OK: 2/2 paths are active [12:48:27] /tmp on damiana is OK: DISK OK - free space: /tmp 11802 MB (99% inode=99%): [12:48:36] DiskSuite on damiana is OK: OK - No disk failures detected [12:48:46] Environment IPMI on damiana is OK: ok: temperature ok fan ok voltage ok chassis ok [12:48:46] Free Memory on damiana is OK: OK - 54.1% (4536720 kB) free. [12:48:56] Load avg. on damiana is OK: OK - load average: 0.54, 0.52, 0.50 [12:49:57] SMTP on damiana is OK: SMTP OK - 0.149 sec. response time [12:51:40] /sql on rosemary is CRITICAL: DISK CRITICAL - free space: /sql 58279 MB (5% inode=99%): [13:03:56] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3608.000000 [13:03:57] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3601.000000 [13:04:06] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 3609.000000 [13:04:36] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 72420 MB (7% inode=99%): [13:04:36] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 125144.000000 [13:04:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [13:04:46] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [13:04:56] s5 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2586.000000 [13:04:56] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 60.00, 53.29, 65.05 [13:04:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [13:04:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [13:05:06] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [13:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [13:05:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 8971 [13:05:17] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 88081 MB (14% inode=99%): [13:05:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [13:05:27] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [13:05:27] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 161736 MB (3% inode=68%): [13:05:35] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [13:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [13:05:56] s5 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1585.000000 [13:05:56] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3556.000000 [13:06:06] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 158617.000000 [13:06:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:06:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:10:06] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2157 [13:10:26] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3421 [13:10:57] MySQL slave on thyme is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3173 [13:10:57] MySQL slave on daphne is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2009 [13:11:06] MySQL slave on z-dat-s3-a is OK: Uptime: 1817307 Threads: 23 Questions: 1703927702 Slow queries: 67411 Opens: 16151782 Flush tables: 1 Open tables: 16384 Queries per second avg: 937.611 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1767 [13:11:36] SSH on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:11:46] s4 replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1743.000000 [13:11:56] MySQL slave on daphne is OK: Uptime: 5411802 Threads: 23 Questions: 1255875213 Slow queries: 113303 Opens: 251438 Flush tables: 1 Open tables: 1492 Queries per second avg: 232.62 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1671 [13:20:46] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 1711.000000 [13:20:56] MySQL slave on thyme is OK: Uptime: 393724 Threads: 16 Questions: 276627403 Slow queries: 35975 Opens: 2716 Flush tables: 1 Open tables: 444 Queries per second avg: 702.592 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1699 [13:24:56] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1799.000000 [13:25:26] MySQL slave on rosemary is OK: Uptime: 407280 Threads: 15 Questions: 341308372 Slow queries: 267490 Opens: 23594 Flush tables: 1 Open tables: 3093 Queries per second avg: 838.18 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1743 [13:32:55] hi, i hav problems with commonshelper [13:33:28] db-cluster: sql-s5-rr InfoINFO MNT-1298 Slow throu commons-import [13:33:47] db-cluster: sql-s5-user InfoINFO MNT-1298 Slow throu commons-import [13:43:57] MySQL on z-dat-s5-b is CRITICAL: Cant connect to MySQL server on z-dat-s5-b (146) [13:45:27] SSH on yucca is OK: SSH OK - OpenSSH_5.5p1 Debian-6+squeeze3 (protocol 2.0) [13:47:51] Steinsplitter: ts has been up and down very randomly for the past few days [13:48:28] ah, okay. Thanke You [14:01:56] MySQL on z-dat-s5-b is OK: Uptime: 1260 Threads: 1 Questions: 28331 Slow queries: 0 Opens: 82 Flush tables: 1 Open tables: 72 Queries per second avg: 22.484 [14:04:06] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7208.000000 [14:04:35] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 124968.000000 [14:04:45] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [14:04:46] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [14:04:57] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 73.65, 115.60, 138.02 [14:04:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [14:04:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [14:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [14:05:17] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 161350 [14:05:17] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 9604 [14:05:17] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 87804 MB (14% inode=99%): [14:05:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [14:05:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 162240 MB (3% inode=68%): [14:05:26] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [14:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [14:06:06] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 161293.000000 [14:06:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:06:17] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:20:06] s4 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3502.000000 [14:27:06] s4 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1267.000000 [14:35:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [15:04:36] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 127029.000000 [15:04:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [15:04:47] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [15:04:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [15:04:57] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 44.33, 46.48, 54.35 [15:04:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [15:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [15:05:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 145279 [15:05:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 12059 [15:05:16] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 87532 MB (14% inode=99%): [15:05:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [15:05:27] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 162530 MB (3% inode=68%): [15:05:27] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [15:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [15:06:06] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 145309.000000 [15:06:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:06:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:35:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [16:04:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [16:04:46] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [16:04:56] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 31.20, 34.08, 35.23 [16:04:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [16:04:56] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [16:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [16:05:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 121391 [16:05:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 14454 [16:05:16] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 87278 MB (14% inode=99%): [16:05:27] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [16:05:27] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [16:05:35] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 162728 MB (3% inode=68%): [16:05:36] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 129690.000000 [16:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [16:06:00] [[Special:Log/newusers]] create 10 * Olivier Bommel * (New user account) [16:06:06] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 120630.000000 [16:06:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:06:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:35:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [16:54:56] Load avg. on nightshade is WARNING: WARNING - load average: 15.24, 14.35, 19.87 [17:04:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [17:04:46] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [17:04:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [17:04:56] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [17:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [17:05:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 8487 [17:05:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 105854 [17:05:16] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 86987 MB (14% inode=99%): [17:05:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [17:05:26] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [17:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [17:06:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 133351.000000 [17:06:06] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 105404.000000 [17:06:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:06:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:06:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 163376 MB (3% inode=68%): [17:08:56] wikidata replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2096.000000 [17:13:56] Load avg. on nightshade is OK: OK - load average: 10.31, 11.48, 14.63 [17:28:16] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3174 [17:31:16] MySQL slave on z-dat-s2-b is OK: Uptime: 7389 Threads: 15 Questions: 6910304 Slow queries: 259 Opens: 202626 Flush tables: 1 Open tables: 256 Queries per second avg: 935.215 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1384 [17:35:17] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [18:04:47] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [18:04:47] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [18:04:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [18:04:56] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [18:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [18:05:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 82637 [18:05:16] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 86707 MB (14% inode=99%): [18:05:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [18:05:36] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [18:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [18:06:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 135464.000000 [18:06:07] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 82579.000000 [18:06:15] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:06:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:06:27] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 163745 MB (3% inode=68%): [18:08:56] wikidata replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 1842.000000 [18:09:57] Load avg. on nightshade is WARNING: WARNING - load average: 18.33, 17.32, 14.44 [18:10:56] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 32.20, 21.58, 16.09 [18:13:56] Load avg. on nightshade is WARNING: WARNING - load average: 22.50, 23.96, 18.16 [18:35:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [18:40:57] Load avg. on nightshade is WARNING: WARNING - load average: 15.45, 15.82, 17.01 [18:44:56] Load avg. on nightshade is OK: OK - load average: 7.72, 11.45, 14.96 [19:04:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [19:04:47] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [19:04:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [19:04:56] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [19:05:05] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [19:05:17] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 50591 [19:05:17] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 86279 MB (14% inode=99%): [19:05:27] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [19:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [19:06:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 135003.000000 [19:06:16] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 49430.000000 [19:06:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:06:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:06:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 164458 MB (3% inode=68%): [19:06:26] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [19:08:57] wikidata replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3233.000000 [19:35:30] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [19:49:23] wikidata replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1736.000000 [19:52:13] hello all [19:59:44] If you have any files on open on nightshade you should close them NOW! [20:04:47] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [20:04:47] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [20:04:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [20:04:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [20:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [20:05:16] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 12414 [20:05:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [20:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [20:06:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 130830.000000 [20:06:16] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 12269.000000 [20:06:17] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 85999 MB (14% inode=99%): [20:06:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:06:17] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:06:36] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 164945 MB (3% inode=68%): [20:06:36] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [20:08:48] nightshade is back [20:09:54] thanks :) [20:15:06] NTP on nightshade is WARNING: NTP WARNING: Server has the LI_ALARM bit set, Offset 0.048174 secs [20:15:46] APT on nightshade is WARNING: APT WARNING: 4 packages available for upgrade (0 critical updates). warnings detected. run with -v for information. [20:20:49] If you have any files on open on yarow you should close them NOW! [20:21:16] MySQL slave on z-dat-s5-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2830 [20:21:16] wikidata replag on z-dat-s5-b is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2830.000000 [20:23:16] MySQL slave on z-dat-s5-b is OK: Uptime: 24136 Threads: 4 Questions: 338201013 Slow queries: 293 Opens: 3459 Flush tables: 1 Open tables: 256 Queries per second avg: 14012.305 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 947 [20:23:17] wikidata replag on z-dat-s5-b is OK: QUERY OK: SELECT ts_rc_age() returned 947.000000 [20:24:06] NTP on nightshade is OK: NTP OK: Offset 0.115522 secs [20:27:50] ok, yarrow is back too. Maintenance done [20:31:32] [[Operators]] 10https://wiki.toolserver.org/w/index.php?diff=7836&oldid=7831&rcid=21623 * Dab * (+25) (/* General rights */ +services) [20:32:37] NTP on yarrow is WARNING: NTP WARNING: Server has the LI_ALARM bit set, Offset 0.016341 secs [20:37:36] aliasd on mayapple is CRITICAL: Connection refused [20:42:37] NTP on yarrow is OK: NTP OK: Offset 0.042458 secs [20:47:07] [[Operators]] 10https://wiki.toolserver.org/w/index.php?diff=7837&oldid=7836&rcid=21624 * Dab * (+26) (/* General rights */ +SGE (qdel and co)) [20:50:55] DaBPunkt: hi [20:51:03] hello [20:51:12] DaBPunkt: what needs to be done to take down amaranth? [20:51:26] can any root shut it down? or it needs someone special? [20:51:35] any root can [20:51:50] root → ts-root [20:51:56] right [20:52:16] ok [20:52:34] idk when someone will be back in the datacenter again. i guess monday [20:52:54] how's nosy? [20:53:08] I will be away at Monday. So Nosy have to do it [20:53:22] ok [20:53:55] how is she though? everyone back home? [20:54:01] and how about your family? [20:54:17] Didn't spoke with her yet [20:54:34] hrmmm, and what about other roots? is silke a root? [20:54:52] no [20:54:54] ok [20:55:09] * jeremyb_ has to run away [20:55:10] bbl [20:55:15] bye jeremyb_ [21:02:20] * valhallasw hands DaBPunkt a beer for his hard work on saturday. Although I seem to remember something about a ts-root not liking beer... [21:04:46] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [21:04:47] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [21:04:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [21:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [21:05:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [21:05:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [21:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [21:05:56] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [21:06:06] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 125639.000000 [21:06:16] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 85486 MB (14% inode=99%): [21:06:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:06:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:06:36] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 167548 MB (3% inode=68%): [21:06:36] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [21:37:36] aliasd on mayapple is CRITICAL: Connection refused [22:04:47] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [22:04:47] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [22:04:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [22:05:06] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [22:05:14] @replag [22:05:15] DaBPunkt: s1-rr-a-wd: 39s [-0.00 s/s]; s1-user-wd: 39s [-0.00 s/s]; s2-user-c: error; s2-user-wd: 1d 7h 33m 55s [-1.67 s/s]; s3-user: 16s [-0.02 s/s]; s3-user-wd: 39s [-0.00 s/s]; s4-rr-a: 11s [-0.00 s/s]; s4-user: 11s [-0.00 s/s] [22:05:16] DaBPunkt: s4-user-wd: 18m 44s [-0.07 s/s]; s5-user-c: error; s6-user-wd: 39s [-0.00 s/s]; s7-user-wd: 39s [-0.00 s/s] [22:05:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [22:05:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [22:05:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [22:05:56] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [22:06:05] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 113315.000000 [22:06:17] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 85253 MB (14% inode=99%): [22:06:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:06:17] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:06:26] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:06:36] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 171060 MB (3% inode=68%): [22:06:36] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [22:10:24] DaBPunkt: Could we address some of the Nagios/tsnag issues at some time? Some tests seem to be broken (for example SGE execd on Solaris servers), others aren't clear: Does "DISK WARNING - free space" mean we need more disk space, or is just the threshold for alarm too low? [22:13:09] scfc_de: sure. I will reservate a termin in 2014 ;-) [22:25:51] We don't have to wait that long :-). I asked Coren recently if he would volunteer as another TS root, and when he is finished with the initial Labs setup and has some more time, he is willing to if you are okay with that. So in a few weeks, we should be able to increase the admin manpower by a reasonable factor. [22:36:39] scfc_de, I thought about what you said that you saw me as a ts root [22:37:36] aliasd on mayapple is CRITICAL: Connection refused [22:37:48] I would be willing to help there [22:38:10] but it's more likely that I wouldn't fit DaBPunkt criteria for return-value of the inversion :) [22:38:56] Platonides: inve*st*ion :-). [22:39:47] oops [22:40:49] tricked by the name in Spanish :) [22:41:41] Silke will probably be online again on Monday, and then we could see what steps need to be taken. BTW, you're based in (mainland) Spain? [22:42:11] maybe I overstate the requirements a little bit. [22:42:49] scfc_de, yes [22:43:34] But I work on a config for Operators anyway. It could be a prepare-step for other roots too [22:44:30] DaBPunkt: I saw that on toolserver-puppet. It looks promising. But of course it would be nice if there are more people around with the "general key" to all "rooms" :-). [22:49:03] sure. But I prefer small steps [22:50:04] Is the config ready to promote Platonides to such an operator yet? [22:51:09] in a few days [22:51:22] What needs to be done? [22:51:34] the sudo-config [22:51:56] and a global alias for rm [22:52:06] Free Memory on turnera is CRITICAL: CRITICAL - 1.5% (127176 kB) free! [22:52:11] and a mail to the users telling about the alias [22:52:34] and define the RULES [22:52:49] a global alias for rm? is that a sudoable rm ? [22:52:53] looks dangrous [22:52:57] but if Platonides is willing to play the genuia pig that is VERY welcoming [22:52:57] *dangerous [22:53:45] Platonides: it should be something like alias rm=rm -I [22:54:21] should be harmless [22:54:45] toolserver.org no longer answer [22:54:46] assuming it doesn't affect people scripts [22:55:04] I don't remember if alias run when called through ssh [22:55:30] phe: let me look [22:56:11] I mean a command like "ssh host rm /my/file" [22:56:28] /home is still there and the webservers are working. I will check the loadbalancer [22:56:57] Hmmm. It's already set for me, and I don't have it in my ~/.bashrc? No trace in /etc/bash.bashrc and /etc/profile.d/. Hmmm^2. [22:57:31] (BTW, what is the connection between this alias and the operators group?) [22:59:15] NFS server ha-nfs.esi not responding still trying [23:07:38] nfs again not working [23:07:54] tNFS server ha-nfs.esi not responding still trying [23:08:35] Dab already took his hammer and he is repairing it [23:09:21] awesome [23:13:25] that's great. I get output of damiana's console, but can not enter commands [23:15:39] sent a hard-reset [23:16:50] not sure if anybody mentioned it, but these nfs problems started to appear after last reboot few days ago, in case it helps [23:17:34] Danny_B|webgate: it is a kernel-memory-bug in solaris [23:17:56] great, now turnera ALSO rebooted… [23:17:58] so obviously yes [23:27:14] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:27:14] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [23:27:15] aliasd on mayapple is CRITICAL: Connection refused [23:27:15] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_SAS_PORT_DEGRADED.description:S27:Tray.85.Controller.B.Port.2:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.de [23:27:24] SMTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:32:12] it SHOULD work again [23:32:23] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:32:24] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [23:32:24] aliasd on mayapple is CRITICAL: Connection refused [23:32:24] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [23:37:14] wikidata replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2314.000000 [23:37:33] Sun Grid Engine execd on nightshade is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [23:37:53] wikidata replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3598.000000 [23:37:54] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 6.77, 15.26, 37.75 [23:37:54] wikidata replag on thyme is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2363.000000 [23:38:03] s4 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2339.000000 [23:38:04] wikidata replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2300.000000 [23:38:14] Sun Grid Engine execd on yarrow is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [23:38:33] wikidata replag on z-dat-s6-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2129.000000 [23:38:33] wikidata replag on z-dat-s7-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2128.000000 [23:39:15] @replag [23:39:15] DaBPunkt: s1-rr-a: 23m 44s [-]; s1-rr-a-wd: 30m 14s [-]; s1-user: 23m 45s [-]; s1-user-c: 23m 45s [-]; s1-user-wd: 30m 14s [-]; s2-rr: 18m 41s [-]; s2-user: 18m 41s [-]; s2-user-c: error [23:39:16] DaBPunkt: s2-user-wd: 1d 4h 23m 25s [-]; s3-user: 25m 56s [-]; s3-user-wd: 30m 13s [-]; s4-rr-a: 23m 45s [-]; s4-user: 23m 45s [-]; s4-user-wd: 56m 8s [-]; s5-rr-a: 23m 50s [-]; s5-user: 22m 34s [-] [23:39:17] DaBPunkt: s5-user-c: error; s6-user: 23m 49s [-]; s6-user-wd: 30m 12s [-]; s7-user: 23m 44s [-]; s7-user-wd: 30m 13s [-] [23:39:28] great… [23:39:33] wikidata replag on z-dat-s6-a is OK: QUERY OK: SELECT ts_rc_age() returned 1789.000000 [23:39:34] wikidata replag on z-dat-s7-a is OK: QUERY OK: SELECT ts_rc_age() returned 1789.000000 [23:39:52] wikidata replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 1622.000000 [23:40:03] wikidata replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1599.000000 [23:40:52] seems to be working again :) [23:41:13] wikidata replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1610.000000 [23:42:08] good night [23:45:53] wikidata replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1716.000000 [23:46:13] Sun Grid Engine execd on yarrow is CRITICAL: CRITICAL: execd not communicating [23:47:14] Sun Grid Engine execd on yarrow is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [23:49:52] Load avg. on nightshade is WARNING: WARNING - load average: 4.14, 5.55, 20.00 [23:55:34] again [23:57:23] Load avg. on nightshade is OK: OK - load average: 7.40, 6.24, 14.64