[00:16:17] legoktm: i pinged nosy. She was online at 19 o'clock and fixed it. also replication was startet again [00:16:17] @replag [00:16:17] Merlissimo: s1-rr-a-wd: 39s [-]; s1-user-c: 1h 46m 34s [-]; s1-user-wd: 38s [-]; s2-rr: 3m 0s [-]; s2-user: 3m 0s [-]; s2-user-c: error; s2-user-wd: 2d 4h 24m 17s [-]; s3-user: 23s [-] [00:16:17] Merlissimo: s3-user-wd: 36s [-]; s4-user-wd: 36s [-]; s5-user: 3d 4h 9m 50s [-]; s5-user-c: error; s6-user-wd: 40s [-]; s7-user-wd: 34s [-] [00:16:17] awesome thanks :) [00:16:18] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [00:16:18] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). warnings detected. run with -v for information. [00:16:18] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 151055 MB (2% inode=69%): [00:16:18] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [00:16:19] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [00:16:19] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [00:16:19] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:16:19] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [00:16:19] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [00:16:19] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [00:16:19] s4 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3495.000000 [00:16:19] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [00:16:19] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [00:16:19] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 95791 MB (15% inode=99%): [00:16:19] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [00:16:19] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [00:16:19] Merlissimo: The job at 23:30Z gave me an error message from my script. It looks as if the cgdelete error "overwrote" the script's stderr. So your patch didn't cause errors, it just hid mine :-). [00:16:19] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 189059.000000 [00:16:20] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [00:16:20] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [00:16:20] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [00:16:20] s4 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1756.000000 [00:16:20] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 273513 [00:16:21] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 273505.000000 [00:16:25] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 68537 MB (7% inode=99%): [00:29:15] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [00:31:04] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). warnings detected. run with -v for information. [00:31:14] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 150733 MB (2% inode=69%): [00:31:34] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [00:32:15] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [00:32:15] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [00:32:33] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:32:57] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [00:32:57] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [00:32:57] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [00:32:57] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [00:33:06] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [00:33:06] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [00:33:15] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 189490.000000 [00:33:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [00:33:57] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 95512 MB (15% inode=99%): [00:35:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [00:35:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [00:36:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [00:38:55] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 272264 [00:39:16] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 272268.000000 [01:28:56] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2039 [01:29:15] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [01:31:05] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). warnings detected. run with -v for information. [01:31:15] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 150734 MB (2% inode=69%): [01:32:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [01:32:16] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [01:32:33] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [01:32:57] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [01:32:57] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [01:32:57] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [01:32:58] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [01:33:03] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [01:33:04] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [01:33:16] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 191627.000000 [01:33:58] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [01:33:58] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 95344 MB (15% inode=99%): [01:35:58] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [01:35:58] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [01:36:58] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [01:38:57] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 269300 [01:39:16] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 269279.000000 [01:55:57] MySQL slave on z-dat-s2-b is OK: Uptime: 46640 Threads: 11 Questions: 52753279 Slow queries: 1378 Opens: 514992 Flush tables: 1 Open tables: 256 Queries per second avg: 1131.73 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1541 [02:01:33] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [02:29:16] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [02:32:04] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). warnings detected. run with -v for information. [02:32:15] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 150678 MB (2% inode=69%): [02:32:42] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [02:32:57] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [02:32:57] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [02:32:57] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [02:32:57] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [02:33:04] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [02:33:04] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (3 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.CommunicationLost.desc: [02:33:17] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [02:33:17] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [02:33:58] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [02:33:59] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 95513 MB (15% inode=99%): [02:34:15] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 189934.000000 [02:35:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [02:36:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [02:37:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [02:38:57] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 258713 [02:40:15] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 258537.000000 [03:01:33] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [03:29:16] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [03:32:03] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). warnings detected. run with -v for information. [03:32:15] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 150512 MB (2% inode=69%): [03:32:43] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [03:32:58] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [03:32:58] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [03:32:58] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [03:32:58] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [03:33:05] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [03:33:05] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [03:33:15] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [03:33:15] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [03:34:14] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 190512.000000 [03:34:58] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [03:34:58] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 94931 MB (15% inode=99%): [03:35:58] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [03:36:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [03:37:56] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [03:38:58] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 250896 [03:40:15] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 250824.000000 [04:01:34] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [04:29:14] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [04:32:15] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 150121 MB (2% inode=69%): [04:32:24] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). warnings detected. run with -v for information. [04:32:58] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [04:32:58] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:32:58] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [04:32:58] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [04:32:58] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [04:33:16] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [04:33:16] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [04:33:24] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [04:33:24] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [04:34:15] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 191270.000000 [04:34:58] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 94839 MB (15% inode=99%): [04:34:58] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [04:36:58] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [04:36:58] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [04:37:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [04:38:58] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 239232 [04:40:15] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 239089.000000 [05:01:34] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [05:29:25] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [05:32:25] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). warnings detected. run with -v for information. [05:33:14] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 149658 MB (2% inode=69%): [05:33:25] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [05:33:25] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [05:33:25] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [05:33:25] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [05:33:58] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [05:33:59] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [05:34:00] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [05:34:00] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:34:00] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [05:34:25] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 186411.000000 [05:35:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [05:35:57] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 94749 MB (15% inode=99%): [05:37:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [05:37:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [05:37:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [05:38:58] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 235044 [05:40:24] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 234993.000000 [06:01:34] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [06:29:25] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [06:32:24] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [06:33:14] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 148021 MB (2% inode=69%): [06:33:25] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [06:33:25] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [06:33:25] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [06:33:25] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [06:34:04] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [06:34:04] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [06:34:25] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 183500.000000 [06:34:56] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [06:34:57] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [06:34:57] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [06:35:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [06:35:57] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 94598 MB (15% inode=99%): [06:37:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [06:37:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [06:37:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [06:39:56] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 231548 [06:40:25] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 231575.000000 [07:01:34] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [07:04:25] Free Memory on turnera is WARNING: WARNING - 5.1% (424192 kB) free! [07:05:25] Free Memory on turnera is CRITICAL: CRITICAL - 4.8% (400708 kB) free! [07:06:24] Free Memory on turnera is WARNING: WARNING - 5.4% (449224 kB) free! [07:29:25] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [07:32:24] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [07:33:15] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 148012 MB (2% inode=69%): [07:34:03] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [07:34:04] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [07:34:25] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [07:34:26] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 177834.000000 [07:34:26] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [07:34:26] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [07:34:26] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [07:34:56] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [07:35:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [07:35:56] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 94445 MB (15% inode=99%): [07:35:56] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [07:35:57] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [07:37:24] Free Memory on turnera is CRITICAL: CRITICAL - 2.2% (184296 kB) free! [07:38:24] Free Memory on turnera is OK: OK - 10.0% (836968 kB) free. [07:38:56] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [07:38:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [07:38:56] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [07:39:57] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 233231 [07:41:24] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 233144.000000 [07:47:24] Free Memory on turnera is WARNING: WARNING - 6.3% (526684 kB) free! [07:48:24] Free Memory on turnera is OK: OK - 7.1% (591060 kB) free. [08:01:34] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [08:25:57] Load avg. on nightshade is WARNING: WARNING - load average: 29.20, 22.46, 14.32 [08:30:27] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [08:32:35] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [08:33:14] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 150992 MB (2% inode=69%): [08:34:03] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [08:34:03] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:34:35] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [08:34:57] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [08:35:23] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [08:35:24] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 175864.000000 [08:35:24] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [08:35:34] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [08:35:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [08:35:57] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 94243 MB (15% inode=99%): [08:35:57] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [08:35:57] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [08:38:56] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [08:38:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [08:38:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [08:40:57] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 230129 [08:42:23] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 230047.000000 [08:45:56] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 30.95, 25.25, 20.00 [08:53:25] Free Memory on turnera is CRITICAL: CRITICAL - 2.2% (184144 kB) free! [09:01:34] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [09:30:23] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [09:33:14] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 151164 MB (2% inode=69%): [09:33:33] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [09:34:04] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [09:34:04] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:35:24] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [09:35:25] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 163518.000000 [09:35:25] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [09:35:33] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [09:35:56] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [09:35:56] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [09:35:56] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 94005 MB (15% inode=99%): [09:36:34] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [09:36:56] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [09:36:56] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [09:38:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [09:38:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [09:38:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [09:41:56] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 227189 [09:42:24] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 227173.000000 [09:45:57] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 22.42, 24.14, 24.68 [09:53:24] Free Memory on turnera is CRITICAL: CRITICAL - 2.6% (219380 kB) free! [10:27:56] Load avg. on nightshade is WARNING: WARNING - load average: 12.54, 16.59, 19.78 [10:28:24] Environment IPMI on damiana is OK: ok: temperature ok fan ok voltage ok chassis ok [10:31:33] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [10:33:34] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [10:34:03] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [10:34:03] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [10:34:13] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 149964 MB (2% inode=69%): [10:34:57] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 29.07, 21.16, 20.19 [10:35:24] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [10:35:38] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [10:35:57] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [10:35:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [10:35:57] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 93746 MB (15% inode=99%): [10:36:23] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 158796.000000 [10:36:24] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [10:36:33] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [10:36:56] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [10:36:56] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [10:39:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [10:39:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [10:39:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [10:41:56] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 228149 [10:42:44] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 228154.000000 [10:44:56] Load avg. on nightshade is WARNING: WARNING - load average: 15.92, 18.87, 19.86 [10:53:24] Free Memory on turnera is CRITICAL: CRITICAL - 3.0% (254980 kB) free! [11:04:56] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2037 [11:09:56] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 29.01, 23.85, 20.04 [11:10:56] Load avg. on nightshade is WARNING: WARNING - load average: 21.32, 22.64, 19.87 [11:11:57] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 22.12, 22.86, 20.13 [11:19:56] MySQL slave on z-dat-s2-b is OK: Uptime: 80480 Threads: 7 Questions: 108387819 Slow queries: 1868 Opens: 876331 Flush tables: 1 Open tables: 256 Queries per second avg: 1346.767 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1726 [11:28:56] MySQL slave on z-dat-s2-b is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2104 [11:31:34] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [11:31:57] Load avg. on nightshade is WARNING: WARNING - load average: 26.98, 19.14, 17.22 [11:33:33] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [11:33:56] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 32.45, 23.25, 18.91 [11:34:03] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [11:34:14] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [11:34:14] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 148297 MB (2% inode=69%): [11:35:24] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [11:35:37] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [11:35:57] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [11:35:57] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [11:35:57] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 93454 MB (15% inode=99%): [11:36:23] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 159015.000000 [11:36:24] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [11:36:34] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [11:36:57] Load avg. on nightshade is WARNING: WARNING - load average: 24.01, 23.77, 19.87 [11:36:57] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [11:36:57] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [11:38:57] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 33.67, 26.31, 21.19 [11:40:57] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [11:40:57] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [11:40:57] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [11:41:57] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 226783 [11:42:44] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 226737.000000 [11:52:24] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 67245 MB (6% inode=99%): [11:53:23] Free Memory on turnera is CRITICAL: CRITICAL - 1.8% (152972 kB) free! [11:55:57] MySQL slave on z-dat-s2-b is CRITICAL: Cant connect to MySQL server on z-dat-s2-b (146) [12:01:57] MySQL on z-dat-s2-b is CRITICAL: Cant connect to MySQL server on z-dat-s2-b (146) [12:27:43] It looks like the toolserver webpages are unreachable. E.g. http://toolserver.org/~darkdadaah [12:31:35] Ssh also asks for a password. [12:41:15] Federico Leva (Nemo) * Re: [Toolserver-l] Possible LDAP outtime this morning, major disruption [12:42:33] Cf mail... [12:52:36] PING on willow is OK: PING OK - Packet loss = 0%, RTA = 0.23 ms [12:53:26] PING on asw-oe16-esams.mgmt is OK: PING OK - Packet loss = 0%, RTA = 0.46 ms [12:53:46] PING on fsw2-n1-oe16-esams.mgmt is OK: PING OK - Packet loss = 0%, RTA = 0.32 ms [12:54:16] PING on nightshade is OK: PING OK - Packet loss = 0%, RTA = 3.26 ms [12:54:26] PING on adenia is OK: PING OK - Packet loss = 0%, RTA = 0.24 ms [12:55:15] PING on amaranth is OK: PING OK - Packet loss = 0%, RTA = 117.00 ms [12:55:48] DiskSuite on damiana is CRITICAL: (Return code of 137 is out of bounds) [12:55:48] /tmp on nightshade is CRITICAL: (Return code of 137 is out of bounds) [12:55:48] aliasd on nightshade is CRITICAL: (Return code of 137 is out of bounds) [12:55:48] / on yucca is CRITICAL: (Return code of 137 is out of bounds) [12:55:58] / on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:55:58] Environment IPMI on damiana is CRITICAL: (Return code of 137 is out of bounds) [12:55:58] APT on nightshade is CRITICAL: (Return code of 137 is out of bounds) [12:55:58] /tmp on yucca is CRITICAL: (Return code of 137 is out of bounds) [12:55:58] /tmp on adenia is CRITICAL: (Return code of 137 is out of bounds) [12:55:59] /sql on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:55:59] RAID on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:56:00] Free Memory on damiana is CRITICAL: (Return code of 137 is out of bounds) [12:56:00] Environment IPMI on nightshade is CRITICAL: (Return code of 137 is out of bounds) [12:56:05] uff [12:56:08] /tmp on rosemary is CRITICAL: (Return code of 137 is out of bounds) [12:56:08] PING on cassia is CRITICAL: CRITICAL - Plugin timed out after 10 seconds [12:56:08] APT on yucca is CRITICAL: (Return code of 137 is out of bounds) [12:56:08] Environment IPMI on adenia is CRITICAL: (Return code of 137 is out of bounds) [12:56:08] /sql/data/dewiki on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:56:09] SMTP on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:56:09] Load avg. on damiana is CRITICAL: (Return code of 137 is out of bounds) [12:56:10] Load avg. on nightshade is CRITICAL: (Return code of 137 is out of bounds) [12:56:10] Environment IPMI on rosemary is CRITICAL: (Return code of 137 is out of bounds) [12:56:11] s1 replag on rosemary is CRITICAL: (Return code of 137 is out of bounds) [12:56:11] Environment IPMI on yucca is CRITICAL: (Return code of 137 is out of bounds) [12:56:18] /tmp on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:56:18] FC 0/15 [fsw2-n1-oe16-esams] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: SNMPv3 support is unavailable (Required module Digest/HMAC.pm not found). [12:56:18] FC 0/3 [cassia] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: SNMPv3 support is unavailable (Required module Digest/HMAC.pm not found). [12:56:18] RAID on hyacinth is CRITICAL: (Return code of 137 is out of bounds) [12:56:18] RAID on ptolemy is CRITICAL: (Return code of 137 is out of bounds) [12:56:19] s4 replag on rosemary is CRITICAL: (Return code of 137 is out of bounds) [12:56:19] Load avg. on yucca is CRITICAL: (Return code of 137 is out of bounds) [12:56:20] Load avg. on adenia is CRITICAL: (Return code of 137 is out of bounds) [12:56:20] Environment IPMI on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:56:21] s4 replag on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:56:21] FC 0/16 on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: SNMPv3 support is unavailable (Required module Digest/HMAC.pm not found). [12:56:22] FC 0/4 [hyacinth] on fsw1-n1-oe16-esams.mgmt is UNKNOWN: ERROR opening session: SNMPv3 support is unavailable (Required module Digest/HMAC.pm not found). [12:56:33] SMTP on ptolemy is CRITICAL: (Return code of 137 is out of bounds) [12:56:33] MySQL on rosemary is CRITICAL: (Return code of 137 is out of bounds) [12:56:34] MySQL slave on thyme is CRITICAL: (Return code of 137 is out of bounds) [12:56:34] DiskSuite on turnera is CRITICAL: (Return code of 137 is out of bounds) [12:56:35] Environment IPMI on wolfsbane is CRITICAL: (Return code of 137 is out of bounds) [12:56:35] /tmp on yarrow is CRITICAL: (Return code of 137 is out of bounds) [12:56:36] Sun Grid Engine execd on yarrow is CRITICAL: (Return code of 137 is out of bounds) [12:56:36] PING on nightshade is CRITICAL: CRITICAL - Plugin timed out after 10 seconds [12:56:37] Load avg. on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:56:38] Sensors on mayapple is CRITICAL: (Return code of 137 is out of bounds) [12:56:39] /tmp on ptolemy is CRITICAL: (Return code of 137 is out of bounds) [12:56:39] MySQL slave on rosemary is CRITICAL: (Return code of 137 is out of bounds) [12:56:39] Environment IPMI on turnera is CRITICAL: (Return code of 137 is out of bounds) [12:56:39] APT on yarrow is CRITICAL: (Return code of 137 is out of bounds) [12:56:40] aliasd on yarrow is CRITICAL: (Return code of 137 is out of bounds) [12:56:40] SMTP on yucca is CRITICAL: (Return code of 137 is out of bounds) [12:56:41] MySQL on cassia is CRITICAL: (Return code of 137 is out of bounds) [12:56:41] / on damiana is CRITICAL: (Return code of 137 is out of bounds) [12:56:57] PING on adenia is CRITICAL: CRITICAL - Plugin timed out after 10 seconds [12:57:03] tsnag is panicking :P [12:57:07] PING on cassia is OK: PING OK - Packet loss = 0%, RTA = 0.22 ms [12:57:36] NFS server ha-nfs.esi not responding still trying [12:57:48] PING on thyme is OK: PING OK - Packet loss = 0%, RTA = 0.25 ms [12:58:46] can't run my web tools [13:00:48] Even the toolserver.org website is down. [13:03:04] Everything is back now. [13:06:02] not necessarily [13:06:52] s2 has issues at least [13:07:08] Well, I can ssh and the webpages are ok, at least. [13:07:20] /home has issues [13:07:26] at least on willow [13:07:37] hem, yes, I can't even ls :/ [13:07:45] exactly [13:08:11] there is nfs issue, see my post of error msg [13:10:47] It seemed to work for a second, but not anymore. [13:11:32] somebody should poke dab or nosy or any other root [13:23:00] It's a conspiracy, it went down right when I about to do a check so I could remove somebody for inactivity :P [13:23:08] *I was about to [14:01:50] bah still no improvement [14:04:41] nfs [14:05:17] i think we are really in need for speed of roots atm [14:40:26] Free Memory on turnera is OK: OK - 80.4% (6736780 kB) free. [14:40:34] Sun Grid Engine execd on nightshade is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [14:40:35] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [14:40:35] aliasd on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:40:35] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [14:40:35] s4 replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8861.000000 [14:40:35] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [14:40:44] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8866.000000 [14:40:45] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [14:40:45] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [14:40:45] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10381.000000 [14:40:45] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8866.000000 [14:40:45] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 238.75, 261.34, 261.78 [14:40:45] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [14:40:46] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8889.000000 [14:40:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [14:40:54] s4 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8865.000000 [14:40:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [14:40:55] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [14:40:55] MySQL slave on z-dat-s7-a is CRITICAL: (Return code of 139 is out of bounds) [14:40:55] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [14:40:55] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8919.000000 [14:40:55] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8900.000000 [14:41:04] MySQL slave on z-dat-s3-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 8820 [14:41:05] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8889.000000 [14:41:05] MySQL slave on daphne is CRITICAL: (Return code of 139 is out of bounds) [14:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 93173 MB (15% inode=99%): [14:41:05] MySQL slave on thyme is CRITICAL: (Return code of 139 is out of bounds) [14:41:05] Sun Grid Engine execd on yarrow is UNKNOWN: Error with qhost: error: commlib error: got select error (Connection refused) [14:41:17] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [14:41:17] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8912.000000 [14:41:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [14:41:17] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 8872 [14:41:17] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [14:41:17] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [14:41:17] / on damiana is CRITICAL: Connection refused by host [14:41:17] ts-array5 on damiana is CRITICAL: Connection refused by host [14:41:17] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [14:41:24] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8923.000000 [14:41:24] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 210247.000000 [14:41:24] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 8923.000000 [14:41:25] wikidata replag on z-dat-s2-b is CRITICAL: (Service Check Timed Out) [14:41:25] aliasd on nightshade is OK: TCP OK - 0.091 second response time on port 984 [500 Not found.] [14:41:54] MySQL on z-dat-s2-b is CRITICAL: (Service Check Timed Out) [14:42:04] MySQL slave on z-dat-s2-b is CRITICAL: (Service Check Timed Out) [14:45:24] Sun Grid Engine execd on nightshade is CRITICAL: CRITICAL: execd not communicating [14:46:05] s5 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 2101.000000 [14:46:06] Sun Grid Engine execd on yarrow is OK: Host and Queues Ok [14:47:05] s5 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 313.000000 [14:47:16] /tmp on damiana is CRITICAL: Connection refused by host [14:47:26] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [14:47:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 151282 MB (2% inode=69%): [14:47:26] DiskSuite on damiana is CRITICAL: Connection refused by host [14:47:26] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 67296 MB (6% inode=99%): [14:47:35] MySQL on ha-sql.esi is CRITICAL: Access denied for user tsnagios7643@turnera-bge0 (using password: YES) [14:47:36] Environment IPMI on damiana is CRITICAL: Connection refused by host [14:47:36] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [14:47:36] Free Memory on damiana is CRITICAL: Connection refused by host [14:47:45] Load avg. on damiana is CRITICAL: Connection refused by host [14:48:05] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:48:06] SMTP on damiana is CRITICAL: Connection refused [14:48:16] SSH on damiana is CRITICAL: Connection refused [14:48:48] *sighs* SGE allowed to run tons of multiple instance of my bots instead of only one [14:52:35] s4 replag on daphne is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3316.000000 [14:52:55] MySQL slave on z-dat-s6-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3294 [14:53:05] MySQL slave on daphne is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3070 [14:54:55] MySQL slave on z-dat-s6-a is OK: Uptime: 1737130 Threads: 3 Questions: 650248160 Slow queries: 91864 Opens: 3033918 Flush tables: 1 Open tables: 3226 Queries per second avg: 374.323 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1644 [14:55:36] s4 replag on daphne is OK: QUERY OK: SELECT ts_rc_age() returned 1512.000000 [14:56:05] MySQL slave on daphne is OK: Uptime: 5331650 Threads: 4 Questions: 1223308171 Slow queries: 108444 Opens: 248073 Flush tables: 1 Open tables: 1485 Queries per second avg: 229.442 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1193 [14:59:55] s4 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3311.000000 [15:00:56] MySQL slave on z-dat-s7-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3434 [15:01:31] Ah, everything looks ok now. [15:02:11] * Merlissimo searches tsbot [15:04:06] MySQL slave on z-dat-s3-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3570 [15:04:56] s4 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1408.000000 [15:05:54] MySQL slave on z-dat-s7-a is OK: Uptime: 1737791 Threads: 7 Questions: 662512796 Slow queries: 34946 Opens: 2450068 Flush tables: 1 Open tables: 3932 Queries per second avg: 381.238 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1638 [15:13:06] MySQL slave on z-dat-s3-a is OK: Uptime: 1738217 Threads: 28 Questions: 1583787975 Slow queries: 65334 Opens: 15141271 Flush tables: 1 Open tables: 16384 Queries per second avg: 911.156 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1658 [15:28:45] s1 replag on thyme is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3531.000000 [15:29:06] MySQL slave on thyme is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3444 [15:38:05] MySQL slave on thyme is OK: Uptime: 315556 Threads: 12 Questions: 219267484 Slow queries: 29427 Opens: 2261 Flush tables: 1 Open tables: 415 Queries per second avg: 694.860 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1774 [15:38:46] s1 replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 1711.000000 [15:40:35] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [15:40:36] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [15:40:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [15:40:45] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [15:40:46] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [15:40:46] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 13980.000000 [15:40:46] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 4268.000000 [15:40:46] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 11.45, 14.58, 22.13 [15:40:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [15:40:46] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 12489.000000 [15:40:47] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [15:40:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [15:40:56] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [15:40:56] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 12518.000000 [15:40:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 12500.000000 [15:41:06] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 92809 MB (15% inode=99%): [15:41:17] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [15:41:17] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 12513.000000 [15:41:17] MySQL slave on rosemary is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 4213 [15:41:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [15:41:17] ts-array5 on damiana is CRITICAL: Connection refused by host [15:41:17] / on damiana is CRITICAL: Connection refused by host [15:41:17] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [15:41:18] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [15:41:18] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [15:41:26] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 12524.000000 [15:41:26] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 12525.000000 [15:41:26] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 211594.000000 [15:41:46] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: Too many connections [15:42:16] MySQL on z-dat-s2-b is CRITICAL: Too many connections [15:42:26] MySQL slave on z-dat-s2-b is CRITICAL: Too many connections [15:43:46] Load avg. on nightshade is WARNING: WARNING - load average: 8.41, 11.62, 19.64 [15:45:46] s1 replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3583.000000 [15:46:15] MySQL slave on rosemary is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3524 [15:47:16] /tmp on damiana is CRITICAL: Connection refused by host [15:47:26] DiskSuite on damiana is CRITICAL: Connection refused by host [15:47:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 152221 MB (2% inode=69%): [15:47:45] Load avg. on damiana is CRITICAL: Connection refused by host [15:48:04] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:48:16] SSH on damiana is CRITICAL: Connection refused [15:48:26] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [15:48:36] Environment IPMI on damiana is CRITICAL: Connection refused by host [15:48:36] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [15:48:36] Free Memory on damiana is CRITICAL: Connection refused by host [15:48:46] Sun Grid Engine execd on nightshade is UNKNOWN: Execution timeout exceeded [15:49:05] SMTP on damiana is CRITICAL: Connection refused [15:49:26] SSH on nightshade is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:54:45] Load avg. on nightshade is OK: OK - load average: 8.50, 9.76, 14.73 [16:00:46] s1 replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1723.000000 [16:01:16] MySQL slave on rosemary is OK: Uptime: 330230 Threads: 12 Questions: 270674300 Slow queries: 217399 Opens: 18994 Flush tables: 1 Open tables: 3033 Queries per second avg: 819.653 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1648 [16:05:36] Sun Grid Engine execd on nightshade is OK: Host and Queues Ok [16:06:16] SSH on nightshade is OK: SSH OK - OpenSSH_5.5p1 Debian-6+squeeze3 (protocol 2.0) [16:20:45] Load avg. on nightshade is WARNING: WARNING - load average: 23.84, 18.65, 16.12 [16:31:45] Load avg. on nightshade is OK: OK - load average: 12.81, 14.05, 14.96 [16:32:05] Silke Meyer * [Toolserver-announce] Workshops at the Amsterdam Hackathon [16:40:46] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [16:40:47] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [16:40:47] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 17581.000000 [16:40:47] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [16:40:47] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 16088.000000 [16:40:47] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [16:40:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [16:40:56] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [16:40:56] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 16118.000000 [16:40:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 16100.000000 [16:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 92553 MB (15% inode=99%): [16:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [16:41:16] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 16114.000000 [16:41:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:41:17] ts-array5 on damiana is CRITICAL: Connection refused by host [16:41:17] / on damiana is CRITICAL: Connection refused by host [16:41:17] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [16:41:25] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 16125.000000 [16:41:26] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 16125.000000 [16:41:27] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 214742.000000 [16:41:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [16:41:36] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [16:41:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [16:42:16] MySQL on z-dat-s2-b is CRITICAL: Too many connections [16:42:17] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [16:42:17] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [16:42:26] MySQL slave on z-dat-s2-b is CRITICAL: Too many connections [16:42:46] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: Too many connections [16:46:45] Load avg. on nightshade is WARNING: WARNING - load average: 25.26, 21.30, 17.35 [16:47:26] DiskSuite on damiana is CRITICAL: Connection refused by host [16:47:27] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 152241 MB (2% inode=69%): [16:47:45] Load avg. on damiana is CRITICAL: Connection refused by host [16:48:05] John * Re: [Toolserver-l] Possible LDAP outtime this morning, major disruption [16:48:16] SSH on damiana is CRITICAL: Connection refused [16:48:17] /tmp on damiana is CRITICAL: Connection refused by host [16:48:36] Environment IPMI on damiana is CRITICAL: Connection refused by host [16:48:37] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [16:48:37] Free Memory on damiana is CRITICAL: Connection refused by host [16:49:04] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:49:26] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [16:49:46] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 29.83, 25.54, 19.70 [16:50:05] SMTP on damiana is CRITICAL: Connection refused [17:08:45] Load avg. on nightshade is WARNING: WARNING - load average: 16.28, 18.34, 19.85 [17:10:45] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 20.77, 19.42, 20.05 [17:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 92212 MB (15% inode=99%): [17:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [17:41:16] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 19713.000000 [17:41:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:41:16] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [17:41:16] / on damiana is CRITICAL: Connection refused by host [17:41:17] ts-array5 on damiana is CRITICAL: Connection refused by host [17:41:26] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 19725.000000 [17:41:27] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 19725.000000 [17:41:27] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 218080.000000 [17:41:45] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21241.000000 [17:41:46] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [17:41:46] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [17:41:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [17:41:46] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 19748.000000 [17:41:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [17:41:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [17:41:56] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [17:41:56] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 19779.000000 [17:41:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 19760.000000 [17:42:16] MySQL on z-dat-s2-b is CRITICAL: Too many connections [17:42:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [17:42:16] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [17:42:26] MySQL slave on z-dat-s2-b is CRITICAL: Too many connections [17:42:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [17:42:36] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [17:42:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [17:42:46] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: Too many connections [17:47:46] Load avg. on damiana is CRITICAL: Connection refused by host [17:48:16] SSH on damiana is CRITICAL: Connection refused [17:48:17] /tmp on damiana is CRITICAL: Connection refused by host [17:48:26] DiskSuite on damiana is CRITICAL: Connection refused by host [17:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 151488 MB (2% inode=69%): [17:48:36] Environment IPMI on damiana is CRITICAL: Connection refused by host [17:49:05] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:49:36] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [17:49:36] Free Memory on damiana is CRITICAL: Connection refused by host [17:50:04] SMTP on damiana is CRITICAL: Connection refused [17:50:25] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [18:03:45] Load avg. on nightshade is WARNING: WARNING - load average: 10.51, 13.83, 19.58 [18:17:45] Load avg. on nightshade is OK: OK - load average: 9.56, 12.16, 14.96 [18:31:45] Load avg. on nightshade is WARNING: WARNING - load average: 21.91, 18.69, 16.30 [18:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 92004 MB (15% inode=99%): [18:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [18:41:16] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 23314.000000 [18:41:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [18:41:16] ts-array5 on damiana is CRITICAL: Connection refused by host [18:41:16] / on damiana is CRITICAL: Connection refused by host [18:41:17] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [18:41:26] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 23325.000000 [18:41:26] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 23325.000000 [18:41:45] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 218413.000000 [18:41:45] Load avg. on nightshade is OK: OK - load average: 10.82, 13.65, 15.00 [18:42:16] MySQL on z-dat-s2-b is CRITICAL: Too many connections [18:42:17] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [18:42:17] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [18:42:25] MySQL slave on z-dat-s2-b is CRITICAL: Too many connections [18:42:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [18:42:37] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [18:42:37] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [18:42:45] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 24901.000000 [18:42:46] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [18:42:46] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [18:42:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [18:42:46] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 23408.000000 [18:42:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [18:42:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [18:42:56] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [18:42:56] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 23438.000000 [18:42:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 23421.000000 [18:43:45] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: Too many connections [18:47:46] Load avg. on damiana is CRITICAL: Connection refused by host [18:48:16] SSH on damiana is CRITICAL: Connection refused [18:48:16] /tmp on damiana is CRITICAL: Connection refused by host [18:48:26] DiskSuite on damiana is CRITICAL: Connection refused by host [18:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 155271 MB (2% inode=69%): [18:48:35] Environment IPMI on damiana is CRITICAL: Connection refused by host [18:49:06] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:49:36] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [18:49:36] Free Memory on damiana is CRITICAL: Connection refused by host [18:50:05] SMTP on damiana is CRITICAL: Connection refused [18:51:16] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [19:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 91749 MB (15% inode=99%): [19:41:20] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [19:41:20] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 26914.000000 [19:41:20] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:41:20] ts-array5 on damiana is CRITICAL: Connection refused by host [19:41:20] / on damiana is CRITICAL: Connection refused by host [19:41:20] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [19:41:26] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 26925.000000 [19:41:26] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 26925.000000 [19:41:46] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 214818.000000 [19:42:16] MySQL on z-dat-s2-b is CRITICAL: Too many connections [19:42:17] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [19:42:17] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [19:42:25] MySQL slave on z-dat-s2-b is CRITICAL: Too many connections [19:42:45] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 28500.000000 [19:42:46] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [19:42:47] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [19:42:47] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [19:42:47] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 27008.000000 [19:42:47] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [19:42:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [19:43:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [19:43:36] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [19:43:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [19:43:56] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [19:43:56] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 27098.000000 [19:43:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 27080.000000 [19:44:46] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: Too many connections [19:47:45] Load avg. on damiana is CRITICAL: Connection refused by host [19:48:16] SSH on damiana is CRITICAL: Connection refused [19:48:17] /tmp on damiana is CRITICAL: Connection refused by host [19:48:26] DiskSuite on damiana is CRITICAL: Connection refused by host [19:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 156252 MB (2% inode=69%): [19:48:36] Environment IPMI on damiana is CRITICAL: Connection refused by host [19:49:05] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:49:35] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [19:49:35] Free Memory on damiana is CRITICAL: Connection refused by host [19:50:05] SMTP on damiana is CRITICAL: Connection refused [19:51:15] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [19:54:42] Wow, that must have been an exciting afternoon: [2013-03-01 14:40:44] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 238.75, 261.34, 261.78 [20:04:46] Load avg. on nightshade is WARNING: WARNING - load average: 21.31, 18.31, 15.28 [20:27:08] hello all [20:27:45] MySQL on ha-sql.esi is CRITICAL: Access denied for user tsnagios7643@turnera-bge0 (using password: YES) [20:33:49] 'n Abend, DaBPunkt! [20:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 91395 MB (15% inode=99%): [20:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [20:41:16] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 30513.000000 [20:41:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:41:16] / on damiana is CRITICAL: Connection refused by host [20:41:16] ts-array5 on damiana is CRITICAL: Connection refused by host [20:41:17] APT on yarrow is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [20:41:45] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 216814.000000 [20:42:16] MySQL on z-dat-s2-b is CRITICAL: Cant connect to MySQL server on z-dat-s2-b (146) [20:42:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [20:42:16] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [20:42:26] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 30585.000000 [20:42:26] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 30585.000000 [20:42:26] MySQL slave on z-dat-s2-b is CRITICAL: Cant connect to MySQL server on z-dat-s2-b (146) [20:42:46] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32101.000000 [20:42:46] APT on yucca is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [20:42:46] APT on z-dat-s2-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [20:42:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [20:42:46] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 30608.000000 [20:42:47] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [20:42:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [20:43:09] what you did with my servers? [20:43:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [20:43:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [20:43:36] APT on nightshade is CRITICAL: APT CRITICAL: 8 packages available for upgrade (4 critical updates). [20:43:56] MySQL slave on z-dat-s5-b is CRITICAL: (Return code of 139 is out of bounds) [20:43:56] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 30698.000000 [20:43:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 30681.000000 [20:44:47] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s2-b (146) [20:45:16] APT on yarrow is OK: APT OK: 0 packages available for upgrade (0 critical updates). [20:47:46] Load avg. on damiana is CRITICAL: Connection refused by host [20:48:01] Nothing, they seem to have a mind of their own :-). [20:48:26] DiskSuite on damiana is CRITICAL: Connection refused by host [20:48:27] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 157015 MB (2% inode=69%): [20:48:35] Environment IPMI on damiana is CRITICAL: Connection refused by host [20:49:06] NTP on damiana is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:49:16] SSH on damiana is CRITICAL: Connection refused [20:49:16] /tmp on damiana is CRITICAL: Connection refused by host [20:49:36] APT on z-dat-s5-b is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [20:49:36] Free Memory on damiana is CRITICAL: Connection refused by host [20:50:04] SMTP on damiana is CRITICAL: Connection refused [20:51:16] APT on mayapple is CRITICAL: APT CRITICAL: 4 packages available for upgrade (4 critical updates). [20:51:35] APT on nightshade is OK: APT OK: 0 packages available for upgrade (0 critical updates). [20:52:41] MySQL on ha-sql.esi is OK: Uptime: 51 Threads: 15 Questions: 2550 Slow queries: 0 Opens: 23 Flush tables: 1 Open tables: 14 Queries per second avg: 50.0 [20:54:16] SSH on damiana is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [20:54:26] APT on mayapple is OK: APT OK: 0 packages available for upgrade (0 critical updates). [20:55:06] SMTP on damiana is OK: SMTP OK - 0.210 sec. response time [20:55:16] / on damiana is OK: DISK OK - free space: / 42729 MB (59% inode=95%): [20:55:17] ts-array5 on damiana is OK: 2/2 paths are active [20:55:17] /tmp on damiana is OK: DISK OK - free space: /tmp 13230 MB (99% inode=99%): [20:55:25] DiskSuite on damiana is OK: OK - No disk failures detected [20:55:35] Environment IPMI on damiana is WARNING: NRPE: Unable to read output [20:55:36] Free Memory on damiana is OK: OK - 77.8% (6515668 kB) free. [20:55:46] Load avg. on damiana is OK: OK - load average: 0.09, 0.05, 0.04 [20:55:55] NTP on damiana is OK: NTP OK: Offset -0.000314 secs [20:56:45] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 25.76, 22.59, 20.24 [20:57:46] Load avg. on nightshade is WARNING: WARNING - load average: 19.28, 21.33, 19.95 [20:58:46] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 25.56, 22.35, 20.36 [21:00:45] APT on yucca is OK: APT OK: 0 packages available for upgrade (0 critical updates). [21:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:13:36] APT on z-dat-s5-b is OK: APT OK: 0 packages available for upgrade (0 critical updates). [21:29:36] Environment IPMI on damiana is OK: ok: temperature ok fan ok voltage ok chassis ok [21:41:04] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 91044 MB (14% inode=99%): [21:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [21:41:16] wikidata replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 10795.000000 [21:41:16] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [21:42:15] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 220305.000000 [21:42:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [21:43:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [21:43:16] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [21:43:26] wikidata replag on z-dat-s7-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 9121.000000 [21:43:27] wikidata replag on z-dat-s6-a is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 7159.000000 [21:43:35] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [21:43:45] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 33754.000000 [21:43:45] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [21:43:45] wikidata replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 11753.000000 [21:43:45] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [21:44:35] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [21:44:55] MySQL slave on z-dat-s5-b is CRITICAL: (Service Check Timed Out) [21:44:55] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 21529.000000 [21:44:56] wikidata replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 15831.000000 [21:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 157444 MB (2% inode=69%): [21:57:26] wikidata replag on z-dat-s6-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3387.000000 [21:59:46] Load avg. on nightshade is CRITICAL: CRITICAL - load average: 14.25, 18.43, 20.55 [22:03:25] wikidata replag on z-dat-s6-a is OK: QUERY OK: SELECT ts_rc_age() returned 1597.000000 [22:04:46] Load avg. on nightshade is WARNING: WARNING - load average: 13.36, 17.79, 19.88 [22:05:16] wikidata replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3458.000000 [22:06:26] wikidata replag on z-dat-s7-a is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3528.000000 [22:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:11:16] wikidata replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1587.000000 [22:13:26] wikidata replag on z-dat-s7-a is OK: QUERY OK: SELECT ts_rc_age() returned 1478.000000 [22:14:46] MySQL on z-dat-s5-b is CRITICAL: Too many connections [22:19:46] Load avg. on nightshade is OK: OK - load average: 8.76, 10.53, 14.68 [22:19:46] wikidata replag on thyme is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3499.000000 [22:26:46] wikidata replag on thyme is OK: QUERY OK: SELECT ts_rc_age() returned 1537.000000 [22:28:46] APT on z-dat-s2-b is OK: APT OK: 0 packages available for upgrade (0 critical updates). [22:28:56] wikidata replag on rosemary is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3579.000000 [22:29:06] MySQL on z-dat-s2-b is OK: Uptime: 26 Threads: 7 Questions: 147 Slow queries: 0 Opens: 956 Flush tables: 1 Open tables: 38 Queries per second avg: 5.653 [22:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [22:34:16] DaBPunkt: for some reason one of my SGE jobs is continually in the "qw" state, and i'm not sure why since some other jobs which are nearly the same went to "r" afterwards. is there an issue with sql-s1-rr? thats the only difference between the jobs [22:34:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 38335 [22:34:26] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 197321.000000 [22:34:55] wikidata replag on rosemary is OK: QUERY OK: SELECT ts_rc_age() returned 1779.000000 [22:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 90823 MB (14% inode=99%): [22:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [22:41:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:42:16] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: Cant connect to MySQL server on z-dat-s5-b (146) [22:42:56] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [22:43:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [22:43:17] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A: [22:43:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [22:43:45] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 32670.000000 [22:43:45] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [22:43:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [22:44:36] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [22:44:46] MySQL on z-dat-s5-b is OK: Uptime: 6 Threads: 1 Questions: 14 Slow queries: 0 Opens: 20 Flush tables: 1 Open tables: 13 Queries per second avg: 2.333 [22:44:56] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 19083.000000 [22:45:45] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 224113 [22:48:26] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 157476 MB (2% inode=69%): [23:07:16] NTP on yucca is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:09:25] legoktm: When was that job scheduled? I had a similar experience with a job submitted on Feb 25 IIRC, but since then everything went smoothly. [23:33:46] APT on yucca is CRITICAL: APT CRITICAL: 1 packages available for upgrade (1 critical updates). [23:34:16] MySQL slave on z-dat-s2-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 34374 [23:35:25] wikidata replag on z-dat-s2-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 200766.000000 [23:37:50] scfc_de: its on an hourly cronjob [23:41:05] /sql on ptolemy is WARNING: DISK WARNING - free space: /sql 90545 MB (14% inode=99%): [23:41:16] Virtual disks on far1-n1-oe16-esams.mgmt is CRITICAL: OK 3, WARN 0, CRIT 1: far1-n1-fast3 FTOL, far1-n1-bulk CRIT, far1-n1-fast2 FTOL, far1-n1-fast FTOL [23:41:17] Environment IPMI on thyme is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [23:42:55] MySQL slave on z-dat-s4-a is CRITICAL: (Return code of 139 is out of bounds) [23:43:06] wikidata replag on z-dat-s5-b is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 194262.000000 [23:43:16] SMF on web.amaranth is CRITICAL: ERROR - maintenance: svc:/application/jira:default [23:43:17] CAM on hemlock is CRITICAL: CRITICAL - Storage ts-array5 (2 errors, 1 warning): null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.B:, null :OSGi.com.sun.storage.cam.agent(device.2530):event.ProblemEvent.REC_EXPIRED_BATTERY.description:S17:Tray.85.Battery.A:, null :OSGi.com.sun.storage.cam.agent(com.sun.netstorage.fm.storade.agent.Messages):monitor.Communicatio [23:43:46] wikidata replag on daphne is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 27423.000000 [23:43:46] Sun Grid Engine execd on ortelius is WARNING: NRPE: Unable to read output [23:43:46] Sun Grid Engine execd on wolfsbane is WARNING: NRPE: Unable to read output [23:44:36] Sun Grid Engine execd on willow is WARNING: NRPE: Unable to read output [23:44:37] FMA on amaranth is CRITICAL: Failed components: hc://:product-id=SUN-FIRE-X4150:server-id=amaranth:chassis-id=0819QAR1D1:serial=518545072303039020:part=72T256520HFD3SB:revision=--/motherboard=0/memory-controller=1/dram-channel=2/dimm=3/rank=7 [23:44:54] s4 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 13365.000000 [23:46:26] MySQL slave on z-dat-s5-b is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 192319 [23:48:25] /mnt user-store on rosemary is CRITICAL: DISK CRITICAL - free space: /mnt 157856 MB (2% inode=69%):