[00:02:45] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.020996/1.95, alarm hl:np_load_avg=1.623047/2.0, alarm hl:mem_free=214.000000M/350M, alarm hl:available=1/0 [00:04:45] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [00:09:45] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:13:35] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.451172/1.10, alarm hl:np_load_long=0.902344/1.55, alarm hl:mem_free=15869.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.451172/1.00, alarm hl:np_load_long=0.902344/1.50, alarm hl:mem_free=15869.000000M/600M, alarm hl:available=1/0 [00:14:15] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 61425.000000 [00:15:14] MySQL slave on cassia is CRITICAL: (Return code of 139 is out of bounds) [00:17:34] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [00:17:46] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=0.961914/1.95, alarm hl:np_load_avg=1.160156/2.0, alarm hl:mem_free=287.000000M/350M, alarm hl:available=1/0 [00:27:45] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:33:45] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [00:45:05] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [00:50:25] Load avg. on willow is WARNING: WARNING - load average: 18.61, 14.49, 11.98 [00:51:25] Load avg. on willow is OK: OK - load average: 12.29, 13.30, 11.70 [00:57:25] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 305201 MB (5% inode=33%): [01:02:26] Load avg. on willow is WARNING: WARNING - load average: 17.79, 14.54, 12.43 [01:02:56] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.163086/1.95, alarm hl:np_load_avg=1.810547/2.0, alarm hl:mem_free=889.000000M/350M, alarm hl:available=1/0 [01:03:55] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [01:07:56] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.618164/1.95, alarm hl:np_load_avg=2.161621/2.0, alarm hl:mem_free=946.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.618164/2.3, alarm hl:np_load_long=1.747559/2.5, alarm hl:cpu=86.100000/98, alarm hl:mem_free=946.000000M/200M, alarm hl:available=1/0 [01:09:55] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [01:14:15] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 65027.000000 [01:15:15] MySQL slave on cassia is CRITICAL: (Return code of 139 is out of bounds) [01:27:55] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:33:55] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [01:45:15] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [01:57:24] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 305101 MB (5% inode=33%): [02:02:25] Load avg. on willow is WARNING: WARNING - load average: 16.58, 14.06, 12.30 [02:02:56] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.005371/1.95, alarm hl:np_load_avg=1.747070/2.0, alarm hl:mem_free=782.000000M/350M, alarm hl:available=1/0 [02:03:35] Load avg. on willow is OK: OK - load average: 13.07, 13.42, 12.18 [02:03:55] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [02:07:46] Load avg. on willow is WARNING: WARNING - load average: 17.17, 15.42, 13.24 [02:10:06] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:14:05] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.493164/1.95, alarm hl:np_load_avg=2.362793/2.0, alarm hl:mem_free=382.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.493164/2.3, alarm hl:np_load_long=1.932129/2.5, alarm hl:cpu=93.700000/98, alarm hl:mem_free=382.000000M/200M, alarm hl:available=1/0 [02:14:34] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 68643.000000 [02:15:15] MySQL slave on cassia is CRITICAL: (Return code of 139 is out of bounds) [02:28:05] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:33:54] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [02:45:15] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [02:57:46] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303185 MB (5% inode=33%): [03:09:06] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.624024/1.95, alarm hl:np_load_avg=2.026367/2.0, alarm hl:mem_free=729.000000M/350M, alarm hl:available=1/0 [03:10:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:10:06] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [03:14:15] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.000000/1.95, alarm hl:np_load_avg=2.168457/2.0, alarm hl:mem_free=505.000000M/350M, alarm hl:available=1/0 [03:14:35] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 72243.000000 [03:15:16] MySQL slave on cassia is CRITICAL: (Return code of 139 is out of bounds) [03:16:46] Load avg. on willow is WARNING: WARNING - load average: 12.79, 15.47, 15.82 [03:28:05] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:33:55] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [03:35:47] Load avg. on willow is OK: OK - load average: 8.35, 13.12, 14.79 [03:45:25] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [03:53:49] [[Special:Log/newusers]] create 10 * CRCComputer01 * (New user account) [03:57:45] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303161 MB (5% inode=33%): [04:02:46] Load avg. on willow is WARNING: WARNING - load average: 15.21, 13.77, 13.43 [04:03:46] Load avg. on willow is OK: OK - load average: 13.41, 13.60, 13.40 [04:10:06] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [04:14:35] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 75844.000000 [04:15:15] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.712891/1.95, alarm hl:np_load_avg=1.806152/2.0, alarm hl:mem_free=244.000000M/350M, alarm hl:available=1/0 [04:15:15] MySQL slave on cassia is CRITICAL: (Return code of 139 is out of bounds) [04:20:15] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [04:26:05] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.994629/1.95, alarm hl:np_load_avg=1.862305/2.0, alarm hl:mem_free=519.000000M/350M, alarm hl:available=1/0 [04:28:06] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:28:55] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [04:32:46] Load avg. on willow is WARNING: WARNING - load average: 14.26, 15.50, 14.62 [04:34:06] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [04:34:45] Load avg. on willow is OK: OK - load average: 12.46, 14.59, 14.39 [04:45:35] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [04:48:45] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [04:57:46] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303102 MB (5% inode=33%): [05:06:22] [[User:CRCComputer01]] !NM 10https://wiki.toolserver.org/w/index.php?oldid=7183&rcid=9571 * CRCComputer01 * (+245) (Created page with "Computer Repair New York, serving all 5 boroughs of New York:Manhattan,Queens, Brooklyn. Free pickup & delivery. Laptop repairs. All work done onsite by our computer repair speci...") [05:10:15] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:13:55] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:14:15] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.974121/1.95, alarm hl:np_load_avg=2.154785/2.0, alarm hl:mem_free=133.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.974121/2.3, alarm hl:np_load_long=1.773926/2.5, alarm hl:cpu=98.700000/98, alarm hl:mem_free=133.000000M/200M, alarm hl:available=1/0 [05:14:36] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 79445.000000 [05:14:47] Load avg. on willow is WARNING: WARNING - load average: 22.02, 17.76, 14.53 [05:15:25] MySQL slave on cassia is CRITICAL: (Return code of 139 is out of bounds) [05:23:45] Load avg. on willow is OK: OK - load average: 9.57, 14.41, 14.58 [05:24:16] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [05:28:15] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:29:45] Load avg. on willow is WARNING: WARNING - load average: 12.63, 15.06, 14.96 [05:33:15] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.741211/1.95, alarm hl:np_load_avg=2.210449/2.0, alarm hl:mem_free=283.000000M/350M, alarm hl:available=1/0 [05:34:15] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [05:45:45] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [05:57:25] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.855957/1.95, alarm hl:np_load_avg=2.996094/2.0, alarm hl:mem_free=717.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.855957/2.3, alarm hl:np_load_long=2.741699/2.5, alarm hl:cpu=99.700000/98, alarm hl:mem_free=717.000000M/200M, alarm hl:available=1/0 [05:57:45] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 303041 MB (5% inode=33%): [06:06:45] Load avg. on willow is CRITICAL: CRITICAL - load average: 38.18, 27.25, 24.12 [06:10:13] [[Special:Log/newusers]] create 10 * Crocodile * (New user account) [06:10:15] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [06:10:58] [[User:Crocodile]] !N 10https://wiki.toolserver.org/w/index.php?oldid=7184&rcid=9573 * Crocodile * (+43) (http://en.wikipedia.org/wiki/User:General) [06:11:12] [[User:Crocodile]] ! 10https://wiki.toolserver.org/w/index.php?diff=7185&oldid=7184&rcid=9574 * Crocodile * (+8) () [06:12:25] [[User:CRCComputer01]] ! 10https://wiki.toolserver.org/w/index.php?diff=7186&oldid=7183&rcid=9575 * Crocodile * (-245) (dont advertise) [06:15:35] MySQL slave on cassia is CRITICAL: (Return code of 139 is out of bounds) [06:15:35] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 83105.000000 [06:23:46] Load avg. on willow is WARNING: WARNING - load average: 9.98, 16.06, 19.86 [06:24:25] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [06:24:45] Load avg. on willow is CRITICAL: CRITICAL - load average: 32.29, 20.93, 21.32 [06:27:24] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.716309/1.95, alarm hl:np_load_avg=2.263672/2.0, alarm hl:mem_free=818.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.716309/2.3, alarm hl:np_load_long=2.527832/2.5, alarm hl:cpu=76.600000/98, alarm hl:mem_free=818.000000M/200M, alarm hl:available=1/0 [06:27:45] Load avg. on willow is WARNING: WARNING - load average: 12.88, 17.22, 19.80 [06:28:25] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:29:25] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [06:34:15] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [06:46:44] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [06:57:54] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302872 MB (5% inode=33%): [07:03:26] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.049316/1.95, alarm hl:np_load_avg=3.169434/2.0, alarm hl:mem_free=540.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.049316/2.3, alarm hl:np_load_long=2.479492/2.5, alarm hl:cpu=71.700000/98, alarm hl:mem_free=540.000000M/200M, alarm hl:available=1/0 [07:10:25] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [07:15:25] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [07:15:35] MySQL slave on cassia is CRITICAL: (Return code of 139 is out of bounds) [07:15:35] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 86706.000000 [07:23:54] 3(commented) [TS-1370] replication on sql-s5-user stopped <10https://jira.toolserver.org/browse/TS-1370> (Marlen Caemmerer) [07:27:04] Load avg. on willow is WARNING: WARNING - load average: 12.75, 14.81, 16.02 [07:28:25] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:34:25] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [07:40:05] Load avg. on willow is OK: OK - load average: 8.98, 13.25, 14.97 [07:43:52] 3(commented) [TS-1370] replication on sql-s5-user stopped <10https://jira.toolserver.org/browse/TS-1370> (Marlen Caemmerer) [07:46:44] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [07:57:05] Load avg. on willow is WARNING: WARNING - load average: 13.36, 16.18, 15.36 [07:57:24] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.652832/1.95, alarm hl:np_load_avg=2.017578/2.0, alarm hl:mem_free=786.000000M/350M, alarm hl:available=1/0 [07:58:05] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302812 MB (5% inode=33%): [07:58:05] Load avg. on willow is OK: OK - load average: 9.85, 14.77, 14.91 [07:58:25] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [08:03:25] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.440430/1.95, alarm hl:np_load_avg=2.331543/2.0, alarm hl:mem_free=498.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.440430/2.3, alarm hl:np_load_long=2.071777/2.5, alarm hl:cpu=93.100000/98, alarm hl:mem_free=498.000000M/200M, alarm hl:available=1/0 [08:11:25] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [08:15:35] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 60652 [08:15:36] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 60644.000000 [08:16:25] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [08:20:05] Load avg. on willow is WARNING: WARNING - load average: 11.30, 14.30, 15.56 [08:24:04] Load avg. on willow is OK: OK - load average: 11.32, 13.27, 14.90 [08:27:05] Load avg. on willow is WARNING: WARNING - load average: 10.75, 14.09, 15.10 [08:28:04] Load avg. on willow is OK: OK - load average: 8.89, 13.09, 14.69 [08:28:25] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:30:27] hello [08:30:30] world [08:31:03] word [08:31:07] alter [08:31:25] so, speaking of config you remind me about puppet [08:31:56] there was some talk (maybe a couple months ago) about puppet. and i think you're already using it but the configs weren't sanitized? [08:32:50] anyway, just wondering what the latest is there. will the configs possibly be published soon? this month? (and maintained publicly not just one time). so that people can contribute patches to them [08:33:39] jeremyb: as always i dont say anything without my fellow admins :) [08:33:55] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [08:33:59] nosy: is the sky blue? ;) [08:34:10] jeremyb: why do you think it would be good idea? [08:34:19] which? [08:34:21] to share the configs and which would be sensible? [08:34:26] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [08:34:27] currently its grey here [08:34:29] :D [08:35:03] well, the example that came up on the list was about lists of packages that people needed installed [08:35:36] oh i see but we are gettng away from the "replication from the master"-discussion [08:35:51] yes, mostly unrelated [08:36:15] thats what makes toolserver sometimes hard - lots of issues, partially related to each other [08:37:41] imho on this topic: if we dont get to a point where most of the people use a file to place their requirements in we will stick to tickets or use this as a fallback way [08:38:35] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [08:39:07] what about just publishing the whole puppet repo except for e.g. keys and passwords? [08:39:57] would be a good start probably [08:40:09] like putting it into a svn or something [08:40:15] git! [08:40:21] or git [08:40:41] whatever the current version control system is called...:D [08:41:14] heh [08:41:37] i started with cvs but had a glimps into rcs too for historical interest [08:41:57] i only know RCS from magazine articles [08:42:01] CVS is first hand [08:42:10] anyway: ill put this on my list of questions [08:42:41] danke ;) [08:42:48] bitte gern [08:43:36] jeremyb: are you working for wmf? [08:43:43] i isn't [08:44:15] i see [08:44:38] see you in DC? [08:44:43] or NYC? [08:44:49] https://wikimania2012.wikimedia.org/wiki/Wikimania_Takes_Manhattan [08:45:18] yes they first take manhattan then they take berlin :D [08:45:24] thats where i will be [08:45:52] i think dab first introduced me to that song. never heard of it before he did [08:46:29] rly? wow [08:46:55] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [08:46:58] ja [08:47:05] / on wolfsbane is WARNING: DISK WARNING - free space: / 6272 MB (20% inode=93%): [08:47:31] what about "seen the lights go out on broadway"? :D [08:48:04] sometimes funny things happen here with our music [08:48:28] we listen to english words on the radio and guess it must come from the us or uk [08:48:39] but quite some time this is not the case [08:48:48] yeah [08:48:52] but good old billy joel should be known [08:49:14] well i'm missing quite a bit of music (and movie) knowledge to begin with [08:50:05] Load avg. on willow is WARNING: WARNING - load average: 14.11, 17.17, 16.52 [08:50:12] ok good :) i lack a good portion of film knowledge too [08:51:09] 1st star wars was episode one (in a theather in it's initial release run) [08:51:19] jeremyb: but how should the "puppet configs in git" thing work? [08:51:19] anyway, see you later ;) [08:51:29] i mean we could check this in but it would be read only [08:51:30] do you know how the WMF is doing it? [08:51:31] and then? [08:51:35] no [08:51:50] there's a puppet git repo [08:51:56] do they have a user base that always wants something special? [08:51:59] with no secrets in it. there's a few branches [08:52:33] most people can't write to any branches, they just submit changesets to a review queue [08:52:42] and eventually they're either merged or not [08:53:23] and then there's 2 other repos with some secrets. one is for just ops and it's prod secrets and one i think nearly everyone can see and is either for labs or dummy values [08:54:13] anyway, you could e.g. have people filing tickets with the puppet diff for what they're requesting already attached [08:54:21] and then you just rubber stamp and deploy ;) [08:56:05] Load avg. on willow is OK: OK - load average: 9.21, 13.00, 14.89 [08:56:44] (but even better than a diff is a link to a pull request or a branch to pull from) [08:58:05] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302742 MB (5% inode=33%): [09:11:35] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [09:12:36] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.575195/1.95, alarm hl:np_load_avg=2.125000/2.0, alarm hl:mem_free=857.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.575195/2.3, alarm hl:np_load_long=1.833984/2.5, alarm hl:cpu=93.700000/98, alarm hl:mem_free=857.000000M/200M, alarm hl:available=1/0 [09:13:15] Load avg. on willow is WARNING: WARNING - load average: 19.04, 16.04, 14.45 [09:14:16] Load avg. on willow is OK: OK - load average: 12.32, 14.71, 14.08 [09:14:35] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [09:15:44] MySQL slave on cassia is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 4212 [09:15:45] s5 replag on cassia is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 4204.000000 [09:18:45] MySQL slave on cassia is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3524 [09:18:45] s5 replag on cassia is WARNING: QUERY WARNING: SELECT ts_rc_age() returned 3496.000000 [09:23:01] :) ich wars) [09:23:09] hab dich zum pasten gebracht.he [09:24:19] danke nosy, das ist das mit der CA.. ack [09:28:36] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:28:56] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:29:16] / on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:16] /sql on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:16] / on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:16] /tmp on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:16] Load avg. on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:17] Load avg. on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:17] /sql on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:18] / on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:18] /tmp on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:19] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:24] SSH on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:29:24] SSH on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:29:25] SSH on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:29:35] Load avg. on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:45] Load avg. on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:29:45] SMTP on z-dat-s7-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:29:55] SMTP on z-dat-s4-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:29:56] SMTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:30:06] /tmp on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:30:06] /tmp on z-dat-s7-a is OK: DISK OK - free space: /tmp 2201 MB (99% inode=99%): [09:30:06] / on z-dat-s3-a is OK: DISK OK - free space: / 8343 MB (27% inode=85%): [09:30:06] / on z-dat-s6-a is OK: DISK OK - free space: / 8343 MB (27% inode=85%): [09:30:06] Load avg. on z-dat-s3-a is OK: OK - load average: 0.72, 1.54, 1.88 [09:30:06] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 64767 MB (15% inode=99%): [09:30:06] /sql on z-dat-s3-a is OK: DISK OK - free space: /sql 163899 MB (16% inode=99%): [09:30:07] /tmp on z-dat-s3-a is OK: DISK OK - free space: /tmp 2202 MB (99% inode=99%): [09:30:07] / on z-dat-s7-a is OK: DISK OK - free space: / 8343 MB (27% inode=85%): [09:30:08] SMF on z-dat-s7-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:30:08] SMF on z-dat-s3-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:30:09] SMF on z-dat-s6-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:30:09] SMF on z-dat-s4-a is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:30:10] SMF on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:30:17] Load avg. on z-dat-s7-a is OK: OK - load average: 0.66, 1.46, 1.84 [09:30:36] /tmp on z-dat-s6-a is OK: DISK OK - free space: /tmp 2225 MB (99% inode=99%): [09:31:05] MySQL on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [09:31:05] MySQL on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [09:31:05] MySQL slave on z-dat-s6-a is CRITICAL: (Service Check Timed Out) [09:31:05] s4 replag on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [09:31:05] MySQL slave on z-dat-s3-a is CRITICAL: (Service Check Timed Out) [09:31:46] MySQL slave on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [09:31:56] MySQL on z-dat-s4-a is CRITICAL: (Service Check Timed Out) [09:32:05] MySQL on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [09:32:06] MySQL slave on z-dat-s7-a is CRITICAL: (Service Check Timed Out) [09:32:37] NTP on hyacinth is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:32:45] SMTP on z-dat-s3-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:34:25] SMTP on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:34:36] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [09:34:36] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.359375/1.10, alarm hl:np_load_long=0.721680/1.55, alarm hl:mem_free=14064.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.359375/1.00, alarm hl:np_load_long=0.721680/1.50, alarm hl:mem_free=14064.000000M/600M, alarm hl:available=1/0 [09:34:56] MySQL on z-dat-s3-a is OK: Uptime: 825246 Threads: 25 Questions: 918046924 Slow queries: 62696 Opens: 8951599 Flush tables: 1 Open tables: 16384 Queries per second avg: 1112.452 [09:34:56] MySQL slave on z-dat-s3-a is OK: Uptime: 825246 Threads: 25 Questions: 918046925 Slow queries: 62696 Opens: 8951599 Flush tables: 1 Open tables: 16384 Queries per second avg: 1112.452 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 515 [09:35:37] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [09:35:46] MySQL slave on cassia is OK: Uptime: 337516 Threads: 27 Questions: 556119318 Slow queries: 65774 Opens: 1171987 Flush tables: 1 Open tables: 12590 Queries per second avg: 1647.682 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1673 [09:35:46] s5 replag on cassia is OK: QUERY OK: SELECT ts_rc_age() returned 1661.000000 [09:36:55] MySQL on z-dat-s7-a is OK: Uptime: 825367 Threads: 12 Questions: 284241302 Slow queries: 24948 Opens: 2480693 Flush tables: 1 Open tables: 6255 Queries per second avg: 344.381 [09:36:56] MySQL slave on z-dat-s7-a is OK: Uptime: 825367 Threads: 12 Questions: 284241303 Slow queries: 24948 Opens: 2480693 Flush tables: 1 Open tables: 6255 Queries per second avg: 344.381 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 604 [09:37:45] MySQL on z-dat-s4-a is OK: Uptime: 825417 Threads: 19 Questions: 131486279 Slow queries: 18116 Opens: 14380 Flush tables: 1 Open tables: 664 Queries per second avg: 159.296 [09:37:46] SMTP on z-dat-s7-a is OK: SMTP OK - 1.557 sec. response time [09:37:46] MySQL on z-dat-s6-a is OK: Uptime: 825417 Threads: 11 Questions: 165778304 Slow queries: 53913 Opens: 1120681 Flush tables: 2 Open tables: 2050 Queries per second avg: 200.841 [09:37:47] MySQL slave on z-dat-s6-a is OK: Uptime: 825417 Threads: 10 Questions: 165778305 Slow queries: 53913 Opens: 1120681 Flush tables: 2 Open tables: 2050 Queries per second avg: 200.841 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 688 [09:37:47] s4 replag on z-dat-s4-a is OK: QUERY OK: SELECT ts_rc_age() returned 567.000000 [09:37:47] SMTP on z-dat-s3-a is OK: SMTP OK - 3.841 sec. response time [09:37:47] SMF on z-dat-s7-a is OK: OK - all services online [09:37:47] SMF on z-dat-s6-a is OK: OK - all services online [09:37:47] SMF on z-dat-s3-a is OK: OK - all services online [09:37:48] SMF on z-dat-s4-a is OK: OK - all services online [09:37:48] SMF on hyacinth is OK: OK - all services online [09:37:48] MySQL slave on z-dat-s4-a is OK: Uptime: 825418 Threads: 14 Questions: 131486303 Slow queries: 18121 Opens: 14380 Flush tables: 1 Open tables: 664 Queries per second avg: 159.296 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 567 [09:37:49] SMTP on hyacinth is OK: SMTP OK - 0.003 sec. response time [09:37:49] SMTP on z-dat-s4-a is OK: SMTP OK - 0.028 sec. response time [09:38:05] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:38:05] Load avg. on z-dat-s4-a is OK: OK - load average: 0.62, 0.53, 1.20 [09:38:05] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [09:38:05] Load avg. on z-dat-s6-a is OK: OK - load average: 0.86, 0.58, 1.21 [09:38:16] SSH on z-dat-s4-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:38:17] SMTP on z-dat-s6-a is OK: SMTP OK - 0.003 sec. response time [09:38:17] SSH on z-dat-s7-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:38:17] SSH on z-dat-s3-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [09:38:24] NTP on hyacinth is OK: NTP OK: Offset 0.001101 secs [09:47:05] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [09:47:15] / on wolfsbane is WARNING: DISK WARNING - free space: / 6022 MB (20% inode=93%): [09:55:15] / on wolfsbane is OK: DISK OK - free space: / 7661 MB (25% inode=93%): [09:55:35] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.534180/1.95, alarm hl:np_load_avg=1.761230/2.0, alarm hl:mem_free=870.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.534180/2.3, alarm hl:np_load_long=1.538086/2.5, alarm hl:cpu=80.600000/98, alarm hl:mem_free=870.000000M/200M, alarm hl:available=1/0 [09:56:35] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [09:58:05] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302613 MB (5% inode=33%): [10:02:36] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.139160/1.95, alarm hl:np_load_avg=1.907226/2.0, alarm hl:mem_free=911.000000M/350M, alarm hl:available=1/0 [10:03:35] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.099609/1.00, alarm hl:np_load_long=0.708008/1.50, alarm hl:mem_free=14716.000000M/600M, alarm hl:available=1/0 [10:04:36] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [10:08:15] Load avg. on willow is WARNING: WARNING - load average: 14.73, 15.67, 13.85 [10:09:15] Load avg. on willow is OK: OK - load average: 10.92, 14.39, 13.52 [10:11:45] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [10:28:45] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:34:45] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [10:38:59] 3(commented) [TS-1369] SSL certificate problem, bad/outdated HTTPS CA <10https://jira.toolserver.org/browse/TS-1369> (Krinkle) [10:47:15] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [10:58:05] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302552 MB (5% inode=33%): [10:58:45] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.052734/1.00, alarm hl:np_load_long=0.918945/1.50, alarm hl:mem_free=14912.000000M/600M, alarm hl:available=1/0 [11:02:45] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.194824/1.95, alarm hl:np_load_avg=2.175781/2.0, alarm hl:mem_free=710.000000M/350M, alarm hl:available=1/0 [11:03:16] Load avg. on willow is WARNING: WARNING - load average: 14.73, 16.67, 15.10 [11:04:45] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [11:07:45] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.467773/1.95, alarm hl:np_load_avg=2.172852/2.0, alarm hl:mem_free=1167.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.467773/2.3, alarm hl:np_load_long=1.955566/2.5, alarm hl:cpu=89.800000/98, alarm hl:mem_free=1167.000000M/200M, alarm hl:available=1/0 [11:09:15] Load avg. on willow is OK: OK - load average: 10.43, 14.82, 14.90 [11:11:55] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [11:13:45] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [11:22:45] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.142578/1.10, alarm hl:np_load_long=1.258789/1.55, alarm hl:mem_free=14485.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.142578/1.00, alarm hl:np_load_long=1.258789/1.50, alarm hl:mem_free=14485.000000M/600M, alarm hl:available=1/0 [11:24:52] 3(assigned) [TS-1369] SSL certificate problem, bad/outdated HTTPS CA <10https://jira.toolserver.org/browse/TS-1369> (Marlen Caemmerer) [11:28:45] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:34:46] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [11:48:15] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [11:58:16] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302503 MB (5% inode=33%): [12:02:45] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.589355/1.95, alarm hl:np_load_avg=1.858399/2.0, alarm hl:mem_free=990.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.589355/2.3, alarm hl:np_load_long=1.550781/2.5, alarm hl:cpu=85.300000/98, alarm hl:mem_free=990.000000M/200M, alarm hl:available=1/0 [12:03:15] Load avg. on willow is WARNING: WARNING - load average: 15.69, 14.38, 12.37 [12:04:05] FC 0/5 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/5:UP: 1 int NOK : CRITICAL [12:04:16] Load avg. on willow is OK: OK - load average: 12.06, 13.53, 12.19 [12:04:46] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [12:11:55] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [12:11:55] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [12:28:45] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:34:45] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [12:48:25] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [12:49:54] FC 0/5 on fsw1-n1-oe16-esams.mgmt is OK: FC port 0/5:DOWN:1 UP: OK [12:50:45] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.147949/1.95, alarm hl:np_load_avg=1.731934/2.0, alarm hl:mem_free=1165.000000M/350M, alarm hl:available=1/0 [12:51:45] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [12:54:05] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [12:58:05] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [12:58:15] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302450 MB (5% inode=33%): [13:09:54] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [13:12:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:13:15] Load avg. on willow is WARNING: WARNING - load average: 19.19, 14.69, 12.71 [13:14:16] Load avg. on willow is OK: OK - load average: 14.10, 13.94, 12.57 [13:28:05] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [13:28:46] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:34:55] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [13:43:54] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:48:25] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [13:49:45] maintenance for turnera [13:49:48] now [13:52:32] SSH on turnera is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:52:32] SMTP on turnera is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:52:52] / on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:53:03] NTP on turnera is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:53:23] Free Memory on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:53:23] Load avg. on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:53:23] /tmp on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:53:23] DiskSuite on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:53:33] Environment IPMI on turnera is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [13:54:53] ethernet 0/1/8 [turnera bge1] on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/8:DOWN: 1 int NOK : CRITICAL [13:54:53] ethernet 0/1/6 [turnera bge0] on asw-oe16-esams.mgmt is CRITICAL: GigabitEthernet0/1/6:DOWN: 1 int NOK : CRITICAL [13:55:03] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [13:56:53] 3(created) [TS-1372] Maintenance for turnera; Toolserver; Task <10https://jira.toolserver.org/browse/TS-1372> (Marlen Caemmerer) [13:58:12] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [14:01:42] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 302355 MB (5% inode=33%): [14:01:53] ethernet 0/1/8 [turnera bge1] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/8:UP:1 UP: OK [14:01:54] ethernet 0/1/6 [turnera bge0] on asw-oe16-esams.mgmt is OK: GigabitEthernet0/1/6:UP:1 UP: OK [14:03:32] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [14:03:54] Load avg. on turnera is OK: OK - load average: 1.99, 0.77, 0.29 [14:03:54] /tmp on turnera is OK: DISK OK - free space: /tmp 11293 MB (99% inode=99%): [14:03:54] Free Memory on turnera is OK: OK - 87.1% (3644868 kB) free. [14:04:13] Environment IPMI on turnera is OK: ok: temperature ok fan ok voltage ok chassis ok [14:04:23] / on turnera is OK: DISK OK - free space: / 48185 MB (67% inode=95%): [14:04:23] SSH on turnera is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [14:04:23] SMTP on turnera is OK: SMTP OK - 0.575 sec. response time [14:04:42] Free Memory on damiana is CRITICAL: CRITICAL - 4.9% (203804 kB) free! [14:16:30] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [14:18:52] 3(resolved) [TS-1372] Maintenance for turnera <10https://jira.toolserver.org/browse/TS-1372> (Marlen Caemmerer) [14:19:39] NTP on turnera is OK: NTP OK: Offset 0.003132 secs [14:28:30] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [14:28:50] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:35:10] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [14:37:35] [[Special:Log/newusers]] create 10 * Fekepp * (New user account) [14:55:59] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [14:56:02] Platonides * Re: [Toolserver-l] Spam problem on Toolserver wiki [14:58:50] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [15:02:20] Load avg. on willow is WARNING: WARNING - load average: 15.25, 13.95, 12.08 [15:02:30] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 299945 MB (5% inode=33%): [15:03:19] Load avg. on willow is OK: OK - load average: 12.39, 13.17, 11.91 [15:15:50] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [15:16:40] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [15:19:53] 3(created) [ACCAPP-504] RENDER project tool development; Account Approval; New Account <10https://jira.toolserver.org/browse/ACCAPP-504> (Felix Leif Keppmann) [15:23:02] Hydriz Wikipedia * Re: [Toolserver-l] Interwiki bot MMP planning [15:28:30] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [15:28:50] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:35:12] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [15:36:52] 3(resolved) [MNT-1199] Configured damiana's and tunera's MGMT-IPs and put them into DNS <10https://jira.toolserver.org/browse/MNT-1199> (Marlen Caemmerer) [15:48:54] hello all [15:51:16] hi DaBPunkt [15:51:25] hi DaBPunkt [15:55:52] 3(created) [TS-1373] Send some SFPs to Haarlem; Toolserver; Task <10https://jira.toolserver.org/browse/TS-1373> (Marlen Caemmerer) [15:56:10] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [15:58:50] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [16:02:30] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 299858 MB (5% inode=33%): [16:16:44] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [16:16:45] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [16:21:07] 3(resolved) [TS-1369] SSL certificate problem, bad/outdated HTTPS CA <10https://jira.toolserver.org/browse/TS-1369> (Marlen Caemmerer) [16:21:09] 3(commented) [TS-1293] Add more fibre cables to the SAN installation for redundancy <10https://jira.toolserver.org/browse/TS-1293> (Marlen Caemmerer) [16:22:02] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [16:22:55] 3(commented) [TS-1294] Configure second SAN switch <10https://jira.toolserver.org/browse/TS-1294> (Marlen Caemmerer) [16:28:44] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [16:28:52] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:29:53] 3(created) [TS-1374] Clean up nagios regarding SAN connections; Toolserver; Task <10https://jira.toolserver.org/browse/TS-1374> (Marlen Caemmerer) [16:29:54] 3(assigned) [TS-1374] Clean up nagios regarding SAN connections <10https://jira.toolserver.org/browse/TS-1374> (Marlen Caemmerer) [16:29:58] 3(resolved) [TS-1139] It takes too much time to start up. <10https://jira.toolserver.org/browse/TS-1139> (Marlen Caemmerer) [16:31:52] 3(assigned) [TS-1371] Add account to cvn and swmtbot groups <10https://jira.toolserver.org/browse/TS-1371> (Marlen Caemmerer) [16:31:55] 3(assigned) [TS-1367] Account renewal for z <10https://jira.toolserver.org/browse/TS-1367> (Marlen Caemmerer) [16:32:03] 3(assigned) [TS-1366] Lost my SSH key pair, please update <10https://jira.toolserver.org/browse/TS-1366> (Marlen Caemmerer) [16:35:13] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [16:56:13] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [16:59:44] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [17:01:45] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [17:02:44] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 301869 MB (5% inode=33%): [17:16:44] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [17:16:44] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [17:20:02] Merlijn van Deen * Re: [Toolserver-l] Interwiki bot MMP planning [17:22:03] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [17:28:43] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [17:28:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:35:21] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [17:46:53] [[OpenStreetMap]] !M 10https://wiki.toolserver.org/w/index.php?diff=7187&oldid=6633&rcid=9577 * Kolossos * (+53) (/* projects */ +WIWOSM) [17:56:21] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [18:00:11] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [18:00:11] SMF on willow is OK: OK - all services online [18:03:13] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 301793 MB (5% inode=33%): [18:03:13] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [18:17:13] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [18:29:13] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [18:29:43] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:35:43] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [18:52:11] [[Interwiki bot MMP planning]] ! 10https://wiki.toolserver.org/w/index.php?diff=7188&oldid=7127&rcid=9578 * 2.1.18.231 * (+122) (+HerculeBot) [18:56:42] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [19:00:23] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [19:02:43] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.108398/1.95, alarm hl:np_load_avg=1.948242/2.0, alarm hl:mem_free=1099.000000M/350M, alarm hl:available=1/0 [19:03:13] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 301683 MB (5% inode=33%): [19:03:14] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:03:43] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [19:10:55] [[Special:Log/newusers]] create 10 * Mgedawy * (New user account) [19:14:05] [[User:Mgedawy]] !N 10https://wiki.toolserver.org/w/index.php?oldid=7189&rcid=9580 * Mgedawy * (+363) (New!) [19:15:35] [[User talk:Mgedawy]] !N 10https://wiki.toolserver.org/w/index.php?oldid=7190&rcid=9581 * Mgedawy * (+26) (Redirected page to [[User:Mgedawy]]) [19:17:23] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [19:22:03] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [19:24:13] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.889649/1.10, alarm hl:np_load_long=1.042969/1.55, alarm hl:mem_free=15119.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.889649/1.00, alarm hl:np_load_long=1.042969/1.50, alarm hl:mem_free=15119.000000M/600M, alarm hl:available=1/0 [19:26:13] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [19:26:33] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [19:27:00] [[Transferring files]] ! 10https://wiki.toolserver.org/w/index.php?diff=7191&oldid=5952&rcid=9582 * 41.254.5.2 * (-2444) () [19:27:22] [[Transferring files]] M 10https://wiki.toolserver.org/w/index.php?diff=7192&oldid=7191&rcid=9583 * Dispenser * (+2444) (Reverted edits by [[Special:Contributions/41.254.5.2|41.254.5.2]] ([[User talk:41.254.5.2|talk]]) to last revision by [[User:93.32.188.231|93.32.188.231]]) [19:29:13] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [19:30:43] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:35:43] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [19:36:13] [[Interwiki bot MMP planning]] ! 10https://wiki.toolserver.org/w/index.php?diff=7193&oldid=7188&rcid=9584 * Mgedawy * (+442) (+GedawyBot) [19:41:15] [[Special:Log/newusers]] create 10 * Rosner * (New user account) [19:56:43] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [19:58:13] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.292969/1.10, alarm hl:np_load_long=0.837890/1.55, alarm hl:mem_free=15085.000000M/500M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.292969/1.00, alarm hl:np_load_long=0.837890/1.50, alarm hl:mem_free=15085.000000M/600M, alarm hl:available=1/0 [20:00:14] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [20:01:23] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [20:03:13] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 301578 MB (5% inode=33%): [20:03:22] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [20:17:22] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [20:29:22] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [20:30:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:35:53] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [20:56:53] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [21:01:23] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [21:03:23] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [21:04:23] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 301575 MB (5% inode=33%): [21:07:59] 3(assigned) [TS-1367] Account renewal for z <10https://jira.toolserver.org/browse/TS-1367> (DaB.) [21:08:01] 3(resolved) [TS-1367] Account renewal for z <10https://jira.toolserver.org/browse/TS-1367> (DaB.) [21:17:33] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [21:29:33] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [21:31:03] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:36:12] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [21:57:12] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [22:01:34] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [22:03:33] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [22:05:24] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 301448 MB (5% inode=33%): [22:17:43] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [22:29:00] nacht ts [22:29:43] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [22:31:13] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:32:24] Load avg. on willow is WARNING: WARNING - load average: 16.59, 15.45, 12.80 [22:33:33] Load avg. on willow is OK: OK - load average: 10.80, 13.91, 12.43 [22:36:13] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [22:51:23] / on wolfsbane is WARNING: DISK WARNING - free space: / 6279 MB (20% inode=93%): [22:57:13] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL [23:01:43] FC 0/12 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/12:UP: 1 int NOK : CRITICAL [23:02:33] Load avg. on willow is WARNING: WARNING - load average: 15.55, 15.40, 13.46 [23:03:33] Load avg. on willow is OK: OK - load average: 12.38, 14.46, 13.25 [23:03:45] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [23:05:24] /aux0 on hemlock is CRITICAL: DISK CRITICAL - free space: /aux0 301350 MB (5% inode=33%): [23:17:44] FC 0/8 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/8:UP: 1 int NOK : CRITICAL [23:29:53] FC 0/11 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/11:UP: 1 int NOK : CRITICAL [23:31:12] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:36:13] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default offline: svc:/system/cluster/scsymon-srv:default [23:51:23] / on wolfsbane is WARNING: DISK WARNING - free space: / 6107 MB (20% inode=93%): [23:57:12] FC 0/15 on fsw1-n1-oe16-esams.mgmt is CRITICAL: FC port 0/15:UP: 1 int NOK : CRITICAL