[00:02:02] DaB. * Re: [Toolserver-l] S1 replag [00:02:59] 3(commented) [MNT-1225] Growing replag on S1 due to a database migration at WMF <10https://jira.toolserver.org/browse/MNT-1225> (DaB.) [00:02:59] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [00:18:06] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [00:22:32] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [00:23:57] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.106445/1.75, alarm hl:np_load_avg=1.125000/2.0, alarm hl:mem_free=182.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.106445/1.9, alarm hl:np_load_long=1.127441/2.25, alarm hl:mem_free=182.000000M/200M, alarm hl:available=1/0 [00:24:57] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [00:26:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 789102.000000 [00:27:04] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 789114.000000 [00:27:55] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.274902/1.75, alarm hl:np_load_avg=1.181152/2.0, alarm hl:mem_free=239.000000M/350M, alarm hl:available=1/0 [00:35:57] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [00:37:57] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [00:39:24] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [00:46:57] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [00:48:57] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [00:54:56] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 42005 MB (10% inode=99%): [01:05:56] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 98861 MB (24% inode=99%): [01:26:57] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 792701.000000 [01:28:05] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 792774.000000 [01:36:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [01:38:57] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [01:40:23] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [01:46:57] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [01:49:56] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [02:12:58] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.771973/1.75, alarm hl:np_load_avg=1.742676/2.0, alarm hl:mem_free=788.000000M/350M, alarm hl:available=1/0 [02:13:54] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [02:15:04] someone please recommend a very simple python web library/toolkit. Something like CGI.pm for Perl, please? [02:16:21] Dispenser: ^? [02:16:38] python comes preloaded with cgi.py [02:16:52] http://docs.python.org/library/cgi.html [02:17:08] That's what I use, anyways. [02:17:52] It really depends on what you need to do. I rolled my own implementation to be compatible with pywikibot. [02:18:20] I'm having some trouble with an SQL query: http://pastebin.com/AsWExzm6 [02:18:22] thanks, that's good enough [02:18:47] I can't get it to run in a reasonablee amount of time with the ORDER BY statement. [02:21:52] Tim1357: Replag is still a week behind [02:22:36] Would replag effect the time it takes the query to finish? [02:22:40] *affect [02:26:13] No, but databases are read-only and you're project watchlist tool uses user databases [02:26:38] Yeah, which is why I'm trying to re-write it to use the recentchanges table and not use any user database. [02:26:56] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.806641/1.75, alarm hl:np_load_avg=1.740234/2.0, alarm hl:mem_free=684.000000M/350M, alarm hl:available=1/0 [02:27:36] EXPLAIN doesn't show any real difference between the two tables. You might want to try a sub-query. [02:27:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 796361.000000 [02:29:04] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 796434.000000 [02:30:42] Tim1357: Likely your query is cached, try SELECT SQL_NO_CACHE in the future [02:32:18] Tim1357: Also, u_dispenser_p.projectbanner (updated weekly) might be a useful fallback [02:33:44] Dispenser: You're awesome. That's exactly what I want. [02:34:46] Dispenser: Does it do category-based wikiprojects too? [02:35:16] No, just project banner unfornately [02:35:39] unfortunately* (god that's a difficult word to spell) [02:37:16] See /home/dispenser/scripts/weekly.sh for details [02:37:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [02:39:55] Load avg. on willow is WARNING: WARNING - load average: 15.45, 14.91, 14.00 [02:39:55] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [02:40:22] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [02:45:29] Dispenser: Do you know of a database that lists the number of members of a wikiproject. [02:45:37] Well, I guess I could do a count() on your database. [02:46:29] SELECT pb_title, COUNT(*) FROM u_dispenser_p.projectbanner GROUP BY 1; [02:46:56] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [02:48:38] People need to start documenting useful databases on the wiki [02:49:58] Load avg. on willow is OK: OK - load average: 12.81, 14.59, 14.47 [02:49:58] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [02:59:57] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 131716 MB (13% inode=99%): [03:06:56] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.588867/1.10, alarm hl:np_load_long=0.976562/1.55, alarm hl:mem_free=20509.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.588867/1.00, alarm hl:np_load_long=0.976562/1.50, alarm hl:mem_free=20509.000000M/350M, alarm hl:available=1/0 [03:08:58] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [03:28:56] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 800022.000000 [03:29:04] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 800035.000000 [03:34:57] /sql on z-dat-s4-a is WARNING: DISK WARNING - free space: /sql 34890 MB (8% inode=99%): [03:37:58] /sql on z-dat-s4-a is CRITICAL: DISK CRITICAL - free space: /sql 23368 MB (5% inode=99%): [03:37:58] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [03:39:56] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [03:40:23] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [03:43:57] /sql on z-dat-s4-a is OK: DISK OK - free space: /sql 60266 MB (14% inode=99%): [03:47:05] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [03:47:54] /sql on thyme is CRITICAL: DISK CRITICAL - free space: /sql 55821 MB (5% inode=99%): [03:50:57] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [04:28:57] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 803622.000000 [04:29:04] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 803635.000000 [04:37:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [04:39:55] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [04:40:24] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [04:48:04] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [04:51:56] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [05:18:04] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [05:23:57] /sql on thyme is OK: DISK OK - free space: /sql 281093 MB (29% inode=99%): [05:26:04] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 806907 [05:29:05] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 806816.000000 [05:29:56] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 807281.000000 [05:38:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [05:39:57] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [05:40:24] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [05:42:32] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [05:49:04] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [05:51:57] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [06:10:56] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.623535/1.75, alarm hl:np_load_avg=1.954101/2.0, alarm hl:mem_free=904.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.623535/1.9, alarm hl:np_load_long=1.499512/2.25, alarm hl:mem_free=904.000000M/200M, alarm hl:available=1/0 [06:14:54] Load avg. on willow is WARNING: WARNING - load average: 19.50, 16.55, 13.12 [06:26:04] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 798439 [06:29:05] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 797859.000000 [06:29:55] Load avg. on willow is CRITICAL: CRITICAL - load average: 30.25, 23.86, 19.57 [06:30:56] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 810942.000000 [06:39:54] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [06:40:54] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [06:41:23] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [06:44:54] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.019531/1.00, alarm hl:np_load_long=0.764648/1.50, alarm hl:mem_free=20075.000000M/350M, alarm hl:available=1/0 [06:45:55] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [06:49:05] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [06:52:56] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [06:56:55] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.143555/1.10, alarm hl:np_load_long=0.890625/1.55, alarm hl:mem_free=19971.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.143555/1.00, alarm hl:np_load_long=0.890625/1.50, alarm hl:mem_free=19971.000000M/350M, alarm hl:available=1/0 [07:03:54] Load avg. on willow is WARNING: WARNING - load average: 17.09, 17.65, 19.79 [07:11:55] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.197266/1.75, alarm hl:np_load_avg=2.241699/2.0, alarm hl:mem_free=1147.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.197266/1.9, alarm hl:np_load_long=2.388184/2.25, alarm hl:mem_free=1147.000000M/200M, alarm hl:available=1/0 [07:26:04] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 779136 [07:30:04] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 777631.000000 [07:30:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 814541.000000 [07:40:05] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [07:40:55] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [07:41:23] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [07:50:03] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [07:53:55] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [08:03:54] Load avg. on willow is WARNING: WARNING - load average: 16.86, 18.56, 18.36 [08:09:34] Save me a search: what is the geographical location of the TS? [08:11:55] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.164551/1.75, alarm hl:np_load_avg=2.226562/2.0, alarm hl:mem_free=794.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.164551/1.9, alarm hl:np_load_long=2.262695/2.25, alarm hl:mem_free=794.000000M/200M, alarm hl:available=1/0 [08:26:05] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 758336 [08:30:04] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 757273.000000 [08:31:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 818202.000000 [08:38:54] 3(commented) [MNT-1225] Growing replag on S1 due to a database migration at WMF <10https://jira.toolserver.org/browse/MNT-1225> (Marlen Caemmerer) [08:40:56] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [08:40:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [08:41:23] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [08:44:53] 3(created) [TS-1342] please remove me from the forward list for root@toolserver.org; Toolserver; Minor Task <10https://jira.toolserver.org/browse/TS-1342> (Daniel Kinzler) [08:50:55] Load avg. on willow is CRITICAL: CRITICAL - load average: 28.42, 22.78, 20.18 [08:50:55] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [08:52:55] Load avg. on willow is WARNING: WARNING - load average: 19.50, 21.43, 20.00 [08:54:54] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [08:58:04] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:01:56] Load avg. on willow is CRITICAL: CRITICAL - load average: 21.16, 20.65, 20.02 [09:12:55] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.813965/1.75, alarm hl:np_load_avg=2.623535/2.0, alarm hl:mem_free=666.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.813965/1.9, alarm hl:np_load_long=2.523438/2.25, alarm hl:mem_free=666.000000M/200M, alarm hl:available=1/0 [09:17:32] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [09:27:04] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 741725 [09:27:54] 3(assigned) [TS-1340] Cronie jobs not being run intermittently <10https://jira.toolserver.org/browse/TS-1340> (Marlen Caemmerer) [09:29:56] 3(commented) [TS-1340] Cronie jobs not being run intermittently <10https://jira.toolserver.org/browse/TS-1340> (Marlen Caemmerer) [09:31:05] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 741024.000000 [09:31:55] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 821801.000000 [09:33:55] Load avg. on willow is CRITICAL: CRITICAL - load average: 24.07, 23.29, 23.06 [09:40:56] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [09:40:56] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [09:42:23] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [09:51:03] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [09:54:55] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [09:55:32] RAID on hyacinth is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [09:56:04] RAID on hyacinth is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [10:13:55] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.279785/1.75, alarm hl:np_load_avg=3.121094/2.0, alarm hl:mem_free=313.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.279785/1.9, alarm hl:np_load_long=3.047852/2.25, alarm hl:mem_free=313.000000M/200M, alarm hl:available=1/0 [10:15:55] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.356445/1.10, alarm hl:np_load_long=0.750000/1.55, alarm hl:mem_free=20050.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.356445/1.00, alarm hl:np_load_long=0.750000/1.50, alarm hl:mem_free=20050.000000M/350M, alarm hl:available=1/0 [10:20:54] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [10:27:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 729261 [10:31:13] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 729068.000000 [10:32:56] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 825463.000000 [10:33:55] Load avg. on willow is CRITICAL: CRITICAL - load average: 26.37, 27.39, 26.50 [10:41:55] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [10:41:55] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [10:42:42] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [10:47:13] Sun Grid Engine execd on willow is CRITICAL: medium-sol@willow in unknown state: longrun-sol@willow in unknown state [10:51:03] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [10:54:54] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [10:55:03] SSH on z-dat-s6-a is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:55:55] SSH on z-dat-s6-a is OK: SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [10:57:02] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=14.526855/1.75, alarm hl:np_load_avg=12.093750/2.0, alarm hl:mem_free=305.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=14.526855/1.9, alarm hl:np_load_long=9.131348/2.25, alarm hl:mem_free=305.000000M/200M, alarm hl:available=1/0 [11:27:13] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 724600 [11:31:13] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 724185.000000 [11:33:54] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 829127.000000 [11:34:04] Load avg. on willow is CRITICAL: CRITICAL - load average: 36.25, 36.28, 37.91 [11:42:14] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [11:42:23] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [11:42:43] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [11:44:14] Sun Grid Engine execd on ortelius is WARNING: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.039062/1.00, alarm hl:np_load_long=0.628906/1.50, alarm hl:mem_free=21144.000000M/350M, alarm hl:available=1/0 [11:45:14] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [11:51:13] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [11:55:14] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [11:57:23] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.294434/1.75, alarm hl:np_load_avg=3.897949/2.0, alarm hl:mem_free=287.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.294434/1.9, alarm hl:np_load_long=4.288574/2.25, alarm hl:mem_free=287.000000M/200M, alarm hl:available=1/0 [12:18:14] Environment IPMI on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [12:18:54] Environment IPMI on adenia is OK: ok: temperature ok fan ok voltage ok chassis ok [12:27:25] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 714263 [12:31:35] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 712874.000000 [12:34:03] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 832730.000000 [12:34:34] Load avg. on willow is CRITICAL: CRITICAL - load average: 25.97, 27.78, 29.18 [12:37:34] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.410156/1.10, alarm hl:np_load_long=0.815430/1.55, alarm hl:mem_free=20583.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=2.410156/1.00, alarm hl:np_load_long=0.815430/1.50, alarm hl:mem_free=20583.000000M/350M, alarm hl:available=1/0 [12:39:35] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [12:42:34] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [12:42:34] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [12:42:53] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [12:50:57] 3(commented) [TS-1340] Cronie jobs not being run intermittently <10https://jira.toolserver.org/browse/TS-1340> (DaB.) [12:51:33] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [12:55:10] hello all [12:55:33] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [12:55:55] 3(resolved) [TS-1342] please remove me from the forward list for root@toolserver.org <10https://jira.toolserver.org/browse/TS-1342> (DaB.) [12:57:42] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.322754/1.75, alarm hl:np_load_avg=3.381836/2.0, alarm hl:mem_free=265.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.322754/1.9, alarm hl:np_load_long=3.460938/2.25, alarm hl:mem_free=265.000000M/200M, alarm hl:available=1/0 [13:09:34] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [13:11:58] 3(commented) [TS-1340] Cronie jobs not being run intermittently <10https://jira.toolserver.org/browse/TS-1340> (Marlen Caemmerer) [13:27:34] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 695817 [13:31:35] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 694978.000000 [13:34:03] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 836333.000000 [13:34:36] Load avg. on willow is CRITICAL: CRITICAL - load average: 23.45, 26.04, 27.20 [13:42:43] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [13:42:43] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [13:43:01] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [13:51:42] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [13:55:43] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [13:57:55] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.013184/1.75, alarm hl:np_load_avg=3.023926/2.0, alarm hl:mem_free=245.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.013184/1.9, alarm hl:np_load_long=3.019043/2.25, alarm hl:mem_free=245.000000M/200M, alarm hl:available=1/0 [14:10:33] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [14:12:52] 3(created) [MNT-1227] Re-Import of enwiki; Maintenance; Minor work <10https://jira.toolserver.org/browse/MNT-1227> (DaB.) [14:18:55] 3(updated) [MNT-1227] Re-Import of enwiki <10https://jira.toolserver.org/browse/MNT-1227> (DaB.) [14:28:33] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 683111 [14:32:33] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 682049.000000 [14:34:03] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 839934.000000 [14:35:32] Load avg. on willow is CRITICAL: CRITICAL - load average: 22.88, 22.38, 23.45 [14:42:53] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [14:42:53] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [14:43:11] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [14:44:53] 3(commented) [MNT-1227] Re-Import of enwiki <10https://jira.toolserver.org/browse/MNT-1227> (DaB.) [14:51:54] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [14:55:54] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [14:58:54] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.651367/1.75, alarm hl:np_load_avg=2.559570/2.0, alarm hl:mem_free=1004.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.651367/1.9, alarm hl:np_load_long=2.602539/2.25, alarm hl:mem_free=1004.000000M/200M, alarm hl:available=1/0 [15:00:04] /sql on rosemary is WARNING: DISK WARNING - free space: /sql 129471 MB (13% inode=99%): [15:05:33] Load avg. on willow is WARNING: WARNING - load average: 17.54, 18.53, 19.98 [15:10:33] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [15:10:33] Load avg. on willow is CRITICAL: CRITICAL - load average: 26.27, 20.59, 20.26 [15:16:33] Load avg. on willow is WARNING: WARNING - load average: 19.36, 19.64, 19.99 [15:28:34] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 669680 [15:32:33] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 668805.000000 [15:34:13] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 843543.000000 [15:43:14] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [15:43:14] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [15:43:21] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [15:52:14] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [15:56:13] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [15:56:39] [[User talk:Tomtomn00]] ! 10https://wiki.toolserver.org/w/index.php?diff=6965&oldid=6938&rcid=9171 * Tomtomn00 * (+31) () [15:59:13] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.854980/1.75, alarm hl:np_load_avg=3.381348/2.0, alarm hl:mem_free=1103.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.854980/1.9, alarm hl:np_load_long=2.967285/2.25, alarm hl:mem_free=1103.000000M/200M, alarm hl:available=1/0 [16:00:33] Load avg. on willow is CRITICAL: CRITICAL - load average: 37.62, 29.43, 24.83 [16:10:35] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [16:14:34] Load avg. on willow is WARNING: WARNING - load average: 16.54, 17.76, 19.92 [16:22:22] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [16:28:34] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 658310 [16:28:34] Load avg. on willow is OK: OK - load average: 9.18, 10.79, 14.80 [16:31:41] THAT's an interesting bug! By specifying the database engine MySQL will let you create temporary tables on read-only databases [16:32:34] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 657129.000000 [16:34:22] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 847153.000000 [16:35:23] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.973145/1.75, alarm hl:np_load_avg=1.702149/2.0, alarm hl:mem_free=621.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.973145/1.9, alarm hl:np_load_long=1.827637/2.25, alarm hl:mem_free=621.000000M/200M, alarm hl:available=1/0 [16:35:44] Load avg. on willow is WARNING: WARNING - load average: 24.23, 15.85, 15.36 [16:36:52] It seems the default engine might be InnoDB [16:37:23] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [16:37:43] Load avg. on willow is OK: OK - load average: 11.83, 14.09, 14.77 [16:38:56] * multichill pokes jeremyb [16:41:06] Dispenser: creating temp tables is allowed on readonly, but you must have write access to a database, but this has also changes in last version (not avaiable on ts) [16:41:40] then you not have even need any grants and can use every temp feature [16:43:24] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [16:43:28] All that I know is temporary tables work now if I specify ENGINE=MyISAM for our locked enwiki_p database. Without that I get the 1290 --read-only error [16:43:32] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [16:44:12] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [16:52:22] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [16:55:54] 3(commented) [TS-1213] Allow using temprorary tables on RR servers <10https://jira.toolserver.org/browse/TS-1213> (Dispenser) [16:55:56] 3(commented) [MAGNUS-311] Catscan broken on plwiki <10https://jira.toolserver.org/browse/MAGNUS-311> (Dispenser) [16:57:13] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [17:00:00] @replag [17:00:00] matthewrbowker: s1-rr-a: 1w 2d 19h 44m 56s [+1.00 s/s]; s1-user: 1w 2d 19h 44m 56s [+1.00 s/s]; s2-user: 16s [-0.00 s/s]; s3-rr-a: 23s [-0.00 s/s]; s3-user: 23s [-0.00 s/s]; s6-rr-a: 4h 53m 32s [+0.18 s/s]; s6-user: 4h 53m 32s [+0.18 s/s] [17:06:32] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.464844/1.75, alarm hl:np_load_avg=1.498535/2.0, alarm hl:mem_free=270.000000M/350M, alarm hl:available=1/0 [17:07:32] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [17:11:32] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [17:12:34] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.602051/1.75, alarm hl:np_load_avg=1.521973/2.0, alarm hl:mem_free=237.000000M/350M, alarm hl:available=1/0 [17:29:32] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 632541 [17:32:43] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 630954.000000 [17:34:34] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 850763.000000 [17:43:32] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [17:43:52] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [17:44:13] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [17:45:43] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.238281/1.10, alarm hl:np_load_long=0.827148/1.55, alarm hl:mem_free=20334.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.238281/1.00, alarm hl:np_load_long=0.827148/1.50, alarm hl:mem_free=20334.000000M/350M, alarm hl:available=1/0 [17:47:43] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [17:52:32] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [17:57:12] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [18:11:54] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [18:12:32] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.582520/1.75, alarm hl:np_load_avg=1.534668/2.0, alarm hl:mem_free=249.000000M/350M, alarm hl:available=1/0 [18:17:32] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [18:29:42] MySQL slave on thyme is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 613394 [18:32:54] s1 replag on thyme is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 612637.000000 [18:34:43] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 854372.000000 [18:43:43] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [18:44:13] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [18:44:13] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [18:52:43] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [18:53:43] MySQL on thyme is CRITICAL: Cant connect to MySQL server on thyme (146) [18:54:57] 3(commented) [MNT-1227] Re-Import of enwiki <10https://jira.toolserver.org/browse/MNT-1227> (DaB.) [18:57:13] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [19:02:43] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.758301/1.75, alarm hl:np_load_avg=1.495605/2.0, alarm hl:mem_free=751.000000M/350M, alarm hl:available=1/0 [19:03:44] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [19:06:48] MySQL on thyme is OK: Uptime: 204 Threads: 2 Questions: 42 Slow queries: 0 Opens: 16 Flush tables: 1 Open tables: 9 Queries per second avg: 0.205 [19:06:48] MySQL slave on thyme is WARNING: No slaves defined [19:08:53] 3(commented) [MNT-1227] Re-Import of enwiki <10https://jira.toolserver.org/browse/MNT-1227> (DaB.) [19:12:48] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=1.911621/1.75, alarm hl:np_load_avg=1.566406/2.0, alarm hl:mem_free=362.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=1.911621/1.9, alarm hl:np_load_long=1.383789/2.25, alarm hl:mem_free=362.000000M/200M, alarm hl:available=1/0 [19:12:48] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [19:24:52] 3(commented) [MNT-1227] Re-Import of enwiki <10https://jira.toolserver.org/browse/MNT-1227> (DaB.) [19:32:58] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Unknown database enwiki_p [19:34:41] very funny nagiois… [19:34:50] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 857977.000000 [19:43:57] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [19:44:07] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.526367/1.10, alarm hl:np_load_long=0.792969/1.55, alarm hl:mem_free=19301.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.526367/1.00, alarm hl:np_load_long=0.792969/1.50, alarm hl:mem_free=19301.000000M/350M, alarm hl:available=1/0 [19:44:18] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [19:44:48] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [19:45:07] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [19:52:58] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [19:57:48] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [19:59:58] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.224609/1.75, alarm hl:np_load_avg=1.725098/2.0, alarm hl:mem_free=696.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.224609/1.9, alarm hl:np_load_long=1.435547/2.25, alarm hl:mem_free=696.000000M/200M, alarm hl:available=1/0 [19:59:58] Load avg. on willow is WARNING: WARNING - load average: 16.89, 13.96, 11.61 [20:07:00] MySQL slave on thyme is WARNING: No slaves defined [20:09:55] 3(created) [ACCAPP-484] Account activation for user celicni for Wikimedia projects; Account Approval; New Account <10https://jira.toolserver.org/browse/ACCAPP-484> (Goran Obradovic) [20:12:59] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [20:12:59] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [20:22:59] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [20:26:59] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.317383/1.75, alarm hl:np_load_avg=2.083008/2.0, alarm hl:mem_free=432.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.317383/1.9, alarm hl:np_load_long=2.035156/2.25, alarm hl:mem_free=432.000000M/200M, alarm hl:available=1/0 [20:33:08] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Unknown database enwiki_p [20:35:49] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 861641.000000 [20:37:20] Sun Grid Engine execd on ortelius is WARNING: short-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.148438/1.10, alarm hl:np_load_long=0.709961/1.55, alarm hl:mem_free=19592.000000M/300M, alarm hl:available=1/0: medium-sol@ortelius exceedes load threshold: alarm hl:np_load_short=1.148438/1.00, alarm hl:np_load_long=0.709961/1.50, alarm hl:mem_free=19592.000000M/350M, alarm hl:available=1/0 [20:38:20] Sun Grid Engine execd on ortelius is OK: short-sol@ortelius OK: medium-sol@ortelius OK [20:44:19] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [20:44:28] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [20:44:57] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [20:47:57] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [20:53:19] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [20:57:49] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [20:59:18] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [21:00:07] Load avg. on willow is WARNING: WARNING - load average: 11.05, 14.85, 17.27 [21:03:19] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.602539/1.75, alarm hl:np_load_avg=2.462402/2.0, alarm hl:mem_free=414.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.602539/1.9, alarm hl:np_load_long=2.374512/2.25, alarm hl:mem_free=414.000000M/200M, alarm hl:available=1/0 [21:07:00] MySQL slave on thyme is WARNING: No slaves defined [21:11:06] Load avg. on willow is CRITICAL: CRITICAL - load average: 27.04, 22.45, 20.36 [21:13:06] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [21:25:07] Load avg. on willow is WARNING: WARNING - load average: 17.79, 19.30, 19.96 [21:26:52] 3(created) [TS-1343] s6 is currently not replicating; Toolserver; Bug <10https://jira.toolserver.org/browse/TS-1343> (Marlen Caemmerer) [21:28:51] 3(assigned) [TS-1343] s6 is currently not replicating <10https://jira.toolserver.org/browse/TS-1343> (Marlen Caemmerer) [21:33:19] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Unknown database enwiki_p [21:35:59] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 865244.000000 [21:40:29] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [21:44:28] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.173828/1.75, alarm hl:np_load_avg=1.974121/2.0, alarm hl:mem_free=321.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.173828/1.9, alarm hl:np_load_long=2.173340/2.25, alarm hl:mem_free=321.000000M/200M, alarm hl:available=1/0 [21:44:37] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [21:44:58] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [21:45:17] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [21:53:28] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [21:57:57] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [22:03:28] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [22:06:59] MySQL slave on thyme is WARNING: No slaves defined [22:07:53] 3(commented) [MNT-1225] Growing replag on S1 due to a database migration at WMF <10https://jira.toolserver.org/browse/MNT-1225> (Cyberpower678) [22:10:27] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=3.041504/1.75, alarm hl:np_load_avg=2.069336/2.0, alarm hl:mem_free=378.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=3.041504/1.9, alarm hl:np_load_long=2.015625/2.25, alarm hl:mem_free=378.000000M/200M, alarm hl:available=1/0 [22:13:18] MySQL slave on z-dat-s6-a is CRITICAL: (Return code of 139 is out of bounds) [22:13:27] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [22:13:53] 3(created) [UTRS-90] Blocked emails, IPs, Usernames: not able to view appeal after block. SQL error returned.; UTRS: Main Interface; Critical Bug <10https://jira.toolserver.org/browse/UTRS-90> (DeltaQuad) [22:16:18] Load avg. on willow is OK: OK - load average: 14.16, 13.86, 14.93 [22:27:52] 3(commented) [TS-1343] s6 is currently not replicating <10https://jira.toolserver.org/browse/TS-1343> (Marlen Caemmerer) [22:33:07] RAID on adenia is CRITICAL: CHECK_NRPE: Socket timeout after 30 seconds. [22:33:54] 3(updated) [UTRS-90] No appeal matching ID causes SQL error <10https://jira.toolserver.org/browse/UTRS-90> (DeltaQuad) [22:34:17] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Unknown database enwiki_p [22:36:07] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 868856.000000 [22:37:17] Load avg. on willow is WARNING: WARNING - load average: 17.99, 15.41, 13.77 [22:38:18] MySQL slave on z-dat-s6-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2007 [22:39:17] MySQL slave on z-dat-s6-a is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 30328 [22:40:18] MySQL slave on z-dat-s6-a is WARNING: SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2039 [22:42:00] @replag [22:42:01] Thehelpfulone: s1-rr-a: 1w 3d 1h 26m 57s [+1.00 s/s]; s1-user: 1w 3d 1h 26m 57s [+1.00 s/s]; s3-rr-a: 58s [+0.02 s/s]; s3-user: 58s [+0.02 s/s]; s6-rr-a: 8h 5m 49s [-1.68 s/s]; s6-user: 8h 5m 49s [-1.68 s/s] [22:44:48] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [22:45:08] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [22:45:18] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [22:52:38] RAID on adenia is OK: OK - TOTAL: 2: FAILED: 0: DEGRADED: 0 [22:53:38] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [22:54:38] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.078125/1.75, alarm hl:np_load_avg=2.113281/2.0, alarm hl:mem_free=452.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.078125/1.9, alarm hl:np_load_long=2.000977/2.25, alarm hl:mem_free=452.000000M/200M, alarm hl:available=1/0 [22:58:08] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default [23:07:17] MySQL slave on thyme is WARNING: No slaves defined [23:34:31] s1 replag on thyme is CRITICAL: QUERY CRITICAL: Unknown database enwiki_p [23:36:18] s1 replag on rosemary is CRITICAL: QUERY CRITICAL: SELECT ts_rc_age() returned 872464.000000 [23:37:27] Load avg. on willow is WARNING: WARNING - load average: 16.79, 16.34, 17.27 [23:40:15] nacht ts [23:43:47] Sun Grid Engine execd on willow is OK: medium-sol@willow OK: longrun-sol@willow OK [23:44:57] FMA on yarrow is CRITICAL: ERROR - unexpected output from snmpwalk [23:45:18] SMF on willow is CRITICAL: ERROR - maintenance: svc:/network/puppetmasterd:default [23:45:27] SMF on turnera is CRITICAL: ERROR - offline: svc:/system/cluster/scsymon-srv:default [23:49:37] Sun Grid Engine execd on willow is WARNING: medium-sol@willow exceedes load threshold: alarm hl:np_load_short=2.101562/1.75, alarm hl:np_load_avg=1.956055/2.0, alarm hl:mem_free=427.000000M/350M, alarm hl:available=1/0: longrun-sol@willow exceedes load threshold: alarm hl:np_load_short=2.101562/1.9, alarm hl:np_load_long=2.050781/2.25, alarm hl:mem_free=427.000000M/200M, alarm hl:available=1/0 [23:49:53] 3(commented) [UTRS-90] No appeal matching ID causes SQL error <10https://jira.toolserver.org/browse/UTRS-90> (DeltaQuad) [23:53:48] RAID on daphne is CRITICAL: ERROR - TOTAL: 2: FAILED: 0: DEGRADED: 1 [23:58:17] SMF on damiana is CRITICAL: ERROR - maintenance: svc:/network/ldap/client:default