[00:45:13] PROBLEM Current Load is now: WARNING on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: WARNING - load average: 5.46, 5.44, 5.10 [00:50:54] RECOVERY Current Load is now: OK on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: OK - load average: 2.35, 4.66, 4.97 [01:05:02] 01/08/2013 - 01:05:02 - Creating a mountpoint [01:05:03] 01/08/2013 - 01:05:02 - Failed to mount the key volume /mnt/keys/ [01:05:44] PROBLEM Total processes is now: WARNING on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS WARNING: 169 processes [01:10:03] 01/08/2013 - 01:10:02 - Failed to mount the key volume /mnt/keys/ [01:15:04] 01/08/2013 - 01:15:02 - Failed to mount the key volume /mnt/keys/ [01:15:42] RECOVERY Total processes is now: OK on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS OK: 92 processes [01:20:03] 01/08/2013 - 01:20:02 - Failed to mount the key volume /mnt/keys/ [01:25:02] 01/08/2013 - 01:25:02 - Failed to mount the key volume /mnt/keys/ [01:28:26] ryan_lane: how can I learn more about ^^ ? [01:28:46] (I just merged in the puppetized scripts, fearing this is related) [01:28:54] I think that is on labs-nfs1 [01:30:03] 01/08/2013 - 01:30:02 - Failed to mount the key volume /mnt/keys/ [01:32:06] Ah, so it /is/ because I replaced the scripts. [01:32:09] But maybe doesn't matter... [01:32:27] it doesn't [01:32:29] I just disabled that [01:32:30] * andrewbogott backed up the old ones on labstore2, did not think to do so on labs-nfs1 [01:32:50] and removed the bot :) [01:33:06] great, I was just trying to figure out how to do so. [01:33:23] I edited the root crontab to remove manage-exports [01:33:29] and stopped irceco [01:33:33] *ircecho [01:33:47] and seconds later I wondered… why isn't manage-exports in the crontab? [01:34:07] So... I will return to the business of going home. [01:34:13] :D [01:55:25] RECOVERY Free ram is now: OK on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: OK: 22% free memory [03:11:10] !log bots madman: Installed php-pear, HTTP_Request2, Log on bots-2. [03:11:12] Logged the message, Master [04:03:24] PROBLEM Free ram is now: WARNING on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: Warning: 19% free memory [06:05:42] RECOVERY Current Load is now: OK on parsoid-roundtrip7-8core.pmtpa.wmflabs 10.4.1.26 output: OK - load average: 4.65, 4.71, 4.97 [06:32:44] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 156 processes [06:47:43] RECOVERY Total processes is now: OK on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS OK: 150 processes [07:52:42] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 152 processes [09:10:52] PROBLEM Free ram is now: WARNING on aggregator2.pmtpa.wmflabs 10.4.0.193 output: Warning: 19% free memory [09:58:51] test [10:05:59] Damianz, jeremyb - I implemented safe shutdown for a bot, so in case you need to restart it, don't kill it from terminal, use this @restart instead, otherwise some files may not be written correctly [10:10:48] petan: make sure to update the doc somewhere on labsconsole :-] [10:10:50] !botrestart [10:10:54] !restartbot [10:11:18] !botrestart is To restart the bot use the @restart command [10:11:18] Key was added [10:11:27] @alias [10:13:23] yup [11:23:22] RECOVERY Free ram is now: OK on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: OK: 36% free memory [12:05:43] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 152 processes [12:10:43] RECOVERY Total processes is now: OK on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS OK: 150 processes [12:36:44] PROBLEM Total processes is now: WARNING on parsoid-spof.pmtpa.wmflabs 10.4.0.33 output: PROCS WARNING: 152 processes [14:12:40] <^demon> Gerrit's going to be unavailable for just a short bit--rebooting manganese. [14:41:14] !tunnel [14:41:15] ssh -f user@bastion.wmflabs.org -L :server: -N Example for sftp "ssh chewbacca@bastion.wmflabs.org -L 6000:bots-1:22 -N" will open bots-1:22 as localhost:6000 [14:44:09] !ping [14:44:09] pong [14:48:23] !petan [14:48:23] Petr Bena - http://enwp.org/User:Petrb [14:48:28] haha [14:48:38] :o [14:49:48] My bots are such a mess!!! I use pywikipedia for some, wikibotclases for others, peachy for others .... [15:33:44] PROBLEM Free ram is now: WARNING on dumps-bot1.pmtpa.wmflabs 10.4.0.4 output: Warning: 19% free memory [16:08:03] PROBLEM Free ram is now: CRITICAL on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: Connection refused by host [16:09:23] PROBLEM Total processes is now: CRITICAL on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: Connection refused by host [16:10:54] PROBLEM Current Load is now: CRITICAL on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: Connection refused by host [16:10:56] PROBLEM dpkg-check is now: CRITICAL on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: Connection refused by host [16:11:34] PROBLEM Current Users is now: CRITICAL on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: Connection refused by host [16:12:14] PROBLEM Disk Space is now: CRITICAL on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: Connection refused by host [16:14:25] PROBLEM Total processes is now: WARNING on vumi-metrics.pmtpa.wmflabs 10.4.1.13 output: PROCS WARNING: 151 processes [16:19:23] RECOVERY Total processes is now: OK on vumi-metrics.pmtpa.wmflabs 10.4.1.13 output: PROCS OK: 141 processes [16:20:52] RECOVERY Current Load is now: OK on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: OK - load average: 1.08, 1.20, 0.76 [16:20:52] RECOVERY dpkg-check is now: OK on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: All packages OK [16:21:32] RECOVERY Current Users is now: OK on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: USERS OK - 0 users currently logged in [16:22:12] RECOVERY Disk Space is now: OK on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: DISK OK [16:23:02] RECOVERY Free ram is now: OK on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: OK: 856% free memory [16:24:22] RECOVERY Total processes is now: OK on mwang-dev.pmtpa.wmflabs 10.4.1.14 output: PROCS OK: 94 processes [16:54:23] PROBLEM host: mwang-devel.pmtpa.wmflabs is DOWN address: 10.4.1.61 CRITICAL - Host Unreachable (10.4.1.61) [16:59:53] PROBLEM host: mwang-dev1.pmtpa.wmflabs is DOWN address: 10.4.1.67 CRITICAL - Host Unreachable (10.4.1.67) [17:05:13] RECOVERY host: mwang-dev1.pmtpa.wmflabs is UP address: 10.4.1.67 PING OK - Packet loss = 0%, RTA = 0.87 ms [17:05:53] PROBLEM Current Load is now: CRITICAL on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: Connection refused by host [17:05:53] PROBLEM dpkg-check is now: CRITICAL on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: Connection refused by host [17:06:33] PROBLEM Current Users is now: CRITICAL on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: Connection refused by host [17:07:13] PROBLEM Disk Space is now: CRITICAL on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: Connection refused by host [17:08:04] PROBLEM Free ram is now: CRITICAL on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: Connection refused by host [17:09:23] PROBLEM Total processes is now: CRITICAL on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: Connection refused by host [17:19:22] RECOVERY Total processes is now: OK on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: PROCS OK: 84 processes [17:20:54] RECOVERY Current Load is now: OK on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: OK - load average: 0.19, 0.82, 0.73 [17:20:54] RECOVERY dpkg-check is now: OK on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: All packages OK [17:21:34] RECOVERY Current Users is now: OK on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: USERS OK - 0 users currently logged in [17:22:14] RECOVERY Disk Space is now: OK on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: DISK OK [17:23:04] RECOVERY Free ram is now: OK on mwang-dev1.pmtpa.wmflabs 10.4.1.67 output: OK: 856% free memory [17:57:53] petan, ? :) [17:58:33] ignore me :) [19:21:12] petan: which bot? why can't you just trap SIGINT or SIGTERM? [19:24:57] andrewbogott: morning :-] I have commented on your log rotate file for GlusterFS log files https://gerrit.wikimedia.org/r/42796 [19:25:21] andrewbogott: might want to add a few more options in the log rotate conf. Also the summary says 3 weeks are kept but the conf suggest 2 days :-] [19:32:17] hashar: Thanks! And, d'oh, I was testing with 'days' so I could see immediate results, forgot to update of course. [19:41:59] andrewbogott: I think logrotated has a dry run option. You can have it run specifying a conf file + a target directory [19:42:03] and it shows you what it would do [19:42:20] a nice way to debug it out and find out whether the conf is working as intended. [19:42:35] hashar: Yep, I figured that out after changing 'days' to 'weeks' but nonetheless forgot to switch it back. [19:56:10] Ryan_Lane, any idea what I can do to encourage the gluster client to notice new logfiles? [19:57:03] hm [19:57:11] I'd say ask in the #gluster channel? [19:57:16] maybe a HUP would do it [19:57:24] I hope so, because otherwise I have no clue [20:45:18] !tunnel [20:45:19] ssh -f user@bastion.wmflabs.org -L :server: -N Example for sftp "ssh chewbacca@bastion.wmflabs.org -L 6000:bots-1:22 -N" will open bots-1:22 as localhost:6000 [20:51:53] PROBLEM Free ram is now: CRITICAL on aggregator1.pmtpa.wmflabs 10.4.0.79 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:56:52] PROBLEM Free ram is now: WARNING on aggregator1.pmtpa.wmflabs 10.4.0.79 output: Warning: 9% free memory