[00:02:42] ssmollett: I'll still be on IRC if you need anything [01:22:56] fyi, i might break labs puppet right now :) [01:39:53] PROBLEM Current Load is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:40:43] PROBLEM Current Users is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:03] PROBLEM dpkg-check is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:03] PROBLEM Disk Space is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:03] PROBLEM Free ram is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:28] PROBLEM Free ram is now: CRITICAL on puppet-lucid puppet-lucid output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:33] PROBLEM Total Processes is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:43] PROBLEM Current Load is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:42:13] PROBLEM Free ram is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:42:33] PROBLEM Current Users is now: CRITICAL on puppet-lucid puppet-lucid output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:43:43] PROBLEM Total Processes is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:44:03] PROBLEM dpkg-check is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:44:23] PROBLEM Total Processes is now: CRITICAL on puppet-lucid puppet-lucid output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:44:43] PROBLEM Current Users is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:45:03] PROBLEM Current Load is now: CRITICAL on puppet-lucid puppet-lucid output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:45:13] PROBLEM Disk Space is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:46:03] PROBLEM Current Load is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:46:53] PROBLEM Total Processes is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:47:13] PROBLEM Free ram is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:47:23] * Damianz runs around and stabs network stuff [01:48:13] PROBLEM dpkg-check is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:49:43] PROBLEM Disk Space is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:50:13] PROBLEM Current Users is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:50:33] PROBLEM Current Users is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:50:33] PROBLEM Current Load is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:03] PROBLEM Free ram is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:23] PROBLEM Current Users is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:23] PROBLEM Total Processes is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:33] PROBLEM Free ram is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:48] PROBLEM Disk Space is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:48] PROBLEM Current Users is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:53] PROBLEM dpkg-check is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:03] PROBLEM Free ram is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:03] PROBLEM Total Processes is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:08] PROBLEM dpkg-check is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:23] PROBLEM Disk Space is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:53] PROBLEM dpkg-check is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:03] PROBLEM Current Users is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:13] PROBLEM Current Load is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:23] PROBLEM Disk Space is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:33] PROBLEM Disk Space is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:33] PROBLEM Current Load is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:43] PROBLEM dpkg-check is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:54:03] PROBLEM Free ram is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:54:23] PROBLEM Current Load is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:54:33] PROBLEM Total Processes is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:54:43] PROBLEM Total Processes is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [02:25:37] !project jenkins [02:25:38] https://labsconsole.wikimedia.org/wiki/Nova_Resource:jenkins [02:45:31] @regsearch * [02:56:23] There it is :) [02:56:26] (Wasn't me btw) [02:56:27] @whoami [02:56:27] You are trusted identified by name .*@wikipedia/.* [02:56:27] lol [02:56:36] !wm-bot [02:56:36] http://meta.wikimedia.org/wiki/WM-Bot [02:57:10] Ohhh... it's on apache1? [02:57:23] lulz [02:57:35] @search * [02:57:36] Results (found 2): password, help, [02:58:24] !gettingstarted is Welcome to Wikimedia Labs! Get yourself started at https://labsconsole.wikimedia.org/wiki/Getting_started [02:58:24] Key was added! [02:58:43] !gettingstarted | Hydriz [02:58:43] Hydriz: Welcome to Wikimedia Labs! Get yourself started at https://labsconsole.wikimedia.org/wiki/Getting_started [02:58:49] ah great [02:59:28] !help [02:59:28] want docs? ask for "!wm-bot". all keywords? try "@regsearch .*" [02:59:36] !docs [02:59:59] !docs is View complete documentation at https://labsconsole.wikimedia.org/wiki/Help:Contents [02:59:59] Key was added! [03:00:58] !docs | Hydriz [03:00:58] Hydriz: View complete documentation at https://labsconsole.wikimedia.org/wiki/Help:Contents [03:01:01] :) [03:01:10] lulz [03:01:17] can you help make documentation? :P [03:01:34] Not right now, it's 3AM UK time... I need some sleep :P [03:01:45] lol [03:01:49] I just woke up haha [03:01:54] Ooer [03:09:20] 3AM? best time for working! [03:09:32] Wow... beetstra's unblockbot.pl is slowly eating all the RAM on bots-2 >.> [03:10:05] Hence why nagios is having a go it at [03:10:09] at it* [03:16:33] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 26% free memory [03:19:58] !log incubator Importing incubatorwiki dump of 20120120 into prefixexport's enwiki [03:19:59] Logged the message, Master [03:22:05] !log bots Update packages on all bots instances (excluding apache1 which was done on the 23rd) [03:22:07] Logged the message, Master [03:22:22] Beetstra: PM? [03:22:58] hi methecooldude [03:23:32] what's up? [03:24:42] eh, is it technically possible to create m1.medium? [03:24:51] seems like it always fails for me [03:26:33] PROBLEM dpkg-check is now: CRITICAL on bots-2 bots-2 output: DPKG CRITICAL dpkg reports broken packages [03:26:58] Shut it nagios... it hasn't finished yet! [03:31:33] RECOVERY dpkg-check is now: OK on bots-2 bots-2 output: All packages OK [03:32:18] Hydriz: Yea, might need more hardware which isn't there yet [03:32:23] PROBLEM Free ram is now: WARNING on bots-cb bots-cb output: Warning: 19% free memory [03:32:36] I see [03:32:41] then can I ask you something? [03:32:47] Sure [03:32:58] How do you guys mount filesystems across instances? [03:33:11] something like what NFS share and things [03:33:18] which I don't quite understand [03:33:34] Hydriz: I don't have a clue, you will need to ask either Ryan or petrb [03:33:41] I see [03:33:48] Sorry... petan* [03:34:05] The bots instances has this feature [03:34:13] which I tried to figure out how to do [03:34:21] but failed epically [03:36:17] Hydriz: Ah... I see, you would get a failed message anyway, since you are not a sysadmin in the bots project, but even I'm getting the same message [03:36:31] nono [03:36:36] I did that on my own project [03:36:43] Oh right [03:37:11] I can't sudo in the bots project lol [03:40:00] Bloody ClueBot 3! Eating 60% RAM [03:40:12] !log incubator Created new instance incubator-nfs for Incubator file storage, with s1.large setup [03:40:13] Logged the message, Master [03:41:14] !log bots (bots-cb) Restarting ClueBot 3... how much RAM do you need! [03:41:16] Logged the message, Master [03:42:47] lol we seem to be competing in SAL [03:42:59] Hydriz: Hehe [03:43:44] PROBLEM Current Load is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:43:49] ... [03:43:53] I just ran puppet [03:44:24] PROBLEM Current Users is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:44:45] !log incubator Rerunning puppet on incubator-nfs to keep nagios quiet [03:44:45] Logged the message, Master [03:45:09] PROBLEM Disk Space is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:45:24] grr [03:45:44] PROBLEM Free ram is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:47:04] PROBLEM Total Processes is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:47:29] !log incubator I think Hydriz broke it :P [03:47:29] Logged the message, Master [03:47:34] PROBLEM dpkg-check is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:47:34] RECOVERY Free ram is now: OK on bots-cb bots-cb output: OK: 78% free memory [03:47:45] god damn lulz [03:50:11] Hydriz: http://www.siamkia.com/open-source-help/how-to-fix-check-nrpe-error-could-not-complete-ssl-handshake.html ? [03:50:41] hmm [03:51:11] it was always automatic [03:51:50] * Hydriz feels evil to just leave it as it is [03:55:06] Hydriz: sudo nano /etc/nagios/nrpe_local.cfg and check that "allowed_hosts=10.4.0.34" [03:55:24] * Hydriz checks [03:55:42] its blank [03:55:50] Ooer [03:56:24] Ok, fill that file with http://privatepaste.com/189014eacc then [03:56:29] and its blank for my other instances [03:56:51] Has Nagios alerted about the others as well? [03:57:06] not really [03:57:32] Bet it will if you made a Puppet change on them :P [03:57:42] But don't test that [03:57:52] done [03:58:06] Now just wait... Nagios will re-check soon [03:58:34] okie [03:58:43] What instance is it? [03:58:56] prefixexport, deployment and incubator-nfs [03:59:09] Yea, Nagios is all red for them [03:59:28] http://nagios.wmflabs.org/nagios3/ - Click Service Problems [04:02:33] FARK how to mount nfs... [04:03:06] omg I just feel like killing the server [04:03:41] Hydriz: Reboot it first... [04:03:48] oh wait [04:03:50] Then kill it :P [04:03:54] I think I noticed someting [04:03:57] *something [04:03:58] What? [04:04:04] some /etc/export [04:04:09] *exports [04:04:34] Heh, that's not on bos-cb [04:04:37] bots-cb [04:04:46] more like bots-nfs [04:04:52] thats the global file server for bots [04:04:56] Ah [04:05:02] yes [04:05:03] I see [04:06:07] nope, still failed [04:06:08] haiz [04:06:58] oh, there is a package to install [04:09:57] access denied... [04:15:47] Hydriz: where did you get access denied? [04:15:58] from deployment [04:16:13] I am trying to get my deployment instance access to my incubator-nfs [04:16:36] I have installed nfs-kernel-server [04:16:44] set up /etc/exports [04:16:54] still denied [04:19:25] seems like I need to reboot [04:20:44] RECOVERY Free ram is now: OK on incubator-nfs incubator-nfs output: OK: 90% free memory [04:20:55] YES [04:20:58] finally got it [04:21:07] so it just lacks reboot [04:21:14] sorry for the trouble people! [04:21:28] * Hydriz deserves a slap [04:22:04] RECOVERY Total Processes is now: OK on incubator-nfs incubator-nfs output: PROCS OK: 109 processes [04:22:15] Hydriz: See, I said reboot :P [04:22:34] RECOVERY dpkg-check is now: OK on incubator-nfs incubator-nfs output: All packages OK [04:22:36] lulz [04:23:00] i think reboot wasn't necessary [04:23:25] worst case you would have needed to reload some kernel modules. but even that's extreme [04:23:44] RECOVERY Current Load is now: OK on incubator-nfs incubator-nfs output: OK - load average: 0.46, 0.17, 0.06 [04:24:09] this nfs host is just create [04:24:12] *created [04:24:18] so rebooting isn't much of an issue [04:24:24] RECOVERY Current Users is now: OK on incubator-nfs incubator-nfs output: USERS OK - 1 users currently logged in [04:24:24] unlike my other instances [04:25:04] RECOVERY Disk Space is now: OK on incubator-nfs incubator-nfs output: DISK OK [04:25:15] * Hydriz feels like he just went around the world just to mount that drive [04:30:37] Seems weird: [04:30:38] Filesystem Size Used Avail Use% Mounted on [04:30:38] /dev/vdb 237G 188M 225G 1% /mnt [04:31:02] Size of 237GB and has only 225GB available [04:31:28] when only 188MB used [04:34:02] !log incubator Mounted incubator-nfs:/mnt/1 onto /1 of prefixexport and deployment instances [04:34:03] Logged the message, Master [05:44:35] Hydriz: did you really name your instance deployment? [06:04:14] johnduhart: Yes, by accident :P [06:05:18] but I haven't done anything to it yet, so I can delete and recreate with a different name [06:09:21] !log incubator Deleting instance deployment as the name is too generic, may conflict with Deployment-prep project [06:09:22] Logged the message, Master [06:17:55] Did labs console just die out? [06:24:54] PROBLEM host: incubator-dep is DOWN address: incubator-dep CRITICAL - Host Unreachable (incubator-dep) [06:44:54] yeah, should be like that [06:44:55] does it timeout via the web? [06:44:56] or give an error? [06:44:58] no, it just loads continuously [06:44:58] no error [06:44:58] no webpage [06:44:59] oh, were you using a socks proxy? [06:45:19] should be [06:45:32] something like accessing instance from localhost:8080 thing [06:45:36] are you still connected via ssh? [06:45:41] yes [06:45:59] turn the proxy off in your browser, then try labsconsole again [06:46:12] I either use two browsers, or use foxyproxy in firefox [06:46:49] because, when you use the proxy, it sends your browser requests through the proxy [06:47:23] no, still doesn't work [06:47:38] I even closed all ssh sessions [06:47:50] its quite random [06:47:57] just got it after I created a new instance [06:48:56] and the problem seems to be isolated to my connection to labsconsole [06:49:00] everything else works [06:50:39] try this: telnet labsconsole.wikimedia.org 443 [06:51:21] Trying 208.80.153.135... [06:51:21] Connected to virt0.wikimedia.org. [06:51:21] Escape character is '^]'. [06:51:32] then... [06:51:33] Connection closed by foreign host. [06:51:43] have you tried another browser? [06:51:56] not yet [06:52:00] trying now [06:52:17] and it loads trololol [06:52:29] my system is trolling me [06:52:33] heh [06:52:47] Firefox works [06:52:51] Chrome doesn't load [07:03:49] yeah, I don't have issues with my other instances, sigh [07:03:49] ah, lemme check that [07:04:10] * Hydriz feels like he is the troublemaker here [07:04:38] no route to host [07:04:59] hm [07:06:13] Hydriz: did you delete it and re-create it? [07:06:24] for this instance? [07:06:26] yes [07:06:39] I deleted the instance deployment and created under a different name [07:06:47] which is this [07:07:01] and yes it encountered an error earlier [07:07:06] ah. ok [07:07:09] so had to recreated [07:07:14] *recreate [07:07:26] when you delete an instance and recreate it, the dns takes a while to purge [07:07:35] there's a one hour ttl on the entry [07:07:48] when the instance is recreated, it gets a different IP address [07:07:50] so it will take one hour to purge? [07:07:57] I can do it manually really quick [07:08:10] @search gdfg [07:08:11] No results found! :| [07:08:12] thanks :) [07:08:17] @regsearch .. [07:08:18] Results (found 84): instance, morebots, git, bang, nagios, bot, labs-home-wm, labs-nagios-wm, labs-morebots, gerrit-wm, wiki, labs, extension, wm-bot, putty, gerrit, change, revision, monitor, alert, password, unicorn, help, bz, os-change, instancelist, instance-json, leslie's-reset, damianz's-reset, amend, credentials, queue, sal, info, security, logging, ask, sudo, access, $realm, keys, $site, bug, pageant, blueprint-dns, bots, stucked, rt, pxe, ghsh, group, pathconflict, terminology, etherpad, epad, nova-resource, pastebin, newgrp, osm-bug, Ryan, bastion, ryanland, afk, test, initial-login, account-questions, manage-projects, rights, new-labsuser, cs, puppet, new-ldapuser, projects, quilt, labs-project, openstack-manager, wikitech, load, load-all, socks-proxy, wl, domain, gettingstarted, docs, [07:08:21] @regsearch * [07:08:21] This regex is totally bad [07:08:26] Hydriz: ^ [07:08:27] :P [07:08:36] omg I did that query [07:08:38] fixed [07:08:44] and I crashed the bot [07:08:46] should work now [07:08:47] yes [07:08:54] thanks! [07:08:57] :D [07:09:06] lol no [07:09:09] still exists [07:09:17] sorry. purged in the wrong order [07:09:18] it works now [07:09:56] yep [07:09:57] thanks! [07:10:38] yw [07:10:45] oh yes [07:11:02] is it possible to change the instance from, say, m1.small to m1.medium? [07:11:24] RECOVERY host: incubator-dep is UP address: incubator-dep PING OK - Packet loss = 0%, RTA = 0.75 ms [07:11:29] nope [07:11:33] I see [07:11:36] resizing doesn't currently work [07:11:42] possibly in the future [07:11:49] because m1.medium breaks everything [07:11:53] how so? [07:11:57] like, we just can't create it [07:12:02] really? what happens? [07:12:12] not sure [07:12:17] something about ruby's download [07:12:27] eh? [07:12:37] then wants you to run apt-get with some fixing parameter [07:12:54] I can't really recall what happens [07:13:01] but I know it stops in mid air [07:13:04] ah. the ruby bug [07:13:14] it isn't the instance type causing it [07:13:30] it occasionally happens on any instance type [07:13:35] and I haven't tracked it down yet [07:13:40] usually deleting and recreating works [07:13:50] but it always affect me creating m1.medium [07:13:59] let me test [07:14:00] so I want medium but always fail to [07:14:14] why do you need medium? [07:14:24] PROBLEM Current Users is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:14:30] (I'm looking into it, just curious) [07:14:45] yeah, I am curious :P [07:15:04] PROBLEM Disk Space is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:15:07] I haven't had issues with mediums in the past [07:15:44] PROBLEM Free ram is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:16:32] hm. sure enough I get that error [07:17:04] yeah [07:17:07] that makes no sense [07:17:19] obstructs the path to medium :P [07:17:24] PROBLEM Total Processes is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:17:31] trying again :) [07:17:34] PROBLEM dpkg-check is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:17:47] nagios: be quiet [07:18:09] restart nagios-nrpe-server service [07:18:50] any syntax in doing so? [07:19:04] PROBLEM Current Load is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:19:12] /etc/init.d/nagios-nrpe-server restart [07:19:16] and again it failed. [07:19:19] that makes no sense [07:19:21] heh [07:20:04] RECOVERY Disk Space is now: OK on incubator-dep incubator-dep output: DISK OK [07:20:15] RECOVERY Current Load is now: OK on prefixexport prefixexport output: OK - load average: 0.05, 0.24, 0.35 [07:20:35] RECOVERY Free ram is now: OK on incubator-dep incubator-dep output: OK: 91% free memory [07:20:45] RECOVERY Current Users is now: OK on prefixexport prefixexport output: USERS OK - 4 users currently logged in [07:20:46] I wonder if larges build [07:20:57] 4 users? [07:21:00] oh yes [07:21:05] RECOVERY Free ram is now: OK on prefixexport prefixexport output: OK: 30% free memory [07:21:07] I am running 4 terminals [07:21:10] forgot [07:22:11] large failed too [07:22:13] wtf [07:22:25] RECOVERY Disk Space is now: OK on prefixexport prefixexport output: DISK OK [07:22:25] RECOVERY Total Processes is now: OK on incubator-dep incubator-dep output: PROCS OK: 84 processes [07:22:35] RECOVERY Total Processes is now: OK on prefixexport prefixexport output: PROCS OK: 106 processes [07:22:40] RECOVERY dpkg-check is now: OK on incubator-dep incubator-dep output: All packages OK [07:22:46] lulz [07:23:15] RECOVERY dpkg-check is now: OK on prefixexport prefixexport output: All packages OK [07:24:05] RECOVERY Current Load is now: OK on incubator-dep incubator-dep output: OK - load average: 0.00, 0.01, 0.09 [07:24:25] RECOVERY Current Users is now: OK on incubator-dep incubator-dep output: USERS OK - 2 users currently logged in [07:25:05] PROBLEM host: testmedium is DOWN address: testmedium check_ping: Invalid hostname/address - testmedium [08:04:03] PROBLEM Current Load is now: CRITICAL on incubator-sql incubator-sql output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:04:23] PROBLEM Current Users is now: CRITICAL on incubator-sql incubator-sql output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:05:08] PROBLEM Disk Space is now: CRITICAL on incubator-sql incubator-sql output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:05:29] !log incubator Creating new instance incubator-sql for hosting MySQL databases on the incubator projects [08:05:30] Logged the message, Master [08:05:43] PROBLEM Free ram is now: CRITICAL on incubator-sql incubator-sql output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:09:03] RECOVERY Current Load is now: OK on incubator-sql incubator-sql output: OK - load average: 0.44, 0.50, 0.31 [08:09:23] RECOVERY Current Users is now: OK on incubator-sql incubator-sql output: USERS OK - 1 users currently logged in [08:10:03] RECOVERY Disk Space is now: OK on incubator-sql incubator-sql output: DISK OK [08:10:43] RECOVERY Free ram is now: OK on incubator-sql incubator-sql output: OK: 85% free memory [08:41:33] PROBLEM dpkg-check is now: CRITICAL on incubator-sql incubator-sql output: DPKG CRITICAL dpkg reports broken packages [08:43:39] broken packages, yes. MySQL is still in the middle of installing [09:00:56] !wl [09:00:57] https://www.mediawiki.org/wiki/Wikimedia_Labs here you can find more [09:01:33] RECOVERY dpkg-check is now: OK on incubator-sql incubator-sql output: All packages OK [09:16:56] !log incubator Created new instance incubator-live for hosting MediaWiki files so that we can avoid having the same files on different servers [09:16:57] Logged the message, Master [09:54:34] PROBLEM dpkg-check is now: CRITICAL on incubator-sql incubator-sql output: DPKG CRITICAL dpkg reports broken packages [09:57:41] !log incubator Deleting instance incubator-sql to rename to incubator-sql1, partially also due to severe misconfiguration of mysql installation [09:57:42] Logged the message, Master [10:03:43] !log deployment-prep installed ffmpeg on deployment-web (required by TMH to extract stills) [10:03:44] Logged the message, Master [10:26:35] RECOVERY Current Users is now: OK on bots-4 bots-4 output: USERS OK - 2 users currently logged in [10:26:35] RECOVERY Total Processes is now: OK on bots-4 bots-4 output: PROCS OK: 85 processes [10:26:55] RECOVERY dpkg-check is now: OK on bots-4 bots-4 output: All packages OK [10:28:15] PROBLEM dpkg-check is now: CRITICAL on incubator-sql1 incubator-sql1 output: DPKG CRITICAL dpkg reports broken packages [10:28:15] !log bots Fix nagios issue of bots-4 on SSL handshake by enabling 10.4.0.34 as allowed host in /etc/nagios/nrpe_local.cfg [10:28:16] Logged the message, Master [10:28:35] RECOVERY Current Load is now: OK on bots-4 bots-4 output: OK - load average: 0.36, 0.09, 0.03 [10:28:35] RECOVERY Disk Space is now: OK on bots-4 bots-4 output: DISK OK [10:33:15] RECOVERY dpkg-check is now: OK on incubator-sql1 incubator-sql1 output: All packages OK [12:56:15] PROBLEM dpkg-check is now: CRITICAL on incubator-sql1 incubator-sql1 output: DPKG CRITICAL dpkg reports broken packages [13:23:28] !log incubator Deleting the incubator-sql1 instance as having another SQL server proves to be worthless [13:23:30] Logged the message, Master [13:32:09] !log incubator Creating the incubator-bots instance for hosting Wikimedia Incubator bots [13:32:11] Logged the message, Master [13:40:13] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 17% free memory [13:42:41] linkwatcher.pl is taking quite a bit of CPU? [13:42:57] Beetstra: ping [13:43:10]