[00:02:42] ssmollett: I'll still be on IRC if you need anything [01:22:56] fyi, i might break labs puppet right now :) [01:39:53] PROBLEM Current Load is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:40:43] PROBLEM Current Users is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:03] PROBLEM dpkg-check is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:03] PROBLEM Disk Space is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:03] PROBLEM Free ram is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:28] PROBLEM Free ram is now: CRITICAL on puppet-lucid puppet-lucid output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:33] PROBLEM Total Processes is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:41:43] PROBLEM Current Load is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:42:13] PROBLEM Free ram is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:42:33] PROBLEM Current Users is now: CRITICAL on puppet-lucid puppet-lucid output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:43:43] PROBLEM Total Processes is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:44:03] PROBLEM dpkg-check is now: CRITICAL on hugglewiki hugglewiki output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:44:23] PROBLEM Total Processes is now: CRITICAL on puppet-lucid puppet-lucid output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:44:43] PROBLEM Current Users is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:45:03] PROBLEM Current Load is now: CRITICAL on puppet-lucid puppet-lucid output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:45:13] PROBLEM Disk Space is now: CRITICAL on ubuntu1-pgehres ubuntu1-pgehres output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:46:03] PROBLEM Current Load is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:46:53] PROBLEM Total Processes is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:47:13] PROBLEM Free ram is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:47:23] * Damianz runs around and stabs network stuff [01:48:13] PROBLEM dpkg-check is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:49:43] PROBLEM Disk Space is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:50:13] PROBLEM Current Users is now: CRITICAL on gerrit gerrit output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:50:33] PROBLEM Current Users is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:50:33] PROBLEM Current Load is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:03] PROBLEM Free ram is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:23] PROBLEM Current Users is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:23] PROBLEM Total Processes is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:33] PROBLEM Free ram is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:48] PROBLEM Disk Space is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:48] PROBLEM Current Users is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:51:53] PROBLEM dpkg-check is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:03] PROBLEM Free ram is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:03] PROBLEM Total Processes is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:08] PROBLEM dpkg-check is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:23] PROBLEM Disk Space is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:52:53] PROBLEM dpkg-check is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:03] PROBLEM Current Users is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:13] PROBLEM Current Load is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:23] PROBLEM Disk Space is now: CRITICAL on ganglia-master ganglia-master output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:33] PROBLEM Disk Space is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:33] PROBLEM Current Load is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:53:43] PROBLEM dpkg-check is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:54:03] PROBLEM Free ram is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:54:23] PROBLEM Current Load is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:54:33] PROBLEM Total Processes is now: CRITICAL on phabricator1 phabricator1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [01:54:43] PROBLEM Total Processes is now: CRITICAL on jenkins2 jenkins2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [02:25:37] !project jenkins [02:25:38] https://labsconsole.wikimedia.org/wiki/Nova_Resource:jenkins [02:45:31] @regsearch * [02:56:23] There it is :) [02:56:26] (Wasn't me btw) [02:56:27] @whoami [02:56:27] You are trusted identified by name .*@wikipedia/.* [02:56:27] lol [02:56:36] !wm-bot [02:56:36] http://meta.wikimedia.org/wiki/WM-Bot [02:57:10] Ohhh... it's on apache1? [02:57:23] lulz [02:57:35] @search * [02:57:36] Results (found 2): password, help, [02:58:24] !gettingstarted is Welcome to Wikimedia Labs! Get yourself started at https://labsconsole.wikimedia.org/wiki/Getting_started [02:58:24] Key was added! [02:58:43] !gettingstarted | Hydriz [02:58:43] Hydriz: Welcome to Wikimedia Labs! Get yourself started at https://labsconsole.wikimedia.org/wiki/Getting_started [02:58:49] ah great [02:59:28] !help [02:59:28] want docs? ask for "!wm-bot". all keywords? try "@regsearch .*" [02:59:36] !docs [02:59:59] !docs is View complete documentation at https://labsconsole.wikimedia.org/wiki/Help:Contents [02:59:59] Key was added! [03:00:58] !docs | Hydriz [03:00:58] Hydriz: View complete documentation at https://labsconsole.wikimedia.org/wiki/Help:Contents [03:01:01] :) [03:01:10] lulz [03:01:17] can you help make documentation? :P [03:01:34] Not right now, it's 3AM UK time... I need some sleep :P [03:01:45] lol [03:01:49] I just woke up haha [03:01:54] Ooer [03:09:20] 3AM? best time for working! [03:09:32] Wow... beetstra's unblockbot.pl is slowly eating all the RAM on bots-2 >.> [03:10:05] Hence why nagios is having a go it at [03:10:09] at it* [03:16:33] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 26% free memory [03:19:58] !log incubator Importing incubatorwiki dump of 20120120 into prefixexport's enwiki [03:19:59] Logged the message, Master [03:22:05] !log bots Update packages on all bots instances (excluding apache1 which was done on the 23rd) [03:22:07] Logged the message, Master [03:22:22] Beetstra: PM? [03:22:58] hi methecooldude [03:23:32] what's up? [03:24:42] eh, is it technically possible to create m1.medium? [03:24:51] seems like it always fails for me [03:26:33] PROBLEM dpkg-check is now: CRITICAL on bots-2 bots-2 output: DPKG CRITICAL dpkg reports broken packages [03:26:58] Shut it nagios... it hasn't finished yet! [03:31:33] RECOVERY dpkg-check is now: OK on bots-2 bots-2 output: All packages OK [03:32:18] Hydriz: Yea, might need more hardware which isn't there yet [03:32:23] PROBLEM Free ram is now: WARNING on bots-cb bots-cb output: Warning: 19% free memory [03:32:36] I see [03:32:41] then can I ask you something? [03:32:47] Sure [03:32:58] How do you guys mount filesystems across instances? [03:33:11] something like what NFS share and things [03:33:18] which I don't quite understand [03:33:34] Hydriz: I don't have a clue, you will need to ask either Ryan or petrb [03:33:41] I see [03:33:48] Sorry... petan* [03:34:05] The bots instances has this feature [03:34:13] which I tried to figure out how to do [03:34:21] but failed epically [03:36:17] Hydriz: Ah... I see, you would get a failed message anyway, since you are not a sysadmin in the bots project, but even I'm getting the same message [03:36:31] nono [03:36:36] I did that on my own project [03:36:43] Oh right [03:37:11] I can't sudo in the bots project lol [03:40:00] Bloody ClueBot 3! Eating 60% RAM [03:40:12] !log incubator Created new instance incubator-nfs for Incubator file storage, with s1.large setup [03:40:13] Logged the message, Master [03:41:14] !log bots (bots-cb) Restarting ClueBot 3... how much RAM do you need! [03:41:16] Logged the message, Master [03:42:47] lol we seem to be competing in SAL [03:42:59] Hydriz: Hehe [03:43:44] PROBLEM Current Load is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:43:49] ... [03:43:53] I just ran puppet [03:44:24] PROBLEM Current Users is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:44:45] !log incubator Rerunning puppet on incubator-nfs to keep nagios quiet [03:44:45] Logged the message, Master [03:45:09] PROBLEM Disk Space is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:45:24] grr [03:45:44] PROBLEM Free ram is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:47:04] PROBLEM Total Processes is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:47:29] !log incubator I think Hydriz broke it :P [03:47:29] Logged the message, Master [03:47:34] PROBLEM dpkg-check is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [03:47:34] RECOVERY Free ram is now: OK on bots-cb bots-cb output: OK: 78% free memory [03:47:45] god damn lulz [03:50:11] Hydriz: http://www.siamkia.com/open-source-help/how-to-fix-check-nrpe-error-could-not-complete-ssl-handshake.html ? [03:50:41] hmm [03:51:11] it was always automatic [03:51:50] * Hydriz feels evil to just leave it as it is [03:55:06] Hydriz: sudo nano /etc/nagios/nrpe_local.cfg and check that "allowed_hosts=10.4.0.34" [03:55:24] * Hydriz checks [03:55:42] its blank [03:55:50] Ooer [03:56:24] Ok, fill that file with http://privatepaste.com/189014eacc then [03:56:29] and its blank for my other instances [03:56:51] Has Nagios alerted about the others as well? [03:57:06] not really [03:57:32] Bet it will if you made a Puppet change on them :P [03:57:42] But don't test that [03:57:52] done [03:58:06] Now just wait... Nagios will re-check soon [03:58:34] okie [03:58:43] What instance is it? [03:58:56] prefixexport, deployment and incubator-nfs [03:59:09] Yea, Nagios is all red for them [03:59:28] http://nagios.wmflabs.org/nagios3/ - Click Service Problems [04:02:33] FARK how to mount nfs... [04:03:06] omg I just feel like killing the server [04:03:41] Hydriz: Reboot it first... [04:03:48] oh wait [04:03:50] Then kill it :P [04:03:54] I think I noticed someting [04:03:57] *something [04:03:58] What? [04:04:04] some /etc/export [04:04:09] *exports [04:04:34] Heh, that's not on bos-cb [04:04:37] bots-cb [04:04:46] more like bots-nfs [04:04:52] thats the global file server for bots [04:04:56] Ah [04:05:02] yes [04:05:03] I see [04:06:07] nope, still failed [04:06:08] haiz [04:06:58] oh, there is a package to install [04:09:57] access denied... [04:15:47] Hydriz: where did you get access denied? [04:15:58] from deployment [04:16:13] I am trying to get my deployment instance access to my incubator-nfs [04:16:36] I have installed nfs-kernel-server [04:16:44] set up /etc/exports [04:16:54] still denied [04:19:25] seems like I need to reboot [04:20:44] RECOVERY Free ram is now: OK on incubator-nfs incubator-nfs output: OK: 90% free memory [04:20:55] YES [04:20:58] finally got it [04:21:07] so it just lacks reboot [04:21:14] sorry for the trouble people! [04:21:28] * Hydriz deserves a slap [04:22:04] RECOVERY Total Processes is now: OK on incubator-nfs incubator-nfs output: PROCS OK: 109 processes [04:22:15] Hydriz: See, I said reboot :P [04:22:34] RECOVERY dpkg-check is now: OK on incubator-nfs incubator-nfs output: All packages OK [04:22:36] lulz [04:23:00] i think reboot wasn't necessary [04:23:25] worst case you would have needed to reload some kernel modules. but even that's extreme [04:23:44] RECOVERY Current Load is now: OK on incubator-nfs incubator-nfs output: OK - load average: 0.46, 0.17, 0.06 [04:24:09] this nfs host is just create [04:24:12] *created [04:24:18] so rebooting isn't much of an issue [04:24:24] RECOVERY Current Users is now: OK on incubator-nfs incubator-nfs output: USERS OK - 1 users currently logged in [04:24:24] unlike my other instances [04:25:04] RECOVERY Disk Space is now: OK on incubator-nfs incubator-nfs output: DISK OK [04:25:15] * Hydriz feels like he just went around the world just to mount that drive [04:30:37] Seems weird: [04:30:38] Filesystem Size Used Avail Use% Mounted on [04:30:38] /dev/vdb 237G 188M 225G 1% /mnt [04:31:02] Size of 237GB and has only 225GB available [04:31:28] when only 188MB used [04:34:02] !log incubator Mounted incubator-nfs:/mnt/1 onto /1 of prefixexport and deployment instances [04:34:03] Logged the message, Master [05:44:35] Hydriz: did you really name your instance deployment? [06:04:14] johnduhart: Yes, by accident :P [06:05:18] but I haven't done anything to it yet, so I can delete and recreate with a different name [06:09:21] !log incubator Deleting instance deployment as the name is too generic, may conflict with Deployment-prep project [06:09:22] Logged the message, Master [06:17:55] Did labs console just die out? [06:24:54] PROBLEM host: incubator-dep is DOWN address: incubator-dep CRITICAL - Host Unreachable (incubator-dep) [06:44:54] yeah, should be like that [06:44:55] does it timeout via the web? [06:44:56] or give an error? [06:44:58] no, it just loads continuously [06:44:58] no error [06:44:58] no webpage [06:44:59] oh, were you using a socks proxy? [06:45:19] should be [06:45:32] something like accessing instance from localhost:8080 thing [06:45:36] are you still connected via ssh? [06:45:41] yes [06:45:59] turn the proxy off in your browser, then try labsconsole again [06:46:12] I either use two browsers, or use foxyproxy in firefox [06:46:49] because, when you use the proxy, it sends your browser requests through the proxy [06:47:23] no, still doesn't work [06:47:38] I even closed all ssh sessions [06:47:50] its quite random [06:47:57] just got it after I created a new instance [06:48:56] and the problem seems to be isolated to my connection to labsconsole [06:49:00] everything else works [06:50:39] try this: telnet labsconsole.wikimedia.org 443 [06:51:21] Trying 208.80.153.135... [06:51:21] Connected to virt0.wikimedia.org. [06:51:21] Escape character is '^]'. [06:51:32] then... [06:51:33] Connection closed by foreign host. [06:51:43] have you tried another browser? [06:51:56] not yet [06:52:00] trying now [06:52:17] and it loads trololol [06:52:29] my system is trolling me [06:52:33] heh [06:52:47] Firefox works [06:52:51] Chrome doesn't load [07:03:49] yeah, I don't have issues with my other instances, sigh [07:03:49] ah, lemme check that [07:04:10] * Hydriz feels like he is the troublemaker here [07:04:38] no route to host [07:04:59] hm [07:06:13] Hydriz: did you delete it and re-create it? [07:06:24] for this instance? [07:06:26] yes [07:06:39] I deleted the instance deployment and created under a different name [07:06:47] which is this [07:07:01] and yes it encountered an error earlier [07:07:06] ah. ok [07:07:09] so had to recreated [07:07:14] *recreate [07:07:26] when you delete an instance and recreate it, the dns takes a while to purge [07:07:35] there's a one hour ttl on the entry [07:07:48] when the instance is recreated, it gets a different IP address [07:07:50] so it will take one hour to purge? [07:07:57] I can do it manually really quick [07:08:10] @search gdfg [07:08:11] No results found! :| [07:08:12] thanks :) [07:08:17] @regsearch .. [07:08:18] Results (found 84): instance, morebots, git, bang, nagios, bot, labs-home-wm, labs-nagios-wm, labs-morebots, gerrit-wm, wiki, labs, extension, wm-bot, putty, gerrit, change, revision, monitor, alert, password, unicorn, help, bz, os-change, instancelist, instance-json, leslie's-reset, damianz's-reset, amend, credentials, queue, sal, info, security, logging, ask, sudo, access, $realm, keys, $site, bug, pageant, blueprint-dns, bots, stucked, rt, pxe, ghsh, group, pathconflict, terminology, etherpad, epad, nova-resource, pastebin, newgrp, osm-bug, Ryan, bastion, ryanland, afk, test, initial-login, account-questions, manage-projects, rights, new-labsuser, cs, puppet, new-ldapuser, projects, quilt, labs-project, openstack-manager, wikitech, load, load-all, socks-proxy, wl, domain, gettingstarted, docs, [07:08:21] @regsearch * [07:08:21] This regex is totally bad [07:08:26] Hydriz: ^ [07:08:27] :P [07:08:36] omg I did that query [07:08:38] fixed [07:08:44] and I crashed the bot [07:08:46] should work now [07:08:47] yes [07:08:54] thanks! [07:08:57] :D [07:09:06] lol no [07:09:09] still exists [07:09:17] sorry. purged in the wrong order [07:09:18] it works now [07:09:56] yep [07:09:57] thanks! [07:10:38] yw [07:10:45] oh yes [07:11:02] is it possible to change the instance from, say, m1.small to m1.medium? [07:11:24] RECOVERY host: incubator-dep is UP address: incubator-dep PING OK - Packet loss = 0%, RTA = 0.75 ms [07:11:29] nope [07:11:33] I see [07:11:36] resizing doesn't currently work [07:11:42] possibly in the future [07:11:49] because m1.medium breaks everything [07:11:53] how so? [07:11:57] like, we just can't create it [07:12:02] really? what happens? [07:12:12] not sure [07:12:17] something about ruby's download [07:12:27] eh? [07:12:37] then wants you to run apt-get with some fixing parameter [07:12:54] I can't really recall what happens [07:13:01] but I know it stops in mid air [07:13:04] ah. the ruby bug [07:13:14] it isn't the instance type causing it [07:13:30] it occasionally happens on any instance type [07:13:35] and I haven't tracked it down yet [07:13:40] usually deleting and recreating works [07:13:50] but it always affect me creating m1.medium [07:13:59] let me test [07:14:00] so I want medium but always fail to [07:14:14] why do you need medium? [07:14:24] PROBLEM Current Users is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:14:30] (I'm looking into it, just curious) [07:14:45] yeah, I am curious :P [07:15:04] PROBLEM Disk Space is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:15:07] I haven't had issues with mediums in the past [07:15:44] PROBLEM Free ram is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:16:32] hm. sure enough I get that error [07:17:04] yeah [07:17:07] that makes no sense [07:17:19] obstructs the path to medium :P [07:17:24] PROBLEM Total Processes is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:17:31] trying again :) [07:17:34] PROBLEM dpkg-check is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:17:47] nagios: be quiet [07:18:09] restart nagios-nrpe-server service [07:18:50] any syntax in doing so? [07:19:04] PROBLEM Current Load is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [07:19:12] /etc/init.d/nagios-nrpe-server restart [07:19:16] and again it failed. [07:19:19] that makes no sense [07:19:21] heh [07:20:04] RECOVERY Disk Space is now: OK on incubator-dep incubator-dep output: DISK OK [07:20:15] RECOVERY Current Load is now: OK on prefixexport prefixexport output: OK - load average: 0.05, 0.24, 0.35 [07:20:35] RECOVERY Free ram is now: OK on incubator-dep incubator-dep output: OK: 91% free memory [07:20:45] RECOVERY Current Users is now: OK on prefixexport prefixexport output: USERS OK - 4 users currently logged in [07:20:46] I wonder if larges build [07:20:57] 4 users? [07:21:00] oh yes [07:21:05] RECOVERY Free ram is now: OK on prefixexport prefixexport output: OK: 30% free memory [07:21:07] I am running 4 terminals [07:21:10] forgot [07:22:11] large failed too [07:22:13] wtf [07:22:25] RECOVERY Disk Space is now: OK on prefixexport prefixexport output: DISK OK [07:22:25] RECOVERY Total Processes is now: OK on incubator-dep incubator-dep output: PROCS OK: 84 processes [07:22:35] RECOVERY Total Processes is now: OK on prefixexport prefixexport output: PROCS OK: 106 processes [07:22:40] RECOVERY dpkg-check is now: OK on incubator-dep incubator-dep output: All packages OK [07:22:46] lulz [07:23:15] RECOVERY dpkg-check is now: OK on prefixexport prefixexport output: All packages OK [07:24:05] RECOVERY Current Load is now: OK on incubator-dep incubator-dep output: OK - load average: 0.00, 0.01, 0.09 [07:24:25] RECOVERY Current Users is now: OK on incubator-dep incubator-dep output: USERS OK - 2 users currently logged in [07:25:05] PROBLEM host: testmedium is DOWN address: testmedium check_ping: Invalid hostname/address - testmedium [08:04:03] PROBLEM Current Load is now: CRITICAL on incubator-sql incubator-sql output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:04:23] PROBLEM Current Users is now: CRITICAL on incubator-sql incubator-sql output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:05:08] PROBLEM Disk Space is now: CRITICAL on incubator-sql incubator-sql output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:05:29] !log incubator Creating new instance incubator-sql for hosting MySQL databases on the incubator projects [08:05:30] Logged the message, Master [08:05:43] PROBLEM Free ram is now: CRITICAL on incubator-sql incubator-sql output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:09:03] RECOVERY Current Load is now: OK on incubator-sql incubator-sql output: OK - load average: 0.44, 0.50, 0.31 [08:09:23] RECOVERY Current Users is now: OK on incubator-sql incubator-sql output: USERS OK - 1 users currently logged in [08:10:03] RECOVERY Disk Space is now: OK on incubator-sql incubator-sql output: DISK OK [08:10:43] RECOVERY Free ram is now: OK on incubator-sql incubator-sql output: OK: 85% free memory [08:41:33] PROBLEM dpkg-check is now: CRITICAL on incubator-sql incubator-sql output: DPKG CRITICAL dpkg reports broken packages [08:43:39] broken packages, yes. MySQL is still in the middle of installing [09:00:56] !wl [09:00:57] https://www.mediawiki.org/wiki/Wikimedia_Labs here you can find more [09:01:33] RECOVERY dpkg-check is now: OK on incubator-sql incubator-sql output: All packages OK [09:16:56] !log incubator Created new instance incubator-live for hosting MediaWiki files so that we can avoid having the same files on different servers [09:16:57] Logged the message, Master [09:54:34] PROBLEM dpkg-check is now: CRITICAL on incubator-sql incubator-sql output: DPKG CRITICAL dpkg reports broken packages [09:57:41] !log incubator Deleting instance incubator-sql to rename to incubator-sql1, partially also due to severe misconfiguration of mysql installation [09:57:42] Logged the message, Master [10:03:43] !log deployment-prep installed ffmpeg on deployment-web (required by TMH to extract stills) [10:03:44] Logged the message, Master [10:26:35] RECOVERY Current Users is now: OK on bots-4 bots-4 output: USERS OK - 2 users currently logged in [10:26:35] RECOVERY Total Processes is now: OK on bots-4 bots-4 output: PROCS OK: 85 processes [10:26:55] RECOVERY dpkg-check is now: OK on bots-4 bots-4 output: All packages OK [10:28:15] PROBLEM dpkg-check is now: CRITICAL on incubator-sql1 incubator-sql1 output: DPKG CRITICAL dpkg reports broken packages [10:28:15] !log bots Fix nagios issue of bots-4 on SSL handshake by enabling 10.4.0.34 as allowed host in /etc/nagios/nrpe_local.cfg [10:28:16] Logged the message, Master [10:28:35] RECOVERY Current Load is now: OK on bots-4 bots-4 output: OK - load average: 0.36, 0.09, 0.03 [10:28:35] RECOVERY Disk Space is now: OK on bots-4 bots-4 output: DISK OK [10:33:15] RECOVERY dpkg-check is now: OK on incubator-sql1 incubator-sql1 output: All packages OK [12:56:15] PROBLEM dpkg-check is now: CRITICAL on incubator-sql1 incubator-sql1 output: DPKG CRITICAL dpkg reports broken packages [13:23:28] !log incubator Deleting the incubator-sql1 instance as having another SQL server proves to be worthless [13:23:30] Logged the message, Master [13:32:09] !log incubator Creating the incubator-bots instance for hosting Wikimedia Incubator bots [13:32:11] Logged the message, Master [13:40:13] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 17% free memory [13:42:41] linkwatcher.pl is taking quite a bit of CPU? [13:42:57] Beetstra: ping [13:43:10] Yes [13:43:15] It also does a lot of work [13:43:20] heh [13:43:32] looks like bots-2 is your server :P [13:43:48] yeah, sorry, kind of [13:44:04] consider splitting up? [13:44:12] impossible? [13:44:17] like make linkwatcher.pl get its own server [13:44:26] Lets just have one instance running at the edge [13:44:33] The other two bots don't do anything big [13:44:53] 32% lol [13:45:07] you mean, XLinkBot and unblockbot? [13:45:25] no [13:45:35] unblockbot has a memory problem sometimes, I have to work on that this weekend [13:45:36] nothing much [13:45:41] I see [13:45:51] [16:45:22] LW: 3 days, 22:19:24 hours active; RC: last 0 sec. ago; Reading approx. 772 wikis; Queues: P1=0; P2=0; P3=0; W=0; A1=0; A2=0; M=0; Total: 2382547 edits (420 PM); 312560 IP edits (15.9%; 55 PM); Watched: 1958863 (82.2%; 346 PM); Links: 112721 edits (5.7%; 19 PM); 333186 total (58 PM; 0.17 per edit; 2.95 per EL add edit); 39529 WL (11.8%; 6 PM); 2754 BL (0.8%; 0 PM); 246 RL (0%; 0 PM); 990 AL (0.2%; 0 PM) [13:45:59] just letting you know about that 17% :P [13:46:30] in other words, LiWa3 is parsing 772 wikis, with an edit speed of 420 edits per minute, parsing of those 346 edits per minute, finding 58 link additions per minute ... [13:47:28] in which channel is it in? [14:24:33] New review: Dzahn; "yep, changing the bugzilla logo to the one from commons, now unrelated to changing that link in the ..." [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/2013 [14:24:33] Change merged: Dzahn; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2013 [14:32:44] hydriz: #wikipedia-en-spam and #svn-wp-spam [14:33:06] svn? [15:55:13] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 20% free memory [16:03:13] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 19% free memory [18:53:13] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 22% free memory [19:06:11] 01/25/2012 - 19:06:11 - Updating keys for zaran [19:06:15] 01/25/2012 - 19:06:15 - Updating keys for zaran [19:06:19] 01/25/2012 - 19:06:18 - Updating keys for zaran [19:09:10] Ryan_Lane: any progress on those squid scripts :p [19:13:41] Ryan_Lane : there is a french wikisource contributor who would like to have access to the wikisource instance on the labs, he is a good php developper and has some patches to test [19:13:52] can I give him access to the wikisource project ? [19:14:20] and if so, how can I do it ? [19:21:41] Zaran: First, he will have to come in here and ask Ryan to set him up an account, then you can add him to the wikisource instance by clicking Add Member on https://labsconsole.wikimedia.org/wiki/Special:NovaProject [19:21:54] petan|wk: how can I get squid purged? [19:22:54] thanks methecooldude, he's just joined the channel (Tpt) [19:22:57] * hexmode sees deployment-squid and decides to give it a try [19:23:48] Hi, or should I say bonsoir :) [19:23:54] Tpt: ^ [19:24:13] methecooldude: Hi. [19:25:11] Tpt: What kind of patches do you have? [19:26:26] johnduhart: Now, nothing. It's to test params and new extensions and maybe to work with Zaran for ProofreadPages [19:26:57] hexmode: I thought you weren't doing new extension testing right now? [19:27:39] johnduhart: nope, I'm not. why? [19:28:16] Tpt: Deployment-prep isn't doing new extension testing right now, what were you looking to test? [19:29:21] Things like FeaturedFeeds https://www.mediawiki.org/wiki/Extension:FeaturedFeeds [19:30:32] I don't think that's being targets for 1.19wmf1 atm, will probably be done later. [19:30:36] targeted* [19:30:57] Tpt, you'll be able to test it live on WP in 1:30 ;) [19:31:14] johnduhart : Tpt only wants access to the wikisource project [19:31:15] johnduhart: featuredfeeds is being released now ^^ [19:31:23] which is a playground for wikisource [19:31:40] hexmode: Okay so we should make sure to update our configuration [19:32:03] johnduhart: can you do that? [19:32:10] Zaran: I wouldn't call it a playground. It's a staging area for before 1.19wmf1 [19:32:18] hexmode: Nope, don't have root anymore. [19:32:37] can I give it to you? [19:33:51] No, I don't want to get involved. I'm willing to advise but at the end of the it's your and petan's project. [19:33:58] heh [19:34:01] I tried [19:34:32] sorry. [19:34:35] any suggestions for how to sync it [19:34:37] np [19:34:52] New patchset: Lcarr; "adding nrpe_local to stop the breakage of puppet" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2086 [19:34:52] hexmode: InitaliseSettings can just be copied from live [19:35:02] Check for anything new in CommonSettings [19:35:07] k I'll look at that [19:35:09] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2086 [19:35:09] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/2086 [19:35:15] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2086 [19:35:51] It'd be nice if we could get an svn history of commonsettings [19:41:13] PROBLEM dpkg-check is now: CRITICAL on labs-nfs1 labs-nfs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:41:43] PROBLEM Free ram is now: CRITICAL on client1-lcarr client1-lcarr output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:41:48] johnduhart: that would probably be a git history of commonsettings [19:41:51] I'll ask [19:42:12] hexmode: I'm pretty sure it's a private svn repo unless they changed that [19:42:23] PROBLEM Free ram is now: CRITICAL on labs-nfs1 labs-nfs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:42:33] PROBLEM Current Load is now: CRITICAL on wep wep output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:42:43] LeslieCarr: Congratz! [19:43:08] PROBLEM dpkg-check is now: CRITICAL on client1-lcarr client1-lcarr output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:43:09] methecooldude: ? [19:43:13] PROBLEM Total Processes is now: CRITICAL on labs-nfs1 labs-nfs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:43:18] PROBLEM Current Load is now: CRITICAL on client1-lcarr client1-lcarr output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:43:23] PROBLEM Total Processes is now: CRITICAL on mediahandler-test mediahandler-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:43:24] LeslieCarr: You patch, nagios goes mad :) [19:43:30] haha [19:43:37] but puppet is now working ;) [19:43:53] PROBLEM Total Processes is now: CRITICAL on wep wep output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:44:03] PROBLEM Current Load is now: CRITICAL on wikisource-web wikisource-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:44:43] PROBLEM Disk Space is now: CRITICAL on wikisource-web wikisource-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:44:53] PROBLEM dpkg-check is now: CRITICAL on vivek-puppet vivek-puppet output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:45:08] PROBLEM Current Users is now: CRITICAL on wikisource-web wikisource-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:45:23] PROBLEM Current Load is now: CRITICAL on labs-nfs1 labs-nfs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:45:23] PROBLEM Current Users is now: CRITICAL on wep wep output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:45:53] PROBLEM Current Load is now: CRITICAL on vivek-puppet vivek-puppet output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:45:53] PROBLEM Disk Space is now: CRITICAL on mediahandler-test mediahandler-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:46:13] gotta add myself to the nagios project first... [19:46:23] PROBLEM Disk Space is now: CRITICAL on vivek-puppet vivek-puppet output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:46:33] PROBLEM Current Users is now: CRITICAL on labs-nfs1 labs-nfs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:46:33] PROBLEM dpkg-check is now: CRITICAL on wep wep output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:46:43] PROBLEM Disk Space is now: CRITICAL on nova-dev1 nova-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:46:48] PROBLEM Current Users is now: CRITICAL on nova-dev1 nova-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:46:58] PROBLEM Total Processes is now: CRITICAL on feeds feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:47:09] 01/25/2012 - 19:47:09 - Creating a home directory for lcarr at /export/home/nagios/lcarr [19:47:13] PROBLEM Total Processes is now: CRITICAL on vivek-puppet vivek-puppet output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:47:23] PROBLEM Disk Space is now: CRITICAL on labs-nfs1 labs-nfs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:47:33] PROBLEM Disk Space is now: CRITICAL on wep wep output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:47:33] PROBLEM dpkg-check is now: CRITICAL on wikisource-web wikisource-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:47:33] PROBLEM Current Load is now: CRITICAL on mobile-feeds mobile-feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:47:33] PROBLEM Free ram is now: CRITICAL on mobile-feeds mobile-feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:47:33] PROBLEM Current Users is now: CRITICAL on vivek-puppet vivek-puppet output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:47:43] PROBLEM Total Processes is now: CRITICAL on client1-lcarr client1-lcarr output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:48:03] PROBLEM Disk Space is now: CRITICAL on bots-sql2 bots-sql2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:48:09] 01/25/2012 - 19:48:09 - Updating keys for lcarr [19:48:38] PROBLEM Current Load is now: CRITICAL on nova-dev1 nova-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:48:43] PROBLEM Disk Space is now: CRITICAL on client1-lcarr client1-lcarr output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:49:03] PROBLEM Free ram is now: CRITICAL on wikisource-web wikisource-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:49:03] PROBLEM dpkg-check is now: CRITICAL on bots-sql2 bots-sql2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:49:32] if anyone with nagios access wants to check this out while i wait for the bot to update my permissions [19:49:38] it's probably related to nrpe_local [19:49:58] PROBLEM dpkg-check is now: CRITICAL on feeds feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:08] PROBLEM Disk Space is now: CRITICAL on mobile-feeds mobile-feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:08] PROBLEM Current Users is now: CRITICAL on client1-lcarr client1-lcarr output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:08] PROBLEM Current Load is now: CRITICAL on mediahandler-test mediahandler-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:13] PROBLEM Current Load is now: CRITICAL on deployment-backup deployment-backup output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:23] PROBLEM dpkg-check is now: CRITICAL on deployment-backup deployment-backup output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:23] PROBLEM Free ram is now: CRITICAL on bots-sql2 bots-sql2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:23] PROBLEM Free ram is now: CRITICAL on wep wep output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:23] PROBLEM dpkg-check is now: CRITICAL on bots-apache1 bots-apache1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:23] PROBLEM Current Users is now: CRITICAL on feeds feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:23] PROBLEM Total Processes is now: CRITICAL on mobile-feeds mobile-feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:33] PROBLEM Total Processes is now: CRITICAL on bots-apache1 bots-apache1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:43] PROBLEM Free ram is now: CRITICAL on nova-dev1 nova-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:43] PROBLEM Current Load is now: CRITICAL on feeds feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:43] PROBLEM Disk Space is now: CRITICAL on feeds feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:50:43] PROBLEM dpkg-check is now: CRITICAL on mediahandler-test mediahandler-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:51:03] PROBLEM Current Load is now: CRITICAL on bots-sql2 bots-sql2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:51:03] PROBLEM Total Processes is now: CRITICAL on wikisource-web wikisource-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:52:23] PROBLEM Free ram is now: CRITICAL on mediahandler-test mediahandler-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:52:23] PROBLEM Free ram is now: CRITICAL on vivek-puppet vivek-puppet output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:52:23] PROBLEM dpkg-check is now: CRITICAL on nova-dev1 nova-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:52:48] PROBLEM Current Load is now: CRITICAL on bots-apache1 bots-apache1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:52:53] PROBLEM Disk Space is now: CRITICAL on deployment-transcoding deployment-transcoding output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:53:08] PROBLEM Current Users is now: CRITICAL on deployment-sql deployment-sql output: Connection refused by host [19:53:08] PROBLEM Total Processes is now: CRITICAL on bots-sql2 bots-sql2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:53:33] PROBLEM Disk Space is now: CRITICAL on deployment-backup deployment-backup output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:53:33] PROBLEM Free ram is now: CRITICAL on deployment-backup deployment-backup output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:53:33] PROBLEM Current Users is now: CRITICAL on mediahandler-test mediahandler-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:53:33] PROBLEM Free ram is now: CRITICAL on pageviews pageviews output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:53:53] PROBLEM Free ram is now: CRITICAL on feeds feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:53:58] PROBLEM Current Users is now: CRITICAL on bots-sql2 bots-sql2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:54:23] PROBLEM dpkg-check is now: CRITICAL on bots-1 bots-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:54:23] PROBLEM Disk Space is now: CRITICAL on bots-1 bots-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:54:23] PROBLEM Disk Space is now: CRITICAL on deployment-sql deployment-sql output: Connection refused by host [19:54:23] PROBLEM Free ram is now: CRITICAL on bots-sql3 bots-sql3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:54:48] PROBLEM dpkg-check is now: CRITICAL on bots-sql3 bots-sql3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:54:53] PROBLEM Current Load is now: CRITICAL on labs-ocg1 labs-ocg1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:55:08] PROBLEM Free ram is now: CRITICAL on labs-ocg1 labs-ocg1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:55:08] PROBLEM Total Processes is now: CRITICAL on pageviews pageviews output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:55:13] PROBLEM Current Users is now: CRITICAL on mobile-feeds mobile-feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:55:23] PROBLEM Disk Space is now: CRITICAL on labs-ocg1 labs-ocg1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:55:48] PROBLEM Total Processes is now: CRITICAL on nova-dev1 nova-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:56:47] !log bots moved unblockbot from bots-2 to bots-3 [19:56:48] Logged the message, Master [19:57:28] PROBLEM dpkg-check is now: CRITICAL on pageviews pageviews output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:57:41] Beetstra: /me waits for Nagios to report that free ram is ok now :) [19:58:34] bots-2 is now at 22% .. will go up and down a bit due to LiWa3 modules taking more memory, dying and respawning (ad infinitum) [19:58:48] PROBLEM Total Processes is now: CRITICAL on deployment-backup deployment-backup output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:58:50] waiting for nagios to run puppet again so it knows i can have sudo access :( [19:59:03] PROBLEM Total Processes is now: CRITICAL on deployment-sql deployment-sql output: Connection refused by host [19:59:33] PROBLEM Current Users is now: CRITICAL on bots-1 bots-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [19:59:38] PROBLEM dpkg-check is now: CRITICAL on labs-ocg1 labs-ocg1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:00:16] could someone give labs-nagios-wm_ some hands .. so it can shake hands again [20:00:28] PROBLEM dpkg-check is now: CRITICAL on mobile-feeds mobile-feeds output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:00:38] PROBLEM Free ram is now: CRITICAL on bots-apache1 bots-apache1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:00:48] PROBLEM Disk Space is now: CRITICAL on pageviews pageviews output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:00:48] PROBLEM Disk Space is now: CRITICAL on bots-apache1 bots-apache1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:00:48] PROBLEM Total Processes is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:01:43] PROBLEM Current Load is now: CRITICAL on bastion1 bastion1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:02:03] PROBLEM Total Processes is now: CRITICAL on bots-1 bots-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:03] PROBLEM Current Users is now: CRITICAL on pageviews pageviews output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:08] PROBLEM dpkg-check is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:23] PROBLEM Current Load is now: CRITICAL on pageviews pageviews output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:23] PROBLEM Current Users is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:28] PROBLEM Current Users is now: CRITICAL on deployment-backup deployment-backup output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:43] PROBLEM Current Users is now: CRITICAL on bots-apache1 bots-apache1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:43] PROBLEM Current Load is now: CRITICAL on bots-1 bots-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:53] PROBLEM Total Processes is now: CRITICAL on bots-sql3 bots-sql3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:58] PROBLEM Current Users is now: CRITICAL on master master output: Connection refused by host [20:03:58] PROBLEM dpkg-check is now: CRITICAL on deployment-sql deployment-sql output: Connection refused by host [20:03:58] PROBLEM Disk Space is now: CRITICAL on bots-sql3 bots-sql3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:03:58] PROBLEM Total Processes is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:04:03] PROBLEM Total Processes is now: CRITICAL on master master output: Connection refused by host [20:04:23] PROBLEM Current Load is now: CRITICAL on deployment-sql deployment-sql output: Connection refused by host [20:04:23] PROBLEM Current Load is now: CRITICAL on master master output: Connection refused by host [20:04:23] PROBLEM Free ram is now: CRITICAL on bots-1 bots-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:04:38] PROBLEM Current Load is now: CRITICAL on deployment-transcoding deployment-transcoding output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:04:53] PROBLEM Current Load is now: CRITICAL on bots-sql3 bots-sql3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:05:03] PROBLEM Disk Space is now: CRITICAL on incubator-live incubator-live output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:05:18] PROBLEM dpkg-check is now: CRITICAL on bastion1 bastion1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:05:49] labs-nagios-wm_: shaddup [20:06:08] PROBLEM Free ram is now: CRITICAL on incubator-live incubator-live output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:08] PROBLEM Disk Space is now: CRITICAL on test3 test3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:08] PROBLEM Current Users is now: CRITICAL on deployment-transcoding deployment-transcoding output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:08] PROBLEM dpkg-check is now: CRITICAL on deployment-squid deployment-squid output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:08] PROBLEM Current Users is now: CRITICAL on labs-ocg1 labs-ocg1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:09] PROBLEM Free ram is now: CRITICAL on bots-nfs bots-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:09] PROBLEM Current Users is now: CRITICAL on bots-sql3 bots-sql3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:18] PROBLEM dpkg-check is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:28] PROBLEM dpkg-check is now: CRITICAL on embed-sandbox embed-sandbox output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:33] PROBLEM Free ram is now: CRITICAL on deployment-squid deployment-squid output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:33] PROBLEM Total Processes is now: CRITICAL on labs-ocg1 labs-ocg1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:38] PROBLEM Disk Space is now: CRITICAL on bastion1 bastion1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:43] PROBLEM Free ram is now: CRITICAL on bastion1 bastion1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:06:53] PROBLEM Total Processes is now: CRITICAL on bastion1 bastion1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:07:08] PROBLEM Total Processes is now: CRITICAL on embed-sandbox embed-sandbox output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:07:23] PROBLEM Total Processes is now: CRITICAL on bots-nfs bots-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:07:43] PROBLEM Current Users is now: CRITICAL on ganglia-collector ganglia-collector output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:13] PROBLEM Current Users is now: CRITICAL on asher1 asher1 output: Connection refused by host [20:08:13] PROBLEM Disk Space is now: CRITICAL on bots-sql1 bots-sql1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:23] PROBLEM Free ram is now: CRITICAL on deployment-transcoding deployment-transcoding output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:33] PROBLEM dpkg-check is now: CRITICAL on turnkey-1 turnkey-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:33] PROBLEM Current Users is now: CRITICAL on wikistats-01 wikistats-01 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:38] PROBLEM dpkg-check is now: CRITICAL on test3 test3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:53] PROBLEM Disk Space is now: CRITICAL on nova-dev3 nova-dev3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:53] PROBLEM Current Load is now: CRITICAL on nova-dev4 nova-dev4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:53] PROBLEM Total Processes is now: CRITICAL on incubator-live incubator-live output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:58] PROBLEM Current Load is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:58] PROBLEM Disk Space is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:08:58] PROBLEM Current Load is now: CRITICAL on asher1 asher1 output: Connection refused by host [20:09:03] PROBLEM Total Processes is now: CRITICAL on turnkey-1 turnkey-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:08] PROBLEM Disk Space is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:33] PROBLEM Current Users is now: CRITICAL on deployment-squid deployment-squid output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:33] PROBLEM dpkg-check is now: CRITICAL on bots-sql1 bots-sql1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:33] PROBLEM Current Load is now: CRITICAL on bots-4 bots-4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:38] PROBLEM Current Load is now: CRITICAL on embed-sandbox embed-sandbox output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:38] PROBLEM Current Load is now: CRITICAL on turnkey-1 turnkey-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:38] PROBLEM Free ram is now: CRITICAL on turnkey-1 turnkey-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:38] PROBLEM Current Users is now: CRITICAL on nginx-dev1 nginx-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:38] PROBLEM Total Processes is now: CRITICAL on pad2 pad2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:09:58] PROBLEM Free ram is now: CRITICAL on pad2 pad2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:10:08] PROBLEM Free ram is now: CRITICAL on master master output: Connection refused by host [20:10:08] PROBLEM Current Load is now: CRITICAL on deployment-squid deployment-squid output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:10:08] PROBLEM Total Processes is now: CRITICAL on bots-sql1 bots-sql1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:10:13] PROBLEM Free ram is now: CRITICAL on test3 test3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:10:13] PROBLEM Current Users is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:10:13] PROBLEM Disk Space is now: CRITICAL on deployment-squid deployment-squid output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:10:18] PROBLEM Total Processes is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:10:38] PROBLEM Total Processes is now: CRITICAL on test3 test3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:10:58] PROBLEM Total Processes is now: CRITICAL on labs-lvs1 labs-lvs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:03] PROBLEM Disk Space is now: CRITICAL on master master output: Connection refused by host [20:11:23] PROBLEM dpkg-check is now: CRITICAL on nova-dev3 nova-dev3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:23] PROBLEM Current Load is now: CRITICAL on wikistats-01 wikistats-01 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:23] PROBLEM Current Load is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:23] PROBLEM Disk Space is now: CRITICAL on labs-lvs1 labs-lvs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:23] PROBLEM Total Processes is now: CRITICAL on nova-dev4 nova-dev4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:28] PROBLEM Free ram is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:28] PROBLEM dpkg-check is now: CRITICAL on pad2 pad2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:28] PROBLEM Disk Space is now: CRITICAL on deployment-nfs-memc deployment-nfs-memc output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:43] PROBLEM Free ram is now: CRITICAL on vumi-gw1 vumi-gw1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:43] PROBLEM Current Load is now: CRITICAL on fwserver1 fwserver1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:48] PROBLEM Current Users is now: CRITICAL on vumi-gw1 vumi-gw1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:48] PROBLEM Free ram is now: CRITICAL on labs-lvs1 labs-lvs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:48] PROBLEM Current Users is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:48] PROBLEM Total Processes is now: CRITICAL on deployment-wmsearch deployment-wmsearch output: Connection refused by host [20:11:53] PROBLEM Free ram is now: CRITICAL on nginx-dev1 nginx-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:53] PROBLEM Current Users is now: CRITICAL on reportcard1 reportcard1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:11:53] PROBLEM Free ram is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:08] PROBLEM dpkg-check is now: CRITICAL on labs-lvs1 labs-lvs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:08] PROBLEM Current Users is now: CRITICAL on labs-build1 labs-build1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:08] PROBLEM dpkg-check is now: CRITICAL on deployment-transcoding deployment-transcoding output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:08] PROBLEM Total Processes is now: CRITICAL on deployment-transcoding deployment-transcoding output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:13] PROBLEM Current Users is now: CRITICAL on nova-dev3 nova-dev3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:13] PROBLEM Free ram is now: CRITICAL on reportcard1 reportcard1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:18] PROBLEM Free ram is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:18] PROBLEM Current Load is now: CRITICAL on incubator-live incubator-live output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:28] PROBLEM Free ram is now: CRITICAL on ganglia-collector ganglia-collector output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:28] PROBLEM Current Load is now: CRITICAL on bots-nfs bots-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:28] PROBLEM dpkg-check is now: CRITICAL on incubator-live incubator-live output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:28] PROBLEM dpkg-check is now: CRITICAL on bots-nfs bots-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:28] PROBLEM dpkg-check is now: CRITICAL on master master output: Connection refused by host [20:12:38] PROBLEM Free ram is now: CRITICAL on nova-dev3 nova-dev3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:38] PROBLEM Free ram is now: CRITICAL on incubator-nfs incubator-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:38] PROBLEM Current Users is now: CRITICAL on deployment-wmsearch deployment-wmsearch output: Connection refused by host [20:12:38] PROBLEM Current Users is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:38] PROBLEM Current Users is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:38] PROBLEM Current Users is now: CRITICAL on labs-lvs1 labs-lvs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:48] PROBLEM Total Processes is now: CRITICAL on pad1 pad1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:53] PROBLEM Current Users is now: CRITICAL on bastion1 bastion1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:53] PROBLEM Total Processes is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:58] PROBLEM Free ram is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:12:58] PROBLEM Total Processes is now: CRITICAL on reportcard1 reportcard1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:13:03] PROBLEM Free ram is now: CRITICAL on wikistats-01 wikistats-01 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:13:03] PROBLEM Current Load is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:13:03] PROBLEM Current Users is now: CRITICAL on bots-cb bots-cb output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:13:28] PROBLEM Disk Space is now: CRITICAL on ganglia-collector ganglia-collector output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:13:48] PROBLEM Total Processes is now: CRITICAL on asher1 asher1 output: Connection refused by host [20:13:53] PROBLEM Free ram is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:13:53] PROBLEM Total Processes is now: CRITICAL on nova-dev3 nova-dev3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:13:58] PROBLEM dpkg-check is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:13] PROBLEM dpkg-check is now: CRITICAL on reportcard1 reportcard1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:13] PROBLEM Disk Space is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:18] PROBLEM dpkg-check is now: CRITICAL on deployment-wmsearch deployment-wmsearch output: Connection refused by host [20:14:28] PROBLEM Current Load is now: CRITICAL on bots-cb bots-cb output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:38] PROBLEM Disk Space is now: CRITICAL on pad2 pad2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:38] PROBLEM dpkg-check is now: CRITICAL on nova-dev4 nova-dev4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:48] PROBLEM Current Load is now: CRITICAL on reportcard1 reportcard1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:48] PROBLEM Current Load is now: CRITICAL on test3 test3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:48] PROBLEM Disk Space is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:53] PROBLEM Current Users is now: CRITICAL on pad2 pad2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:53] PROBLEM dpkg-check is now: CRITICAL on nginx-dev1 nginx-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:53] PROBLEM Free ram is now: CRITICAL on deployment-nfs-memc deployment-nfs-memc output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:14:53] PROBLEM Free ram is now: CRITICAL on labs-build1 labs-build1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:03] PROBLEM Total Processes is now: CRITICAL on nginx-dev1 nginx-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:18] PROBLEM Total Processes is now: CRITICAL on deployment-squid deployment-squid output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:23] PROBLEM Current Users is now: CRITICAL on incubator-live incubator-live output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:23] PROBLEM Current Users is now: CRITICAL on pad1 pad1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:23] PROBLEM Current Users is now: CRITICAL on turnkey-1 turnkey-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:33] PROBLEM dpkg-check is now: CRITICAL on deployment-nfs-memc deployment-nfs-memc output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:33] PROBLEM Total Processes is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:38] PROBLEM Free ram is now: CRITICAL on bots-sql1 bots-sql1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:38] PROBLEM Disk Space is now: CRITICAL on embed-sandbox embed-sandbox output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:38] PROBLEM Free ram is now: CRITICAL on pad1 pad1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:38] PROBLEM Disk Space is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:38] PROBLEM Current Users is now: CRITICAL on bots-sql1 bots-sql1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:38] PROBLEM Disk Space is now: CRITICAL on bots-nfs bots-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:38] PROBLEM Current Users is now: CRITICAL on embed-sandbox embed-sandbox output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:48] PROBLEM Current Load is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:48] PROBLEM Current Load is now: CRITICAL on pad2 pad2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:58] PROBLEM Current Load is now: CRITICAL on nova-dev2 nova-dev2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:15:58] PROBLEM Total Processes is now: CRITICAL on wikistats-01 wikistats-01 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:03] PROBLEM dpkg-check is now: CRITICAL on bots-cb bots-cb output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:08] PROBLEM Current Load is now: CRITICAL on labs-build1 labs-build1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:08] PROBLEM Current Load is now: CRITICAL on ganglia-collector ganglia-collector output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:08] PROBLEM Current Load is now: CRITICAL on nginx-dev1 nginx-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:08] PROBLEM Current Load is now: CRITICAL on nova-dev3 nova-dev3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:28] PROBLEM dpkg-check is now: CRITICAL on asher1 asher1 output: Connection refused by host [20:16:28] PROBLEM Current Load is now: CRITICAL on vumi-gw1 vumi-gw1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:38] PROBLEM Current Users is now: CRITICAL on nova-dev4 nova-dev4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:38] PROBLEM dpkg-check is now: CRITICAL on wikistats-01 wikistats-01 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:38] PROBLEM dpkg-check is now: CRITICAL on pad1 pad1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:38] PROBLEM Disk Space is now: CRITICAL on turnkey-1 turnkey-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:38] PROBLEM Current Load is now: CRITICAL on deployment-wmsearch deployment-wmsearch output: Connection refused by host [20:16:38] PROBLEM Free ram is now: CRITICAL on nova-ldap1 nova-ldap1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:39] PROBLEM dpkg-check is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:39] PROBLEM dpkg-check is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:40] PROBLEM Current Load is now: CRITICAL on bots-sql1 bots-sql1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:40] PROBLEM Total Processes is now: CRITICAL on ganglia-collector ganglia-collector output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:48] PROBLEM Disk Space is now: CRITICAL on wikistats-01 wikistats-01 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:58] PROBLEM Current Users is now: CRITICAL on bots-nfs bots-nfs output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:16:58] PROBLEM Current Load is now: CRITICAL on pad1 pad1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:08] PROBLEM Disk Space is now: CRITICAL on nginx-dev1 nginx-dev1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:08] PROBLEM Total Processes is now: CRITICAL on labs-build1 labs-build1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:13] PROBLEM dpkg-check is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:13] PROBLEM Disk Space is now: CRITICAL on venus venus output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:13] PROBLEM Disk Space is now: CRITICAL on deployment-wmsearch deployment-wmsearch output: Connection refused by host [20:17:13] PROBLEM dpkg-check is now: CRITICAL on ganglia-collector ganglia-collector output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:13] PROBLEM Free ram is now: CRITICAL on bots-cb bots-cb output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:13] PROBLEM dpkg-check is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:18] PROBLEM Free ram is now: CRITICAL on embed-sandbox embed-sandbox output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:18] PROBLEM Disk Space is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:18] PROBLEM Current Load is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:18] PROBLEM Disk Space is now: CRITICAL on bots-cb bots-cb output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:18] PROBLEM dpkg-check is now: CRITICAL on nova-ldap1 nova-ldap1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:28] PROBLEM Disk Space is now: CRITICAL on nova-dev2 nova-dev2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:28] PROBLEM Current Users is now: CRITICAL on test3 test3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:28] PROBLEM Total Processes is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:33] PROBLEM Disk Space is now: CRITICAL on pad1 pad1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:33] PROBLEM Disk Space is now: CRITICAL on asher1 asher1 output: Connection refused by host [20:17:33] PROBLEM Current Users is now: CRITICAL on nova-dev2 nova-dev2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:38] PROBLEM Free ram is now: CRITICAL on nova-dev4 nova-dev4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:38] PROBLEM Disk Space is now: CRITICAL on reportcard1 reportcard1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:38] PROBLEM Current Users is now: CRITICAL on nova-ldap1 nova-ldap1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:38] PROBLEM Current Users is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:38] PROBLEM Current Users is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [20:17:38] PROBLEM Free ram is now: CRITICAL on asher1 asher1 output: Connection refused by host [20:17:48] PROBLEM dpkg-check is now: CRITICAL on vumi-gw1 vumi-gw1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:48] PROBLEM Total Processes is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:53] PROBLEM Total Processes is now: CRITICAL on bots-cb bots-cb output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:17:58] PROBLEM Total Processes is now: CRITICAL on vumi-gw1 vumi-gw1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:18:03] PROBLEM Total Processes is now: CRITICAL on fwserver1 fwserver1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:18:13] PROBLEM Disk Space is now: CRITICAL on nova-dev4 nova-dev4 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:18:13] PROBLEM Disk Space is now: CRITICAL on vumi-gw1 vumi-gw1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:18:23] PROBLEM Disk Space is now: CRITICAL on nova-ldap1 nova-ldap1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:18:33] PROBLEM Total Processes is now: CRITICAL on deployment-nfs-memc deployment-nfs-memc output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:18:43] PROBLEM Disk Space is now: CRITICAL on fwserver1 fwserver1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:18:48] PROBLEM Free ram is now: CRITICAL on bots-2 bots-2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:18:53] PROBLEM Total Processes is now: CRITICAL on nova-ldap1 nova-ldap1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:03] PROBLEM Current Users is now: CRITICAL on deployment-nfs-memc deployment-nfs-memc output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:03] PROBLEM Current Load is now: CRITICAL on labs-lvs1 labs-lvs1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:03] PROBLEM Free ram is now: CRITICAL on deployment-wmsearch deployment-wmsearch output: Connection refused by host [20:19:03] PROBLEM Disk Space is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [20:19:03] PROBLEM dpkg-check is now: CRITICAL on labs-build1 labs-build1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:03] PROBLEM Free ram is now: CRITICAL on fwserver1 fwserver1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:03] PROBLEM Current Users is now: CRITICAL on fwserver1 fwserver1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:08] PROBLEM Current Load is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:18] PROBLEM Total Processes is now: CRITICAL on nova-dev2 nova-dev2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:23] PROBLEM dpkg-check is now: CRITICAL on fwserver1 fwserver1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:23] PROBLEM dpkg-check is now: CRITICAL on nova-dev2 nova-dev2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:23] PROBLEM Current Load is now: CRITICAL on deployment-nfs-memc deployment-nfs-memc output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:28] PROBLEM Disk Space is now: CRITICAL on labs-build1 labs-build1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:33] PROBLEM Current Users is now: CRITICAL on incubator-dep incubator-dep output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:53] PROBLEM Free ram is now: CRITICAL on nova-dev2 nova-dev2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:53] PROBLEM Total Processes is now: CRITICAL on search-test search-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:19:58] PROBLEM Current Users is now: CRITICAL on prefixexport prefixexport output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:08] PROBLEM Free ram is now: CRITICAL on bots-3 bots-3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:08] PROBLEM Current Load is now: CRITICAL on prefixexport prefixexport output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:08] PROBLEM Free ram is now: CRITICAL on prefixexport prefixexport output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:08] PROBLEM Total Processes is now: CRITICAL on bots-3 bots-3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:13] PROBLEM Disk Space is now: CRITICAL on bots-2 bots-2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:28] PROBLEM Current Load is now: CRITICAL on analytics analytics output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:28] PROBLEM dpkg-check is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [20:20:38] PROBLEM Free ram is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [20:20:48] PROBLEM Disk Space is now: CRITICAL on prefixexport prefixexport output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:48] PROBLEM Current Load is now: CRITICAL on nova-ldap1 nova-ldap1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:48] PROBLEM Current Load is now: CRITICAL on search-test search-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:48] PROBLEM dpkg-check is now: CRITICAL on search-test search-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:20:53] PROBLEM Disk Space is now: CRITICAL on analytics analytics output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:21:23] PROBLEM dpkg-check is now: CRITICAL on bots-2 bots-2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:21:23] PROBLEM Total Processes is now: CRITICAL on prefixexport prefixexport output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:21:28] PROBLEM Total Processes is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [20:21:33] PROBLEM Current Users is now: CRITICAL on bots-2 bots-2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:03] PROBLEM Free ram is now: CRITICAL on analytics analytics output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:03] PROBLEM dpkg-check is now: CRITICAL on prefixexport prefixexport output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:13] PROBLEM Current Load is now: CRITICAL on canonical-bridge canonical-bridge output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:13] PROBLEM Total Processes is now: CRITICAL on analytics analytics output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:18] PROBLEM Current Users is now: CRITICAL on canonical-bridge canonical-bridge output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:18] PROBLEM Total Processes is now: CRITICAL on bots-2 bots-2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:43] PROBLEM Free ram is now: CRITICAL on search-test search-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:43] PROBLEM Disk Space is now: CRITICAL on search-test search-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:43] PROBLEM dpkg-check is now: CRITICAL on canonical-bridge canonical-bridge output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:43] PROBLEM Current Users is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:43] PROBLEM Current Users is now: CRITICAL on bots-3 bots-3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:43] PROBLEM Current Load is now: CRITICAL on bots-2 bots-2 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:53] PROBLEM Disk Space is now: CRITICAL on canonical-bridge canonical-bridge output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:53] PROBLEM Current Users is now: CRITICAL on search-test search-test output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:53] PROBLEM Current Users is now: CRITICAL on analytics analytics output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:53] PROBLEM Disk Space is now: CRITICAL on bots-3 bots-3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:22:53] PROBLEM Current Load is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [20:22:53] PROBLEM Current Users is now: CRITICAL on incubator-bots incubator-bots output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:23:03] PROBLEM Free ram is now: CRITICAL on canonical-bridge canonical-bridge output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:23:03] PROBLEM Total Processes is now: CRITICAL on canonical-bridge canonical-bridge output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:23:08] PROBLEM Disk Space is now: CRITICAL on incubator-bots incubator-bots output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:23:28] PROBLEM Disk Space is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:23:33] PROBLEM dpkg-check is now: CRITICAL on bots-3 bots-3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:23:33] PROBLEM Free ram is now: CRITICAL on incubator-bots incubator-bots output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:24:03] PROBLEM Current Load is now: CRITICAL on bots-3 bots-3 output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:24:03] PROBLEM dpkg-check is now: CRITICAL on analytics analytics output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:24:03] PROBLEM dpkg-check is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:24:03] PROBLEM Total Processes is now: CRITICAL on incubator-bots incubator-bots output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:24:08] PROBLEM dpkg-check is now: CRITICAL on incubator-bots incubator-bots output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:24:43] PROBLEM Total Processes is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:24:48] PROBLEM Current Load is now: CRITICAL on incubator-bots incubator-bots output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:25:13] PROBLEM Free ram is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:25:43] PROBLEM Current Load is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:26:23] PROBLEM Total Processes is now: CRITICAL on p-b p-b output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:28:13] PROBLEM Current Load is now: CRITICAL on p-b p-b output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:29:43] PROBLEM Free ram is now: CRITICAL on p-b p-b output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:30:03] PROBLEM Disk Space is now: CRITICAL on p-b p-b output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:30:13] PROBLEM dpkg-check is now: CRITICAL on p-b p-b output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:30:13] PROBLEM Current Users is now: CRITICAL on p-b p-b output: CHECK_NRPE: Error - Could not complete SSL handshake. [20:35:52] hi ryan: would it be possible to send me an email whenever there is a code review waiting in one of the analytics repositories? [20:40:18] 01/25/2012 - 20:40:18 - Creating a home directory for demon at /export/home/jenkins/demon [20:41:19] 01/25/2012 - 20:41:19 - Updating keys for demon [21:27:31] New patchset: Lcarr; "allowing nagios host in nrpe_local.cfg" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2092 [21:27:47] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2092 [21:27:54] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2092 [21:29:23] RECOVERY Current Users is now: OK on nova-ldap1 nova-ldap1 output: USERS OK - 0 users currently logged in [21:29:33] RECOVERY Disk Space is now: OK on nova-ldap1 nova-ldap1 output: DISK OK [21:29:33] RECOVERY Total Processes is now: OK on nova-ldap1 nova-ldap1 output: PROCS OK: 84 processes [21:30:19] hexmode: squid? [21:30:28] it's a bad idea to do anything there [21:31:12] LeslieCarr LeslieCarr LeslieCarr [21:31:17] you broke something did you [21:31:22] hey [21:31:24] haha [21:31:27] * petan looks suspicious at her [21:31:32] well in one way i fixed something [21:31:33] RECOVERY Current Load is now: OK on nova-ldap1 nova-ldap1 output: OK - load average: 0.08, 0.06, 0.02 [21:31:37] I see [21:31:43] however in another way i totally broke something :) [21:31:50] :O [21:31:51] bah [21:32:04] i added in the nrpe_local file so nagios could have it [21:32:09] and puppet wouldn't keep erroring [21:32:16] but now everything in nagios has ssl errors [21:32:20] as you might be able to see above :) [21:32:28] yay [21:32:29] I see [21:32:35] because you replaced our template [21:32:40] we made a template with Ryan [21:32:47] so that there was one on prod and one on labs [21:33:09] oh [21:33:31] so the merge ryan did this weekend mad eit so it was looking for a file [21:33:41] but i am guessing that it overwrote some templating bit [21:33:55] probably yes [21:34:03] RECOVERY dpkg-check is now: OK on nova-ldap1 nova-ldap1 output: All packages OK [21:34:03] RECOVERY Current Users is now: OK on nova-production1 nova-production1 output: USERS OK - 3 users currently logged in [21:34:04] look at that, it does have a template bit there [21:34:15] original file looks different [21:34:25] I will try to recover it [21:34:33] RECOVERY Current Users is now: OK on client1-lcarr client1-lcarr output: USERS OK - 0 users currently logged in [21:34:33] RECOVERY Disk Space is now: OK on nova-production1 nova-production1 output: DISK OK [21:34:36] but I am bad with git :P [21:34:46] johnduhart: you know git :P [21:34:53] RECOVERY Current Users is now: OK on puppet-lucid puppet-lucid output: USERS OK - 0 users currently logged in [21:34:57] how do I retrieve a file from older revision [21:35:23] RECOVERY Total Processes is now: OK on bots-3 bots-3 output: PROCS OK: 96 processes [21:35:28] RECOVERY Disk Space is now: OK on bots-2 bots-2 output: DISK OK [21:35:33] RECOVERY dpkg-check is now: OK on nova-production1 nova-production1 output: All packages OK [21:35:37] git checkout e4f9aa8010ad4cfe5a8699252d4ef2e246394695 should get us the version before the merge happened [21:36:03] RECOVERY Total Processes is now: OK on puppet-lucid puppet-lucid output: PROCS OK: 84 processes [21:36:05] Just becareful, that rolls the whole repo back in a detached head state [21:36:21] get what you need and checkout HEAD [21:36:23] PROBLEM Total Processes is now: WARNING on nova-production1 nova-production1 output: PROCS WARNING: 159 processes [21:36:28] LeslieCarr: templates/nagios/nrpe_local.cfg.erb [21:36:28] RECOVERY Current Users is now: OK on bots-2 bots-2 output: USERS OK - 0 users currently logged in [21:36:28] RECOVERY dpkg-check is now: OK on bots-2 bots-2 output: All packages OK [21:36:43] RECOVERY Current Load is now: OK on puppet-lucid puppet-lucid output: OK - load average: 0.00, 0.04, 0.01 [21:36:46] It's like a timemachine, don't touch anything. [21:36:56] omg [21:37:00] wrong one [21:37:13] RECOVERY Total Processes is now: OK on client1-lcarr client1-lcarr output: PROCS OK: 81 processes [21:37:18] RECOVERY Total Processes is now: OK on analytics analytics output: PROCS OK: 95 processes [21:37:22] nope [21:37:26] yep [21:37:42] here we go [21:37:50] it's temporary anyway [21:37:53] so LeslieCarr [21:37:57] templates/nagios/nrpe_local.cfg.erb [21:38:01] that's the file we are missing here [21:38:16] it's in revision you sent me [21:39:22] LeslieCarr: I have no idea if putting it back actually would fix it [21:39:22] i'm going to try and mostly restore the old version :) [21:39:30] there is probably a change in nagios.pp too [21:40:07] but I don't know what change on prod was done [21:40:15] is there a log or something why we changed config of nagios? [21:40:33] if I knew reason I could try to find out a way how to implement it [21:41:08] New patchset: Lcarr; "restoring odl nrpe.pp to pre-merge state" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2093 [21:41:20] ryan did a big merge of production into test [21:41:32] and the template wasn't in production itself at the time i guess [21:41:43] so that would have killed a lot of changes that were only in the test repo [21:41:47] New review: Petrb; "I think you are missing the template file" [operations/puppet] (test) C: 0; - https://gerrit.wikimedia.org/r/2093 [21:42:12] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2093 [21:42:13] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2093 [21:42:32] ok let's see [21:42:37] petan: the template file is still there [21:42:48] RECOVERY Current Users is now: OK on labs-relay labs-relay output: USERS OK - 0 users currently logged in [21:42:56] recovery :) [21:42:58] RECOVERY Current Load is now: OK on labs-relay labs-relay output: OK - load average: 0.02, 0.05, 0.01 [21:42:58] RECOVERY Current Users is now: OK on deployment-web deployment-web output: USERS OK - 0 users currently logged in [21:42:58] RECOVERY Current Users is now: OK on labs-nfs1 labs-nfs1 output: USERS OK - 0 users currently logged in [21:42:58] RECOVERY Total Processes is now: OK on vivek-puppet vivek-puppet output: PROCS OK: 80 processes [21:42:59] that looks ok [21:43:00] :D [21:43:03] RECOVERY dpkg-check is now: OK on wep wep output: All packages OK [21:43:18] RECOVERY Disk Space is now: OK on deployment-web deployment-web output: DISK OK [21:43:18] RECOVERY Disk Space is now: OK on phabricator1 phabricator1 output: DISK OK [21:43:18] RECOVERY Disk Space is now: OK on labs-relay labs-relay output: DISK OK [21:43:38] RECOVERY Total Processes is now: OK on mediahandler-test mediahandler-test output: PROCS OK: 78 processes [21:43:48] RECOVERY dpkg-check is now: OK on bots-sql2 bots-sql2 output: All packages OK [21:43:48] RECOVERY dpkg-check is now: OK on phabricator1 phabricator1 output: All packages OK [21:43:58] RECOVERY Current Load is now: OK on wep wep output: OK - load average: 0.11, 0.06, 0.01 [21:44:05] hexmode: around? [21:44:18] RECOVERY dpkg-check is now: OK on deployment-web deployment-web output: All packages OK [21:44:18] RECOVERY Total Processes is now: OK on mobile-feeds mobile-feeds output: PROCS OK: 96 processes [21:44:23] RECOVERY Total Processes is now: OK on gerrit gerrit output: PROCS OK: 83 processes [21:44:28] RECOVERY Disk Space is now: OK on jenkins2 jenkins2 output: DISK OK [21:44:38] RECOVERY Free ram is now: OK on jenkins2 jenkins2 output: OK: 74% free memory [21:44:38] RECOVERY dpkg-check is now: OK on mobile-feeds mobile-feeds output: All packages OK [21:44:41] hexmode: ping pong me [21:44:48] RECOVERY Total Processes is now: OK on deployment-web deployment-web output: PROCS OK: 99 processes [21:44:50] I will handle squid then [21:44:53] RECOVERY Current Load is now: OK on mediahandler-test mediahandler-test output: OK - load average: 0.00, 0.01, 0.00 [21:44:53] RECOVERY Current Load is now: OK on bots-sql2 bots-sql2 output: OK - load average: 0.00, 0.03, 0.00 [21:44:58] RECOVERY Disk Space is now: OK on wikisource-web wikisource-web output: DISK OK [21:45:08] RECOVERY Current Load is now: OK on wikisource-web wikisource-web output: OK - load average: 0.00, 0.02, 0.00 [21:45:08] RECOVERY Total Processes is now: OK on wikisource-web wikisource-web output: PROCS OK: 90 processes [21:45:18] RECOVERY dpkg-check is now: OK on jenkins2 jenkins2 output: All packages OK [21:45:18] RECOVERY dpkg-check is now: OK on mediahandler-test mediahandler-test output: All packages OK [21:45:28] RECOVERY Total Processes is now: OK on labs-relay labs-relay output: PROCS OK: 80 processes [21:45:33] RECOVERY dpkg-check is now: OK on vivek-puppet vivek-puppet output: All packages OK [21:45:33] RECOVERY dpkg-check is now: OK on nova-dev1 nova-dev1 output: All packages OK [21:45:38] RECOVERY Current Load is now: OK on jenkins2 jenkins2 output: OK - load average: 0.31, 0.12, 0.04 [21:46:08] RECOVERY Free ram is now: OK on deployment-backup deployment-backup output: OK: 44% free memory [21:46:18] RECOVERY Total Processes is now: OK on jenkins2 jenkins2 output: PROCS OK: 82 processes [21:46:28] RECOVERY Disk Space is now: OK on deployment-backup deployment-backup output: DISK OK [21:46:28] RECOVERY Current Load is now: OK on bots-sql3 bots-sql3 output: OK - load average: 0.40, 0.15, 0.11 [21:46:28] RECOVERY Disk Space is now: OK on mobile-feeds mobile-feeds output: DISK OK [21:46:38] RECOVERY Free ram is now: OK on pageviews pageviews output: OK: 73% free memory [21:46:38] RECOVERY Current Users is now: OK on deployment-backup deployment-backup output: USERS OK - 0 users currently logged in [21:46:48] RECOVERY dpkg-check is now: OK on bots-4 bots-4 output: All packages OK [21:46:58] RECOVERY Free ram is now: OK on bots-sql3 bots-sql3 output: OK: 68% free memory [21:47:08] RECOVERY Current Load is now: OK on deployment-web deployment-web output: OK - load average: 0.00, 0.01, 0.00 [21:47:08] RECOVERY Disk Space is now: OK on labs-ocg1 labs-ocg1 output: DISK OK [21:47:18] RECOVERY Free ram is now: OK on bots-1 bots-1 output: OK: 86% free memory [21:47:18] RECOVERY Current Load is now: OK on labs-ocg1 labs-ocg1 output: OK - load average: 0.18, 0.07, 0.02 [21:47:18] RECOVERY Current Users is now: OK on labs-ocg1 labs-ocg1 output: USERS OK - 0 users currently logged in [21:47:28] RECOVERY Current Users is now: OK on mobile-feeds mobile-feeds output: USERS OK - 1 users currently logged in [21:47:28] RECOVERY Total Processes is now: OK on labs-ocg1 labs-ocg1 output: PROCS OK: 79 processes [21:47:38] RECOVERY Current Users is now: OK on jenkins2 jenkins2 output: USERS OK - 0 users currently logged in [21:47:53] hey petan can't look at nagios.pp yet, got prod packet loss [21:47:58] RECOVERY Current Users is now: OK on bots-sql3 bots-sql3 output: USERS OK - 0 users currently logged in [21:47:58] RECOVERY Total Processes is now: OK on pageviews pageviews output: PROCS OK: 96 processes [21:48:26] ok [21:48:28] RECOVERY Free ram is now: OK on labs-ocg1 labs-ocg1 output: OK: 90% free memory [21:48:28] RECOVERY dpkg-check is now: OK on pageviews pageviews output: All packages OK [21:48:38] RECOVERY Disk Space is now: OK on bots-4 bots-4 output: DISK OK [21:48:48] RECOVERY Total Processes is now: OK on deployment-backup deployment-backup output: PROCS OK: 80 processes [21:48:53] RECOVERY Current Load is now: OK on incubator-nfs incubator-nfs output: OK - load average: 0.17, 0.05, 0.01 [21:48:53] RECOVERY Current Load is now: OK on bots-4 bots-4 output: OK - load average: 0.07, 0.08, 0.02 [21:48:58] RECOVERY Current Load is now: OK on deployment-backup deployment-backup output: OK - load average: 0.00, 0.02, 0.00 [21:49:18] RECOVERY Current Users is now: OK on bots-1 bots-1 output: USERS OK - 0 users currently logged in [21:49:18] RECOVERY dpkg-check is now: OK on deployment-backup deployment-backup output: All packages OK [21:49:38] RECOVERY Disk Space is now: OK on incubator-nfs incubator-nfs output: DISK OK [21:49:38] RECOVERY Free ram is now: OK on master master output: OK: 93% free memory [21:49:48] RECOVERY Current Users is now: OK on incubator-nfs incubator-nfs output: USERS OK - 0 users currently logged in [21:49:48] RECOVERY Total Processes is now: OK on labs-realserver labs-realserver output: PROCS OK: 79 processes [21:49:58] RECOVERY Disk Space is now: OK on master master output: DISK OK [21:49:58] RECOVERY Disk Space is now: OK on pageviews pageviews output: DISK OK [21:50:08] RECOVERY dpkg-check is now: OK on labs-realserver labs-realserver output: All packages OK [21:50:18] RECOVERY Total Processes is now: OK on deployment-transcoding deployment-transcoding output: PROCS OK: 79 processes [21:50:28] RECOVERY Total Processes is now: OK on bots-4 bots-4 output: PROCS OK: 83 processes [21:50:33] RECOVERY Total Processes is now: OK on bots-1 bots-1 output: PROCS OK: 93 processes [21:50:38] RECOVERY dpkg-check is now: OK on labs-ocg1 labs-ocg1 output: All packages OK [21:50:48] RECOVERY Current Users is now: OK on bastion1 bastion1 output: USERS OK - 8 users currently logged in [21:50:48] RECOVERY dpkg-check is now: OK on master master output: All packages OK [21:50:48] RECOVERY Current Load is now: OK on pageviews pageviews output: OK - load average: 0.19, 0.58, 0.59 [21:50:58] RECOVERY Free ram is now: OK on bots-4 bots-4 output: OK: 90% free memory [21:50:58] RECOVERY Current Load is now: OK on incubator-live incubator-live output: OK - load average: 0.13, 0.06, 0.02 [21:51:08] RECOVERY dpkg-check is now: OK on bots-1 bots-1 output: All packages OK [21:51:08] RECOVERY Current Users is now: OK on master master output: USERS OK - 0 users currently logged in [21:51:08] RECOVERY Current Users is now: OK on pageviews pageviews output: USERS OK - 0 users currently logged in [21:51:08] RECOVERY Total Processes is now: OK on master master output: PROCS OK: 95 processes [21:51:13] RECOVERY Total Processes is now: OK on bots-sql3 bots-sql3 output: PROCS OK: 83 processes [21:51:18] RECOVERY Current Load is now: OK on bots-1 bots-1 output: OK - load average: 0.12, 0.06, 0.01 [21:51:28] RECOVERY Disk Space is now: OK on bots-1 bots-1 output: DISK OK [21:51:28] RECOVERY Current Load is now: OK on labs-realserver labs-realserver output: OK - load average: 0.00, 0.02, 0.00 [21:51:28] RECOVERY dpkg-check is now: OK on bots-sql3 bots-sql3 output: All packages OK [21:51:38] RECOVERY Total Processes is now: OK on incubator-nfs incubator-nfs output: PROCS OK: 91 processes [21:51:48] RECOVERY dpkg-check is now: OK on deployment-transcoding deployment-transcoding output: All packages OK [21:51:48] RECOVERY Disk Space is now: OK on bots-sql3 bots-sql3 output: DISK OK [21:51:58] RECOVERY Free ram is now: OK on incubator-nfs incubator-nfs output: OK: 85% free memory [21:51:58] RECOVERY Current Load is now: OK on deployment-transcoding deployment-transcoding output: OK - load average: 0.02, 0.03, 0.01 [21:51:58] RECOVERY Current Load is now: OK on bastion1 bastion1 output: OK - load average: 0.04, 0.05, 0.01 [21:52:08] RECOVERY Current Users is now: OK on bots-sql1 bots-sql1 output: USERS OK - 0 users currently logged in [21:52:08] RECOVERY dpkg-check is now: OK on incubator-live incubator-live output: All packages OK [21:52:08] RECOVERY Current Load is now: OK on ubuntu1-pgehres ubuntu1-pgehres output: OK - load average: 0.24, 0.08, 0.02 [21:52:08] PROBLEM Disk Space is now: WARNING on deployment-transcoding deployment-transcoding output: DISK WARNING - free space: / 46 MB (3% inode=54%): [21:52:18] RECOVERY Current Users is now: OK on deployment-transcoding deployment-transcoding output: USERS OK - 0 users currently logged in [21:52:18] RECOVERY Disk Space is now: OK on labs-realserver labs-realserver output: DISK OK [21:52:28] RECOVERY Disk Space is now: OK on embed-sandbox embed-sandbox output: DISK OK [21:52:28] RECOVERY Current Users is now: OK on incubator-live incubator-live output: USERS OK - 0 users currently logged in [21:52:28] RECOVERY Current Load is now: OK on venus venus output: OK - load average: 0.31, 0.08, 0.02 [21:52:28] RECOVERY dpkg-check is now: OK on deployment-squid deployment-squid output: All packages OK [21:52:38] RECOVERY Disk Space is now: OK on incubator-live incubator-live output: DISK OK [21:52:38] RECOVERY Free ram is now: OK on bots-nfs bots-nfs output: OK: 88% free memory [21:52:38] RECOVERY Current Users is now: OK on labs-realserver labs-realserver output: USERS OK - 0 users currently logged in [21:52:38] RECOVERY Current Users is now: OK on aggregator1 aggregator1 output: USERS OK - 0 users currently logged in [21:52:38] RECOVERY Current Users is now: OK on bots-4 bots-4 output: USERS OK - 1 users currently logged in [21:52:43] LeslieCarr: can you enforce puppet to run on all instances? [21:52:48] RECOVERY Total Processes is now: OK on p-b p-b output: PROCS OK: 88 processes [21:52:53] RECOVERY Free ram is now: OK on bots-sql1 bots-sql1 output: OK: 88% free memory [21:52:53] RECOVERY Current Users is now: OK on pad1 pad1 output: USERS OK - 0 users currently logged in [21:52:53] RECOVERY Free ram is now: OK on deployment-squid deployment-squid output: OK: 88% free memory [21:52:58] RECOVERY Free ram is now: OK on pad1 pad1 output: OK: 89% free memory [21:52:58] RECOVERY Free ram is now: OK on client1-lcarr client1-lcarr output: OK: 55% free memory [21:52:58] RECOVERY dpkg-check is now: OK on bastion1 bastion1 output: All packages OK [21:52:58] RECOVERY Total Processes is now: OK on bastion1 bastion1 output: PROCS OK: 163 processes [21:53:03] RECOVERY Current Load is now: OK on nginx-dev1 nginx-dev1 output: OK - load average: 0.25, 0.10, 0.03 [21:53:03] RECOVERY Total Processes is now: OK on embed-sandbox embed-sandbox output: PROCS OK: 81 processes [21:53:08] RECOVERY Disk Space is now: OK on wikistats-01 wikistats-01 output: DISK OK [21:53:08] RECOVERY dpkg-check is now: OK on ubuntu1-pgehres ubuntu1-pgehres output: All packages OK [21:53:08] RECOVERY Free ram is now: OK on labs-realserver labs-realserver output: OK: 63% free memory [21:53:09] petan: SSH into the instance, run sudo puppetd -tv ? [21:53:18] RECOVERY Free ram is now: OK on bastion1 bastion1 output: OK: 80% free memory [21:53:18] RECOVERY Current Users is now: OK on ubuntu1-pgehres ubuntu1-pgehres output: USERS OK - 0 users currently logged in [21:53:18] RECOVERY Total Processes is now: OK on bots-nfs bots-nfs output: PROCS OK: 90 processes [21:53:20] RoanKattouw: I can't ssh to all instances [21:53:22] only to mine [21:53:23] RECOVERY Disk Space is now: OK on test3 test3 output: DISK OK [21:53:23] RECOVERY Free ram is now: OK on aggregator1 aggregator1 output: OK: 91% free memory [21:53:27] Right [21:53:28] RECOVERY Current Load is now: OK on ganglia-collector ganglia-collector output: OK - load average: 0.16, 0.09, 0.02 [21:53:28] RECOVERY Free ram is now: OK on gerrit gerrit output: OK: 78% free memory [21:53:28] RECOVERY dpkg-check is now: OK on pad1 pad1 output: All packages OK [21:53:28] RECOVERY Total Processes is now: OK on deployment-squid deployment-squid output: PROCS OK: 84 processes [21:53:38] RECOVERY Free ram is now: OK on deployment-transcoding deployment-transcoding output: OK: 71% free memory [21:53:38] RECOVERY Disk Space is now: OK on nginx-dev1 nginx-dev1 output: DISK OK [21:53:38] RECOVERY Disk Space is now: OK on bots-sql1 bots-sql1 output: DISK OK [21:53:38] RECOVERY Disk Space is now: OK on venus venus output: DISK OK [21:53:38] RECOVERY dpkg-check is now: OK on ganglia-collector ganglia-collector output: All packages OK [21:53:39] RECOVERY Total Processes is now: OK on wikistats-01 wikistats-01 output: PROCS OK: 93 processes [21:53:43] RECOVERY Disk Space is now: OK on bastion1 bastion1 output: DISK OK [21:53:43] RECOVERY dpkg-check is now: OK on embed-sandbox embed-sandbox output: All packages OK [21:53:48] RECOVERY Total Processes is now: OK on ubuntu1-pgehres ubuntu1-pgehres output: PROCS OK: 100 processes [21:53:53] RECOVERY Current Users is now: OK on ganglia-collector ganglia-collector output: USERS OK - 0 users currently logged in [21:53:53] RECOVERY dpkg-check is now: OK on venus venus output: All packages OK [21:53:53] RECOVERY Current Users is now: OK on turnkey-1 turnkey-1 output: USERS OK - 0 users currently logged in [21:53:53] RECOVERY Current Load is now: OK on pad1 pad1 output: OK - load average: 0.09, 0.05, 0.01 [21:53:53] RECOVERY Free ram is now: OK on incubator-live incubator-live output: OK: 82% free memory [21:53:54] RECOVERY Free ram is now: OK on ubuntu1-pgehres ubuntu1-pgehres output: OK: 88% free memory [21:53:54] RECOVERY dpkg-check is now: OK on incubator-nfs incubator-nfs output: All packages OK [21:53:55] RECOVERY Current Users is now: OK on wikistats-01 wikistats-01 output: USERS OK - 0 users currently logged in [21:53:58] RECOVERY Current Load is now: OK on p-b p-b output: OK - load average: 0.09, 0.11, 0.04 [21:53:58] RECOVERY Total Processes is now: OK on ganglia-collector ganglia-collector output: PROCS OK: 82 processes [21:54:03] RECOVERY Current Load is now: OK on bots-sql1 bots-sql1 output: OK - load average: 0.05, 0.06, 0.02 [21:54:03] RECOVERY Disk Space is now: OK on turnkey-1 turnkey-1 output: DISK OK [21:54:03] RECOVERY dpkg-check is now: OK on wikistats-01 wikistats-01 output: All packages OK [21:54:03] RECOVERY Disk Space is now: OK on pad1 pad1 output: DISK OK [21:54:03] RECOVERY Current Users is now: OK on test3 test3 output: USERS OK - 0 users currently logged in [21:54:04] RECOVERY Current Load is now: OK on master master output: OK - load average: 0.00, 0.04, 0.01 [21:54:08] RECOVERY Current Load is now: OK on turnkey-1 turnkey-1 output: OK - load average: 0.02, 0.04, 0.01 [21:54:08] RECOVERY Current Users is now: OK on nginx-dev1 nginx-dev1 output: USERS OK - 0 users currently logged in [21:54:18] RECOVERY dpkg-check is now: OK on deployment-dbdump deployment-dbdump output: All packages OK [21:54:18] RECOVERY Total Processes is now: OK on turnkey-1 turnkey-1 output: PROCS OK: 86 processes [21:54:28] RECOVERY Free ram is now: OK on turnkey-1 turnkey-1 output: OK: 90% free memory [21:54:28] RECOVERY Current Load is now: OK on nova-dev4 nova-dev4 output: OK - load average: 0.19, 0.08, 0.07 [21:54:28] RECOVERY Current Users is now: OK on bots-nfs bots-nfs output: USERS OK - 0 users currently logged in [21:54:28] RECOVERY dpkg-check is now: OK on test3 test3 output: All packages OK [21:54:28] RECOVERY Disk Space is now: OK on nova-dev3 nova-dev3 output: DISK OK [21:54:29] RECOVERY Free ram is now: OK on embed-sandbox embed-sandbox output: OK: 90% free memory [21:54:29] RECOVERY Total Processes is now: OK on ganglia-master ganglia-master output: PROCS OK: 85 processes [21:54:38] RECOVERY Current Users is now: OK on ganglia-master ganglia-master output: USERS OK - 0 users currently logged in [21:54:38] RECOVERY Disk Space is now: OK on aggregator1 aggregator1 output: DISK OK [21:54:38] RECOVERY Total Processes is now: OK on aggregator1 aggregator1 output: PROCS OK: 101 processes [21:54:39] [21:54:43] RECOVERY dpkg-check is now: OK on aggregator1 aggregator1 output: All packages OK [21:54:48] RECOVERY Current Load is now: OK on embed-sandbox embed-sandbox output: OK - load average: 0.02, 0.04, 0.01 [21:54:48] RECOVERY Free ram is now: OK on test3 test3 output: OK: 90% free memory [21:54:48] RECOVERY Current Load is now: OK on deployment-squid deployment-squid output: OK - load average: 0.01, 0.03, 0.00 [21:54:48] RECOVERY Free ram is now: OK on ganglia-master ganglia-master output: OK: 90% free memory [21:54:48] RECOVERY Disk Space is now: OK on deployment-squid deployment-squid output: DISK OK [21:54:49] RECOVERY Free ram is now: OK on p-b p-b output: OK: 80% free memory [21:54:54] okay, i am ignoring labs-nagios-wm_ now until it stops spewing [21:54:58] RECOVERY Total Processes is now: OK on incubator-live incubator-live output: PROCS OK: 90 processes [21:55:03] RECOVERY Free ram is now: OK on nova-dev4 nova-dev4 output: OK: 65% free memory [21:55:03] RECOVERY Current Load is now: OK on wikistats-01 wikistats-01 output: OK - load average: 0.72, 0.20, 0.06 [21:55:03] RECOVERY Disk Space is now: OK on deployment-nfs-memc deployment-nfs-memc output: DISK OK [21:55:03] RECOVERY Free ram is now: OK on vumi-gw1 vumi-gw1 output: OK: 90% free memory [21:55:03] RECOVERY Disk Space is now: OK on labs-build1 labs-build1 output: DISK OK [21:55:08] RECOVERY dpkg-check is now: OK on ganglia-master ganglia-master output: All packages OK [21:55:08] RECOVERY Total Processes is now: OK on venus venus output: PROCS OK: 85 processes [21:55:13] RECOVERY Disk Space is now: OK on nova-dev4 nova-dev4 output: DISK OK [21:55:13] RECOVERY Current Load is now: OK on labs-lvs1 labs-lvs1 output: OK - load average: 0.31, 0.08, 0.02 [21:55:13] RECOVERY Total Processes is now: OK on bots-sql1 bots-sql1 output: PROCS OK: 80 processes [21:55:18] RECOVERY Total Processes is now: OK on test3 test3 output: PROCS OK: 76 processes [21:55:30] heh [21:55:33] RECOVERY dpkg-check is now: OK on turnkey-1 turnkey-1 output: All packages OK [21:55:33] RECOVERY dpkg-check is now: OK on bots-sql1 bots-sql1 output: All packages OK [21:55:33] RECOVERY Current Users is now: OK on reportcard1 reportcard1 output: USERS OK - 0 users currently logged in [21:55:33] RECOVERY Current Users is now: OK on vumi-gw1 vumi-gw1 output: USERS OK - 0 users currently logged in [21:55:33] RECOVERY Current Load is now: OK on deployment-dbdump deployment-dbdump output: OK - load average: 0.02, 0.03, 0.01 [21:55:34] RECOVERY Free ram is now: OK on reportcard1 reportcard1 output: OK: 66% free memory [21:55:34] RECOVERY Current Users is now: OK on venus venus output: USERS OK - 0 users currently logged in [21:55:43] RECOVERY Free ram is now: OK on nginx-dev1 nginx-dev1 output: OK: 86% free memory [21:55:43] RECOVERY Current Users is now: OK on p-b p-b output: USERS OK - 0 users currently logged in [21:55:43] RECOVERY Current Users is now: OK on deployment-squid deployment-squid output: USERS OK - 0 users currently logged in [21:55:43] RECOVERY Free ram is now: OK on labs-lvs1 labs-lvs1 output: OK: 90% free memory [21:55:43] RECOVERY dpkg-check is now: OK on p-b p-b output: All packages OK [21:55:43] RECOVERY dpkg-check is now: OK on nova-dev3 nova-dev3 output: All packages OK [21:55:44] LeslieCarr: it's possible to turn it off from nagios.wmflabs.org [21:55:53] RECOVERY Current Load is now: OK on deployment-nfs-memc deployment-nfs-memc output: OK - load average: 0.05, 0.05, 0.01 [21:55:53] RECOVERY Total Processes is now: OK on deployment-nfs-memc deployment-nfs-memc output: PROCS OK: 95 processes [21:55:58] RECOVERY Current Load is now: OK on aggregator1 aggregator1 output: OK - load average: 0.00, 0.04, 0.00 [21:55:58] RECOVERY Current Users is now: OK on deployment-nfs-memc deployment-nfs-memc output: USERS OK - 1 users currently logged in [21:55:59] but no one asked me for acces there :| [21:56:03] RECOVERY Total Processes is now: OK on pad1 pad1 output: PROCS OK: 82 processes [21:56:08] RECOVERY Total Processes is now: OK on nova-dev4 nova-dev4 output: PROCS OK: 118 processes [21:56:13] RECOVERY Total Processes is now: OK on deployment-dbdump deployment-dbdump output: PROCS OK: 87 processes [21:56:18] RECOVERY Current Load is now: OK on mobile-enwp mobile-enwp output: OK - load average: 0.32, 0.10, 0.03 [21:56:18] RECOVERY Current Load is now: OK on incubator-dep incubator-dep output: OK - load average: 0.07, 0.03, 0.01 [21:56:18] RECOVERY Disk Space is now: OK on labs-lvs1 labs-lvs1 output: DISK OK [21:56:28] RECOVERY Free ram is now: OK on venus venus output: OK: 86% free memory [21:56:28] RECOVERY Free ram is now: OK on fwserver1 fwserver1 output: OK: 88% free memory [21:56:28] RECOVERY Current Load is now: OK on bots-nfs bots-nfs output: OK - load average: 0.01, 0.04, 0.01 [21:56:28] RECOVERY Current Users is now: OK on deployment-dbdump deployment-dbdump output: USERS OK - 1 users currently logged in [21:56:38] RECOVERY Current Load is now: OK on ganglia-master ganglia-master output: OK - load average: 0.04, 0.05, 0.01 [21:56:38] RECOVERY Free ram is now: OK on deployment-dbdump deployment-dbdump output: OK: 82% free memory [21:56:38] RECOVERY Current Load is now: OK on bots-cb bots-cb output: OK - load average: 0.45, 0.40, 0.36 [21:56:38] RECOVERY Disk Space is now: OK on incubator-dep incubator-dep output: DISK OK [21:56:38] RECOVERY dpkg-check is now: OK on nova-dev4 nova-dev4 output: All packages OK [21:56:39] RECOVERY dpkg-check is now: OK on labs-lvs1 labs-lvs1 output: All packages OK [21:56:39] RECOVERY dpkg-check is now: OK on nginx-dev1 nginx-dev1 output: All packages OK [21:56:40] RECOVERY Total Processes is now: OK on nova-dev3 nova-dev3 output: PROCS OK: 77 processes [21:56:48] RECOVERY Disk Space is now: OK on p-b p-b output: DISK OK [21:56:48] RECOVERY Free ram is now: OK on bots-3 bots-3 output: OK: 52% free memory [21:56:48] RECOVERY dpkg-check is now: OK on mobile-enwp mobile-enwp output: All packages OK [21:56:48] RECOVERY dpkg-check is now: OK on bots-nfs bots-nfs output: All packages OK [21:56:48] RECOVERY Current Users is now: OK on labs-lvs1 labs-lvs1 output: USERS OK - 0 users currently logged in [21:56:49] RECOVERY Current Users is now: OK on nova-dev3 nova-dev3 output: USERS OK - 0 users currently logged in [21:56:49] RECOVERY Free ram is now: OK on incubator-dep incubator-dep output: OK: 91% free memory [21:56:58] RECOVERY Current Load is now: OK on test3 test3 output: OK - load average: 0.00, 0.04, 0.05 [21:56:58] RECOVERY Free ram is now: OK on deployment-nfs-memc deployment-nfs-memc output: OK: 77% free memory [21:56:58] RECOVERY Total Processes is now: OK on nginx-dev1 nginx-dev1 output: PROCS OK: 80 processes [21:57:05] oh sorry, i added myself there today to try and fix some of the nagios stuff, but since i couldn't run puppet it took a while for puppet to give me sudo [21:57:08] RECOVERY Current Users is now: OK on nova-daas-1 nova-daas-1 output: USERS OK - 0 users currently logged in [21:57:08] RECOVERY Disk Space is now: OK on deployment-dbdump deployment-dbdump output: DISK OK [21:57:08] RECOVERY Current Users is now: OK on embed-sandbox embed-sandbox output: USERS OK - 0 users currently logged in [21:57:08] RECOVERY Disk Space is now: OK on bots-nfs bots-nfs output: DISK OK [21:57:18] RECOVERY Current Users is now: OK on bots-cb bots-cb output: USERS OK - 1 users currently logged in [21:57:18] RECOVERY Disk Space is now: OK on ubuntu1-pgehres ubuntu1-pgehres output: DISK OK [21:57:18] RECOVERY Total Processes is now: OK on reportcard1 reportcard1 output: PROCS OK: 91 processes [21:57:23] RECOVERY Free ram is now: OK on nova-dev3 nova-dev3 output: OK: 62% free memory [21:57:23] RECOVERY Free ram is now: OK on wikistats-01 wikistats-01 output: OK: 79% free memory [21:57:23] RECOVERY dpkg-check is now: OK on labs-build1 labs-build1 output: All packages OK [21:57:28] RECOVERY Disk Space is now: OK on ganglia-master ganglia-master output: DISK OK [21:57:38] RECOVERY Current Users is now: OK on labs-build1 labs-build1 output: USERS OK - 0 users currently logged in [21:57:38] RECOVERY Free ram is now: OK on ganglia-collector ganglia-collector output: OK: 82% free memory [21:57:38] RECOVERY Disk Space is now: OK on mobile-enwp mobile-enwp output: DISK OK [21:57:38] RECOVERY Total Processes is now: OK on mobile-enwp mobile-enwp output: PROCS OK: 92 processes [21:57:46] New patchset: Lcarr; "Removed payments file from labs as no payments cluster" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2094 [21:57:48] RECOVERY dpkg-check is now: OK on bots-cb bots-cb output: All packages OK [21:57:48] RECOVERY Disk Space is now: OK on ganglia-collector ganglia-collector output: DISK OK [21:57:48] RECOVERY Free ram is now: OK on nova-daas-1 nova-daas-1 output: OK: 65% free memory [21:57:53] LeslieCarr: I can fix it [21:57:58] RECOVERY Current Load is now: OK on reportcard1 reportcard1 output: OK - load average: 0.01, 0.05, 0.02 [21:57:58] RECOVERY Free ram is now: OK on labs-build1 labs-build1 output: OK: 89% free memory [21:57:58] RECOVERY dpkg-check is now: OK on reportcard1 reportcard1 output: All packages OK [21:58:05] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2094 [21:58:08] i got it now :) [21:58:08] RECOVERY dpkg-check is now: OK on deployment-nfs-memc deployment-nfs-memc output: All packages OK [21:58:08] RECOVERY Current Load is now: OK on labs-build1 labs-build1 output: OK - load average: 0.00, 0.04, 0.00 [21:58:08] RECOVERY Current Load is now: OK on vumi-gw1 vumi-gw1 output: OK - load average: 0.01, 0.04, 0.00 [21:58:08] RECOVERY Current Load is now: OK on nova-dev2 nova-dev2 output: OK - load average: 0.39, 0.17, 0.10 [21:58:14] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2094 [21:58:27] now, removed the reference to the payments cluster [21:58:28] RECOVERY Current Load is now: OK on nova-dev3 nova-dev3 output: OK - load average: 0.03, 0.03, 0.00 [21:58:31] let's see what else breaks! [21:58:38] RECOVERY Current Users is now: OK on nova-dev4 nova-dev4 output: USERS OK - 1 users currently logged in [21:58:38] RECOVERY dpkg-check is now: OK on nova-daas-1 nova-daas-1 output: All packages OK [21:58:48] RECOVERY Disk Space is now: OK on nova-dev2 nova-dev2 output: DISK OK [21:58:48] RECOVERY dpkg-check is now: OK on incubator-dep incubator-dep output: All packages OK [21:58:58] RECOVERY Total Processes is now: OK on incubator-dep incubator-dep output: PROCS OK: 82 processes [21:59:08] RECOVERY Total Processes is now: OK on labs-build1 labs-build1 output: PROCS OK: 78 processes [21:59:13] RECOVERY Total Processes is now: OK on vumi-gw1 vumi-gw1 output: PROCS OK: 79 processes [21:59:18] RECOVERY Total Processes is now: OK on fwserver1 fwserver1 output: PROCS OK: 80 processes [21:59:33] RECOVERY Total Processes is now: OK on bots-cb bots-cb output: PROCS OK: 123 processes [21:59:38] RECOVERY Disk Space is now: OK on bots-cb bots-cb output: DISK OK [21:59:38] RECOVERY Disk Space is now: OK on nova-daas-1 nova-daas-1 output: DISK OK [21:59:38] RECOVERY dpkg-check is now: OK on vumi-gw1 vumi-gw1 output: All packages OK [21:59:38] RECOVERY Current Load is now: OK on nova-daas-1 nova-daas-1 output: OK - load average: 0.04, 0.07, 0.06 [21:59:38] RECOVERY Current Users is now: OK on mobile-enwp mobile-enwp output: USERS OK - 0 users currently logged in [21:59:38] RECOVERY Disk Space is now: OK on reportcard1 reportcard1 output: DISK OK [21:59:43] RECOVERY Free ram is now: OK on bots-cb bots-cb output: OK: 63% free memory [22:00:03] RECOVERY Total Processes is now: OK on labs-lvs1 labs-lvs1 output: PROCS OK: 80 processes [22:00:09] RECOVERY dpkg-check is now: OK on fwserver1 fwserver1 output: All packages OK [22:00:09] RECOVERY Disk Space is now: OK on fwserver1 fwserver1 output: DISK OK [22:00:09] RECOVERY dpkg-check is now: OK on nova-dev2 nova-dev2 output: All packages OK [22:00:09] RECOVERY Current Load is now: OK on fwserver1 fwserver1 output: OK - load average: 0.14, 0.10, 0.03 [22:00:23] RECOVERY Disk Space is now: OK on vumi-gw1 vumi-gw1 output: DISK OK [22:00:23] RECOVERY Current Users is now: OK on nova-dev2 nova-dev2 output: USERS OK - 0 users currently logged in [22:00:23] RECOVERY Total Processes is now: OK on nova-daas-1 nova-daas-1 output: PROCS OK: 114 processes [22:00:28] RECOVERY Current Users is now: OK on fwserver1 fwserver1 output: USERS OK - 0 users currently logged in [22:00:33] RECOVERY Free ram is now: OK on mobile-enwp mobile-enwp output: OK: 60% free memory [22:00:33] RECOVERY Total Processes is now: OK on nova-dev2 nova-dev2 output: PROCS OK: 119 processes [22:00:38] RECOVERY Free ram is now: OK on nova-dev2 nova-dev2 output: OK: 67% free memory [22:00:53] RECOVERY Current Users is now: OK on incubator-dep incubator-dep output: USERS OK - 1 users currently logged in [22:03:23] RECOVERY Free ram is now: OK on puppet-lucid puppet-lucid output: OK: 40% free memory [22:03:43] RECOVERY Free ram is now: OK on nova-ldap1 nova-ldap1 output: OK: 73% free memory [22:05:13] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 21% free memory [22:05:33] RECOVERY Current Users is now: OK on prefixexport prefixexport output: USERS OK - 0 users currently logged in [22:06:23] RECOVERY Current Load is now: OK on prefixexport prefixexport output: OK - load average: 0.10, 0.08, 0.02 [22:06:23] RECOVERY Free ram is now: OK on prefixexport prefixexport output: OK: 77% free memory [22:07:03] PROBLEM Total Processes is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [22:07:08] RECOVERY Total Processes is now: OK on prefixexport prefixexport output: PROCS OK: 93 processes [22:07:23] RECOVERY Disk Space is now: OK on prefixexport prefixexport output: DISK OK [22:07:23] RECOVERY dpkg-check is now: OK on search-test search-test output: All packages OK [22:07:43] RECOVERY dpkg-check is now: OK on prefixexport prefixexport output: All packages OK [22:08:03] RECOVERY Current Load is now: OK on search-test search-test output: OK - load average: 0.09, 0.09, 0.03 [22:08:13] RECOVERY Disk Space is now: OK on search-test search-test output: DISK OK [22:08:43] RECOVERY Free ram is now: OK on miniswarm miniswarm output: OK: 62% free memory [22:08:43] RECOVERY Current Users is now: OK on search-test search-test output: USERS OK - 0 users currently logged in [22:08:43] RECOVERY Free ram is now: OK on search-test search-test output: OK: 70% free memory [22:08:53] RECOVERY Free ram is now: OK on canonical-bridge canonical-bridge output: OK: 72% free memory [22:08:53] RECOVERY Free ram is now: OK on incubator-bots incubator-bots output: OK: 67% free memory [22:09:03] PROBLEM dpkg-check is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [22:09:03] PROBLEM Disk Space is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [22:10:04] New patchset: Lcarr; "changing nagios service to nagios3 service" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2095 [22:10:23] RECOVERY Free ram is now: OK on labs-relay labs-relay output: OK: 88% free memory [22:10:33] RECOVERY Total Processes is now: OK on search-test search-test output: PROCS OK: 95 processes [22:10:53] RECOVERY Current Load is now: OK on feeds feeds output: OK - load average: 0.21, 0.09, 0.03 [22:11:23] RECOVERY Free ram is now: OK on feeds feeds output: OK: 86% free memory [22:11:33] RECOVERY Free ram is now: OK on mediahandler-test mediahandler-test output: OK: 60% free memory [22:11:43] PROBLEM Current Load is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [22:12:33] RECOVERY Free ram is now: OK on hugglewiki hugglewiki output: OK: 67% free memory [22:12:53] RECOVERY Free ram is now: OK on mobile-feeds mobile-feeds output: OK: 69% free memory [22:13:00] New patchset: Lcarr; "changing nagios service to nagios3 service" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2095 [22:13:03] PROBLEM Current Users is now: CRITICAL on nova-production1 nova-production1 output: Connection refused by host [22:13:13] RECOVERY Free ram is now: OK on nova-dev1 nova-dev1 output: OK: 76% free memory [22:13:33] RECOVERY Total Processes is now: OK on feeds feeds output: PROCS OK: 88 processes [22:13:38] RECOVERY Free ram is now: OK on phabricator1 phabricator1 output: OK: 66% free memory [22:13:43] RECOVERY Free ram is now: OK on labs-nfs1 labs-nfs1 output: OK: 87% free memory [22:13:59] New review: Lcarr; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/2095 [22:13:59] Change merged: Lcarr; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/2095 [22:14:23] RECOVERY Current Users is now: OK on bots-apache1 bots-apache1 output: USERS OK - 0 users currently logged in [22:14:33] RECOVERY Free ram is now: OK on wikisource-web wikisource-web output: OK: 85% free memory [22:14:43] RECOVERY Disk Space is now: OK on feeds feeds output: DISK OK [22:14:43] RECOVERY Current Users is now: OK on feeds feeds output: USERS OK - 0 users currently logged in [22:14:43] RECOVERY Free ram is now: OK on wep wep output: OK: 75% free memory [22:15:03] RECOVERY Free ram is now: OK on bots-apache1 bots-apache1 output: OK: 87% free memory [22:15:03] RECOVERY dpkg-check is now: OK on feeds feeds output: All packages OK [22:15:13] RECOVERY dpkg-check is now: OK on bots-apache1 bots-apache1 output: All packages OK [22:15:13] RECOVERY Total Processes is now: OK on bots-apache1 bots-apache1 output: PROCS OK: 91 processes [22:15:23] RECOVERY Disk Space is now: OK on bots-apache1 bots-apache1 output: DISK OK [22:15:23] RECOVERY Free ram is now: OK on vivek-puppet vivek-puppet output: OK: 83% free memory [22:15:33] RECOVERY Free ram is now: OK on bots-sql2 bots-sql2 output: OK: 77% free memory [22:15:33] RECOVERY Free ram is now: OK on deployment-web deployment-web output: OK: 65% free memory [22:15:43] RECOVERY Current Load is now: OK on bots-apache1 bots-apache1 output: OK - load average: 0.10, 0.07, 0.02 [22:16:09] hexmode: what the heck is happening [22:16:53] RECOVERY Current Load is now: OK on deployment-sql deployment-sql output: OK - load average: 0.18, 0.12, 0.04 [22:18:03] RECOVERY dpkg-check is now: OK on deployment-sql deployment-sql output: All packages OK [22:18:13] RECOVERY Disk Space is now: OK on deployment-sql deployment-sql output: DISK OK [22:19:03] RECOVERY Current Users is now: OK on deployment-sql deployment-sql output: USERS OK - 0 users currently logged in [22:19:13] RECOVERY Total Processes is now: OK on deployment-sql deployment-sql output: PROCS OK: 79 processes [22:21:48] petan: doh i had the bot still muted, everything looks ok puppet side now, does it look okay from your end ? [22:21:58] not really [22:22:02] someone broken beta [22:22:06] but it wasn't you :) [22:22:20] LeslieCarr: I think it's ok apart of that instances need to have puppet forced [22:22:27] because I can't do puppetd -tv [22:22:32] on all [22:22:36] but that should be fixed later [22:22:41] cool [22:22:44] I hope [22:22:50] now I need to find hexmode [22:23:13] RECOVERY Current Load is now: OK on pad2 pad2 output: OK - load average: 0.36, 0.08, 0.03 [22:23:20] now… i need to break puppet again ;) [22:24:03] RECOVERY dpkg-check is now: OK on asher1 asher1 output: All packages OK [22:24:13] RECOVERY Total Processes is now: OK on pad2 pad2 output: PROCS OK: 89 processes [22:24:18] RECOVERY Disk Space is now: OK on asher1 asher1 output: DISK OK [22:24:18] RECOVERY Current Users is now: OK on asher1 asher1 output: USERS OK - 0 users currently logged in [22:24:53] RECOVERY Free ram is now: OK on pad2 pad2 output: OK: 84% free memory [22:25:03] RECOVERY dpkg-check is now: OK on pad2 pad2 output: All packages OK [22:25:33] RECOVERY Free ram is now: OK on asher1 asher1 output: OK: 93% free memory [22:26:23] RECOVERY Current Load is now: OK on asher1 asher1 output: OK - load average: 0.02, 0.06, 0.03 [22:26:33] RECOVERY Total Processes is now: OK on asher1 asher1 output: PROCS OK: 100 processes [22:26:38] RECOVERY Disk Space is now: OK on pad2 pad2 output: DISK OK [22:27:03] RECOVERY dpkg-check is now: OK on deployment-wmsearch deployment-wmsearch output: All packages OK [22:27:28] !sal [22:27:28] https://labsconsole.wikimedia.org/wiki/Server_Admin_Log see it and you will know all you need [22:27:33] RECOVERY Current Users is now: OK on pad2 pad2 output: USERS OK - 0 users currently logged in [22:27:59] !log deployment-prep reverted unlogged changes made to config which broke whole site [22:28:01] Logged the message, Master [22:28:43] RECOVERY Current Load is now: OK on deployment-wmsearch deployment-wmsearch output: OK - load average: 0.01, 0.03, 0.00 [22:28:53] RECOVERY Disk Space is now: OK on deployment-wmsearch deployment-wmsearch output: DISK OK [22:30:33] RECOVERY Free ram is now: OK on deployment-wmsearch deployment-wmsearch output: OK: 89% free memory [22:30:33] RECOVERY Total Processes is now: OK on deployment-wmsearch deployment-wmsearch output: PROCS OK: 94 processes [22:30:53] RECOVERY Current Users is now: OK on deployment-wmsearch deployment-wmsearch output: USERS OK - 0 users currently logged in [22:36:43] RECOVERY Current Load is now: OK on nova-production1 nova-production1 output: OK - load average: 0.69, 0.29, 0.11 [22:37:03] RECOVERY Free ram is now: OK on nova-production1 nova-production1 output: OK: 71% free memory [22:37:03] RECOVERY Total Processes is now: OK on nova-production1 nova-production1 output: PROCS OK: 159 processes [22:37:33] RECOVERY Free ram is now: OK on analytics analytics output: OK: 65% free memory [22:38:03] RECOVERY Current Users is now: OK on nova-production1 nova-production1 output: USERS OK - 3 users currently logged in [22:39:03] RECOVERY Disk Space is now: OK on nova-production1 nova-production1 output: DISK OK [22:39:03] RECOVERY dpkg-check is now: OK on nova-production1 nova-production1 output: All packages OK [22:41:50] petan: What was changed? [22:42:23] johnduhart: tons of stuff by someone and it broke the site [22:42:33] Like what? [22:42:44] it seemed to me like someone replace whole CommonSeetings and other files, maybe with production version [22:42:50] sigh [22:42:58] all local paths disappeared and were replaced with non existing [22:42:59] hexmode: Did you copy paste commonsettings? [22:43:21] + someone added some cache bypassing header hardcoded to settings [22:43:33] I commited it before I reverted so it's still in git [22:43:43] but I don't know how to get it outside of labs [22:44:16] probably push to github [22:45:07] or just ssh there and check it [22:47:21] mdale: did you change it? [22:47:34] I guess it was you or hexm ode [22:47:41] have not touched it [22:47:45] ah ok [22:47:58] anyway there is a git repository so if you make a change, please commit it [22:48:09] I don't know who all have access there [22:48:25] but I think they should know it, I added a notice to motd.local [22:48:35] * tail [22:53:01] petan, you can also clone from outside [22:53:22] I am really dumb when it comes to git [22:53:40] I don't have access to that repo from outside [22:53:49] I have access to outside from repo :) [22:53:50] I'm not a git expert either [22:54:21] but I having to use it, I grasped a few basic usage :) [22:54:23] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 19% free memory [22:54:26] where's the repo? [22:54:59] in /usr/local/apache [22:55:08] where should I write it so that people know that :) [22:55:16] I have a feeling it's like on every wall [22:55:17] :D [22:55:34] there is a big label when you ssh to any instance telling you that [22:55:48] uh? no [22:55:52] I see a big motd there [22:55:53] really? [22:55:58] ah that's it [22:56:00] with deployment cluster in ascii art [22:56:04] but no mention to git there [22:56:07] there is a text under it [22:56:12] telling you to read help :) [22:56:18] which is a wiki page [22:56:24] containing information about git [22:56:52] but it's good that people at least see ascii :) [22:56:56] not even there [22:57:01] oh [22:57:02] http://labs.wikimedia.beta.wmflabs.org/wiki/Help doesn't mention git xD [22:57:06] yay [22:57:18] still, I'd mention "commit all changes to git" in the motd [22:57:54] fixed [22:58:01] right, problem is that it can change [22:58:09] and I would have to change motd on all isntances then [22:58:24] that's why I prefer to have it all on wiki :) [22:58:37] less information we have in motd, more curious people would be to read wiki [22:58:55] _if_ they read it [22:59:00] heh :) [22:59:05] true [23:00:22] I have a feeling we should move help to other wiki than one which is actually depending on that instance [23:00:26] like mediawiki.org [23:03:17] ok, so have set up the path now [23:03:25] to directly connect deployment-web [23:03:52] um [23:04:00] so git clone ssh://deployment-web/usr/local/apache should clone that repo [23:04:03] you shouldn't use -web for anything else than reloading apache [23:04:03] yep [23:04:22] that path /usr/local/apache exist on all isntances [23:04:39] dbdump is perfect for maintenance and such [23:04:45] so it was in bastion, too? [23:04:52] in bastion not [23:04:56] all instances in deployment [23:04:57] heh [23:05:17] which one would have been prefered for that kind of meta-action? [23:05:25] dbdump [23:05:40] that is instance you can overload how you want and it wouldn't break anything [23:06:00] it has access to config, db and everything and is completely separate [23:06:12] it's used for running large imports [23:06:22] or update etc [23:06:30] I will show you [23:06:40] !log deployment-prep updating svn [23:06:41] Logged the message, Master [23:07:49] Platonides: now I am running bin/updatedata on dbdump [23:08:00] which run update.php for all wikis we have [23:08:04] I'm having problems connecting to dbdump [23:08:21] deployment-dbdump [23:08:21] I first tried from here with ssh tunnel, and I got channel 0: open failed: administratively prohibited: open failed [23:08:33] then bastion said Name or service not known [23:08:45] ssh deployment-dbdump [23:08:56] it's prefixed [23:08:57] oh, ok [23:09:05] I would like to make a standart from that [23:09:10] I will talk to Ryan about it [23:09:11] it's a good idea [23:09:22] I don't like how people randomly name their instances, one day it will conflict [23:09:44] like instance called sql is probably a bad idea [23:11:00] petan: I'm pretty sure it is a standard already... [23:12:40] petan, that version does indeed look like a compy from deployment [23:12:46] I mean, from real wikipedia [23:13:14] comparing CommonSettings.php, the only difference is $wmgUseFeaturedFeeds added in the real one [23:14:33] johnduhart: if it's a standard we should tell people ;0 [23:14:43] because there is dozen of weird names [23:15:23] the person who did it, probably thought we configured test identicaly to prod [23:15:28] actually I would like to [23:15:36] we would need to create new paths on fs [23:15:43] and move live to /home/wikipedia [23:15:53] but I want to discuss it with someone from ops [23:16:00] what [23:16:20] to make it more identical like on production, so that paths would stay same [23:16:32] and we wouldn't need to overwrite it in Commons [23:16:52] production doesn't have /usr/local/apache [23:16:56] uh [23:16:59] yes it does [23:17:12] um, maybe it does, but it doesn't look like that [23:17:14] from configs [23:17:26] I have a feeling it's in /home/wikipedia [23:17:38] I've seen it in some configs [23:17:41] okay have fun with that [23:18:04] I will definitely don't start moving it without having clear info how does it exist on prod [23:18:35] you don't have access there? [23:18:41] no [23:19:00] ops do [23:19:00] Ithought you had [23:19:03] um... I don't... [23:19:53] if I had wikimedia would be down ^^ likely [23:19:53] oh, don't worry [23:19:53] it's considered a rite of passage ;) [23:20:24] actually I support that ideas of brion to rid of need to have a shell access at all for devs who would like to be involved in wikimedia cluster [23:20:35] or Roan's [23:21:02] having a git repository with all configs and all stuff in puppet etc [23:21:10] well, once there's a beta site [23:21:15] it seems doable [23:21:21] from what I have heard there is no versioning on prod right now, or there is but probably some weird [23:21:33] No there's versioning [23:21:37] I have long liked the idea of you have to go through repository for changing [23:21:47] johnduhart: mu tante told me that nothing like what we have on beta [23:21:59] like you change a config, it's not tracked [23:21:59] though not being able to look at the real live hacks, just thorugh svn, i could be considered biased :) [23:22:01] petan: It's a private svn repo. [23:22:32] I guess mu tante would know about it [23:22:40] I know there is a svn repo for files [23:22:43] but not for config [23:22:53] I do remember about a private svn repo [23:23:06] yes it's for stuff like extensions and wiki files, afaik [23:23:35] but configuration probably doesn't live there [23:23:36] or if it does, mutatnte didn't know it :) [23:23:40] * mu tante [23:26:51] Platonides: I think it would be cool just to make changes to labs and if it's ok, merge it to prod [23:27:00] indeed [23:27:06] I think it's a goal of labs... maybe [23:27:13] although that would require production to be more sane :) [23:27:19] heh :) [23:27:28] that's would be side effect of labs, making it sane [23:39:23] petan: wmf-config is a private SVN repo. I want to move out the private stuff and move it to a public git repo managed in Gerrit, but I don't have time to actually do that in the short term [23:57:29] :o [23:57:34] irc feed should be quite easy [23:58:30] I may try it tomorrow [23:58:33] ok [23:58:42] maybe create a separate instance [23:58:48] but it doesn't really need to be [23:58:50] I was thinking so [23:58:51] it's up to you [23:58:54] ok [23:59:18] I suppose I'll bu you if i encounter some problem [23:59:22] as you are always here :) [23:59:27] *bug [23:59:29] don't forget to set up security before creating instance [23:59:30] good night [23:59:41] because only solution is to nuke instance and start again