[03:09:40] 06/21/2012 - 03:09:39 - Creating a home directory for pastakhov at /export/keys/pastakhov [03:10:40] 06/21/2012 - 03:10:40 - Updating keys for pastakhov at /export/keys/pastakhov [08:13:45] PROBLEM Free ram is now: WARNING on ganglia-test2 i-00000250 output: Warning: 19% free memory [08:13:55] PROBLEM Free ram is now: WARNING on bots-sql2 i-000000af output: Warning: 18% free memory [08:42:45] PROBLEM Total Processes is now: CRITICAL on aggregator-test1 i-000002bf output: PROCS CRITICAL: 205 processes [08:47:45] PROBLEM Total Processes is now: WARNING on aggregator-test1 i-000002bf output: PROCS WARNING: 198 processes [08:52:33] !dns [08:52:37] @search dns [08:52:37] No results were found, remember, the bot is searching through content of keys and their names [08:52:50] !dns alias addresses [08:52:50] Created new alias for this key [08:52:52] !dns [08:53:01] @regsearch addr [08:53:01] Results (Found 3): new-labsuser, domain, account-questions, [08:53:15] !addreses is https://labsconsole.wikimedia.org/wiki/Help:Addresses [08:53:16] Key was added [08:53:26] !addresses is https://labsconsole.wikimedia.org/wiki/Help:Addresses [08:53:27] Key was added [08:53:34] :P [08:53:37] !dns [08:53:37] https://labsconsole.wikimedia.org/wiki/Help:Addresses [08:53:44] Danny_B|backup: ^ [08:55:24] !domain [08:55:24] in case you want to assign a domain to your ip, you can use manage addresses to do that [08:55:46] @regsearch Help [08:55:46] Results (Found 13): docs, address, ssh, documentation, start, security, git, port-forwarding, instance, accountreq, puppetmaster::self, addreses, addresses, [08:55:59] !address [08:55:59] https://labsconsole.wikimedia.org/wiki/Help:Addresses [08:56:02] !address del [08:56:02] Successfully removed address [08:56:05] !addres del [08:56:05] Unable to find the specified key in db [08:56:07] !addreses del [08:56:08] Successfully removed addreses [09:00:29] 06/21/2012 - 09:00:29 - Created a home directory for petrb in project(s): configtest [09:01:30] 06/21/2012 - 09:01:30 - User petrb may have been modified in LDAP or locally, updating key in project(s): configtest [09:21:58] PROBLEM dpkg-check is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [09:22:08] PROBLEM Free ram is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [09:23:30] 06/21/2012 - 09:23:30 - User zaran may have been modified in LDAP or locally, updating key in project(s): bastion,deployment-prep,wikisource-dev [09:23:38] 06/21/2012 - 09:23:37 - Updating keys for zaran at /export/keys/zaran [09:27:08] RECOVERY Free ram is now: OK on mobile-testing i-00000271 output: OK: 84% free memory [09:31:58] RECOVERY dpkg-check is now: OK on mobile-testing i-00000271 output: All packages OK [09:58:18] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache31 i-000002d4 output: Puppet has not run in last 20 hours [10:53:44] PROBLEM Current Load is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [10:58:44] RECOVERY Current Load is now: OK on mobile-testing i-00000271 output: OK - load average: 0.11, 0.18, 0.21 [11:21:54] PROBLEM Current Users is now: WARNING on bastion-restricted1 i-0000019b output: USERS WARNING - 9 users currently logged in [11:27:54] PROBLEM Current Load is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [12:03:04] PROBLEM dpkg-check is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [12:12:54] RECOVERY dpkg-check is now: OK on mobile-testing i-00000271 output: All packages OK [12:52:44] PROBLEM Free ram is now: CRITICAL on incubator-bot1 i-00000251 output: Critical: 5% free memory [13:12:39] 06/21/2012 - 13:12:39 - Updating keys for cneubauer at /export/keys/cneubauer [14:37:29] New review: Ottomata; "So, I've recently been working on this over here:" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/9342 [14:37:46] Ryan_Lane: do you have any idea how I can detect that MediaWiki is running on a labs instance? [14:38:02] I tried looking at INSTANCE_NAME env variable, but that is not set while running under apache [14:38:25] I'm not sure I understand the question [14:38:27] I thought about setting a file in /etc/ and stat it [14:38:30] oh sorry [14:38:44] so I got some InitialieSettings.php with configuration depending on $cluster . [14:38:51] that variable can be either 'pmtpa' or 'wmflabs' [14:39:17] default value is 'pmtpa', I would like to detect that I am running on labs and set $cluster = 'wmflabs' if so [14:40:30] I thought about using posix_uname() to find out the server domain, but it is empty on labs :( [14:41:47] hashar: oh [14:41:55] can you change $cluster to $realm, btw? [14:42:03] and use "production" or "labs" [14:42:10] so that it is consistent with puppet? [14:42:16] yup that is on my TODO list [14:42:19] ok [14:42:30] will even bug it to make sure I do that [14:42:35] hashar: use the domain name [14:42:46] if its .wmflabs, it's labs [14:42:55] if it's wmnet, it's production [14:43:16] that is a good idea, my issue is merely how to retrieve it :( [14:43:30] you can also make puppet set a file that you can include somewhere [14:43:35] posix_uname() has an empty "domainname" entry [14:43:48] it's probably better not to check for domain [14:44:04] I'd add a file that can be included by puppet [14:44:21] then I will do a if( file_exist( ) ) {} [14:44:33] my fear is that it will add a stat() call on any request [14:44:34] not sure if that is bad or not [14:44:51] php_uname("n"); [14:45:15] php > print php_uname("n"); [14:45:16] i-000000d2 [14:45:16] only hostname, [14:45:19] :( [14:45:49] and I am pretty sure it is not reliable on production anyway [14:45:59] I will got for the puppet class [14:46:09] and a file in /etc/ [14:46:58] sounds better [14:47:27] ... [14:47:37] include_once('blah.php') [14:47:42] then it'll go into apc [14:47:57] problem solved :) [14:48:00] ahah [14:48:04] PROBLEM dpkg-check is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [14:53:04] RECOVERY dpkg-check is now: OK on mobile-testing i-00000271 output: All packages OK [14:55:33] I have came with: /etc/wikimedia-realm { content => $::realm }} [14:55:37] https://gerrit.wikimedia.org/r/12377 [14:55:47] Ryan_Lane paravoid -> https://gerrit.wikimedia.org/r/12377 :-) [14:55:48] hashar, did you look at $_SERVER? [14:56:11] just looking at .wmflabs.org in SERVER_NAME looks enough [14:56:40] Platonides: I am not sure it is set on CLI [14:56:59] doesn't seem to be :( [14:57:04] in cli there's INSTANCENAME [14:57:21] yeah but then that env variable is not set when running under apache [14:57:29] unless it's there just for being run as a user :S [14:57:34] we could check both [14:57:45] apache doesn't source the environment [14:57:49] and it shouldn't [14:57:55] I am fine with that [14:57:59] it's not an interactive user [14:58:09] it doesn't even have a shell [14:59:32] Ryan_Lane, I'm not meaning to source it [14:59:50] It'd check in apache with the SERVER_NAME [15:00:02] it's cronjobs what could need the environment [15:00:32] why? if it's included in the mediawiki config, the crons will run fine [15:00:49] the maintenance scripts include the config [15:01:25] how would cronjob know if it's a wmflabs cronrunner or a production one? [15:01:42] by looking at /etc/wikimedia-realm ?:-] [15:02:23] !log deployment-prep updating MediaWiki to 80fbb70 (latest master) [15:02:24] Logged the message, Master [15:02:50] hashar, what are the steps to update wmflabs version? [15:03:10] depends [15:03:18] since the cluster keep changing :-] [15:03:21] hehehe, Select an editor. To change later, run 'select-editor'. [15:03:21] 2. /bin/nano <---- easiest [15:05:11] hashar, you go to /usr/local/apache/common-local/php-trunk/ and do git pull ? [15:05:25] I am pretty sure I wrote some doc somewhere [15:05:40] so yeah that [15:05:53] then in extensions : git submodule update [15:07:36] oh and git submodule checkout master, then git submodule pull [15:08:00] finally update the DB with: foreachwiki update.php [15:08:06] (wich take a loooong time) [15:08:28] why don't we have a script with all those steps? [15:08:48] too lazy to wrote one I guess [15:09:02] and I like to review the output after running each step [15:09:13] definitely could use something that would die out whenever a command has error [15:09:28] set -e is your friend :) [15:09:53] git submodule is just one command, so -e is probably useless there [15:10:06] what? [15:10:31] Ryan_Lane: paravoid can you please give your word on /etc/wikimedia-realm at https://gerrit.wikimedia.org/r/#/c/12377/ ? thankx! [15:10:39] hint: set -e is equivalent to set -o errexit [15:11:56] ohh [15:12:09] Platonides: there are already scripts. The repo is mediawiki/extensions [15:13:30] where are those scripts? [15:13:41] I don't see on mediawiki/extensions any script to update wmflabs [15:14:36] hashar: sorry a bit busy right now... [15:15:06] I am pretty sure you all are. It is just my lame attempt to end up at the top of the queue! [15:15:08] ;) [15:18:34] Ryan_Lane: I wonder if my, yours, Andrew's and Peter (who's been helping me) time is actually more expensive than RAID controllers. [15:19:20] paravoid: Depends if you're on a wage or salary. [15:19:23] paravoid: hahaha [15:19:39] I'd *much* prefer the raid controllers [15:19:44] "Hardware justification: My time is worth more than the card" [15:19:48] however, do you want to wait a month or two for them? [15:19:49] probably why don't bother changing a dead fan on a server. Easier to just buy a new one :D [15:21:21] paravoid: Have you given up sleeping entirely? [15:21:40] andrewbogott: heh. I was actually wondering the same :) [15:21:45] and/or are there nonstop riots outside your window such that sleeping is out of the question? [15:22:37] grumble grumble [15:22:54] even though sleeping is a problem, my biggest issue right now is RSI :/ [15:22:57] pain, just pain [15:23:24] I had a bout a couple of months ago. It's the worst. [15:23:36] paravoid: ew. that sucks [15:23:43] been a while since my wrists have hurt me [15:23:44] Lemme guess... pinky finger, right hand? [15:24:26] andrewbogott: far off :) upper part of the wrist, left hand [15:24:32] (I'm left-handed though) [15:24:44] Do you use a mouse? [15:24:54] I do [15:25:03] I think using a trackpad actually helped with my wrists [15:25:21] Yeah, when I switched to strictly using laptop trackpads a lot of my wrist problems went away. Not all though. [15:26:03] paravoid: But, I'm not trying to give advice, just express sympathy. [15:26:04] RECOVERY Puppet freshness is now: OK on queue-wiki1 i-000002b8 output: puppet ran at Thu Jun 21 15:26:04 UTC 2012 [15:26:17] thanks :) [15:29:26] hm. what to fix now [15:29:27] oh [15:29:33] yeah. I'll test gluster [15:30:54] RECOVERY Puppet freshness is now: OK on mwreview i-000002ae output: puppet ran at Thu Jun 21 15:30:39 UTC 2012 [15:31:44] PROBLEM Current Load is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [15:36:18] !log deployment-prep Closed {{bug|37500}} - migrates Apaches boxes to precise [15:36:20] Logged the message, Master [15:36:35] * Damianz gives hashar a cookie [15:36:44] RECOVERY Current Load is now: OK on mobile-testing i-00000271 output: OK - load average: 0.09, 0.12, 0.16 [15:36:46] !log deployment-prep Closed {{bug|37217}} - thumbnail extraction for videos needs newer ffmpeg [15:36:47] Logged the message, Master [15:36:56] Damianz: thanks!!! [15:39:07] hashar++ [15:39:24] RECOVERY Puppet freshness is now: OK on e3 i-00000291 output: puppet ran at Thu Jun 21 15:39:15 UTC 2012 [15:40:11] chrismcmahon: I still need to make 'beta' transcode the videos : / [15:42:18] hashar, so I think thois would do the job of bumping deployment-prep: http://pastebin.com/qa03z21x [15:43:04] paravoid: The ciscos that you're messing with are different from the ones I'm messing with, right? Or are you also working on virt1001-1008? [15:43:27] andrewbogott: no, I'm trying to install virt6-8 [15:43:29] pmtpa [15:43:37] ok, great. [15:43:59] PROBLEM Current Users is now: CRITICAL on gluster-2 i-000002e0 output: Connection refused by host [15:44:34] PROBLEM Disk Space is now: CRITICAL on gluster-2 i-000002e0 output: Connection refused by host [15:44:34] PROBLEM Free ram is now: UNKNOWN on gluster-3 i-000002e1 output: NRPE: Unable to read output [15:44:59] PROBLEM dpkg-check is now: CRITICAL on e3 i-00000291 output: Connection refused by host [15:45:09] PROBLEM Free ram is now: UNKNOWN on gluster-2 i-000002e0 output: NRPE: Unable to read output [15:45:49] PROBLEM Free ram is now: UNKNOWN on gluster-1 i-000002df output: NRPE: Unable to read output [15:46:39] PROBLEM Current Load is now: CRITICAL on e3 i-00000291 output: Connection refused by host [15:46:39] PROBLEM Current Users is now: CRITICAL on e3 i-00000291 output: Connection refused by host [15:46:39] PROBLEM Disk Space is now: CRITICAL on e3 i-00000291 output: Connection refused by host [15:46:39] PROBLEM Free ram is now: CRITICAL on e3 i-00000291 output: Connection refused by host [15:46:39] PROBLEM Total Processes is now: CRITICAL on e3 i-00000291 output: Connection refused by host [15:48:59] RECOVERY Current Users is now: OK on gluster-2 i-000002e0 output: USERS OK - 0 users currently logged in [15:49:02] Platonides: well extensions are submodules. So you can just use "git submodule foreach" ins tea of a recursive find :-] [15:49:28] Platonides: -q is a good idea [15:49:29] RECOVERY Disk Space is now: OK on gluster-2 i-000002e0 output: DISK OK [15:50:20] Platonides: probably need to do the git submodule init / update before the checkout/pull [15:50:28] hmm no [15:50:30] that is fine [15:50:50] that's just in case a new one was added but not checked out yet [15:53:57] ok, I will place it somewhere in labs then [15:55:03] what do you think of also adding a php -l line? [15:55:03] find \( -name \*.php -or -name \*.inc \) | xargs -iFILE php -l FILE > /dev/null [15:55:29] the faulty file would be already checked out, but at least it'd be easy to spot for the scapper [16:11:39] RECOVERY Current Load is now: OK on e3 i-00000291 output: OK - load average: 0.23, 0.34, 0.33 [16:11:39] RECOVERY Disk Space is now: OK on e3 i-00000291 output: DISK OK [16:11:39] RECOVERY Current Users is now: OK on e3 i-00000291 output: USERS OK - 0 users currently logged in [16:11:39] RECOVERY Total Processes is now: OK on e3 i-00000291 output: PROCS OK: 97 processes [16:11:44] RECOVERY Free ram is now: OK on e3 i-00000291 output: OK: 92% free memory [16:14:59] RECOVERY dpkg-check is now: OK on e3 i-00000291 output: All packages OK [16:17:11] Platonides: we don't scap on beta [16:17:22] Platonides: and I assume extensions to be linted already :-] [16:17:24] it's just an update [16:17:32] php -l would slow things :-( [16:17:36] yeah, that's what *should* happen [16:17:38] so just sync and forget :-] [16:17:52] that line cautht php fatail errors several times :) [16:17:54] on production we only lint when syncing a file [16:18:02] and never lint when doing scap or syncing a dir [16:18:02] to be strict, there's no sync either [16:24:09] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache30 i-000002d3 output: Puppet has not run in last 20 hours [16:25:09] PROBLEM Puppet freshness is now: CRITICAL on deployment-cache-bits i-00000264 output: Puppet has not run in last 20 hours [16:28:09] PROBLEM Puppet freshness is now: CRITICAL on wikistats-01 i-00000042 output: Puppet has not run in last 20 hours [16:30:09] PROBLEM Puppet freshness is now: CRITICAL on deployment-bastion i-000002bd output: Puppet has not run in last 20 hours [16:36:49] PROBLEM Current Users is now: CRITICAL on bastion-restricted1 i-0000019b output: USERS CRITICAL - 11 users currently logged in [16:38:09] PROBLEM Puppet freshness is now: CRITICAL on deployment-feed i-00000118 output: Puppet has not run in last 20 hours [16:39:09] PROBLEM Puppet freshness is now: CRITICAL on deployment-jobrunner05 i-0000028c output: Puppet has not run in last 20 hours [16:43:09] PROBLEM Puppet freshness is now: CRITICAL on deployment-imagescaler01 i-0000025a output: Puppet has not run in last 20 hours [16:44:09] PROBLEM Puppet freshness is now: CRITICAL on deployment-syslog i-00000269 output: Puppet has not run in last 20 hours [17:31:42] Change abandoned: Jens Ohlig; "(no reason)" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/10567 [17:32:18] Change restored: Jens Ohlig; "(no reason)" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/10567 [17:58:53] petan: so, it's not as simple as just pushing that script into the public repo [17:59:09] Ryan_Lane: what you mean [17:59:12] there's no way to tell the irc server that the labs bot can only write into certain channels [17:59:31] Ryan_Lane: labs bot? [17:59:37] I have no idea how that thing work [17:59:49] a bot connects to the server and writes into the channels [17:59:57] labs currently send data over UDP but I don't know about some bot [18:00:03] aha [18:00:12] the bot takes the udp and writes it into the channels [18:00:25] ok, so we could just let it listen to labs UDP as well [18:00:43] labs wouldn't use production channel names [18:00:56] or if u want I can write a better bot [18:01:02] that won't help [18:01:04] so that it's possible to separate it [18:01:08] per IP [18:01:29] that won't help [18:01:33] the irc server is patched [18:01:35] ok, what help [18:01:38] only ops can write [18:01:46] and the bot is an op [18:01:53] Just give us the op passwords? [18:01:54] :D [18:01:57] nah [18:02:02] but that's still same [18:02:03] we'll need to run another server [18:02:07] why [18:02:07] Ryan_Lane: You know this would be an idael use case for a proper queue.... [18:02:12] we could use this bot which can write [18:02:17] ummm [18:02:18] no [18:02:21] it would just write to labs channels as well [18:02:23] because thats the production one [18:02:34] ok, so we could create another bot? [18:02:47] any bot we use could write into the production channels [18:02:52] that's not something we can do [18:02:59] so, we'd need to run another server [18:03:02] hm, that bot could have some access list [18:03:10] so that we define which channels it is allowed to write to [18:03:10] no, it can't [18:03:18] not with the software we are running [18:03:27] I was talking about improving it :) [18:03:29] * Damianz wonders if you wrote an irc server in php :D [18:03:33] and we're not going to modify the production server just so that labs can write to it [18:03:35] it's open source or not [18:03:38] when we can just make another server [18:04:00] right we are back where we were months ago then [18:04:03] Problem with another server is most people hate joining crappy servers so everyone will ignore it [18:04:05] yep [18:04:24] no getting around that, though [18:04:26] Damianz: we don't require them to join this one [18:04:34] Damianz: it's for people who deal with spam [18:05:05] let's just add a server to the beta project [18:05:08] Ryan_Lane: ok, in that case we are back in previous status: I need IP for IRCD [18:05:12] :D [18:05:15] If I had a fetish for spam I'd rather deal with on production than join yet another network and fight stuff pointlessly. [18:05:27] Ryan_Lane: that ircd already exist, I made it months ago [18:05:30] I just packaged our version of the software [18:05:32] https://gerrit.wikimedia.org/r/#/c/12425/1 [18:05:34] before you decided to use production feed [18:05:38] it should run the same [18:05:39] aha [18:05:51] tbh I don't like what you use on production, but fine [18:05:56] Can we just have oauth, let beta use our labs details and bamn we fixed spamming issue? [18:06:03] that ircd you use is obsolete a bit [18:06:07] how would that fix spamming issues? [18:06:15] ratbox is what efnet uses [18:06:24] we're using an older version [18:06:30] really? good I am not using efnet ^^ [18:06:37] ok [18:06:49] the version I packaged is a newer version [18:06:51] I hope it won't crash too much [18:06:52] that's in lucid [18:07:23] Ryan_Lane: can we have configs without pw's? [18:07:27] the basic idea is that only the bot should be able to write into the irc servr [18:07:28] Ryan_Lane: I want to see them [18:07:39] that's another issue [18:07:48] hm [18:07:57] I'm thinking of just running a second version of this in production [18:08:04] don't tell me it's so hard to remove few passwords from a file :D [18:08:26] sanitizing large config files is a pain [18:08:44] I don't believe there is more than 2 passwords [18:09:15] but I can create some experimental config myself [18:10:08] btw what do I have to do, so that I am considered trustworth enough to see a password of irc bot I would likely never used :D [18:12:12] Sell your sole [18:12:41] we don't let production passwords outside of production [18:20:39] 06/21/2012 - 18:20:38 - Updating keys for mkroetzsch at /export/keys/mkroetzsch [18:21:49] RECOVERY Current Users is now: OK on bastion-restricted1 i-0000019b output: USERS OK - 0 users currently logged in [18:30:19] PROBLEM Disk Space is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [18:35:19] RECOVERY Disk Space is now: OK on mobile-testing i-00000271 output: DISK OK [19:04:11] * petan guess he never make it to production [19:22:39] 06/21/2012 - 19:22:39 - Creating a home directory for koosha at /export/keys/koosha [19:23:37] 06/21/2012 - 19:23:37 - Updating keys for koosha at /export/keys/koosha [19:25:39] 06/21/2012 - 19:25:39 - Updating keys for erosen at /export/keys/erosen [19:31:39] 06/21/2012 - 19:31:39 - Updating keys for koosha at /export/keys/koosha [19:59:07] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache31 i-000002d4 output: Puppet has not run in last 20 hours [20:52:30] 06/21/2012 - 20:52:30 - Created a home directory for erosen in project(s): analytics [20:53:32] 06/21/2012 - 20:53:32 - User erosen may have been modified in LDAP or locally, updating key in project(s): analytics [20:54:48] maplebed: just cloned the puppet scripts [20:54:49] hi all, I'm trying to use labs and I can't get to bastion [20:55:20] notmyname: excellent. let me know if you have questions. [20:55:33] i've just had my account set up and I am hoping someone can help me make sure I have all of the necessary prereqs [20:58:53] PROBLEM Current Load is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [21:02:58] !initial-login | erosen [21:02:58] erosen: https://labsconsole.wikimedia.org/wiki/Access#Initial_log_in [21:03:21] !initial-login del [21:03:21] Successfully removed initial-login [21:03:25] !initial-login is https://labsconsole.wikimedia.org/wiki/Access#Initial_log_in_and_password_change [21:03:25] Key was added [21:03:26] so I followed the directions here pretty faithfully i think [21:03:34] then you're good to go [21:03:40] are you going to be using labs? [21:03:45] or just gerrit? [21:03:49] labs [21:04:04] partly to help with the analytics teams [21:04:16] and partly for my global development work (assuming that is kosher) [21:04:23] ok. I just added you to the bastion [21:04:41] you can use labs for anything wikimedia related [21:05:17] just to be clear, if I have my ssh-agent set up correctly, I should be able to type: [21:05:29] ssh erosen@bastion.wmflabs.org [21:05:32] 06/21/2012 - 21:05:32 - Created a home directory for erosen in project(s): bastion [21:05:41] dec added me to the analytics group [21:05:46] dsc* [21:06:11] yes, I just added you to the bastion project [21:06:17] you should be able to get in now [21:06:33] 06/21/2012 - 21:06:32 - User erosen may have been modified in LDAP or locally, updating key in project(s): bastion [21:07:23] no luck so far: [21:07:29] Connection closed by 208.80.153.194 [21:13:43] RECOVERY Current Load is now: OK on mobile-testing i-00000271 output: OK - load average: 0.30, 0.30, 0.26 [21:14:21] erosen: likely due to negative cache in nscd. gimme a sec [21:14:49] erosen: try now [21:15:26] yay! [21:15:27] success [21:16:43] just so I can know what happened, what did you do? [21:17:44] did you just add me to the bastion project [21:19:53] PROBLEM Current Users is now: CRITICAL on bastion-restricted1 i-0000019b output: USERS CRITICAL - 12 users currently logged in [21:21:13] yes [21:21:16] but [21:21:24] you had tried to log-in before you were a member [21:21:26] it cached that [21:21:33] and it caches it for way too long [21:30:43] RECOVERY HTTP is now: OK on blamemaps-s1 i-000002c3 output: HTTP OK: HTTP/1.1 200 OK - 453 bytes in 0.097 second response time [21:31:23] awesome. thanks again for the prompt help [21:32:35] Ryan_Lane: Would you expect autoinstall (netboot/preseed/partman/etc.) to install puppet and keys as part of the process, or am I missing a step? [21:34:18] question about deployment-prep: "It is intended to follow the different configuration that Wikimedia uses for its sites, along with the on-wiki MediaWiki configuration" Does this mean MW config can be changed on-wiki? [21:35:43] RECOVERY HTTP is now: OK on demo-web1 i-00000255 output: HTTP OK: HTTP/1.1 200 OK - 453 bytes in 0.153 second response time [21:38:53] PROBLEM HTTP is now: CRITICAL on blamemaps-s1 i-000002c3 output: CRITICAL - Socket timeout after 10 seconds [21:41:53] PROBLEM HTTP is now: WARNING on deployment-apache30 i-000002d3 output: HTTP WARNING: HTTP/1.1 403 Forbidden - 366 bytes in 0.015 second response time [21:41:53] RECOVERY HTTP is now: OK on grail i-000002c6 output: HTTP OK: HTTP/1.1 200 OK - 453 bytes in 0.205 second response time [21:43:53] PROBLEM HTTP is now: CRITICAL on demo-web1 i-00000255 output: CRITICAL - Socket timeout after 10 seconds [21:47:03] PROBLEM HTTP is now: CRITICAL on deployment-apache30 i-000002d3 output: CRITICAL - Socket timeout after 10 seconds [21:50:03] PROBLEM HTTP is now: CRITICAL on grail i-000002c6 output: CRITICAL - Socket timeout after 10 seconds [21:55:14] Ryan_Lane: nm, I'm dumb, this is all explained on a wikitech page which I've already read [22:03:03] PROBLEM Free ram is now: UNKNOWN on incubator-bot1 i-00000251 output: NRPE: Call to fork() failed [22:06:53] PROBLEM Disk Space is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:06:53] PROBLEM Total Processes is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:07:33] PROBLEM Current Load is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:07:33] PROBLEM Current Users is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:08:03] PROBLEM Free ram is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:11:57] PROBLEM dpkg-check is now: CRITICAL on incubator-bot1 i-00000251 output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:12:17] PROBLEM SSH is now: CRITICAL on incubator-bot1 i-00000251 output: Server answer: [22:16:47] RECOVERY Disk Space is now: OK on incubator-bot1 i-00000251 output: DISK OK [22:16:47] RECOVERY Total Processes is now: OK on incubator-bot1 i-00000251 output: PROCS OK: 130 processes [22:16:57] RECOVERY dpkg-check is now: OK on incubator-bot1 i-00000251 output: All packages OK [22:17:17] RECOVERY SSH is now: OK on incubator-bot1 i-00000251 output: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1 (protocol 2.0) [22:17:37] RECOVERY Current Users is now: OK on incubator-bot1 i-00000251 output: USERS OK - 0 users currently logged in [22:17:37] RECOVERY Current Load is now: OK on incubator-bot1 i-00000251 output: OK - load average: 0.91, 1.04, 0.80 [22:18:07] RECOVERY Free ram is now: OK on incubator-bot1 i-00000251 output: OK: 35% free memory [22:41:03] PROBLEM Current Users is now: WARNING on bastion1 i-000000ba output: USERS WARNING - 6 users currently logged in [23:15:57] RECOVERY Current Users is now: OK on bastion1 i-000000ba output: USERS OK - 5 users currently logged in [23:44:57] RECOVERY Current Users is now: OK on bastion-restricted1 i-0000019b output: USERS OK - 0 users currently logged in