[09:24:07] andrewbogott: hey [09:32:27] hm, calling log on web4 should show up here? [09:32:37] it returned that my message was logged [09:34:41] ah, let me fix that [09:37:00] !log deployment-prep petrb: this is a test [09:38:01] try now [09:38:15] it was misconfigured on it [09:38:19] anyway the log bot is down :P [09:38:25] so it doesn't get to sal [09:40:44] !log deployment-prep j: make sure all web instances have the same upload_max_filesize value in php.ini (100M) [09:42:21] j^: I am just setting up web6 [09:43:23] ah, make sure to set upload_max_filesize to 100M, and call /home/j/tmh-deployment-web.sh [09:46:29] petan: can you sync /etc/php5/apache2/php.ini between all web instances? [09:46:45] right now they all have not quite the same setup [09:48:12] i.e. web1 has larger max_execution_time,max_input_time,memory_limit compared to web4 [09:55:09] yes it would be better if it was somewhere on nfs [09:55:20] or project store [10:01:45] ok done [10:01:57] Or you could use puppet :D [10:02:01] now it's all in one folder which is mounted to all vm's [10:02:10] Damianz: if we had any control over it [10:02:22] I don't like waiting month to my patch to be merged [10:02:46] puppet is good for ops maybe [10:02:50] Only a month? [10:04:02] what's wrong on nagios [10:05:21] really I feel like creating a proper bot written in c would be a good idea [10:05:58] j^: all machines are 100% same now [10:18:13] RECOVERY dpkg-check is now: OK on deployment-web6 deployment-web6 output: All packages OK [10:18:53] RECOVERY Current Load is now: OK on deployment-web6 deployment-web6 output: OK - load average: 0.61, 0.41, 0.23 [10:20:13] RECOVERY Disk Space is now: OK on deployment-web6 deployment-web6 output: DISK OK [10:21:03] RECOVERY Free ram is now: OK on deployment-web6 deployment-web6 output: OK: 90% free memory [10:22:23] RECOVERY Total Processes is now: OK on deployment-web6 deployment-web6 output: PROCS OK: 104 processes [10:24:33] RECOVERY Current Users is now: OK on deployment-web6 deployment-web6 output: USERS OK - 0 users currently logged in [11:25:08] http://commons.wikimedia.beta.wmflabs.org/robots.txt still allows google to index the lab instance and google is the most active user [12:25:08] j^: do we know why it is? [12:26:06] petan: not sure but noticed a lot of google bot requests in errors.log [12:26:14] <^demon> I don't see any reason for any crawler to index it [12:28:03] j^: inserted Disallow: /wiki/ [12:28:08] just now [12:28:16] ^demon: neither I do [12:28:30] <^demon> What about /w/index.php? [12:28:33] if there is a line telling go the fuck out of this site, I will insert it [12:28:37] <^demon> I'd just disallow: / [12:28:42] ok [12:28:43] done [12:53:43] PROBLEM Disk Space is now: WARNING on wikistream-1 wikistream-1 output: DISK WARNING - free space: / 78 MB (5% inode=47%): [13:03:43] RECOVERY Disk Space is now: OK on wikistream-1 wikistream-1 output: DISK OK [13:10:37] !ping [13:10:37] pong [13:21:24] PROBLEM HTTP is now: CRITICAL on deployment-web6 deployment-web6 output: Connection refused [13:26:44] PROBLEM Disk Space is now: WARNING on wikistream-1 wikistream-1 output: DISK WARNING - free space: / 78 MB (5% inode=47%): [13:29:21] I think it's time to set some rules for nagios and I do it now. Anyone who will not handle the problem reported by nagios within 2 hours will have the check disabled [13:29:47] you can request an account there so that you can reenable [13:30:25] we are getting spammer by errors reported from instances no one cares of [13:31:14] PROBLEM Puppet freshness is now: CRITICAL on aggregator1 aggregator1 output: Puppet has not run in the last 10 hours [13:31:14] ACKNOWLEDGEMENT Disk Space is now: WARNING on wikistream-1 wikistream-1 output: DISK WARNING - free space: / 78 MB (5% inode=47%): [13:31:14] PROBLEM Puppet freshness is now: CRITICAL on analytics analytics output: Puppet has not run in the last 10 hours [13:31:14] PROBLEM Puppet freshness is now: CRITICAL on asher1 asher1 output: Puppet has not run in the last 10 hours [13:31:14] PROBLEM Puppet freshness is now: CRITICAL on backport backport output: Puppet has not run in the last 10 hours [13:33:34] ACKNOWLEDGEMENT Current Load is now: CRITICAL on kripke kripke output: Connection refused by host [13:33:49] ACKNOWLEDGEMENT dpkg-check is now: CRITICAL on orgcharts-dev orgcharts-dev output: DPKG CRITICAL dpkg reports broken packages [13:34:19] ACKNOWLEDGEMENT dpkg-check is now: CRITICAL on puppet-lucid puppet-lucid output: DPKG CRITICAL dpkg reports broken packages [13:34:34] ACKNOWLEDGEMENT dpkg-check is now: CRITICAL on test-oneiric test-oneiric output: DPKG CRITICAL dpkg reports broken packages [13:34:49] ACKNOWLEDGEMENT dpkg-check is now: CRITICAL on utils-abogott utils-abogott output: DPKG CRITICAL dpkg reports broken packages [13:35:10] mutante: you here? [13:35:31] or andrewbogott [13:38:06] yep [13:38:14] you have a time to look in puppet issue? [13:38:17] re nagios [13:38:49] it's been few days and nagios mark it all as critical [13:38:49] ok, I'll give it another try [13:39:00] just check how it is done on prod [13:39:05] I don't have access there to do that [13:39:37] yea,i just couldnt find it either last time [13:39:45] it must be in puppet config [13:39:57] probably in some initialisation script [13:40:43] if there is any editable script which is run on start [13:40:46] we could use it [13:41:01] I don't know how this passive check really works, but netcat could probably do that [13:41:08] if we knew which port we need to send data to [13:41:22] we could just echo "blah" | nc it [13:41:28] in puppet script [13:45:45] how is puppet run [13:45:52] in cron? [13:46:30] you must know who configured it on prod [13:46:47] or either we could use the my check I pushed to gerrit [13:52:35] got it [13:52:42] its in base.pp [13:52:50] public puppet config [13:53:10] command => "snmptrap -v 1 -c public nagios.wikimedia.org [13:53:35] ok, can we change it so that it sends it to nagios [13:53:35] that answers your question where the IP / hostname is configured [13:53:44] -c public nagios.wikimedia.org [13:53:48] ok, how do we change it [13:53:50] yes [13:54:01] "public" is the community string / password [13:54:32] so public = default = public [13:54:38] ok [13:54:47] can we create a template so that it's different in labs [13:54:58] hostname would be just nagios [13:56:40] it would be cool if someone removed all instances which aren't being used [13:56:52] create a template would be puttin it in puppet [13:57:00] it seems to me that someone created a bunch of instances they don't need and just left them running there [13:57:20] mutante: we need to put it to puppet [13:57:46] yes, but is it running on labs nagios again? [13:58:33] !nagios [13:58:33] http://nagios.wmflabs.org/nagios3 [13:58:40] there is a passive check on all instances [13:58:45] yes snmp is [13:58:48] running [13:59:08] i mean if the puppet agent is running succesfully and it has selected monitoring classes via labs console [13:59:26] because there was an issue with the selected classes [13:59:41] puppet surely isn't running well [13:59:46] that's why we create this check [13:59:58] there is bunch of instances where it's broken [14:00:11] I hope this check help us to find it [14:00:20] that' why I want to make it [14:00:32] of course, just saying to apply the fix to labs nagios we need to run the agent [14:00:44] and make sure which classes we can select [14:00:55] we don't need to apply any patch to nagios in labs, we need to apply it to all instances we have [14:01:01] nagios on labs was alredy updated by hand [14:01:25] snmp is running there and waiting for message from puppet [14:01:58] I can't fix nagios for that I need to have command line [14:02:10] there is a class which doesn't exist and I can't remove it [14:02:18] but that's a minor issue [14:02:26] ok [14:02:54] what we need is to update puppet so that it sends the message to nagios [14:02:59] everytime when it finish run [14:03:22] right, for now let's fix base.pp [14:03:41] I understand it that now it send the data to production nagios [14:03:42] so are the labs instances really using that [14:03:44] even from lab [14:04:01] if there isn't any realm I am pretty sure they are using this [14:05:41] I don't really know how does it work, I am pretty bad with puppet [14:08:13] ok, see, give me some time for fresh coffee and i'll come up with a gerrit change, just expect it to sit a little bit in review before its merged [14:08:26] ok [14:18:34] petan: sorry, I was sleeping... catching up on the backscroll now [14:24:18] petan, j^: If beta is still getting crawled, maybe add 'User-agent: *' to robots.txt. [14:42:22] petan: i have the change now was just wondering for a little if this should be in test or production branch and found i have a messed up git repo when swithcing to test [14:49:28] mutante: Labs machines get their puppet files from the test branch. [14:49:50] Did you switch to test by doing a pull, a rebase, or a checkout? [14:50:05] checkout [14:50:20] Hm... that should work fine. [14:50:46] (I was just thinking that maybe you'd accidentally tried to merge test + production which does not work fine.) [14:51:24] ok, lets try again, to test [14:51:38] stashing my changes first [14:53:08] Hm... I never use 'stash' when I'm switching branches. Which is not to say that it won't work... [14:53:26] Is there a stash per branch or just a global stash? (I've always assumed the latter.) [14:55:26] git checkout test was:, error: Your local changes to the following files, then i used "git stash", then repeated git checkout test to switch [14:55:33] As stash returns to HEAD I think per branch but no idae [14:56:13] You could totally take a stash from branch a and unstash it in master though I guess as they are just in a ref called statsh. [14:56:39] Yeah, so that suggests that it's global... [14:57:02] Dunno... I'm scatterbrained enough that I think I would lose track of/forget what's in my stash. So I usually commit before switching branches. [14:59:49] mutante: I also find it much more simpler to create patch compared to submitting it to gerrit [15:00:35] I mean, getting it merged is much harder than creating it [15:02:35] At the moment, any puppet change in test is applied to /all/ labs machines, so the bar is pretty high for getting changes in. Someday soon each project will have its own git branch, and the bar will be much lower. [15:03:34] this is supposed to be applied to all machines [15:03:57] it would be nice if puppet run was enforced on machines where it never run [15:04:56] ok, i had lost the push-fore-review-test alias as well [15:05:02] git config alias.push-for-review-test "push puppet HEAD:refs/for/test" [15:05:11] git push-for-review-test [15:05:37] fatal: 'puppet' does not appear to be a git repository [15:05:45] It makes puppetizing/packaing stuff hard until we have branches but I think we're heading that way soon... just need to break the current puppetmaster first though. [15:05:49] I think you can just use 'git review' [15:06:00] i stopped using that after i heard there would be issues [15:06:08] git review is nicer than the hook stuff [15:06:13] Oh, ok. *shrug* It's what I use, seems to work so far. [15:06:19] Maybe you have newer info than I. [15:08:18] Dear Labs, YUNOLETMELOGIN!? [15:08:38] !access | Reedy [15:08:38] Reedy: https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [15:08:46] hehe [15:08:46] I know [15:08:54] @search login [15:08:54] Results (found 1): newgrp, [15:08:59] o.o [15:09:01] !newgrp [15:09:01] newgrp can be used to change your current group ID for a login session. [15:09:02] I get errors with a ssh key that works fine on gerrit [15:09:18] ok, which kind of error, on bastion or machine behind [15:09:23] bastion [15:09:29] can't even get onto the network [15:09:33] Reedy: Do you have access to a linux box anyplace? Or a shell account on a different WMF machine that actually works? [15:09:39] Yes [15:09:40] sec [15:09:41] To any WMF site [15:09:45] to any of my other servers [15:09:46] to svn [15:09:48] to gerrit/git [15:09:54] Reedy: you need to use bastion-restricted i think [15:09:57] oh [15:10:05] Reedy: Howsabout we see if you can log into labs from one of those machines. Get putty/whatever out of the equation? [15:10:07] Reedy: Gerrit keys != ldap keys. [15:10:13] 03/30/2012 - 15:10:13 - Creating a home directory for reedy at /export/home/bastion/reedy [15:10:16] Oh, yeah, or that :) [15:10:21] !log bastion gave access to Reedy [15:10:25] meh [15:10:30] Reedy: I inserted you to bastion project [15:10:32] try now [15:10:39] uh? you didnt have that ? [15:10:43] Apparently not [15:10:44] he didn't [15:10:49] * Reedy blames Ryan [15:10:56] * Damianz thionks petan should for SALbot [15:10:56] Yup, that works [15:11:06] Though, per mutante, I should probably be using bastion-restricted [15:11:09] Damianz: I thought you administer it [15:11:13] 03/30/2012 - 15:11:13 - Updating keys for reedy [15:11:14] so you never logged on there before? [15:11:16] !hyperon [15:11:16] admin of logs [15:11:18] ah [15:11:18] I wonder if I've been making that mistake for the 12 or so accounts I created this week... [15:11:19] that's it [15:11:27] hyperon is master of botlogs [15:11:34] petan: No that I know of :P [15:11:41] s/No/Not/ [15:11:42] !damianz is some weirdo around here [15:11:43] Key was added! [15:11:46] mutante: lol, nope [15:11:55] * Damianz tells cluebot to remove petan from wikipedia :D [15:12:11] ok [15:12:15] there is no petan on wikipedia [15:12:17] :P [15:12:24] I use different user name [15:12:28] Reedy: then dont worry about what i said about -restricted :p [15:12:30] !wiki petan [15:12:30] http://en.wikipedia.org/wiki/petan [15:12:56] Damianz: you can remove that XD [15:12:57] Totally should make an article on petan :D [15:13:00] hehe [15:13:15] Though hmmm [15:13:24] Ah, only ops/roots have been moved [15:13:26] "Soon I'll also add in everyone else who has shell access. I'll send a follow-up email when that happens." [15:13:29] have cluebot create it no one is going to be brave enough to remove it [15:13:40] See wikipedia is silly because anyone starting an article probably has a COI with what they are writing about so nothing gets made [15:13:52] Reedy: exactly [15:14:24] Righto :) [15:14:47] !ryanland [15:14:47] in case you want to get to wonderfull land of labs use portal we call bastion, you will see amazing world where vm's runs happily and nfs friends with ntfs, puppets are fresh and gerrit is ugly :O [15:15:06] I should probably generate a seperate key for labs though anyway ;) [15:15:12] yeh [15:15:17] that's what I recommend to everyone [15:15:29] thanks petan [15:15:33] At least I can logon now ;) [15:15:42] I should setup sshproxy stuff but I hate typing full hostnames. [15:21:47] see you in few hours [15:21:48] :P [15:37:32] Reedy: I just now tried to document your recent problem... mind checking my work? https://labsconsole.wikimedia.org/wiki/Help:Access#Accessing_public_and_private_instances [15:37:47] Just the first P at the top there. Clear enough? [15:41:11] Looks good [15:43:03] ok, thanks. [16:25:57] New patchset: Dzahn; "set a different nagios hostname for snmp traps in labs realm / fix puppet freshness monitoring in labs nagios" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3988 [16:26:10] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3988 [16:27:09] New patchset: Dzahn; "set a different nagios hostname for snmp traps in labs realm / fix puppet freshness monitoring in labs nagios" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3988 [16:27:21] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3988 [16:34:29] New review: Dzahn; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/3988 [16:34:31] Change merged: Dzahn; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3988 [16:45:32] New review: Dzahn; "Stage[main]/Base::Puppet/Exec[puppet snmp trap]/returns: executed successfully" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3988 [16:54:42] New review: Dzahn; "tcpdump port 162 on labs nagios shows them now" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3988 [17:15:03] RECOVERY Puppet freshness is now: OK on venus venus output: puppet ran at Fri Mar 30 17:14:49 UTC 2012 [17:15:27] <-manual test using submit_check_result which needed add. fix [17:18:54] !log nagios fixed path to nagios.cmd CommandFile in ./eventhandlers/submit_check_result which is called by snmp traps (/var/log/nagios in prod vs. /var/lib/nagios3 in labs), i would prefer to just change it on labs to be like prod [17:24:53] RECOVERY Puppet freshness is now: OK on wikistats-01 wikistats-01 output: puppet ran at Fri Mar 30 17:24:35 UTC 2012 [17:50:59] !log nagios started snmptrapd with the options as used in production, now just the hostnames don't match nagios hostnames [17:56:39] PROBLEM Puppet freshness is now: CRITICAL on shop-analytics-main1 shop-analytics-main1 output: Puppet has not run in the last 10 hours [17:56:39] PROBLEM Puppet freshness is now: CRITICAL on swift-fe1 swift-fe1 output: Puppet has not run in the last 10 hours [18:11:07] petan: Poke :D [18:12:39] * Damianz questions methecooldude's motives in poking random guys on the internet [18:14:27] Damianz: :P [18:15:00] Damianz: Do you happen to know how one create a public_html directory for themselfs on bots? [18:15:09] creates* [18:17:22] Damianz: (It's for alejrb) [18:18:43] I'd need a shell command that i can execute on any labs instance, that returns the "instance_name" / the "nice" name , and replaces something that just works as `hostname` in prod. like on labs instances the hostnames are the resource names, but not the instance names [18:18:47] can a nova instance find out it's own instance_name ? as opposed to asking the controller [18:20:00] or labs nagios would have to be changed to use resource names (e.g. i-00000000123) as the host_names from Nagios' point of view [18:20:50] cause now, after other issues, the only reason left those puppet freshness passive checks still fail is that hostname mismatch [18:21:39] PROBLEM Puppet freshness is now: CRITICAL on contenthandler-demo contenthandler-demo output: Puppet has not run in the last 10 hours [18:22:25] methecooldude: yeah [18:23:22] Damianz: Ok... how... [18:25:58] ssh bots-nfs; cd /export/public/; mkdir alejrb [18:26:14] Just done it. [18:26:24] Oh also chown to user.www-data for access. [18:26:57] Damianz: Ahh, bots-nfs... ok [18:26:58] methecooldude: You're useless at writing code, right? [18:27:07] Ch'yea [18:27:23] Sadtimes [18:27:30] awesome, cheers [18:27:34] Why, what'cha doing [18:27:37] I want to write a new site for cb and have it more integrated for reports/issues etc [18:27:50] Ohh [18:27:51] But my freetime is like non-existent, then I have a little for 2 weeks then back to none. [18:28:09] Also need to fix the reports with bad data but that needs a shit load of logparsing doing. [19:02:39] PROBLEM Puppet freshness is now: CRITICAL on resourceloader2-apache resourceloader2-apache output: Puppet has not run in the last 10 hours [19:13:16] 03/30/2012 - 19:13:16 - Creating a home directory for erik at /export/home/deployment-prep/erik [19:14:16] 03/30/2012 - 19:14:16 - Updating keys for erik [19:18:29] mutante: I haven't done this myself, but this page roughly describes how an instance can learn information about its vm-ness: http://aws.amazon.com/code/1825 [19:18:48] Let me know if that's not what you actually wanted :) [19:19:49] (Background info: labs runs on openstack which is theoretically API-compatible with Amazon's EC2. Hence that being an amazon page.) [20:53:18] hexmode: you can make your own instance :) [20:53:48] hexmode: but but but ... you're already here! [20:53:51] * hexmode starts to figure it out [20:53:55] :) [20:54:05] im willin to help ya hexmode if ya need it [20:54:10] mostly cuz you doing this saves me having to do it. [20:54:12] hexmode: this might help: https://labsconsole.wikimedia.org/wiki/User:Bhartshorne/Path_to_a_New_Project [20:54:13] ;] [20:54:22] that's what I wrote up after having to figure all this shit out last week. [20:54:28] err.. earlier this week. [20:54:46] RobH: I'm all about making your job easier :) [20:55:14] hexmode: wrong answer! [20:55:19] ;) [20:56:37] hrm... have to add it under deployment-prep [21:24:05] LeslieCarr: so, if I set up a instance w/o any global groups, it should just use the standard puppet config, right? [21:24:34] yeah, just set up an instance without clicking anything and it'll be the basic config :) [21:24:58] * hexmode clicks [21:27:19] * hexmode reads console output [21:27:21] fun [21:27:25] next step! [21:33:44] PROBLEM Current Load is now: CRITICAL on bugzilla bugzilla output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:34:24] PROBLEM Current Users is now: CRITICAL on bugzilla bugzilla output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:35:04] PROBLEM Disk Space is now: CRITICAL on bugzilla bugzilla output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:35:28] ^^^ that's normal until puppet runs once or twice. [21:35:44] PROBLEM Free ram is now: CRITICAL on bugzilla bugzilla output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:36:54] PROBLEM Total Processes is now: CRITICAL on bugzilla bugzilla output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:37:34] PROBLEM dpkg-check is now: CRITICAL on bugzilla bugzilla output: CHECK_NRPE: Error - Could not complete SSL handshake. [21:40:52] hexmode: i hope you applied the web security stuff to it [21:41:01] cuz afaik you only get to choose that crap once =P [21:41:26] it's true that you must choose security groups before you launch the instance. [21:41:35] web security? [21:41:53] yep [21:42:02] or else the ports for simple http are closed [21:42:23] hrmm, as the cloud admin whatever, i should see this in instance list right? [21:42:27] group stuff, ports -- looks like it falls into the default for that group [21:42:32] yep [21:42:40] i wonder why i dont... [21:43:19] hexmode: if the instance is in the default group, that means that other instances in the same project can access it on all ports but no instances in any other project (or in the outside world if you give it a public IP) will be able to access it. [21:43:38] or you, via bastion proxy [21:43:44] i found that hard way =P [21:44:01] hm. you sure about that? [21:44:11] wouldnt work for me and smokeping [21:44:12] you sholud still be able to access it via ssh. [21:44:20] not ssh, ssh passthrough for web interface [21:44:27] should work you say? [21:44:28] (I think the default group says ssh from any project.) [21:44:36] no, we're agreeing. [21:44:40] ok, cool [21:44:55] you should not be able to hit it on port 80 from the bastion unless it's in an additional security group that specifically allows that. [21:45:02] but you should be able to get to it via ssh. [21:45:03] yea [21:45:05] cool [21:45:44] now why cant i see an instance named bugzilla under manage instances =/ [21:45:58] are you a member of the deployment-prep project? [21:46:09] you can only see instances in projects you belong to. [21:46:28] bah, i can add myself as admin, but view is restricted to projects... [21:46:31] annoying. [21:46:39] global admin should = can see all instances. [21:47:04] if an op needs to kill an instance due to whatever reason, we have to add ourself to the group to do so [21:47:10] that seems kinda mehj. [21:47:12] meh even [21:47:15] yup. [21:47:27] add it to the list. [21:47:53] you mean the novel? [21:48:01] sorry, my bad. [21:48:05] ;] [21:48:16] 03/30/2012 - 21:48:16 - Creating a home directory for robh at /export/home/deployment-prep/robh [21:48:33] hexmode: So your bugzilla instance is going to have to be deleted and redone [21:48:40] as you only have security group default [21:48:42] :P [21:48:44] you need 'web' [21:48:56] yeah, that bit is lame :( [21:48:56] also, dont name it the same thing, as the old dns will cache for another hour after deletion ;] [21:49:04] can you tell you did what i have done? [21:49:05] heh [21:49:06] no problem [21:49:16] 03/30/2012 - 21:49:16 - Updating keys for robh [21:49:25] so you may wanna call it bugzilla-dev or delete it and dont recreate it for an hour, or monday or whatever [21:49:29] if you want the same instance name [21:49:53] RobH: is bz4 in prod installed from just using cpan? [21:50:03] * hexmode doesn't see a package for it [21:50:13] that is a good question, i have no idea =[ lemme see [21:51:06] hexmode: it must be, there is only ubuntu bugzilla3 package [21:51:10] well, that sucks =[ [21:51:17] bad [21:51:22] I'll find one [21:51:24] i checked on kaulen, no bz package installed for it [21:51:29] I'm sure I saw one for it [21:51:42] hi all, question. [21:51:43] if its not in standard repos, we would have to import it into our repo [21:51:57] which still, is easier than doing a cpan install for ops standpoint. [21:51:57] i'm attempting to restart a project i have running on a labs instance [21:52:24] and all of a sudden i can't git pull from an external repo any more [21:52:31] keys are still right [21:52:36] repo is accessible from elsewhere [21:53:01] but i get: [21:53:03] dsc@reportcard1 srv $ git clone git@less.ly:kraken-ui.git [21:53:04] Initialized empty Git repository in /srv/kraken-ui/.git/ [21:53:04] fatal: The remote end hung up unexpectedly [21:53:08] and you could do so at some point on that instance right? [21:53:11] yes. [21:53:27] i get the same thing when i attempt to pull from an extant copy of that repo [21:53:36] but not from elsewhere (such as the office) [21:55:00] maybe ryan updated security groups ? [21:55:30] bah, i was gonna check said groups [21:55:34] PROBLEM host: bugzilla is DOWN address: bugzilla CRITICAL - Host Unreachable (bugzilla) [21:55:37] but i have to add myself to the project to see the groups applied [21:55:37] explain? [21:55:40] soooo annoying. [21:55:50] this sounds not worth it. [21:55:58] dschoon: perhaps the security group assigned to the project has since been updated to block those ports for git pulls [21:56:02] so don't bother unless you are bored or desiring procrastination [21:56:07] yes, that seems likely. [21:56:12] what port does git need? [21:56:15] it last worked before the gluster meltdown [21:56:21] it uses ssh, iirc [21:56:38] maplebed pointed out that the default group allows ssh [21:56:45] so if your pull is using ssh, i would think it would work [21:56:59] as instances are auto-natted for that stuff was my understanding [21:57:05] RobH: the security groups are for inbound connections, not outbound (IIRC) [21:57:21] that makes sense but i was not certain so i didnt want to say [21:57:29] ugh. [21:57:29] ok. [21:57:31] so security groups shouldnt be involved in this error [21:57:34] is what i think [21:57:37] we can resume this some time that is not now. [21:57:40] like next week. [21:57:46] there are some outbound rules, but I'm pretty sure they're all in the network, not the interface. [21:57:52] (like, routers and such) [22:02:26] what is generic::tcptweaks? And is it needed? [22:02:57] it reduces the syn timeout by a lot and increases the initial window size to 10 [22:03:43] PROBLEM Current Load is now: CRITICAL on bz-dev bz-dev output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:04:23] PROBLEM Current Users is now: CRITICAL on bz-dev bz-dev output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:05:03] PROBLEM Disk Space is now: CRITICAL on bz-dev bz-dev output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:05:24] "puppetd -tv" is failing now: Error 400 on SERVER: Duplicate definition: Package[apache2] is already defined [22:05:43] PROBLEM Free ram is now: CRITICAL on bz-dev bz-dev output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:06:53] PROBLEM Total Processes is now: CRITICAL on bz-dev bz-dev output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:07:21] LeslieCarr: got a sec to help me with this error? [22:07:33] PROBLEM dpkg-check is now: CRITICAL on bz-dev bz-dev output: CHECK_NRPE: Error - Could not complete SSL handshake. [22:10:36] * hexmode adds nagios [22:13:44] PROBLEM HTTP is now: CRITICAL on bz-dev bz-dev output: Connection refused [22:13:47] hexmode: wait 30s and do it again. [22:14:17] maplebed: looks like it was checkbox confusion [22:14:33] there should be more sanity checking in the checkboxes [22:14:50] there probably won't be any sanity checking on the check boxes. [22:15:03] it's up to you to choose a working puppet config [22:15:10] (or to create one...) [22:15:23] given that labs isn't a puppet parser, it'd be impossible to actually tell what a checkbox will do. [22:15:50] yeah, I see... [22:16:06] just finished a more-sucessful run [22:16:19] still need moar memory [22:25:44] RECOVERY Free ram is now: OK on bz-dev bz-dev output: OK: 72% free memory [22:25:44] RECOVERY Disk Space is now: OK on bz-dev bz-dev output: DISK OK [22:26:55] RECOVERY Total Processes is now: OK on bz-dev bz-dev output: PROCS OK: 94 processes [22:27:34] RECOVERY dpkg-check is now: OK on bz-dev bz-dev output: All packages OK [22:28:44] RECOVERY Current Load is now: OK on bz-dev bz-dev output: OK - load average: 0.12, 0.35, 0.39 [22:28:44] RECOVERY HTTP is now: OK on bz-dev bz-dev output: HTTP OK: HTTP/1.1 200 OK - 452 bytes in 0.004 second response time [22:29:24] RECOVERY Current Users is now: OK on bz-dev bz-dev output: USERS OK - 2 users currently logged in [22:35:33] yea second the 'there wont be sanity checking on puppet checkboxes' [22:35:48] it relies on the user to download the git puppet stuff and read them before applying [22:36:20] cuz there will always be overlapping puppet classes that rely on you to just not apply both =/ [22:37:11] hexmode: even if you dont get the puppetization of bz working, but instead use labs to make a customization of the templates for the needed change [22:37:28] it will allow me to read your labs templates for bz and copy them over to production =] [22:37:47] since i can see said changes, confirm they dont break things, and do the root level copy for you [22:38:26] ideally we puppetize it all but if you hit wals with puppet and skip right to template hacking and documenting, its ok. [22:38:34] s/wals/walls [23:32:12] PROBLEM Puppet freshness is now: CRITICAL on aggregator1 aggregator1 output: Puppet has not run in the last 10 hours [23:32:12] PROBLEM Puppet freshness is now: CRITICAL on analytics analytics output: Puppet has not run in the last 10 hours [23:32:12] PROBLEM Puppet freshness is now: CRITICAL on asher1 asher1 output: Puppet has not run in the last 10 hours [23:32:12] PROBLEM Puppet freshness is now: CRITICAL on backport backport output: Puppet has not run in the last 10 hours [23:32:12] PROBLEM Puppet freshness is now: CRITICAL on bastion-restricted1 bastion-restricted1 output: Puppet has not run in the last 10 hours [23:44:34] * jeremyb pokes maplebedd [23:44:38] s/d$// [23:47:47] anyone have any idea if it would be feasible to set up a labs instance with a copy of the en.wiki database (or simple.wiki if en is too big)? [23:48:11] one already exists [23:48:24] (for simple... I think it might be one of the complete ones) [23:48:33] en is surely waaaay too big [23:48:46] project is deployment-prep [23:48:53] back from south america? [23:51:20] yep [23:52:32] cool, I'll send a request to the people in the basement [23:52:49] whom? [23:54:04] kaldari: johnduhart might be able to answer some about it tonight (if he's on). otherwise petan and I will be up in the europemorgen [23:54:44] maplebed: anyway, I'm collapsing soon (2am here) but I wanted to apply for the guinea pig job ;-) anything special I need to know? what project should the cluster live in? what's your TZ? (I'm Europe/Berlin for a few more days. which of course comes with it's own meatspace distractions so I might not be able to do it before someone else gets to it) [23:55:09] jeremyb: sorry, the substitution didn't trigger my nick highlighting alert thingy. [23:55:19] haha [23:55:34] fantastic! I'm glad you're interested. [23:55:51] I suggest creating a new project for it, but if you'd like I can add you to the existing swift project intsead. [23:55:58] well, I'll do that anyways so you can look at what I've done. [23:56:12] your call [23:56:32] spin up the cluster in a new project, and I'll also add you to the existing one so you can look. [23:56:49] I'm in the SF office, so PST. [23:56:50] can you make the new project then? [23:56:55] PDT* !!! [23:56:56] I think so. [23:56:58] PDT. [23:56:59] thank you. [23:58:21] what's your labs name? [23:58:24] ok, so, maybe catch you at ~midnight your time or else when you wake up. or maybe the wifi is still flaky tomorrow and I don't get on [23:58:29] labs wiki name? [23:58:29] =ircnick [23:58:32] 03/30/2012 - 23:58:32 - Creating a project directory for swift2 [23:58:33] 03/30/2012 - 23:58:32 - Creating a home directory for jeremyb at /export/home/swift2/jeremyb [23:58:33] 03/30/2012 - 23:58:32 - Creating a home directory for ben at /export/home/swift2/ben [23:58:34] k. [23:59:31] 03/30/2012 - 23:59:30 - Creating a home directory for jeremyb at /export/home/swift/jeremyb [23:59:32] 03/30/2012 - 23:59:32 - Updating keys for ben [23:59:32] 03/30/2012 - 23:59:32 - Updating keys for jeremyb