[00:01:20] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 7% free memory [01:07:36] 04/13/2012 - 01:07:36 - Creating a project directory for syslog-collection [01:07:36] 04/13/2012 - 01:07:36 - Creating a home directory for laner at /export/home/syslog-collection/laner [01:08:34] 04/13/2012 - 01:08:34 - Updating keys for laner [01:41:00] PROBLEM Free ram is now: WARNING on incubator-bots2 i-00000119 output: Warning: 19% free memory [01:57:13] 04/13/2012 - 01:57:13 - Creating a home directory for apmon at /export/home/bastion/apmon [01:57:29] 04/13/2012 - 01:57:29 - Creating a home directory for apmon at /export/home/maps/apmon [01:58:13] 04/13/2012 - 01:58:13 - Updating keys for apmon [01:58:29] 04/13/2012 - 01:58:29 - Updating keys for apmon [02:39:50] PROBLEM Disk Space is now: CRITICAL on nagios 127.0.0.1 output: DISK CRITICAL - free space: /home/laner 1813 MB (10% inode=83%): [02:44:50] RECOVERY Disk Space is now: OK on nagios 127.0.0.1 output: DISK OK [02:53:24] New patchset: Ryan Lane; "Change to move projects into another OU rename groups" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4836 [02:53:37] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/4836 [02:54:02] New review: Ryan Lane; "This requires LDAP changes too, and a lot of things need to be synched for this to work. Please don'..." [operations/puppet] (test); V: 0 C: -1; - https://gerrit.wikimedia.org/r/4836 [03:22:40] RECOVERY Puppet freshness is now: OK on nova-essex-test i-000001f9 output: puppet ran at Fri Apr 13 03:22:19 UTC 2012 [03:48:20] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 14% free memory [03:49:20] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 15% free memory [03:59:20] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 16% free memory [04:03:20] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 5% free memory [04:03:20] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 14% free memory [04:13:20] RECOVERY Free ram is now: OK on test-oneiric i-00000187 output: OK: 97% free memory [04:14:20] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: Critical: 4% free memory [04:15:10] PROBLEM Free ram is now: CRITICAL on test3 i-00000093 output: Critical: 3% free memory [04:18:20] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 5% free memory [04:19:20] RECOVERY Free ram is now: OK on nova-daas-1 i-000000e7 output: OK: 93% free memory [04:19:20] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 4% free memory [04:20:10] RECOVERY Free ram is now: OK on test3 i-00000093 output: OK: 96% free memory [04:24:20] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 96% free memory [04:28:20] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 96% free memory [04:45:10] PROBLEM Puppet freshness is now: CRITICAL on nova-production1 i-0000007b output: Puppet has not run in last 20 hours [04:49:10] PROBLEM Puppet freshness is now: CRITICAL on nova-gsoc1 i-000001de output: Puppet has not run in last 20 hours [05:36:00] RECOVERY Free ram is now: OK on incubator-bots2 i-00000119 output: OK: 25% free memory [06:04:52] PROBLEM Disk Space is now: WARNING on labs-nfs1 i-0000005d output: DISK WARNING - free space: /export 934 MB (5% inode=83%): /home/SAVE 934 MB (5% inode=83%): [06:47:45] PROBLEM Disk Space is now: CRITICAL on nagios 127.0.0.1 output: DISK CRITICAL - free space: /home/dzahn 632 MB (3% inode=83%): /home/petrb 632 MB (3% inode=83%): /home/laner 632 MB (3% inode=83%): [06:52:45] RECOVERY Disk Space is now: OK on nagios 127.0.0.1 output: DISK OK [07:02:55] RECOVERY Disk Space is now: OK on aggregator1 i-0000010c output: DISK OK [07:16:15] PROBLEM Disk Space is now: CRITICAL on labs-nfs1 i-0000005d output: CHECK_NRPE: Socket timeout after 10 seconds. [07:19:25] PROBLEM Current Load is now: WARNING on bots-3 i-000000e5 output: WARNING - load average: 7.02, 7.99, 5.54 [07:24:25] RECOVERY Current Load is now: OK on bots-3 i-000000e5 output: OK - load average: 3.05, 4.84, 4.83 [08:49:55] RECOVERY Disk Space is now: OK on labs-nfs1 i-0000005d output: DISK OK [10:59:25] PROBLEM Disk Space is now: WARNING on ganglia-test i-00000202 output: DISK WARNING - free space: / 300 MB (3% inode=93%): [11:04:25] PROBLEM Disk Space is now: CRITICAL on ganglia-test i-00000202 output: DISK CRITICAL - free space: / 70 MB (0% inode=93%): [11:09:25] RECOVERY Disk Space is now: OK on ganglia-test i-00000202 output: DISK OK [11:25:55] PROBLEM Disk Space is now: WARNING on aggregator1 i-0000010c output: DISK WARNING - free space: / 342 MB (3% inode=93%): [11:30:55] PROBLEM Disk Space is now: CRITICAL on aggregator1 i-0000010c output: DISK CRITICAL - free space: / 91 MB (0% inode=93%): [13:37:21] PROBLEM Disk Space is now: WARNING on ganglia-test i-00000202 output: DISK WARNING - free space: / 335 MB (3% inode=93%): [13:42:21] PROBLEM Disk Space is now: CRITICAL on ganglia-test i-00000202 output: DISK CRITICAL - free space: / 105 MB (1% inode=93%): [14:44:21] PROBLEM Free ram is now: WARNING on incubator-bots2 i-00000119 output: Warning: 19% free memory [14:45:21] PROBLEM Puppet freshness is now: CRITICAL on nova-production1 i-0000007b output: Puppet has not run in last 20 hours [14:49:21] PROBLEM Puppet freshness is now: CRITICAL on nova-gsoc1 i-000001de output: Puppet has not run in last 20 hours [16:18:12] ping petan [17:27:22] andrewbogott: when do you get into town? [17:27:58] Around noon on sunday. Staying at the Triton, predictably. [17:28:09] Do you want to meet up on Sunday sometime? [17:29:07] (I have no particular agenda, really.) [17:29:19] heh [17:29:21] hm. maybe [17:29:32] are any of the sessions on monday? [17:29:49] Is there a schedule someplace? I've been wondering. [17:30:01] should be one somewhere [17:30:14] * andrewbogott regoogles [17:30:56] ok, here it is: http://openstack.org/conference/san-francisco-2012/sessions/ [17:31:41] sooooo many [17:32:09] shared_fs is on Monday at 11. nova/puppet/config on Wednesday... [17:32:25] plugins first thing on tuesday. [17:32:47] sharedfs is at 11, so we can likely meet up early and discuss it beforehand [17:33:32] There's even a break scheduled immediate beforehand :) [17:33:41] I want to go to the volume session @ 9:30. [17:34:31] But, yeah, we can catch up on Monday morning. [17:35:53] We can avoid a bit of AV comedy if you get me your slides on Sunday so I can merge them in. [17:36:07] But it's no big deal either way. [17:38:15] heh [17:38:17] yeah [17:38:26] I can get you the updated slides back before then [18:12:57] Ryan_Lane: re [18:13:13] I wanted to ask you some thing about the GSOC proposal [18:13:17] involving labs [18:14:03] ok [18:14:16] please always just ask the question rather than telling me you'd like to ask the question [18:14:26] I wish I could [18:14:51] I'm not always here, and I can answer the question when I get back [18:14:55] or email you a response [18:15:25] so, what's up? [18:15:36] I don't ge that ptroposal [18:16:02] foes it mean well be able to create instances via api if it works ? [18:16:30] openstack has two APIs [18:16:36] EC2 and a native one [18:16:40] we're using EC2 right now [18:16:52] the native API has more features [18:17:01] ok [18:17:09] and all of andrewbogott's work lately is only available in the native api [18:17:18] but we don't support it in OpenStackManager right now [18:17:27] And the nova commandline! [18:17:35] yep [18:18:34] I just explained my proposal with a for dummies version - now I feel like a dummy [18:18:59] OrenDsk: so, the GSOC guy will add support to OpenStackManager for this [18:19:02] I am kind of asking what functionalities will users see [18:19:14] like me [18:19:15] well, none at first [18:19:22] it should be transparent [18:19:32] but, at some point we'll be able to expose the API [18:19:40] so people will be able to use the CLI to create instances [18:19:47] and script the creation of instances and such [18:19:52] I need that [18:19:54] we'll add the new features over time [18:20:04] such as instance resizing [18:20:06] instance snapshots [18:20:17] an admin api, for people like me [18:20:41] so that labs admins can do most actions without doing it locally on the controller [18:20:43] how about if I bult a perfect instance for running tests [18:21:05] then you should puppetize it and we should move it to production [18:21:14] well, you, us, whatever [18:21:18] someone should puppetzie it [18:21:24] but after the test is run it's should be scrapped [18:21:57] it just needs to be in a certain state for the tests to be repeatable - but takes days to set up [18:22:40] I had saved some things like that in VM ware [18:22:57] yeah, that should be puppetized [18:23:05] so that when the instance starts, it configures itself [18:24:18] say it includes importing a wiki dump - wouldn't that take a long time even if it was puppetized ? [18:24:45] the database should be on hardware anyway [18:24:51] not in the instance [18:25:05] we're going to have user databases at some point [18:25:35] I dont' follow you [18:25:52] and replicated copies of the databases [18:26:08] so, for a test... [18:26:12] jenkins would start instances [18:26:28] and would make a call of to the database for the database the test would use [18:26:31] *off [18:26:45] then when the test is done, it would delete the instances and the database [18:28:26] how much of that is possible today [18:29:54] and how long would it take to replicate a database [18:32:11] also why do you say the dbase should be in harware ? [18:33:05] brb [18:41:43] because vms suck [18:42:13] Ryan_Lane: What was that cool looking mysql replication filter thing you mentioned ages ago? [18:42:35] Also the new mysql from oracle FINALLY has semi decent replication options, shame it's still oracle though [18:42:46] Well I saw new it's still in un-stable but anyway... [18:55:44] DamianZ: tungsten [19:06:39] <^demon> If the --depth option really crashes the git server, then sending this command could be possibly used for a DoS attack -> That's why I set severity to major." [19:06:43] <^demon> Should your server's memory size be increased ? [19:06:45] <^demon> Heh [19:25:07] 04/13/2012 - 19:25:07 - Updating keys for faidon [19:25:13] 04/13/2012 - 19:25:13 - Updating keys for faidon [19:57:37] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/4836 [19:57:39] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4836 [20:10:19] i can't seem to login to gerrit via ssh/git due to publickey. is it possible that there is a delay updating the keys? [20:13:26] joancreus: Did you update your key in the Gerrit web interface too? [20:13:38] Just giving it to us is not enough, you also have to manually put it in Gerrit [20:13:41] RoanKattouw: i just found out that [20:13:45] (Yes, this sucks, there's a Gerrit bug open for that) [20:13:49] now working [20:13:51] thanks! [20:15:46] RoanKattouw: can i change gerrit's password? [20:17:57] * joancreus ignore what i said [20:19:19] ignore what i said above. is there a way to change gerrit's git/ssh password? it might be due to the fact i git cloned without being identified [20:19:22] is it possible? [20:19:25] * DamianZ eats joancreus [20:23:44] PROBLEM Current Load is now: CRITICAL on test-groupchange i-00000203 output: Connection refused by host [20:24:24] PROBLEM Current Users is now: CRITICAL on test-groupchange i-00000203 output: Connection refused by host [20:25:04] PROBLEM Disk Space is now: CRITICAL on test-groupchange i-00000203 output: Connection refused by host [20:25:44] PROBLEM Free ram is now: CRITICAL on test-groupchange i-00000203 output: Connection refused by host [20:26:54] PROBLEM Total Processes is now: CRITICAL on test-groupchange i-00000203 output: Connection refused by host [20:27:34] PROBLEM dpkg-check is now: CRITICAL on test-groupchange i-00000203 output: Connection refused by host [20:30:34] joancreus: I don't understand what you're asking here [20:30:40] You want to change your Gerrit password? [20:30:50] There is also no such thing as a git/ssh password [20:35:02] Well the git+ssh password is the ssh password which is the general everything password as the logins change but the passwords remain the same. [20:35:31] We don't authenticate SSH with passwords, we authenticate with a key [20:35:44] Your key might have a passphrase, true [20:37:32] i was trying to push to https :S [20:37:35] fixed [20:38:16] New patchset: Ryan Lane; "Adding missing parameter" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4917 [20:38:45] gerrit-wm: -_- [20:38:50] where's my damn lint check? [20:39:07] did someone fuck up lint checks again? [20:39:14] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/4917 [20:39:27] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/4917 [20:39:29] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4917 [20:42:26] New patchset: Ryan Lane; "Adding default parameter for privileges" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4918 [20:42:55] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/4918 [20:42:57] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4918 [20:45:53] New patchset: Ryan Lane; "Incorrect project was added" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4919 [20:46:28] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/4919 [20:46:30] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4919 [20:48:42] New patchset: Ryan Lane; "Fix global project check" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4920 [20:48:57] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/4920 [20:49:00] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4920 [21:24:24] RECOVERY Current Users is now: OK on test-groupchange i-00000203 output: USERS OK - 0 users currently logged in [21:25:04] RECOVERY Disk Space is now: OK on test-groupchange i-00000203 output: DISK OK [21:25:44] RECOVERY Free ram is now: OK on test-groupchange i-00000203 output: OK: 71% free memory [21:26:54] RECOVERY Total Processes is now: OK on test-groupchange i-00000203 output: PROCS OK: 78 processes [21:27:34] RECOVERY dpkg-check is now: OK on test-groupchange i-00000203 output: All packages OK [21:28:44] RECOVERY Current Load is now: OK on test-groupchange i-00000203 output: OK - load average: 0.40, 0.16, 0.10 [22:03:44] PROBLEM Current Load is now: CRITICAL on testing-groupchange i-00000204 output: Connection refused by host [22:04:24] PROBLEM Current Users is now: CRITICAL on testing-groupchange i-00000204 output: Connection refused by host [22:05:04] PROBLEM Disk Space is now: CRITICAL on testing-groupchange i-00000204 output: Connection refused by host [22:05:44] PROBLEM Free ram is now: CRITICAL on testing-groupchange i-00000204 output: Connection refused by host [22:06:54] PROBLEM Total Processes is now: CRITICAL on testing-groupchange i-00000204 output: Connection refused by host [22:07:34] PROBLEM dpkg-check is now: CRITICAL on testing-groupchange i-00000204 output: Connection refused by host [22:16:25] Ryan_Lane: http://prezi.com/2nrdx_b1k8cc/shared-filesystems-in-openstack-nova/ <- motion-sick conference attendees! [22:28:52] andrewbogott: :D [22:29:15] I can't decide if the motion adds or detracts. [22:31:24] does it require network access to work? [22:31:46] last conference the network access was *terrible* [22:35:57] Ryan_Lane: Kind of. There's a local client but i think it's windows-only. [22:36:08] Well... it's flash so it /should/ be portable but I haven't messed with it. [22:36:11] heh [22:36:17] I'd stick with using open office :) [22:36:23] the network could totally suck [22:36:41] it was nearly unusable at the last one [22:36:46] Yeah, I'm not counting on having access. [22:38:43] paravoid: http://www.mediawiki.org/wiki/Wikimedia_Labs [22:39:24] RECOVERY Current Users is now: OK on testing-groupchange i-00000204 output: USERS OK - 0 users currently logged in [22:40:04] RECOVERY Disk Space is now: OK on testing-groupchange i-00000204 output: DISK OK [22:40:44] RECOVERY Free ram is now: OK on testing-groupchange i-00000204 output: OK: 71% free memory [22:41:13] so he has my vote! [22:41:19] My mouth hurts :( [22:41:54] RECOVERY Total Processes is now: OK on testing-groupchange i-00000204 output: PROCS OK: 78 processes [22:42:34] RECOVERY dpkg-check is now: OK on testing-groupchange i-00000204 output: All packages OK [22:43:44] RECOVERY Current Load is now: OK on testing-groupchange i-00000204 output: OK - load average: 0.00, 0.04, 0.07 [22:45:44] New patchset: Ryan Lane; "Up version of ldap scripts" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4933 [22:48:03] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/4933 [22:48:06] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/4933 [23:05:27] Oh, duh, when I downloaded I got a mac /and/ a windows client, I just clicked on the wrong icon. [23:05:35] A hazard of having Wine installed [23:10:04] PROBLEM host: testing-groupchange is DOWN address: i-00000204 check_ping: Invalid hostname/address - i-00000204 [23:13:24] RECOVERY host: testing-groupchange is UP address: i-00000205 PING OK - Packet loss = 0%, RTA = 1.24 ms [23:18:44] PROBLEM Free ram is now: CRITICAL on testing-groupchange i-00000205 output: CHECK_NRPE: Error - Could not complete SSL handshake. [23:19:04] PROBLEM Disk Space is now: CRITICAL on testing-groupchange i-00000205 output: CHECK_NRPE: Error - Could not complete SSL handshake.