[00:00:03] I'll start over again... maybe it's busted because I built it while everything was failing. [00:00:28] could be [00:00:53] I need to make the build process less error prone [00:01:40] when we switch away from a central puppet master it should be slightly more reliable [00:01:49] the puppet master was likely dead when this tried to build [00:09:51] andrewbogott: having any more luck? [00:09:58] yep, can log in now. [00:10:05] ah. good [00:10:12] is ldap working? [00:10:15] do: id laner [00:10:18] It's good 'cause I was running out of synonyms for 'large' [00:10:25] if you get my account back with groups, it's working [00:10:39] it will only do that if you run it as root [00:11:22] I think it's working... I got back a great big record, can't testify to its accuracy. [00:11:42] accuracy isn't important. [00:11:50] it would tell you my account didn't exist otherwise [00:15:59] tell me again how to make apt not complain about ganglia-monitor? [00:22:56] ummm [00:22:59] sec [00:23:16] adduser --system --ingroup ganglia --home /var/lib/ganglia ganglia [00:23:46] Bah [00:23:46] bah. the scheduler I want to use doesn't exist in diablo [00:23:57] thanks [00:24:21] * Damianz goes to figure out how to port a normal schema into an ldif. [00:24:29] normal schema? [00:24:39] you mean openldap -> sun? [00:25:00] I have a script for this somewhere [00:25:49] Mhm [00:25:50] hm. seems simple scheduler will at least pick a host that has the least number of instances running [00:25:56] lemme try to find that [00:26:19] https://blogs.oracle.com/Ludo/entry/updated_schema_convert_py_script [00:26:24] Don't feel like going though a 600 schema file by hand to put it into an ldif to load lol [00:26:41] that'll convert it for you [00:26:44] :D [00:26:49] * Damianz gives Ryan_Lane a cookie [00:26:49] some of the schema may come by default [00:27:14] heh. I'm useful occasionally, eh? [00:27:54] A little, sometimes :P [00:28:18] Now just fix the freenode servers xD [00:28:23] heh [00:28:30] I'm going to try to change our scheduler [00:28:33] let's hope it works ;) [00:28:42] I think the max_cores option is going to be way too low [00:32:07] * Damianz takes the cookie off Ryan_Lane [00:32:10] 404 blog post ftl [00:32:11] :P [00:33:26] http://ludopoitou.wordpress.com/2009/07/31/updated-schema-convert-py-script-for-opends/ works though :D [00:33:34] heh [00:33:46] scheduler seems to be in place, now to try to create an instance :) [00:34:59] well, it was scheduled on the preferred node [00:35:42] and it's running. well, time to make that the new scheduler :) [00:39:21] that should keep things alive longer :) [00:39:36] so, whenever we get the new node in, all new instances will be created on that for a while [00:40:03] unfortunately, the hardware is different sizes, so the new node won't be properly assigned more instances [00:40:18] we'll need to switch hardware, or switch to a better scheduler [00:58:09] Hmm [02:20:51] Ryan_Lane: You there? I'd like to take this time to see if I can get a CVNBot working on Labs, but not sure where to start. I need to get a bunch of files uploaded somehow (SFTP possible?) and ssh into the bots project to run them. Where to start ? [02:21:20] would be good to talk to petan, since he's been running the bots project [02:21:31] sftp works [02:21:31] he's approved this and added me to the project [02:21:44] but I mean in general the login stuff and accessing the instance [02:21:51] https://labsconsole.wikimedia.org/wiki/Help:Access#Using_ProxyCommand_ssh_option [02:22:03] https://labsconsole.wikimedia.org/wiki/Help:Access#Accessing_public_and_private_instances [02:22:20] proxycommand should let you scp directly to the instance [02:22:33] So I use both of those sections or either? [02:23:04] there was something about forwarding that wasn't secure or something for people that use their key elsewhere? [02:23:05] well, agent forwarding won't let you connect directly [02:23:09] but…. [02:23:11] I have to go, actually [02:23:18] I need to catch a bus in like 5 mintues [02:23:22] ok, go! [02:23:49] * Ryan_Lane waves [02:23:59] petan: you there? :) [02:32:55] Okay, I've pasted the proxy config and got into ssh bots.wmflabs.org [02:32:56] nice [02:50:13] Hm.. mono/xbuild aren't available on the bots instance [02:50:29] Anyone around able to get that package on there? [02:50:37] the CVNBot uses it [02:50:41] (don't ask why) [03:04:31] * jeremyb waves [03:05:22] Krinkle: i can help with the proxycommand stuff if you like [03:05:47] oh, you're in [03:05:59] i wonder why it's named bots. what a horrible name [03:07:04] oh, huh, it doesn't exist. i guess it's an extra name for a node [03:08:58] aha, https://labsconsole.wikimedia.org/wiki/Special:NovaAddress [03:11:03] * jeremyb grumbles [03:13:35] Krinkle: bots.wmflabs.org is bots-apache1.pmtpa.wmflabs (an rfc 1918 non-publicly-routable IP); you should ssh to the *.*.wmflabs. (private IP) via bastion not directly to the the bots.wmflabs.org address [03:14:18] hm.. difference? [03:14:47] seemed to work thoguh [03:14:52] haven't done anything yet [03:14:54] it did so in fact [03:15:14] port 22 probably shouldn't be forwarded. so if the box is ever fixed so that it's no longer forwarded you won't be able to get in [03:16:41] wait, so the SFTP is not going to the same as port 80 and/or ssh ? [03:16:55] hold on a sec [03:17:04] I uploaded a html file via sftp, and saw it listed in ssh, and accessed via bots.wmflabs.org [03:17:13] on port 80 [03:17:14] also just good practice so that it's the same way to access that box as other labs boxes. and so that when you ssh to it you're not just sshing to "bots.wmflabs.org" (which is not obviously bots-apache1 at a glance and once you learn that it is you might later forget) [03:18:02] now, i assumed CVNbot was some kind of persistent process that 1) made edits or 2) made reports to IRC [03:18:11] does it have a web component? [03:18:11] 2) [03:18:14] no [03:18:26] so, then i think it doesn't belong on the apache box? [03:18:34] Well, that depends [03:18:43] So for now, no. [03:19:20] But if CVN is really migration to Labs (which we plan to), then it does need to be on an apache host since there will be an API (web component) interfacing with the bots database, as well as a control panel for staff members of CVN to start/stop the bot [03:19:34] and for php/apache to be able to kill the process, it needs to be on the same machine, right? [03:19:55] jeremyb: so bots has multiple instances? [03:20:05] ah I see [03:20:09] https://labsconsole.wikimedia.org/wiki/Special:NovaInstance [03:20:13] https://labsconsole.wikimedia.org/wiki/Nova_Resource:Bots#Instances_for_this_project [03:20:24] welcome to semantic MW ;) [03:20:51] so now it makes sense why not to ssh to bots.wmflabs.org but to bots-apache [03:21:00] bots-apache1* [03:21:04] yes [03:21:18] but that doesn't have a public hostname, so how to use SFTP on that? [03:21:20] Krinkle: you could have one box do passwordless ssh to the other box... [03:21:33] I heard about that [03:21:37] forwarding ssh keys [03:21:39] right? [03:21:42] not forwarding [03:21:56] proxying [03:22:00] i'm talking about for one to kill something on a different box (unattended) [03:22:11] for you to log in now is unrelated [03:22:24] oh, for CVN [03:22:47] Krinkle: [this is now for you to log in:] ryan linked to https://labsconsole.wikimedia.org/wiki/Help:Access#Using_ProxyCommand_ssh_option before. did you read it? (i wrote it so i can certainly help you set it up if you hit a bump) [03:22:52] Krinkle: what OS? [03:23:02] basically the control panel needs output of 'ps -f -u ' and ability to kill things by PPID [03:23:23] jeremyb: I did #Using_ProxyCommand_ssh_option [03:23:29] I have that in my config file [03:23:41] I assume that's why I was able to login to bots at all, or was that possible without it? [03:23:46] before I had to go through bastion first [03:23:52] and then use LDAP password to get to a projet [03:23:56] but that's no longer possible [03:24:12] BRB [03:28:29] Krinkle: so you should now be able to `ssh bots-apache1.pmtpa.wmflabs` ? [03:29:18] Krinkle: idk what this business about using an LDAP password to ssh into a machine. can't remember a time when that was required, sounds fishy [03:29:41] jeremyb: directly from my own shell? [03:29:47] Krinkle: yes [03:30:18] * jeremyb hopes Krinkle is not in europe atm ;P [03:30:50] worked [03:30:52] I am [03:33:29] hah [03:35:19] Krinkle: so, you have to decide where to put this stuff. or you can make a new box to play around with or puppetize with [03:35:54] Krinkle: you should now be able to rsync, sftp, scp, and plenty of other things directly from your local shell [03:36:13] I use SFTP from my FTP client though, not from shell [03:36:23] Krinkle: might work... [03:36:27] basically having 2 connections open [03:36:30] works :) [03:36:39] used it for prototype.wikimedia and toolserver.org as well [03:36:46] filezilla or ducky or what? [03:36:49] I use Panic Coda [03:37:01] no, i was saying might work with proxycommand [03:37:12] Excellent code editor with nice terminal, (local) file browser, and (s)ftp clieint [03:37:53] I assume Coda uses shell in the background since it always works flawlessly if I only enter a hostname [03:38:02] the local shell that is, with my config and all loaded [03:38:33] jeremyb: yeah [03:39:35] so, there was talk about this labs conference but idk if it ever happened [03:40:16] * jeremyb still doesn't really understand what the workflow's supposed to look like for stuff that eventually goes to prod [03:41:30] i.e. dev somewhere (maybe labs) and then test somewhere (in testlabs or your own project or where?) and then send to prod [03:43:22] also, there's otto's recent thread about merging changes that have been approved to a different branch. do they get reapproved? e.g. send to operations/puppet:test first and then operations/puppet:production [03:44:24] petan: jeremyb: redesigned hacky frontpage index.html http://bots.wmflabs.org/ [03:44:37] at least it looks somewhat decent now [03:44:41] where's the old? [03:45:34]
Bots project hm [21:53:13] do I need Nova credentials? [21:53:20] is it saying you don't have any? [21:53:24] if so, log out and back in [21:53:25] yeah [21:53:29] I still can't trigger that bug. [21:54:08] it seems to hit people randomly, which I don't understand [21:54:42] wait, is php storing sessions on the filesystem? [21:54:50] that may actually be the issue [21:54:56] buhhhwuhhwhoa [21:55:24] bah. damn it [21:55:25] it is [21:55:32] well, everyone is about to lose their session :) [21:56:33] I bet people's sessions were being eaten with a cleanup script [21:56:40] oh great. now login isn't working for me [21:56:42] * Ryan_Lane sighs [21:57:08] hah [21:57:18] on labsconsole? [21:57:21] i'm still logged in a think [21:57:27] there we go [21:57:28] yup [21:57:41] maybe that bug will disappear now [21:57:48] still there [21:57:51] shoudl I log out/in [21:57:51] ? [21:57:56] well, I just switched to memcache [21:57:57] yeah [21:58:02] how did you trigger it? [21:58:09] what recent actions did you do? [21:58:11] heyyyyy [21:58:12] yes [21:58:13] better [21:58:47] it may just be related to /tmp cleanup, but I guess we'll find out soon [21:59:26] oo, i have to be a member of sysadmin role to configure the instance [21:59:43] which instance are you trying to configure? [21:59:47] https://labsconsole.wikimedia.org/wiki/Nova_Resource:I-000000e2 [21:59:49] analytics [22:00:48] ah. yeah. you aren't a sysadmin [22:01:09] drdee: are you ok with ottomata having sysadmin and netadmin in mobile-stats? [22:01:32] ottomata: anyone with sysadmin rights can give it to you [22:01:38] ok [22:01:43] i'll get that [22:01:46] I generally ask first :) [22:01:51] so I understand: [22:01:53] once i have access [22:01:55] i can configure my instance [22:01:59] and add puppet 'groups' to it? [22:02:11] ni [22:02:12] no [22:02:14] puppet class [22:02:14] hm [22:02:17] I can show you really quick [22:02:22] k, coming over [22:05:54] !project mobile-stats [22:05:55] https://labsconsole.wikimedia.org/wiki/Nova_Resource:mobile-stats [22:07:57] I'm enabling LiquidThreads on labsconsole [22:08:05] there's a possibility it'll die [22:08:53] seems to be working [22:15:46] hmm [22:21:33] git push gerrit :refs/heads/tmp [22:21:41] well, except your remote isn't gerrit ;) [22:21:50] where tmp is the new branch [22:22:06] you have to already have a commit to push [22:22:15] you can't create a branch without a commit [22:23:38] The web interface does allow you to create a branch without a commit [22:23:46] yep [22:23:48] But yeah this way they complement each other [22:25:04] basing it off a commit makes a lot of sense, though [22:25:49] uhu [22:26:42] so, reviewboard [22:28:28] ugh [22:28:30] it's like phabricator [22:29:10] I think if we had to choose between the two, I'd go with phabricator, since it's php [22:30:02] We were saying last night that one of the first things we should do with Phabricator is hack in automated merging [22:30:10] i.e. automatically merge changes that have been approved [22:30:18] That shouldn't be all that hard to do [22:30:36] yeah [22:30:50] I hate that it's based on diffs :( [22:31:55] Yeah, git commit objects are so much nicer [22:32:00] yeah [22:32:08] But it requires writing a git server wrapper in say c++ [22:32:29] Depending on how much extended data is in Phabricator's diff format, those diffs might really just be weird serializations of commit objects [22:32:31] In which case it's fine [22:32:54] I believe it's straight diffs [22:33:07] If we could change the backend so that recreating a commit from a diff would result in the same hash... that shouldn't be too hard to do using some of the lower-level git commands [22:33:08] I say that because you can manually generate them [22:33:18] Right [22:33:48] we'd need to wrap authn/authz around it too [22:34:08] ? [22:34:13] How so? [22:34:22] user x, y, and z are allowed to merge [22:34:25] others are not [22:34:28] Right [22:34:36] otherwise it isn't gated [22:34:37] Yeah if you have automatic merging that needs to be taken into account [22:34:45] <^demon> And make those on per-branch basis. [22:34:48] yep [22:34:55] we're basically re-creating gerrit's backend :D [22:34:55] <^demon> (You're quickly adding complexity that gerrit already has) [22:35:00] The script that you use to merge approved revs (which we don't really want to use) also needs to respect permissions and approval status [22:35:19] Ryan_Lane: Well recreating parts of Gerrit's backend in a sane language wouldn't be too terrible [22:35:25] But potentially a shitload of work [22:35:29] <^demon> Java's not totally insane. It could be worse. [22:35:30] yes [22:35:40] It could be something like OCaml, true [22:35:47] <^demon> Or ruby ;-) [22:35:50] right now, we have openstack people to help [22:36:36] Yeah [22:37:07] <^demon> Ryan_Lane: Do you know how receptive gerrit seems to be to 3rd party development? [22:37:10] with phabricator do we have any other help? [22:37:18] ^demon: they are receptive, but slow to act [22:37:29] At least another non-Google org is using it (OpenStack) [22:37:41] <^demon> Lots of people are using it :p [22:37:45] yeah [22:37:57] we really need to host an openstack meetup in the office [22:38:13] hell, we should host a gerrit oriented openstack meetup [22:38:22] YES [22:38:29] <^demon> RoanKattouw: Are you doing a WM2012 presentation? [22:38:33] Or maybe #openstack-infra -oriented more generally [22:38:38] yeah [22:38:43] ^demon: For VisualEditor yes. Maybe something else, haven't decided [22:38:49] ^demon: We could do something focused on git I guess? [22:38:55] <^demon> http://wikimania2012.wikimedia.org/wiki/Submissions/Practical_Git_for_Wikimedians - I submitted that today [22:39:08] Aha! [22:39:21] Nice [22:39:32] Should probably also be run in Berlin (if you're coming) [22:39:40] <^demon> Nope. [22:40:04] Meh; finals? [22:40:12] I really need to submit something [22:40:19] when is the last submission date? [22:40:24] March 18th? [22:40:24] the 16th of next month? [22:40:27] oh [22:40:27] ok [22:40:28] I think [22:40:31] Well somewhere around that time [22:40:38] I have a while, then [22:40:43] <^demon> RoanKattouw: My 1 summer class I have to take to graduate starts the monday after berlin ends. [22:40:50] heh, yeah that's annoying [22:41:04] Hopefully they'll send Brion and myself so we can do a git tutorial there [22:41:04] <^demon> And I'd rather not tempt the travel gods. [22:41:15] wheeee [22:41:49] <^demon> The tutorial I plan to do at WM2012 is going to be geared towards people who already understand git basics, but want to understand our workflow and how we're doing things. [22:42:32] Yeah [22:43:27] <^demon> brion: Can we go back in time and have you guys move to git like 6 or 7 years ago? Converting a 10 year repo history is NP-hard. [22:43:47] <^demon> :) [22:43:57] heh [22:45:18] and also write a sane code review tool for it? [22:45:22] heh [22:45:46] I guess we'd be using the mediawiki one for git if you guys switched to it way back then [22:47:37] <^demon> Oh yippie. [22:50:11] hey if you'd prefer to do your code reviews in notepad, i can go back in time and unwrite CodeReview :) [22:50:13] seems the openstack guys have a bunch of work queued [22:50:17] for gerrit and jenkins [22:50:29] brion: I actually like the mediawiki codereview tool [22:50:39] \o/ [22:50:49] it's interface is much better than gerrit [22:50:57] that's a low bar ;) [22:53:48] brion: indeed it is [22:54:08] HP says they have 5 gerrit changes queued and waiting on legal to submit [22:54:19] (HP is an openstack member) [23:00:12] <^demon> Ryan_Lane: Are any of those changes ui fixes? [23:02:53] ^demon: probably all UI fixes [23:03:14] <^demon> Well gerrit needs to hurry up then :) [23:03:54] <^demon> Hmm. svn2git is choking on some of our release tags :\ [23:04:21] <^demon> All the old cvs2svn-generated ones are fine. They start messing up once brion started tagging manually with 1.6.0 [23:05:51] ^demon: HP lawyers need to hurry up, technically ;) [23:06:00] they can't submit the changes until they are signed off on [23:06:06] something to do with gerrit's CLA [23:07:40] <^demon> Lawyers... [23:07:46] yeah [23:07:50] CLA's too [23:07:59] they make things painful sometimes [23:08:23] <^demon> And this is why we don't do contributor agreements. [23:08:56] Ryan_Lane: any idea if bdale's involved with this? (former debian project leader, current SPI president, HP open source guy) [23:10:38] jeremyb: dunno [23:17:53] * jeremyb waves the Krinkle [23:17:57] hiya [23:18:30] Krinkle: so, did you find a home? bots-4? [23:18:43] Haven't tried logging into bots-4 yet [23:18:48] got other priorities right now [23:19:16] catching up on code review and bugzilla from last 2 weeks absence and mediawiki development [23:20:39] well FYI if you could get into other boxes the inside way instead of straight to the public address then this one should be no different. (you've already logged in to that project and now you can get through bastion too) [23:21:10] k, I'll try a straight log in and get out again [23:22:21] k worked [23:22:42] jeremyb: Is there a way I can avoid the fingerprint-warnings for every separate instance, or is that a good thing? [23:22:51] I know it's a good thing [23:23:05] but does it make sense to be able to whitelist stuff behind bastion.wmflabs ? [23:23:46] just wondering if there's a trick everybody is using that I just don't know about, it's ok if I should keep accepting them for each new instnace. [23:26:28] well... [23:26:32] we'd like to automate it [23:26:34] but it's hard [23:26:59] what I'd really like to do is have the instance's fingerprint added to DNS [23:27:04] then have the instances configured to trust DNS [23:27:10] that also requires dnssec, though [23:27:38] and we can't use LDAP for DNS records then either, which means we need to write some more code [23:36:14] Krinkle: monkeysphere would do it... [23:36:40] I have no clue what you guys are talking about [23:36:49] :D [23:36:58] Krinkle: talking about managing ssh public keys [23:37:03] that much I figured [23:37:04] so that labs users don't need to accept them [23:37:15] Krinkle: http://web.monkeysphere.info/ [23:37:19] it's possible to store the information in DNS, and let ssh pull it from there [23:37:23] but dnssec and monkey spheres are to exotic for me. [23:37:32] too* [23:37:39] and it's one word ;) [23:37:43] I know [23:37:43] dnssec signs the zone file [23:37:46] ;-) [23:37:55] Tell me if and when I need to do stuff, until than, happy shop talking :) [23:38:03] :D [23:38:12] Krinkle: bake me a cake [23:38:21] got some browsers to crash, so I'm going back to work [23:38:24] with spherical components [23:38:33] Krinkle: write dns support for bind-style zone files in our openstack dns driver [23:38:36] then enable dnssec for it [23:41:43] Ryan_Lane: should be enough to just expose pub keys by API or some other public way and then have a script for people to run to update their known_hosts. or even use the same ssh hook that monkeysphere uses. (can look it up) [23:42:32] that's a much more painful thing [23:42:33] Ryan_Lane: keys should be exposed somewhere in the web interface anyway even if we do dnssec. i can do the client side part... [23:42:38] and it puts the burden on the end-user [23:42:58] if we can publish the ssh keys, then we can also handle them in DNS too [23:43:35] how is it more painful? either way the end user has to do something. unless they do agent forwarding. but i'm of course against agent forwarding ;) [23:43:53] how so? the instances would be pre-configured to trust DNS [23:44:02] excluding the bastion node [23:44:06] i don't follow [23:44:39] put the ssh fingerprint of the host into DNS [23:44:45] you have to configure your local box to get the fingerprint from DNS and to make sure it's doing dnssec verification [23:44:46] configure ssh to trust it [23:44:58] who cares about the local box? [23:45:15] the only one it needs to know is the bastion instance [23:45:17] i do? everyone that's not doing agent forwarding does [23:46:02] proxycommand stores all the keys locally? [23:46:06] that's annoying [23:46:16] of course, how else would it work? [23:46:38] via the server? [23:46:42] Krinkle's using proxycommand and he's the one who asked [23:46:48] Ryan_Lane: i don't follow [23:46:53] it isn't difficult to trust dns for ssh [23:47:01] it's one more line to add to the ssh config [23:47:43] so, what then, everyone has to run their own local DNS servers too? in order to get the wmflabs TLD [23:48:04] or is the WMF going to expose public recursors? ;-) ;-) [23:48:12] hm [23:48:19] meh [23:48:26] I'll just continue to ignore the problem :D [23:48:27] anyway, i think my fix is fairly simple and globally applicable. we can have it work with no user action on bastion to cover the agent forwarders and with little work for everyone else [23:48:46] all you have to do is get the keys to somewhere where i can fetch them [23:48:53] * Ryan_Lane nods [23:48:58] we have plans for it. it's unfortunately hard [23:49:46] can't you just ssh-keyscan from a trusted host? or do some magic with facter? [23:50:13] you can't trust the instance [23:50:29] which means you need to probe from the server [23:51:00] I'd really love some way of pre-generating the ssh-keys, then injecting them [23:51:06] it may be possible. [23:51:20] i don't follow. you can't trust what? [23:51:36] i'm not crazy about the pregeneration [23:51:41] you can't let the instance update the server [23:51:48] you could parse it out of the console output [23:51:59] yeah, that's nasty, though [23:52:03] what does that mean and why? [23:52:09] yes, it is nasty [23:52:21] pre-generation of the ssh-keys is the best option [23:52:26] what's wrong with that? [23:52:41] generate on the server, and inject into the instance [23:52:59] well first of all it means that if there's a security update to change the way they're generated then you don't pick it up [23:53:06] hell, if you delete and re-create an instance you can just re-inject the same key [23:53:12] but it just generally feels icky [23:53:53] why? the server's ssh version may be more up to date than the instance's [23:54:06] the instance builds using an image, that's possibly old [23:54:17] it generates the keys, then updates itself [23:54:49] then the server also always has direct access to the public kehys [23:54:50] *keys [23:57:10] i'm not getting what's wrong with the ssh-keyscan [23:58:05] where do you get the key from? [23:58:06] is it a timing thing? it should be successful any time the port's open and a quick failure when it's not open [23:58:14] from the network [23:58:20] ssh-keyscan -t rsa hostname [23:59:12] because we delete and re-create instances with the same names [23:59:13] but i still don't understand why we can't trust the box about it's own key... (from above) [23:59:26] and they should have the same key [23:59:36] should they really? [23:59:47] hm. actually, I guess not