[00:31:32] spagewmf: for me too; taking a look [00:38:58] oh, i don't have that key on this machine yet... [00:39:19] or rather this machine's key in ldap [00:42:57] * jeremyb waits for the bot [00:45:41] 08/18/2012 - 00:45:40 - User jeremyb may have been modified in LDAP or locally, updating key in project(s): testlabs,bots,bastion,gerrit,globaleducation,search,deployment-prep,phabricator,upload-wizard,puppet-cleanup,testing,otrs,maps,swift,mailman [00:45:51] 08/18/2012 - 00:45:51 - Updating keys for jeremyb at /export/keys/jeremyb [01:05:39] 08/18/2012 - 01:05:39 - User jeremyb may have been modified in LDAP or locally, updating key in project(s): testlabs,bots,bastion,gerrit,globaleducation,search,deployment-prep,phabricator,upload-wizard,puppet-cleanup,testing,otrs,maps,swift,mailman [01:05:49] 08/18/2012 - 01:05:49 - Updating keys for jeremyb at /export/keys/jeremyb [01:05:57] thanks labs-home-wm ;) [01:31:52] i'm stumped. idk how this ever worked [01:32:01] 443 seems to not be in phab's sec group [01:36:15] ok, now i'm really confused [01:36:40] it was always drilled into me that sec groups can't be changed for an instance after it's creation [01:36:55] i just changed one and it fixed the problem... [01:37:10] phabricator's working again [01:39:32] !log phabricator phabricator: fixed ssl. played with the nginx conf a little and couldn't get it working. tweaked nova sec group and that fixed it. (443 was missing entirely) [01:39:33] Logged the message, Master [01:59:25] hey Ryan_Lane [01:59:34] 18 01:36:40 < jeremyb> it was always drilled into me that sec groups can't be changed for an instance after it's creation [01:59:38] 18 01:36:54 < jeremyb> i just changed one and it fixed the problem... [02:00:05] Ryan_Lane: are you planning to open up essex @ eqiad for crowdsourced testing? [02:02:09] * jeremyb is running off again in a couple mins [02:09:28] * jeremyb runs [02:56:24] * jeremyb waves from train. will be back in 10 mins [04:07:08] spagewmf: try again? [04:07:19] 18 01:39:31 < jeremyb> !log phabricator phabricator: fixed ssl. played with the nginx conf a little and couldn't get it working. tweaked nova sec group and that fixed it. (443 was missing entirely) [04:08:28] jeremyb I'm in, thanks! [04:08:33] sure [04:50:39] 08/18/2012 - 04:50:39 - User jeremyb may have been modified in LDAP or locally, updating key in project(s): testlabs,bots,bastion,gerrit,globaleducation,search,deployment-prep,phabricator,upload-wizard,puppet-cleanup,testing,otrs,maps,swift,mailman [04:50:49] 08/18/2012 - 04:50:49 - Updating keys for jeremyb at /export/keys/jeremyb [04:50:59] yeah, yeah [09:54:02] Change on 12mediawiki a page Developer access was modified, changed by Kdambiec link https://www.mediawiki.org/w/index.php?diff=573484 edit summary: [20:15:39] 08/18/2012 - 20:15:38 - Created a home directory for cariaso in project(s): bastion,testing [20:15:47] 08/18/2012 - 20:15:47 - Creating a home directory for cariaso at /export/keys/cariaso [20:20:41] 08/18/2012 - 20:20:41 - User cariaso may have been modified in LDAP or locally, updating key in project(s): bastion,testing [20:20:50] 08/18/2012 - 20:20:50 - Updating keys for cariaso at /export/keys/cariaso [21:25:36] !cli [21:27:12] [jeremyb@irc wikimedia-labs]$ [21:27:47] ;) [22:15:52] Change on 12mediawiki a page Developer access was modified, changed by Jeremyb link https://www.mediawiki.org/w/index.php?diff=573586 edit summary: /* User:kdambiec */ done [22:16:01] thanks wm-bot [22:16:18] when does petr get back? [22:44:53] hi, can someone help me out? I'm trying to create my first labs instance, but I'm just getting an error saying "Failed to create instance" after the add instance form [22:45:03] the project is https://labsconsole.wikimedia.org/wiki/Nova_Resource:Zotero [22:45:49] Interesting [22:46:12] Might be one for Ryan_Lane as we're sorta about to upgrade a bunch of things, Don't suppose it gives anymore info? [22:47:15] tgr: try again [22:47:28] it may be due to the 500 erros [22:47:30] *errors [22:47:32] Ah, it's one of *those* issues [22:48:01] Btw did you fix that annoying changing profile = dropping session issue yet? [22:48:32] not just yet [22:48:33] will soon [22:48:41] :) [22:49:41] heh. first test of OpenStackManager against production LDAP resulted in PHP exhausting memory [22:50:13] Rofl [22:50:42] Ryan_Lane: it is an old error, I had the same problem at Wikimania [22:50:49] I had a really annoying php exhausting memory issue the other day, simplexml_load_string was just making up it's own limit and failing well below the limit :( [22:50:58] tgr: yeah [22:51:03] Same bug :P [22:51:11] that bug will go away when we upgrade [22:51:37] Damianz: my problem was that I was loading all projects, which was actually loading all of them from ldap [22:51:50] Ah... that's a rather lot of data IIRC. [22:51:56] I switched to doing a lazy load, which just loads their project names [22:52:15] for admins I need the project names for the project filter [22:52:27] I recall the frontend doing something like grabing all the groups in an ou and filtering out by class or something. [22:52:35] I may eventually need to switch to make the project filter a text box that loads the project names via ajax [22:52:59] lazy loading does the trick really well right now, though [22:53:01] The project filter thing is annoying but will really be required as the number of projects grow. [22:53:06] yes [22:53:29] I may be able to default it to enable all projects if a user's project list is under a certain number of projects [22:54:07] multi-region naming is going to be hard [22:54:37] Damianz: I usually get a stack trace afterwards when I go to Special:NovaInstance ("Error getting instance list from Nova: An unknown error has occurred.") [22:54:41] dns naming, that is [22:54:47] not sure if it is related, though [22:54:48] tgr: yeah. that's a known issue [22:54:55] tgr: if you just refresh the page, it'll re-port [22:54:58] *post [22:55:03] which will try to create it again [22:55:40] ah. crap. I need to base the DNS domain on the region. I forgot about that [22:55:44] Are the 2 zones going to have like MPLS sexyness so we can talk over private ips to the other range/migrate vms betwean the 2 and float the ips over (not sure how isolated zones are currently) or just public outside of each zone? [22:56:07] well.... [22:56:18] that won't work so well [22:56:26] the regions manage the floating IPs themselves [22:56:36] so, you can't float an IP between regions [22:56:46] which sucks, yes [22:57:17] Ryan_Lane, so basically I can't use labs until the upgrade? [22:57:24] I thought the idea was to have 1 master control cluster that can manage sub clusters (full os installs used as zones). (totally don't keep track with dev atm) [22:57:31] tgr: you can, you just need to re-attempt occasionally [22:57:49] Damianz: that's really difficult to do [22:58:09] Though doing full layer3 stuff would be more "less open" I suppose, though for most orgs there some form of inter-dc connectivity, it would cause issues loosing a router though. [22:58:29] Suppose if you want really good integration at re-announcing public ips and just loosing tunnels it would never hit core [22:58:33] well, for the floating IPs we're going to do BGP announcements [22:58:41] so that isn't actually problematic [22:59:05] Ryan_Lane: I tried on five different days at least, several times each, always get the same error [22:59:11] hm [22:59:19] tgr: what are you naming the instance? [22:59:22] maybe the name is already in use [22:59:28] the error checking for that sucks right now [23:00:15] If we actually use BGP for public ips does failover for like loadbalancing services would be easy yeah, but if each region manages it's own ips I can imagine integrating lb stuff (like the awesome python version I saw) could be tricky. [23:00:42] It's a shame some of the pugins for like user<>cluster/cluster <> cluster vpn connections arn't more in core but then I suppose that's why there's plugins. [23:00:53] Damianz: yes, that's one reason why multi-region is easier [23:02:16] ideally, though, we could do load balancing with a combination of lvs and dns [23:02:28] where we did anycast dns, or something [23:02:38] tgr: what are you trying to name the instance? [23:04:11] That would actually be pretty cool. [23:05:12] doing that wouldn't be the easiest thing in the world :) [23:05:20] It's a little bit weird to think about as everything really resides in the same layer 2 network with nat over the top so the points of breakage and not being able to route traffic mostly lie in inter-zone communication internally. [23:05:45] yes. which is again why multi-region is easier :) [23:05:50] Ryan_Lane: zotero1 or zotero1 [23:05:51] Which for redundant services, loosing 1 dc wouldn't be an issue - clean failover. Loosing the link betwean 2 and taking down "tunnels" for internal traffic would cause split brain mayhem potentially. [23:05:51] 08/18/2012 - 23:05:51 - Creating a home directory for karun at /export/keys/karun [23:06:05] tgr: try a different instance name [23:06:08] but i tried now with a random postfix and still the same error [23:06:11] ah [23:06:11] hm [23:06:45] tgr: can you try right now for me? [23:06:50] while I'm tailing the log [23:07:01] maybe it's some other issue [23:07:10] I suppose really I'm thinking that 2 totally isolated clusters with communication over public ips would suck, but actually 2 isolated accessible on internal ips though network magic would be useful, more managable and still allow failover of public ips and some though about handling failure internally. [23:07:25] * Damianz also should probably not put thoughts in irc as they most likely make no snese :P [23:07:44] Ryan_Lane: trying [23:07:52] eqiad and pmtpa will be able to talk to each other [23:08:05] and done [23:08:11] it failed? [23:08:14] yes [23:08:39] weird [23:08:42] no error in the logs [23:09:15] I still think full inter-datacenter failover would be awesome cool to do with openstack, but there are some many issues from storage to latency to intergration with bgp/mpls etc. Useful > cool sometimes [23:09:55] Damianz: you do that from the client side, not the server side [23:09:58] meaning openstack doesn't do it [23:10:04] you do it with your instances and your config [23:10:51] 08/18/2012 - 23:10:51 - Updating keys for karun at /export/keys/karun [23:11:18] tgr: let me get back with you about this [23:11:28] tgr: I'm about to leave so I can't debug it right now [23:11:49] Ryan_Lane: ok, thanks [23:12:25] From the point of redundant applications, yes and it's not uber hard as long as you have access to stuff like RAs. From being able to live migrate machines across the country/auto failover a virtualised datacenter let openstack do it. Though as I don't know anyone that supports that publically (for internal stuff, certinally but that's back to app level) it falls under cool rather than practicle [23:13:23] RAs? [23:13:50] Actually mean BGP announcements rather than RAs... totally got IPv6 on the brain [23:13:54] Damianz: well, you'd have to either re-network the instance, or you'd have to have the networks work across regions [23:14:35] you can do failover with DNS and anycast [23:14:44] don't necessarily need BG [23:14:45] BGP [23:15:25] I can actually think of a way of doing that, too [23:15:33] Oh yeah, it would totally work with one zone using anycast and terminating at a lb. BGP would make it faster. But that would be a single zone in openstack as far as I know. High level cluster awareness would be interesting though. [23:15:34] again, not terribly easy [23:15:55] Anycast is awesome in so many ways. [23:16:06] yeah [23:16:13] well, I'm talking about anycast dns [23:16:52] then you do geo-loadbalancing via anycast dns [23:17:10] and you failover by modifying the dns records [23:17:52] We've already put some support for DNS into openstack. we really need to expand on it [23:18:19] Yeah - the only issue you'd fight is cache on end users dns recursers. Not a huge issue but just annoucing for the ip routes to change would be faster. I'm not a huge fan of geodns on any level above getting people terminated to a local lb. [23:18:38] Any crazy logic can be handled at the loadbalencers where it's internal and can change fast. [23:19:09] well, you'd still do in-datacenter load balancing via bgp [23:19:31] well, let me clarify that [23:19:39] you'd still load balance your load balancers via bgp [23:19:50] Yeah [23:19:52] you'd do in-datacenter load balancing via lvs [23:20:15] ok. I've got to run [23:20:16] * Ryan_Lane waves [23:20:22] * Damianz waves back [23:22:06] Hmm I wonder how hard it would be to move things up the stack a little and route directly to the instances and just have "loadbalncers" to annouce changes based on detected state. You'd hit all sorts of "not in the same subnet" issues I guess but for ipv6 if you could route traffic straight to hundreds of machines that would be cool and drop the number of public ips needed. [23:22:17] Though you'd loose any layer7 stuff you *could* do at the lbs [23:23:45] Ah damn, how is it half midnight [23:23:53] * Damianz goes to tidy this code up and push it before sleep