[01:56:14] Ryan_Lane: ping? [01:57:21] sup? [01:57:22] so. [01:57:32] I just found something that's going to save me days of work :) [01:57:36] virt5 is emptied [01:57:38] I don't *need* to do the dns stuff right now [01:57:40] awesome [02:04:13] paravoid: so, what's up with the c_rehash revert? did you look at all yet? [02:04:25] I haven't yet no [02:04:34] * jeremyb could look a little but isn't really in a position to test i think? that needs a prod node? [02:04:42] I have several people waiting on me for reviews plus the VM migrations [02:04:55] (cause it only broke prod?) [02:05:08] probably tomorrow, right now I feel like just doing dumb work [02:05:16] (at this time...) [02:05:34] heh [02:06:02] * jeremyb pastes for reference: [02:06:03] 01 16:40:42 <@mark> (File[/etc/ssl/certs/star.wikibooks.org.pem] => Exec[c_rehash] => Class[Certificates::Base] => Install_certificate[star.wikibooks.org] => File[/etc/ssl/certs/star.wikibooks.org.pem]) [02:06:25] * jeremyb digs a little [02:06:47] Ryan_Lane: simple revision for you https://gerrit.wikimedia.org/r/#/c/17376/ [02:06:54] Fixes the annoying tabindex order on login at labs ;) [02:07:28] can't we just fix our auth code so that we don't have to do this in hacky ways? :) [02:07:49] this is a 30 second fix... :D [02:08:08] can't we just not offer a choice if there's no choice to make? [02:08:09] ;) [02:08:25] Reedy: well, yes, but it breaks things for third parties [02:08:25] but yeah, having to manually set this based on randomness is daft [02:08:33] not that we have any [02:08:34] but still [02:08:35] :) [02:08:54] merged [02:08:55] let's deploy [02:10:03] wow. I almost did something really really bad [02:10:18] Ryan_Lane: permission to reboot all of your VMs? incl. the ones in the bots project? [02:10:24] paravoid: go for it [02:11:05] Reedy: deployed [02:11:05] worked [02:11:17] Yay [02:11:25] I almost did a git pull in the parent directory [02:11:27] I was going to log a bug, but it would've taken longer to fill out the form [02:11:36] I ctrl-c'd out of it quickly enough :) [02:12:10] paravoid: what are you still doing awake? :) [02:12:17] heh [02:12:28] Not something you want to be doing when you don't want to have to fix it now [02:12:31] zzz have VMs zzz migrate [02:13:00] :D [02:13:18] go to sleep! heh [02:14:52] Reedy and me have just an hour time difference you know [02:14:53] hooray, we can stick anything we want into the metadata with the nova api! [02:15:05] and update it [02:15:17] yeah, I knew that [02:15:22] Aren't you +2 on us? [02:15:25] It's 03:15 here [02:15:31] ah, right [02:15:33] damn [02:15:59] nova api doesn't list the security groups that an instance is a member of, though :( [02:16:06] so I'm adding it to the metadata for now [02:17:57] I'm pondering whether I should just leave a script that nova-manage vm list | grep -v virt[6-8] | cold-migrate and go to sleep :P [02:18:34] might want to bring virt5 into the mix first [02:18:40] by removing the gluster mount [02:19:11] ? [02:19:17] virt5 is a cisco box [02:19:37] that's why I was saying we should evacuate it first :) [02:19:38] yeah [02:20:06] oh so you mean remove the gluster mount and start populating VMs on it [02:20:10] yeah [02:20:27] yeah, should better do that sooner than later if I want to keep the cluster balanced [02:20:30] also, right now, if someone creates an instance, it'll end up on virt5 [02:20:37] heh, right [02:20:45] can we drain virt1-5? [02:20:53] i.e. tell openstack to never allocate vms there? [02:20:53] I'm fine with that [02:20:58] ah [02:21:07] well, when they are empty we can just disable nova-compute [02:21:13] I mean now [02:21:17] hm [02:21:23] I can't think of an easy way [02:21:39] :( [02:21:41] it's likely possible, but I'd need to find the way how [02:22:15] so, what's the process of removing the gluster mount? [02:22:26] I really don't want to read gluster docs right now :) [02:22:32] umount [02:22:47] and remove from fstab [02:22:51] eh? just that? [02:22:55] yeah [02:22:57] it's just a mount :) [02:23:07] it's not a replica? [02:23:10] nope [02:23:19] cool [02:23:21] virt1-4 are [02:23:30] and even on those, the mount isn't needed for the replica [02:23:51] gluster is easy. it's too bad it's so buggy [02:23:55] well yes, but if it had a replica I'd like to remove that too [02:23:59] * Ryan_Lane nods [02:24:09] we should wait until all nodes are done before doing so [02:24:24] otherwise we need to shrink the volume [02:25:35] and it needs to be done in pairs [02:25:37] umounted [02:25:55] sweet [02:26:11] we need to mount the local disk on /var/lib/nova/instances, now [02:26:33] I know I know [02:26:38] heh [02:26:40] I was just surprised umount worked :P [02:26:41] * Ryan_Lane nods [02:26:44] :D [02:28:20] gah, it's different than the others [02:28:26] :( [02:28:31] maybe we should rebuild it then [02:28:40] for now we can disable nova-compute [02:28:50] way ahead of you [02:28:51] ah [02:28:52] crap [02:28:54] I did that first [02:28:56] we can't rebuild it [02:28:58] heh [02:29:02] we can't? [02:29:03] why not? [02:29:06] ppa [02:29:10] fuuuuck [02:29:13] \o/ [02:29:17] never again [02:29:22] never again will I use a ppa [02:29:41] it's not too bad [02:29:55] * Ryan_Lane goes to a conference room with a whiteboard to write that 200 times [02:30:12] it just has a /dev/md1 for swap, which makes instance storage on /dev/md2 instead of /dev/md1 like the others [02:30:22] ah [02:30:23] yeah [02:30:24] otoh, we can live with three nodes for now and add it back when we upgrade [02:30:29] indeed [02:30:54] three nodes is way more than enough for now :) [02:31:09] just half a terrabyte of ram [02:31:48] yeah [02:31:55] more ram than the other hosts combined [02:33:56] Mem: 48386 48165 220 0 5 17915 [02:33:59] -/+ buffers/cache: 30244 18142 [02:34:01] Swap: 50159 11376 38783 [02:34:02] no wonder people were complaining [02:34:06] yep [02:36:18] cloudcracker.com is awesome [02:37:21] even has a rest api [02:38:43] ok. I'm off [02:38:45] * Ryan_Lane waves [02:38:59] go to sleep already. heh [07:39:11] hello [07:43:09] hi [07:43:17] !ping [07:43:17] pong [07:43:25] !log [07:43:30] :/ [07:44:56] ;-( [07:46:04] grmblbl apparently Gluster hasn't been updated :/ [07:54:11] petan: is there anything of interest on deployment-wmsearch or can we just get rid of it ? [07:54:46] hashar we really want to have a woring search there [07:55:05] sec [07:55:13] I must agree, but if we do so I guess we will want to start out using a fresh instance [07:55:19] + using Ubuntu Precise :-) [07:56:10] -bash: /usr/bin/groups: cannot execute binary file [07:56:10] Connection to deployment-wmsearch.pmtpa.wmflabs closed. [07:56:15] I guess it got corrupted [07:57:13] ok [07:57:21] !delete it [07:57:44] I should implement control interface for nova in this bot [07:57:54] so we can manage instances using !start !stop !delete !create [07:57:58] :D [07:58:00] !log deployment-prep {{bug|38748}} deleting unused/corrupted deployment-wmsearch instance. (had stuff like: -bash: /usr/bin/groups: cannot execute binary file. Connection to deployment-wmsearch.pmtpa.wmflabs closed.) [07:58:02] Logged the message, Master. [07:59:21] petan: Ryan wants to implements a MediaWiki API interface [07:59:37] this would need something more [07:59:39] petan: so he could start using more AJAX in the interface [07:59:46] you would need to auth using bot [07:59:57] but it would be cool [08:00:06] well the bot could get an auth token or a specific account [08:01:05] we could even have the bot act in the name of the user triggering the action [08:01:27] by using the irc cloak as an auth [08:04:50] heh [08:04:53] true [08:05:01] you could just set a cloak in your preferences in console [08:05:35] it would also log it to SAL by design [08:05:56] I should make a bug for that before I forget [08:07:02] we will make this channel an interactive shell :D [08:07:09] multiplayer bash [08:07:11] :P [08:07:22] I was actually thinking of a bot like that [08:07:39] it would be connected using ssh and it would send the output to channel and take input of users as commands to terminal [08:07:53] so that you could share the shell using irc [08:18:15] petan: #jenkins as a nice setup [08:18:33] that let them do most administrative tasks via the channel [08:18:40] for example forking a repo on github [08:18:47] triggering a jenkins build [08:18:51] or even a release :-) [08:19:00] and also interact with bugs [08:27:17] :) [08:27:45] irc is a nice terminal for multiple users :P [08:27:52] you just need to extend it a bit [08:51:04] Change on 12mediawiki a page Developer access was modified, changed by Kozuch link https://www.mediawiki.org/w/index.php?diff=568227 edit summary: [12:45:44] 08/02/2012 - 12:45:44 - Creating a home directory for rotsee at /export/keys/rotsee [12:46:46] 08/02/2012 - 12:46:46 - Updating keys for rotsee at /export/keys/rotsee [14:22:58] Change on 12mediawiki a page Developer access was modified, changed by Matma Rex link https://www.mediawiki.org/w/index.php?diff=568344 edit summary: [15:00:52] andrewbogott: ping? [15:03:20] howdy [15:04:56] hashar why my SUL on labs doesn't work anymore [15:05:48] petan|wk: I have no idea [15:12:12] petan|wk: try investigating it a bit more :-D [15:12:25] hashar how do I generate new one [15:12:30] I need my account to work [15:12:31] like are the pictures showing ? where do they link to ? what the link error output is and so on [15:12:44] error is wrong username or password [15:12:54] Login error [15:12:55] Incorrect password entered. Please try again. [15:12:56] that's i [15:13:30] I will need to change it in sql [15:13:30] SUL seems insane, not sure why we can't use oauth and stick to standards [15:13:42] what encryption it uese [15:13:44] uses [15:13:52] paravoid: hi. Did you get any inspiration for -dbdump / syslog-ng / rsyslog madness ? :-] [15:13:52] well [15:13:56] I have an idea [15:14:58] !log deployment-prep yeah we lost udp2log again! -dbdump : /etc/init.d/udp2log-mw restart [15:14:59] Logged the message, Master. [15:16:16] hashar: sec, looking at the outage [15:16:24] oh outage :( [15:16:27] paravoid: I'm out for breakfast at the moment; I'm going to head home and will catch up with you again in ~15 minutes. Hopefully the nagios smoke will have cleared by then… [15:16:36] andrewbogott_afk: yeah :) [15:16:40] I'll be here [15:17:43] hashar why gu_password is null for all users? [15:17:48] in centralauth [15:17:53] I have no idea what gu_password is [15:18:04] it's in global user table [15:18:07] where are passwords? [15:18:11] I need to reset mine [15:18:28] you could use eval.php maybe [15:18:40] what should I eval [15:18:53] mwscript eval.php --wiki=labswiki (can't remember the central auth wiki, but I guess it is labswiki) [15:18:54] $wgPasswordtableOfPetrb [15:18:57] :P [15:19:01] then input some mediawiki magic such as: [15:19:03] it's centralauth db [15:19:11] $u = User::newFromName( 'Petrb' ); [15:19:12] oh [15:19:18] $u->setPassword( 'foosecret' ); [15:19:18] you mean it's not in that db [15:19:19] ok [15:19:21] sec [15:19:22] $u->saveSettings(); [15:19:34] need to look at includes/User.php for the exact method names [15:20:53] wtf [15:21:03] there are no passwords in db :| [15:22:18] maybe that is not used :-) [15:22:21] I can login fine [15:22:23] and SUL work [15:22:32] (logged on commons beta and got logged on enwiki beta) [15:22:46] got it [15:24:09] wtf [15:24:12] it still doesn't work [15:24:21] I changed the password by hand... [15:24:22] pff [15:26:41] hashar can you give sysop to Petrb2 on enwiki [15:26:45] I need to try something [15:26:56] teh main acc doesn't work [15:27:01] I don't know why [15:27:11] I overwrote the gu_password by hand and it's still same [15:27:35] not a bureaucrat there :-D [15:27:40] you are steward [15:27:42] I am sure [15:27:45] I gave u steward [15:28:00] just go to meta wiki and do Petrb@enwiki] [15:28:03] just go to meta wiki and do Petrb@enwiki [15:28:12] it will let you change the rights there [15:29:13] pohhhhh [15:29:14] on http://deployment.wikimedia.beta.wmflabs.org/ [15:29:38] You do not have permission to edit user rights on other wikis. :-D [15:29:39] meta.wikimedia [15:29:45] o.o [15:29:50] we had to screw things [15:30:04] I don't have the user rights on meta.wikimedia.beta.wmflabs.org [15:30:05] or some steward did [15:30:14] you have global user rights [15:30:21] or you had before stewards put their nose in [15:30:28] ok I will do it using sql... [15:30:36] sorry ;D [15:41:41] paravoid: OK, back. Two questions: [15:41:53] 1) Are you trying to warn people before you migrate their instances? [15:42:00] 2) How should we divide up the pool? [15:42:33] I'm trying, although it's not required [15:42:43] if I see them online, I give them a heads-up [15:42:52] also, I'm thinking of special handling bots [15:43:27] as for the pool, we're on separate timezones, so we can work on it on separate times I guess :) [15:43:48] I did something like 40 yesterday, we have 115 left [15:43:51] it won't take that long [15:44:30] I'd start with nova-manage vm list |grep laner :) [15:44:47] That's easy enough… just let me know when you quit for the day and I'll take over. [15:44:55] feel free to start [15:44:57] I have a huuuuge backlog [15:45:00] ok :) [15:45:11] and I block other people, like hashar and j^ [15:45:29] regarding bots… special handling how? Just making sure the instance owners are on call during migration? [15:45:50] so far my plan was "do everything else, and leave them last" :) [15:45:54] :) [15:45:55] but yeah, that sounds sensible [15:45:58] ping their owners [15:46:03] also bastion [15:46:14] Does your script mostly succeed without comment, or are there frequent special cases? [15:46:19] since killing bastion would be bad(tm) [15:46:23] Oh yeah, bastion, yikes! [15:46:37] it has no error handling /at all/ [15:46:48] (ryan wrote it mostly) [15:47:06] be careful, the first argument is 0000NNN, without the i- or the instance- [15:47:47] hashar I fixed ur groups too [15:47:55] so that you should be able to use meta to change right everywhere [15:48:23] and it doesn't even check if you gave a second argument [15:48:27] petan|wk: thanks [15:48:32] (and fails miserably) [15:49:51] paravoid: OK, and to make sure I'm totally clear on what's happening… the grep -v virt[6-8] is because virt6,7,8 are the new servers and virt1,2,3,4,5 the old ones. [15:50:06] yep [15:50:13] we're migrating from virt1-5 to virt6-8 [15:50:18] But all in pmtpa? [15:50:24] virt5 is already empty because we want to fill that up [15:50:25] yes [15:50:31] virt5-8 are Ciscos [15:50:43] 'k. [15:50:43] virt1-4 are smallish servers [15:50:48] Let's see what I can break! [15:51:08] virt1-4 has 48G of RAM, virt5-8 have 192GB [15:52:46] Hm… I fear that Ryan is actively coding on one or more of these instances. [15:52:57] ryan gave a go-ahead for all of his VMsw [15:53:04] I asked him yesterday [15:53:08] ok [15:54:24] paravoid: How are you picking $TOHOST? Round-robin? [15:54:40] yes [15:55:40] Doing any checking to make sure the instances survive the operation? [16:01:00] not really :) [16:01:02] "Failed to create instance as the host could not be added to LDAP." [16:01:10] what's wrong? [16:01:24] ldap works... [16:02:27] paravoid: And we're expecting labconsole to display the old (now wrong) hostname, right? [16:02:50] andrewbogott: but ldap will be correct? [16:03:27] jeremyb: Hm, no idea. [16:03:30] andrewbogott: yes, Ryan was telling me that you're involved into something that will fix this or soemthing? :) [16:03:45] jeremyb: I don't think LDAP keeps instance<->node relationships [16:03:52] paravoid: Yeah, although it will work best in folsom :( [16:05:58] paravoid: huh, i guess so [16:06:10] folsom sounds like distant future? [16:08:42] yeah [16:08:47] we're about to upgrade to essex [16:08:59] although I surely hope the essex->folsom transition will be easier [16:09:17] it seems so, considering we'll be using keystone and the openstack api [16:09:33] but who knows, maybe we'll other kind of transitions then, like the networking stuff [16:18:28] andrewbogott: can you display as a semantic ask result whether or not someone is a sysadmin/netadmin for a given project ? [16:18:49] andrewbogott: btw, nova-manage vm list is remarkably slow (there's a launchpad bug about it), so I usually do nova-manage vm list > nova-is-slow [16:18:53] and then grep through that [16:19:13] https://labsconsole.wikimedia.org/w/index.php?title=Special:Ask&q=%5B%5BResource+Type%3A%3Aproject%5D%5D%5B%5Bmember%3A%3AUser%3AAndrew+Bogott%5D%5D&p=format%3Dbroadtable%2Fheaders%3Dshow%2Fmainlabel%3D-2D%2Flink%3Dall%2Fsearchlabel%3Dprojects%2Fclass%3Dsortable-20wikitable-20smwtable&po=%3F%0A%3FMember%0A%3FDescription%0A&eq=no [16:19:55] paravoid: btw, do you subscribe to all wnpp? [16:20:03] jeremyb: of course not, are you crazy? :) [16:20:08] I just read debian-devel [16:20:17] hah, ok, ok ;) [16:21:53] jeremyb: Should be possible but I'm not familiar with semantic search at all… [16:23:32] andrewbogott: well if you edit the project resource page it lists the users but not the type of relationship. (only that they are a member) [16:24:52] jeremyb: This page has roles… https://labsconsole.wikimedia.org/wiki/Special:NovaProject [16:25:19] andrewbogott: i know but that doesn't allow viewing roles for projects you're not a member of [16:25:25] But I suppose only for… yeah, what you said. [16:26:11] basically i wanted to see what [[special:novaproject]] would look like if i were user X [16:26:26] we used to have no filter at all and you could see all projects [16:30:29] paravoid: I get an instance lacking a DNS entry in pmtpa.wmflabs domain. Should I assign the bug to you ? :) https://bugzilla.wikimedia.org/show_bug.cgi?id=38846 [16:31:18] jeremyb: I agree that it's probably not useful to keep project roles secret from non-members. Probably worth logging a bug about it. [16:31:45] And, actually, I don't know if those roles are literally secret or just filtered to save space… that 'list projects' page is kind of a real-estate disaster in any case. needs work. [16:32:37] jeremyb: Sorry I'm not being very helpful :( [16:32:41] andrewbogott: it's nothing to do with space. it's all about perf [16:32:59] but in effect it then made some info secret [16:33:02] ish [16:33:49] andrewbogott: anyway, separate issue: that data should be exposed to semantic queries i think [16:34:05] I agree. [16:34:05] hashar: sigh. sure, although I'd have to ask Ryan [16:34:09] but I can be your tier-1 [16:34:14] paravoid: assigned :-] [16:34:39] paravoid: the instance creation just failed to create the DNS entry :-D I would fix it myself if I had the proper credentials :-] [16:38:11] paravoid: also apparently Gluster has not been updated on labs :/ [16:38:27] need to poke Ryan about it, he had a slot last night [16:38:40] yeah, he didn't keep me in the loop for that [16:39:12] Wednesday, August 1, 22:00-23:00 UTC (3pm-4pm PDT): Labs GlusterFS project storage upgrade [Ryan Lane] [16:39:38] maybe he faced a conflict with something else [16:39:55] * jeremyb wonders what this says: Οὔτε τι τῶν ἀνθρωπίνων ἄξιον ὂν μεγάλης σπουδῆς [16:40:05] it' [16:40:08] it's ancient greek [16:40:26] * jeremyb pulled it out of a chinese guy's mail footer [16:40:41] it's Plato [16:40:50] according to Google [16:41:04] "No human thing is of serious importance." [16:41:05] oh, google! of course i should have asked google [16:45:48] petan: d'you mind if nova-dev3 shuts down for a bit? [16:56:09] that's np [16:56:15] andrewbogott ok [16:58:35] petan|wk: I may actually just do all your non-bots instances in a lump, if that's OK with you. Will it ruin anyone's day to have the deployment-prep VMs reboot? [16:59:27] Just don't break them [16:59:28] :) [17:27:37] paravoid: Is it normal for migrations of big instances to take minutes rather than seconds? This last one is taking orders of magnitude longer than any of the earlier ones. [17:27:59] (And I can't immediately think of a good way to measure progress) [17:28:00] andrewbogott hold on [17:28:24] andrewbogott which instances we talk about now and what are you going to do with them? [17:29:00] nova-dev3 is an instance or not? are you talking about the nova compute node? [17:29:12] petan|wk: Right now I'm doing 000000d0. I'm just migrating them to faster hardware; it will just cause a few minute outage. [17:29:20] aah [17:29:25] nova-dev3 is a VM that claims to have been created by you. [17:29:36] hm, we should probably announce this in a mail next time [17:29:46] so that the people who run their bots know [17:30:00] I can tell you it's ok for me, but I don't know about the rest of people [17:30:21] I do not operate their bots, I don't even know how to bring them back [17:30:28] petan|wk: We're going to do bots last, and do some extra dilligence before migrating them, since they're the most likely to suffer from outages. [17:30:32] we really need to have a documentation base [17:30:39] ok [17:30:42] That is, 'instances in the bots project' [17:31:00]