[00:01:02] Ryan_Lane: Consider this ldap entry: ('dc=wikimedia-lb,dc=wmflabs,ou=hosts,dc=wikimedia,dc=org', {'objectClass': ['domainrelatedobject', 'dnsdomain', 'domain', 'dcobject', 'top'], 'aRecord': ['208.80.153.193'], 'associatedDomain': ['wikimedia-lb.wmflabs.org', 'commons.wmflabs.org', 'meta.wmflabs.org', 'test.wmflabs.org'], 'dc': ['wikimedia-lb']}) [00:01:19] If I delete the host 'commons.wmflabs.org' then it becomes this: [00:01:33] ('dc=wikimedia-lb,dc=wmflabs,ou=hosts,dc=wikimedia,dc=org', {'objectClass': ['domainrelatedobject', 'dnsdomain', 'domain', 'dcobject', 'top'], 'aRecord': ['208.80.153.193'], 'associatedDomain': ['wikimedia-lb.wmflabs.org', 'meta.wmflabs.org', 'test.wmflabs.org'], 'dc': ['wikimedia-lb']}) [00:01:34] back [00:02:02] But what if I delete the host 'wikimedia-lb.wmflabs.org'? What does it become then? [00:02:10] (sorry for the giant pastes, all) [00:02:43] yeah. the object would need to be renamed [00:02:44] OrenBochman: i'm fairly certain the right solution is an HTTP cache [00:02:54] * jeremyb is busy watching a debian developer play jeopardy [00:02:55] renamed to what? Can I just pick an arbitrary other host name? [00:02:57] andrewbogott: any of the others is fine [00:03:05] Huh. OK. [00:03:28] maybe there's some better way I'm not thinking of [00:03:40] I wonder if I even handle that situation properly. heh [00:03:56] they are all a records in that situation [00:04:05] so, it doesn't actually matter which one is the entry [00:05:57] Yeah, it won't break any of my existing logic to just use the first remaining name. [00:25:10] andrewbogott: cool. [00:25:14] wow that was a late response [00:25:32] andrewbogott: did you want to discuss functionality currently in OSM, and some that would be useful for the future? [00:25:34] regarding DNS [00:25:53] Yes, but... tomorrow ok? [00:25:56] yep [00:34:03] PROBLEM Free ram is now: CRITICAL on deployment-web deployment-web output: Connection refused by host [00:35:13] PROBLEM HTTP is now: CRITICAL on deployment-web deployment-web output: Connection refused [00:35:23] PROBLEM Total Processes is now: CRITICAL on deployment-web deployment-web output: Connection refused by host [00:36:13] PROBLEM dpkg-check is now: CRITICAL on deployment-web deployment-web output: Connection refused by host [00:36:53] PROBLEM Current Load is now: CRITICAL on deployment-web deployment-web output: Connection refused by host [00:37:06] Rage. [00:37:33] PROBLEM Current Users is now: CRITICAL on deployment-web deployment-web output: Connection refused by host [00:38:13] PROBLEM Disk Space is now: CRITICAL on deployment-web deployment-web output: Connection refused by host [00:39:45] new instances always show up poorly in nagios :) [00:39:57] nagios pulls in the new instance faster than it comes up [00:40:00] I was raging about #mediawiki ;) [00:50:06] ah [01:01:08] deployment ugh [01:02:50] hexmode: ? [01:02:53] Ryan_Lane: so I just saw the above and now deployment-web isn't taking my key [01:03:08] deploment-sql is, though :P [01:03:35] Thought the disk would be filling up on -sql, not -web [01:03:48] hah. damn. I deleted a message I thought was unused [01:04:12] 01/05/2012 - 01:04:12 - Creating a home directory for laner at /export/home/deployment-prep/laner [01:04:16] * Ryan_Lane looks [01:04:59] you OOM'd it :D [01:05:04] :) [01:05:13] 01/05/2012 - 01:05:12 - Updating keys for laner [01:07:39] hexmode: you can reboot it [01:07:42] via the interface [01:07:53] ah, didn't know that [01:07:55] ty [01:13:13] RECOVERY Disk Space is now: OK on deployment-web deployment-web output: DISK OK [01:14:03] RECOVERY Free ram is now: OK on deployment-web deployment-web output: OK: 90% free memory [01:15:13] RECOVERY HTTP is now: OK on deployment-web deployment-web output: HTTP OK: HTTP/1.1 200 OK - 542 bytes in 0.007 second response time [01:15:23] RECOVERY Total Processes is now: OK on deployment-web deployment-web output: PROCS OK: 107 processes [01:16:13] RECOVERY dpkg-check is now: OK on deployment-web deployment-web output: All packages OK [01:16:53] RECOVERY Current Load is now: OK on deployment-web deployment-web output: OK - load average: 0.11, 0.11, 0.05 [01:17:33] RECOVERY Current Users is now: OK on deployment-web deployment-web output: USERS OK - 1 users currently logged in [01:24:26] New patchset: Ryan Lane; "We aren't using nova-volume right now, and it throws errors since it's not configured. Remove it." [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1788 [01:24:39] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1788 [01:24:47] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1788 [01:24:48] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1788 [01:27:23] RECOVERY Total Processes is now: OK on deployment-sql deployment-sql output: PROCS OK: 79 processes [01:27:53] RECOVERY Disk Space is now: OK on nova-production1 nova-production1 output: DISK OK [01:28:13] RECOVERY dpkg-check is now: OK on deployment-sql deployment-sql output: All packages OK [01:28:53] RECOVERY Current Load is now: OK on deployment-sql deployment-sql output: OK - load average: 0.21, 0.20, 0.11 [01:29:33] RECOVERY Current Users is now: OK on deployment-sql deployment-sql output: USERS OK - 0 users currently logged in [01:30:13] RECOVERY Disk Space is now: OK on deployment-sql deployment-sql output: DISK OK [01:31:03] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 83% free memory [01:34:17] New patchset: Ryan Lane; "Adding requires to all services." [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1790 [01:42:55] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1790 [01:42:55] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1790 [04:13:19] yah hi! [04:15:06] yay [04:15:27] * jeremyb accidentally mentioned the channel name in #-ops and a troll came by ^^ [04:15:43] when typing /wg #wikimedia-labs :) [04:16:07] anyway, troll gave up i guess [13:54:13] PROBLEM Current Load is now: WARNING on bots-cb bots-cb output: WARNING - load average: 1.90, 21.53, 14.59 [14:14:13] RECOVERY Current Load is now: OK on bots-cb bots-cb output: OK - load average: 0.20, 0.67, 4.22 [15:53:32] !log deployment-prep import now running using mwimport [15:53:33] Logged the message, Master [16:47:59] !log deployment-prep created instance for dumps, current instance is overloaded [16:48:00] Logged the message, Master [16:53:53] PROBLEM Current Load is now: CRITICAL on deployment-dbdump deployment-dbdump output: Connection refused by host [16:54:33] PROBLEM Current Users is now: CRITICAL on deployment-dbdump deployment-dbdump output: Connection refused by host [16:55:13] PROBLEM Disk Space is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [16:56:03] PROBLEM Free ram is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [16:57:23] PROBLEM Total Processes is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [16:58:13] PROBLEM dpkg-check is now: CRITICAL on deployment-dbdump deployment-dbdump output: CHECK_NRPE: Error - Could not complete SSL handshake. [17:05:13] PROBLEM host: deployment-dbdump is DOWN address: deployment-dbdump CRITICAL - Host Unreachable (deployment-dbdump) [17:35:23] PROBLEM host: deployment-dbdump is DOWN address: deployment-dbdump CRITICAL - Host Unreachable (deployment-dbdump) [18:05:23] PROBLEM host: deployment-dbdump is DOWN address: deployment-dbdump CRITICAL - Host Unreachable (deployment-dbdump) [18:12:15] this is new: [18:12:15] If you are having access problems, please see: https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [18:17:33] RECOVERY host: deployment-dbdump is UP address: deployment-dbdump PING OK - Packet loss = 0%, RTA = 0.90 ms [19:06:48] Ryan_Lane: is there a way to get shared nfs / gluster for a project [19:07:04] it would make stuff easier [19:08:58] Ryan_Lane: Random question: how did you set things up such that the puppet lint check doesn't run for the mediawiki repo? I've been looking at puppet/files/gerrit/hooks but I don't see it happening there [19:09:16] (I was messing around a bit, trying to implement PHP and JS linting for the mediawiki repo) [19:14:07] petan: that's what we're going to do when we get the volume storage in [19:14:13] ok [19:14:32] RoanKattouw: it should be happening there [19:14:40] RoanKattouw: the hooks need to be modified to do this [19:14:58] There is one that exempts operations/private [19:15:01] yes [19:15:06] I couldn't find how mediawiki is exempted, offhand [19:15:09] I'd *really* like this to be configurable [19:15:11] But maybe I didn't look hard enough [19:15:12] it may not be [19:15:12] Well [19:15:14] I did *some* work [19:15:18] But I suck at Python [19:15:23] it may be running for all I know [19:15:29] It's not [19:15:31] I checked [19:15:36] I'll submit my WIP to gerrit [19:16:05] I dunno why it wouldn't be running, then [19:16:47] <^demon> RoanKattouw: Anything exciting in your gerrit wc? [19:16:57] what's WIP and wc? [19:17:06] <^demon> work in progress, working copy [19:17:06] work in progress? [19:17:08] ah [19:17:08] Work In Progress, working copy [19:17:11] :D [19:17:33] <^demon> Or water closet. [19:17:50] * RoanKattouw runs git push-for-review-production and watches it hang [19:18:13] ^demon: I've started some reorg that'll allow us to run php -l and JSLint for MW commits [19:18:29] <^demon> Good :) [19:18:41] WTF, is gerrit down or something? [19:18:43] <^demon> I'll be doing a fresh dump this week. [19:18:45] in patchset-created, I'd like to see the following line changed to be configurable: lint_test(options, helper) [19:18:50] <^demon> Gonna try and get tags this go round. [19:18:51] Yes [19:18:58] Ryan_Lane: My work sort of has that [19:19:09] But it looks like the gerrit server is out to lunch :( [19:19:13] oh? [19:19:38] doesn't seem so to me [19:19:46] <^demon> WFM. [19:20:02] Hmm, git pull/push hang for me [19:26:58] It seems to be SSH auth that hangs [19:27:07] Fetching https:// URLs works, fetching ssh:// URLs doens' [19:27:09] t [19:30:47] ah. crap. hanging for me too [19:31:59] lemme restart gerrit [19:34:08] RoanKattouw: working now [19:34:22] not sure why it hung [19:35:34] Yay, thanks [19:35:40] https://gerrit.wikimedia.org/r/1794 is what I did [19:35:52] I don't know enough about Python to follow through but I think it illustrates what I wanted to do [19:37:00] wow. wtf. [19:37:08] labsconsole was just unresponsive [19:39:13] server reached MaxClients setting [19:39:15] hmm [19:39:16] really? [19:40:00] ah. puppet. right [19:40:08] well, that's a problem [20:26:05] !log deployment-prep importing MW ns full history to en_wikipedia [20:26:06] Logged the message, Master [20:35:27] be careful importing too much :) [20:35:48] hmm [20:35:55] lemme see if the cisco servers have been racked [20:36:06] if so, maybe we can get some database stuff going [20:36:18] I guess we should do the initial work in instances before deploying it, though [20:36:32] which? [20:36:38] ideally we'd have an API service that write into a queue [20:36:47] petan, we got some donations [20:36:50] then the database service would pick up messages from the queue [20:36:59] :o [20:37:39] the api should authenticate via accesskey/secret key [20:37:46] and authorization should be based on projects [20:37:57] we should have project quotas too [20:38:08] project A can create x databases [20:38:26] can use x space, etc [20:38:44] I wonder what resource limits mariadb has for this [20:41:45] definitely [20:43:34] ah. mysql has some [20:43:53] per-user [20:46:26] Change on 12mediawiki a page Wikimedia Labs/status was modified, changed by 216.38.130.165 link https://www.mediawiki.org/w/index.php?diff=481723 edit summary: /* 2011-11-30 */ [20:46:27] Change on 12mediawiki a page Wikimedia Labs/status was modified, changed by 216.38.130.165 link https://www.mediawiki.org/w/index.php?diff=481723 edit summary: /* 2011-11-30 */ [20:46:46] hmm. seems there is no size limit [20:46:48] so... [20:47:07] I guess a cron that checks database sizes, and enabled read-only mode when a database exceeds a quota [20:47:31] we could have a separate storage with some limit for each project where dbfiles would be located [20:48:18] that's slightly more difficult :) [20:48:34] then if we want to increase the quota, we need to increase the filesystem size [20:48:38] hm. [20:48:46] if it was in lvm [20:48:46] maybe filesystem quotas [20:48:54] though that's not terribly nice [20:48:58] but that's per user hm? [20:49:01] it's nicer to set a database to read-only [20:49:17] running out of space really fucks mysql up [20:49:23] it'll corrupt itself [20:50:12] hm... [20:50:47] how would user remove data if it was read only [20:50:47] read-only is also really easy to set [20:50:55] ah. crap. [20:50:55] right [20:51:22] I wonder if we could use filesystem quotas for this [20:51:31] I don't think so [20:51:38] mysqld would need to run in different users [20:51:48] right. [20:52:08] what I want to avoid is one database stealing all the space of the others [20:52:27] sure [20:53:50] maybe mariadb can do that [20:53:58] doesn't look like it [20:54:47] ah [20:54:52] directory-tree quotas [21:05:31] hexmode: here? [21:25:47] bah. I forgot to live migrate instances back to virt4 [21:26:06] so nearly all instances are running on virt2 and virt3 [21:28:12] Ryan_Lane: please don't get deployment-sql down now :D [21:28:21] I'm not shutting anything down :) [21:28:29] it's called live-migration for a reason :) [21:28:34] yesterday some instances was rebooted [21:28:45] and that got broken import [21:28:45] they likely patched themselves [21:28:49] hm... [21:29:00] one of the deployment boxes OOM'd [21:29:00] Mark rebooted it [21:29:03] yes [21:29:09] probably import caused [21:29:12] OOM [21:29:22] it got it again when I restarted it [21:29:22] I can't help that :D [21:29:33] apergos did [21:29:34] ok. why the hell is nova-manage hanging? [21:29:34] :) [21:30:41] wow. weird [21:30:47] nova-compute service on virt2 is down [21:31:08] back up [21:31:42] and down again [21:31:43] wtf [21:36:16] Exercise [21:36:33] no clue what's wrong :( [21:36:43] whats wrong? [21:36:55] well, nothing from the perspective of you guys :) [21:37:04] but the nova-compute service on virt2 is down [21:37:15] if it is not an issue for customer. It is not an issue. [21:37:55] the only thing I know is that https://labsconsole.wikimedia.org/ is timing out [21:38:12] probvably related to whatever issue you are fighting against :\ [21:38:25] it is? [21:38:27] damn it [21:38:36] gerrit too [21:38:47] too many puppet connections [21:38:54] puppet and labsconsole are on same apache [21:38:55] well my git pull times out to [21:39:08] I don't know what the hell is wrong with gerrit :( [21:39:20] possible root cause: LDAP/ DNS [21:39:37] hmm. possible with gerrit [21:39:39] maybe ldap [21:39:42] (as a past network engineer: "It is not a network issue, ask the sysadmins" ) [21:39:59] ldap is indeed having issues [21:40:22] you really need a high availability LDAP service :D [21:40:24] wow. wtf. apache is having a ton of issues too [21:40:32] well, I'm working towards that [21:41:04] hmm [21:41:05] no [21:41:09] the entire system is having issues [21:41:24] I think I need to reboot virt1 [21:41:26] hmm [21:41:27] $ ssh gerrit.wikimedia.org // <-- close connection fast with a perm denied [21:41:28] Damn [21:41:31] is gerrit pointing at virt1? [21:41:35] it shouldn't be [21:41:36] but git pull on gerrit.wikimedia.org does not work [21:41:37] I was just going to say "Have you tried turning it off and on again?" [21:41:48] I'm getting kernel errors on virt1 [21:42:06] bah [21:42:15] I should point gerrit at the other LDAP servers [21:42:23] no reason it needs to be limited to just virt1 [21:42:32] http://gerrit.wikimedia.org/ works fine when unlogged :D [21:43:30] Your welcome :P [21:43:44] YAY [21:43:49] seriously? [21:44:40] btw jeremyb do you want +o here? [21:44:52] you talked about trolls or what [21:44:56] at morning [21:47:25] doesn't seem down [21:47:58] petan: sure. but consider me AFK for tonight :) [21:48:20] done [21:48:52] Ryan_Lane: gerrit git pull works again :D [21:48:57] yeah [21:49:00] virt1 is back up [21:49:07] I have no clue why it died [21:49:08] \o/ [21:49:27] !log testswarm Ryan is our hero [21:49:28] Logged the message, Master [21:49:45] what just happened ? [21:49:53] I have no clue [21:49:54] outage [21:49:55] :o [21:49:58] kernel errors on virt1 [21:50:04] well, outage-ish [21:50:05] :D [21:50:10] Krinkle: some french server gone on strike? [21:50:13] what kernel does it run [21:50:22] stock ubuntu lucid one [21:50:27] ah [21:50:34] that's not so bad for virt [21:50:50] oh great. now nova-compute is down on two nodes instead of just one [21:51:08] butI have some old kernel on one ubuntu box which really has troubles with vm's [21:51:21] I didn't update it because I would need to reboot it [21:51:36] I am still waiting for a good reason to do that :) [21:51:40] ah. it's back up on everything except virt2 [21:51:44] virt2 having issues is problematic [21:52:25] you know, rob gave me nodes in eqiad [21:52:37] I think at minimum I should be getting DNS and LDAP up there [21:52:40] and mysql [21:52:47] cool [21:52:48] PROBLEM Disk Space is now: WARNING on puppet-lucid puppet-lucid output: DISK WARNING - free space: / 59 MB (4% inode=35%): [21:53:08] PROBLEM Disk Space is now: CRITICAL on testpuppet testpuppet output: DISK CRITICAL - free space: / 0 MB (0% inode=16%): [21:53:09] ah nova-compute seems to be restarting properly on virt2 [21:53:47] of course, there's a billion instances, so this is gonna take a while :D [21:53:48] PROBLEM dpkg-check is now: CRITICAL on puppet-lucid puppet-lucid output: DPKG CRITICAL dpkg reports broken packages [21:53:53] heh [21:54:39] the too-many-clients issue with apache on virt1 was also due to the kernel issues [21:54:46] something was causing it to hang [21:55:44] Per deployment and testing of 1.9 on wmflabs: Might be an idea to provide a substantial list of tests to be performed by users. Lots of little tests, rather than a few big sheets of tests. So people can actively try the new software out, instead of maybe stumbling across a bug, or not. [21:55:54] hi [21:56:03] 1.19 sry [21:56:37] i think we have/had that for 1.18 [21:56:39] atm I am importing db there, so I think this will be sorted out by Mark later [21:56:47] no [21:56:52] it's for 1.19 [21:57:00] oh you mean a list [21:57:23] yup :p [21:57:26] Balck box for newbs [21:57:42] non technical things to try [21:58:03] FredGandt: my message was primarily focused on people who are running various user scripts etc [21:58:22] there are many of them and it sometimes happen that these get broken with mw update [21:58:53] user scripts is something we are importing only on simple wiki clone [21:58:59] https://www.mediawiki.org/wiki/Category:MediaWiki_test_plans [21:59:09] https://www.mediawiki.org/wiki/MediaWiki_manual_smoke_test [21:59:15] https://www.mediawiki.org/wiki/User:Sumanah/1.18_test_cases [21:59:23] FredGandt: ^ [22:00:21] Well it was just a thought. Focussed testing by a large number could accomplish what whimsical testing by a larger number might not. Red names! must learn how to use IRC [22:01:55] Cereal finished, off to make cups of tea. See ya! [22:02:02] :) [22:02:24] cool. nova-compute on virt2 is ok now [22:02:31] now back to migrating instances :) [22:02:42] the deployment labs enwiki seems to be misconfigured [22:03:58] definitely [22:04:04] hm. I wonder if everything started dying because of ldap failing [22:04:05] Danny_B|backup: I am in process of db import [22:04:14] after that I will sort out configuration [22:05:23] Ryan_Lane: what all is crashing now [22:05:33] instances seems to be running ok [22:06:49] nothing is crashing now [22:07:01] only thing that happened was ldap timing out temporarily [22:07:10] and that doesn't actually cause major problems [22:07:17] ah [22:07:18] well, it causes problems till the ldap server comes back up [22:07:28] why is the current structure used? [22:07:28] with a secondary server, even that wouldn't happen [22:07:34] i think it's not the best idea [22:07:42] Danny_B|backup: which structure you mean [22:07:52] labs should mirror real wikis as much as possible [22:08:03] this is not a final version [22:08:12] it's just a quick wiki set up before deployment [22:08:19] so there should be eg. en.wikipedia.wmflabs.org/wiki/ [22:08:20] there is a proposal for better way to clone them [22:08:33] so the paths will be same [22:08:44] only the domain will be different [22:08:46] hm... [22:09:07] thus all scripts which depend on paths will work [22:09:24] that wouldn't be actually hard to set up, but I don't think it's necessary I would need to talk with others about it [22:09:48] but yes that's a good point [22:10:02] if we give the instance a public IP address, it can have *.wikipedia.wmflabs.org DNS entry associated with it [22:10:24] * Danny_B|backup would even prefer labs.en.wikipedia.org [22:10:29] no [22:10:32] definitely not [22:10:34] security issue [22:10:36] but i know there was some issue with that [22:10:44] we registered a different domain just for this reason :) [22:10:45] tracking cookies etc [22:11:06] Ryan_Lane: yup, i remember our discussion about the domain structure for labs in haifa [22:11:25] crap. I can't delete * dns entries? :( [22:11:26] heh [22:11:31] what did I fuck up? P) [22:11:34] err :) [22:11:38] :) [22:11:42] Ryan_Lane: dns entries? ;-) [22:12:04] ah. wrong domain is being sent to the delete interface [22:14:29] Ryan_Lane: Is this going to turn out to be because my test driver was modifying actual live dns records? [22:14:45] heh [22:14:54] nah. you wouldn't have the password for doing so :) [22:15:04] Oh, good to know. [22:15:20] Hey, the console is back! [22:15:28] RECOVERY Total Processes is now: OK on deployment-dbdump deployment-dbdump output: PROCS OK: 81 processes [22:15:30] yeah [22:15:57] it may have something to do with my the openstack instances, though [22:16:00] err [22:16:09] sudo is set to hit ldap, then files [22:16:18] RECOVERY dpkg-check is now: OK on deployment-dbdump deployment-dbdump output: All packages OK [22:16:24] and the openstack instances use a *lot* of sudo [22:17:18] RECOVERY Current Load is now: OK on deployment-dbdump deployment-dbdump output: OK - load average: 0.37, 1.01, 0.55 [22:17:38] RECOVERY Current Users is now: OK on deployment-dbdump deployment-dbdump output: USERS OK - 1 users currently logged in [22:17:53] maybe I should set it to go to files, then ldap [22:17:55] that's likely saner [22:18:02] Danny_B|backup: I gave you some flags on deployment enwiki so you can import pages there [22:18:18] RECOVERY Disk Space is now: OK on deployment-dbdump deployment-dbdump output: DISK OK [22:18:58] RECOVERY Free ram is now: OK on deployment-dbdump deployment-dbdump output: OK: 90% free memory [22:19:05] * Danny_B|backup would prefer to have a lab clone of all cs wikis to test various extensions and setting we need there [22:19:08] yeah, files than ldap is saner. changing that :) [22:19:16] Danny_B|backup: that's the long-term goal [22:19:44] i know ;-) [22:20:00] i'm happy that labs finally started [22:20:15] Ryan_Lane: Is this an ok time for me to create a new instance, or will that make your life much much worse? [22:20:29] andrewbogott: no, it's cool [22:20:35] we aren't actually having any issues [22:22:36] New patchset: Ryan Lane; "Putting LDAP before files is insane, when most sudo is being handled by files." [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1796 [22:23:55] the home directory server is being a PITA too [22:24:04] it make a billion requests thanks to the number of users we have [22:24:04] :D [22:24:07] *makes [22:25:29] Danny_B|backup: there are some proposals on mediawiki.org you probably want to see them [22:26:48] PROBLEM host: master is DOWN address: master CRITICAL - Host Unreachable (master) [22:30:18] PROBLEM host: tempbuild is DOWN address: tempbuild CRITICAL - Host Unreachable (tempbuild) [22:32:19] RECOVERY host: tempbuild is UP address: tempbuild PING OK - Packet loss = 0%, RTA = 0.50 ms [22:32:44] "live" migration :D [22:32:53] tempbuild took fucking forever to move [22:32:53] hehe [22:34:09] PROBLEM dpkg-check is now: CRITICAL on nova-dev2 nova-dev2 output: Connection refused by host [22:34:49] PROBLEM Current Load is now: CRITICAL on nova-dev2 nova-dev2 output: Connection refused by host [22:35:39] PROBLEM Current Users is now: CRITICAL on nova-dev2 nova-dev2 output: Connection refused by host [22:36:19] PROBLEM Disk Space is now: CRITICAL on nova-dev2 nova-dev2 output: Connection refused by host [22:37:29] PROBLEM Free ram is now: CRITICAL on nova-dev2 nova-dev2 output: Connection refused by host [22:38:29] PROBLEM Total Processes is now: CRITICAL on nova-dev2 nova-dev2 output: Connection refused by host [22:47:09] Ryan_Lane: How can I review stuff on gerrit... namely your latest patch... [22:57:19] PROBLEM host: master is DOWN address: master CRITICAL - Host Unreachable (master) [23:04:02] hexmode: around? [23:04:11] yep [23:04:14] ok [23:04:16] sup, petan ? [23:04:16] methecooldude: do you have a labs account? [23:04:25] I posted a thread on VPT about deployment site [23:04:32] ! :) [23:04:41] hashar: https://labsconsole.wikimedia.org/wiki/User:Rich_Smith [23:04:50] also I think it would be cool to write down some information about it on mediawiki.org [23:05:29] I will install all extensions we use on english wikipedia + some which are in review queue [23:05:29] petan: yep... I'll be spending more time on it tomorrow [23:05:32] methecooldude: don't you have a review button in the patch set table? [23:05:51] hashar: Nope [23:06:04] ohh [23:06:13] maybe you are not allowed to review changes ? :-( [23:06:14] hashar: I can add myself as a reviewer, but that's about it [23:06:28] petan: I really, really want to get some commons people on there, too [23:06:34] the review button send you to a new page which let you +1 changes https://gerrit.wikimedia.org/r/#change,publish,1796,1 [23:06:35] sure [23:06:40] I will set up a common clone soon [23:06:40] but let me look at your vpt thing first [23:07:03] hashar: Arr, I can access that link [23:07:09] great :) [23:07:10] New review: Rich Smith; "(no comment)" [operations/puppet] (test) C: 1; - https://gerrit.wikimedia.org/r/1796 [23:07:39] hashar: But for reference, where is that button? [23:07:53] there is a line for each patch set [23:07:59] when clicking on it, it expands the patch set [23:08:02] let me find a screenshot [23:08:26] methecooldude: http://developer.qt.nokia.com/uploads/gerrit/3_3_2_viewing_change_overview.png [23:08:36] methecooldude: the review button is on the left [23:08:53] petan: I'm gonna edit of that for mw.o and commons .... I'll also talk to amir, siebrand and others about good non-english wiki's to test [23:09:05] methecooldude: the various other buttons let you check the diff [23:09:06] ok [23:09:35] hexmode: send me a message about it and I will read it later, ok? [23:09:40] on irc or mw.org [23:09:44] sure [23:09:52] hashar: OHH... that button :P [23:09:56] Herp derp [23:10:23] New review: Hashar; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/1796 [23:10:36] hexmode: you want to test uploading too? [23:10:47] methecooldude: gerrit GUI is not that good [23:10:52] I don't know if there is a lot of space for media [23:10:59] petan: I don't think so [23:11:01] ok [23:11:01] methecooldude: so you really want to keep asking :-) [23:11:21] commons does have some very interesting gadgets, though [23:11:29] right [23:12:19] methecooldude: I have signed your enwiki guestbook :D [23:16:05] petan: wmf email starts with petan? [23:18:30] nm found your email [23:18:52] nah [23:18:58] I don't have wmf mail [23:19:08] petr bena at gmail [23:23:11] methecooldude: just log in, and go to my change [23:23:20] methecooldude: your log in is your wiki username/password [23:24:29] oh. your question was already answered :D [23:27:19] PROBLEM host: master is DOWN address: master CRITICAL - Host Unreachable (master) [23:39:28] Ryan_Lane: If I wanted to set up a commons test site... is there an easy way that you know of that files older than 1hr would be deleted (so we could simiuate w/o disk space) [23:40:29] New review: Rich Smith; "(no comment)" [operations/puppet] (test) C: 1; - https://gerrit.wikimedia.org/r/1712 [23:42:42] * methecooldude is confused... [23:43:11] gerrit shows Ryan's patch as open, yet if I click on Merged, it's also there... so which is it? [23:46:09] oh. did I forget to merge it to test? [23:46:20] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/1796 [23:46:20] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/1796 [23:46:37] methecooldude: it's the same change cherry-picked to another branch [23:46:47] Ryan_Lane: Ahh, I see [23:48:17] the production branch runs our production infrastructure and the test branch runs labs [23:52:10] Ryan_Lane: Would you or Chad (who's not here?!) be interested in beefing up https://gerrit.wikimedia.org/r/#change,1794 ? I know what I want to do there and I explained it in the comments, but this is the first Python code I've written in my life so ... [23:53:36] :D [23:53:40] first python? [23:53:46] welcome to a much better language [23:54:51] RoanKattouw: you messed up the whitespace, though :) [23:55:06] Eh? [23:55:14] look at the diffs in gerrit [23:55:17] Well, I have yet to be convinced [23:55:29] Python doesn't have static functions, for instance [23:55:38] yes it does [23:56:09] I Googled for it and I was told that it wasn't in the language and you had to hack around that [23:56:26] Is the excess whitespace harmful? [23:56:31] yes [23:56:43] Hm, it made sense to me conceptually [23:56:46] But OK, I'll remove it [23:56:59] wait [23:57:04] it's not the newlines [23:57:19] PROBLEM host: master is DOWN address: master CRITICAL - Host Unreachable (master) [23:57:20] it's trailing space, or something along those lines [23:57:35] No, it's not [23:57:41] It's an indented empty line [23:57:48] ah [23:57:57] don't indent [23:59:16] hmm