[00:00:26] johnduhart: if it makes you feel any better, half my job is cleaning up after people :D [00:00:35] Ryan_Lane: hehe [00:01:00] thinking of that... [00:01:11] MaxSem: your home directory is really large [00:01:34] MaxSem: all projects same the same space for home directories, and they are kind of only meant for environment, at least for now [00:01:49] Well that's what happens when you download a copy of the internet [00:01:56] can you move the dump to a /mnt location of one of the instances in your project? [00:01:59] Ryan_Lane, I store dumps in there to share between instances [00:02:08] if there's a better way... [00:02:12] hm [00:02:28] why do multiple instances need access to it? [00:02:35] isn't it just getting loaded into a database? [00:03:13] you could make a single database instance, and let all instances access the database server [00:03:15] okay, I'll redownload it next time I'll need it [00:03:23] Ryan_Lane: i created the instance [00:04:10] coool [00:04:49] Ryan_Lane, done [00:04:56] MaxSem: soon we'll have shared storage for each project [00:05:03] sehr gut [00:05:05] sorry it's kind of a pain right now [00:06:33] heh, I also moved it because I feared that instance's mnt will not be enough for enwiki DB, but innodb on NFS made import choke much earlier [00:06:43] heh [00:06:51] hexmode: Is there a bug about how the checkboxes don't do anythin here: http://labs.wikimedia.beta.wmflabs.org/wiki/Special:GlobalUsers [00:06:58] well, I'd imagine it would [00:07:53] it would be NFS -> ext3 -> qcow2 -> glusterfs -> ext3 -> lvm -> raid10 [00:08:05] imagine the IO overhead in that :) [00:08:10] johnduhart: I think so but can't think of it offhand. File a new one and I'll dupe it if I find the original? [00:08:21] Meh, I'll fix it right now [00:08:28] heh [00:08:32] just 10MB/s on file copy /home -> /mnt [00:08:43] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [00:08:57] heh [00:08:59] Ryan_Lane: i can't ssh to it, even with agent forwarding [00:09:02] good night [00:09:11] -sql is aready dieing [00:09:12] or 4am is morning?:P [00:09:22] when we have the gluster storage per project, that'll change to: glusterfs -> xfs -> lvm -> raid6 [00:09:53] removing three levels of indirection, including one network one [00:10:12] hyperon: gimme a sec [00:11:03] johnduhart: already? [00:11:10] * hexmode goes to check traffic [00:11:11] arrrrgh [00:11:19] I removed my known_hosts accidentally [00:11:27] hyperon: it's working for me [00:11:54] hyperon: try now [00:13:26] nope [00:13:36] permission denied (publickey) [00:13:59] hm. auth log doesn't show any issues [00:14:13] try sshing to bastion, from bastion [00:14:25] err [00:14:26] sorry [00:14:33] try sshing to bastion1 from bastion [00:14:34] PROBLEM Total Processes is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [00:14:40] johnduhart: since -web doesn't have any traffic... why would -sql be dying? [00:14:41] the public IP won't work ;) [00:14:52] hexmode: Are you sure about that? [00:14:56] Where are you looking? [00:15:14] PROBLEM dpkg-check is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [00:15:21] johnduhart: /var/log/apache2/access.log on web [00:15:29] is there a better place? [00:15:30] hexmode: Try other_*.log [00:15:34] ah [00:15:57] nope, it doesn't like my keys [00:16:04] PROBLEM Current Load is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [00:16:05] should there be a reason i get a 403 code for a wget in labs ? [00:16:43] depends on what you are doing [00:16:54] PROBLEM Current Users is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [00:16:58] do you not get one from another url? [00:17:00] err [00:17:02] trying to download a jar off the web [00:17:12] for the same url, on your local system [00:17:24] PROBLEM Disk Space is now: WARNING on puppet-lucid puppet-lucid output: DISK WARNING - free space: / 42 MB (3% inode=35%): [00:17:36] no [00:17:37] chmm [00:17:44] PROBLEM Disk Space is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [00:17:46] hyperon: does ssh to bastion1 work? [00:17:51] nope [00:17:55] still doesn't like my keys [00:18:00] your forwarding isn't set up properly, then [00:18:04] maybe an issue with my agent forwarding [00:18:14] PROBLEM Free ram is now: CRITICAL on nova-daas-1 nova-daas-1 output: CHECK_NRPE: Error - Could not complete SSL handshake. [00:18:27] type this on bastion: ssh-add -l [00:19:48] this agent has no identities [00:20:08] yeah, then your agent on your local system doesn't have the keys added [00:20:15] log out of bastion [00:20:20] and type the same command on your local system [00:20:55] and it also has no identities [00:21:01] ssh-add [00:21:18] then ssh-add -l [00:21:29] once the key is in there, it should work [00:21:35] what OS is this? [00:21:39] hyperon [00:22:18] hexmode: In a few days you may want to consider sitenotices for places like enwiki who bitch that they didn't get enough notice for a change [00:22:24] PROBLEM Disk Space is now: CRITICAL on puppet-lucid puppet-lucid output: DISK CRITICAL - free space: / 38 MB (2% inode=35%): [00:22:41] Gentoo Linux, kerne version 3.0.0-xen [00:22:45] when's prod deployment set for? [00:22:55] jeremyb: No set date. [00:23:06] Some time in feburary probably [00:23:10] Hopefully. [00:23:22] johnduhart: well if release is about a month and cluster is before release then i imagine someone has an idea [00:23:30] johnduhart: yes, I was hoping that this would be better than another "Jimmy" [00:23:42] hexmode: heh [00:23:54] Ryan_Lane: i did that initially [00:23:59] and still no identities [00:24:02] we could use the beta mascot [00:24:04] does it now? [00:24:06] which is who? [00:24:23] no identities now? [00:24:30] "A personal appeal from a MediaWiki developer" [00:24:33] hyperon: then the ssh-add command isn't working [00:24:57] hyperon: is your private key named id_rsa or id_dsa? [00:25:06] no, wikikey [00:25:09] ah [00:25:18] ssh-add -i wikikey [00:25:35] i think no -i [00:25:35] if your key doesn't have a default name, you have to tell the ssh-add command that [00:25:38] oh [00:25:39] right [00:25:40] just ssh-add wikikey [00:25:43] ssh-add wikikey [00:25:46] yeah, i have far far far too many keys to do defaults [00:25:58] * Ryan_Lane nods [00:26:00] me too [00:26:03] hyperon: you could do per domain defaults in ~/.ssh/config [00:26:07] ...even though i apparently have never needed to ever use agent forwarding anywhere else [00:26:36] hyperon: also you don't have to forward... there's the method i use doc'd on the wiki [00:27:19] ( https://labsconsole.wikimedia.org/wiki/Access#Using_ProxyCommand_ssh_option ) [00:28:12] weird weird weird [00:28:50] "Could not open a connection to your authentication agent." [00:29:03] do you have one running? [00:29:07] eval `ssh-agent` [00:29:47] yes [00:35:15] so maybe i suck at the internet or something [00:35:38] just start a new ssh agent [00:35:47] eval `ssh-agent` [00:35:56] maybe you started one, and didn't add the environment [00:36:18] if you just run ssh-agent, without the eval, it outputs the environment [00:36:23] it won't add it [00:36:31] which means ssh-add won't be able to talk to it [00:36:39] i've been running with the eval [00:36:49] ps aux | grep ssh-agent [00:36:52] is it running? [00:41:59] yes [00:42:01] yes it was [00:42:05] and i figured it out [00:42:11] i am the biggest idiot ever [00:42:26] and forgot to run ssh -A [00:43:04] heh [00:48:33] i'm confused, i thought key adding was broken even on local shell? [00:48:40] anyway, fixed i guess [00:52:51] errors occured while processing ganglia [00:52:54] as you said [00:52:58] so what do i do? [00:57:08] ah [00:57:09] sorry [00:57:12] gimme a se [00:57:14] *sec [00:57:56] as root, run: adduser --system --ingroup ganglia --home /var/lib/ganglia ganglia [00:58:07] then re-run it [00:58:20] next issue you'll run into is apache not wanting to start [00:59:23] in /etc/apache2/sites-enabled/000-default, you'll need to change all group references from your username, to openstack [00:59:33] since there is no hyperon group [00:59:42] but you are in the openstack group, since you are in the project [01:01:54] ah i just hit that [01:01:55] thanks [01:05:22] but keep user references hyperon? [01:06:50] yep [01:07:03] well [01:11:24] wait wait wait [01:11:28] i keep changing it [01:11:33] but when i run the script [01:11:36] it still fails [01:11:55] and when i look at the file again, all the values have changed back to hyperon [01:13:35] heh [01:13:43] I think you don't need to re-run the script after changing the file [01:13:52] you just need to restart apache [01:14:30] oh [01:14:49] i changed 000-template in devstack/files/ [01:15:52] ah [01:15:53] cool [01:16:32] so it worked now [01:16:34] it's up [01:16:39] cool [01:19:00] ok. out for a bit [02:38:34] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 24% free memory [02:40:54] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 20% free memory [02:51:05] in other news, i finally finished my htcpcp based, fully RFC-2324 compliant coffee advanced system [02:51:13] crap, wrong channel [03:03:14] RECOVERY Free ram is now: OK on nova-daas-1 nova-daas-1 output: OK: 68% free memory [03:03:54] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 18% free memory [03:04:34] RECOVERY Total Processes is now: OK on nova-daas-1 nova-daas-1 output: PROCS OK: 117 processes [03:05:14] RECOVERY dpkg-check is now: OK on nova-daas-1 nova-daas-1 output: All packages OK [03:05:34] Ryan_Lane: petan: where can i start documenting my progress? [03:05:54] !project openstack [03:05:54] https://labsconsole.wikimedia.org/wiki/Nova_Resource:openstack [03:06:04] RECOVERY Current Load is now: OK on nova-daas-1 nova-daas-1 output: OK - load average: 0.06, 0.15, 0.11 [03:06:06] there's a documentation link there [03:06:18] that let's you add documentaton [03:06:56] every project page has one [03:07:03] * Ryan_Lane disappears [03:07:04] RECOVERY Current Users is now: OK on nova-daas-1 nova-daas-1 output: USERS OK - 1 users currently logged in [03:07:44] RECOVERY Disk Space is now: OK on nova-daas-1 nova-daas-1 output: DISK OK [03:11:52] !log openstack started the daas project [03:11:53] Logged the message, Master [03:12:12] !log openstack created nova-daas-1, installed openstack with devstack [03:12:13] Logged the message, Master [03:15:28] wait a second [03:15:32] https://svn.wikimedia.org/viewvc/mediawiki/trunk/lucene-search-2/src/org/wikimedia/lsearch/util/PHPParser.java?view=markup&pathrev=108594 [03:15:42] OrenBochman: That supports InitialiseSettings [03:16:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [03:17:49] what ? [03:18:25] OrenBochman: You kept telling us how you needed a LocalSettings for search t work, and that configuration parser is written for InitialiseSettings [03:19:12] Ok [03:20:01] I was planning to refactor this file per rainmans suggestion [03:20:22] However it is not used exclusivly to configure things [03:30:34] johnduhart: in org.wikimedia.lsearch.util.Configure [03:31:12] /** Use maintenance/eval.php to get medawiki variables */ [03:31:13] public static String getVariable(String mediawiki, String var) throws IOException{ [03:31:15] return Command.exec(new String[] { [03:31:16] "/bin/bash", [03:31:18] "-c", [03:31:20] "cd "+mediawiki+" && (echo \"return \\$"+var+"\" | php maintenance/eval.php --conf "+mediawiki+"/LocalSettings.php | sed -e 's/^> // ; /^$/d')"}).trim(); [03:31:21] } [03:31:35] ... [03:31:42] Don't copy paste in chat [03:31:42] that's how the it actualy reads the variables [03:32:35] hm [03:38:25] I apreciate that you are looking into this [03:40:45] I've gone over the code - one of the issues is that it needs to build the localsetting.conf before it can find the localsettings.php, initiLisesettings and it does some other tasks too [03:42:02] I've updated the Lucene-search documentation today using the source to reflect the settings I see in current local-configuration.conf [03:43:51] Once thing that is stated there is that you need to start with a local installation and then you can add databases, and split index definitions [03:45:10] there are also some support comments that say it is not needed to have a local MediaWiki - but I'm pretty sure this is for setting up a non indexing searcher [04:23:54] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 21% free memory [04:56:05] Ryan_Lane, i tried to do the login process as described at [[Access]]. I'm pretty sure that i uploaded an SSH key to Special:NovaKey, but when i tried to log in, i received a Permission denied (publickey) error. [04:56:42] And now when i look at https://labsconsole.wikimedia.org/wiki/Special:NovaKey again, it says "There were no Nova credentials found for your user account. Please ask a Nova administrator to create credentials for you. " [04:58:17] aharoni: try loging in and out of the wiki [04:58:50] it might update your credentials [05:00:11] 01/15/2012 - 05:00:11 - Updating keys for amire80 [05:00:52] i did it, and it did something interesting: [05:02:24] at the bottom it showed a key that goes "gIbbeRish345345 amir.aharoni@mail.huji.ac.il", which i didn't upload. [05:02:45] that is my correct email, and the gibberish part looked like the gibberish from my real key. [05:02:51] it didn't let log in though. [05:04:26] so i tried uploading "ssh-rsa gIbbeRish345345", which is what i have in ~/.ssh/id_rsa.pub, and then it let me log in. [05:26:54] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 19% free memory [05:31:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [06:59:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [08:07:44] * Beetstra notices that he is teasing bots-2 too much ... [08:07:52] And not all my bots are running there yet .. [08:41:54] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 25% free memory [08:42:29] Snowolf: what's puttytray? [08:42:57] <+Snowolf> jeremyb: modified version of putty with a couple of more bells and whistles, ie clickable urls [08:43:06] <+Snowolf> http://puttytray.goeswhere.com/ [08:44:09] drdee: OrenBochman: ^^ [08:44:19] jeremyb: actually there's an even better version [08:44:26] http://ryara.net/putty-url/ [08:44:33] this one has pretty much same functions [08:44:37] but urls are configurable [08:44:57] well it ain't for me anyway... ;-) [08:45:16] Well, just thought I'd let you/whoever you're linking to know :) [08:45:26] yeah, i got it ;) [08:45:29] danke! [08:45:33] np :) [08:49:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [09:07:24] A database error has occurred. Did you forget to run maintenance/update.php after upgrading? See: https://www.mediawiki.org/wiki/Manual:Upgrading#Run_the_update_script [09:07:27] Query: DELETE FROM `msg_resource` [09:07:30] Function: MessageBlobStore::clear [09:07:32] Error: 1205 Lock wait timeout exceeded; try restarting transaction (deployment-sql) [09:07:39] @ http://labs.wikimedia.beta.wmflabs.org [11:08:08] 01/15/2012 - 11:08:08 - Updating keys for beetstra [11:08:11] 01/15/2012 - 11:08:11 - Updating keys for beetstra [11:08:20] good boy [11:09:54] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 17% free memory [11:12:18] Danny_B|backup: ? [11:13:47] petan: no longer, but that was an error i got at that moment so i reported it [11:13:53] ok [11:20:10] !deployment-prep updated to latest head all wikis [11:20:17] !log deployment-prep updated to latest head all wikis [11:20:18] Logged the message, Master [12:07:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [12:33:05] methecooldude: Hey there [12:42:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [12:48:59] !logs bots COIBot started with limited modules [12:49:08] !log bots COIBot started with limited modules [12:49:09] Logged the message, Master [12:54:54] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 23% free memory [13:12:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [13:12:54] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 19% free memory [13:45:16] re [13:46:47] hey [13:46:59] OrenBochman: need me? [13:47:11] what's up with search on beta :o [13:47:15] some progress [13:47:18] I'd like to give search another shot [13:47:21] ok [13:47:25] I have a time now [13:47:27] few hours [13:47:30] ok [13:47:33] me too [13:47:48] I need someone from ops to get me a config [13:48:23] meh [13:48:48] first off I'd like to know how to update the search code from svn [13:48:54] I've done some work [13:49:04] svn up [13:49:05] :o [13:49:11] ok [13:50:36] hyperon: around [14:28:35] petan: is there a way to tell shubversion not to include a folder in syncs ? [14:28:44] probably [14:28:57] ignore file [14:29:33] oh [14:30:04] https://www.google.com/search?q=svn+ignore+file&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:cs:official&client=firefox-a [14:36:13] Beetstra: you may wish to move some of your bots to server 3 [14:40:22] Hey, maybe this is a human error, but the spelling of Pali is incorrect in http://labs.wikimedia.beta.wmflabs.org/wiki/Special:SiteMatrix [14:42:21] bots-3, you mean? [14:42:26] yes [14:42:29] I see that the memory is slowly filling up .. [14:42:30] it's rather idle now [14:42:57] My account is active there? [14:42:57] ok I will create another instance in case we were out of memory [14:42:59] sure [14:43:42] I see the box is running without any swap ..? [14:43:55] yes [14:44:02] be glad ;) [14:44:06] IO really suck [14:47:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [14:55:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [15:00:44] I just got an error in opera while trying to open a test page [15:00:44] TrustedXFF: hosts file missing. You need to download it. [15:00:44] This is followed by a large backtrace [15:00:56] which ends at [15:00:58] #27 {main} [15:02:51] Sid-G: link [15:03:02] petan: hold on [15:03:04] thanks [15:03:07] :o [15:03:10] petan: http://hi.wikipedia.beta.wmflabs.org/wiki/????????? [15:03:23] uh [15:03:23] :( [15:03:28] pastebin coming up :) [15:03:36] it opens to me in ff [15:03:52] ok can you report it? [15:03:58] where? [15:04:06] http://labs.wikimedia.beta.wmflabs.org/wiki/Problem_reports?action=edit [15:04:17] include error log there [15:04:30] backtrace etc [15:04:40] petan: that ???s were a pagename [15:04:58] i wasn't talking of the mainpage [15:05:09] will report [15:05:12] ah I clicked the link [15:05:13] you sent me [15:05:30] it opened main page to me [15:05:30] hmm [15:05:33] seems its opera [15:05:44] similar error in another page [15:05:44] opera shouldn't trigger error in mediawiki [15:05:57] it is a bug which will be dealt with [15:06:18] known bug? [15:06:20] so no need to report? [15:06:36] btw, got a second one [15:06:40] yes please report it :) [15:06:57] MediaWiki internal error. [15:06:57] ok :) [15:06:57] links, backtrace etc I will add it to bugzilla then [15:07:16] ok [15:07:29] although i could put it directly on bugzilla if u want [15:07:30] I am unable to reproduce it that's why you need to do that [15:07:45] ok, in that case insert it to bugzilla and report page and use {{tracked}} [15:07:52] petan: turn on turbo boost [15:07:55] in opera [15:08:02] I don't have opera [15:08:06] ok [15:08:16] its working fine with chrome so far [15:08:48] ok, will add it to both places [15:09:15] :) [15:09:25] :) [15:10:05] thank you for helping us with test ;) [15:10:19] my pleasure :D [15:10:36] lets me try out admin tools ;) [15:10:48] ok [15:11:37] admin tools need to be tested too [15:11:48] yeah [15:11:54] will do all i can :) [15:15:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [15:21:36] petan: first one https://bugzilla.wikimedia.org/show_bug.cgi?id=33741 [15:23:04] yay [15:23:17] :) [15:23:23] second one coming up [15:24:13] cool [15:24:32] meh, its probably a dupe [15:24:50] almost the same error [15:24:56] also ends in {main} [15:25:06] add it as a comment to this one maybe? [15:27:47] petan: need the entire backtrace at beta too? or just the basic info [15:27:47] ? [15:28:14] um... for bug? you can [15:28:27] it never hurt [15:28:32] ok [15:28:36] https://bugzilla.wikimedia.org/show_bug.cgi?id=33741 [15:28:43] XFF file is missing for TrustedXFF [15:28:55] yeah [15:36:09] wow, now i got an error in chrome [15:39:00] reedy: similar to 33741: [15:39:01] MediaWiki internal error. [15:39:02] Original exception: exception 'MWException' with message 'TrustedXFF: hosts file missing. You need to download it.' in /usr/local/apache/common/live/extensions/TrustedXFF/TrustedXFF.php:55 [15:39:25] add it to that one? or report as new bug? [15:39:30] nah [15:39:33] same thing [15:39:35] I would rather say this is a config error, Reedy where I should download it from? [15:39:37] could mention it on the same bug [15:39:45] is it mw error or config error? [15:39:48] "The file can be generated using the generate.php maintenance script." [15:39:51] config most likely [15:39:57] ok [15:40:06] ok [15:40:12] yeah, try running the generate script in the extension folder [15:40:26] generates it from the trusted-hosts.txt in the file [15:40:36] just need to then make sure it's wherever it needs to live [15:42:03] is this worth reporting? : Error: 1205 Lock wait timeout exceeded; try restarting transaction (deployment-sql) [15:42:22] any more info than that? [15:42:27] this is the last line of the error I got on chrome [15:42:32] yeah [15:42:42] Unable to open input file "trusted-xff.txt" [15:43:04] Sid-G: no don't report that [15:43:11] A database error has occurred. Did you forget to run maintenance/update.php after upgrading? See: https://www.mediawiki.org/wiki/Manual:Upgrading#Run_the_update_script [15:43:11] Query: DELETE FROM `msg_resource` [15:43:11] Function: MessageBlobStore::clear [15:43:11] Error: 1205 Lock wait timeout exceeded; try restarting transaction (deployment-sql) [15:43:11] thats all the info :) [15:43:11] that seems to be an issue with slow sql [15:43:12] got it once, not getting it again [15:43:13] ok Petan :) [15:45:17] Sid-G: try again [15:45:19] opera thing [15:45:52] !log deployment-prep ran live/extensions/TrustedXFF/generate.php [15:45:54] Logged the message, Master [15:49:39] * Beetstra is confused ... I ssh'd to bots-3 .. and now there I already have the dirs for the bots? [15:49:50] home is on nfs [15:50:07] wonderful .. no extra work [15:50:11] yup [15:50:57] what are those .nfs##### files? [15:51:24] deleted files which are opened [15:51:48] unless process release the inode it won't be removed [15:52:00] ah .. OK [15:52:08] I see it is a nohup.out .. [15:52:09] you probably forget close() [15:52:16] or something [15:52:54] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 25% free memory [15:53:17] LiWa3 is a beast .. [15:53:26] We should tell people to use less external links on Wikipedia ... [15:53:32] heh [15:53:34] forget references [15:53:41] and foremost: STOP SPAMMING [15:54:17] {{WMFLabsBot}} [15:54:25] don't forget to use it :P [15:54:30] on userpage [15:54:37] heh [15:54:41] * Beetstra cries a little .. [15:54:48] you don't have to :D [15:54:53] insert it [15:55:01] BOOHOO .. and why do the installs not reside on that frigging NFS [15:55:08] heh [15:55:12] they don't [15:55:13] Now I have to install all the perl modules on bots-3 [15:55:17] indeed [15:55:21] that should be on nfs too [15:55:22] Otherwise it won't run [15:55:24] YES [15:55:28] :-D [15:55:30] I will create a share ok [15:55:44] I can install them [15:56:36] LOL [15:56:38] bots-2 [15:56:39] :D [15:56:48] !nagios :P [15:56:48] http://nagios.wmflabs.org/nagios3 [15:57:02] funny load [15:57:15] yes, I installed a handful of perl modules on bots-2 (and upgraded another set) .. [15:57:33] can you share those between all bots-servers [15:57:40] that kind of ruined my idea to put it on nfs since I can't login there :D [15:57:49] yes I can [15:57:54] but I need to ssh first :D [15:57:57] by the way, does bots-sql3 also have a shell? [15:58:02] yes, don't use it [15:58:07] there is a phpmyadmin [15:58:18] http://bots.wmflabs.org/phpmyadmin/ [15:58:26] no need for shell :o [15:58:34] or use mysql from remote machine [15:58:51] phpmyadmin [15:58:52] lolol [15:58:58] and how are you going to upload a dump of 5 gig via phpmyadmin? [15:59:06] Reedy: it's on toolserver too [15:59:09] or use mysql from remote machine [15:59:10] And? [15:59:16] I don't use it I installed it because of users from ts [15:59:20] Doesn't make it any less funny :p [15:59:24] Reedy: huh? [15:59:33] and how are you going to upload a dump of 5 gig via phpmyadmin? [15:59:37] or use mysql from remote machine [15:59:44] ah [16:00:02] mysql -h bots-sql3 -u user -p dbname < dump.sql [16:00:06] Beetstra: ^ [16:00:22] storage is on gluster shared between vm's [16:00:29] Ah, OK [16:00:42] so it doesn't matter if it's on sql or application server [16:00:45] (you'll have to explain that again when there is sufficient diskspace, and all is running smoothly [16:00:50] Reedy: ? [16:00:59] ? [16:01:01] Reedy: what's wrong on mysql command :o [16:01:19] Beetstra: where isn't space? [16:01:28] nothing? [16:01:34] ah ok... didn't get you [16:01:51] Ryan was going to install more diskspace for MySQL .. the 5 gig dump is just one of the tables ... [16:02:10] actually Ryan is going to install a server for this [16:02:27] he won't do anything regarding the space of current vm's :o [16:02:36] once it [16:02:43] it's done I will move all db's there [16:03:05] sql is slow because of IO [16:03:12] which suck on storage we use [16:03:12] Ohai. I can haz sudo aptitude on bots-3? :-) [16:03:20] Coren: yes, log it :P [16:03:25] I mean log what you install etc [16:03:30] Need moar perl modules. [16:03:33] * Coren nods. [16:03:35] ok [16:03:36] np [16:03:39] Coren, petan is busy with that [16:03:56] I've installed a good handful on bots-2 .. but they are not shared with the other bots [16:03:57] if you guys log it I will see it and install all on nfs [16:04:03] so that it would be on all vm's [16:05:22] OK, here we go then [16:05:54] !log bots installing perl module POE [16:05:55] Logged the message, Master [16:06:08] Coren, what do you need? [16:06:26] (so that we do not do double work etc.) [16:06:50] Beetstra: I'm making a list now. Net::Oauth is one for sure, but I'm looking at what else is already there vs what I need. [16:07:06] Heh, only the basics seems to be there [16:07:22] I will first try and install mine, then see what is needed further [16:07:35] Aren't these sorts of things supposed to be puppetised? [16:07:46] rather than arbitarily just installing packages? [16:08:17] Reedy: I dunno. Are there per-projects puppet classes? [16:08:31] No idea [16:08:47] Beetstra: Also clearly missing for me is JSON. [16:09:58] Everything else seems to be there already. Well, except Text::Align::WagnerFischer but that one is baroque and not even on CPAN so I got a local install of it [16:12:07] Oh yeah, I think I /first/ have to upgrade CPAN .. [16:12:26] yes it is supposed to be in puppet [16:12:56] Coren: not yet [16:13:00] Ryan is working on that [16:13:19] The two packages I need are in the base distro: libnet-oauth-perl and libjson-perl [16:13:29] ok so I created /global on all application servers [16:13:52] it's not easy to puppetize stuff when things are going to change rapidly in few months [16:16:45] Coren, do you want JSON::XS ?? [16:17:36] * Beetstra installs anyway [16:18:19] Beetstra: Isn't it part of JSON since 2.0? [16:18:29] It asks for the install [16:18:35] Beetstra: Ah, no, just checked: will use if there [16:18:40] Beetstra: So, yes. :-) [16:18:47] too bad, that one fails to install too .. [16:18:48] sigh [16:19:09] Why don't you just apt-get the ubuntu prebuilt? [16:19:17] what part of Net did you need, it can't find Net::Oauth [16:19:57] Missing cap. It's Net::OAuth [16:21:16] I guess that the prebuilt will need to be used, bots-3 does not want to install a lot of things [16:21:35] POE failed, Net::OAuth does not work, JSON::XS failed (JSON worked) [16:23:03] Can you install the pre-built ones, Coren? [16:23:08] (not sure how to do that) [16:23:36] I'm out of CPAN [16:23:40] Sure. [16:24:12] And on bots-2 this went smoothly, no problems whatsoever [16:24:16] I'm not cool enough to be in the sudoers. [16:24:33] (or at least, I'm not NOPASSWD:) [16:24:39] * Beetstra wonders how he became so cool then [16:24:45] you can 'sudo su' ?? [16:25:11] Asks for a password. [16:25:16] Coren: wiki pw [16:25:23] petan: Ah. :-) [16:25:50] !log bots installing libnet-oauth-perl on bots-3 [16:25:51] Logged the message, Master [16:26:27] petan: opera thing works :) [16:26:30] Beetstra: There's a metric shitton of dependencies that went in; I'm guessing that's why a straight-from-cpan install didn't like you [16:26:33] !log POE install failed via CPAN, JSON installed - JSON::XS failed, Net::OAuth fails [16:26:33] POE is not a valid project. [16:27:11] but I did the same on bots-2 yesterday, installed all the ones I needed .. [16:27:13] !log bots installing libjson-perl on bots-3 [16:27:14] Logged the message, Master [16:27:15] !log bots POE install failed via CPAN, JSON installed - JSON::XS failed, Net::OAuth fails [16:27:16] Logged the message, Master [16:29:05] Full of joy. CSBot now works on bots-3. [16:29:14] CS? [16:29:18] counter strike! :o [16:30:00] !cs is labs can be used for running bots, usefull stuff for development or playing counter strike [16:30:00] Key was added! [16:30:40] Heh. It's the midly annoying CorenSearchBot [16:31:09] Only sniping going on is copy-and-paste articles. :_) [16:31:19] heh [16:31:42] So, as far as POC goes, we know it works. The question is: when do we go live? [16:33:55] live? [16:35:17] Well, the objective is to rehome CSBot there; but isn't this just testing? Or can I fire it up? [16:35:35] yes you can fire it up :P [16:35:42] it will be testing for a long time [16:35:44] :D [16:35:54] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 13% free memory [16:35:56] because there are many things we need to do untill we start prod [16:37:38] Yeah, I know you are low on memory bots-2 [16:37:44] I'm working on that problem .. [16:38:16] !log bots wp:en:CorenSearchBot now running on bots-3 [16:38:17] Logged the message, Master [16:39:05] I should probably write a nagios plugin to monitor it [16:40:54] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 25% free memory [16:41:01] :-) [16:41:15] How many bots-servers are there, petan? [16:41:16] 3? [16:41:32] petan what are the credentials for the OIArepository ? [16:42:40] OrenBochman: no idea [16:42:49] I don't even know what it is [16:43:01] Beetstra: application? [16:43:04] OAI [16:43:05] yes 3 but I can add more [16:43:13] :O [16:43:27] OAI - you said but you installed it and it is working ? [16:43:29] Because I think that LiWa3 is taking out one of them .. :-/ [16:43:47] OrenBochman: I don't remember I did [16:44:19] well have a look on the MediaWiki host [16:44:36] Beetstra: I don't see that all of them are utilized at least 80% [16:44:51] !nagios [16:44:51] http://nagios.wmflabs.org/nagios3 [16:47:46] * Beetstra is working on getting bots-2 to 80% ... [16:48:06] LiWa3 can't do its work, needs more threads activated .. [16:51:22] !log bots upgraded installed perl packages on bots-2 [16:51:23] Logged the message, Master [16:57:11] Hm. I see through CU at least three bots are now coming out of a small range of IPs. It would probably wise to not that on those IP talk pages; any block on those would cause havock. [16:58:28] What /is/ the range? [16:58:41] Coren, XLinkBot is one of them? [16:59:01] labsconsole should tell you the assigned ips [17:00:43] Beetstra: ClueBot, XLinkBot, SharedIPArchiveBot and AFCBot that I can see [17:01:21] But they're not fixed IPs; CSBot has two edits since the move, two IPs [17:01:25] That sounds about correct .. when I get it working, you will also see COIBot there [17:02:40] I.e.: those realy, really shouldn't be blocked. [17:02:44] petan: hey [17:05:29] XLinkBot can be turned off by changing the on-wiki settings .. but I think the last time someone blocked anyway [17:06:39] Coren, how do you install those packages .. I can't get POE to work [17:06:59] sudo aptitude install libfoo-perl [17:07:17] where depends on the module. Which one do you need? [17:07:24] for now, POE [17:07:59] and probably more [17:08:01] sudo aptitude install libpoe-perl [17:10:21] OK, that worked, and now perl still complains .. [17:10:39] (about POE not being installed) [17:10:52] o_O? [17:11:02] on bots-3? [17:11:04] oh wait .. [17:11:13] now it is one of the sub-packages .. [17:11:39] Do: aptitude search libpoe [17:12:03] ah, thanks, that will help [17:15:52] !log bots installing libpoe-perl [17:15:53] Logged the message, Master [17:16:10] !log bots installing libpoe-component-irc-perl [17:16:10] Logged the message, Master [17:16:26] !bots installing libdbi-perl [17:16:26] http://www.mediawiki.org/wiki/Wikimedia_Labs/Create_a_bot_running_infrastructure proposal for bots [17:17:40] !bots installing libdbd-mysql [17:17:40] http://www.mediawiki.org/wiki/Wikimedia_Labs/Create_a_bot_running_infrastructure proposal for bots [17:17:47] !log bots installing libdbd-mysql [17:17:48] Logged the message, Master [17:17:51] !log bots installing libdbd-mysql-perl [17:17:51] Logged the message, Master [17:19:31] !log bots running COIBot from bots-3 [17:19:32] Logged the message, Master [17:19:58] Beetstra: did you figure out your unicode? [17:20:23] yes, had to re-inforce it in one of the modules of XLinkBot [17:20:36] Which is strange, because that was not necessary on the previous box XLinkBot was running on [17:21:01] you may have moved from 5.8 -> 5.10 [17:21:18] Yes, the previous box had a bit older version of perl [17:21:20] did you read `perldoc perlrun`'s section on -C? [17:22:01] bastion1 says > [17:22:03] gah [17:22:04] *** System restart required *** [17:22:46] yup, bastion1 has perl 5.10 [17:23:13] I think I read that document [17:23:19] Versageek had 'v5.8.8 built for sun4-solaris' [17:24:10] did you see the part about -C? [17:26:13] > "-C" on its own (not followed by any number or option list), or the empty string "" for the "PERL_UNICODE" environment variable, has the same effect as "-CSDL". In other words, the standard I/O handles and the default "open()" layer are UTF-8-fied but only if the locale environment variables indicate a UTF-8 locale. This behaviour follows the implicit (and problematic) UTF-8 behaviour of Perl 5.8.0. [17:27:46] Krinkle: Hi, you still around? [17:29:24] I am pushing date between bot-modules .. not even sure if these are files-handles or whatever, I just have to take good care [17:29:34] s/date/data/ [17:29:51] friggin' hotel network [17:30:06] * Beetstra goes for dinner, bots are running, lets see how the boxes cope with them [17:32:32] http://en.wikipedia.org/wiki/Template:WMFLabsIP [17:34:04] heh [17:34:11] We should really tag all WMF ips as WMF [17:34:27] and then more specific templates if it's more likely to be usd [17:34:41] Coren: I'm pretty sure we alreadu have a template on enwiki [17:34:46] true [17:35:02] Probably, but I'm just looking at the immediate problem atm [17:35:03] :-) [17:35:08] Beetstra: but eventually it has to get into the interpreter or out of it somehow? [17:35:14] http://en.wikipedia.org/wiki/User:127.0.0.1 is my homeboy [17:35:21] Beetstra: network, file, stdin/out/err? [17:36:10] Yeah, still didn't find any list of outgoing IPs. [17:36:48] https://labsconsole.wikimedia.org/wiki/Special:NovaInstance [17:36:57] Instance floating ip address [17:37:03] 208.80.153.211 [17:38:36] Except not. I see edits coming from all over 208.80.152.0/24 [17:40:09] oooh we have a full /22 http://toolserver.org/~chm/whois.php?ip=208.80.153.211 [17:40:38] And I though river wasn't around anymore [17:42:24] johnduhart: I'm guessing not all the /22 is assigned to egress NAT [17:42:33] Gah. I so hate IPv4 [17:42:42] Of course not [17:43:55] * Coren idly wonders why the labs don't deploy v6 [17:46:21] My own infrastructure is v6 throughout [18:13:31] Coren: do you come with geek code too? [18:13:39] hi guys [18:13:56] who goes there [18:14:31] I've been trying to figure out why search wont work [18:14:53] you mean indexing? [18:14:58] * jeremyb has to go in 2 secs [18:15:03] there are some bugs in the parser yes [18:15:09] yes indexer [18:15:40] I've been trying to isolate and fix [18:16:04] but it's lots of work [18:16:56] php parser [18:17:12] you know it's being rewritten? [18:18:58] * jeremyb leaves [18:58:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [19:13:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [19:26:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [19:33:59] petan: Requests are waiting http://labs.wikimedia.beta.wmflabs.org/wiki/Global_Requests [20:12:22] petan: you fixed the XFF problem? [21:12:26] hey [21:12:27] hexmode: yes [21:12:37] hi johnduhart [21:13:03] johnduhart: maybe if you have a shell access you could handle them? [21:13:23] hyperon: around? [21:14:05] Oh, cool. My latest "bugfix" fixed a bug I had and had the happy bonus side effect of transforming my bot into a fork bomb. :-) [21:14:30] I love bombs of forks [21:14:32] Here's to keeping an eye on one's running code and cathing things like that in time. :-) [21:15:44] So, for future reference, if you have a loop that forks, and you have a case that exits without further processing, you want to test for that case /before/ the fork. :-) [22:32:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [22:42:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [23:07:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [23:12:24] PROBLEM Disk Space is now: WARNING on puppet-lucid puppet-lucid output: DISK WARNING - free space: / 50 MB (3% inode=35%): [23:17:24] PROBLEM Disk Space is now: CRITICAL on puppet-lucid puppet-lucid output: DISK CRITICAL - free space: / 35 MB (2% inode=35%):