[15:43:25] o/ [16:51:17] * Ironholds yawns [16:51:19] morning all [17:02:32] halfak, Ironholds: do you know how to let a Python script which deals with a big dump running on Labs? [17:02:37] Specifically this one: https://gist.github.com/he7d3r/cbdffa019659f6f437ef [17:02:51] How to let it what? [17:03:23] ouch.. [17:03:34] let --> run [17:04:00] I mean, the script will probably take a long time to run (more than 24 hours maybe) [17:04:33] Gotcha. Regretfully, I don't use the tool labs infrastructure much for that kind of work. [17:04:46] But I know that they have a job queue. [17:04:49] Helder: are you looking to run a single script for >24 hrs, or can you split the job up? [17:04:53] ^ [17:04:55] This guy [17:04:59] :) [17:05:05] hi! [17:05:06] :) [17:05:09] Hey Nettrom [17:05:11] it is a single script [17:05:48] Helder: Okay, that's not a problem. Two things I would keep in mind: [17:06:09] 1: preferably make it save state so you can continue if it crashes or if the job is halted [17:06:25] (e.g. pickle the state of things) [17:06:37] 2: tell the grid that it'll run for long and submit it to the grid [17:06:54] I use shell scripts to submit jobs, let me find one you can look at [17:07:30] here [17:07:37] is one you can use as a basis: https://bitbucket.org/grouplens/suggestbot/src/a16d511298b8cc32d8243222868cdabed0706f32/tool-labs/link-rec/update_linkcounts.sh?at=master [17:08:12] change the name, expected runtime, output files, memory needs and point it to your script [17:08:35] then submit it to the grid with 'qsub [filename]', just replace it with the filename of your shell script [17:09:26] great! I'll take a look [17:09:42] here's more info about the grid on Tool Labs: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Submitting.2C_managing_and_scheduling_jobs_on_the_grid [17:10:05] note that they have a wrapper for qsub called jsub, but since I use shell scripts I tend to ignore jsub [17:23:24] Nettrom, halfak: so... it seems I'll have other problem before being able to run that script on Labs [17:23:24] $ pip install mediawiki-utilities [17:23:25] The program 'pip' is currently not installed. To run 'pip' please ask your administrator to install the package 'python-pip' [17:23:48] do we need to request extra permissions for these things? [17:24:04] if you set up a virtual environment, you'll have pip available to install packages for that environment [17:24:17] I've only done that for Python 2.7, though [17:24:45] Helder: virtualenv -p $(which python3) some_folder/ [17:25:16] Helder: source some_folder/bin/activate [17:25:57] or in my shell script, replace 'python' with '$HOME/some_folder/bin/python' [17:26:16] I'll try these, thanks [17:26:46] the latter only works when the virtualenv is set up, of course [17:31:37] halfak: I got a "SyntaxError: invalid syntax" from "pymysql" during the installation: http://dpaste.com/1EDKBBK [17:31:43] any ideas? [17:32:24] Damn. That python version is old. [17:32:32] Let me see what I can do. [17:32:47] :-) [17:33:12] (however, I was able to run my script with a small dump from ocwikibooks [17:34:01] Hmm.. You don't need pymysql for the mw work you need to do. [17:35:11] What are you supposed to SSH to for tool labs these days? [17:35:19] I'm timing out on tools-login [17:35:50] I used ssh USERNAME@tools-login.wmflabs.org [17:35:57] tools-(login|dev).wmflabs.org [17:36:04] Hmmm [17:36:14] weird, I just sshed into tools-login without a problem from the U [17:36:15] Nettrom: what is the dev version? [17:36:57] Helder: both tools-login and tools-dev are running Ubuntu 12.04 [17:37:12] what should we (not) do in each version? [17:37:27] they should be identical, except on tools-dev you can run interactive programs to test your code [17:37:45] hmm... interesting [17:39:40] there's a note about it here: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Developing_on_Tool_Labs [17:42:06] I guess it is time for me to RTFM...