[17:20:12] zhuyifei1999_: Are you aware of the "Grid" in toolslab ? [17:20:23] yes [17:20:42] So, I was asked to run my script on grid as it was taking quite a long time to run [17:20:48] yeah [17:20:55] I made a bash script which works when I run the bash script. [17:21:13] what's the error? [17:21:19] On doing "jsub bash my_script.sh" the job doesnt seem to get submitted [17:21:46] It does say "Your job 9145282 ("catfiles20160724") has been submitted" [17:21:56] But, "qstat" shows nothing [17:22:07] do you have catfiles20160724.err and catfiles20160724.out? [17:22:19] it would be in the same directory I ran jsub in ? [17:22:32] no, in your home dir [17:22:37] ~/catfiles20160724.err [17:22:48] Ah. yes I do [17:23:00] anything inside? [17:23:11] Ok, so I see. The directory changes in jsub, so it says my_script.sh could not be found [17:23:12] Thanks :) [17:23:33] oh yeah jsub by default used your home dir as cwd [17:23:35] np [17:25:00] How long is the queue typically ? [17:25:20] zhuyifei1999_: My code should take 2 hrs to run - how long will it take including the grid's queue ? [17:25:41] the queue is usually seconds [17:25:51] on average, as the queue would change depending on others... [17:25:56] Ouh, cool. [17:26:20] Nod, it is indeed "r" mode now. [17:27:04] note that the .out .err are usually "late" [17:30:08] zhuyifei1999_: Sorry for disturbing - but one more question. Do the tasks run on a different unix machine ? Or the same one I've sshed into ? [17:30:26] they run on different machines [17:30:30] Nice, ok [17:31:29] it's okay, I'm trying to fix weird issues here, so feel free to distrub :P [17:31:41] XD [17:44:51] zhuyifei1999_: Unusually, my submitted script in Grid keeps getting a KeyboardInterrupt exception [17:45:10] ran out of memory [17:45:21] Nope, not ran out of memory - that gave a different error [17:45:32] I then added "-mem 4g" - so, it's definitely not a memory issue [17:45:41] hmm [17:45:57] Can I create "subprocess" in python in the Grid ? [17:46:03] yes [17:46:30] Ok. I get the KeyboardException in the subprocess repetitively, hence was wondering [17:46:51] but it'll get SIGINT (KeyboardInterrupt) anyways [17:47:31] As in ? I don't folllow [17:47:35] https://tools.wmflabs.org/paste/view/1793238d [17:48:00] I mean using subprocesses can't avoid SIGINT [17:48:08] I've tried, failed [17:48:30] But what gives the SIGINT ? [17:49:03] my testing says it happens usually when it runs out of memory [17:49:07] Hmm [17:49:11] Let me try 8g then :| [17:49:25] so some of my memory-intensive tasks have 5g or 6g [17:49:33] Just a note though [17:49:40] memory (ab)use [17:49:41] This subprocess calls a java class [17:49:53] And earlier it used to give: [17:49:54] RROR:root:Error occurred during initialization of VM Could not reserve enough space for object heap Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. [17:50:27] :( [17:50:51] ^ That was with 1g. Hence I increased it to 4g [17:50:57] Lemme try 8g :P [17:50:59] alternatively, poke Yuvipanda until k8s is ready for one-off jobs :( [17:51:19] k8s is so much better with containers [17:53:49] I want to complete this by end of week, so poking will be insufficient for my needs :P [17:54:07] oh AbdealiJK do you know of a rsync compiled for windows? I don't use windows, but others use it [17:55:16] zhuyifei1999_: Nope, sorry. I don't use windows either [17:56:11] zhuyifei1999_: Ok, cool. So, 8g did the trick and 1 file got analzed successfully atleast. Here's me hoping no other file causes that issue later [17:56:21] k [17:56:29] Thanks :)) [17:57:45] rsync exists in cygwin FWIW - https://www.itefix.net/cwrsync [17:57:53] np [17:58:02] looking