[08:48:07] Any tools webserver admin around here? [09:15:01] is any direct communication between grid nodes possible? [09:57:28] MartijnH: they can talk to eachother over the network, I think [09:57:41] MartijnH: but I think SGE also allows you to schedule two jobs on the same node [09:57:55] ('schedule this job on the node where job X is running') [09:58:31] if I can communicate over TCP that's fine [09:58:49] but I'm getting the feeling I want to do things that I shouldn't want to do [10:00:04] (I'm looking at category intersection at the moment; CATSCAN is horribly slow for large categories, and I'd like to be able to split up a category for bite-sized pieces over multiple nodes, but that would require some communication) [10:00:26] I'm all but sure what the etiquette for hogging nodes is, and how I can signal completion of a workload though [10:00:39] other than writing to Redis and polling periodically [10:01:00] (which is probably the worst way of doing IPC possible) [10:02:09] so since I'm getting the feeling I want to do things that I shouldn't want to do, I probably need subtle clue ajustments on what the underlying idea is, and not abuse it for the thing in my head [10:06:00] MartijnH: I'm not really clear on what you want to do with the intermediate bits of data [10:06:22] MartijnH: but as catscan is db-heavy and not cpu-heavy, you could just have it submitted as a single job, I'd think? [10:08:03] valhallasw, I probably would want to send the results back as soon as I have them, and keep a long polling connection open [10:09:03] MartijnH: this would be from a web app, then? [10:09:20] MartijnH: in that case, you could just submit the queries from the web server - no need for SGE [10:09:21] seems like a job for websockets, otherwise I'll have to resort to a longpolling cycle [10:11:14] I'll try and set up that first then, probably a good idea to get my feet wet with simpler things before going to more complicated scenarios [10:15:48] MartijnH: you don't need to 'poll' redis [10:16:02] MartijnH: you can also just use tcp or zeromq [10:16:24] I'll take tcp :D [10:17:32] MartijnH: the only thing is finding out where the other processes are running [10:17:34] which node and port [10:17:41] MartijnH: you can use a well known path to write that [10:18:32] as long as it is uniquely identified, it should be good. First act of a new actor system would be to send back its address [10:19:17] the job itself should have sufficient knowledge of its own address it can send that along for creation of a job starting an actor system [11:12:20] Coren|Away: hm, lighttpd has again crashed without errors reported. [12:21:26] Hi, something wrong with the take command? I get always "Invalid argument" ? [12:21:27] local-guc@tools-login:~/public_html$ ls [12:21:29] app.php index.php lb settings.php [12:21:30] local-guc@tools-login:~/public_html$ take index.php [12:21:32] index.php: Invalid argument [13:45:08] Coren|Away: any idea why "take" is not working for me? ↑ [13:45:30] chown -R local-guc:local-guc public_html/* isn't working too... [13:45:52] Luxo: Known issue with the current NFS server and kernel version. It's going to be tricky to fix, but I can change file ownership for you manually in the meantime. [13:46:31] {{done}} [13:55:44] Coren|Away: if you have time to look at it -- gerrit-patch-uploader's lighttpd is again stopping/crashing without any info in the log [13:56:34] valhallasw: Better yet, lemme patch the startup script so that its errors also get output there. :-) [13:56:39] :-D [14:03:17] merged. Tell me how it goes? [14:03:38] ok, .STOPPED removed - let's see what error.log brings up/. [14:04:17] * Coren|Away needs to find a better way to manage this. [14:05:00] If know smarts were like radiation and I could just soak it up by being in the same IRC channel as you guys... [14:05:04] If only* [14:06:06] Huh. [14:06:34] Well, now it's obvious to *me* what's going on. [14:07:10] port already in use. Er. [14:07:17] valhallasw: Yeah, it's a bug. [14:07:20] Too many people switched to lighttpd already? ;-) [14:08:17] * Coren|Away goes and work on the bug now. [14:15:32] valhallasw: pprt used on quick restart? [14:15:39] port [14:15:53] saper: ? [14:16:02] lighttpd tries to bind to a port thta's already in use [14:16:53] But is it port used by a formr lighty instance ? [14:17:05] I don't know. [14:20:09] in that case, setting SO_REUSEADDR/SO_REUSEPORT socket options might help [14:21:15] netstat should tell if there are any TIME_WAIT TCP sockets [14:21:38] Ah, found a nicer way of doing it that will in fact allow users to start and restart their server directly too. [14:21:43] * Coren|Away goes and implement. [15:52:43] where do i find the largest concentration of german speakers? [15:53:34] giftpflanze sounds german [15:53:35] http://lists.debconf.org/lurker/message/20131012.145837.126e53a7.en.html [16:46:50] Coren|Away, http://tools.wmflabs.org seems down (but https:// is up) [18:03:30] and it's up again. Weird. [18:04:11] YuviPanda, around? [18:04:21] sortof [18:04:36] happen to run in to sbt deployment? :) [18:04:52] you want sbt on toollabs?t [18:04:55] that can be arranged quickly [18:05:03] goody :D [18:05:14] web will be slightly harder, but sbt is super easy [18:05:21] MartijnH: you might get only scala 2.9 tho [18:05:28] meh [18:05:35] sbt will pull the scala version I specify anyway [18:05:39] ah [18:05:40] right [18:05:48] scala itself really is optional with sbt [18:06:04] as long as there is a JVM [18:06:47] MartijnH: sbt installed on -dev and -login [18:06:51] cool [18:06:51] puppetizing in a moment now [18:06:57] MartijnH: also have 7 jdk and jre [18:07:21] in case you're interested in what I'm playing with, https://github.com/martijnhoekstra/overmind should be a prove of concept of a 3 grid nodes communicating through Akka actors [18:07:43] I'm not 100% positive if I got the grid engine stuff right though [18:07:57] MartijnH: do you need sbt on the execution hosts themselves? [18:07:59] I guess not? [18:08:01] they have JRE [18:08:06] nah, JRE is enough [18:08:09] sweet [18:08:20] sbt would be easy, but it would be the quick and very dirty way out [18:08:34] you don't want to run the actual production stuff with the overhead of sbt [18:08:59] right [18:09:11] MartijnH: can you verify you've sbt on -login and -dev? [18:09:25] I got it on -login [18:09:28] let me check -dev [18:09:30] sweet [18:10:30] -dev too [18:11:43] sweet [18:14:27] at least it seems to compile [18:26:05] OK, the Gerit Patch Uploader is finished except for two minor issues: the new oauth consumer needs approval and lighttpd should be fixed (although it works without) [20:32:57] valhallasw: new web server scheme. All is now controlled through the 'webservice' command. [21:18:30] Coren|Away: awesome [21:43:10] now I really need someone to OK my OAuth consumer :-) [21:56:58] robla, could you maybe OK my fourth-attempt-to-get-it-right OAuth consumer on mw.o? https://www.mediawiki.org/wiki/Special:MWOAuthConsumerRegistration/update/f26337df9e626952ba65499be86f8e62 [21:59:42] valhallasw: I think I need to defer to csteipp unfortunately [22:00:28] robla: sure. However, csteipp doesn't seem to have rights on mw.o :/ [22:00:42] I'm pretty sure he does [22:01:40] I [22:01:45] I'll send him an e-mail. Thanks. [22:02:37] sorry I couldn't be more help. AaronSchulz and/or anomie might be able to do something too [22:03:11] anomie OK'ed it last time, but I think he's only around during office hours [22:05:54] valhallasw: it looks like your request is already approved [22:06:39] robla: version 0.1.2, which has an incorrect callback url, has been approved, but 0.1.3, which has the correct url, has not been :-) [22:08:24] ok...I think I can handle this [22:09:22] valhallasw: approved [22:18:15] robla: cool, thanks. It's working like a charm now: http://tools.wmflabs.org/gerrit-patch-uploader/