[00:00:59] Coren, [00:00:59] db.setHostName("eswiki.labsdb.wmflabs.org"); [00:00:59] db.setDatabaseName("eswiki_d"); [00:01:21] _p not _d. :) [00:03:23] I caught the file "my.cnf" , my username and password [00:03:54] Harpagornis: what language are you coding in? [00:04:01] c++ [00:05:27] Why? [00:06:18] Harpagornis: If I was familiar with the language I would give you a hand [00:06:56] aah ok [00:07:07] !log deployment-prep Fixed puppet for deployment-jobrunner01 using https://gerrit.wikimedia.org/r/#/c/134519/2 [00:07:11] Logged the message, Master [00:08:06] How you make your connection? [00:08:09] Betacommand, [00:08:18] in your lenguage [00:09:06] Harpagornis: db = MySQLdb.connect(db=database, host=database2+".labsdb", read_default_file=os.path.expanduser("~/replica.my.cnf")) [00:09:53] where database and database2 are variables [00:11:13] Bgwhite: you around? [00:11:30] Yes [00:12:05] bgwhite_: you there? [00:12:17] I'm here.... I think [00:13:59] bgwhite_: Betacommand: o/ [00:14:16] a930913: he needs some help [00:14:33] Betacommand: "It has been 0 days since the last Labs incident." [00:14:50] a930913: dont get me started [00:17:10] Betacommand: I'm coining a new BOFH acronym for hammer - HRT, hardware reconfiguration tool. [00:17:33] BOFH ? [00:17:49] Betacommand: Really? O.O [00:18:11] a930913: Im doing about 8 things at once right now [00:18:28] Betacommand: Google. Read. Lose a few hours of your life :p [00:18:47] a930913: I dont have that right now [00:18:57] which is why I asked [00:19:56] Betacommand: I won't do it justice. Google it when you randomly remember me mentioning it. [00:20:13] a930913: that wont happen [00:22:44] Betacommand: Here's one. http://www.theregister.co.uk/2012/07/27/bofh_2012_episode_7/ [00:22:51] ori: Any idea how to fix "Duplicate definition: Class[Mediawiki::Jobrunner] is already defined in file /etc/puppet/manifests/role/mediawiki.pp at line 225; cannot redefine at /etc/puppet/manifests/role/mediawiki.pp:259" for a node trying to apply role::mediawiki::videoscaler? [00:24:08] It almost looks like Puppet is deciding that '::mediawiki::jobrunner' resolves to '::role::mediawiki::jobrunner' [00:27:19] Coren: you around? [00:27:32] Betacommand: Mostly. [00:28:21] Coren: Bgwhite is having some issues since the cron change [00:28:44] Betacommand: "some issues"? What specifically? [00:28:48] Coren: the specific error message is: [00:28:50] DBI connect('s51080__checkwiki_p:host=tools-db.labsdb','s51080',...) failed: Unknown MySQL server host 'tools-db.labsdb' (2) at /data/project/checkwiki/bin/dump_dispatcher.pl line 95 [00:29:12] Coren: can I have a few seconds to dig up the message :P [00:30:26] Betacommand: 'tools-db.labsdb' doesn't exist; tools-db (no fqdn) is a compatibility alias for 'tools.labsdb' which is the real name. [00:30:58] bgwhite_: try that and let us know if there are still issues [00:33:18] bgwhite_: you there? [00:33:30] Coren: Done. Waiting for cron to run. The name in the program was origanally tools-db [00:33:40] !log deployment-prep Puppet failing on deployment-videoscaler01 with duplicate definition of Class[Mediawiki::Jobrunner] [00:33:43] Logged the message, Master [00:33:52] bgwhite_: just plain 'tools-db' also should work, actually. [00:34:09] DBI connect('s51080__checkwiki_p:host=tools.labsdb','s51080',...) failed: Unknown MySQL server host 'tools.labsdb' (2) at /data/project/checkwiki/bin/dump_dispatcher.pl line 95 [00:34:38] bgwhite_: Wait, are you running this in a crontab without sending it off to the grid? (Like, with jlocal)? [00:34:48] Coren: yeah [00:35:08] I know what's wrong then; gimme a sec. (But 'tools-db.labsdb' wouldn't have worked for the other reason) [00:35:36] Ah, that was it. tools-submit wasn't in the list of hosts that get the updated hosts file. Fixed. [00:35:57] Coren: the script is a wrapper that submits jsub queries [00:36:25] I presume it opens a db connection to fetch a list of things to submit against? [00:38:18] It worked. It connected to the database. [00:39:06] Bgwhite are you still getting the jsub errors? [00:47:28] bgwhite_: is it working? [00:47:54] Coren: Im surprised that the host file isnt a puppet config thing [00:48:43] Betacommand: It should be; it's just that it's a little complicated/annoying to put it there because it's generated outside labs by the db maintenance script. [01:03:04] 3Wikimedia Labs / 3Infrastructure: !add-labs-user gone, fix or add docs to link SVN users to labs/wikitech - 10https://bugzilla.wikimedia.org/64596#c4 (10John Mark Vandenberg) I dont have a wikitech wiki user. My username on wmf wikis is 'John Vandenberg'. Email is per bugzilla account. Preferred ldap accoun... [01:09:19] 3Wikimedia Labs / 3deployment-prep (beta): cannot sudo on deployment-bastion - 10https://bugzilla.wikimedia.org/65548#c3 (10Bryan Davis) (In reply to Daniel Zahn from comment #2) > Does this mean the users should be converted like in: > > https://bugzilla.wikimedia.org/show_bug.cgi?id=64596 > > (instead of... [01:10:19] 3Wikimedia Labs / 3deployment-prep (beta): Users with primary group of 550(svn) cannot sudo as mwdeploy on deployment-bastion - 10https://bugzilla.wikimedia.org/65548 (10Bryan Davis) [01:18:21] 3Wikimedia Labs / 3deployment-prep (beta): Can't apply Puppet class role::mediawiki::videoscaler in beta - 10https://bugzilla.wikimedia.org/65569 (10Bryan Davis) 3NEW p:3Unprio s:3major a:3None err: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate definition: Class[Mediaw... [01:19:04] 3Wikimedia Labs / 3deployment-prep (beta): Can't apply Puppet class role::mediawiki::videoscaler in beta - 10https://bugzilla.wikimedia.org/65569 (10Bryan Davis) p:5Unprio>3High [01:54:50] 3Wikimedia Labs / 3deployment-prep (beta): Can't apply Puppet class role::mediawiki::videoscaler in beta - 10https://bugzilla.wikimedia.org/65569 (10Ori Livneh) a:3Ori Livneh [06:13:17] i have a home page for my tool, but it is not linked from the Tools home page(http://tools.wmflabs.org/) automatically.. tool->http://tools.wmflabs.org/bub/ [06:13:57] can i manually edit the tools home page to add a link [07:59:07] rohit-dua: No, the script checks that there is a index.* in ~/public_html. [08:01:00] scfc_de: my is index.py inside a sub folder. i'm using url.rewrite-once [08:03:22] rohit-dua: I know, but that doesn't change that the script looks for index.* in ~/public_html :-). You should probably be okay by putting an empty index.html there. [08:04:49] scfc_de: ok [08:05:21] scfc_de: also how can i enable terminal colors in the shell.. [08:05:46] i looks kind of dull [08:07:31] rohit-dua: Hmmm. Don't know; I have colours in the shell on tools-login.wmflabs.org. [08:08:36] For me, $TERM is set to xterm-256color, but I don't know if that's figured out by something on tools-login or if my ssh client sets it. [08:21:56] scfc_de: i tried export TERM=screen-256color but still no colors [08:23:22] rohit-dua: Sorry, don't know and gotta go. [09:49:03] Hello [10:09:26] Please can I ask a question about Tool labs? [10:17:18] 3Tool Labs tools / 3[other]: debug feature does not work in orwell01 - 10https://bugzilla.wikimedia.org/52152#c2 (10Abshirdheere) I'm sorry, I don't know if this place I can ask this question Anyway I use python as pywikipedia and I have this problem import MySQLdb as mysqldb ImportError: No module named MyS... [10:59:09] Abshirdheere: Don't ask to ask, just ask, and then anybody who knows the answer can answer. [11:04:04] !log integration deleted integration-composer instance. Archived /mnt/ in /data/project/ [11:04:06] Logged the message, Master [11:44:19] 3Wikimedia Labs / 3tools: Install python module matplotlib - 10https://bugzilla.wikimedia.org/61445#c3 (10Philippe Elie) 5RES/FIX>3REO p:5Unprio>3High It seems matplotlib is no longer installed tools.phetools@tools-login:~$ python -c "import matplotlib" Traceback (most recent call last): File " 3Wikimedia Labs / 3tools: install tesseract command line tool and the associated language package - 10https://bugzilla.wikimedia.org/65354 (10Philippe Elie) p:5Unprio>3High [11:52:33] 3Tool Labs tools / 3[other]: debug feature does not work in orwell01 - 10https://bugzilla.wikimedia.org/52152#c3 (10Andre Klapper) (In reply to Abshirdheere from comment #2) > Anyway I use python as pywikipedia and I have this problem Please see https://www.mediawiki.org/wiki/Manual:Pywikibot - this bug rep... [12:09:48] 3Tool Labs tools / 3[other]: debug feature does not work in orwell01 - 10https://bugzilla.wikimedia.org/52152#c4 (10Abshirdheere) Please clarify step bots connect to the database replica.my.cnf user-config.py db_username = "username" db_password = "******" Thanks. [12:39:04] 3Tool Labs tools / 3[other]: debug feature does not work in orwell01 - 10https://bugzilla.wikimedia.org/52152#c5 (10Andre Klapper) Abshirdheere: This is NOT a support forum so please stop commenting on this unrelated bug report. I have told you already where to ask your question instead. [12:51:38] hi all :) [12:52:09] i want to find all the instances in a given project which have a certain puppet role. [12:52:23] does anybody know of a way to do this programmatically, say from a python script? [12:54:34] JohannesK_WMDE: This is actually surprisingly hard as what instances get from the puppet master is the /complied/ manifest and not its source. [12:55:08] JohannesK_WMDE: As a rule, the only way to do this is by checking for side effects. [12:55:11] Coren: you mean compiled? like some form of bytecode? [12:55:40] JohannesK_WMDE: More like preprocessed text with all substitutions done. [12:56:10] hm, so i could have the role create files in a shared directory which can then be globbed or something... [12:56:11] JohannesK_WMDE: (It's still called "complied" in puppet parlance) [12:56:24] oh, ok, so it wasn't a typo :) [12:56:25] JohannesK_WMDE: Yes, that's one easy way to do it. [12:56:36] No, it was a typo I managed to make twice. [12:56:41] ComPILed. [12:56:45] :-P [12:56:55] ah. now i'm confused :p ok thanks Coren [14:14:48] Coren: good morning :-) [14:15:03] Coren: I was wondering whether I could extract a list of the IP used on a given project :-D [14:15:28] possibly via LDAP. The use case would be to have puppet to do a magic query to list the IP of my 'integration' project and use that to allow those hosts to rsync [14:16:37] hashar, you mean the IPs of the VMs in a project? [14:16:44] the instance IP yeah [14:17:00] I am trying to figure out differentiate scenario to be able to push data from instances to a production host :] [14:17:15] there are so many ways to handle it that I eventually get lost [14:17:20] You want to do it once, or you want an automated way? [14:17:41] It's pretty easy to do on virt1000, shouldn't be too hard from elsewhere in production either (an openstack query + some awk) [14:18:19] You could probably also do it via the mediawiki API [14:18:28] ah that is a good call [14:18:31] since wikitech displays all that… [14:18:45] gotta write down some possible scenarios now :] [14:19:38] I am more familiar with the MediaWiki API so that my be the best option for me [14:19:39] thx! [14:20:05] I'm not totally sure how to do this, but you can probably write some kind of targetted SMW query and pull it up via the API [14:20:14] since (in theory) SMW knows vital stats of all instances [14:20:58] can we enable color in shell terminal.. tools lab instance. [14:21:08] and last question, do we have some private spaces to put passwords in ? labs/private being public [14:21:13] I could use a place to store some password [14:21:39] andrewbogott: Coren proxy seems stable after 1.5.0 downgrad3e [14:21:46] still haven't found a way to notify me when it is down tho [14:21:49] yeah :( [14:22:38] YuviPanda: I'm pretty sure we have icinga running on labs instances, someplace? So you could write a custom test for that. [14:22:40] It's not too hard [14:22:52] !log integration migrated integration-publisher to use puppetmaster::self [14:22:53] Logged the message, Master [14:23:13] andrewbogott: there's a proxy instance on tools-proxy-test, so if I can somehow open up the redis port on tools-webproxy to that instance alone, I can set that to be a slave of this, and have two working proxies [14:23:19] so we can test things on one of them [14:33:46] YuviPanda: And there was much annoyance, and gnashing of teeth. [14:33:56] * Coren really wants to know why. [14:34:23] Coren: yeah, error logs were helpful the first time, since it just fully failed, but the intermittent issues I can't explain [14:34:59] Coren: the 499s perhaps were a red herring, since they still exist in about the same rate [14:35:01] andrewbogott: 6 [14:35:02] err [14:35:03] ^ [14:35:40] It does appear to be capacity related or load related though. Perhaps there is some tunable whose default value/behaviour changed between versions? [14:35:40] Coren: andrewbogott I definitely need to first do some instrumentation, record 499 / 500 / status code logs, and then track them to see if things are changing. [14:36:00] Coren: maybe, but I explicitly merged a patch later that tuned some variables that we were hitting limtis of (Open FDs, primarily) [14:36:18] 499 IMO are not that interesting except, perhaps, as a signal that things have gotten really bad (and that users are doing an unusual number of stop-and-reloads from their clients) [14:37:12] right [14:37:58] having per tool access stats would also be useful, I think. Not just for tool users, but for us too. If all 499s were isolated to 3 tools, it is easier to reason that they are being caused by the tools [14:39:20] Hi YuviPanda [14:39:25] hello nischayn22 [14:39:33] long time :) [14:39:35] where are you? [14:41:18] nischayn22: righ now? Scotland :) [14:41:19] nischayn22: you? [14:41:39] Bangalore :/ [14:42:02] nischayn22: I'll be back in a few weeks there, though. [14:42:32] YuviPanda: Why? You like power cuts? [14:42:37] nischayn22: haha :P [14:42:42] nischayn22: might move to Goa for a while after [14:43:08] YuviPanda: Interesting.. Bangalore, Delhi and now Goa (tech cities in India) [14:44:04] nischayn22: haha :P let's not even go there :P [14:44:13] nischayn22: heard you were going to help ocassi with some tool [14:44:26] YuviPanda: Yeah, I am here for that [14:44:41] I have a wikitech Labs account and got the access too, have uploaded the SSH key as well. But I don't seem to be able to login via SSH [14:45:41] nischayn22: what are you trying? [14:45:46] nischayn22: ssh tools-login.wmflabs.org [14:46:22] YuviPanda: Yeah, I tried that. Permission denied [14:46:42] nischayn22: try again? I'm tailing the auth log now [14:47:03] YuviPanda: Done [14:47:13] nischayn22: your shell username is wrong. ninahata does not exist as a user [14:47:17] nischayn22: do you remember what your shell name is? [14:47:18] YuviPanda: I hope I am not doing some newbie mistakes [14:47:27] YuviPanda: Ah, I see [14:47:54] nischayn22@tools-login.wmflabs.org [14:47:59] nischayn22: try sshing to that [14:48:32] !log integration applied role::ci::publisher::labs to integration-publisher to setup rsync {{gerrit|134608}} [14:48:34] Logged the message, Master [14:48:50] nischayn22: I see that worked :) [14:49:00] YuviPanda: I had tried that and it didn't work either. I realized now that it was because of my .ssh/config [14:49:04] YuviPanda: Yes, worked finally [14:49:10] nischayn22: oh, what was in youf config? [14:49:17] nischayn22: also I highly reccomend mosh :) [14:49:46] YuviPanda: Something like this : Protocol 2 User ninahata [14:49:50] aaah [14:49:51] right [14:50:14] !log deployment-prep restarted logstash on deployment-logstash1; getting really tired of these soft crashes [14:50:16] Logged the message, Master [14:51:07] YuviPanda: 'replacement for SSH'.. sounds challenging but I will give it a try :) [15:04:00] andrewbogott: thanks for your +2 :-] [15:04:13] One of those won't merge, I'm not sure why [15:04:21] I can rebase it [15:06:41] I love rsync [15:07:38] andrewbogott: ah it does not merge because there is a dependency upon https://gerrit.wikimedia.org/r/#/c/127213/ [15:08:00] I do a bunch of atomic commits :-( [15:08:51] hm… that one I'd rather gets a vote from someone besides me [15:08:52] hashar: you abandoned tons of things ... [15:09:09] just took too long? [15:09:36] that and I didn't bother to ping someone to get them merged [15:09:43] they are merely lints / whitespace cleanup though [15:09:49] so it is not a big loss and easy to redo [15:10:17] :( [15:10:42] andrewbogott: I rebased both https://gerrit.wikimedia.org/r/#/c/127213/ and https://gerrit.wikimedia.org/r/#/c/129687/ (linked changes) [15:16:05] YuviPanda: I need to use tools to proxy requests to an API and hide the API keys involved. The URL seems to be http://tools.wmflabs.org/local-reference-api/ [15:16:57] YuviPanda: With this URL I might be facing CORS problem on the browser when accessing wikipedia.org right? [15:29:36] YuviPanda: Never mind, I was able to figure it out. And thanks for all the help [15:42:16] Hi [15:43:08] Coren.. [15:43:43] Are you available? [15:46:38] Harpagornis: What's up? [15:46:46] thank [15:47:25] yesterdat asking about the connnections [15:47:37] yesterday* [15:47:54] hostname = "enwiki.labsdb" [15:48:05] dbname= "enwiki_p" [15:48:11] my question is [15:48:34] this connection is from the outside? [15:51:18] Coren , understand? [15:52:04] I need make the connection from the outside, in an aplication [15:52:14] Ah! You simply cannot do that. [15:52:25] Unless you are, for instance, tunelling through SSH. [15:52:35] The databases are only acessible from within labs. [15:53:01] aaah ok [15:53:09] molti grazie [15:53:29] now understand,xd [15:54:19] Thank very much Coren [15:54:31] No problem. [16:24:39] JohannesK_WMDE: If you meant to query which instances have a puppet role assigned in the "Configure" dialog at wikitech, you should be able to do that with an SMW query. [16:39:12] andrewbogott: ping [16:39:34] hey! [16:39:37] Coren, interested in helping me troubleshoot mgrabovsky's login problems? [16:39:42] bad new, gerrit username still the same, ssh not working [16:40:04] andrewbogott: Heh. Sure, I'll give a hand. Where is he trying and failing? [16:40:05] Coren, it's either the case that I badly scrambled his account, or just a standard short-list login problem. [16:40:33] mgrabovsky: for starters why don't you try -vvv again and paste the whole output (including the original ssh command) to dpaste.org? [16:41:29] andrewbogott: Coren, i'd like to merge another lint change.. role/nova this time..i'll watch it [16:41:34] Coren, for background… mgrabovsky is one of the unfortunates for whom I attempted an account rename. So no amount of breakage should surprise you. [16:41:36] on like virt1001 [16:41:46] andrewbogott: here it is: https://dpaste.de/VDbz [16:41:47] restored hashar's change [16:41:55] andrewbogott: BTW, we have tools.wmflabs.org/paste/ [16:42:23] mgrabovsky: can you please include the commandline that results in the output? [16:42:38] it's just ssh -vvv mgrabovsky@bastion.wmflabs.org [16:43:13] * Coren looks [16:43:25] Looks to me like he does indeed have keys on bastion (which is what I expected to have broken) [16:44:37] Found the issue. [16:45:03] ...and? [16:45:16] gimme a sec [16:46:37] his /public/keys/mgrabovsky isn't owned by him. Fixing. :-) [16:46:40] mgrabovsky: Try again? [16:47:10] Hm, was it owned by grabovsky instead? [16:47:37] still the same [16:47:49] It was owned by a uid that had no associated username. [16:48:14] There may well be more than one issue. :-) Checking. [16:49:21] odd, it still complains about ownership or mode but they /seem/ okay. [16:49:25] I can connect now [16:49:32] but get Could not chdir to home directory /home/mgrabovsky: Permission denied [16:49:57] Ah! That's probably the same issue, but with your home. Simple enough to fix. [16:50:15] Fix't [16:50:35] * andrewbogott vows, yet again, to never ever try to rename a labs user [16:50:37] great! thank you very much [16:50:56] and gerrit? [16:50:56] mgrabovsky: ok, I'll look at the gerrit thing again. [16:51:01] I definitely renamed you. I wonder why it didn't take... [16:52:37] I see 'username:mgrabovsky' [16:52:44] but gerrit is reporting you as 'grabovsky'? [16:53:16] yes [16:53:47] Hm… try log out and back in? [16:54:21] nah, nothing changed [17:10:46] Change on 12mediawiki a page OAuth/For Developers was modified, changed by CSteipp link https://www.mediawiki.org/w/index.php?diff=1011896 edit summary: /* Intended Users */ update for /identity [17:15:06] nischayn22: ah, I think CORS is allowed for wikimedia.org by default? or something like that [17:20:43] CORS is allowed for a list of domains [17:20:50] (all the WMF ones) [17:24:56] nischayn22: ^ [17:33:50] !log deployment-prep converted deployment-pdf01 (i-00000396.eqiad.wmflabs) to use local puppet & salt master [17:33:52] Logged the message, Master [17:44:24] * bd808 logs into mwalker's new trusty server in labs to look around [17:50:50] Jeff_Green, how did you add the ocg roles to be available in labs? [17:51:06] that is an excellent question [17:51:20] (or rather; can you add the role role::ocg::production) [17:51:35] ha. i liked leaving it at a question, that was easier [17:51:45] hee [17:52:07] I like the question too; that way I can possibly do it later [17:55:03] I might have just found it, but the wiki foo is full of beelzebub [17:55:38] andrewbogott, Coren ^ how does one add production puppet roles to be available in labs [17:56:00] mwalker: https://wikitech.wikimedia.org/wiki/Special:NovaPuppetGroup [17:56:04] i found where you do it, i just can't get the stupid wiki to actually render what I need to make it happen [17:56:12] There's a "manage puppet groups" in the sidebar; you can make classes (and variables) available to projects from there. [17:56:25] Jeff_Green: Huh? How so? [17:56:46] afaik the class is role::ocg::something, and role doesn't show up on the projects dropdown [17:57:18] I get a brief flicker of the full list (which would be great) which disappears as soon as I'm sure some Javascript evilness realises I'm about to get to what I actually need to [17:57:23] Jeff_Green: The project dropdown is just to select projects to manage. :-) [17:57:30] Jeff_Green, I added it [17:57:38] *to the deployment-prep project [17:57:39] Coren: you add it by project? [17:58:05] oic. wacky [17:58:16] Jeff_Green: Yes. There's also a 'All projects' set of groups but IIRC only global labs admins (i.e.: me and Andrew) can edit those. [17:58:42] ya I just found "all projects" [17:58:52] i seem to have access to modify those fwiw, but I think we're good now [17:59:19] Jeff_Green, since I can add ocg stuff to individual projects; can you remove the ocg stuff from the global group? [17:59:41] mwalker: sure [18:01:41] odd. All Projects doesnt show up in teh projects dropdown, but it's tacked on at the bottom of the deployment-prep one [18:02:56] mwalker: actually I'm not going to touch this. it's not clear to me how it works. [18:03:17] fairnuf [18:18:50] andrewbogott: OK, I set up a Gerrit, created a user, renamed it in account_external_ids, and nothing [18:19:01] andrewbogott: restarted it and everything's OK [18:19:33] andrewbogott: that may not be a feasible option in this case, though, right? [18:19:54] What do you mean by 'and nothing'? [18:20:05] I already renamed you in account_external_ids, that's why I'm confused [18:20:42] the bottom line is that it needs a reboot [18:21:29] oh, /gerrit/ needs a reboot? Hm... [18:23:23] mgrabovsky: check now? [18:23:46] awesome [18:24:03] thank you very much and sorry for all the hassle [18:24:43] Coren: thank you, too [18:43:50] hedonil1, ping [18:51:13] Cyberpower678: Hi! [18:51:27] I'm back online. :D [18:51:32] long time, no see \o [18:51:39] With DSL however. :/ [18:52:11] But it's pretty fast, and I can stream Hulu and Netflix with it, so I'm not complaining. [18:52:26] Great [18:52:43] * Cyberpower678 goes to power wash the bird cage, and then he'll revive his bot. [18:53:18] Yeah, let's get things done :P [18:53:50] Apparently my bot half died during my absence. [22:44:03] hedonil, I made the switch to ar_user. Runs fast now. :-) [22:49:25] Cyberpower678: when are you going to disable the optin requirement? [22:51:20] Betacommand, a few users have dragged this out to the foundation, so it would be wrong to remove it before I get a response. However, due to a lack of a response, I sent Phillipe a ping. If he doesn't respond in a few days, I'll go ahead and remove it. I just want to make sure the foundation, which I feel won't, doesn't have any objections. [22:52:05] Cyberpower678: K, just saw the RfC close a while back [22:52:11] * CP678|food is away: This is a manual computer virus. Please copy paste me in your away message. I'm not here right now. [23:02:43] Q: Can a crontab invoked job submit more jobs? [23:05:35] hasteur: depends on how you do it, by default no [23:07:16] Hrmph... I have a driver file that I modify when certain jobs no longer have any results. Previously I just scheduled a crontab firing of submitting many jobs to the grid, but because the crontab now jsubs to the grid for me, I can't jsub from inside the simple bash driver. [23:09:01] ls [23:10:19] hasteur: for a simple driver like that use jlocal [23:11:31] The jlocal invocation will have jsub then running? [23:12:44] i.e. cd $HOME/bot_tool_dir && jsub -cwd -quiet -mem 512m -N bot_task python bot_script.py -from:SOMECATEGORY [23:13:13] No, in your crontab for the master program replace jsub with jlocal [23:14:04] But when I run the script from crontab with jlocal, it will allow me to execute commands inside the driver like that? [23:15:13] i.e. Crontab says "jlocal -cwd -quiet -mem 256m -N driver_script $HOME/bot_tool_dir/driver_script.sh" [23:15:20] it should, just test it and see if it works :P [23:17:41] Why can't jobs sumbit jobs? Anti forkbomb? [23:17:52] Yeah [23:19:06] Really? I thought that up on the spot as a joke. [23:19:48] a930913: for the most part why would grid jobs need to fork more grid jobs? [23:19:57] Surely a forkbomb is thwarted by limiting the number of jobs a tool can have? [23:20:30] a930913: number limiting isnt that effective [23:20:30] Betacommand: Because crontabs are basically a job now. [23:20:44] a930913: thats why you use jlocal [23:21:03] Betacommand: Yes, but why does that exist in the first place? [23:21:36] a930913: idiots overloaded -login and ran too many cron jobs on -login that should have been on the grid\ [23:22:07] Betacommand: I know that. Why does jlocal exist? [23:22:31] a930913: for the few cases where submitting to the grid isnt a good idea [23:23:00] IE I have a very small script that I run once a day, that I need the output emailed to me. which cannot be done via grid [23:23:36] Betacommand: It seems more straightforward to allow jobs to submit jobs, rather than make effectively a special case job that can submit jobs. [23:24:10] a930913: jlocal doesnt actually use the grid, its used to bypass it [23:24:30] Betacommand: "Effectively" [23:25:16] a930913: why should most jobs be able to submit more jobs? that is a recipe for fork bombing [23:25:34] which can crash the whole grid [23:25:51] Limiting doesn't work? [23:26:02] And there aren't other ways to crash the grid? [23:26:10] a930913: limit based off of what criteria? [23:26:26] a930913: there may be others, but thats the easiest to do and prevent [23:27:14] Actually, I suppose once you are on the grid, you don't need more jobs, just more processes. [23:27:55] a930913: correct, which that can be monitored and controled [23:28:11] * hasteur bangs head against desk. [23:28:17] hasteur: ?? [23:29:10] Having to re-design the crontab invocations for my bot processes. [23:29:11] hasteur: Head, meet desk. Desk, meet head. [23:29:53] hasteur: just for the few drivers replace jsub with jlocal [23:30:44] Crontab line: 35 23 * * * cd $HOME/g13bot_tools && jlocal -cwd -n g13_driver ./g13_nudge_driver3.sh [23:31:33] g13_nudge_driver3.sh: [23:31:35] #!/bin/bash [23:31:35] PATH=/usr/local/bin:/usr/bin:/bin [23:31:35] #2009 Driver [23:31:35] cd $HOME/g13bot_tools && jsub -cwd -quiet -mem 400m -N g13_nudge python g13_nudge_bot.py -from:AfC_submissions_by_date/16_November_2013 [23:31:38] cd $HOME/g13bot_tools && jsub -cwd -quiet -mem 400m -N g13_nudge python g13_nudge_bot.py -from:AfC_submissions_by_date/17_November_2013 [23:31:41] cd $HOME/g13bot_tools && jsub -cwd -quiet -mem 400m -N g13_nudge python g13_nudge_bot.py -from:AfC_submissions_by_date/18_November_2013 [23:31:44] cd $HOME/g13bot_tools && jsub -cwd -quiet -mem 400m -N g13_nudge python g13_nudge_bot.py -from:AfC_submissions_by_date/19_November_2013 [23:31:47] cd $HOME/g13bot_tools && jsub -cwd -quiet -mem 400m -N g13_nudge python g13_nudge_bot.py -from:AfC_submissions_by_date/20_November_2013 [23:31:52] Uh-oh. [23:32:07] hasteur: you shouldnt need to change g13_nudge_driver3.sh [23:32:34] * a930913 breaths a sigh of relief. [23:32:36] g13_nudge_driver3.sh is the very small test rig I'm attempting to use to get the line right [23:33:05] The main one has ~120ish jobs submitted from it [23:34:17] hasteur: I would really not use bash for that. I would use a python wrapper [23:35:39] How would you wrap it? [23:35:43] hasteur: I see several ways that could be optimized and made cleaner [23:36:15] hasteur: A for loop of subprocess.* [23:36:50] a930913: using subprocess wouldn't fork as needed [23:37:14] So have subprocess to the jsub? [23:37:31] Betacommand: It can Popen. [23:37:53] Ideally, I'd like to toss a arbitrary pack of jobs out at the execution cluster and let the cluster decide the ordering. [23:38:28] a930913: you wouldnt use subprocess.Popen you would use os.popen [23:39:19] hasteur: fairly easy to do. you just write the python script to invoke the needed shell commands that submit the jobs [23:39:46] you can even name the jobs specifically for the day its running [23:40:31] Hrm... I don't understand what I gain from going through annother layer of python instead of a simple shell script to submit? [23:40:58] hasteur: less headache, how do you create the sh file? [23:41:22] Copy the lines, paste the lines, change the year [23:41:38] ouch [23:41:48] could easily be automated [23:41:50] For loop, what? [23:42:09] ^Y, ^P, s/2013/2014/ [23:42:27] Still... [23:42:44] hasteur: is your code in a public repo somewhere? [23:43:10] I see a lot of help I could give you [23:45:34] Stand by, having to mask some pieces of data [23:47:07]