[00:00:01] Right and can't change ownership with either user [00:00:02] Fun times [00:00:55] That's not really tenable for large numbers of files. Moving via /tmp is, but that's a security risk if it includes files with credentials. [00:01:31] Yeah, Unix file permissions are... sometimes a pain. Like I said, I'll code a 'chown to self' suid utility soon that'll allow to take ownership of files in a clean way. It's clearly urgent. [00:01:42] Yes :-) [00:02:04] So maintainers can just stuff them in the tool home, and the tool user can take its ownership from there. [00:03:15] * Coren goes to do that now. [00:54:00] Hi I wondered if someone has an ETA on when beta labs will be online again? [00:54:54] "beta labs"? [00:55:18] Done, but given the security implication this /really/ needs a review. [00:55:22] Dcoetzee: ^^ [00:55:34] http://en.m.wikipedia.beta.wmflabs.org/wiki/Main_Page [00:55:57] Wrong room? Should I go to ops? [00:56:11] So, this is probably the right room; I was just unaware of that project. [00:56:22] Which also means I cannot help you. :-( [00:56:31] Sorry Coren :) [00:56:36] Bet bet: ask Ryan_Lane. He knows all, sees all. [00:56:40] Thanks [00:56:44] Coren: K cool [00:57:04] Dcoetzee: /usr/local/bin/take [00:57:11] Dcoetzee: Assuming you grok perl. [00:57:35] It's not suid root until I get some review. [00:57:55] (i.e.: useless until then) :-) [01:16:40] Coren, security implications? am I missing something here? [01:17:07] Krenair: Anything that has suid to root, by definition, has security implications. [01:17:57] oh, I see [01:19:15] Guess I didn't read far enough up the chat log before [14:12:11] hello labs :) [14:12:44] Can anyone here change my labs username / create a new one? [14:39:25] !log deployment-prep I have setup a Jenkins job to automatically update mediawiki-config. Dashboard is https://integration.wikimedia.org/ci/job/beta-mediawiki-config-update/ [14:39:28] Logged the message, Master [14:39:49] !log deployment-prep Updated mediawiki/extensions.git which was lacking the Thanks extension {{gerrit|55263}} [14:39:51] Logged the message, Master [14:40:29] !log deployment-prep manually refreshing extensions on -bastion [14:40:31] Logged the message, Master [15:00:55] !log deployment-prep manually update puppet sources on -search01 and -searchidx01 [15:00:59] Logged the message, Master [15:52:20] "Ryan Lane deleted instance 'metavidwiki' in project Nova Resource:Metavidwiki" [15:52:41] GChriss: it's been disabled for two weeks now [15:52:41] across the board or something specific metavidwiki? [15:52:57] the instance was owned [15:53:01] oh. [15:53:04] ok [15:53:05] I sent an email about this when I disabled it [15:53:13] along with about 10 others [15:53:29] I deleted all of the owned ones a couple days ago [15:54:00] no one complained after being disabled for two weeks, so I assumed they weren't being used [15:55:24] what was the subject line? [15:55:26] * GChriss catches up [16:33:41] petan: would be great if you could try and delte those files, I seem to be having little sucess even after getting rid of 100,000 or so [16:33:50] LOL [16:33:59] you were doing that whole day? [16:34:06] I mean, I asked you yesterday [16:34:10] and today you replied :D [16:34:17] sounds like ur pc stucked on deleting them [16:34:20] nah I had to go to work :P [16:34:24] ah [16:34:28] ok I will delete them [16:34:32] :) [16:34:32] can you tell me WHICH? [16:34:34] <^demon> Ryan_Lane: https://gerrit.wikimedia.org/r/#/c/55271/ \o/ [16:34:39] wildcard or whatever [16:34:39] g.wd.* [16:34:42] ok [16:34:44] in /home/addshore [16:34:49] so /home/addshore/g.wd.* [16:34:52] only files [16:35:26] wait [16:35:39] /home/addshore/wd.g.* ;p [16:35:44] ^demon: ok. let me build this [16:35:46] and ya only files :) [16:35:57] addshore: 100,000 files? :) [16:36:10] what in the world are you doing? :D [16:36:13] grid engine has been spamming my home >.< [16:36:16] <^demon> Ryan_Lane: No rush, just now sitting down to lunch. [16:36:30] and I havent been active for about a week so I havn't noticed, intil I tried to do ls in my home directory... [16:36:31] easier if I do it now [16:36:35] hahaha [16:36:45] oh man, gluster must *love* that [16:37:01] Ryan_Lane: yes ;p [16:37:20] hi Ryan_Lane [16:37:20] * Damianz blames addshore for his poor filesystem performance [16:37:31] milimetric: howdy [16:37:35] I, unfortunately, needed my labs username changed [16:37:36] dont blame me, I didnt know oge would write all of these >.< [16:37:43] milimetric: why? [16:37:44] I have to go through you right? [16:38:01] I've exhausted all other options, it's the only way I can work with hue properly [16:38:12] what do you mean? [16:38:16] it authenticates based on the user you're signed in as [16:38:45] what's the actual problem? :) [16:39:02] renaming is…. hard to say the least [16:39:10] and possibly still impossible [16:39:14] Gerrit loves renaming users [16:39:14] ottomata can explain it better, and he knows it's hard, but apparently it's the only way [16:39:23] ^demon: can gerrit handle this properly? [16:39:24] right, off to work again :D [16:39:26] one sec, I'll get you an error message [16:39:32] milimetric: what's your username? [16:39:37] milimetric it is [16:39:44] and that fails because? [16:39:50] and how will renaming your user help? [16:39:55] <^demon> Ryan_Lane: We should be able to get by with just renaming the external account. [16:40:00] <^demon> But I haven't tested. [16:40:04] ^demon: ok [16:40:13] renaming it elsewhere is a pain, but doable [16:40:32] there's still a bug in LdapAuth and Special:RenameUser, but I can just disable LdapAuth while I rename the user [16:40:36] basically,  I can't access hdfs:///wmf/webrequest/raw [16:40:38] I get this: [16:40:43] Cannot access: /wmf/raw/webrequest. Note: you are a Hue admin but not a HDFS superuser (which is "hdfs").AccessControlException: Permission denied: user=milimetric, access=EXECUTE, inode="/wmf/raw":hdfs:stats:drwxr-x--- (error 403) [16:41:02] oh ok [16:41:09] ok, so how will renaming your user help? [16:41:09] Ryan_Lane: I'm ok with abandoning that account [16:41:15] and just starting fresh, so no rename [16:41:16] addshore running... [16:41:19] is that possible? [16:41:24] I'd prefer to avoid that [16:41:28] cheers petan :D [16:41:29] renaming makes my username sync up with the one in hue [16:41:36] guess I will have to make a cron to cleanup daily :/ [16:41:38] what's your username on hue? [16:41:42] dandreescu [16:41:49] i don't understand why that can't be renamed... [16:41:52] one sec, lemme get ottomata [16:42:10] is this because hue isn't using ldap? [16:42:18] or the system isn't or something like that? [16:42:29] it's not using LDAP, correct [16:42:40] er. [16:42:42] it has a crazy authentication scheme [16:42:43] but hue is? [16:42:50] Hold up, I think there might be confusion. [16:42:57] it seems to me that the authentication scheme is what's broken [16:42:59] not your username [16:43:03] Just make a new user in hue with your ldap username? [16:43:03] http://blog.cloudera.com/blog/2012/03/authorization-and-authentication-in-hadoop/ [16:43:08] I was about to say. [16:43:12] Just make a new Labs user. [16:43:20] that's possible too [16:43:27] i somewhat agree with that Ryan_Lane, but we probably stand little chance of upstreaming something in time :) [16:43:34] .... [16:43:34] in time for me to still be alive that is :) [16:43:37] that's not what I mean [16:43:50] I mean the way that *we* are doing something is screwed up [16:43:55] Ryan_Lane: I can explain the shit with Hue later if you want. [16:44:15] dschoon: well, now is better, not sure how available I'll be today [16:44:17] No, it's the fact that Hue uses some confusing words for two separate things [16:44:17] oh, no, as far as I can tell, we're not doing anything besides what we've all agreed is the best way forward with Hadoop at WMF [16:44:18] I'm not in the office [16:44:21] ah [16:44:28] <^demon> Ryan_Lane: I found a maven plugin to generate .debs. This sounds...evil. [16:44:36] <^demon> And yet likely to make our packaging easier :p [16:44:55] ^demon: probably better than what we currently do [16:45:00] Okay, so just to make sure: milimetric create a new labs account for your shell name [16:45:15] and let me know when you've done that, and tried to log into hue [16:45:18] I don't think we should ever expect hue to translate from one username to another :) [16:45:29] Ryan_Lane: okay, so backing up [16:45:34] I'm assuming hue is doing ldap auth [16:45:40] and the hdfs systems are not [16:45:42] forget everything you know so far :) [16:45:44] there are two sets of permissions: [16:45:49] - HDFS file permissions [16:45:58] - Hue web GUI permissions [16:46:15] hue permissions are simple to explain: they control what shit you can do through hue. [16:46:23] hue permissions are attached to hue users. [16:47:26] and hue users are currently created and auth'd through labs LDAP [16:47:26] Ryan_Lane, your assumption (currently) is correct [16:47:52] HDFS permissions and users are distinct. [16:48:03] addshore my script will keep deleting these files until some are in that folder, that means old and new [16:48:06] I hope it's ok [16:48:09] HDFS relies on unix permissions and users. [16:48:23] if you only need old I will change it [16:49:19] ottomata: that's what I thought :) [16:49:19] heh [16:49:23] (at least for now it does, ldap user group mapping is possible in hadoop, but we haven't set it up yet) [16:49:24] ottomata: because nss was removed [16:49:25] that make sense? [16:49:27] right [16:49:40] so milimetrics problem is that he logs into hue with his ldap username [16:49:48] but hadoop grants permission based on shell username [16:49:53] dschoon: so, it's exactly the situation that I wrote before you said "forget everyting you know" :D [16:49:53] and his ldap and shell usernames don't match [16:49:58] haha [16:49:58] * Ryan_Lane nods [16:50:09] dschoon: I helped set this up, you know, right? :) [16:50:23] I'll just shoosh :) [16:50:34] :D [16:50:40] ottomata: right [16:50:41] Ryan_Lane, I missed the beginning of the convo, you don't want mlimetric to create a new 'dandreescu' ldap account? [16:51:03] well, does he want his production and labs accounts to match? [16:51:03] ldap account? you mean labs account right? [16:51:08] milimetric: same sam [16:51:09] *same [16:51:13] labs = ldap [16:51:25] we don't have production ldap [16:51:38] we only have one ldap right? [16:51:46] so I already have a dandreescu ldap account [16:51:52] don't I? [16:52:00] that's your production account [16:52:10] managed by puppet [16:52:21] oh ok, is it easier for that one to change to milimetric? [16:52:30] neither one are easy to chaneg [16:52:32] *change [16:52:48] we can delete your production one and create a brand new production one [16:52:52] so... [16:53:01] is that best ottomata? [16:53:02] ok wait [16:53:05] in labs this is slightly easier, if we only change your shell account name [16:53:05] afaik [16:53:08] I totally don't care which name I end up with [16:53:11] right now [16:53:12] if we change your CN, this is hard [16:53:19] shell == dandreescu [16:53:19] ldap == milimetric [16:53:21] we want them to match [16:53:37] let's use the proper terminology here [16:53:44] ok [16:53:53] ? [16:54:01] production/labs, uid/cn [16:54:11] ha ok [16:54:13] that makes so much more sense :) [16:54:23] i have no idea what a shell account is [16:54:26] production uid = dandreescu [16:54:37] ok [16:54:39] got it sure [16:54:40] ok [16:54:41] labs uid = milimetric ? [16:54:44] yes [16:54:45] currently [16:54:50] labs ldap cn = milimetric [16:55:04] ok. so, are we discussing changing both the cn and the uid? [16:55:04] can he just create a new labs ldap account called dandreescu and then delete the other one? [16:55:08] no [16:55:09] one or the other [16:55:13] we can't delete the other one [16:55:18] we can never delete accounts [16:55:23] ok, well, we can just stop using it [16:55:25] he can never use it again [16:55:26] :) [16:55:34] but yes, that's doable [16:55:38] and the easiest option [16:55:44] whatever's easiest for you guys [16:55:48] but that also means he loses all his contribution history [16:55:50] which sucks [16:55:52] i just have to change a few ssh config lines, no big deal at all [16:55:52] ok, milimetric, do you have anything in your labs homedir that you need saved? [16:55:54] OH [16:55:55] in gerrit? [16:55:56] crap [16:55:59] yeah that sucks [16:56:08] meh, i'm not married to it [16:56:09] :) [16:56:10] in gerrit, wikitech, etc [16:56:13] we can change the production uid [16:56:20] but i think that gets annoying too, no? [16:56:28] we can't really change it [16:56:32] we can create a new one [16:56:34] (this has already gone back and forth with production uid by robh and maybe someone else too) [16:56:35] and delete the old one [16:56:37] right irhgt [16:56:45] puppet shits itself on renames [16:56:56] which is just… absurd [16:56:59] :) silly puppet [16:57:01] In labs you could copy the homedir over and re-chown it, but wiki wise you'll loose history unless you can manually re-name the account [16:57:15] i actually think thats better than changing in labs, because the produciton uid accounts are just per machine anyway [16:57:18] another option is to keep the cn and change the uid in labs [16:57:20] ok, so add new and delete old seems fine too, would that be a lot harder for you guys? [16:57:21] i can babysit those changes through [16:57:31] oh...? [16:57:31] hm [16:57:38] what would that entail Ryan_Lane? [16:57:54] but then you'd log into the web as milimetric, and via shell as dandreescu [16:58:15] hm. that also may be a pain in the ass with gerrit [16:58:24] hm, that seems like a hack that nobody would enjoy [16:59:01] which username do you prefer? [16:59:15] root [16:59:19] Damianz: ;) [16:59:24] * Ryan_Lane sharpens his knives [17:00:06] * Damianz tries not to choke on tea [17:00:23] ok, so removing me and adding me back to production as milimetric, does that seem like the best choice? [17:00:23] milimetric [17:00:23] not if it's hard on you guys though [17:00:35] well, then let's change production [17:00:51] push in a change to delete the old account and to add a new one [17:00:56] using the same key [17:00:58] but a new uid [17:01:09] if you use the same uid we have to do this in two stages [17:01:19] and it'll take an eternity [17:01:33] *uid number [17:01:33] ok cool [17:01:35] sounds good [17:01:36] that's totally fine [17:01:38] thanks Ryan_Lane! [17:01:41] yw [17:01:41] we will change prod uid to milimetric [17:01:42] thank you! [17:01:44] i'll do that one [17:02:00] if you would have prefered the other name, we would have went the route of changing it in labs ;) [17:02:07] you chose wisely :D [17:03:41] * milimetric scurries back to the analytics cave, content with his wisdom [17:04:37] Change on 12mediawiki a page OAuth was modified, changed by Dantman link https://www.mediawiki.org/w/index.php?diff=662517 edit summary: [+1] /* Previous discussions */ Death not Dead [17:53:57] Coren: so, where does this hook exist? [17:54:02] 13:52:41] Extension -> create tool -> invoke the (create tool) hook -> hook returns with possible LDAP entries for the user -> extension finishes, adds entries, and claims success [17:54:02] [13:53:24] So all modification would /still/ be done by the extension, only the hook can add to the list. [17:54:02] [13:53:33] where does the hook exist? [17:54:07] heh [17:54:13] Give context. :-) [17:54:23] indeed [17:54:52] does the hook exist on in instance inside of labs? [17:55:17] I can see several plausible places for the hook to exist. (a) on a dedicated project that has all the right magic but where access is limited, (b) on a designated instance of the project itself are the two obvious ones. [17:55:41] but then we're breaking the logic into two separate systems [17:55:41] (a) is more secure, also less flexible and not in control of the actual project [17:56:15] Ryan_Lane: Well, I see it more as "delegating project-specific stuff to the project", with the Extension doing the actual global stuff. [17:56:15] What's an example of a project-specific thing we'd be doing? [17:56:29] a) splits the code into two places and doesn't give any more control than just doing this directly in mediawiki [17:56:35] andrewbogott: Well, in tools-, I want to create databases and add a couple grants. [17:56:52] Ryan_Lane: (b) does seem more sensical. [17:57:08] except that b) also then requires each project owner to manage this stuff [17:57:24] which means other project owners don't benefit from the work of others [17:57:34] Ryan_Lane: The obvious solution is that if the project owner doesn't give a hook, then no hook is done. [17:58:03] Ryan_Lane: Why? What would prevent the project owner from sharing his work with others that could use it? [17:58:54] they could add the hooks to gerrit, I guess, but it seems like a lot more work [17:59:16] man I need to learn how to draw mockups [17:59:29] Ryan_Lane: Just fake it. :-) [17:59:36] Bah-dum tsh! [18:00:00] * Damianz gives Ryan_Lane a can of idea paint [18:00:06] one sec. I think I can do this in omnigraffle [18:00:28] If you fake drawing a mockup, are you then drawing the real thing? [18:01:07] :) [18:01:17] I'm not sure double negatives work that way :) [18:01:26] that would not be unawesome [18:05:07] <^demon> Ryan_Lane: That package up? [18:05:15] hahaha [18:05:16] I forgot [18:05:48] <^demon> ha. [18:06:32] ok. screw doing a mockup [18:06:38] ^demon: I'm making it now [18:06:45] <^demon> kthanks. [18:06:59] Coren: so, what I'm proposing is to add these project specific things as configurables per project [18:07:10] so that users can enable them via "configure projects" [18:07:52] we can have OpenStackManager call a hook, and have a separate extension that has all of these things [18:08:06] that way it doesn't clutter up the extension [18:08:34] I agree that we still need a mechanism for sending notifications [18:09:01] I actually don't want notifications, but instead want to use salt-api to actually call code modules [18:09:08] or runners [18:09:39] I want as little logic as possible to actually run on instances [18:10:59] runners could actually do a lions share of most of the work [18:11:48] Yeah, that sounds like a reasonable approach. [18:12:53] andrewbogott: any status update on salt-api? usable at all yet? :) [18:13:23] we may be able to use salt-api without keystone auth, if we only allow it from OpenStackManager [18:13:35] we could limit it to 127.0.0.1 [18:13:52] OSM could pass in project/user [18:14:09] we could wrap everything in a runner, and do the authn/z there [18:14:20] Ryan_Lane: No real progress :( [18:14:30] * Ryan_Lane nods [18:14:56] does it work at all? [18:15:09] we could do basic auth to it using a password.... [18:15:16] from OSM [18:15:51] I haven't updated recently, there may be upstream progress. But it does more than nothing... [18:15:57] Damianz can probably testify as to specifics. [18:16:00] heh [18:16:20] we may be able to use it in a minimal way [18:16:29] It can authenticate using an openstack username/password but not a token. I keep sitting down to add token support but then getting distracted. [18:16:29] I really just want to avoid shelling out from OSM to salt [18:16:30] it needs work to use it properly [18:16:57] I think we can actually ignore keystone auth for our initial use casew [18:16:58] if we don't care about decent group/acl support it's 'usable' atm [18:16:59] *case [18:17:11] Damianz: yeah< i think we can work around that [18:17:26] so, here's my current thinking: [18:17:35] have salt-api run on 127.0.0.1 [18:17:43] make it require a username/password [18:17:50] give OSM that username/password [18:18:05] only allow it to call a runner [18:18:23] the runner wraps all calls, and takes in user, project, and the call it wants to make [18:18:38] we do authn/z in the runner [18:19:08] basically we're offloading the auth to OSM [18:19:11] I do something like that at work - but I use a celery worker and put all authz logic in the application, then salt-api is just a dumb interface (since i need to open up peer runs for other stuff...) [18:19:11] which we're doing anyway [18:19:48] andrewbogott: Commented on the patchset [18:20:07] Damianz: this would be similar, except we'd do the logic in the runner [18:20:07] Coren: Thanks, I'll catch up after lunch. [18:20:17] salt-api would just be a dumb interface to avoid shelling out [18:21:01] It sort of seems silly - you're allready checking auth in osm, why check it again? You can never expose that runner to anyone else [18:21:14] hm. true [18:21:29] yeah. that's actually easier [18:21:51] Damianz: So turn off auth entirely because the invoker is trusted. [18:22:12] yeah - just say 'osm' can do 'everything' and let osm decide what it's calling to call on your behalf [18:22:34] we can still limit what's callable, we just don't check the user/project [18:22:39] since the caller is doing it anyway [18:22:44] mhm [18:22:55] good idea [18:23:02] the only thing we'd have to watch is if we make this extensable taht the auth steps will stand up to stupidity [18:23:06] It does have the elegance of simplicity. [18:23:16] (ie let people add 'project' specific commands to the interface) [18:24:25] you know, if we only exposed a runner from the api, then we could actually do all of the keystone stuff in the runner [18:24:27] and use it as a wrapper [18:24:34] Damianz: But then, the worse they could do is break their own toys since OSM won't call with a different project. [18:24:35] then we could open this up past localhost [18:24:40] but that's a future topic :) [18:25:00] Ryan_Lane: That all sounds doable. [18:25:07] great [18:25:19] Coren: Well they could do priv escalation within a project potentially [18:25:27] But since it's all reviewed code anyway, not hugly a problem [18:25:48] ok. I gotta do lunch before someone stabs me [18:25:54] Damianz: Don't we presume that only project admins would have the right to set project-specific commands anyways? They don't need to escalate. [18:26:03] Ryan_Lane: Go do the eating thing. [18:26:17] I'll write up plans and send an email to labs-l [18:26:42] Depends... I was more thinking on the level of 'moving auth back a step', but in theory if they can do anything bad in the interface there's worse things they could do anyway [18:26:46] for both the salt stuff, and the project-specific sudo stuff [18:26:52] On the subject of eating..... [18:27:11] and we can continue the discussion through email where everyone can participate :) [18:27:19] Ryan_Lane: Heh. okay. [18:27:28] email is no fun [18:27:31] * Ryan_Lane should practice what he preaches [18:27:35] Ryan_Lane: BTW, I do have a workaround; I still have a project-local sudoers so I can cope regardless. [18:27:35] someone feed mailman cocaine so it's vaugly real time [18:27:43] Coren: indeed [18:28:04] Ryan_Lane: It's just ugly and I would love to be able to get rid of it. [18:30:24] if you applied that theory to it departments, you'd have no one in it #troll [18:34:01] Coren: Why do you restrict tools not to create new databases? [18:35:09] Jan_Luca: I haven't found a decent way of restricting database names, so I can't prevent a tool from interfering with another. [18:35:30] Jan_Luca: Hence create database will only be given on request. [18:35:49] Coren: GRANT ALL PRIVILEGES ON `_%` ? [18:35:58] that's horrid [18:36:11] That's... augh! [18:36:16] reminds me of cpanel... bad thoughts [18:37:10] It is a suggest, but I think we should not forbid creating new databases [18:37:43] maybe you could create a procedure which uses a nicer way [18:41:52] Coren: Have I scare you off :-) [18:42:20] No, sorry Jan_Luca, I'm doing about 7 things at once. :-) [18:42:58] Coren: No problem. We can speak about this some other time. It does not hurry [18:43:08] Jan_Luca: I'd rather not prevent it if possible, but most tools that need to create databases dynamically have to be able to do so within mysql itself; so an external tool won't help much. [18:43:42] Coren: No external tool, I mean "CREATE PROCEDURE" [18:44:44] Hm. That's a possibility; though it does mean having to modify the tools accordingly. It's worth looking into. [18:44:57] Can you create a bz for that request so that I can manage my time? :-) [18:45:10] Coren: OK :-) [18:48:55] Coren: https://bugzilla.wikimedia.org/show_bug.cgi?id=46460 [18:49:12] Jan_Luca: Thanks. [18:55:32] Balls [18:55:47] Downside of working in it - your tickets to the helpdesk get assigned back to you randomly [18:59:38] salt is also a solution to creating databases [19:00:03] well, kind of [19:00:13] depending on which domain the database systems are in [19:00:24] I think they are in production. that makes things…. harder [19:00:59] You could technically have a salt-api instance in prod that limits to 1 call which takes 1 arg (a db name) rofl [19:01:27] I have very, very serious doubts anyone would be OK with that [19:01:45] I don't want that either [19:02:21] I'm actually thinking the databases should be under the labs domain [19:02:28] the data that goes to them is already stripped [19:02:43] we should be network segregating them, as well [19:02:46] Since AFAIK we strip data per table/column that's eaaasy [19:02:58] yep [19:03:05] then we can manage them through salt [19:03:34] ok. food. food real this time [19:03:53] I got doritos... do they count as food, I hope so =D [19:04:35] * ^demon whimpers [19:05:13] * Damianz strokes ^demon's head softly [19:46:45] Coren why not to use the procedure we have on bots [19:46:55] everyone can create as many db's as they want [19:47:56] It'd need adaptation; you want distinct namespaces, since you want to different tools being able to create the same database. [19:48:09] I mean, being *un*able [19:48:22] But yeah, that's a good starting point. [19:48:51] that's simple [19:49:01] if the tool is using own username [19:49:09] you can just prefix the db name with username [19:49:24] -.- [19:49:35] so that create_db("blah") creates botname_blah [19:50:27] I'd rather have create_db("notbotname_blah") fail. I would expect that tool maintainers reasonably expect that the argument to create_db is the actual database name. :-) [19:50:52] that's easy as well [19:51:15] you just add a condition that if db name doesn't start with proper name it throw error [19:51:30] Yep. Sounds like the simplest possible solution. [19:57:53] on other hand we could have separate db instances per tool [19:58:00] if it was running on dedicated hw [19:58:03] shouldn't be hard [19:58:12] .. [19:58:24] Damianz [19:58:25] ?? [19:58:48] why would you have a seperate db instance per tool [19:59:16] because it would a)be more secure b) more scalable c) easier to maintain [19:59:49] maintainers of bot / tool could make as many databases of any name they like [19:59:57] and as many users of any name they like [20:00:08] it's less scalable... and harder to maintain [20:00:13] nope [20:00:29] You'd have to guess at the buffers per db, you've got multiple processes, versions, tables etc [20:00:34] it could be somehow automated so that creation of db instance would be as hard as creation of db [20:01:05] multiple processes are consuming resources just as multiple threads in one process, except for memory [20:01:24] multiple versions? why [20:01:35] but yes u are probably right with memory [20:01:38] Process has overhead [20:01:40] buffers would be complicated [20:02:11] You'd end up with like 8k of mysql instances with 4 being used activly rather than like 1 with a few concurrent connections [20:02:30] mhm [20:03:06] Everytime I find a not-thread-safe lib and have to use multiprocessing for something, kittens die because of the overhead [20:03:36] * Damianz looks at pysnmp [20:03:50] Reminds me, need to remember c++ bbl [20:48:03] poor search is broken in beta :( [20:51:25] <^demon> What's up? [20:52:39] <^demon> Ryan_Lane must've gotten distracted again :\ [20:54:04] he's in NYC [20:56:19] ^demon: I think he went for food [20:56:34] <^demon> Yes, I know. [20:56:39] <^demon> I'm just complaining. [20:57:24] * Damianz gives ^demon a wine gum [21:00:50] ahhhh Damianz :-] [21:01:23] Damianz: I could use a new check in the labs icinga. Check that port 8123 is TCP ready on deployment-search01.pmtpa.wmflabs [21:01:28] Damianz: how would i proceed?:-] [21:01:31] haaai [21:01:48] Does the service that provides that port have a puppet class? [21:01:49] deployment-search01 is a Wikimedia Front end lucene search server (role::lucene::front-end). [21:03:01] That should be simple, give me a few to go back to my laptop and I'll make a change to show you how that's jiggled together =D [21:03:06] !log deployment-prep Starting lucene-search-2 on deployment-search [21:03:09] Logged the message, Master [21:03:52] Damianz: could also need a check_proc [21:03:58] the command that should be running is: /usr/bin/java -Xmx20000m -Dsun.rmi.transport.tcp.handshakeTimeout=10000 -Djava.rmi.server.codebase=file:///a/search/lucene-search/LuceneSearch.jar -Djava.rmi.server.hostname=deployment-search01 -classpath :/usr/share/java/udp2log-log4j.jar:/a/search/lucene-search/LuceneSearch.jar org.wikimedia.lsearch.config.StartupManager [21:04:15] Does that check exist in prod nagios currently? [21:04:18] the important parts are LuceneSearch.jar and the class " org.wikimedia.lsearch.config.StartupManager" [21:04:34] the TCP exist in prod, let me look it up [21:04:50] The process check is done via nrpe - so it needs a change into puppet if it's not there [21:05:08] check_lucene will look for port 8123 [21:05:26] apparently done by the icinga server directly [21:06:10] I can't find a nrpe check to verify whether the lucene process is running locally :/ [21:06:35] port 8123 would be enough, aka check_lucene whenever the puppet class role::lucene::front-end is applied [21:07:02] Damianz: also I was wondering if we could setup email notifications for errors [21:07:33] would be awesome - I was kinda waiting for per-project aliases email wise [21:08:02] is there a bug for it ? :-] [21:08:23] like deployment-prep@wmflabs.org set as an email alias that would spam all sysadmins of the project [21:10:47] I think, somewhere [21:11:56] !log deployment-prep Search is back! Turns out that lucene-search2 service was not running on deployment-search01 despite puppet ensure => running on the service :( See also {{bug|46459}} [21:11:59] Logged the message, Master [21:14:49] New patchset: DamianZaremba; "Splitting into classes so the config is cleaner, adding lucene check" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55400 [21:15:03] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55400 [21:16:08] derp [21:16:20] New patchset: DamianZaremba; "Wrong place.." [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55401 [21:16:43] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55401 [21:19:54] This needs unit tests... [21:23:03] seriously can't find anything on labsconsole now [21:33:09] hashar lies :( [21:33:55] New patchset: DamianZaremba; "Making brackets more sane, adding group for lucene and adding the class beta /really/ uses" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55403 [21:34:08] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55403 [21:36:34] New patchset: DamianZaremba; "I'm an idiot and remove the defintion" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55405 [21:36:45] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55405 [21:37:09] hashar: You should have monitoring now... btw beta uses role::lucene::front_end::poolbeta not role::lucene::front-end [21:38:09] oh [21:39:11] You know what... I might get rid of all those ifs... I can check if a role class exists then just pass that as an array [21:39:22] ah true [21:39:30] thanks for the verification :-] [21:39:41] That will make this sooo clean, just drop 1 file in and it will work hmm [21:50:42] andrewbogott: The point being, of course, that service users get a *different* home than normal users would in my example. /data/project as opposed to /home [21:50:58] Oh, of course. That makes sense. [21:51:14] * andrewbogott thinks about how to make it configurable per project [21:52:59] Why not provide a substitutable template? Something like /home/%p%u by default, but I could /data/project/%u for tools? [21:53:14] * Coren ponders. [21:53:34] There really isn't a place for project variables being set once the project exists, is there? [21:54:15] Or maybe simply in action=configureproject? [21:59:44] Coren: Yeah, I'm just thinking about the right place to do it in the gui. Probably here: https://wikitech-test.wmflabs.org/w/index.php?title=Special:NovaProject&action=configureproject&projectname=servicegrouptesting [22:00:17] Yeah, that's what I said just above. It would make 100% sense there. [22:00:27] Ah, so it is :) [22:01:30] I mean, if you told me I could change that setting at let me loose at the interface, I'd ecpect the very first thing I'd try is that 'configure' link right there. :-) [22:01:43] expect* [22:02:51] Yep! That page is new and I forgot it was there :) [22:03:00] (Even though I made it. *shrug* ) [22:03:13] andrewbogott: Short attention span? :-P [22:03:47] Yeah, I'm a little too good at purging my cache after a patch lands. [22:07:05] Coren: I've got a bot I want to migrate from toolserver to labs. I did run a bot a few months ago in the bots projects but that was just a test and everything has changed since. [22:07:07] What do I do? [22:07:16] It is an irc bot that should run for ever and be ensured to stay running. [22:08:54] Krinkle: Well, if it's stable you want the tools project [22:09:09] Coren: define stable? [22:09:16] I can set you up in about 60 secs. What's your labs username? [22:09:21] It's all in version control [22:09:24] Coren: 'krinkle' [22:09:34] So yeah, pretty stable. Been running for almost 2 years on toolserver without issues [22:09:38] By stable, I mean "you already know what all the dependencies are and not so much experimenting as deploying" :-) [22:09:39] only down when they are having issues [22:09:39] * Damianz straightens Krinkle out [22:09:46] which is a lot lately, I've had enough. [22:10:06] Their mysql replication lag/http uptime is stupidly bad recently [22:10:08] Coren: Yep, only depends on php and bash. [22:10:26] Krinkle: http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help [22:10:38] krinkle last think I need for you is a name for your tool. :-) [22:10:55] The bot is mostly affected by crashes of their sge sytstem. I get like 10 mails a week of "Cron qcronsub" of another thing failing with a big java stracktrace that is useless to me. [22:11:15] New patchset: DamianZaremba; "Improving readme" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55412 [22:11:16] New patchset: DamianZaremba; "Moving to ini file" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55413 [22:11:16] New patchset: DamianZaremba; "Adding autodiscovery of roles" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55414 [22:11:18] * Damianz crosses fingers [22:11:23] Coren: wmfDbBot [22:11:48] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55412 [22:11:51] Krinkle: Tool names are case significant. Want it that way? [22:12:00] Yes [22:12:07] https://github.com/Krinkle/ts-krinkle-wmfDbBot [22:12:27] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55413 [22:12:29] {{doing}} [22:12:41] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55414 [22:13:05] {{done}}. The help I pointed to should give you a good idea, but I'm online if you have any questions. [22:13:17] You can login to tools-login.wmflabs.org at your convenience. [22:14:15] New patchset: DamianZaremba; "use AD" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55415 [22:14:34] Krinkle: Biggest differences from TS: your tool has its own uid/gid, you can 'become wmfDbBot' as a shortcut [22:14:57] Krinkle: Second difference, you may want to look at jstart/jstop for a bot rather than mess with cron [22:15:01] Coren: lol, that is exactly what is the same as on Toolserver. [22:15:02] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55415 [22:15:14] Krinkle: o_O? Didn't use to be that way. [22:15:17] every tool has its own uid/gui, and one uses become to do shared stuff [22:15:25] Coren: has been for at least 5 years. [22:15:27] * Coren chuckles. [22:15:46] Coren: https://wiki.toolserver.org/view/Multi-maintainer_projects [22:15:47] Odd. Nobody ever complained about my running my bots from my own account! [22:15:54] Of course not everybody used that, but it was available [22:16:01] Aaah! I never /had/ a multi-maintainer project. [22:16:17] for those that don't want their stuff to be deleted then they forget about toolserver [22:16:35] For example https://toolserver.org/~intuition/ and https://toolserver.org/~cvn/ are MMT tools I maintain. [22:16:42] anyway [22:16:42] Aaaah. [22:16:56] Well, then, you should feel right at home! (no tildes in URLs though) [22:17:01] great [22:18:08] Coren: And I see it is properly enforced [22:18:15] I don't have a public_html in my personal home [22:18:19] perfect [22:18:39] Yes, tool account usage is not optional here. :-) [22:19:13] Do I have to use 'become' though? It has the downside of not leaving a track of who did it (if there are multiple people that maintain a tool) [22:19:14] New patchset: DamianZaremba; "Fix and making clear" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55417 [22:19:23] am I in the group of the tool account? [22:19:30] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55417 [22:19:32] If so, I should be able to access it without become, correct ? [22:19:50] You're in the group, but you'll run into problems if the actual phps aren't run by the tool user [22:20:04] So you'll have to sudo to start it, at least [22:20:28] Oh sure, the executable will run under the tool account [22:20:52] but I mean when I'm setting stuff up and such, I'd like it so say that I touched the file If I touched the file. [22:21:00] If it's a program meant to run and not stop, the very best way is to 'jstart the_bot' [22:21:16] Yeah, no need to become then -- the directory shares a group with its maintainers and is sgid [22:21:21] That's how it was at toolserver, but nobody used it because the problem was that stuff was 644 / 755 by default instead of 664/775 on the project storage. [22:21:56] which meant it was a pain to work with when someone else did something, you still had to act as the project to fix the permissions. [22:22:14] Are you planning to address that (or already done?) [22:22:16] New patchset: DamianZaremba; "Jinja won't take the full path" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55418 [22:22:37] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55418 [22:22:42] It's already done that way, but you may want to set your umask in your .profile [22:23:05] yeah, its 0022 by default. [22:23:12] Which makes sense when I'm in my home directory [22:23:24] The permissions on the tool home are set for sharing right, though, so just setting your umask should suffice [22:23:27] New patchset: DamianZaremba; "Coffee" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55420 [22:23:37] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55420 [22:23:53] drwxrwsr-x 4 local-wmfDbBot local-wmfDbBot 107 Mar 22 22:17 wmfDbBot [22:24:57] Coren: When I did become wmfDbBot, and `touch a` it was created as 644, umask is 0022 for the project account as well I see. [22:25:11] Ah, yes, that's a Ubuntu default. [22:25:14] -rw-r--r-- 1 local-wmfDbBot local-wmfDbBot 0 Mar 22 22:24 a [22:25:31] The tool user also has a .profile, though [22:25:36] In wmf production we set umask from /etc/profile.d (or something like that) [22:25:55] As to avoid having to create a .profile for every tool manually [22:25:57] Yeah, I'll see if it's puppetized yet [22:26:16] If not, I'll switch the default around; makes sense here. [22:26:23] * Coren makes a bz [22:26:48] Is it possible to set umask on a directory base? So that if I'm on my personal account and creating a file in /data/project/*/ it'll have group "local-project" instead of "svn" and writable by group. [22:27:05] I recall something like that existing, it's not umask but something else [22:27:20] recursive sticky bit? Not sure.. [22:27:55] ... not that I know of. Might be a solarisism? [22:28:09] Oh, wait [22:28:16] Yes, you mean the default _group_ [22:28:26] That's sgid on the directory. That's already on. :-) [22:28:44] New patchset: DamianZaremba; "POP path" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55422 [22:28:53] Yeah, so normally when I create a file it is "644 krinkle svn" that's fine. But when I create a file in /data/project/foo it'd be nice it was created as "664 krinkle local-foo" [22:28:55] But it won't override your umask [22:29:01] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55422 [22:29:03] okay [22:29:24] Ah, I see. nice, the group is fixed automatically [22:29:27] So if you create it in /data/project/foo it'll be krinkle:local-foo, but the mode will depend on your umask [22:29:35] okay [22:29:52] Just changing the default will automagically fix that though. :-) [22:30:20] New patchset: DamianZaremba; "Path fix" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55423 [22:30:30] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55423 [22:30:42] (Or setting your umask in .profile as a workaround for the moment) [22:31:14] Coren: Well, I'm not sure if changing the umask for everyone is desirable. If I create a file in /home/krinkle I don't want everybody else in any project (group "svn" = everybody) to be able to write it [22:31:25] which it would if umask is group writable for everybody everywhere [22:31:47] Your homes permissions have go= by default, though. [22:32:19] So unless you actually open 'em up, it's not a serious issue. [22:32:45] New patchset: DamianZaremba; "Adding lucene monitoring to beta" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55424 [22:32:49] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55424 [22:32:49] Coren: What do you mean? [22:32:53] Given the general "shared dev" nature of tools, I'm not sure it's unreasonable to 002 by default. [22:33:05] drwx------ 5 krinkle svn 8192 Mar 22 22:14 krinkle [22:33:10] Coren: For the project account sure, but not for the personal accounts [22:33:30] if I create .bashrc in my home directory nobody should be able to write to that file. [22:33:40] home directory of me, not of the tool. [22:33:45] FINALLY [22:33:52] Sure, but if the users routinely write in the project home, /their/ accounts needs the 002 :-) [22:33:55] of course I could reset the chmod in that case. [22:33:57] !log nagios Made adding role checks simple - see https://gerrit.wikimedia.org/r/#/c/55424/ for basics [22:33:59] Logged the message, Master [22:34:11] * Damianz apologizes for the spam... *cough* [22:34:28] Coren: Oh well, I don't know. You'll do the right thing. I'll just observe. [22:34:48] Krinkle: Nobody can get at your personal files even if you put them 777, your actual home /itself/ is completely off by default to anyone but you. :-) [22:35:19] Although, if you do that, you still get trouted by your friendly neighborhood BOFH if you then mess up your stuff. :-) [22:35:24] Coren: What do you mean by completely off? Just because they can't list the contents doesn't mean they can't access a file [22:35:47] Krenair: No x means no access whatsoever. [22:36:05] marc@tools-login:~$ cat ~krinkle/.bash_history [22:36:05] cat: /home/krinkle/.bash_history: Permission denied [22:36:28] Coren: Bad example, that file is 600 [22:36:29] need moar executables [22:36:31] Try bash_profile [22:36:34] "execute", on a directory, means "traverse" [22:36:44] marc@tools-login:~$ cat ~krinkle/.bash_profile [22:36:44] cat: /home/krinkle/.bash_profile: Permission denied [22:37:03] It's more fun when you do that but setfacl extra acls ontop [22:37:09] * Damianz thinks of his 'fav' cvs server [22:37:21] Coren: Hm.. interesting, how does that work? The file is 644. Some special feature in ubuntu? [22:37:36] To change into a dir you need executable rights on it [22:37:39] it's a linux thing [22:37:43] on toolserver one coudl always peek in other peopls' homes (same on every other linux machien I know of in labs and in wmf production) [22:37:47] No, the /directory/ is 700. You can't reach the file to try at all. :-) [22:38:05] Damianz: You don't need to change into a dir to access a file [22:38:06] Damianz: Not linux, that's been in since Unix V7. :-) [22:38:17] Krenair: yes you do.. sorta [22:38:20] Coren: meh [22:38:36] Damianz: In most linux syntems I've been on one couldn't list files or cd into a persons home directory, but one can do cat /home/foo/.bash_profile [22:38:41] And if we where using selinux *hint* you could make everything 777 and you'd still be fine [22:38:46] Krinkle: The rule: if a directory does not have x permission for you, you cannot even /try/ to open a file in it. [22:38:54] Still need +x [22:39:04] I can set just +x so you can't list but can transverse [22:39:06] Krinkle: That's because the directory was +x, just not +r [22:39:20] Coren: Interesting [22:39:30] Coren: Which way is the default on linux? [22:39:42] Krinkle: Depends on the distro. [22:39:49] Ubuntu [22:40:22] Krinkle: Ubuntu has group access on by default, but the group owner is a group containing only yourself. Others don't have +x. [22:40:30] ok [22:40:51] Our own kickstart here has no access to group since your primary group isn't a singleton. [22:40:59] s/kickstart/preseed/ [22:41:08] I just make puppet 711 /home and 700 /home/* cause I'm a little paranoid [22:42:05] Krinkle: Now stop chattering and go back to work installing that bot so I can make sure you're all set. :-P [22:44:10] On it [22:44:37] New patchset: DamianZaremba; "Better path" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55427 [22:44:53] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55427 [22:44:56] * Coren cracks his whip. [22:45:21] Right, un-fooooked groups [22:45:35] Anyone want anything doing with monitoring while I have my cat5 flogger out? [23:03:10] Coren: Errr.. I'm already a bit stuck with the permissions. I cloned various repositories into the data, then I did 'become' and tried git pull, but I can't because .git is owned by krinkle 755 [23:03:25] I should've cloned them under the account I guess [23:03:27] argh [23:03:32] Hang on. [23:03:50] * Coren has an experimental solution he'll enable briefly. [23:04:24] * Damianz wonders if Coren'e experimenting is going to require therapy for Krinkle afterwards [23:04:42] Krinkle: from the tool account, just "take the_files/dir" [23:06:54] That should have given the tool ownership of the files. :-) [23:07:32] Ryan_Lane doesn't like that tool, but you had the perfect use-case. Tell me when you're done so I can move it back out of the way. [23:08:57] Coren: Yeah, seems to do the job [23:09:23] Coren: It warned about not following a symlink but that was fine, the symlink was pointing to something within the same subtree so it did that anyway [23:09:38] Coren: open source? [23:10:00] Krinkle: Meh. So small as to be CC0-able. [23:10:08] I can't view it :) [23:10:47] It's not a script: /usr/src/take.cc [23:11:16] you can't setuid a script, evil bastards [23:12:21] Coren: Well, aside from that (cat `which take`, permission denied) [23:12:32] but yeah, I woud've gotten binary [23:12:50] which naturally is my second language, not. [23:13:06] Krinkle: That's because its permisions were 711 :-) But it's already out of the path for now. [23:14:00] (Well, 4711, but who's counting) :-) [23:14:40] Testing now [23:14:53] Krinkle: please to use jstart [23:15:12] Krinkle: That's the happy fun method of indestructible joy. :-) [23:16:08] (you may have to jstart -cwd if the bot expects to start in your current directory) [23:17:12] Coren: What do you recommend jsub or jstart? I wasn't using qsub for this at toolserver, so no need for qsub-like per se. [23:18:12] jstart is a happy fun wrapper that starts your code on the continuous queue and makes sure it /keeps/ running until (a) you jstop it or (b) it exits without error. [23:18:39] It's keep running even if the execution node dies, so long as there is somewhere else to restart it. :-) [23:19:01] Coren: So jstart is more than just a shortcut for jsub with certain parameters? [23:19:37] Coren: The documentation says "to the files jobname.out and jobname.out in the tool account's home directory" It names the same file twice thre. Not sure what that should be though. [23:19:41] Krinkle: No, jsub has the same magic if you do -once -continuous; but I may add some extra monitoring to jstart eventually [23:19:54] cool [23:19:57] Krinkle: Err, one of those should be .err [23:21:19] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Help was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=662618 edit summary: [+0] /* Simple utilities */ ce [23:24:12] Coren: Hm.. I'm doing something wrong I guess [23:24:23] local-wmfDbBot at tools-login in ~/apps/ts-krinkle-Kribo (master) [23:24:23] $ jstart -N dbbot-wm -- php Init.php [23:24:29] $ cat ~/dbbot-wm.err [23:24:30] [Fri Mar 22 23:23:17 2013] /data/project/wmfDbBot/apps/ts-krinkle-Kribo/dbbot-wm: not an executable file [23:24:57] ... [23:25:10] The -- isn't supported, but that shouldn't be it. [23:25:13] * Coren checks. [23:25:48] Coren: Did it without it, same error [23:26:04] Yeah, I see the bug. Something silly on my part [23:26:12] Coren: Is it by design that it appends to the .err/.out files instead of overwriting them each run? [23:26:26] Coren: Something I can do meanwhile? Or is this a quick fix? [23:26:35] Quick fix. 2 min or so [23:26:54] Yes, appending is by design; in case the job needs to be restarted on a different node. [23:27:03] How should I have used (I understand you'll make this way work, but was there a different way I could've made it work?) [23:27:24] okay, cool. [23:28:04] Not specifying a -N. I had a bug there. :-) [23:28:10] It's fixed now, though. [23:28:44] Hmm. You might have to be explicit about the php path though. [23:28:52] I think I don't trust the $PATH by default. [23:29:24] * Coren makes notes to document that. [23:31:14] New patchset: DamianZaremba; "Fixing tests" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55436 [23:31:14] New patchset: DamianZaremba; "Moving ssh out" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55437 [23:31:14] New patchset: DamianZaremba; "Global" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55438 [23:31:28] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55436 [23:31:48] job-ID prior name user state submit/start at queue slots ja-task-ID [23:31:48] ----------------------------------------------------------------------------------------------------------------- [23:31:48] 132 0.25000 dbbot-wm local-wmfDbB r 03/22/2013 23:31:16 continuous@tools-exec-01.pmtpa 1 [23:31:51] Yeay! [23:32:02] Coren: Getting several 'Could not open input file: Init.php' entries in the .err file [23:32:05] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55437 [23:32:16] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/55438 [23:32:25] Ah. jstop it [23:32:31] Then restart with -cwd [23:32:32] :-) [23:32:45] and provide full path? [23:32:54] (for php the file or both?) [23:33:15] -cwd is probably more useful since I'm guessing your script actually expect to run in that directory [23:33:30] No, it can be run from anywhere [23:33:47] If anything, it expects to be run from the bot directory (~/apps/Kribo) not ~ [23:33:59] Trying now: [23:34:11] wmfDbBot in ~:$ jstart -N dbbot-wm php apps/ts-krinkle-Kribo/Init.php [23:34:59] Nice [23:35:01] It's running [23:35:03] and on irc [23:35:05] Great :) [23:35:09] Yeay! [23:35:51] Coren: Im trying to find in the documentation how often 'continuous' is enforced. [23:35:57] i.e. if it dies for some reason [23:36:26] It's SGE restarting it. If it dies because the /script/ dies, it's about 5s [23:36:39] impressive [23:36:44] * Krinkle is going to try that [23:36:57] If it dies because the infrastructure croaks, it may take 50 seconds for the gridmaster to reschedule it elsewhere [23:38:00] I'm used to 5 to 10 minutes. [23:38:26] Looks like (on the irc front) it came back within a minute. Which includes teardown and set up. [23:38:27] Great [23:39:19] You can use '/usr/local/bin/job' from a php webpage if you want to have a web status. [23:39:32] Coren: Only one thing I'm worried about. Appending instead of replacing the job output files. I understand why and I like it, but it does mean it'll grow fast. [23:40:19] Perhaps some built-in rotator (or easy hook up with logrotate) [23:40:59] * Coren ponders. [23:41:16] It's a bit hard to do since it is open on some random exec node... [23:41:45] Perhaps make logrotate look in /data/project/*/logrotate.d (in addition to /etc/logrotate.d) [23:42:09] logrotate pretty much covers all the edge cases of how to move those things when in execution [23:42:13] (copy vs. move etc.) [23:42:30] Sounds like a good plan. Do you mind opening a bz with the feature request? [23:42:31] Though I guess in this case it isn't my app writing to the file [23:42:36] it is wrapped by the grid [23:42:43] So at least that gives you consistency [23:42:52] in how the file is opened [23:42:52] Krinkle: It is, but I should be able to convince it to close-reopen. [23:44:01] Given that my app isn't writing to it directly, it wouldn't make sense for me to have to configure logrotate locally, you'll know best which settings work best with jstart (or perhaps not use logrotate at all and do it within job grid) [23:44:48] It may be a bit tricky since the file is actually being written on another host, but it should be doable. [23:44:51] I'm thinking something like jstart/jsub --log-maxage='1 week' --log-rotate='daily' [23:44:58] Ah, right. [23:45:06] but it's a mounted disk though, right? [23:45:39] * Krinkle files bug [23:46:08] Right now it's on *thunderclap* gluster. We're aiming to scrap that soon though. [23:50:16] Coren: https://bugzilla.wikimedia.org/show_bug.cgi?id=46471 [23:52:26] Great, thanks. [23:53:53] Hm.. looks like ganglia isn't responding http://ganglia.wmflabs.org/latest/ [23:53:55] Error 324 (net::ERR_EMPTY_RESPONSE): The server closed the connection without sending any data. [23:54:38] and back up [23:55:09] Coren: "PHP Fatal error: Call to undefined function curl_init() in " [23:55:17] My bot just died [23:55:31] Looks like the packages could use some work [23:55:43] Mind if I blow it up and bloat with moar crap? [23:55:59] php5-curl for starts [23:56:08] puppet? yes, right. On it. [23:56:17] No worries, I knew there are missing dependencies. :-) [23:56:55] http://ganglia.wmflabs.org/search/ 404 [23:57:07] php5-curl. Predict other missing stuff? [23:57:08] Ok, who's messing with ganglia [23:57:43] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Notepad was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=662624 edit summary: [+10] +missing