[00:21:39] BracketBot@itwiki? [01:36:42] Ryan_Lane: Can you create a labs project called "splunk"? [01:36:55] sure. what's it for? [01:37:25] Testing log data [01:37:50] is splunk open source? [01:38:00] Oh crap, it's not. [01:38:18] at least not entirely. There's a debian tgz for installation [01:38:26] yeah. thought so [01:38:27] there's a few precompiled binary files [01:38:48] labs is open source only [01:39:01] k [01:40:11] so it's a piece of software they're wiling to freely license to oss projects (similar to how we're using SauceLabs and Browserstack right now). But does require local installation. Though we haven't decided yet, I'd like to try it out and see how it works. Not sure how to do that. [01:40:25] Or whether this would be a nogo in general regardless of labs specific policy. [01:40:39] Krinkle: they've offered in the past to analyze our logs [01:41:07] we let them for a while and decided that even if it was helpful, we'd rather use open source alternatives [01:41:17] and yeah, it's a nogo in labs [01:41:24] I have a fair amount of experience with it at jQuery, they use it for most of their logs. [01:41:27] we've been pretty strict about open source only [01:41:51] Krinkle: logstash is a relatively OK alternative [01:41:51] I was going to propose to use it for our error logs. Right now we have no proper solution to see spikes in that or to be alerted of it. [01:42:09] it uses elastic search [01:42:39] Splunk is similar I think, though I haven't seen the term elastic used. [01:42:54] basically you can make queries and get results right away [01:43:01] yeah, logstash can do the same [01:43:16] Krinkle: try out: http://logstash.openstack.org/ [01:43:46] I believe they use it mostly for jenkins logs [01:43:54] wow, this is scarily similar [01:43:57] like 99% UI [01:44:48] we'd be really hesitant to use something proprietary in production, even if it has a free license [01:45:10] sure, no worries. Just checking out it. [01:45:17] * Ryan_Lane nods [01:45:22] Anyway, we should start using something, anything. [01:45:35] totally agree [01:45:42] tool labs needs something too [01:45:56] Automatically picking up on the various logs (even pushed or pulled from different machines to the indexer) [01:45:58] they started writing something from scratch, which makes me want to stab [01:46:11] who? [01:46:20] tools project folk [01:47:18] anyway, yeah, it would be nice to have some tool for logs [01:47:36] where we could link to common queries, or have them as tags [01:50:21] `"fatal" source=mw_errors type=php_error_log grouptrim=subject | timechart by count | earliest 1 week` [01:50:25] those kind of queries :) [01:51:22] yeah ;) [01:52:52] (I'm being corrected, it'd be | timechart count by hour | [01:53:06] or something like that [01:53:10] heh [05:56:24] New review: Petrb; "why don't you make a script similar to what I made, which just run make and then it create a package..." [labs/toollabs] (master) - https://gerrit.wikimedia.org/r/70771 [05:59:04] New review: Petrb; "on other hand I see nothing wrong on having ./configure && make && make install option as well :-) i..." [labs/toollabs] (master) - https://gerrit.wikimedia.org/r/70771 [08:30:49] crontab runs "normally"? Is not able to run a command that works in che CLI [09:20:44] fale: you might have to set PATH manually [09:20:52] (re: crontab) [09:21:03] I usually have a small .bash script that sets PATH and does the commands I want [11:24:18] YuviPanda: thanks :) The problem was that the script I run in crontab execs other script and there was a wrong path for the sub-scripts :) [14:48:20] I think I need to delete some rrd files from ganglia.wmflabs.org - could someone help me with that? [14:53:18] random spam :P http://petrbena.blogspot.cz/2013/06/handling-oom-issues-gracefully-on-linux.html [14:53:26] basically alternative to slayerd on TS [14:54:48] if anybody haz any kind of feedback tell me :o [14:57:46] manybubbles yes I can help you [14:57:53] which files you need to remove [15:33:10] petan, you there? [15:33:17] yeh [15:34:23] petan, there's something funky going on with file ownership. Can you look at /data/project/xtools/Peachy/Includes/AutoUpdate.php/ I can't seem to ownership of the file. [15:34:54] sure [15:35:05] I uploaded the file like any other file. [15:35:26] -rwxrwxr-x 1 cyberpower678 wikidev 11429 Jun 25 00:05 AutoUpdate.php [15:35:52] I have no clue what wikidev is. [15:36:09] it's your primary group [15:36:26] I can't use the take command on it, and it's causing write access problems for the scripts. [15:36:31] petan: Another option for managing the OOM killer is to set /proc//oom_score_adj to -1000 for processes you never want killed (Ubuntu apparently has some sort of config for upstart-managed daemons, see http://upstart.ubuntu.com/cookbook/#oom-score). Or you could set /proc/sys/vm/overcommit_memory to 2, but note that can cause failures if some big process tries to do a fork-and-exec and total VM (memory + swap) is already mostly committed. [15:36:44] The group should be xtools petan [15:36:54] Like all of the other files/ [15:37:33] petan, how do I set the group of the files to xtools? [15:38:44] Cyberpower678 for that you would need my new take which others do not like [15:38:53] scfc_de ^ here is example why -g is useful [15:39:15] Coren ^ [15:39:15] petan, what's the new take? [15:39:25] that is a version we don't use on tools project... [15:39:33] there is a take utility which let you overtake the owner [15:39:34] Cyberpower678: What error message do you get? [15:39:37] but it can't change the group [15:39:43] Lemme read scrollback. [15:39:50] scfc_de, one sec... [15:39:56] Coren basically you can see why -g is useful in take [15:40:06] sometimes group need to be changed as well [15:40:28] petan: No, it shows why it is useless: if your take had allowed taking that file over, then it would have been broken. You can take a file ONLY if you are part of the group. [15:40:44] Coren well, it wouldn't work probably... [15:40:52] Cyberpower678: Hang on. [15:41:31] but I still can't understand why group must not be overtaken when owner can be [15:41:39] it would be useful for cases like this [15:41:43] Cyberpower678: Why is /data/project/xtools/Peachy not +s? Did you change the permissions on that directory? [15:41:43] Cyberpower678: I see what your problem is. [15:41:47] petan: No it wouldn't. [15:42:06] howcome, he would just take -g and he would never need help from others [15:42:09] Cyberpower678: You removed the g+s from /data/project/xtools/Peachy [15:42:09] scfc_de, no. Just the files. [15:42:34] petan: If take allowed him to take a file that was in a group not his own, it would be broken. [15:42:40] crap I just executed take $HOME/public_html [15:42:42] :/ [15:42:50] This could take awhile [15:42:54] Cyberpower678: I didn't /ask/ if you removed g+s, I told you you /have/ :-) [15:43:06] Coren: yes it would be broken by design, I don't argue about that, but I ask why is it shouldn't allow it by design? [15:43:08] Cyberpower678: You did a chmod you shouldn't have. :-) [15:43:24] petan: Are you *seriously* asking this question? [15:43:33] like, if you are able to overtake owner, you can change the group anyway [15:43:36] it's just complicated [15:43:59] Incoming [15:44:01] ... [15:44:02] local-xtools@tools-login:~$ take $HOME/Peachy [15:44:03] Update.log: You need to share a group with the file [15:44:03] AutoUpdate.php: You need to share a group with the file [15:44:03] acc.php: You need to share a group with the file [15:44:03] LICENSE: You need to share a group with the file [15:44:04] tmp: You need to share a group with the file [15:44:04] petan: Yes, but *if you are not in the file's group, you shouldn't be **allowed** to take over ownership!* [15:44:06] test.php: You need to share a group with the file [15:44:41] Cyberpower678: That's because you removed g+s from [15:44:46] * Cyberpower678 now sees what Coren is talking about. GID permission is missing. [15:44:50] Coren why you shouldn't be allowed, when it's in your home and it's readable by you? :o [15:45:11] * Cyberpower678 executes chmod to fix that. [15:45:12] I still fail to see what the problem with that is [15:45:15] Cyberpower678: That's because you removed g+s from /data/project/xtools/Peachy so the tool never got group ownership. [15:45:47] Cyberpower678: Lemme fix it for you now. Don't do that again. :-) [15:46:05] how could it ever happen that someone would upload files there which must not be ever overtaken... it doesn't make sense [15:46:06] Coren, wait.. [16:03:59] * Coren wants to find how long ago that got added. [16:04:35] Hard to google for something like that. 10:1 that got added to GNU find first. :-) [16:05:11] Manpage says it was added to GNU find in 4.2.12, and 4.2.15 was in Debian in Jan 2005. [16:05:13] brb 2h [16:05:28] btw if someone wanted to give me feedback for my new linux idea :D I would be happy [16:05:31] http://petrbena.blogspot.cz/2013/06/handling-oom-issues-gracefully-on-linux.html [16:05:34] I was looking for similar tool for ages so I created it myself... [16:05:38] I think that TS's slayerd was hand made as well, I couldn't google it [16:11:59] petan: https://svn.toolserver.org/svnroot/toolserver/trunk/slayerd/ [16:12:09] aha! I knew it :P [16:12:11] it's hand made [16:12:45] anomie: That doesn't make me any younger, does it? [16:12:47] well, my tools is different a bit, but it serves similar purpose... terminatord is more like OOM killer, just it's more safe & friendly :P [17:19:58] I need my gerrit password reset .. I was wondering how I might be able to do that. [17:20:22] I've tried about 10 different passwords on 2 different possible user name's and no luck . and there is no 'forgot password' link either. [17:25:25] Coren|AFK ping :o [17:25:50] Coren|AFK what you think about running terminatord on -login, so that when it was nearly OOM it would survive [17:53:52] Coren|AFK: I'm writing a blog post about the service groups feature [17:53:59] Coren|AFK: wanted to clarify some things [17:54:25] Coren|AFK: your original concept was to use two new OUs, ou=project-people and ou=project-groups or something along those lines, right? [17:54:54] (this post is going on my personal blog, since it aggregates to OpenStack's planet and Wikimedia's planet) [18:34:25] Coren|AFK: hm. looking at —manage-gids, how does it work properly with local-? they aren't defined on the NFS server and manage-gids implies that it ignores info sent from the client [18:34:51] oh. right. primary gid [18:36:50] and indeed, manage-gids only handles secondary groups [19:03:57] I'm getting 'failed to create instance' when I try to build a new instance in labs but no indication of why. is there a good place for me to look? [19:04:44] looks like I've taken all my allowed cores [19:15:33] manybubbles, do you need a quota bump, or can you clean up some unused instances? [19:16:39] ^demon has some instances that I'm not sure what he's doing with them. We could also rebuild some instances with fewer cpus but since we don't have puppet configs for those it'd be a pain. We're building the puppet stuff as quick as we can. [19:19:42] <^demon> manybubbles: I just deleted my two instances I don't need. [19:20:00] sweet! now we don't need to quota bump [20:41:19] Coren: http://ryandlane.com/blog/2013/06/27/per-project-users-and-groups-aka-service-groups/ [20:41:41] Coren: let me know if anything is inaccurate [20:42:46] petan: Is ther a other possibelety to connect with shell (only-pw-auth) ? [20:42:53] *ssh [20:43:43] this is the first time, that i hav problems to access to a shell o_O [20:43:56] password auth is not allowed [20:44:15] what issue are you having? and connecting to which instance? [20:46:14] Ooo. You have a blag! [20:46:22] yep [20:46:26] actally i hav no Linux-OS, and i use Putty on Win (i konow, putty is not a real-console) [20:46:29] I haven't posted to it for a while [20:49:30] strange: https://triton.aquila.uberspace.de/wp-wiki/oops/sshwm.JPG [20:49:35] Ryan_Lane: That seems pretty much exactly what happened, and clearly explained too. :-) [20:49:42] great [20:49:52] I've been meaning to blog up some of the stuff we've been doing [20:50:01] I guess I should put them on the wikimedia blog, though [20:50:20] I need to add an OpenStack category and have it aggregated on the OpenStack planet, though [20:50:28] * Coren nods. [20:52:55] Ryan_Lane I actually started a blog too but nobody is reading :D [20:53:07] bleh. I don't know what's happened to akismet lately, but my pending comments folder has 7,200 comments [20:53:09] all spam [20:53:22] o_O [20:53:29] Coren did you get my message? :o [20:53:30] don't ever get picked up by news sites, especially for openstack stuff [20:53:47] it gets you an absurd number of views, but it also sicks the spam bots on you [20:54:07] petan: get aggregated on some planets [20:54:15] what is that [20:54:35] petan: Which one? [20:54:37] aggregation services specific to communities [20:55:24] Coren about enabling terminatord on -login so that it survive on eventual OOM, but it is just an idea [20:55:34] that is something like slayerd on toolserver [20:56:05] it kill problematic processes which eat too much ram when system is out of memory, and unlike OOM killer it never kill the important ones :D [20:56:09] petan: it's also possible to adjust OOM [20:56:22] Ryan_Lane I was actually reading article on that... but I am still confused [20:56:36] you can change some flag in /proc to prevent OOM killer killing some process [20:56:38] you can make OOM target some processes before others and some processes less than others [20:56:41] but it is only temporary :/ [20:56:45] petan: I'd rather see if simple OOM tweaks won't do the trick. I don't like relying on a daemon for something the kernel should be doing. [20:56:52] you can make it permanent [20:56:53] ok but you can do that for running processes only [20:57:01] how? [20:57:08] is there some guide for that? [20:57:32] ok I am fine with tweaking OOM killer if it is possible [20:57:40] I'd have to find the docs. we use it in production for some things, though [20:58:10] I know you can like echo -17 > /proc/procname/oomsomething but that is only for running proces [20:58:20] I'm too stupid for keys, try it tomorrow :) [20:58:49] Steinsplitter: we have a guide [20:58:58] the advantage of terminatord is that it can send the user whom process is down an e-mail with explanation :P [20:58:59] i know [20:59:10] which OOM killer cant do [20:59:10] https://wikitech.wikimedia.org/wiki/Help:Putty [20:59:12] Ryan_Lane: https://wikitech.wikimedia.org/wiki/Help:Access_to_instances_with_PuTTY_and_WinSC [20:59:41] Ryan_Lane: aaahhhh :), Thy :):) ! [20:59:44] *thx [20:59:46] yw [20:59:57] Help:Putty links to the doc you were using :) [21:08:47] Ryan_Lane I really cant find anything all the pages are like OOM killer is evil, try not to ever interact with it XD [21:09:13] delete from wp_comments where comment_approved=0; [21:09:15] \o/ [21:09:30] petan: bleh [21:09:50] by lots of linux noobs, most likely [21:09:57] or: never change the default behavior, the default behaviour is always best :DD [21:10:07] LMAO [21:10:24] system knows it better than you what is good for it... [21:11:23] http://lwn.net/Articles/317814/ [21:12:20] this is funny [21:12:29] ignore the idiotic comments [21:12:40] lwn comments are often as bad as slashdot comments [21:12:40] it describes OOM as ingenious thing which always kill the best process and never does any mistake [21:12:48] Define best [21:12:54] in fact OOM killer always bring the system down by killing some super important service XD [21:13:14] Damianz: This value is determined on the basis that the system loses the minimum amount of work done, recovers a large amount of memory, doesn't kill any innocent process eating tons of memory, and kills the minimum number of processes (if possible limited to one) [21:13:18] definion [21:13:19] petan: what describes it as that? [21:13:28] the article I linked? [21:13:34] this sentence kind of :P [21:13:38] The most important process might have a memory leak [21:13:42] the article is good [21:13:58] I'd say the article mentions OOM's shortcoming pretty well ;) [21:14:20] The longer a process is alive in the system, the smaller the score. <<< I want to turn this behaviour off [21:14:30] it is evil thinking that long running processes are OK [21:14:38] some daemon can be slowly leaking for months [21:14:53] like gluster XD [21:15:18] well, gluster doesn't actually do that anymore ;) [21:15:26] but yeah, I understand your point [21:15:28] Proactive monitoring should pick that up... oomkiller is more like an oh shit process [21:15:36] but long running processes are often important [21:15:39] so it makes sense [21:15:47] and yeah, proper monitoring should pick it up [21:19:57] In my experience, the default OOM will almost unfailingly leave the system usable enough that a real sysadmin can come in and evaluate the situation. [21:20:13] I.e.: human inspection beats automated processes 100% of the time. [21:20:59] Unless you let a retarded monkey in [21:21:05] Then it's like 80% of the time [21:37:15] Ryan_Lane: Unsurprisingly, I note Nathan2055 hasn't substantiated or explained what "sketchy policy revamp" he was going on about. FUD. [21:40:02] Hipster Ryan is out [21:41:40] Hi, I want to add a member to project 'contributors' but I get this message: Failed to add jgbarah to contributors. This needs user jgbarah to have the "loginviashell" right. [21:41:49] the user says he has uploaded the ssh key [21:41:52] get them tor equest shell rights [21:42:04] ok Damianz [21:42:15] New review: coren; "I don't know whether the extraneous comma is actually harmful, but it doesn't hurt to remove it." [labs/toollabs] (master); V: 2 C: 2; - https://gerrit.wikimedia.org/r/70734 [22:12:34] Coren: yeah. no response. FUD. [22:12:40] Damianz: hipster ryan? :) [22:13:07] #Devops [22:13:47] Damianz: still confused :) [22:13:59] Last blog post tag :P [22:14:34] I have that as a tag? on the blog itself? :) [22:14:38] I used #openstack on twitter [22:14:50] oohhhh [22:14:51] Mhm [22:15:01] in the categories [22:15:18] I still don't understand the difference betwean categories and tags... so I just made them 1 thing on mine :P [22:15:29] I don't use tags. just categories [22:15:42] I should really remove the tag element from the skin [22:15:52] I want to switch my skin at some point anyway [22:15:55] I'm tired of the current one [22:41:50] Does OSM live under extensions? [23:17:39] Damianz: yes [23:17:47] mediawiki/extensions/OpenStackManager [23:17:52] :) [23:18:15] I wish gerrit had a better 'project page' to link to =\ [23:20:40] Damianz: i think this is what they'd call the project page https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/OpenStackManager [23:21:11] Yeah... it's a bit cruddy.... so I linked to changeset search instead, which is just as crappy [23:21:36] Damianz: gitblit? [23:21:49] Hmm, that could work actually [23:22:00] https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FOpenStackManager [23:23:20] * Damianz pats mutante