[01:37:51] Coren, petan: Could you look on tools-login how many files (inodes) a screen session takes? Basically "find /var/run/screen/S-gifti | wc -l". [01:38:33] Two, plus the dir [01:39:28] Thanks. So that's not what's eating inodes at the Toolserver ... Hmmm. [04:10:03] switch to a better filesystem and you won't have problem with inodes :P [04:12:25] Filesystem Inodes IUsed IFree IUse% Mounted on [04:12:35] /dev/sda4 0 0 0 - / [04:12:41] :P [05:11:36] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 0 minutes) [05:25:10] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 13 minutes) [05:38:40] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 27 minutes) [05:52:15] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 40 minutes) [06:05:46] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 54 minutes) [06:19:16] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 67 minutes) [06:32:42] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 81 minutes) [06:46:03] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 94 minutes) [06:59:25] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 108 minutes) [07:12:54] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 121 minutes) [07:26:24] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 134 minutes) [07:39:59] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 148 minutes) [07:53:29] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 162 minutes) [08:06:58] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 175 minutes) [08:20:34] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 189 minutes) [08:34:03] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 202 minutes) [08:47:33] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 216 minutes) [09:01:03] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 229 minutes) [09:14:33] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 243 minutes) [09:27:58] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 256 minutes) [09:41:28] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 270 minutes) [09:54:54] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 283 minutes) [10:03:21] Hi Coren! How are you doing? :) [10:03:38] awake? [10:08:28] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 297 minutes) [10:21:49] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 310 minutes) [10:35:18] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 323 minutes) [10:48:49] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 337 minutes) [11:02:23] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 350 minutes) [11:15:52] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 364 minutes) [11:29:18] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 377 minutes) [11:42:48] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 391 minutes) [11:56:22] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 404 minutes) [12:09:48] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 418 minutes) [12:20:01] Cyberpower678: pong [12:23:14] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 431 minutes) [12:24:27] addshore hey :D [12:24:34] addshore I created the extension for my irc client [12:24:40] that does the irc cloud style part / join [12:24:43] :D [12:24:44] heh [12:24:49] :D [12:26:03] I would let you try it but you would need to compile it yourself... [12:26:11] it requires latest version of pidgeon because of new hooks [12:27:22] mhm... it still got some bugs [12:29:47] ill git pull it some time soon :) [12:34:56] rihgt [12:35:02] time to try and labify this IW bot [12:35:51] Coren: did you get anywhere with that code? :) [12:35:55] addshore, ping [12:36:01] Cyberpower678: pong [12:36:06] addshore, ping [12:36:21] what? :P [12:36:30] Cyberpower678: pong [12:36:39] I should be asking you. [12:36:41] you pinged me on memoserv ;p [12:36:44] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 445 minutes) [12:36:52] addshore, that was way back. [12:36:58] "labify" what a word! [12:37:01] :) [12:37:16] Repliction!?!?!? [12:37:26] I needs replication! [12:37:31] :p [12:37:44] So does everybody else [12:37:52] I need it more. :p [12:37:56] hehe [12:39:02] Coren Are you here and can you tell us more about the status of replication? [12:42:26] Silke_WMDE: it is half labified ;p just not quite ready for 'tools' yet [12:42:46] :) [12:50:09] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 458 minutes) [13:03:35] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 472 minutes) [13:17:00] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 485 minutes) [13:30:26] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 499 minutes) [13:33:17] hey all [13:34:12] has anyone gotten python scripts using flup/fcgi to work? the flup package seems to be installed, but i'm getting a 500 internal server error response [13:43:56] Warning: There is 1 user waiting for shell: Twotwotwo (waiting 512 minutes) [13:54:17] [bz] (NEW - created by: Chris McMahon, priority: Unprioritized - major) [Bug 47893] fully support Wikilove on beta - https://bugzilla.wikimedia.org/show_bug.cgi?id=47893 [14:37:12] JohannesK_WMDE: where? :) [14:37:43] http://tools.wmflabs.org/render-tests/erdbeer/hello2.py [14:37:51] it is working now, thanks to help from coren. [14:38:19] ahh :) [14:38:28] Coren: are you around? ;p [14:38:31] the problem was that the script and the directory it lives in must be owned by the project user [14:39:03] addshore: I am; I'll get to your script today or die trying. :-) [14:39:06] and the hashbang using env which worked on the toolserver didn't work on labs, but it works without env [14:39:11] :D [14:39:28] Coren: Im going to work on fixing up a few other bits then :) [14:40:09] ahh Coren one quick question, should my cron line look like the below to fix the issue I was having before? [14:40:48] 0 9 * * * cd /data/project/addbot/bot/enwiki/ && jsub -mem 1Gusertalksubst.php ?? I'm not entirly sure that would work [14:41:12] i may just go and try it a few times [14:41:45] Well, you need a space after '1G', and you want to add -cwd so that it doesn't change the default to $HOME [14:42:06] You might also want to add '-once' for safety; it won't start it again if it's already on the grid with -once [14:42:48] 0 9 * * * cd /data/project/addbot/bot/enwiki/ && jsub -mem 1G -cwd -once usertalksubst.php [14:43:35] I wonder if I should change the WD to /data/project/ and work all of the scripts from there changing their references [14:43:50] I definatly need to write a nice bash script for all of this [14:44:41] addshore: That's the best thing; have a script that does the jsub cleanly, etc, and call /that/ from cron [14:45:36] another question Coren (or whoever else may know): in .htaccess, is 'AuthType Basic' with 'AuthBasicProvider file' supported? [14:46:07] JohannesK_WMDE: It should be. If it doesn't work, it's a bug. :-) [14:47:04] it doesn't work for me yet Coren, i get a 500. [14:47:30] JohannesK_WMDE: Lemme see. [14:48:03] JohannesK_WMDE: Remove the Option ExecCGI [14:48:48] ok, thought it was not necessary but wouldn't hurt [14:48:57] leftover from TS [14:49:11] cool, works now [14:55:48] didn't touch enough wood. [14:56:08] Coren: Got another "Lost connection to MySQL server during query" error, probably shortly after 01:22:15 UTC this morning (certainly earlier than 03:39:23 UTC). [14:56:46] browser asks for a password now, but then i get a 500 error, Coren. i tried a password file generated with htpassword -p, and one with a plaintext password for testing [14:56:51] any ideas? [14:57:50] JohannesK_WMDE: Grrr. I really need to upgrade to Apache 2.4 so endusers get error logs. [14:57:56] JohannesK_WMDE: I go check. [14:58:07] (13)Permission denied: Could not open password file: /data/project/render-tests/limes-upload-htpasswd [14:58:13] yeah, error logs would be most helpful Coren :) [14:58:42] hmmm, who is trying to open it? which user? o.O [14:58:55] -rw-rw---- 1 local-render-tests local-render-tests 20 May 10 14:49 /data/project/render-tests/limes-upload-htpasswd [14:59:01] Oh! How stupid! At the time Apache checks for the password, it hasn't yet switched to the tool acount! [14:59:22] oh. [14:59:38] Ooo. This isn't going to be easy to get right. I need to make something to give ownership to www-data [14:59:45] (group is okay) [15:00:07] I'll do it manually for now, JohannesK_WMDE, but I'll file a bugzilla for a better fix [15:00:21] Try it now? [15:00:32] works :) [15:00:55] Yeah, that'll need a fix. It's still secure this way because the tool group is limited to the maintainers. [15:01:32] so what exactly did you do? [15:01:45] -rw-rw---- 1 www-data local-render-tests 20 May 10 14:49 /data/project/render-tests/limes-upload-htpasswd [15:01:52] ah, OK [15:02:07] that's ok [15:02:29] Yeah, it works, but it needs root to do, which sucks. [15:02:56] Coren need a help with something... :P [15:02:58] like addshore [15:02:59] :D [15:03:00] and btw: installing the htpasswd tool would be nice so people can generate those files themselves. should be in the apache2-utils package [15:03:12] I mean, Coren, do you need a help with something? :D [15:03:24] good, thanks again for now Coren. :) [15:04:02] all day this channel is dead and within few minutes it became full of text [15:04:39] petan: Dead it good, it means no problems. :-) [15:04:55] heh [15:05:57] !log nagios petrb: restarting nagios bot [15:05:59] Logged the message, Master [15:06:46] petan: Actually, there is something you /could/ do. I think there is an apache2.4 deb in experimental Debian; I'd like you to check how difficult a backport would be? [15:06:53] ok [15:07:00] I actually run experimental debian on my PC :P [15:08:23] !log tools create tools-webserver-02 for Apache 2.4 experimentation [15:08:25] Logged the message, Master [15:08:50] We'll keep -01 at 2.2 for the forseeable future to support tools that break under 2.4 [15:09:14] Coren: 2.4.4-2 amd64? [15:09:48] petan: Sounds right. The trick will be to backport it to precise. [15:09:51] well, that requires some dependencies mhm [15:10:04] it would be most simple to setup some testing server to try it [15:10:16] [11:08:23] !log tools create tools-webserver-02 for Apache 2.4 experimentation [15:10:20] aha [15:10:26] so, can I break it? :) [15:10:31] Yep. [15:10:32] er. install it [15:10:33] :D [15:10:43] breaking is a side effect [15:10:44] :P [15:11:26] Coren what about creating some shared folder for everyone on nfs.. like /data/project/shared etc [15:11:57] I might have stuff that I would like to share with others, like these packages for apache, but I don't know if my home is a best place :P [15:12:31] or /data/project/.shared which is somewhat... hidden :P but won't colide with tool names [15:12:48] _public would work [15:12:57] I would myself go for /data/project/.shared and /shared symlink to it :P [15:13:02] or _public [15:13:12] .shared is even nicer. [15:13:21] aaaaaand... next thing Coren & all: having trouble creating sql databases: [15:13:24] mysql> create database test123; [15:13:24] ERROR 1044 (42000): Access denied for user 'rendertests'@'%' to database 'test123' [15:13:32] symlink needs to be in puppet which I don't know how to make [15:14:08] JohannesK_WMDE afaik it's not possible to create more databases atm using tool account :( [15:14:14] ah, oh [15:14:19] but you can probably request them? Coren? [15:14:26] not sure about the rules here [15:14:26] petan: He can indeed. :-0 [15:14:48] petan: There aren't hard rules yet because that's not the "real" DB. We'll get rules on the real DB next week. [15:14:56] ok [15:14:57] petan: drwxrwsr-x 2 root project-tools 6 May 10 15:13 /data/project/.shared/ [15:15:08] JohannesK_WMDE you want me to create it? [15:15:10] well ok [15:15:16] Coren what about chmod 1777 ;) [15:15:29] Coren, can you create database u_jkroll_erdbeer_p; please ;) [15:15:30] No need, everyone is part of project-tools [15:15:34] ah ok [15:16:35] LOL the new prompt is...... so compact :P [15:16:41] petrb@:~$ [15:16:43] :D [15:16:43] JohannesK_WMDE: We can create it with this name now, but in the future database names will need to be prefixed with your tool's name: rendertest_dbname [15:17:05] well, ok... i see there is already a db named 'rendertests' [15:17:09] i can use that Coren [15:17:13] @notify Ryan_Lane [15:17:13] I will notify you, when I see Ryan_Lane around here [15:17:24] JohannesK_WMDE: Yeah, that's your default db. :-) [15:17:26] I am wondering if Ryan did it purposefully or just the new images is borked [15:17:35] I kind of like to see hostname in prompt [15:17:36] that must have been created with the account, but i didn't know. OK [15:18:59] petan: The instance isn't quite ready yet. Hang on. :-) [15:19:04] aha ok [15:19:29] petan: Initial puppet run is still in progress. :_) [15:19:58] that's why things are changing in front of me [15:22:05] petan: And it's going to reboot at least once to switch to NFS [15:27:27] hmm, anyone know if I can replace a tiny part of %{REQUEST_URI} when using htaccess rewrites? [15:27:54] addshore: mod_rewrite is enabled on the webservers, so yeah. [15:28:28] i.e. change /addbot/fun/text.php to /addshore/fun/text.php ? and Coren my point was Im not entirly sure how to make such a change ;p [15:29:15] addshore I did this for wm-bot but I think it was using some redirect instead of mod_rewrite [15:29:24] For all of /addbot or just fun/text.php? [15:29:32] people from apache told me it's better [15:29:32] all of /addbot [15:30:15] Ah, yes, then you want just a redirect [15:31:19] mhhm [15:31:41] Redirect permanent /addbot /addshore [15:31:58] In your .htaccess [15:37:41] addshore: [Fri May 10 15:36:44 2013] [alert] [client 10.4.1.89] /data/project/addbot/public_html/.htaccess: RewriteCond: unknown flag 'AND' [15:37:51] yep ignore that ;p [15:38:10] i changed from or to and instead of just removing it ;p [15:39:59] petan: -webserver-02 will be ready once it's done rebooting. [15:41:12] ok [15:41:22] Coren: Is the source for your index pages (?list, ?status & Co.) available somewhere? [15:41:37] (Mostly interested in ?status.) [15:44:35] Coren was apache2.2-common installed by puppet? [15:44:44] Coren because I need to remove it [15:45:02] if you applied some puppet class on it, we may need to disable it for a while [15:45:51] petan: Can't you puppetmaster::self on that instance? [15:49:16] petan: Listen to scfc_de, he speaks sooth. :-) [15:50:13] JackPotte: /data/project/xtools/public_html/.htaccess: Option ExecCGI not allowed here [15:50:34] Coren: you're watching the error.log? [15:51:20] JackPotte: For a different reason, but I though you'd like to know. :-) [15:51:36] Coren: thanks a lot [15:52:07] JackPotte: Also, last night, I changed your crontab for jackbot. Please go read the comments I put in it. [15:52:22] now [15:52:39] Whenever. I was going to email you about it today. [15:53:04] So, no rush. :-) [15:53:13] jsub makes me think of the qsub I used on the Toolserver [15:59:25] It's a slightly friendlier equivalent for the TL; you can also use qsub directly if you want to, but most will prefer the defaults from jsub. You can 'man jsub' [16:01:20] Coren: Source for http://tools.wmflabs.org/?status? [16:02:01] scfc_de: / is in /data/project/.system/public_html [16:02:41] scfc_de: You want /data/project/.system/public_html/content/status.php (invoked from index.php) [16:04:08] Coren: Thanks. I'm considering porting it to the Toolserver, the design is nice. Are the web pages in a repo somewhere for change proposals? [16:04:34] scfc_de: That's on my Copious Free Time TODO List™ [16:04:36] :-) [16:05:49] Coren: Okay :-). You might also want to add to that: "Insert licence notes in every file." :-) [16:05:59] (a.k.a.: after Hong Kong, probably) [16:15:55] YAY, all of my files on tools and bots now update from my repo properly :D [16:16:01] and automagically! [16:20:49] It's automagicalicious! [16:25:20] Coren: any chance you could put your bash script for submitting to the grid somewhere I can see? :) [16:43:27] addshore: I don't have one, I just mentionned that this is how most people go about it. [16:44:07] addshore: By the way, you have a redirect loop. [16:44:15] i fixed it ;p [16:44:39] I still get an Error 310 (net::ERR_TOO_MANY_REDIRECTS): There were too many redirects. [16:45:00] where from? [16:45:09] http://tools.wmflabs.org/addshore/ [16:45:18] ill check in 1 second [16:48:07] hmm, indeed :P [16:54:47] AHHH [16:55:01] addshore: (36)File name too long: access to /addshore/tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01tools-webserver-01 failed [16:55:04] :-) [16:55:07] I am expecting %{HTTP_HOST} to be tools.wmflabs.org but it is tools-webserver-01 [16:55:47] hmmm [16:56:16] tools.wmflabs.org is the proxy in front [16:57:04] addshore: %{SERVER_NAME} is what you want. [16:58:29] fixed ;p [17:05:25] interesting, docs say php5.3 has __DIR__, we are on 5.3.1 and it doesnt seem to work [17:06:58] hmm, but it does work :P [17:07:29] It only seems not to? :-) [17:07:43] something like that ;p [17:08:02] oh no, its another small error somewhere ;p [17:14:03] Coren: Interestingly, on Toolserver the XML for JB_hard_resource_list has another level (and of course h_vmem is named virtual_free), but otherwise it worked almost instantly: . I'll remove any mentioning of Labs, of course. [17:14:51] scfc_de: Neat. Might want to switch the labs' logo to the toolserver's though. :-) [17:15:16] Coren: *Anything* mentioning Labs :-). [17:15:36] That too. Otherwise you'll give people heart attacks. :-) [17:16:30] scfc_de: Since you have muilti-architecture, you might want to add $h['arch'] (already there) to the hostline. [17:16:39] Well, that would increase the conversion rate to Labs in an effective way :-). (Might even make a nice MediaWiki extension, but the wiki server isn't puppetized on Toolserver.) [17:16:43] Alongside Load: and Memory: [17:17:31] Heart attacks as a mediawiki extension? :-) [17:19:27] Hmmm. Random redirects to goatse.cx? :-) [17:28:45] Ah, TS uses virtual_free so there is no hard limit. I see many jobs over their allocation. :-) [17:29:01] 479M/101M [17:29:06] tsk, tsk. :-) [17:31:53] Not to make petan's point, but memory wasn't an issue till now :-). [17:32:04] huh? [17:32:45] hmm [17:33:00] I don't see anythin wrong on that as long as there is a lot of free /real/ memory [17:33:32] I think, on Toolserver the slots are the limiting factor. [17:33:37] I know all jobs could suddenly use 100% of what they asked for, but that is not going to happen. ever. [17:33:55] Coren, if you want to do another user rights request on mediawikiwiki please start a new page and list it on Project:Requests [17:34:02] Rather than appending to an old request [17:34:35] Krenair: I was about to transclude it once I figured out whether it was kosher to prepend or not, I was looking at past practice as we speak. :-) [17:35:00] Yeah we just add stuff like (2) to the title [17:36:01] What sort of changes did you want to do which need sysop? [17:37:10] Krenair: Delete some WIP pages that are no longer useful; I could find the RfD-equivalent, but I think I'm trustworthy with +sysop. :-) [17:37:38] Plus I can give a hand since I now hang around mw.org a lot. [17:39:07] I agree, {{done}} [17:39:59] is jsub broken? O_o [17:40:16] addshore: Not as far as I know. What occurs when you try? [17:40:19] oh.. its not an excutable file ;p [17:41:55] Alchimista: FollowSymlinks is not allowed; FollowSymlinksIfOwnerMatch is enabled, however. [17:42:49] thanks Coren :D [17:50:13] Coren: what is the default -m? :) [17:50:30] (for jsub) [17:51:09] 256m. (Well, 256 * 1000000 bytes -- gridengine doesn't know how to count bytes) [17:52:08] 512 seems to be enough to start php :) [17:52:37] Yeah, PHP is a little bit of a hog to start, but at least it doesn't keep growing like Python does. [17:53:19] heh :) [17:53:34] I also modified all of my scripts so cwd no longer matters making running them easier [17:53:54] Coren apache 2.4 is up [17:53:56] addshore: That's usually simpler, especially since gridengine set the cwd to home by default [17:54:09] petan: Ooo! That fast? Nice. [17:54:26] petrb@tools-webserver-02:/data/project/.shared/webserver$ dpkg -l | grep apache [17:54:27] ii apache2 2.4.4-2 Apache HTTP Server [17:54:38] is that the version you wanted? :P [17:54:54] all dependencies in .share [17:55:05] Yeays! /me dances. [17:56:06] I'll configure later today so we can start having proper per-user error logging. [17:56:17] ok [17:56:32] I think we should turn off 01 and switch to 02 so that we can always switch it back [17:58:00] python is using GC? [17:58:13] I hate these GC languages, though I am using c# myself [17:58:15] :P [18:00:21] Silly IRC client [18:00:39] petan: Sysadmin fun tip: Look at cat /data/project/.system/webservers and guess what it's for. :-) [18:00:53] :o [18:07:18] btw Coren I had to disable that puppet class because it keep removing this version [18:08:44] petan: That's expected. [18:09:07] addshore: Now the Help page lists 256 MByte three (!) times, that should work :-). [18:09:16] ;p [18:10:00] * addshore loves labs ;p [18:11:18] * petan imagines addshore in a bad with a cute lab [18:11:25] * bed [18:11:26] lol [18:16:17] O_o :P [18:18:16] so with continuous jobs, I should only ever have to run them once..? [18:18:24] no cron to make sure they are still there? they should just always be there? [18:19:04] addshore: That's the whole point. It's actually been proven to work even through complete grid shutdown and restart, and even through switching the entire filesystem from under them. :-) [18:19:27] ecmabot-wm has been running since 2013-03-27 01:39:02 :-) [18:19:28] lovely :) and thats jstart with the same params as jsub? [18:19:51] addshore: Basically. jstart is just shorthand for jsub -once -continuous [18:19:59] ahh okay :) [18:20:56] If you want to be super-duper-extra paranoid, a daily jstart is a last-line defense against something going really wrong (like an admin qdel your job by accident) [18:22:19] Otherwise, the job will run until (a) someone qdel it (including jstop), (b) it exits with a return status of 0. [18:22:57] heh, or me qdeling it by accident ;p [18:23:08] (a) someone qdel it [18:23:10] :-) [18:24:34] the gridengine might move it around from server to server if needed, though, so it needs to be able to survive being killed and restarted. If you want to be extra nice about it, gridengine will even send you a SIGUSR1 in advance in case you want to checkpoint and exist yourself. [18:25:02] exiit* [18:25:05] exit** [18:26:34] Coren: For your Copious Free Time TODO List™: "jsub -continuous" fails for arguments to the bot with a "'", I believe. [18:27:09] scfc_de: ... it shouldn't; that's an actual bug. [18:27:11] hehe [18:27:37] scfc_de: File a bz with example, pretty please, so it doesn't fall between the cracks? [18:29:41] Coren: Wilco. At Toolserver, we had, well, have a similar bug since August 2012 (https://jira.toolserver.org/browse/TS-1479), but not causing much stir as probably most users will just have one "bot_main.sh" script or similar. [18:31:10] addshore: Not allowed to use .. in require/include with safe mode PHP. You need to fiddle with the include_path. [18:31:54] Coren: it seems to run :P [18:32:13] addshore: Yeah, that's not what was happening: [18:32:22] require(/data/project/addshore/config/database.cfg): failed to open stream: Permission denied in /data/project/addshore/public_html/addbot/iwlinks/index.html on line 28 [18:32:45] hmm [18:32:45] Coren: SIGUSR2, not SIGUSR1, isn't it? [18:33:09] anomie: You've managed to throw doubt in my mind. Looking it up. :-) [18:33:26] * anomie uses that feature, and is pretty sure it's USR2 [18:34:48] Coren: is that the only file it is hitting an error on? [18:35:16] anomie: Ah. It's a configurable, actually, that defaults (on Linux, at least) to SIGUSR1 [18:35:44] anomie: I'll switch to SIGUSR2 to avoid causing problems with code that worked on the TS [18:35:49] Coren: Oh. USR1 is for stop, USR2 is for kill [18:35:51] Coren: Really? "man qsub" says SIGUSR1 for SIGSTOPs, SIGUSR2 for SIGKILLs. [18:36:07] qconf -sckpt continuous [18:36:16] shows signal to be USR1 [18:36:45] Should reallybe USR2 to be analogous to a kill [18:37:06] {{done}} [18:37:09] SIGUSR2 now. [18:38:59] addshore: You got a couple; will pm you [18:39:20] Coren: i found the problem, its the old 0660 or 0661 permissions, 0661 fixed it [18:39:49] i.e. I have changed the permissions and now >> http://tools.wmflabs.org/addshore/addbot/iwlinks/raw.html [18:40:08] addshore: Ooo. [18:46:54] HAH, whats up with ganglia reporting the memory of the exec nodes? [18:47:06] -20G swap [18:47:20] http://ganglia.wmflabs.org/latest/?r=week&cs=&ce=&c=tools&h=tools-exec-01&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [18:50:16] addshore: Hah. It's clearly confused. :-) [18:50:43] What's with Icinga still being down? ;P [18:58:02] [bz] (NEW - created by: Tim Landscheidt, priority: Unprioritized - trivial) [Bug 48334] "jsub -continuous" fails on arguments with "'"s - https://bugzilla.wikimedia.org/show_bug.cgi?id=48334 [18:58:25] Coren: Could you install libstring-shellquote-perl somewhere for a quick test? [18:59:50] scfc_de: On -login now [18:59:58] Coren: Thanks. [19:00:00] Tell me if you need this grid-wide [19:00:29] Now, it should only be needed at the jsub stage. [19:01:49] Still need to deploy it via puppet if you use it beyond experimentation. [19:02:43] I'm familiar with the concept :-). [19:05:58] :/ [19:14:58] (Just wondered why String::ShellQuote is very gracious towards !s, till I figured out that the latter has only special meaning in interactive shells.) [19:19:50] Coren when are you going to switch the apache? [19:20:08] petan: In a meeting atm. I'll look into it later. [19:20:14] (Probably today) [19:24:03] scfc_de, Coren, addshore if u had idea you can update https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Tips [19:34:39] New patchset: Tim Landscheidt; "Fix "jsub -continuous" quoting issues." [labs/toollabs] (master) - https://gerrit.wikimedia.org/r/63165 [19:35:46] New review: coren; "Is teh better." [labs/toollabs] (master); V: 2 C: 2; - https://gerrit.wikimedia.org/r/63165 [19:35:47] Change merged: coren; [labs/toollabs] (master) - https://gerrit.wikimedia.org/r/63165 [19:39:13] scfc_de: merged. [19:40:12] Well, too late, I just wanted to add a space after "Bug:" :-). [19:43:23] Coren: Which manifests list packages grid-wide? [19:45:50] scfc_de: modules/toollabs/manifests/dev_environ.pp or modules/toollabs/manifests/exec_environ.pp [19:46:01] (in operations/puppet) [19:46:26] dev_environ is for bastions and such; you normally want exec_environ [19:46:50] And exec_environ is used on -login? [19:47:37] scfc_de: Also -dev [19:48:00] exec_environ everywhere user code can/should run [19:48:14] Okay, i'll submit against exec_environ. [19:49:16] dev_environ is for, like, compiler tool chains, dev tools, etc that you want to be able to use on -login and -dev but don't need to be put on exec notes. Most -dev packages for libraries (include files, etc) should go there. [19:50:30] k [20:06:27] New patchset: Tim Landscheidt; "Be more liberal with invocations of jsub & Co." [labs/toollabs] (master) - https://gerrit.wikimedia.org/r/63212 [20:55:03] hmm, i wonder if it would be able to make a script that acted like cron for me, enabling me to only have 1 job on grid for many schedules tasks [20:57:47] addshore: It would, and you would have to spend the amount of effort that the authors of cron have already invested for you (for free!) :-). So I'd use cron for scheduling. [20:58:10] mhhm, but then I would spawn a job every 30 mins :O [20:58:15] it would be so tidy not to ;p [20:58:38] but I dont see much point in making it a cont job that sleep for 29mins and 30 seconds ;p [20:59:32] oh well :P [20:59:40] Spawn a job from cron or your cron replacement? [20:59:53] from cron :P [21:00:03] just means job ids would go up slightly more quickly :P [21:00:06] and noone likes that! [21:04:01] Anyone have any suggestions for a good place on Labs to put some jsduck documentation? [21:04:03] Job IDs = SGE job ids? No, if the job already exists under the same name, no new job ID is allocated. But process IDs are used, of course, but if someone is concerned about that, they probably shouldn't be working with computers :-). [21:04:13] It needs to be publically accessible. [21:04:19] superm401: What's jsduck? [21:04:32] Basically, a web page with documentation. [21:04:52] It's static content [21:05:45] If you have a tool in Tools, you could of course put it there. [21:06:07] It's for a MW extension, so it's not really a tool. [21:06:30] See https://www.mediawiki.org/wiki/Extension:GuidedTour [21:08:50] Do you have an example of such a documentation page? [21:09:00] https://doc.wikimedia.org/mediawiki-core/master/js/ is for core. [21:13:20] Alright, I'm just going to use a separate documentation instance for now. [21:13:28] I'm getting "Failed to create instance." though? [21:13:57] Do I need to request permission to create a new instance under the editor-engagement project? [21:14:01] I'm already a projectadmin. [21:17:04] superm401: Sorry, don't know. andrewbogott? [21:17:26] superm401: If you're project admin then you're free to create instances. [21:17:39] Projects have instance quotas to keep you from getting too crazy :) [21:18:01] andrewbogott, how do I check the quota? [21:18:22] It isn't easy to check at the moment… it requires me to run a command-line query. [21:18:32] In the future we hope to have it displayed on a page. [21:18:35] Maybe it's the checkboxes on the add instance screen. [21:18:45] But I think the quota is something like 8 per project, so you should have some headroom. [21:18:46] superm401: Re documentation, you might want to speak with krinkle and hashar if extensions can be added to doc.wikimedia.org. [21:19:01] scfc_de, already filed a bug. Will make sure they're CCed. [21:19:53] andrewbogott, we're already over that, which would explain it. [21:20:04] ok, let me check... [21:20:06] But this project is really shared between (at least two) teams, E2 and E3. [21:21:10] Looks like the quota is 10. [21:21:37] And that's what we have. [21:21:44] Can we get an 11th? [21:21:48] I raised it to 12 -- but 10 is already a ton; probably you should see if some cleanup or consolidation can happen. [21:22:25] Alright, fair enough. Some are for E2, but I'll email them. [21:23:39] Thank you for raising the quota. I'll see if we can get rid of a couple. [21:24:36] np [21:27:55] andrewbogott, is it possible to rename an instance? [21:28:58] superm401: I'm not sure. I see a command to do it but I don't trust it not to scramble things. [21:29:07] Ryan_Lane might have a better idea [21:29:49] andrewbogott, okay, having multiple host names on a box should be fine, though, right? [21:30:20] You mean assigning multiple DNS names to one instance? That's fine. [21:30:37] Right. [21:30:49] I'm going to try doing that to avoid making another instance. [21:56:42] *sigh toolserver [21:58:04] Cyberpower678: Wrong channel :-). [22:03:07] scfc_de, no it wasn't. [22:03:30] I no longer want to run my stuff on toolserver. [22:08:12] Cyberpower678 I no longer do [22:08:31] Cyberpower678: Well, lucky you if you don't need database replication :-). I don't *want* to either. [22:09:44] I do need replication. [22:09:59] How did you manage that then on Tools? [22:10:43] I can't get toolserver's damn phpMyAdmin to work. It's 504ing and 502ing at the same time somehow. [22:11:44] New patchset: Andrew Bogott; "Added labs user/password for RT" [labs/private] (master) - https://gerrit.wikimedia.org/r/63234 [22:13:33] Change merged: Andrew Bogott; [labs/private] (master) - https://gerrit.wikimedia.org/r/63234 [22:25:12] oh balls, qstat doesnt work on exec nodes does it? xD [22:26:39] addshore: No, it doesn't. Exec nodes aren't sumbit hosts. [22:27:02] I was trying to use qstst to count the job number ;p [22:29:42] Coren: So jobs can't schedule other jobs? [22:31:25] jobs can be scheduled on -dev, -login, and the webservers [22:31:48] scfc_de: It's an atrocious models even when it works at all, and is not really supported by gridengine. The "canonical" method of doing it is to have a dispatcher running on a submit host, or dependent jobs (you can queue a job "start when x is over") [22:32:34] scfc_de: If you really need that functionality, I would recommend a work queue that is read by a cron job, for instance. [22:32:38] Coren: I don't have a specific use case in mind, just wanted to make a mental note :-). [22:44:43] hehe http://tools.wmflabs.org/addshore/toolslab/ [22:51:46] addshore: Care to turn that into something ganglia-ic? :-) [22:52:06] If I know how to do such things with ganglia then sure ;p [22:52:11] *knew [22:52:50] (Do we actually use Ganglia on Labs?) [22:53:04] ganglia.wmflabs.org/latest/?c=tools [23:06:35] addshore: http://comments.gmane.org/gmane.comp.monitoring.ganglia.general/1920 has some code, and the "official" contrib repo at https://github.com/ganglia/gmetric/tree/master/hpc/sge_jobs some more. As this probably requires admin action on ganglia.wmflabs.org, and Ryan isn't around, let's wait till next week. [23:18:03] * addshore adds it to his sort of todo list ;p [23:20:11] [bz] (NEW - created by: Tim Landscheidt, priority: Unprioritized - normal) [Bug 48338] Show grid status in Ganglia - https://bugzilla.wikimedia.org/show_bug.cgi?id=48338 [23:21:36] addshore: ^ [23:22:13] clever you ;p [23:22:47] i feel like I have had a rather productive evening with all my scripts and such [23:24:27] did you see my last message? :P [23:25:49] addshore: I did, and I saw the work that you did today :-). [23:26:10] heh, i hope not, it was all ugly ;p