[00:10:06] is anyone else getting $ ssh wmflabs ssh_exchange_identification: Connection closed by remote host? [00:10:29] oh nvm, i see the topic, labs down for maintenance [00:18:11] Coren: Wow, this is taking a long time. [00:19:40] anomie|away: Yeah. :-( [00:20:14] At least we finally end on new NFS hardware. [00:20:59] Is the new hardware "hopefully no more problems", or "definitely no more problems"? [00:23:33] Krinkle: you were looking for me yesterday? [00:27:27] YuviPanda: Yeah, I was looking for someone who can tell me where the php.ini is for tool labs web servers. [00:27:46] I couldn't figure out where it was in puppet. [00:29:37] anomie|away: "hopefully" [00:29:56] anomie|away: we'll know soonish I'd imagine [00:30:09] because we currently have a very silly configuration (as php 5.4 does by default) that makes it impossible (or at least hard) to output gzip pages (e.g. in mediawiki or some other php script) [00:30:15] Krinkle: ah [00:30:33] Krinkle: in toollabs? I thought we had 5.3? [00:30:41] yeah, 5.3 too [00:31:03] is tool labs down? [00:31:08] http://tools.wmflabs.org/ [00:31:09] Krinkle: nfs [00:31:11] replacement [00:31:14] see /topic [00:31:15] ERR_CONNECTION_REFUSED [00:31:21] ah [00:31:23] k [00:31:34] YuviPanda: - https://bugzilla.wikimedia.org/show_bug.cgi?id=55547 [00:32:03] Krinkle: i'm not sure if it is puppetized at all, I bet it's just the default [00:33:03] k, we'll need to either replace the default or add something in conf.d [00:33:20] Krinkle: yeah, should be able to do that easily. [00:33:25] can play around when toollabs is back up [00:33:42] I have no idea where it should go. I know where it could go but not where it should. [00:33:54] Coren: ^ [00:34:11] Krinkle: have you played around with SPDY at all? [00:34:36] YuviPanda: ? [00:34:54] Coren: Krinkle is trying to figure out where in puppet to put php.ini confs for toollabs [00:36:25] It's not puppetized yet, AFAIK. [00:36:31] yeah, what i thought [00:36:56] Coren: I suppose we can make it a conf.d and put it in modules/toollabs/files and add the appropriate rule in modules/toollabs/manifests/webserver.pp [00:36:59] Krinkle: ^ [00:37:16] Sounds about right to me. [00:37:33] for a second i read "trolllabs" [01:08:51] Coren: things up yet? [01:09:17] No. I despair, but I shan't give up. Then I'm going to beat up people who have 20G logs. [01:10:45] Coren: my whole / is less than 1g [01:10:59] That wasn't directed at you. :-) [01:11:08] (Nor at anyone specific, really) [01:11:26] Coren: whomever has 20g log files has something broken [01:11:35] Clearly. [01:11:57] log files shouldnt often exceed 500mb [01:13:51] we should build a central logging system (with logstash) [01:14:04] but then everyone'll bitch and moan about having to modify their code, so why bother? :) [01:14:23] Coren: or they should be stored in /temp and subject to random deletions :P [01:22:07] Hi. [01:22:21] you're here [01:22:23] LABS DOWN FOR PLANNED MAINTENANCE [01:22:25] Yes, I'm here. [01:22:27] Hello. [01:22:28] I'm here. [01:22:33] I checked the mailing list. [01:22:41] Coren: Is there an outage list or something? [01:22:45] You can't make me subscribe to labs-l. [01:22:52] But I might subscribe to actually important shit. [01:23:05] Elsie: hire a secretary [01:23:08] I checked the archives earlier and it said NFS for like 30 minutes to an hour I think. [01:23:11] YuviPanda: I have Coren. [01:23:12] (o; [01:23:30] I was on the server when it apparently went away. [01:23:33] Not a clean break. [01:23:34] Elsie: The advantage of being subscribed is that you'd also have gotten the regular updates. :-) [01:23:35] Coren said there was some unexpected data transfers that are slowing the patching down. [01:23:53] Coren: I get about a 100 wiki e-mails a day as it is. :-( [01:24:03] I'm apparently now on design, in addition to wikimedia-l and wikitech-l. [01:24:07] Have some empathy! [01:24:17] * YuviPanda unsubscribes Elsie from wikimedia-l [01:24:20] that should help [01:24:23] And the two Toolserver lists, of course. [01:24:32] Plus nagging e-mails from the Toolserver. [01:24:51] "Your code sucks. Sincerely, ts-nag-bot" [01:24:55] Every day. [01:26:24] Coren: What about a status site? :-) [01:26:32] http://status.wmflabs.org/ [01:27:10] http://lists.wikimedia.org/pipermail/labs-l/2013-October/001750.html [01:27:21] I wonder if I have gigs of logs. [01:27:58] http://lists.wikimedia.org/pipermail/labs-l/2013-October/date.html#1750 [01:28:02] Oh, I already pasted a link. [01:30:13] See, I planned ahead. I already did a rsync yesterday, thinking that copying "just the touched files" since last night would be swift. [01:30:55] * Coren stares malevolently at nearly 300G of logs, all told, not counting git pulls touching hundreds of thousands of files. [01:31:01] And dumps. [01:31:26] * Elsie takes a dump on the channel floor. [01:31:33] Elsie: You're not helping. [01:31:36] Coren, copy copy copy faster [01:31:38] :P [01:32:37] And the copy is already annoyingly slow because it has to be over the network to avoid triggering the controller stalls. [01:33:06] percentage? [01:33:37] MRX: Not knowable. rsync copies in arbitrary order; but I think I've seen most tool names scroll by by now. [01:33:51] It's currently stuck on a particularily... beefy one. [01:34:00] heh [01:34:52] Not sure what Cyberbot's datefix /does/, but it produced 16G of logs. [01:35:26] rfx-report is "more reasonable" at only 8G of logs. [01:35:44] * Coren makes a note to have a chat with Cyberbot. [01:36:18] haha [01:36:52] The sad thing, of course, is that much of the stuff that takes so long could very well have been nuked. [01:37:45] But like I said earlier, the version of rsync packaged for Precise doesn't support the --ignore-crap option. :-) [01:38:29] Good thing too, I might have an account after it's done. [01:38:33] might not* [01:48:48] Coren: there was a reason I had a quote per user :) [01:49:11] quite a few users were writing out 10G log files because they never truncated them [01:49:15] in the bots project [01:49:18] Ryan_Lane: I have a LART. [01:49:23] LART? [01:49:30] Look it up. :-) [01:49:49] ah, I never use the term luser [01:50:19] It doesn't need to go with luser specifically; it's a BOFH thing. :-) [01:50:51] * Ryan_Lane nods [01:51:49] At one of my past jobs, I had two LARTS on my desk in my office: LART-1, a long stretch of packing foam (think Nerf); and LART-2, a crowbar. Whenever someone was at my desk asking me something silly or because they goofed, I'd make a show of hesitating over which to grab. (I always wacked with LART-1) [01:53:44] The only time I "used" LART-2 was when one of the dev managed to bring down the build box with a fork bomb. I was already hearing them tease the poor dude who had goofed, so I stamped into the dev room with a grimace brandishing LART-2. [01:54:04] It took them about 2 hours to recover from the severe laughter. :-) [01:56:02] (Oh, yeah, I burst in the dev room bellowing "Who was it?" in my best impression of an insane lumberjack on a rampage) [01:56:23] I read that as 'insurance salesman on a rampage' [01:56:28] i've no idea why [01:56:41] * YuviPanda imagines Coren as an Insurance salesman [01:56:46] Are they known for attacking devs with crowbars? :-) [01:57:13] printers maybe [01:57:44] * YuviPanda imagines Coren doing an 'Office' style hardware bashing, but with labstore3 than a printer :) [01:58:03] heh. [01:59:25] * Coren misses LART-2 now. [01:59:27] -rw-rw---- 1 50404 local-50404 282G Oct 10 17:59 /mnt/srv/tools/project/cyberbot/CyberbotII/PCbot.out [01:59:33] Ima kill 'im! [01:59:53] 282G [01:59:56] :| [01:59:58] right [02:03:12] Coren, please don't delete PCbot.out, I've been reading it over the last few weeks and am on the 254th gig [02:03:18] rm $(find /srv/tools/project/cyberbot -name '*.out') [02:03:48] spagewmf: I have no choice; this file alone will take hours to copy. [02:04:43] * Coren feels like a sucker now. [02:04:43] oh nooes, it started to get really gripping around the 247th gig. Did Alice and Bob get back together? [02:04:50] * spagewmf trolling [02:05:13] spagewmf: Non, Eve screwed the relationship up anyways by stealing Alice's love letters. [02:05:32] In the end, it turned out that Alice was actually the US government [02:05:34] and Bob was Elsie [02:05:35] so [02:06:14] spagewmf: hey! you use a standing desk, right? [02:06:25] spagewmf: any resources on how to properly position oneself when using it, etc? [02:07:45] YuviPanda: http://www.amazon.com/Ergotron-WorkFit-C-Sit-Stand-Workstation-24-213-085/dp/B005K96QJU/ Now everyone's using the http://www.focaluprightfurniture.com/ [02:08:09] spagewmf: ah [02:08:18] spagewmf: mine is a stack of books on a dining table :| [02:10:36] YuviPanda: briefly, everyone is different. I like the top of monitor _above_eye level, high above the keyboard, so I like the Ergotron [02:13:29] spagewmf: hmm, right [02:18:58] I sit down. [02:35:40] Sooo tired. Be done already!!! [05:23:55] [bz] (8NEW - created by: 2MZMcBride, priority: 4Unprioritized - 6normal) [Bug 55603] Impose per-user home directory quota in Tool Labs - https://bugzilla.wikimedia.org/show_bug.cgi?id=55603 [09:58:45] [bz] (8NEW - created by: 2Tim Landscheidt, priority: 4Unprioritized - 6trivial) [Bug 48625] Provide namespace IDs and names in the databases similar to toolserver.namespace - https://bugzilla.wikimedia.org/show_bug.cgi?id=48625 [10:38:05] [bz] (8NEW - created by: 2Matthew Flaschen, priority: 4Unprioritized - 6normal) [Bug 55612] mediawiki_singlenode can not run installer due to missing /usr/bin/php - https://bugzilla.wikimedia.org/show_bug.cgi?id=55612 [10:48:16] !log deployment-prep beta is back up :-] [10:48:17] Logged the message, Master [11:01:11] valhallasw: reviewing your change [11:01:14] valhallasw: and amending it :-] [11:01:21] ok :-) [11:01:31] it was almost correct [11:01:39] you just forgot to apply the new template to the project [11:02:27] and had to change the Git command to make sure it is not fetching submodules [11:03:22] submodules should be exempted anyway, as externals is not listed in hte pyflakes command, right? [11:03:26] !jenkins pywikibot-core-pyflakes [11:03:26] https://integration.wikimedia.org/ci/job/pywikibot-core-pyflakes [11:03:28] nothing wrong with not cloning submodules, though [11:03:54] valhallasw: example build https://integration.wikimedia.org/ci/job/pywikibot-core-pyflakes/652/console [11:04:01] 11:03:41 + pyflakes pywikibot scripts tests [11:04:01] 11:03:42 pywikibot/__init__.py:15: 'logging' imported but unused [11:04:02] 11:03:42 pywikibot/__init__.py:18: 'sys' imported but unused [11:04:03] .. [11:04:08] looks good to me [11:04:14] :-) [11:05:06] what was the issue with pfyalkes . already ? [11:05:37] https://integration.wikimedia.org/ci/job/pywikibot-core-pyflakes/651/console [11:05:40] that one has pyflakes . [11:09:43] copy pasted question on https://gerrit.wikimedia.org/r/#/c/86729/ [11:13:21] hashar: it takes ez_setup into account, but it indeed already ignores externals [11:16:37] okkk [11:17:08] valhallasw: job is already updated, I am merging the change [11:23:45] petan, [11:29:33] labs corrupted my bot. :-( [11:35:30] Coren|Sleep, for when you wake up. PLEASE PLEASE shut down the spambot task first. It relies on so many files, that the maintenance almost sent it on a rampage to falsely tag 30000 articles on WIkipedia. It also corrupted the bot. I took 45 minutes to fix itself since the files where providing conflicting data, essentially the bot hung up duiring the recovery process. [11:37:52] Cyberpower678: jstop? [11:38:15] or qdel [11:38:25] I was asleep when this happened. Someone on Wikipedia disabled the runpage thankfully. [11:40:30] Spambot which has probably become very well known by content editors relies on several files during it's run. If it goes missing. The bot may go haywire. [11:54:21] Cyberpower678 [11:54:44] Will Coren|Sleep get that message I wrote above? [11:54:56] which one [11:55:12] The one right above valhallasw [11:55:21] A few lines up [11:55:23] this is all I received in notification window [11:55:25] (11:23:45) petan, [11:55:46] PLEASE PLEASE shut down the spambot task first. It relies on so many files, that the maintenance almost sent it on a rampage to falsely tag 30000 articles on WIkipedia. It also corrupted the bot. I took 45 minutes to fix itself since the files where providing conflicting data, essentially the bot hung up duiring the recovery process. [11:56:06] ok so you need to shutdown some task? [11:56:14] No. [11:56:34] Before ever doing maintenance again. Shut down that task first. [11:56:40] Should be easy to find. [11:56:50] why it happened? [11:56:55] you could also fix your bot [11:56:58] your task should be optimized for such events [11:57:02] giftpflanze +1 [11:57:45] The maintenance caused some files to get damaged. They were outdated which fed the bot wrong information. [11:57:53] There's no defense against that. [11:57:56] but I agree with Cyberpower678 that all tasks should be probably stopped before maintenance [11:58:06] I thought Coren|Sleep did tha [11:58:07] t [11:58:33] I found out because someone filed a bug report on my talk page. [11:58:58] This bot is optimized for crash recover, but is powerless in detecting faulty data. [11:59:25] It does have a runpage though. [11:59:41] So the user disabled my bot while I was asleep. [11:59:47] yes, this is fucked up maintenance I guess [11:59:57] he should have stop all tasks before messing up the fs [12:00:14] maybe just send him a mail [12:00:25] that is more reliable than "Coren: ping" [12:00:26] My bot's the winnner of large log files. :p [12:00:45] howcome [12:00:53] isn't that 990GB logfile guy a winner [12:01:24] Coren|Sleep, said that Cyberbot is generally the winner through it's high volume of large files. [12:02:13] 05:04 < wm-bot3> False [12:02:28] wonders why the bot said that without an obvious reason :) [12:02:50] !false is False [12:02:50] Key was added [12:02:54] !falser [12:02:58] !false [12:02:58] False [12:03:10] petan, wm-bot> False [12:03:30] what [12:03:36] ?? [12:03:41] oh wait mutante False [12:03:49] wtf is going on now [12:15:16] petan: Cyberpower678 tried to suggest a reason for mutante's observation of wm-bot saying 'false' [12:15:41] valhallasw: does it do that? :P [12:15:43] @rss-off [12:15:43] Rss feed has been disabled on channel [12:15:57] ^^^ [12:16:17] That is what causes "False" [12:18:58] petan: it's the rss "False" bug I mentionefd a while ago. [12:19:21] is it reported? [12:20:31] You can reset the rss-style to something usable, but when it false to retrieve, it sets the text to false instead of turning that feed off. [12:20:52] And I'm not sure if I put in a ticket or not. [12:20:54] oh, so it told me "couldnt read RSS feed" [12:20:55] gotcha [12:21:12] that makes sense, i use it on another channel to read various feeds [12:22:08] If you do @info you can see which feed and fix it or disable it correctly. [12:22:48] It'll be the one with false in the text column. [12:33:56] thanks [12:38:49] Cyberpower if a tools needs a clean shutdown before a mainetance operation that is advertized well in advance, you should at /least/ have told me (and ideally stopped it yourself) :-) [12:40:46] Coren|Sleep: there is a file that can't be changed owner of [12:41:10] petrb@tools-dev:/data/project/gabrielchihonglee-bot/public_html/w$ sudo chown local-gabrielchihonglee-bot fuckup [12:41:11] chown: changing ownership of `fuckup': Invalid argument [12:41:33] petan: Is that a new tool? [12:41:45] not sure, it's not mine [12:41:49] zhuyifei1999? [12:46:34] petan: It's just owned by him; he can use take. [12:46:59] Coren: he told me he couldn't? nor sudo chown worked [12:47:26] sudo chown is deprecated anyways [12:48:03] petan: Have you seen https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb ? [12:48:17] not yet [12:50:28] Coren: When I tried touching ~/lighttpd.RESTART just now, it dumped some errors about being unable to unlink the pid file. And I'm not sure if it actually restarted. [12:51:07] It should have. The bit about unlinking the pid file is a known issue that has no effect. [12:51:26] Coren: I'm just fixing Gabrielchihonglee's bot test wiki [12:51:26] if the file is gone, and hasn't been replaced by a .STOPPED then it's restarted. [12:52:03] he gave me the right to deal with them [12:52:33] zhuyifei1999: You can use take to chown the file; have you tried it? [12:52:47] anomie: What tool? [12:52:52] already [12:52:54] Coren: File is not gone. 'anomiebot' tool. [12:53:33] anomie: The file's existence is checked every 120s or so. Lemme check. [12:53:39] Coren: already done long ago [12:53:58] zhuyifei1999: Can you try it and tell me the output, please? [12:54:13] Invalid argument [12:54:48] just like petan's [12:54:55] Just "Invalid argument"? [12:55:27] more, but I forgot [12:55:46] I'd appreciate it if you could try it now and paste the result. [12:55:49] &very hard connect today [12:56:31] anomie: Odd; it's really not starting for you. Lemme try and figure out why. [12:57:40] zhuyifei1999@tools-dev:~$ cd /data/project/yifeibot [12:57:50] zhuyifei1999@tools-dev:/data/project/yifeibot$ touch testing [12:58:01] zhuyifei1999@tools-dev:/data/project/yifeibot$ become yifeibot [12:58:14] local-yifeibot@tools-dev:~$ take testing [12:58:22] testing: Invalid argument [12:58:44] Coren: ^^ [12:59:11] Odd. I'll look into it [13:03:19] anomie: Oh, I see why. There's a bug in the RESTART logic. [13:04:52] fwiw, wmbot read RSS feeds just fine yesterday, it's just today that it switched to all "False" [13:07:23] anomie: Fix't [13:07:34] Pro tip: qstat -u root -j httpd-anomiebot [13:08:42] Coren: auto restart failed again with about 4 of my bots [13:09:27] Betacommand: That's probably my bad; I was so tired when I finally could restart things that some exec nodes were rebooted early with only partial filesystem access. [13:09:40] * Betacommand grumbles [13:09:40] Betacommand: I thought I had caught all, but possibly not. [13:09:57] those bots have yet to survive a reboot [13:09:59] Coren: BTW, any interest in including that rule in ~local-anomiebot/.lighttpd.conf in the default config? If not, I'll put it on the webpage under "recipes for common configuration tasks". [13:11:25] anomie: It'd make sense (to make it default). After I caught up on sleep. The only reason I'm here at all is for some emergency support. :-) [13:36:21] (03CR) 10Yuvipanda: "Ow, okay :D Can we make it to be Draft1, to be consistent with PS1, PS2, etc?" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/88037 (owner: 10Hashar) [14:09:20] YuviPanda: yes [14:09:54] (03CR) 10Hashar: "Yeah go for Draft1 :-] Can you amend it yourself? I am too busy right now sorry. Else will do next week." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/88037 (owner: 10Hashar) [14:11:07] I am off [14:18:06] (03CR) 10Yuvipanda: "Sure! Just didn't want to step on toes :P I'll amend it later tonight" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/88037 (owner: 10Hashar) [14:20:50] YuviPanda: if awake, can you restart/check on the unicorn? I'm getting 500s now. [14:42:48] andrewbogott: ah, okay [14:42:50] moment [14:43:14] thx [14:43:51] andrewbogott: I guess it needs a restart? [14:43:56] Killed by signal 1. [14:45:02] (03PS3) 10Yuvipanda: draft exposed as 'draft-1' instead of 'PD1' [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/88037 (owner: 10Hashar) [14:45:43] (03PS4) 10Yuvipanda: draft exposed as 'Draft1' instead of 'PD1' [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/88037 (owner: 10Hashar) [14:46:42] (03CR) 10Yuvipanda: [C: 032 V: 032] draft exposed as 'Draft1' instead of 'PD1' [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/88037 (owner: 10Hashar) [14:47:49] andrewbogott: rebooting [14:49:23] (03CR) 10Yuvipanda: "Merged and Deployed. Would be awesome if you could test it at some point." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/88037 (owner: 10Hashar) [14:50:10] andrewbogott: still can't ssh in. can you take a look? [14:50:17] yep [14:51:10] andrewbogott: machine seems dead [14:51:10] andrewbogott: https://metrics.wmflabs.org/ doesn't work either [14:51:10] responds to ping tho [14:51:10] * YuviPanda awards Coren|AFK a Barnstar of Staying Up Watching Things Copy [14:51:26] YuviPanda: I can log into proxy-dammit both as root and as Andrew [14:51:48] andrewbogott: ah, i can now too [14:51:55] So just a delay in reboot [14:51:58] probably [14:52:05] oh, no screen [14:52:07] right [14:52:36] andrewbogott: should be running on 5000 now [14:53:16] Yep, looks better. Thanks! [14:53:38] yw [14:54:11] YuviPanda: did you get anywhere with checking code into gerrit? [14:54:19] andrewbogott: nope. needs push right [14:58:11] YuviPanda: Can you push now? I added group lolrrit since that's basically just you anyway [14:58:18] trying [14:59:18] andrewbogott: woo, that worked! :D https://github.com/wikimedia/labs-invisible-unicorn [14:59:51] Cool. [15:00:14] I'm going to be afk off and on today, but I'll think about making a deb out of this... [15:00:58] YuviPanda: Now that things are imported do you want to keep push rights? [15:01:07] ah, no. take it off! :) [15:01:12] 'k [15:01:22] andrewbogott: it might need a newer version of python-redis and python-flask :( [15:01:24] not sure about flask [15:01:28] but probably redis [15:01:40] Yeah, and that special nginx package [15:01:43] Will be a mess [15:01:55] andrewbogott: i can rewrite it to use the system packages [15:02:07] andrewbogott: for flask and redis at least [15:02:13] If that's not a huge amount of work, that'd be really good. [15:02:16] it's not [15:02:23] let me know when you stop testing so I can take a look [15:02:44] Redis might be *slightly* slow on system package, but since this is just the API shouldn't be a problem [15:03:27] I'm done testing for the moment. [15:03:53] andrewbogott: grr, it uses flask SQLalchemy, which has no package [15:03:54] :( [15:05:54] Everything you're doing is similar to things that Openstack services do (using wsgi and sqlalchemy) so if you look at OS source maybe it'll be obvious how to adapt? [15:06:02] Since presumably we already have all the packaging we need for Openstack APIs [15:06:25] - back soon - [15:06:48] andrewbogott: hmm, okay [15:06:51] will do! [15:34:06] YuviPanda: Um… not that I want to necessarily encourage you to rewrite everything, but the OS services use a shared core called 'Oslo' that might magically do all the things you want. [15:34:33] I haven't looked at it in a while, though, it might have grown to be excessively complex. [16:00:38] andrewbogott: right. I think Ryan_Lane linked me to it at some point. [16:41:23] Coren|AFK: err, why are all the projects home directories in / ? [16:49:18] Coren|AFK: and / and /data/project/ are not the same... [16:50:52] Coren|AFK, petan, and I can't write to /data/project/gerrit-patch-uploader/ as it's owned by root... [17:11:44] a bunch of my jobs are in error state [17:11:45] error reason 1: 10/11/2013 06:54:12 [40004:29226]: can't stat() "/data/project/legobot//GA_bot.out" as stdout_path: [17:12:43] i killed all of them except one [17:13:18] my logs are just under 5M! [17:13:19] nice [17:13:42] legoktm: there's something wrong with the project mount, I think [17:13:52] :/ [17:15:46] legoktm: /data/project *and* / contain home directories for projects, but they are not the same [17:15:51] hm, tried activating lighttpd by creating .lighttpd.conf (https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb), but I'm not sure how to configure it to work with python.. Anyone know? [17:16:09] danmichaelo: http://flask.pocoo.org/docs/deploying/fastcgi/ [17:16:18] I haven't tried it myself yet, though. [17:17:02] https://panel.djangoeurope.com/support/manual-setup/ under Set up your Django site in lighttpd might also help [17:17:13] I was planning to work on it tonight, but my projects' home folder is broken [17:18:47] danmichaelo: you could probably also just use cgi scripts, though. [17:19:19] yeah, that's what I did with apache [17:19:40] I'm working on the remaining kinks. Something went broky with chown when root on the new NFS server I'm trying to debug (hence inability to change ownership) [17:21:32] Find it a bit weird that the python script itself is specified in "bin-path". Seems like you have to add a "fastcgi.server" block for each script then. Not that I have that many scripts, just seems a bit redundant [17:23:31] danmichaelo: the fastcgi.server maps an http address ('/yourapplication.fcgi') to an application (/var/www/yourapplication/yourapplication.fcgi) [17:23:42] danmichaelo: the application doesn't have to be (and shouldn't be, really) in public_html [17:24:22] Sorry about the (brief) filesystem stall, changing a setting. [17:28:37] danmichaelo & valhallasw : looks like you can either map every single application, or create one socket and map processes to that, e.g: http://home.badc.rl.ac.uk/lawrence/blog/2006/10/26/exploring_web_server_backends_-_installing_fastcgi_and_lighttpd ? [17:28:54] sorry, tell your application to connect to the socket that's available [17:30:04] hmm [17:47:59] The first time I touched lighttpd.RESTART the server restarted and lighttpd.RESTART was deleted, but I also got the lighttpd.STOPPED file that I cannot delete (owned by root, rw-r-r) and the second time I touched lighttpd.RESTART nothing happened (it's eight minutes ago). I wonder if the presence of lighttpd.STOPPED is blocking the restart [17:48:32] also, nothing is written to error.log... [17:50:12] danmichaelo: It is. [17:50:27] danmichaelo: But you should be able to remove the lighttpd.STOPPED file [17:50:47] Coren|AFK: liar! :P (AFK) [17:52:01] Coren: any idea why /data/project/gerrit-patch-uploader is owned by root? [17:52:26] and, more importantly, could you chown it to local-gerrit-patch-uploader? :-) [17:52:34] valhallasw: Yes, because there is a problem with permissions atm prevent root from chowning stuff. [17:53:04] Hey all, quick question regarding the javascript API [17:53:12] what would be the time when mw.user is ready within the mw.loader? [17:53:28] I don't understand why, but yes, I could remove it 8) [17:53:44] Ryan_Lane: thinking of any timeline for migration from public IPs to proxy? [17:53:58] Ryan_Lane: asking to make sure I can be available around that time [17:55:30] Coren: ahhhh. OK [17:57:04] YuviPanda: whenever we have the interface merged and the api packaged [18:03:12] valhallasw: I've chowned it manually from the server. [18:07:15] Coren: cool, thanks. [18:08:29] Coren, now that the hardware upgrade is complete, how about expanding my instance? :-) [18:08:58] Cyberpower678: so you can make more 1T log files? :) [18:09:26] None of my files are no where near that big. [18:09:51] Read the email again. :p [18:10:02] it said 2G by mistake [18:10:03] he meant 2T [18:13:04] Can you verify that? [18:13:42] !logs [18:13:42] raw text: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-labs/ cute html: http://tools.wmflabs.org/wm-bot/logs/index.php?display=%23wikimedia-labs [18:13:46] hmm [18:13:48] there won't be logs for y'day [18:13:52] because labs was down [18:14:02] Cyberpower678: coren can verify, I gues [18:14:06] i was here when he reported it [18:14:21] Cyberpower678: also, if one of the other bots had a log file that was 1T, how can *you* win it overall with 2G? [18:14:24] I meant 2T Cyberpower678 [18:14:36] Cyberbot I runs three different bots taken over fro users, soon to be 4 bots. [18:14:51] Cyberbot II runs my own stuff and one of them is resource heavy. [18:15:14] They generate quite a bunch of output. [18:15:20] [21:59:26] -rw-rw---- 1 50404 local-50404 282G Oct 10 17:59 /mnt/srv/tools/project/cyberbot/CyberbotII/PCbot.out [18:15:38] Cyberpower678: That's one of many. You *need* to clean your logs. [18:16:14] Yes I do. I can't even access them anymore. They're way too big. I'm going to develop a monthly log cleaner script. [18:16:53] And if I delete them, the job starts dumping them into wierdly named files. [18:17:55] Coren, so about my question? [18:18:41] You're now #1 on my list, for next week. Today, I'm taking half-day off fixing minor kinks left in the new setup. [18:18:55] :-)))))))))))))))))))))) [18:19:07] Coren, I love you so much. [18:20:04] Well I gotta go. [18:21:54] Coren, Internal Server Error [18:22:02] What's going on? [18:22:41] valhallasw: thanks for helping. After restarting it turned out CGI worked just like with the old apache server (index.py file in public_html is executed correctly with an empty .lighttpd.conf file). I might look into fastcgi at some point, but cgi works fine for now since my web interface is very simple and not much used [18:23:04] There are going to be regular brief nfs stalls as I try to fix the permission problem. [18:23:15] Ok [18:23:16] Bye [18:23:50] Although tools-webproxy seems to have died entirely (for an unrelated reason) [18:24:21] Ah, not, it just stalled a little harder [18:34:01] Coren: looks like webservers might be down again [18:34:34] [14:23:03] There are going to be regular brief nfs stalls as I try to fix the permission problem. [18:34:40] ah [18:34:43] missed that [18:35:12] By, literally, minutes. And said stall is over for now. :-) [18:43:05] hello what was the table logging_ts_alternative on toolserver? [18:43:39] it is not avaible on labs. Can it be replaced with table logging on labs? [18:43:53] or is on labs also a table like this one [18:43:54] ? [18:45:13] pyfisch: Hm, I expect I have one like it. Which extra index is on logging_ts_alternative that isn't on logging? [18:46:01] I don't know. I am just trying to get a ts query working [18:46:16] pyfisch: If you show me the query, I'll tell you what view to use. :-) [18:46:38] one moment [18:48:22] Coren: ok it worked just with "logging" [18:48:36] It might be very slow depending on the where clauses, though. [18:48:54] Which is why the alternatives exist. :-) [18:52:07] it was fast, to fast. The query should take on the ts +1 hour here it only need 4:31 min [18:52:11] SELECT STRAIGHT_JOIN CONCAT("[[:File:", REPLACE(img_name, "_", " "), "]]") AS "File", IF(img_media_type IN ("AUDIO", "VIDEO"), "—",CONCAT("[[{{ns:6}}:", REPLACE(img_name, "_", " "), "|1x1px|alt=404|Removed]]")) AS "Status", CONCAT("{{subst:#time:Y-m-d|", img_timestamp, "}}") AS "Timestamp", CONCAT("{{user|", REPLACE(img_user_text, "_", " "), "}}") AS "Uploader", IFNULL(GROUP_CONCAT(log_action ORDER BY log_timestamp SEPARATOR [18:52:11] (SELECT COUNT(*) FROM globalimagelinks WHERE gil_to=img_name) AS "Usage" FROM image LEFT JOIN page ON page_namespace=6 AND page_title=img_name LEFT JOIN logging ON log_namespace=6 AND log_title=img_name AND log_action NOT IN ("patrol") WHERE page_title IS NULL GROUP BY img_name; [18:52:57] does anyone know if it is possible (yet) to have a tool labs tool to function as a web server? [18:53:01] this is the query and this one the original: User:Ilmari Karonen/Queries/Zombie images [18:53:28] coren: https://commons.wikimedia.org/wiki/User:Ilmari_Karonen/Queries/Zombie_images [18:53:29] MartijnH: I'm not sure what you mean with that. There's public_html, and you can run lighttpd [18:53:50] MartijnH: scala? [18:53:54] yup [18:54:01] spray if at all possible [18:54:01] i can set you up with one for it as a one off for now. [18:54:04] spray? [18:54:06] what's spray? [18:54:07] valhallasw: It'll be even more efficient if you use 'logging_logindex' rather than 'logging' [18:54:16] yeah, it self serves (binds to a port) [18:54:18] Oh, no. [18:54:20] MartijnH: http://spray.io/ [18:54:21] ? [18:54:27] yes YuviPanda [18:54:32] MartijnH: when do you want it? [18:54:33] A feature I need requires a >3.2 kernel. [18:54:43] * Coren facedesks. [18:55:18] YuviPanda, no rush (whatsovever) [18:55:23] hah, okay :) [18:55:27] it'd be nice to have [18:55:43] if it were there I'd try to set up a slightly faster CATSCAN replacement with limited functionality [18:55:49] if not, not big deal [18:55:59] mwalker: Gerrit Patch Uploader doesn't need shell access - it only connects to Gerrit. Thanks for checking. [18:56:35] *thumbsup* [18:58:43] YuviPanda, Petan earlet set up sbt for me, but that seems gone now too. I'm hella busy at work at the moment now though, and can't get myself to coding in my off hours. And I cut my 20% time at work too [18:59:04] MartijnH: he mustn't have puppetized it, i saw no scala stuff on puppet [18:59:17] Coren: I wrote a quick wrapper script to restart lighttpd - see /data/project/gerrit-patch-uploader/bin/restart_lighttpd [18:59:56] I probably should have it working locally anyway before talking about deployment woes. Folks are generally pretty fast if you need something [19:00:03] Coren: touches lighttpd.RESTART, then waits until lighttpd.RESTART disappears, and shows if lighttp.STOPPED exists afterwards [19:01:57] Coren: svn: Can't read directory 'old_junk/w': Input/output error [19:02:13] Betacommand: where is that old junk? [19:02:15] MartijnH: heh, yeah [19:02:26] MartijnH: I bet scala doesnt' do well with CGI :P [19:02:36] Coren: svn_copy [19:03:21] YuviPanda, the only way I can think of that working is spawning a JVM per request [19:03:27] indeed [19:03:29] YuviPanda: we have fcgi now! [19:03:30] terrible terrible :D [19:03:31] that doesn't make me a happy puppy [19:03:33] ... I see nothing wrong with the directory. [19:03:39] valhallasw: yeah, need to read up :) [19:03:53] Coren: trying to do an svn update [19:03:56] MartijnH: shouldn't be too hard to set it up, tho [19:04:25] svn_copy$ svn update [19:04:26] svn: Can't read directory 'old_junk/w': Input/output error [19:05:30] * Betacommand grumbles [19:06:37] That's odd indeed. [19:06:48] YuviPanda, I don't really know anything about CGI. But I'm sure people here are kind enough to set me up with a Java application server if needed [19:08:06] Betacommand: It's the same on the server side too. [19:08:09] * Coren checks [19:09:46] Aaah, I see what happened. That directory had some minor corruption from way back when the server went boom; doing the copy basically broke it for good. [19:10:18] MartijnH: any suggestions on Java app servers? Hopefully ones that are packaged on Ubuntu precise? :) [19:10:38] Coren: do you know what is affected? [19:11:40] Betacommand: Setting the directory aside server-side allows the svn up to restore everything; I've renamed it broken__w for now; if you have nothing in it you want to retain I'll delete it server side, [19:11:44] yeah sure. I'll go with "whatever" with a preference for Jetty I suppose [19:11:58] no serious preference either way [19:12:34] Betacommand: No, anything visible was already ferreted out; that bit was left in a broken dentry for . in that directory. [19:12:46] Coren: Ive got symlinks in there that may be broken [19:13:05] Hi all [19:13:11] I have a Q [19:13:21] Coren: I think I have a lighttpd config problem (changed config and I get lighttpd.STOPPED), but I see nothing in my error.log [19:13:32] Coren, what's broken? [19:13:59] Betacommand: You've got a fresh, unbroken w in old_junk now [19:14:06] when i run a bot on labs i face"""WARNING: Token not found on wikipedia:ar. You will not be able to edit any page. [19:14:07] Received incomplete XML data. Sleeping for 15 seconds...""" [19:14:25] Whats the problem [19:14:40] Coren: thanks [19:14:56] Coren: did you do an update? [19:15:11] no [19:15:13] Betacommand: Yes, to make sure it worked. [19:15:23] Coren, what's broken? [19:15:29] Elph: you are using a REALLY old version of pywikipedia [19:15:31] Cyberpower678: What are you taking about? [19:15:40] Aaah, I see what happened. That directory had some minor corruption from way back when the server went boom; doing the copy basically broke it for good. [19:15:53] That's the first line that showed up. [19:16:14] How i can update it? [19:16:15] Cyberpower678: I don't get your question; that's the answer. [19:16:51] valhallasw: Gimme a minute. [19:16:53] Coren, you made that statement. But I don't know what about. [19:17:07] Cyberpower678: That's because it wasn't addressed to you. [19:17:44] Elph: you will need a fresh checkout, pywiki moved to git [19:18:01] Elph: https://www.mediawiki.org/wiki/Manual:Pywikibot/Installation [19:18:19] Coren: it can be deleted [19:18:29] thank you. I'll try [19:18:54] Betacommand: *poof* [19:19:14] Coren: I got my directories mixed up :P [19:19:29] "mixed up?" [19:19:43] /w and /Watchbot [19:19:53] valhallasw: The only thing I have server side is an entirely uninformative "Session terminated, terminating shell... ...terminated." [19:20:12] valhallasw: I can try to start it by hand see what gives. What's the tool name? [19:20:28] /w has my generic configs for multi-language watchbots while the other has my current active bots [19:20:42] Coren: gerrit-patch-uploader [19:21:15] the unlink failed for: /var/run/lighttpd/gerrit-patch-uploader.pid 2 No such file or directory lines seem to be unrelated [19:21:27] (or rather, related to restarting the server) [19:21:39] I now have an empty .lighttpd.conf, which works. [19:21:49] Yeah, that's a noise I haven't found how to prevent but is harmless. [19:22:07] I was about to say that it was already running. :-) [19:22:09] Coren is soarin' [19:22:24] :D [19:23:06] And will soon be sore-in [19:24:10] valhallasw: Want me to test it manually with your non-empty .lighttpd.conf? [19:24:20] Or is this in hand atm? [19:24:22] Coren: I can put mine back, just a sec. [19:24:56] Coren: ok, replaced, and restarting lighttpd... [19:27:12] Coren: OK, restarted, and it's not running anymore. no .STOPPED yet, though. [19:30:52] and now lighttpd.STOPPED is also there :-) [19:36:49] valhallasw: Gimme a sec [19:38:03] valhallasw: Ah, I see the error now. [19:38:14] * Coren wonders why it doesn't end up in your error.log [19:38:28] 2013-10-11 19:37:10: (configfile.c.912) source: /var/run/lighttpd/gerrit-patch-uploader.conf line: 571 pos: 17 parser failed somehow near here: (EOL) [19:39:15] Which isn't useful for you. :-) [19:39:40] Should be line 14, pos 17 [19:39:42] for you [19:40:38] It's the url.rewrite-once [19:42:07] Hmmm. [19:42:26] Coren: hum, maybe it's the comments that break parsing. Let me try. [19:42:39] lemme start it by hand [19:42:46] (once you're done) [19:42:53] I'm done :-) [19:43:23] and it's stopped. [19:44:05] possibly it's your fastcgi.server directive that's broken, it just gets to the url.rewite-once with a bad state [19:44:58] * Coren sees nothing wrong with it. [19:45:31] I added a comma that another example does have... [19:45:38] let's see if that helps :/ [19:45:58] If there is an easy way to get the full config file, then it's easy to test it locally for errors [19:46:06] basically lighttpd -t -f lighttpd.conf [19:46:31] Yes and no. This is basically what I'm doing now, but teh system isn't quite flexible enough for it yet. [19:49:01] Lemme try something; don't fiddle with your config file for a minute [19:49:09] k [19:58:30] Oh! [19:58:44] You got duplicates with the defaults. [19:59:43] doh! [19:59:47] And that happens before the error.log is set. [19:59:58] The version I modified starts long enough for you to get errors. :-) [20:00:45] (And I'm guessing you need to make your backend executable) [20:02:06] I need to enter the right path to my backend :-) [20:19:30] Coren: the wait time between editing something and having lighttpd restarted is somewhat annoying - it seems to be 2x 2 minutes: 2 minutes to stop the server then 2 minutes before it is started again [20:20:02] valhallasw: I know. I'll have a nicer mechanism; remember this is a first draft. Think of it as beta. :-) [20:20:44] ok :-) [20:22:26] Coren: Are you too busy on labs stuff to advise me on some Arbcom procedures [20:22:26] ? [20:22:50] I'm mostly lacking sleep atm, sorry. [20:22:58] np [20:58:07] WHOOOOOOOOOOOOO [20:58:13] it's working \o/ [20:58:33] so: enough coding for today =) [23:17:50] petan: i won't touch c# and maybe won't even do upstart. but i will at least continue to boot the bot once in a while. but i can't if i don't have access to the box anymore. it moved from bots, right? [23:24:53] Hi [23:25:01] Anyone know why catscan2 is down? [23:25:16] http://tools.wmflabs.org/catscan2/ [23:29:12] $ less /data/project/catscan2/php_error.log [23:29:12] /data/project/catscan2/php_error.log: Permission denied [23:29:21] Qcoder00: idk :) [23:29:34] Hmm [23:29:40] Not a major issue [23:29:51] I can do something else, but ti would be nice to have it fixed [23:29:51] Qcoder00, https://wikitech.wikimedia.org/wiki/User_talk:Magnus_Manske [23:29:56] Server has a bad day... [23:30:08] OK