[00:43:27] 3Wikimedia Labs: /etc/mailname is set to "labs-vmbuilder-precise.eqiad.wmflabs" - 10https://bugzilla.wikimedia.org/64962#c1 (10Tim Landscheidt) And just to clarify: While this happens on freshly created instances, it is of course also an issue on instances that have been created some time ago :-). So /etc/mai... [01:27:58] 3Wikimedia Labs / 3tools: Install pdftk - 10https://bugzilla.wikimedia.org/65048 (10Tim Landscheidt) 5PAT>3RES/FIX [01:28:12] 3Wikimedia Labs / 3tools: Install jq - 10https://bugzilla.wikimedia.org/65049 (10Tim Landscheidt) 5PAT>3RES/FIX [06:26:39] is there a best practice recommendation for group granularity on Tool Labs? [06:27:14] should every bot/tool have its own service group, or is it fine to have an "xxwiki" service group for all the tools of that wiki? [07:10:43] 3Wikimedia Labs / 3tools: Install jq - 10https://bugzilla.wikimedia.org/65049#c3 (10Alessandro Brollo) Thanks (for pdftk too)! :-) [07:22:43] Hmm, I didn't think this BBC TV cite tool through. The labs are not in the UK :/ [07:26:12] 3Wikimedia Labs / 3tools: querycache and querycachetwo tables aren't available on labs sql dbs - 10https://bugzilla.wikimedia.org/63782#c5 (10Bawolff (Brian Wolff)) (In reply to Luis Villa (WMF Legal) from comment #4) > You mean, we're still showing suppressed stuff on the main site in some > cases? Or that... [09:48:44] !mysql [09:48:45] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Shared_Resources/MySQL_queries [10:20:58] https://tools.wmflabs.org/catscan2/ morning [10:21:07] It's down agian [10:21:12] I'll leave a note [10:25:07] Qcoder00: peas contact the toolsowner [10:25:15] I tried to [10:25:34] The talk page that is redirected to I can't leave a message on as it's protected [10:39:32] Qcoder00: https://en.wikipedia.org/wiki/User_talk:Magnus_Manske ? [10:40:10] Coren: scfc_de: very strange. Something sent catscan2 webservice a SIGTERM signal out of the blue. [10:40:18] scfc_de: I restarted it [10:41:23] end_time Fri May 9 06:43:51 2014, exit_status 143 [10:41:25] 2014-05-09 06:43:51: (server.c.1512) server stopped by UID = 0 PID = 23350 [10:42:06] very strange. who might UID = 0 be ? ;) [10:42:49] or is this the new OOM signal? [11:08:36] hedonil: Don't know. Coren wrote recently that he switched to SIGTERM for qdel, but I assume OOM will still get SIGKILL. ~/.bash_history has the last "webservice stop" at May 4th (and qdel May 2nd), so can't be that. [11:10:39] scfc_de: Hm. Just looked at $qacct -j lighty-catscan2 again. At >>end_time Fri May 9 06:43:51 2014 we had: >> maxvmem: 4.083G [11:16:00] scfc_de: Do you happen to be at the Hackathon? [11:16:22] scfc_de: this UID = 0 thing in error.log seems to be ambigous, as it's always UID = 0, even if issued by the user [11:17:36] but something definetly stopped it at Fri May 9 06:43:51 2014 [11:26:20] Silke_WMDE: No. [11:27:01] hedonil: Yeah, but the qdel/webservice stop/jstop command should show up in ~/.bash_history (unless someone used sudo). [11:27:47] Coren or anybody: I’m re-setting up a labs instance that didn’t survive transition to eqiad; it’s got an external IP which I think I’ve re-assigned to the new instance but I can’t seem to reach it [11:27:50] sudo make me_a_sandwich [11:27:57] 3Wikimedia Labs / 3tools: Copy contents of https://svn.toolserver.org/ to Wikimedia git - 10https://bugzilla.wikimedia.org/58801#c5 (10Silke Meyer (WMDE)) Proposal (from discussion at the Zürich Hackathon with Marc-André Pelletier): Labs will host a backup of the svn repos on Toolserver. If somebody wants t... [11:28:07] any ideas if i’ve misconfigured it? [11:29:43] 3Wikimedia Labs / 3tools: Copy contents of https://svn.toolserver.org/ to Wikimedia git - 10https://bugzilla.wikimedia.org/58801#c6 (10Silke Meyer (WMDE)) (In reply to Silke Meyer (WMDE) from comment #5) > Proposal (from discussion at the Zürich Hackathon with Marc-André Pelletier): > > Labs will host a bac... [11:31:23] brion: Have you looked at the security group? [11:31:31] scfc_de: i have not, lemme check [11:31:40] 'default' [11:32:18] ugh i have some dpkg issue somehow? [11:33:25] brion: In most projects, "default" only allows ssh. Usually, you define a security group for the role of the instance, assign the security group on instance creation and allow the port in the security group. [11:34:00] yeah…… looks like 80 is null-routed [11:34:01] lemme adjust that [11:36:38] hmm do i have to pick security group at instance creation time? [11:36:58] brion: you have to specify group at creation time [11:37:04] but you can create the rules /in/ a group at any time [11:37:05] ok delete and recreate time :D [11:37:08] great [11:37:21] So if you have a one-instance project you can just throw new rules into the default group rather than rebuild [11:37:32] ah too late ;) [11:37:56] let’s try the 14.04 image this time, see if it works ;) [11:38:04] brion: I can also have a go at rescuing the old instance (or the data on the old instance) if there's anything important there [11:38:16] Um… I would not advise that [11:38:17] andrewbogott: nah i just have to reclone from git [11:38:23] ok 12.04 it is then ;) [11:38:45] ok, now i should *wait until it’s done* before i add puppet groups right? [11:38:50] We're running into lots of dependency messes on trusty, it's still unclear what will happen. [11:38:52] that might have been one of the things that tripped me up before [11:39:02] yep, wait until the box is up and has finished a puppet run [11:39:11] excellent [11:40:08] ok build is done, puppet status is unknown….. so i’ll let it sit for a while until it says it’s run [11:40:47] i gotta say this beats begging for a small server to get physically installed, even with its oddities ;) [11:40:49] that'll work [11:57:33] sorry brion, I'm opening and closing my laptop a lot. Your VM coming up ok? [12:02:58] 3Wikimedia Labs / 3deployment-prep (beta): support dvwiki in beta labs - 10https://bugzilla.wikimedia.org/50335#c7 (10Antoine "hashar" Musso) 5NEW>3ASS a:3Antoine "hashar" Musso Finally giving a poke at it :-) [12:06:29] !log deployment-prep Creating en_rtlwiki wiki {{bug|50335}} [12:06:32] Logged the message, Master [12:11:57] 3Wikimedia Labs / 3deployment-prep (beta): support dvwiki in beta labs - 10https://bugzilla.wikimedia.org/50335#c10 (10Antoine "hashar" Musso) mwdeploy@deployment-bastion:~$ mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia en_rtlwiki en-rtl.wikipedia.beta.wmflabs.org Creating d... [12:15:57] 3Wikimedia Labs / 3deployment-prep (beta): Create an en_rtl wiki in beta labs - 10https://bugzilla.wikimedia.org/50335 (10James Forrester) [12:16:12] 3Wikimedia Labs / 3deployment-prep (beta): Create an en_rtl wiki in beta labs - 10https://bugzilla.wikimedia.org/50335 (10James Forrester) 5PAT>3RES/FIX [12:19:07] Cyberpower678: o/ [12:19:31] * Cyberpower678 waves [12:20:06] Cyberpower678: http://tools.wmflabs.org/bbc-tv-cite/search?q=tonight [12:23:12] Coren: Used to have a website at http://maps-warper.instance-proxy.wmflabs.org/maps/ [12:23:26] What would the current url be? I remember something about proxy changes [12:26:08] multichill|hacki: Probably easiest to just set up a new web proxy at https://wikitech.wikimedia.org/wiki/Special:NovaProxy [12:27:55] scfc_de: Thank you, found it at warper.wmflabs.org [12:34:12] 3Wikimedia Labs / 3deployment-prep (beta): Create an en_rtl wiki in beta labs - 10https://bugzilla.wikimedia.org/50335 (10James Forrester) 5PAT>3RES/FIX [12:34:57] 3Wikimedia Labs / 3tools: Move wiki.toolserver.org to WMF - 10https://bugzilla.wikimedia.org/60220#c18 (10Silke Meyer (WMDE)) Ok, I see we don't have a volunteer for this. So - same as for svn - a backup copy / xml dump will be kept in Labs. Whoever wants to resurrect the toolserver wiki as a "tool" in Tool... [12:36:42] 3Wikimedia Labs / 3tools: Provide namespace IDs and names in the databases similar to toolserver.namespace - 10https://bugzilla.wikimedia.org/48625#c35 (10Silke Meyer (WMDE)) Little status update... According to Nosy, this is not fully done, yet. [12:51:02] Your webservice is scheduled: [12:51:04] queue instance "continuous@tools-exec-06.eqiad.wmflabs" dropped because it is temporarily not available [12:52:32] Coren: ^? [12:52:55] multichill|hacki, if you need some help with the warper, let me know :-) I'm not at the hackathon in person this weekend, but am hacking on it remotely [12:57:57] chippy: Awesome! [13:08:28] 3Wikimedia Labs / 3tools: Move wiki.toolserver.org to WMF - 10https://bugzilla.wikimedia.org/60220#c19 (10Tim Landscheidt) The point of moving wiki.toolserver.org to WMF *is* to have it be part of production so that (security) updates are done once and neither waste multiple people's time nor leave the wiki... [13:12:28] a930913: Some job failed ("qstat -f -explain E"). Let me take a look. [13:13:43] a930913: Possibly a glitch or so, I'll clear the queues' states. [13:14:23] Coren: why is http://wmde.wmflabs.org/ redirecting me to https://www.wikimedia.ch/ on the conf wiki? fab.wmflabs.org works. [13:14:31] s/wiki/wifi/ [13:15:36] Coren: you around? [13:16:44] !log tools Cleared error state of queues {continuous,mailq,task}@tools-exec-06 and webgrid-lighttpd; no obvious or persistent causes [13:16:48] Logged the message, Master [13:17:36] scfc_de: Yeah, fixed :) [13:17:43] * a930913 goes back to fixing his errors. [13:29:53] is there any way to force a puppet run on my instance? or do i just wait [13:30:18] brion: sudo puppetd --test --verbose [13:30:33] thx [13:30:58] * brion cross fingers and hopes apache installs [13:31:10] hah i have duplicate definitions ok lemme fix that [13:31:27] 3Wikimedia Labs / 3deployment-prep (beta): Create an en_rtl wiki in beta labs - 10https://bugzilla.wikimedia.org/50335#c13 (10Antoine "hashar" Musso) So the localization cache for en-rtl was not being generated because that language is not listed in Names.php Sam figured out we can inject new language names... [13:36:42] 3Wikimedia Labs / 3tools: Move wiki.toolserver.org to WMF - 10https://bugzilla.wikimedia.org/60220#c20 (10Sam Reed (reedy)) Putting it on production as a "standalone" wiki (like the private wikis etc) would be fairly easy. The "most difficult" part of it at would presumably be postgres -> mysql. Of course,... [13:42:27] wooo i got httpd working [13:46:31] d’oh [13:46:43] https://embed-sandbox.wmflabs.org/ <- https no happy because cert is self-signed [13:51:12] 3Wikimedia Labs / 3tools: Move wiki.toolserver.org to WMF - 10https://bugzilla.wikimedia.org/60220#c21 (10Silke Meyer (WMDE)) Yay, cool, Reedy! I can poke Nosy to help you. [13:52:43] 3Wikimedia Labs / 3tools: Move wiki.toolserver.org to WMF - 10https://bugzilla.wikimedia.org/60220#c22 (10Sam Reed (reedy)) (In reply to Silke Meyer (WMDE) from comment #21) > Yay, cool, Reedy! I can poke Nosy to help you. I've got root ;) [14:08:22] hey is anybody familiar with the star-wmflabs-org cert option for labs instances? [14:08:36] it seems to give me a self-signed cert [14:08:46] should i use some proxy setup instead? [14:11:20] That cert is self signed on purpose IIRC (it's pointless to give a signed cert to everyone). Use the proxy to get a real ssl cert. [14:11:28] true :D [14:11:32] ok lemme find the info on that [14:14:28] 3Wikimedia Labs / 3tools: Add support for Java Servlets on Tool Labs - 10https://bugzilla.wikimedia.org/54845 (10Marc A. Pelletier) 5PAT>3RES/FIX [14:14:28] 3Wikimedia Labs / 3tools: Move wiki.toolserver.org to WMF - 10https://bugzilla.wikimedia.org/60220#c23 (10Sam Reed (reedy)) (In reply to Tim Landscheidt from comment #19) > The point of moving wiki.toolserver.org to WMF *is* to have it be part of > production so that (security) updates are done once and neit... [14:44:58] 3Wikimedia Labs / 3tools: Allow jvm non-cgi webapps - 10https://bugzilla.wikimedia.org/50453#c2 (10Marc A. Pelletier) Don't use the normal webgrid for that, but use the tomcat queue. The normal lighttpd queue are /severly/ overcommited under the presumption that nothing but lighttpd runs there. A (possibly... [15:27:47] https://embed-sandbox.wmflabs.org/ <- now with ssl proxy! [15:33:52] a930913, supercount works for me. [15:43:04] Cyberpower678: Yeah, but it wasn't then. [15:43:54] :p [15:56:15] @info [15:56:15] http://bots.wmflabs.org/~wm-bot/dump/%23wikimedia-labs.htm [16:41:41] Your webservice is scheduled: [16:41:42] np_load_avg=2.892500 (= 2.892500 + 0.50 * 0.000000 with nproc=8) >= 2.75 [16:50:46] https://bugzilla.wikimedia.org/show_bug.cgi?id=65067 Enable VisualEditor on commons as BetaFeature [16:50:49] hmm..... :/:/ [16:51:35] wrong window, sorry. [18:13:01] Coren: I need a hand [18:14:48] Betacommand: New tool bbc-tv-cite : [18:14:53] :) * [18:18:04] hey, tool labs users [18:18:21] can you check something on an exec node really quick? [18:18:36] what? [18:18:47] are these packages installed: [18:18:53] pdftk [18:19:13] jq [18:19:15] mutante: are they installed on the login server? [18:19:25] i don't know [18:19:42] i want to know if a gerrit change did what it was supposed to do [18:19:51] and that would be installing these on exec nodes [18:20:23] what i didn't know right away was the names of those nodes [18:20:30] to connect their and look [18:20:32] there [18:20:46] mutante: you really cant connect directly [18:21:17] do you know the instance names? [18:21:32] well, my key is in root auth keys [18:22:36] Betacommand: you can, actually. [18:22:50] well, directly from within tool labs, anyway [18:23:08] just ssh tools-exec-XX from tools-login works [18:23:40] mutante: jq is installed on tools-exec-03, at least, as is pdftk [18:24:02] mutante: also on -01 and -10 [18:24:55] anyone has an idea why trying to ssh into an instance i created gives me?: channel 0: open failed: administratively prohibited: open failed [18:25:57] valhallasw: thanks! i'm on it now :) [18:26:08] i just needed the instance name actually and fix my ssh config [18:26:22] and yea, that looks like all is fine , can close some BZ [18:32:42] 3Wikimedia Labs / 3tools: Install pdftk - 10https://bugzilla.wikimedia.org/65048#c3 (10Daniel Zahn) 5RES/?>3VER @tools-exec-03:~# ii pdftk 1.44-4build1 tool for manipulating PDF documents [18:33:27] 3Wikimedia Labs / 3tools: Install jq - 10https://bugzilla.wikimedia.org/65049#c4 (10Daniel Zahn) 5RES/?>3VER @tools-exec-03: ii jq 1.2-8~precise1 lightweight and flexible command-line JSON processor [18:39:11] Coren: Im about to take a shotgun or C4 to the lighthtpd webserver [18:39:44] Betacommand: what did you do this time :-p [18:40:18] valhallasw: Not me, the piece of shit keeps crashing with zero useful information [18:40:55] One minute things will be working the next "No webservice" [18:41:45] Ive had to restart it a dozen times today [18:41:54] no info in the error.log? or maybe lighttpd.err? [18:42:07] oh, there's no lighttpd.err [18:42:08] :/ [18:42:23] stderr goes to ~/error.log [18:42:35] error.log has nothing [18:42:53] which is why Im about to take a shotgun to it [18:43:20] I been trying to find coren for several days with zero response [18:44:01] Hell, a microsoft webserver has fewer issues [18:44:29] Betacommand: try running qstat -s z next time, then qacct -j to see if there's any info [18:44:32] which project is this? [18:44:45] valhallasw: Mine [18:44:57] betacommand-dev? [18:45:03] yeah [18:48:02] Betacommand: so the webserver that's down again :-p what does qstat -s z show? [18:52:28] valhallasw does lighthhtpd not spawn requests off correctly? [18:53:02] Betacommand: I'm not sure what lighttpd does exactly [18:53:13] Its showing a 4.5GB vmem usage which is way too high [18:53:18] right [18:53:43] That does make sense with the messages in the error.log that seem to come from cgi-bin scripts [18:55:06] Betacommand: would there be any reason for a cgi script to use such massive amounts of memory? [18:56:02] valhallasw No single one is, it just lumps them together [18:56:39] so 5 x 1GB means server crash [18:56:53] Betacommand: it's retarded to have cgi-bin scripts use 1GB each, tbh, but OK [18:57:07] it's also retarded that this crashes lighttpd [18:57:18] valhallasw I was using a random easy value [18:57:52] Betacommand: are they long-running? [18:58:06] some can be [18:58:26] well, move those to the grid, and let them output a static file that you redirect the user to [18:59:12] which, I think, should be done for anything that takes longer than maybe 10 seconds or so [18:59:37] I'm not sure how to easily diagnose which scripts are the main culprits, though. [19:01:02] Betacommand: oh, and do any of them return large files? [19:01:07] http://redmine.lighttpd.net/issues/2102 [19:53:15] valhallasw Im doing to testing and the script is using 30MB, no where close to the 5GB that caused it to crash [19:53:30] :/ [19:53:40] and the output size? [19:54:18] less than 1MB [19:54:59] 1MB could still be an issue if I can believe http://redmine.lighttpd.net/issues/2102 [19:56:39] valhallasw the webservers use 1.1 not 1.0 [19:57:40] HTTP/1.1 you mean? That depends on the client, I'd think, or maybe on the YuviProxy [20:16:40] valhallasw part of the issue is that a new webservice uses 40% of the vmem just to exist [20:19:08] Hm. [21:27:12] 3Wikimedia Labs / 3tools: Puppet is stuck due to openjdk-7-jre-headless - 10https://bugzilla.wikimedia.org/63823#c1 (10Daniel Zahn) just confirmed on tools-exec-03, i was about to create this as a duplicate, fortunately Bugzilla is smart :) it's still an issue, but the puppet run continues anyways, so it wa... [21:32:27] 3Wikimedia Labs / 3tools: Puppet is stuck due to openjdk-7-jre-headless - 10https://bugzilla.wikimedia.org/63823#c2 (10Daniel Zahn) circular dependency! baah openjdk-7-jre-headless Depends: openjdk-7-jre-lib openjdk-7-jre-lib Depends: openjdk-7-jre-headless this must be https://bugs.debian.org/cgi-bin/... [21:42:27] 3Wikimedia Labs / 3tools: Puppet is stuck due to openjdk-7-jre-headless - 10https://bugzilla.wikimedia.org/63823#c3 (10Daniel Zahn) i think i could just fix it manually, but didn't want to just mess with it before another proper attempt via a puppet change (try to install -lib instead of -headless?)