[00:00:19] scfc_de: Ah, well, Windows is part of the problem. PSFTP (part of PuTTY) apperarently doesn't like that. [00:00:37] Matthew_: Make sure it too is pointed at tools-login [00:01:07] It is [00:01:49] Lemme look at the logs in a min [00:02:57] Kk [00:03:45] Matthew_: I only see successes in the log regarding you. Did you psftp give you any error messages? [00:04:09] Yes, "Unknown command become" [00:04:21] I assume because PSFTP doesn't support issuing SSH commands... [00:04:50] Well, yeah. It's a File Transfer Protocol. :-) [00:05:04] Figured :P [00:05:10] Do I need to become? Or can I just put? [00:05:23] (cd-ing to the proper place, of course :) ) [00:05:47] You can just put, your user account is in the tool's group. [00:05:59] OK, thanks :) [00:06:33] You might need to retake ownership of the files once your done, if you intend to run them as web CGIs (there is a command for that, 'take') [00:07:02] OK [00:07:38] Coren: You convinced Ryan? :-) [00:08:14] scfc_de: Executive decision. The sudo mess doesn't work right, and would be a pain even if it did. It's my project, I get to keep both pieces if I break it. :-) [00:09:09] Coren: You have my support :-). [00:09:42] I'll rope (the other) Tim into an in-depth security review of my code to help assuage their worries. [00:29:58] OK, one more question, then I should be done. Is there a Powered by Wikimedia Labs button? [00:32:19] Matthew_: Not that I know of (nor "Powered by Tools"). But a nice idea. [00:32:57] OK, thanks anyway. (TS had one.. that's why I'm asking. Maybe I'll ps one) [00:39:00] ... I broke it. [00:39:55] Broke what? [00:40:02] My tools. [00:40:16] Is it a web tool? [00:40:36] http://tools.wmflabs.org/matthewrbowker <-- Stylesheets (which were in cgi-bin but are not in styles) are 500-ing [00:40:38] Yes [00:40:46] Well, a static landing page until I get things sorted... [00:41:09] * Coren doesn't see a 500 [00:41:35] Yeah, but can you see stylesheets? [00:42:18] Ah, it's a php. Is it owned by the tool account? [00:42:29] I ran take, unless it needs arguements... [00:42:44] It needs the name of the files/dirs you want to take. :-) [00:43:01] Ah [00:43:49] Nope... [00:44:53] Matthew_: Your css.php isn't executable. :-) [00:45:20] Matthew_: But a big problem is that it's also world writable(!) [00:45:36] o.O [00:45:42] I definitly didn't set it that way [00:45:46] You really shouldn't have anything in your tool's home that is world writable. [00:45:58] Maybe SFTP has been overly generous [00:46:23] Coren: Do PHP files need to be executable?! [00:46:26] chmod -R o-w /data/project/matthewrbowker/ [00:46:50] scfc_de: For suphp to recognize them as scripts, yes. It's a security measure against upload dir abuse and such. [00:47:21] Matthew_: Want me to fix your permissions? [00:47:45] Coren: http://tools.wmflabs.org/wikilint/test.php works for me with 644. I think suphp was complaining about the world-writability. [00:47:47] Coren: I just fixed it via CHMOD, but I'm now going to have to figure out why Cyberduck did it that way. [00:48:13] ... what's a cyberduck? [00:48:37] It's my SFTP software :P [00:48:54] scfc_de: Ah, right, phps don't actually need +x, only cgi-like. [00:49:39] Matthew_: Unless is really braindead, you should have an option to determine what permissions you want. Normally, you want group write but never world write. [00:50:27] Matthew_: I'm not sure the red is an improvement :-). [00:50:46] scfc_de: Yeah... maybe a nice blue. [00:51:00] Coren: Yes, i just found it. It was configured that way out of the box. [00:51:25] * Coren hates software that has insecure defaults. [00:54:21] We need to document PSFTP's settings and the other pitfalls somewhere. I made a note to update /Help tomorrow. [00:54:35] scfc_de: That would be helpful, thank you! [00:54:38] Coren: Agreed [00:55:33] Well, with the Toolserver still down: Good night! [00:55:50] scfc_de: 'night [00:55:54] Night, scfc_de [00:56:11] scfc_de: Thanks for the help! [01:50:26] bloop [02:42:02] buh: bloop? [02:42:26] I thought a bot was about to message something, but apparently not :P [04:52:00] Warning: There is 1 user waiting for shell: MichaelBillington (waiting 0 minutes) [05:05:30] Warning: There is 1 user waiting for shell: MichaelBillington (waiting 13 minutes) [05:19:04] Warning: There are 2 users waiting for shell, displaying last 2: MichaelBillington (waiting 27 minutes) Ladsgroup (waiting 10 minutes) [05:32:29] Warning: There are 2 users waiting for shell, displaying last 2: MichaelBillington (waiting 40 minutes) Ladsgroup (waiting 24 minutes) [06:05:12] Warning: There is 1 user waiting for shell: Krd (waiting 0 minutes) [08:45:54] @requests [08:45:54] There are no shell requests waiting [08:45:54] There are no shell requests waiting [09:15:40] . [09:15:46] test [09:16:04] @logon [09:16:05] Channel is already logged [09:17:20] is there no other backup of it? :/ [09:33:14] buh I am happy there is at least this one [09:33:23] I don't usually backup configs [09:33:23] true [09:33:41] but worst shit is that infobot db is part of configs [09:33:49] this backup is from february [09:34:11] fortunately I have html dumps that contain all the new keys [09:34:18] but someone need to convert them from html to xml :/ [09:34:26] I probably need to write a tool for that [09:34:26] :/ [09:36:33] well, I guess you have learnt from your mistake? ;p [11:19:39] Warning: There is 1 user waiting for shell: Moustapha (waiting 0 minutes) [11:32:59] Warning: There is 1 user waiting for shell: Moustapha (waiting 14 minutes) [13:06:51] petan: Timetravel! We have timetravel! [13:07:04] petan: You lost nothing. [13:07:42] huh? [13:09:33] In your home or /data/project? [13:24:23] petan: ls /data/project/.snapshot/20130512.0017/ [13:24:56] petan: Timetravel FTW! [13:25:40] the way to be productive with timetravel: [13:25:50] cp -vr /data/project/.snapshot/20130513.0017/myproject/ . [13:26:38] I'm afraid I haven't yet perfected that direction. :-) [13:26:44] :( [13:28:25] is bastion currently having problems? I can't use scp and a touch in my home directory is answered with "Read-only file system" :/ [13:29:00] Pyfisch: Bastion is still using gluster. Gluster doesn't /have/ problems, gluster /is/ the problem. [13:29:30] gluster works fine until it breaks, then it's like sitting on a cacti [13:30:37] ok, very interesting but what I have to avoid this problem [13:31:19] Pyfisch: You can't. It's a bug that will require Ryan's intervention to fix. [13:31:55] Pyfisch: The NFS server is ready, though, I think Ryan plans to migrate stuff away from it shortly. [13:32:10] s/from it/from gluster/ [13:32:55] ok, how can I upload a file to shared file system? [13:33:44] Pyfisch: Well, your best bet is to copy it directly to the instance you need the file on with a tunnel. [13:34:16] Check the ProxyCommand section of https://wikitech.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [13:43:13] petan: Did you save poor wm-bot? [13:44:21] * Damianz looks at the toolserver and sighs [13:44:38] So need mysql replication so I don't have to rely on ts http/mysql being up [13:48:03] Damianz: Real Soon Now™ [13:48:05] Damianz: don't we all. [13:48:20] Coren: That's what the roadmap said last year :P [13:48:25] i had to do this today https://en.wikipedia.org/w/index.php?title=Template:Dupdet&diff=prev&oldid=554709118 [13:48:34] Damianz: It didn't have /me/ dedicated to it then. :-) [13:49:02] Coren: I vote you tie asher up and cover him in bannanas then let some chimps into the room [13:49:31] Damianz: No, I need him. :-) [13:50:25] So does everyone else :P [13:51:01] Also yay, tools has like a real ssl cert [13:51:14] Well... sorta [13:51:56] It's real. Why only 'sorta'? [13:52:32] petan: ? [13:54:25] !petan [13:54:41] petan|ic? [13:58:59] RapidSSL :P [14:00:01] * Damianz wonders if Coren is going to install mod_spdy ftw [14:01:17] Damianz: We iz switching to apache 2.4 soon. [14:01:43] going to package it for ubuntu? [14:01:51] * Coren nodsnods. [14:02:03] We already got it, working on modules. [14:02:04] * Damianz feels your pain [14:02:27] Hate packaging debs... though probably lack of understanding on my part as I normally use rpms [14:04:07] hmm actually that reminds me [14:04:22] * Damianz goes to look at where he got to with fpm and jenkins behaving with zuul and gerrit [14:08:25] Coren; do you know if anyone is working on improving the ping between europe and labs? [14:08:59] valhallasw: It shouldn't be any worse than usual transatlantic. What kind of RTT are you getting? [14:10:03] A nice thing we might be able to do is have a tool-eu bastion in AMS. Interactive performance with the grid isn't an issue. [14:10:20] that would be useful, yes [14:10:28] let me get the exact stats for you [14:10:41] from my home it was a median of 400ms, at the top of my head [14:11:11] It's like ~170ms for me, which is kinda normal [14:11:12] Homes are on NFS, though, which means that while interactive RTT will be good, filesystem speed will be slow on the bastion (but won't affect actual jobs) [14:11:33] 400ms is teh suck. I'd appreciate a tracroute so we can see where your bottleneck is. [14:12:14] sure. just a sec [14:12:17] 400ms is like aws on a bad day :P [14:12:35] it's better today, though. 140-170ms [14:12:52] still ssh feels laggy as hell [14:13:04] laggy as hell could be packet loss still [14:13:16] What we really need is mosh on bastion :D [14:13:22] yes, that would be helpful [14:14:53] http://pastebin.com/xZWMSC12 traceroute [14:15:06] Though with that said, I'm actually pretty impressed with mobile internet these days.... had a rdp session to hongkong over vpn, over a ssh tunnel, over a wifi hotspot to my phone on h+ and it worked fine [14:16:31] valhallasw: That's kinda reasonable for cross-atlantic -- but we live in spoiled time and 200ms feels like ages. :-) [14:16:41] You want mosh? Mosh you get. [14:16:56] Coren: cool [14:17:14] Coren: I'm also making some stats to see median/worst case response [14:17:21] Look on the bright side - you don't have to wait 5min for an image to download these days :P [14:18:11] If it's slow often, just leave like mtr running and it will probably just be one hop that's dodgy [14:21:31] valhallasw: {{done}} U can haz mosh. [14:22:14] Coren: note that mosh also needs udp ports forwarded [14:23:03] valhallasw: Yeah, I was looking into which port I was going to dedicate to it. [14:23:10] :-) [14:23:32] it's really silly - they don't want UDP hole punching because that would break roaming... [14:23:55] Meh, I'll just open 60000-61000 for UDP [14:24:03] It /is/ a bastion host. [14:25:11] UDP is a special sort of creature [14:26:20] * Coren tests. [14:27:40] Hm. Fail. [14:27:44] * Coren debugs. [14:28:35] Aha. Was just delay in pushing the filtering rules. [14:29:27] Booo! mosh doesn't show the motd [14:29:34] you mean yay [14:29:35] :P [14:29:51] My motd is teh pretties, and doesn't have sysadmin spam in it. [14:30:06] It has a unicorn! [14:30:11] :D [14:30:13] IPU? [14:30:13] Coren: from my home connection, it was median 245ms, 95% percentile 425ms (i.e. 5% of packages took longer than 425ms). Packet loss was .2% [14:30:40] valhallasw: A bit flaky indeed. Perhaps mosh will help. [14:30:47] * Coren wishes issh had mosh support. [14:30:49] from my university, it was 105ms median, 110ms 95% percentile - but that's pretty much a 100mbit backbone straight into ams-ix [14:31:17] only 100mbit for a uni? [14:31:27] oh, the backbone is more than 100mbit [14:31:33] but my network connection is 100mbit [14:31:42] 110 ms is... pretty damn close to theorical minimum for EMEA <-> NCSA [14:32:28] Yeah, this annoying thing called the speed of light [14:32:52] Speed of light is rather slow sadly [14:33:10] meh, mtr doesn't want to work in my virtualbox [14:33:12] oh crap, my macbook is compiling qt.... we could be here a while [14:35:23] Oooo! issh has mosh compatibility as DLC! [14:38:15] pretty graphs anyone? http://i.imgur.com/PGLWMix.png [14:39:05] (this is for 1000 pings at different locations; packet loss is always <.2%) [14:39:47] iSSH is the best money I've ever spent on my ipad. [14:41:05] mosh is much better [14:41:06] thank you [14:41:21] valhallasw: iSSH speaks mosh. :-) [14:41:31] * Coren just tested it. [14:41:46] :-) [14:41:52] now I just have to buy an ipad ;-) [14:42:03] * valhallasw is pretty happy with his lenovo convertible tablet [14:42:16] It's also a nice one. [14:45:09] Coren: BTW, if you need alpha testers for DB replication, ping :-). (Cross-database or not.) [14:50:05] ooh, and with --predict=always it's even better [14:51:17] scfc_de: Noted. :-) [14:51:53] * valhallasw hands Coren brownies [15:04:55] !log tools Opened UDP 60000-61000 for mosh [15:05:19] * Coren forwns at the conspicuous absence of morebots. [15:09:22] BTW, is the automatic backup (every hour/every day/etc.) limited to Tools? [15:09:31] (Re petan's mishap.) [15:09:50] scfc_de: No, anything that uses the NFS server gets it for free. [15:10:02] And Bots uses gluster? [15:10:05] I know there's at least one other project that switched already. [15:10:18] scfc_de: I believe petan is planning an outage to switch. [15:10:32] Okay. [15:11:15] scfc_de: If you're curious, you can look at /home/.snaplist and /data/project/.snaplist to see the active snapshots. [15:17:11] Coren: 20130512.1317 and 20130512.1417 are in .snaplist, but don't exist in .snapshot. [15:18:18] scfc_de: Odd. Usually, one is generated from the other. Lemme check what be up. [15:20:26] And now 20130512.1517 is gone :-). [15:20:37] for DIR in /data/project /home; do while read SNAPSHOT; do [ -d "$DIR/.snapshot/$SNAPSHOT" ] || echo "$SNAPSHOT doesn't exist for $DIR."; done < "$DIR"/.snaplist; done [15:21:04] (Or rather, 20130512.1517 isn't there.) [15:21:48] Odd. It's /there/, it just doesn't get mounted right. [15:22:13] That might be autofs being too smart with caching for its own good. [15:22:29] One doesn't usually automount all the snapshots in a loop. :-) [15:22:41] Could be. Well, I'm strange :-). [15:23:33] scfc_de: Yeah, I think it's being silly and caching negative hits. [15:23:34] (Though I don't like time-based backups vs. a proper VCS very much; I can usually type "rm -Rf" faster than any fs snapshot is taken.) [15:23:50] scfc_de: It's actual snapshots. [15:23:59] scfc_de: time-consistent. [15:24:59] Coren: But at most hourly. So any change done in the past seven minutes would be gone. I usually prefer "git commit && git push somewhere-offsite". [15:26:15] scfc_de: Well, yeah. Timetravel is a last resort. [15:44:53] How does mosh reduce lag? Is it just the pre-echo of the keystrokes? [15:46:05] scfc_de: it predicts the echo, yes [15:46:10] Coren: neither [15:46:16] it's basically a smart form of local echo [15:46:21] Coren on vdb1 [15:46:34] valhallasw: Okay, thanks. [15:46:40] petan: Ah, then there is nothing timetravel can do for you. [15:46:47] petan: You should switch to NFS soon. [15:46:57] scfc_de: mosh -a tools-login.wmflabs.org is even better, because it forces prediction on [15:47:01] I am quite busy fixing bot atm [15:47:07] otherwise you keep the ~100ms lag before the prediction turns on [15:47:27] scfc_de: mosh also does screen painting rather than stream-of-text. [15:47:34] (a la screen) [15:48:28] Must admit, never used screen :-). What's screen painting? (And my brain is pretty good at predicting echo :-).) [15:49:05] scfc_de: It synchronizes what a terminal should look like after the tty I/O rather than actually send the I/O. [15:49:39] Ah, okay. [16:04:00] Coren: Could it be that the gluster -> NFS switch mangled hard links? (Nothing too bad, just noticing that one of my local git clone is corrupt.) [16:06:34] Look at ~scfc/src/wikilint/.git/objects/71/63e75db5125e984c33e3bcf771177a0c4e50a4 for example. [16:06:44] "---------T 1 root root 0 Apr 9 20:36" [16:07:02] * addshore waves [16:09:27] Hello people, just checking if there are any active issues with Lab's bot's project that any of you know about before I start moving continuitybot on to labs [16:10:11] JasonDC|BNC: "Bots" project or "Tools" project? If you don't know the difference, you want "Tools". Do you need replicated databases? [16:10:14] scfc_de: No, but it won't have fixed existing corruption on gluster though. [16:10:15] move it to tools not bots [16:10:31] Bots, I don't need db's [16:10:54] we really need to rename it :/ [16:11:05] JasonDC|BNC: That's not the difference :-). [16:11:07] JasonDC|BNC: Is that a tool that is in active use? If so, you almost certainly want tools. [16:11:11] oh... [16:11:38] bots is the project for experiments and Mad Science. :-) [16:11:40] (haven't worked on labs for a few months :P) [16:11:58] We need to update https://wikitech.wikimedia.org/wiki/Help:Move_your_bot_to_Labs. [16:11:59] JasonDC|BNC: What's your wikitech username? I'll add you to the project. [16:12:03] Jasonspriggs [16:12:44] scfc_de: That this is desperately out of date it's not even funny. [16:12:57] Help:Move your bot to Labs should just point to https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help [16:14:17] buh: It does now. [16:14:24] :) [16:14:34] should probably put a note on https://wikitech.wikimedia.org/wiki/Nova_Resource:Bots also [16:14:54] Coren: any joy with that php script ? ;p [16:16:00] is having puppet configured for the tool required yet or just reccomended (better question actually, is puppet fixed?) [16:16:05] buh: Very little. PHP is not friendly to worker threads, and my understanding of how it manages process is - at best - flawed. [16:16:30] JasonDC|BNC: Puppet is fixed, but given the setup on tools is not useful or necessary as a way to set it up. [16:17:05] alrighty [16:17:12] JasonDC|BNC: Put your tool on project storage and its code under source control; the project storage works from the entire project. [16:17:36] hehe, I have never managed to make anything pretty and threaded with workers [16:17:39] (I.e.: tools aren't set up with new instances from scratch -- it's a much easier model to work with) [16:17:46] will do once added to the project [16:18:15] buh: It's about 20 lines of perl. Perl rules. :-) [16:18:36] JasonDC|BNC: {{done}} [16:19:00] and are cronjobs allowed or does this new grid engine work? [16:19:01] thanks :) [16:19:13] JasonDC|BNC: Cheat sheet -> https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help [16:19:30] JasonDC|BNC: Grid engine works: http://tools.wmflabs.org/?status [16:19:55] alright, awesome, ill start the port later today [16:20:16] JasonDC|BNC: I'm idling here almost all the time, don't hesitate to poke me if you need help. [16:20:29] will do :P [16:26:27] is there a quick way to view puppet classes that are applied to a host? [16:26:27] im going to try and butcher something together for my code now Coren :) [16:27:21] valhallasw: Not really; the catalog is built partly from manifests, partly from ldap. I can probably tell you if you ask me which host though. [16:28:14] I was generally browsing the tools* configs [16:28:25] in this specific case, I was wondering about role::labs::tools::webproxy [16:30:10] role::labs::tools::webproxy -> toollabs::webproxy, but I can't find that one [16:30:29] valhallasw: Ah. What you want is the toolabs module. [16:30:41] valhallasw: modules/toollabs/manifests/* [16:30:56] ah, under modules. I see :-) [16:31:06] great, thanks [16:32:48] valhallasw: It's partial, though, I'm still catching puppet up to the actual config. [16:34:32] Right :-) Basically, I'd like to move the pywikipedia website & nightly generation to labs at some point, so I was wondering about the vhost config. [16:35:38] Coren: perfect, I will execute a deply script which will create a thread for every lang [16:35:40] valhallasw: That one's not there yet. :-) But http://tools.wmflabs.org/peachy/wiki/index.php?title=Main_Page so we know it works. [16:36:01] it will wait for that lang to finish and close the thread, this total process might last a day or so depending on how many there are for the lang [16:36:14] BRB [16:36:20] Coren|Food: heh, cool. [16:45:39] is it just me or is everything going reallllllly slowly :/ [17:04:46] * Damianz pushes the turbo button [17:09:57] :/ [17:41:04] Coren: Re per-tool descriptions on wiki, Nova_Resource:Tools/Tools/$TOOL would probably be "logical", but looks repetitive. Do you have another proposal, or should I go ahead with that and later move my page when another structure is put in place? [17:41:44] scfc_de: It looks a little funky, but it seems to be the most sensical. We'll move if we get some brilliant rename idea later. [17:41:54] Coren: k [17:45:14] It would be funky to have redirects from tools.wmflabs.org/$TOOL to Nova_Resource:Tools/Tools/$TOOL by default [17:45:51] but that's funky in the music sense [17:47:11] valhallasw: And how do you reach the tool's page then?! :-) [17:48:04] scfc_de: only if that page doesn't exist yet :-) [17:48:48] I'd rather let the tool maintainers choose to make redirects or put a link in their .description as appropriate. [17:49:25] a default index.php with header("location: ...") would be my suggestion [17:49:49] but it's not something I feel strongly about :-) [17:50:03] I'd rather there not be indices for tools that have no web component so they stand out in the list. [17:50:29] Hm, yes. Maybe a second column 'wiki page'? [17:50:49] but indeed, then just adding it to the .description is a sensible way [17:50:57] And more flexible. [17:51:22] But what about mine, where there are a whole bunch of small tools clustered together? [17:57:12] Matthew_: Well, you can link to a wiki page enumerating them from your .description [17:57:43] Huh, OK. Or in the tools themselves. [18:22:05] Coren: is there an official process to get packages installed, or is the process 'nagging you'? [18:22:26] In the latter case, I offer a brownie for installing tig on tools-login :-) [18:22:48] valhallasw: Me or petan, but if we're not arround opening a bugzilla for it will also do in a pinch. :-) [18:27:16] valhallasw: {{done}} [18:27:24] Coren: thanks! [18:48:00] hm, shouldnt SGE .eXXXX files be deleted automatically? [18:49:21] .eXXX? [18:49:39] Ah, default output files. No. [18:49:59] But you probably want to use jsub on tools rather than qsub directly. It has saner defaults. [18:50:01] man jsub [18:51:36] Coren: I'm porting tools from the toolserver, so qsub was the logical choice [18:51:55] does jsub use the same config-in-the-script-file system? [18:52:32] well, let's just try [18:52:38] It's a strict superset of qsub (in fact, it's just a wrapper that provides some appropriate defaults for most common cases) with functionality for -continuous and -once [18:53:28] except it seems to ignore #$ lines [18:53:49] Ah, it might because it wraps things around itself as well. [18:54:09] So yeah, if you've already tweaked for qsub, you might stick with it. [18:54:25] ok [18:54:46] Caveat: memory request is done with h_vmem as a resource [18:54:57] And the queues are 'continuous' and 'tasks' [18:55:02] instead of virtual_free [18:55:27] Right. Tools has hard allocation. [18:55:51] OK. If I specify them both, it should work on both the TS and labs, I think [18:56:05] I don't need 256M anyway :-) [18:56:27] If you use qsub directly, there is no default so you have to specify one. [18:56:56] that's strange. The job completed successfully [18:56:59] except for the stray file [18:57:38] AFAIK, gridengine never deletes output files by default; that's something they added on the TS [18:57:50] ah, that would explain [18:58:01] ... which would be very nice to have on Tools as well. [18:58:32] Personally, I'd be very much annoyed by this - if I have a job that failed, I want to keep the output around until I get to see it. [18:59:28] Coren: stdout should be retained, stderr should be removed unless it exits with a non-zero return code or has output there [18:59:42] I'm talking about *empty* output of *successful* jobs. [18:59:42] that's at least the behavior I would expect [18:59:56] and the same for empty output, indeed :-) [19:00:09] File a bugzilla, I'll look into it. [19:00:12] k [19:00:22] It's not a bad idea, but tbh not a priority. :-) [19:01:35] There should be nothing on the table between you and replicated databases ATM :-). [19:03:55] Coren: *grin* if I don't set any memory limits, the script runs, if I do, it gets killed. [19:04:09] o_O [19:04:41] Oh! I knew I forgot something. I disabled the default some time back to test something. I'll have to reenable it. [19:05:10] Set it to 1G and look at the actual usage. It uses hard /vmem/ allocation, so it's probably higher than you think. [19:05:52] Coren: as in: this takes shared libraries into account, while the toolserver one doesn't? [19:06:11] valhallasw: Right, to prevent even the possibility of an overcommit. [19:07:45] wm-bot, help [19:08:46] @add #wikimedia-affcom [19:09:18] Thehelpfulone: Isn't petan still working on resurrecting wm-bot? [19:09:33] scfc_de: He is, last I heard. [19:09:44] yes petan is [19:09:45] hmm, should I not use it whilst he does? [19:09:50] hey petan :) [19:10:12] it doesn't work so I don't really care [19:10:17] !ping [19:10:22] it has only log module loaded [19:10:29] it is logging channels that is all it can do now [19:12:05] That's pretty much :-). [19:13:00] I thought wm-bot had moved to tools [19:13:16] nope [19:26:59] hm, is there a way I can check why my job is not being run? [19:27:21] valhallasw: It cancelled out or is it just sitting in the queue? [19:27:40] just sitting [19:28:07] I can see the resources I requested using qstat -r, but that doesn't show me if the resources are the issue [19:28:25] That might be me, I'm in qconf atm. It will unwedge in a minute if I'm right. [19:28:55] aaah, OK, no problem. [19:33:37] Coren: is the suggested SGE submit cron-host tools-login? [19:33:54] i.e. the host that should be used to submit SGE jobs via cron [19:34:06] valhallasw: ATM, yes. [19:34:13] OK. [19:34:14] valhallasw: Yes. I'll eventually put a distributed cron in place for redundancy, but you'd still edit the crontabs from -login [19:34:31] OK, cool. [19:34:51] then my bot is now running redundantly on both Tools Labs /and/ the toolserver :-) [19:35:15] valhallasw: Does your bot actually run at interval or is it meant to run continuously? [19:35:41] at midnight every day [19:35:45] * Coren nods. [19:36:12] it updates the dutch to-be-deleted-pages-page with a new daily sheet [19:36:26] WP:AFD is the enwiki term [19:37:17] If you need to launch on midnight CE(S)T, there's a snippet somewhere ... [19:37:51] Incidentally, if you managed to tweak your script to work on both the TS and TL, you might want to put notes down somewhere to help others who may want to do the same? [19:38:49] Coren: *nod* [19:39:39] valhallasw: You know you can add the other maintainers to the service group on wikitech? [19:40:18] there's a button somewhere there, yes [19:40:20] If I read "Op dit moment hebben Erwin, Akoopal, Multichill, Valhallasw en Siebrand toegang tot het account." right. :-) [19:40:37] none of the other ones have a labs account yet, afaik [19:40:58] although siebrand probably has one [19:41:02] I'm pretty sure Siebrand has one, I'll add him to tools [19:41:45] valhallasw: '[ "$(TZ=:Europe/Amsterdam date +%H)" = "00" ] && echo Do something on midnight CEST.' was what I was looking for (scheduled at 22 and 23 hours UTC). [19:42:40] scfc_de: ah! I just run the job hourly and do that in python :-) [19:42:42] valhallasw: I've added Siebrand, so if you add him to the service group you'll have redundancy. [19:42:46] Coren: cool, thanks [19:43:11] Will you be at the Hackaton in Amsterdam? [19:43:23] valhallasw: There are many ways :-). [19:44:04] Coren: yep [19:49:50] ok, I have cleaned up https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools somewhat [20:22:24] valhallasw: Much cleaner [20:24:18] I like the grid status page, but /status/ would be much cleaner than ?status [20:26:18] Damianz: "system" parts of the server don't use /foo/ because that's /$tool/ :-) [20:26:39] status could totally be a tool for tool status [20:26:44] yay small modular part unix design [20:30:12] * Coren fiercely hates Impress. [20:31:01] Impress is a true POS [20:31:14] Our CTO has a major hard on for Libre too, sadly [20:31:39] I don't mind Libre in general; but I need something less sucky for slides. [20:33:08] It's usable but not slick imo and doing complex stuff like macros/pivot tables/nice slides is hard and the compatibility is stupid, totally screws most office docs [20:33:17] "Semantic MediaWiki includes an easy-to-use query language" <-- lies. [20:33:23] Pfft [20:33:32] On another note, since when did Libre use gerrit [20:33:45] valhallasw: Everytime I try and write a Semantic related page, I want to stab people [20:33:54] Damianz: oh, good, so it's not just me [20:34:00] I like SMW [20:34:12] I like the *idea* [20:34:19] Coren: You are an evil, masochistic bastard [20:34:22] I *loathe* the search functionality [20:34:33] I'm trying to get a list of projects I'm a member of [20:34:38] that should be a *trivial* query [20:34:43] Last place I was in, my whole park self-updated an SMW with data like power, use, etc. VMs categorized in their clusters, and so on. It was *great* [20:34:57] The principle of SMW is good, but why the hell use a wiki to do that functionality... it's like as insane as creating an openstack manager in mediawiki [20:35:10] Damianz: if all you know is a hammer... [20:35:13] Damianz: No comment on that latter analogy. [20:37:22] Having the ability to do queries on wiki like (how much power is used in the left strip) and (what VMs are hosted on the servers) on the Rack's page was... fun. [20:38:17] I can see how that could be useful, but the query to do that sort of thing (esp in templates) gets insane [20:38:29] Now if only SMW did SQL ;-) [20:38:33] And for that sort of data, relational databases ftw just because they're truly relational [20:39:50] True, but having it in the wiki means it can live cleanly alongside notes, documentation, diagrams, what-have-you. It took a while to get {{rackinfo}}, {{serverinfo}}, etc right but it payed off once they were tuned. [20:40:08] oh, I need a DOUBLE COLON. of course. [20:40:20] >_< [20:44:18] service groups are not SMW'able? [20:46:01] Huh. I don't think Andrew thought of that. They should be. [20:51:30] ah, ok. [21:01:09] Coren: I added some notes at https://wikitech.wikimedia.org/wiki/User:Magnus_Manske/Migrating_from_toolserver#qsub_et_al [21:01:44] why isnt the help page in the topic? [21:02:01] Because you haven't added it yet Betacommand [21:02:15] Krenair: I cant find it [21:02:16] Betacommand: Well, this is a channel for the whole Labs, not just the Tool Labs. It might actually be worthwhile to make one, in fact. [21:02:41] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help [21:02:57] Coren: not sure how helpful splitting it up further is, tbh [21:03:07] Yeah, I suppose you're right. [21:04:10] Maybe putting it even more strongly: I think it might make sense to explicitly name tools labs, as it will probably be the project with the broadest user base [21:04:52] Coren: how does one find what jobs that are active under the longrun queue? [21:05:07] In general or just yours? [21:05:41] mine [21:05:48] i tried qstat [21:05:50] qstat -q continuous [21:05:59] thanks [21:06:09] Oh, you want your tool's perhaps. [21:06:19] Doh [21:06:20] Betacommand: you can also check http://tools.wmflabs.org/?status [21:06:29] qstat -u local-betacommand-dev -q continuous [21:06:46] Coren: I forgot the become :P [21:06:57] :-) [21:09:16] Coren: its always the small stuff [21:09:30] Betacommand: Stalkbot is impressively consistent in its memory usage. Were all tools that well behaved. :-) [21:10:06] Coren: that bot has been almost unchanged since 2006 [21:10:41] Coren: how is replication comming? [21:10:57] still on track for the end of the month? [21:10:57] Next week's sprint is the last step. [21:11:22] So we might even have the luxury of a week of shakedown. :-) [21:11:36] then I might even be able to start the migration process [21:20:15] * Damianz wonders if Coren is a fan of scrum or kanban methods [21:56:16] Damianz: I'm a dinosaur. Pretty much by definition I have a reflexive dislike of management buzzword fads. [21:56:32] lol [21:57:03] I like the idea of working off boards, but sadly people use them to micro manage teams which is just annoying [22:00:56] I survived six sigma, matrix management, XP, a bastard North-American cousin of 5S, TQM. I expect I'll weather Agile and kanban like the others. :-) [22:01:28] I'm a sysadmin. The rule is, "Don't try to manage me, and the job will get done." :-) [22:02:04] heh [22:03:38] Personally I try to drive flat management in engineering orgs as the best ideas and work tend to come from the bottom up, dictorialship management in it never ends well [22:04:47] Teh solution in engineering is sooo simple. Make sure everyone that does work has the authority to match their responsibilities. Being tasked to "fix X" if you don't have the resources or authority to actually fix it is the death of any engineering department. [22:05:17] If you tell someone "Make X" then they need to be allowed to decide how X is to be done. [22:05:57] IMO, Erik did exactly the right things with the Labs+me. He said: "Here are objectives, here are resources. Make it work." [22:06:17] Of course, you can't do this with someone too junior. [22:07:21] Well, you can - and should - but not at project scale. [22:08:32] The senior delegates junior-sized pieces, and assigns. But -- and there's the trick -- the junior has to be allowed to reach their own solution. (Good management skill is finding the right size pieces, not in telling how they are to be dealt with) [22:09:07] But that also needs a culture where going back up the chain and saying "I don't know how to do this" is encouraged, not punished. [22:09:45] Hah! Ima start a new management fad! "Divide and Conquer". [22:09:59] * Coren googles it. Something like this probably already exists in a bad form. [22:11:28] Hm. The term is used, but as a weapon. :-) [22:12:01] lol [22:12:38] Silly MBAs. You divide and conquer the /problems/, not your /staff/! [22:12:43] Junior people defo need to make their own solutions, let them fail in isolation, learn and they will progress themselves [22:13:35] Yes, but you have to encourage reaching upwards when they hit a snag. There is no shame in "I ran into a problem, do you have an idea: [22:15:21] indeed [22:15:47] I love watercooler conversations for that though... even if the inspiration is not allways tied to your immediate problem [22:16:13] Depends on culture I guess.... which sucks in a lot of places [22:16:25] I've seen both extremes. [22:16:55] I haven't actually worked /at/ the office, but even then it's pretty clear WMF is on the good side of that slider by a bit. [22:17:42] I'm of the opinion you should be able to work anywhere and the experience is the same - I think github are doing great things in the space of reducing friction and improving lifestyle as well as productivity [22:20:36] Perhaps, but the "feel" isn't quite the same either. [22:20:57] There are pros and cons to being an extrinsic opsoid contractor. :-) [22:23:12] pros like getting to work from bed and crons like lack of banter? ;) [22:24:35] Pretty much it. I'm a smoker, so being to smoke at work is a big plus. :-) [22:25:16] And the schedule is hard to beat. [22:26:43] Yeah still smoking would suck at work... though I maintain my caffine intake so spend half the day grinding coffee instead [23:04:57] Coren, you here? [23:07:37] JasonDC: Ayup. [23:08:25] so im getting this cool error [23:08:25] http://fpaste.org/11743/83999091/ [23:08:49] Did you just create the service group? [23:08:54] like 2 hours ago [23:08:57] o_O [23:09:07] * Coren checks. [23:09:43] Balls, so to use an external ca for puppet I've got to offload the cert checking :( [23:10:09] Gah. That bug again? There is something funky in the OpenStack wiki extension that occasionally creates the service group with the wrong home. [23:10:17] I thought that was gone. [23:10:25] well... i think its in the preform thing [23:10:32] Delete and recreate the group; in a minute or two it'll be all fixed. [23:10:39] preform? [23:11:00] ya, because its looking for /data/project/local-continuitybot [23:11:13] and NONE of the directories in /data/project start with local- [23:11:37] I know, $HOME is set wrong. That will have pretty much broken the service group when creating the auxilary stuff. [23:12:22] It happens about 1:10 right now. I need to push andrew in a corner until he tracks down that bug and fixes it. :-) [23:12:37] kk, recreating group [23:13:48] "projectadmin role required" [23:13:58] Ah, to delete. Right. Gimme a sec. [23:13:59] to delete the group [23:14:02] thx :) [23:14:14] Coren: Btw, what do you think of using salt for real time creating -> ok/failed rather than cron or such? [23:14:23] * Damianz notes we don't actually use salt at all, but it would be cool [23:14:48] JasonDC: All clear. [23:15:06] Damianz: It's a good idea in principle, but it needs a bit of work to do which nobody has the time to do right now. :-) [23:15:16] Damianz: It's the intent in the medium term [23:15:29] Same problem as there's allways been with salt sadly [23:15:57] But medium is good, because that would be cool and is better for like not getting into weird diverged states [23:18:03] JasonDC: I see it got created right this time. [23:18:18] indeed and i was able to preform up to it :) [23:18:35] will report if any further issues ensure [23:18:53] oh! does the cluster have the latest java jdk on it? [23:21:39] It's a Java 7, dunno about latest. [23:22:01] 7u21-2.3.9-0ubuntu0.12.04.1 [23:29:52] Coren, slight other issue, the /data/project/continuitybot/ directory is owned by root now so i can't quite do the file things :P [23:32:55] JasonDC: Should be all fixed now. Silly caching of uid. :-) [23:33:21] JasonDC: You'll have to exit any shell you might have had with the tool user and become again though.