[05:55:23] Hey! [05:55:40] Anyone around know how to work out a tools-login error on PuTTY/WinSCP? [05:55:48] hi TCN7JM [05:55:55] Hey. [05:56:01] I don't, sorry, but petan might [05:56:31] or the labs-l mailing list might. https://lists.wikimedia.org/mailman/listinfo/labs-l [05:58:16] or Coren|Away might, but ... still away [06:46:48] hi I am here :o [06:46:54] ew [06:47:23] o_O [07:17:52] https://wikitech.wikimedia.org/wiki/User:Legoktm/wmflib [07:18:02] any thoughts? [08:09:12] legoktm: cool, but I already use bash script to do these things :D [09:37:44] TCN7JM oh your here ;p [09:37:48] petan: around? [09:37:49] lol yeah. [09:38:14] hi [09:38:17] TCN7JM: whats your labs name? [09:38:30] TCN7JM / shell: tcn7jm [09:38:57] petan: can you make sure the above users key is everywhere it should be? :D [09:39:03] ok [09:39:07] having troubles getting to tool-login [09:39:13] what kind of troubles [09:39:20] key troubles :D [09:39:47] I see 1 public key there for him [09:39:53] Yup. [09:40:02] hmm [09:40:12] right im guessing your public key is in the wrong format then :> [09:40:23] seems correct to me [09:40:29] hmm [09:40:32] It's in SSH-2 format. [09:40:36] TCN7JM: http://askubuntu.com/questions/204400/ssh-public-key-no-supported-authentication-methods-available-server-sent-publ [09:40:47] TCN7JM can you try ssh -vvvv [09:40:54] are you on linux right? [09:40:57] petan: putty ;p [09:40:59] * TCN7JM cries. [09:41:04] * TCN7JM is on Windows 8. [09:41:05] ok [09:41:08] I had a similar problem and I think it was because of the format of my key ;p [09:41:17] are you sure you are providing key to putty? private one [09:41:21] all because of silly puttgen [09:41:30] you need to also convert it to putty format [09:41:42] If I'm hearing you right, I need two keys. One public, one private. [09:41:43] Right? [09:41:49] yes [09:41:56] public go on server [09:42:00] private is on your pc [09:42:08] I'll quick check to see if I didn't accidentally use the same key. [09:42:33] * TCN7JM puts on Daft Punk. [09:42:34] that wouldn't work [09:42:42] :> [09:44:51] hehe [09:45:07] Hey, Daft Punk is awesome. [09:45:27] Anyway, I switched to a completely different private key just in case I was using the same key for public and private, and it did not work. [09:46:02] if you switch to different private key you need to update the public one [09:46:10] these keys are creating a pair [09:46:26] that is how it work [09:46:40] + putty requires some special version of private key [09:46:52] it's quite complicated on windows [09:46:55] if pagent accepts the key it should be fine :0 [09:46:56] :) [09:47:06] I don't know windows very much [09:47:12] IT SUCKS! [09:47:26] no I don't think so, it's just not well supported on them [09:47:29] But I couldn't order my $300-off Dell without Windows. [09:48:28] petan: Generating new pair. [09:49:58] No dice. [09:51:00] I have no idea wtf is wrong. [09:51:49] Waaaait, lemme check something. [09:53:12] Never mind. [09:53:15] That did nothing. [09:53:44] Hold on... [09:54:35] I keep thinking I have wrong but it does nothing. [09:57:25] I got it!!! [09:57:30] For some stupid-ass reason! [09:57:34] I forgot to save my public key! [09:57:36] I'm a dumbass! [09:57:48] [09:57:52] Alright, done with that. [10:01:00] :> [10:01:15] * TCN7JM slaps TCN7JM around a bit with a large trout. [10:09:48] @notify Coren|Away [10:09:48] You've already asked me to watch this user [11:24:04] I asked this on wikitech wiki as well, but I'll probably get an answer faster here :). In Amsterdam it was mentioned that if you want to use a GUI for interfacing with the database you should run one on your own machine [11:24:15] what settings should I use to connect to the database? [11:27:44] henna I managed to do that once, but it was very complicated [11:27:58] which database you talk about [11:28:40] it involved a lot of port forwarding, other option would be to make mysql publicly accessible but that isn't secure at all [11:37:43] I still think we should just have something like phpmyadmin [11:38:16] even if you have to forward port 80 for that, it would be an improvement [11:54:19] projectdatabase [11:59:04] select/insert/delete/update is no problem from commandline for me but CREATE I tend to mess up :) [12:01:01] henna you mean like enwp? [12:01:10] in that case it goes beyond me [12:01:21] I am just a labs guy [12:01:52] petan, wasn't S7 supposed to be ready by now. I'm getting killed with comments. [12:02:00] again [12:02:01] I am just a labs guy [12:02:14] I can't do anything about S* servers [12:02:26] petan, you are now Coren|Away's replacement. :p [12:02:36] for labs [12:02:41] for database you need asher ;) [12:02:53] Who is never here. [12:03:04] @seen binasher [12:03:05] petan: Last time I saw binasher they were quitting the network with reason: Quit: binasher N/A at 6/22/2013 12:47:09 AM (1.11:15:55.6251110 ago) [12:03:15] he was on friday [12:05:12] oh this is nice [12:05:21] SQL Manager Lite for MySQL understands SSH tunneling [12:05:50] just not in the way we need :-( [12:10:29] henna: https://wikitech.wikimedia.org/wiki/File:Mysql_from_windows_to_tools.pdf [12:12:11] valhallasw: Does tunneling not work at all? I tried to set up one (on Linux) that gets forwarded, but I get access denied. [12:12:23] scfc_de: it works for me [12:12:59] valhallasw: "ssh -v -L 3307:enwiki.labsdb:3306 tools-dev.wmflabs.org" + "mysql --host=localhost --port=3307 --user=u2267 --password=verysecret enwiki_p" = "ERROR 1045 (28000): Access denied for user 'u2267'@'localhost' (using password: YES)" [12:13:37] oh, I tried toolsdb [12:13:45] but your username seems wrong [12:13:57] that should be some weird 15-character string, I think? [12:15:19] scfc_de: try adding --protocol=TCP [12:15:26] That's the one for the replicas. It works on tools-dev: "mysql --host=enwiki.labsdb --port=3306 --user=u2267 --password" = "Welcome to the MariaDB monitor." [12:15:28] otherwise it will connect to your local mysql server [12:15:49] port number is ignored if protocol=sockets [12:16:36] valhallasw: *Argl*. That is just so wrong. Thanks, it works now. [12:17:03] scfc_de: I only noticed it because I had no mysql server running [12:17:17] so I got ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) [12:18:31] petan: hm, you created [[User:Lfaraone]] and despite my username being [[User:LFaraone [12:18:45] ]], I got an echo "you have a new talkpage message" email [12:20:27] lfaraone I noticed [12:20:45] it was User_talk :P [12:20:56] yes, sorry. I didn't know echo was case-insensitive. [12:21:01] no problem [12:21:36] this is a stupid feture of mediawiki [12:21:54] you can't have user bob and Bob but it is case sensitive for user pages anyway [12:27:26] all I want is [[User:lfaraone]]. But noooo. [13:51:30] Coren|Away can we deploy new take? :o [13:51:42] I already fixed all the issues you found [13:52:15] petan: why are you in such a hurry O_o [13:52:28] because I want to see it in action :P [13:52:44] you want to see how quickly tools labs can be taken over? :P [13:53:02] that's really not the right attitude for security engineering :p [13:53:09] not possible, my take is even more secure :3 [13:53:17] try it out on toolsbeta [13:54:40] Has Coren done a full security review yet? [13:56:39] no idea [13:56:45] but I did :3 [13:57:01] it's cute and secure [13:57:36] and four times more code running as root for any user [13:59:12] why don't you try to hack it, if you are so suspicious [13:59:42] petan: again "you can't hack it" is not a sensible approach to security engineering :P [14:00:48] and I happily follow the 'don't write suid programs yourself' approach [14:00:50] read the source code? find a bug? :P [14:01:04] lol so you let other people write them? [14:01:28] don't trust anybody FTW [14:01:28] so I use stuff where I can be certain people who are more knowledgeable than I am about security engineering have reviewed it, yes [14:01:29] :) [14:01:54] yes like NSA precompiled version of kernel with lot of selinux and surveillance code :P [14:04:21] petan: but seriously, just have some patience until coren has done a security review. If he finds no issues, at least two people could not think of anything ;-) [14:04:45] that's why I pinged him while aho [14:04:46] ago [14:07:17] petan, Coren|Away, couldn't take run with sgid instead of suid permission? after all, it just needs to run as the relevant user group [14:12:06] valhallasw: Doesn't it need to set the owner? [14:12:54] ^ [14:13:06] it must run as superuser, in order to change the owner [14:13:07] petan: Why should Coren review code if we already have a tool that has been reviewed by him? Just put your additions in a patch so he doesn't have unnecessary work. [14:13:23] my patch is that new take [14:13:39] I basically took his code and wrapped it into this bigger project [14:13:49] just check both source codes, you would understand [14:16:21] my continuous job died :( [14:16:37] petan: I did. [14:16:58] started with `jstart -N php_dispatchRC_zhwiki -mem 1g $HOME/mw/labsDispatchRC.sh` in liangent-php tool initially [14:17:15] liangent that happens. [14:17:29] that is why I recommend to cron auto-check if it's running and restart when necessary [14:17:56] liangent: Do you have a job ID for that? Then petan could look at the exit status. [14:18:08] petan: what's the entire point of jstart/jstop then?! [14:18:18] valhallasw idk :o [14:18:28] we have a job cluster with rescheduling to make stuff like that work [14:18:31] scfc_de: no job it but has job name [14:18:32] valhallasw just kidding.. [14:18:41] jstart jstop is like service xx start / stop [14:18:45] it start / stop a service [14:18:53] it doesn't guarantee it will never die [14:19:07] linux service commands don't do it either [14:19:14] petan: so I should do what I did on toolserver? [14:19:23] liangent: What was the job name? [14:19:28] liangent I have no idea what you did on toolserver [14:19:34] scfc_de: php_dispatchRC_zhwiki [14:19:46] petan: ^ [14:19:53] sec [14:20:13] petan: keep submitting the same jobs and rely on SGE to screen duplicated (running) ones [14:20:16] petan: err, no? it should only stop when the process exits with a 0 exit status [14:20:56] petan: it's wrapped in a shell script to keep it running, and SGE *should* restart the job if an execution server goes down [14:21:15] valhallasw: I think the current wrapper doesn't deal well when the client script runs out of memory. But let's see the exit status. [14:21:26] my internet is very poor (mobile connection like 5kb/s) my responses will take a bit [14:21:29] scfc_de: ah, yes, that sounds reasonable [14:21:47] scfc_de: if SGE kill the job, it will indeed not restart [14:21:50] +s [14:22:12] liangent that is not a best idea. I do that I create a script which just check if job is running on sge and if it's not it submit it [14:22:38] petan: we have the -once parameter to de exactly that, right? [14:22:45] valhallasw that isn't true. It will stop the process if it exceeds its limits [14:22:48] petan: You don't even trust Coren to do jsub? :-) [14:22:55] no :P [14:22:59] I did and it didn't work [14:23:44] I mean jsub is a cool thing, but I was told it will keep my bot alive and it didn't keep it alive, so I had to switch back to my working code... [14:24:15] scfc_de: and my script has a large chance to run out of memory [14:24:24] mediawiki eats a lot of ram [14:24:43] petan: The script will, indeed, not survive and out-of-memory kill. [14:24:58] yes, that is what I meant valhallasw ^ [14:25:43] It could have been worked around but I saw very little point to it: if the script was killed because it hit the limit then it's clearly broken; restarting it might just make things worse. [14:26:14] indeed [14:26:32] Coren for some reason I can't ssh to toolsbeta-exec-01 from any other host than -login o.O [14:26:35] toolsbeta-login [14:27:00] petan: You should be able to from -master too. [14:27:07] Then we need to signal that to the user (qsub -m e?). [14:27:12] Coren: no sometimes it's just due to memory leak, and restarting it works [14:27:25] especially when the previous job has been running for a long time [14:27:33] liangent: By definition, "memory leak" is a bug. :-) [14:27:51] Coren: but not one that, engineering wise, has to be fixed. [14:28:15] 'just restart it every few days' is a reasonable engineering approach in a lot of cases ;-) [14:28:18] Coren: is it possible to change this? I would like to be able to ssh there using tools-login instead toolsbeta [14:28:19] liangent: If you want to be uber paranoid, have your script exit /itself/ if it gets bigger than some limit (under the absolute max, of course) [14:28:27] so that I don't need to ssh to 3 boxes [14:28:33] (through) [14:29:24] Coren: anyway I'm now putting * * * * * jsub -once -N php_dispatchRC_zhwiki -mem 1g $HOME/mw/labsDispatchRC.sh in my crontab [14:29:34] petan: Add the key of tools-login to the known hosts in /data/project/.system/store/ [14:29:43] aah [14:29:44] ok [14:29:44] liangent: That's one per minute. Don't do that. [14:30:41] well this doesn't even work [14:30:42] liangent: Better yet would be to fix your tool so that it doesn't grow unbounded or, at least, that it self-restarts if it grows too big. :-) [14:31:21] Coren: then go to fix mediawiki itself [14:31:30] liangent: The default cron path is very restricted, you probably want to add a PATH=/bin:/usr/bin:/usr/local/bin directive to your crontab. [14:31:50] liangent: Wait, you're running mediawiki in a cron job?! o_O [14:31:53] my script is a long-running mediawiki maintenance script [14:31:58] Ah. [14:32:12] liangent: But it's restartable? [14:32:32] petan: Did you reach tools-master? :-) [14:32:56] Coren: php $dir/extensions/Labs/dispatchRecentChanges.php --wiki=zhwiki --from=`cat $dir/rc.state` --state=$dir/rc.state [14:33:11] scfc_de sorry I am incredibly busy and my connection suck, what should I do there? [14:33:29] just send me the commands to shorten it :> [14:33:41] liangent: if you set ini_set('memory_limit', '$somevalue') with a suitably limited $somevalue, php with end the script before the job runs out of memory. [14:33:57] liangent: Which means jstart will be able to restart it without difficulty. [14:34:17] Coren: Could you look up the exit status of the last job "php_dispatchRC_zhwiki" on tools-master so that we can confirm that it was an OOM issue? [14:36:07] Coren: mediawiki maintenance scripts have a builtin --memory-limit param. does it work? [14:36:29] liangent: It should. [14:36:37] scfc_de: Good idea. [14:37:13] scfc_de: exit code 137 -> sigkill. [14:37:24] Coren: which run? [14:37:25] scfc_de: Yep. Got OOMed. [14:37:39] I manually killed one recently [14:37:42] check job start time [14:37:56] liangent: All of them since 119712 [14:38:08] ok [14:38:29] That means it got killed hard by SGE; althoug you're right that you asking also gives that result. :-) [14:39:08] liangent: But yeat, --memory-limit should do the trick if you find the right value (the mapping between --memory-limit and h_vmem will not be 1:1) [14:40:00] liangent: Start it with a bigger -mem and a --memory-limit to the script then monitor its usage; you should see exactly how much vmem that means quickly enough. [14:40:37] Coren: how can I monitor it? [14:40:40] Filed https://bugzilla.wikimedia.org/show_bug.cgi?id=50053 so that we don't lose this thought. [14:41:49] liangent: http://tools.wmflabs.org/?status for manual inspection; you can also use qstat -j to do it from the command line (look at maxvmem) [14:43:44] Neat. There are three continuous jobs that have been running sinc job IDs <1400. That means they survived complete cluster reboots, two filesystem switches, and at least one NFS hiccup. :-) [14:45:50] Coren: I don't see a maxvmem [14:47:00] Coren: how can I read this report? http://pastebin.com/zym7Pncv [14:47:03] liangent: vmem is the current usage, maxvmem is the maximum to date. If you're using the web tool, it only shows maxvmem separately (as "peak") if it's higher than the current value [14:47:48] liangent: Oh. You want the 'usage 1' line which will only appear once the job is actually running. :-) [14:48:17] liangent: Looks like: [14:48:24] usage 1: cpu=00:01:05, mem=15.07724 GBs, io=0.12783, vmem=264.086M, maxvmem=705.160M [14:49:02] liangent: In your case, the "interesting" values are "vmem" and "maxvmem" [14:49:57] Coren did you see that bug about showing used / available vmem on ?status [14:50:03] it would be so useful :3 [14:50:30] petan: Should be trivial enough to add by just summing up the vmems from the jobs. [14:50:38] currently it display hmem that is useful just as if we could see if fan is running... [14:51:04] Coren yes but the webpage is not in any repository, so you are only one who can implement it [14:51:38] petan: Fail. It's in toollabs under www :-) [14:51:44] aha [14:51:55] Coren: well I can't confirm php memory limit until the script died and at this time qstat doesn't work [14:52:00] ok I will try that on beta [14:52:27] and I want to get vmem value seen by qstat *at the time* it's killed by php for memory limit=1g [14:53:28] liangent: you can use '-s z' with qstat to also see recently ended jobs [14:54:51] I can always see it even without -s z (maybe I'm typing it fast enough) but there's never a usage line [14:54:59] liangent: And once https://bugzilla.wikimedia.org/show_bug.cgi?id=48696 is resolved, you can also use qacct without nagging petan or Coren :-). [14:56:47] Coren: what can I do now? [14:57:28] liangent: That's odd. What tool is this running under? [14:57:45] What tool? you mean my script [14:57:59] /data/project/liangent-php/mw/memtest.sh [14:58:06] liangent-php. :-) [14:59:58] liangent: Ah. I see what went wrong: the job never actually ran. [15:00:06] liangent: Check the .err file. :-) [15:00:50] Coren: BTW, shouldn't the HTML not rather go to operations/puppet's toollabs::webserver? [15:00:54] Could not open input file: maintenance/eval.php [15:01:01] Coren: got it. should I file a bug that jsub doesn't guarantee working directory? [15:01:39] liangent: It's not a bug, it's normal qsub behaviour; there is a parameter, '-cwd', to request the wd. :-) [15:02:07] scfc_de: It's cluster-wide, not on the webservers themselves. [15:02:33] Coren: ok and now it's working but still no usage line [15:03:03] I'm not seeing it run. What job number? [15:03:39] 442252 [15:03:54] already died or should I start a new one? [15:04:17] liangent: maxvmem 1.242G [15:04:35] ok and let me try 2g in php [15:04:38] It died so fast gridengine never got to collect running stats. :-) [15:04:49] (every 5s, IIRC) [15:05:02] Coren: Ah, yeah, /data/project/.system/public_html? Hmmm. [15:06:23] Coren: so how can I slow down the line for(;;)$a[]=1; [15:06:49] liangent: Interestingly enough, that shows --memory-limit does work though: the job ended because *php* decided there was too much memory. [15:07:43] What --memory-limit did you put in? [15:07:58] I am back for a bit [15:08:33] liangent: I'd add in groups of, say, 10000 and sleep between them. [15:09:19] Coren: --memory-limit=1g [15:13:48] liangent: Ah, nice. --memory-limit=1g matches a vmem of 1.242G apparently. It's /likely/ to be a fixed overhead. [15:15:09] liangent: So if you run the maintenance script with --mem 1g, a --memory-limit of ~600m should do just the trick. [15:15:22] * Coren needs food, badly! [15:15:59] * Steinsplitter givs coren food [15:16:09] Coren: and maxvmem=266.738M for echo 'for(;;);' | php $HOME/mw/maintenance/eval.php --wiki=zhwiki --memory-limit=1g [15:18:07] Coren: to be safe I'm now using php $dir/extensions/Labs/dispatchRecentChanges.php --memory-limit=4g and jstart -N php_dispatchRC_zhwiki -mem 4.5g $HOME/mw/labsDispatchRC.sh [15:18:26] in the past it was -mem 1g and no --memory-limit [15:18:30] Coren 1) can you review the take source code I fixed all the reported issues [15:18:37] petan: Monday. [15:18:41] Coren 2) can you check why toolsbeta can't display ?status [15:18:56] ouch it accepts integers only [15:19:45] liangent: Yeah, silly qsub. You can ask for 4608M though [15:21:07] Coren|Food for some reason qstat display nonsense on beta [15:21:17] hmm I'm using the longest job name on http://tools.wmflabs.org/?status [15:21:28] HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS [15:21:29] toolsbeta-exec-01.pmtpa.wmflabs lx26-amd64 2 0.02 3.9G 134.8M 976.0M 0.0 [15:21:45] in fact load is 0 memuse 100mb and swapus 0 [15:22:33] * WearyPanda apologizes for being snappy yesterday [15:23:28] petan, you haven't changed the basics substantially [15:23:45] what [15:23:50] can you be more specific? [15:24:01] it's still vulnerable [15:24:05] yesterday you told me "it's still vulnerable" [15:24:17] you didn't say a single bit of information why and how [15:24:35] (and you didn't change the FD into FD&, so it still closes descriptors twice) [15:24:42] well, I didn't want to go into detail [15:24:47] where [15:25:06] but the basic issue is what Coren told you about using the full path instead of the folder fd [15:25:07] petan: is there docs on experimenting with puppet on beta? [15:25:14] Commons Delinker is not running for weeks, Maby labs can run the Delinker script on the wmf server? [15:25:26] if you aren't going to go in details you can't expect anyone to solve the problems... neither take them seriously [15:25:45] WearyPanda what kind of docs you mean? [15:25:50] remove all stat and open calls :P [15:26:04] WearyPanda tbh I think we don't have any docs on puppet so far to have some project specific :D [15:26:06] petan: for instance, I'm not even sure where to log in :P [15:26:17] or if i'll have root to run puppet [15:26:18] WearyPanda toolsbeta-login [15:26:30] !toolsadmin | WearyPanda [15:26:30] WearyPanda: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Documentation/Admin [15:26:41] we have these docs for admins :> [15:26:42] petan: Commons Delinker Bot is down vor weeks (removig delete immage across al wikis and replacin renamed fiels aross all wikis). Bryan dos not respond and kr inkle dos not hav the passwort... [15:26:44] see, if I call take Makefile, it looks like this: [15:26:44] lstat("/tmp/take/Makefile", {st_mode=S_IFREG|0644, st_size=3748, ...}) = 0 [15:26:44] open("/tmp/take/Makefile", O_RDONLY|O_NOFOLLOW) = 4 [15:26:45] stat("/tmp/take", {st_mode=S_IFDIR|0755, st_size=260, ...}) = 0 [15:26:45] stat("/tmp/take/Makefile", {st_mode=S_IFREG|0644, st_size=3748, ...}) = 0 [15:26:45] fchown(4, 1000, 4294967295) = 0 [15:26:57] * petan reminds his internet is slow [15:27:10] I could change /tmp/take/Makefile into a symlink to /etc/passwd just before you opened it [15:27:11] petan: ah, ok [15:27:25] s/you/take/ [15:27:47] Steinsplitter you are welcome to run it on labs [15:27:52] Coren, where's the code for the take used in labs? [15:28:02] Platonides now I am reading your backlog, WearyPanda hold on, getting to you soon :P [15:28:11] petan: i am not a tech/programmer :( [15:28:13] heh [15:28:15] I think it may be simpler to improve that one than reimplementing it, petan [15:28:45] Platonides did you actually debug it or is that what you /think/ [15:28:57] Platonides because I don't believe you are correct [15:29:14] Platonides: https://git.wikimedia.org/blob/labs%2Ftoollabs.git/2888b0f122c507cdd13a928054dd2ea51dac0b8c/src%2Ftake.cc [15:29:17] petan, the above lines are strace output [15:29:28] from 061aa632 commit [15:29:43] if you call take Makefile, it will open the FD, then it will /only/ work with this very file using its FD and only thing which may eventually be lstated (which I am not sure about either) would be parent folder [15:29:57] look at the strace lines [15:30:03] you are working with the paths [15:30:15] Steinsplitter: commonsdelinker runs under the commonsdelinker account on the toolserver AFAIK. Krinkle could just ask for group access. [15:30:25] in fact, you open Makefile _twice_ [15:30:50] valhallasw: i dos not know was rong. ... every day users ar asking on commons about this bot.... [15:31:01] I see how it could be fixed, but it looks too much like rewriting your rewrite [15:31:06] Platonides are you sure that lstat isn't from FTW [15:31:10] valhallasw: krinkle dos not hav access to the pw, only Bryan [15:31:18] Steinsplitter: oh, the bot password you mean [15:31:19] that is possible and is irrelevant and not vulnerable to anything [15:31:24] Bryan dos not respond (died o.O:?) [15:31:34] valhallasw: jeah [15:31:38] oh wait [15:31:45] petan, your code is vulnerable [15:31:59] it may be a bit hard to exploit the race, but it's certainly there [15:32:00] thousends of not replaced immages (loooonggg backlog...) [15:32:08] Platonides lol, you keep repeating that, but maybe we should find first "how it is vulnerable" [15:32:27] Steinsplitter: I see. [15:32:30] if it's FTW's lstat it has nothing to do with vulnerability [15:32:31] petan, I explained aboce [15:32:48] *above [15:32:51] Platonides you explained it opens a file twice, which may be a problem but not necessarily vulnerability [15:32:58] no, I mentioned that later [15:33:05] that's silly but not a vulnerability per se [15:33:15] the problem is that you open /tmp/take/Makefile [15:33:42] https://github.com/benapetr/take/blob/master/src/Take.cpp#L39 [15:33:43] and I could change it into eg. a symlink to another file I shouldn't be allowed to change [15:33:50] this is where it starts taking a file [15:33:52] change = take [15:33:59] petan: https://developer.apple.com/library/mac/#documentation/Security/Conceptual/SecureCodingGuide/Articles/RaceConditions.html [15:34:18] interesting source, but still :) [15:34:29] petan: see 'time of check vs time of open', which I think is what Platonides is talking about [15:34:38] petan, the open you are doing there (and fstating) is not the one which is chowned [15:34:39] (at least that was what Coren|Food was talking about y'day) [15:34:44] Platonides I found it, stupid bug, not vulnerable at all [15:34:57] Platonides it's the device look up which calls the second open of file [15:35:01] yes, WearyPanda [15:35:07] take itself uses 1 fd only... [15:35:09] let me fix it [15:35:29] petan, you are vulnerable... [15:35:35] lol [15:35:41] WearyPanda, perhaps you can explain him better than me? [15:35:45] it's also common enough that there exist labs for it :) http://www.cis.syr.edu/~wedu/seed/Labs/Vulnerability/Race_Condition/Race_Condition.pdf [15:35:49] open("Makefile", O_RDONLY|O_NOFOLLOW) = 3 [15:35:49] fstat(3, {st_mode=S_IFREG|0644, st_size=3748, ...}) = 0 [15:35:49] getcwd("/tmp/take", 4096) = 10 [15:35:49] lstat("/tmp/take/Makefile", {st_mode=S_IFREG|0644, st_size=3748, ...}) = 0 [15:35:49] open("/tmp/take/Makefile", O_RDONLY|O_NOFOLLOW) = 4 [15:35:50] stat("/tmp/take", {st_mode=S_IFDIR|0755, st_size=260, ...}) = 0 [15:35:50] stat("/tmp/take/Makefile", {st_mode=S_IFREG|0644, st_size=3748, ...}) = 0 [15:35:51] fchown(4, 1000, 4294967295) = 0 [15:35:58] Platonides: i've read backlog from yours and Coren|Food, not sure I can do a better job [15:36:23] petan: check the code in that lab exercise, perhaps? It makes this particular race issue rather obvious [15:36:52] can you at least agree with me that there are issues in his code? [15:37:16] oh yes, completely. [15:37:25] ok, thanks [15:37:47] at least I won't look like being stubbornly wrong :) [15:38:14] Platonides: I'm basing it on the strace output, not the code (which I've not read) [15:38:23] obvious enough, though. [15:39:23] petan: re: code readability, Coren|Food's version is far more readable than your version, I think. [15:39:31] it's a straightforward single function [15:39:34] I don't [15:39:46] yes it's huge complicated function not structured at all [15:39:54] and hard to extend [15:40:01] what exactly would you want to extend it to? [15:40:03] you have problems with reading because you prefer K&R over ANSI / Allman [15:40:25] I would at least prefer to extend it with these functions I pointer many times in my email [15:40:32] --help? [15:40:36] --log? [15:40:51] petan: as far as I have seen your functions do not need suid privileges [15:40:56] GNU compliant (--help /-h --version etc / usage on no parameter, --recursive -r,) + -g for group [15:41:01] and --verbose [15:41:06] except for the group thing [15:41:24] valhallasw and except for Take::ChangeOwner thing [15:41:28] which is quite crucial [15:41:38] valhallasw, he is calling chown() [15:41:45] * fchown [15:41:49] WearyPanda, where's Coren version? [15:41:51] Platonides: I mean for the extensions to the existing take [15:41:56] Platonides: https://git.wikimedia.org/blob/labs%2Ftoollabs.git/2888b0f122c507cdd13a928054dd2ea51dac0b8c/src%2Ftake.cc [15:42:00] Platonides: https://git.wikimedia.org/blob/labs%2Ftoollabs.git/2888b0f122c507cdd13a928054dd2ea51dac0b8c/src%2Ftake.cc [15:42:02] heh [15:42:07] :) [15:42:22] valhallasw I didn't say I extended it with more code which requires root, I just extended it with more code :) [15:42:26] pretty straightforward though I don't like the indentation :) [15:42:45] WearyPanda yes, his version is more simple, because IT IS more simple [15:42:51] it makes perfect sense [15:42:55] it also contains far less features [15:43:05] WearyPanda ssh to toolsbeta-login [15:43:18] try /home/petrb/take * and take -v * [15:43:22] you will see the difference [15:43:27] or take -vvvvrg * [15:43:53] my version is newbie friendly, non-experts will appreciate helpful hints and explanation how to use the tool [15:44:00] petan: all of the things in your first email can be done with a wrapper that is NOT suid root [15:44:08] as well as debug informations of what is wrong and how they can fix it [15:44:12] and the people trying to look for security exploits will just have to find IRC logs? :) [15:44:34] valhallasw but why in the world would you create a wrapper when you can change the core? [15:44:35] valhallasw: or, as patches to take.cc :) [15:44:42] you would spawn 2 processes just to do a simple thing? [15:44:45] petan: because then it doesn't have to run as root [15:44:51] so what... [15:44:59] is that advantage or something? [15:45:07] why increase the attack surface? [15:45:07] keep performance in mind [15:45:07] I'm seeing a problem with Coren version [15:45:08] ... [15:45:26] performance, for a tool that will be seldomly used? [15:45:29] Platonides: security issue or otherwise? [15:45:45] I hate wrappers for simple things which don't need them [15:45:45] otherwise, for now [15:46:12] Platonides: hmm, okay. file bugs? I'm adding usage info as a patch [15:46:40] * WearyPanda reads up getopt [15:46:56] Steinsplitter: I did reach Bryan by mail some time ago. Have you tried that? [15:47:03] oh, it wasn't a bug [15:47:10] just a funny way of recursing [15:47:11] WearyPanda: You might also want to look at argp. [15:47:16] scfc_de: jepp [15:47:22] and a lot of other users [15:47:31] that was what I thought, too [15:47:34] scfc_de: all my C before had used glib, so thanks for the pointer :) [15:47:41] if you want gnu options, use GNU getopt :P [15:48:01] *I* am happy with take as is ;) [15:48:01] getopt_long in this case [15:48:52] petan: did you read the links I presented? [15:49:08] Oh, petan did reimplement that as well. [15:49:09] WearyPanda hold on 1 link is loading 20 mintes on my internet [15:49:10] so petan, have you bought the cookies yet? =p [15:49:11] 5kb/s [15:49:23] scfc_de, he did [15:49:27] petan: :( want me to give you a plaintext version? [15:49:35] valhallasw yes, but for now I am eating them alone, because no one did find a real vulnerability so far [15:49:42] Platonides: GNU getopt = argp, more or less :-). [15:49:44] just some theories which were invalid so far [15:50:03] scfc_de: yeah, except seems to do auto --version and --help [15:50:04] * WearyPanda reads more [15:50:05] argp? [15:50:10] Platonides indeed found several bugs, but no vulnerabilities... :P [15:50:32] Platonides: https://www.gnu.org/software/libc/manual/html_node/Argp.html (argp) [15:50:38] scfc_de: only krinkle resond to my questions. Krinkle do a great job on Wikimedia. :) [15:50:41] scfc_de: any idea what the 'p' stands for? :) [15:51:25] I don't seem to have it installed [15:51:33] WearyPanda: Panda? :-) [15:51:34] Argument Parsing? [15:51:39] haha [15:51:47] parsing makes sense [15:51:49] Platonides: It's part of GNU libc. What OS are you using? [15:52:16] Platonides btw I fixed that double file openning [15:52:20] petan: https://dpaste.de/P3PHn/raw/ plaintext version [15:52:23] Steinsplitter: If it's only PW, can't someone set up another bot account and use its credentials? [15:52:25] of that apple article [15:52:27] should be faster :) [15:52:47] scfc_de: but needs aprovall on meta? [15:52:48] WearyPanda ok cool, but I am handling this [15:53:07] scfc_de, Steinsplitter or just a global password reset... [15:53:09] just pointing out the vulnerability everyone else has been trying to :) It's explained fairly nicely there [15:53:29] valhallasw: ab, but is bryans account? [15:54:01] Steinsplitter: ha! the mail address is delinker@toolserver.org [15:54:11] so Krinkle|detached should be able to reset it [15:54:25] okay, thx [15:54:26] as well as multichill and siebrand [15:54:27] :) [15:54:50] the password may be stored at the account, too [15:54:57] WearyPanda you being happy with take as is, is just implying you are not a newbie and you don't need extra features. However I know people who were fine with ms-dos as it was and didn't need features of other OSes so I am quite happy you and these people are not deciding about the future of development [15:55:04] Platonides: then krinkle would have been able to find it, I think [15:55:20] petan: well, good luck convincing Coren|Food about suid :) [15:55:27] ? [15:55:36] his version already HAS suid [15:55:48] my version doesn't change it [15:55:53] petan: seriously, what's the problem with a wrapper script? performance *is not an issue* [15:56:07] it's not like it's being started 200 times per second [15:56:24] and if it is, /then/ you can think of optimizing it [15:56:45] valhallasw it's ugly and crappy... it's like... building a robot which go and empty your dustbin instead of moving your ass and doing it yourself :P [15:56:48] can I lookup what people has entered as signature via public db? [15:57:08] AzaToth, I don't think so [15:57:13] valhallasw my version is already working and is wrapper free :P [15:57:18] where is it saved Platonides ? [15:57:24] user table [15:57:28] it even has automatic script which make a .deb packages [15:57:36] deployment is about 1 command [15:57:48] why you people like to do simple stuff in a complicated way [15:57:52] Platonides: where in user table? [15:57:56] the options blob? [15:58:23] petan: are you still going to keep saying there is no security vulnerability despite the fact that many of us are pointing out something that is commonplace enough to find its place in university lab exercises? [15:58:36] it's probably migrated to user_properties [15:58:37] petan: "why you people like to do simple stuff in a complicated way": Have you really compared your and Coren's sources?! :-) [15:58:48] WearyPanda where is that vulnerability you talk about? [15:58:54] * WearyPanda gives up [15:59:00] * Platonides shuffles petan [15:59:13] scfc_de: implement the features to Coren's version, then compare them [15:59:26] my version is indeed more complicated, but it also has far more features [15:59:28] petan, what features are you missing? [15:59:42] Platonides I still have it in my history... sec [15:59:47] Platonides: and that table isn't replicated I assume? [15:59:50] at all [15:59:55] GNU compliant (--help /-h --version etc / usage on no parameter, --recursive -r,) + -g for group [15:59:58] and --verbose [16:00:01] Platonides ^ [16:00:07] AzaToth, it is replicated, but not allowed by the views [16:00:11] for example -r I am missing most [16:00:13] ok [16:00:22] current version can't take stuff non-recursively [16:00:27] petan, his version does allow -r [16:00:33] no, it doesn't [16:00:36] oh, you miss to run it non-recursively [16:00:37] it enforces it [16:00:57] that's a bit odd, but ok [16:01:00] doing take * can be dangerous in some cases [16:01:03] now *that* would be something useful to add [16:01:06] and very slow [16:01:17] or rather, to make it non-recursive by default and implementing recursion in a wrapper script ;-) [16:01:19] petan: Why would you want to own a directory, but not the files in it? [16:01:24] Platonides: is the limit of signatures still 255 characters? [16:01:41] scfc_de imagine you have a folder with another folder with 1 million of files, which you already own [16:01:53] scfc_de then you copy 3 more files in that folder [16:02:02] scfc_de you are lazy to type long command so you just type take * to match all stuff in that folder [16:02:13] scfc_de and you just started it on million+ files [16:02:27] petan: And? [16:02:40] WearyPanda, where's your patch? [16:02:42] and you have to wait 2 weeks for something that could be done in 2 miliseconds [16:02:54] Platonides: i'm busy being outraged, aaarrrr :P [16:02:59] Platonides: but no, i'm writing it now. [16:03:14] it's just that adding the -r option would be trivial on top of it [16:03:23] Platonides: just doing --help and --version, since I've no idea why you would want to not take a directory recursively [16:03:56] Platonides don't forget to add -vhgr as well as --version --help --group --recursive GNU alternatives, tyvm [16:03:58] I see it appropiate for consistency [16:04:02] Platonides and pls counters for -v [16:04:10] rm folder/ doesn't remove the whole subtree [16:04:16] and usage if there are no commands [16:04:21] or zip folder/ [16:05:03] petan: If you're lazy, you certainly will enjoy a two-week break. [16:05:27] I am happy other programers don't follow that philosphy, or computers would really suck :P [16:05:56] like why should you copy paste when you can retype the text [16:06:42] Eh, what? [16:06:47] Never mind. [16:07:00] you say that people being lazy is no argument for improving software [16:07:23] which I disagree with [16:07:27] Ah now I see, did wc -c on one persons signatrue and got 271, but that's bytes, was only 254 "characters" [16:07:30] there are lot of things that were invented just because people are lazy [16:09:11] Platonides: i'm not going to write that patch. Going to do puppet stuff for Redis instead. [16:09:13] feel free to patch it if you want, but I don't think adding minor 'features' to take is 'improving software features and taking development forward' :) [16:09:53] petan: Who besides you has complained about Coren's take's recursive behaviour so far? [16:10:16] Oren did request a feature which it is missing, that started my development of new version [16:10:19] he wanted -g parameter [16:11:04] then I realized how complicated it is to implement it into his single-function version, and decided to rewrite that simple thing from a scratch into something more structured and more documented and more flexible [16:11:28] however it's still built on top of his version at some point [16:11:31] just heavily extended [16:11:41] petan: So: Noone? [16:12:21] no one did complain publicly (not counting newbies who came here and had to ask how to use take because of no --help and no man page) but someone definitely wanted to improve it [16:12:55] just not having people who publicly complain about software isn't a reason not to improve it [16:14:41] what is mine worse in that you still prefer the old version so much? [16:16:58] I think that nobody did test it on beta but yet so many people keep telling me how vulnerable it is and how much it suck :P [16:17:58] petan: Eh, Platonides *did* test it. [16:18:55] yes and reported stuff I fixed, and which in fact was rather minor issue [16:19:34] petan: !"/§(/!("§!/! [16:19:38] Okay, I give up. [16:20:08] * YuviPanda gives scfc_de ice cream [16:20:52] * scfc_de prefers cholocate :-). [16:21:52] * YuviPanda give scfc_de chocolate ice cream :) [16:22:53] * scfc_de thanks YuviPanda. [16:23:06] :) [16:31:43] petan, I'm writing the take usage... [16:31:46] what should -g do? [16:38:01] found [16:38:23] Platonides: *grin* I just wrote a wrapper function [16:38:49] http://pastebin.com/FAKmVYp4 [16:38:50] tadaaaaa [16:39:04] (not fully tested, though) [16:43:04] New patchset: Platonides; "Add usage() to take(1)" [labs/toollabs] (master) - https://gerrit.wikimedia.org/r/70058 [16:43:13] valhallasw, my version ^ [16:43:38] hmm, I can't login to toolsbeta-login [16:43:50] * YuviPanda checks if he is on that project [16:44:17] Platonides: does verbose actually do something? [16:45:47] valhallasw: not that I can see [16:46:20] I'm actually also not sure of the -g use case [16:46:29] service groups only have a single group [16:48:23] New patchset: Platonides; "Make take non-recursive by default." [labs/toollabs] (master) - https://gerrit.wikimedia.org/r/70059 [16:48:43] valhallasw, not for now [16:48:55] "The options added by this commit are not operative yet [16:48:55] (well, except --help) [16:49:06] oh, right [16:49:51] http://pastebin.com/NaHcwAcx < this looks about right [16:51:31] Platonides: I guess you could do the same in take.cc, after dropping suid [16:51:37] Coren|Food: petan can someone add me to the toolsbeta project? [16:51:45] or... wherever I can test redis puppet? [16:52:43] is that a suggestion for -v ? [16:53:01] I'm not sure what you are proposing [16:53:48] Platonides: sorry. for -g [16:54:49] Platonides: although shelling out is maybe not the best option when you're already in C >_< [16:57:46] YuviPanda sure, even scfc_de can do that [16:57:53] ah, good to know, petan. [16:58:15] petan: can you add me now? [16:58:48] doe [16:58:49] done [16:58:58] and beware [16:59:12] my version of take is there :DD [16:59:40] YuviPanda if you were bored you can fix ?status [17:00:06] petan: will put it on my todo list [17:00:50] Platonides good, when you finish with usage(), continue with help() recursive() verbose() and group(), let me know when it's finished I will review [17:01:11] don't forget to make it accept the -- arguments [17:01:13] like -- -r [17:01:21] if you wanted to take file called '-r' [17:01:27] :) [17:01:53] also, valhallasw, Platonides you realize that current version of take is written in c++ even if it may look to you as c [17:01:55] it's not [17:04:12] petan: Was a bug report for websockets submitted? [17:04:24] which one [17:04:31] I have no idea what you talk about [17:04:43] petan: To use websockets from the labs. [17:04:46] my memory is bad [17:04:57] not that I know of [17:05:48] YuviPanda document all changes on beta using log or just write it somewhere, try to keep both environments aligned [17:08:15] petan: will do [17:08:41] petan: hmm, I can't login to toolsbeta-mc? [17:08:52] hold on [17:09:00] maybe you need to be member of local-admin [17:09:12] oh [17:09:14] can I bea? [17:09:15] *be? [17:09:19] tr now [17:09:20] * try [17:10:30] * YuviPanda does [17:10:38] petan: nope [17:10:52] petan: am I to try logging in from toolsbeta-login? [17:10:54] or directly? [17:32:57] What is the cgi engine used? [17:33:52] a930913: Apache + suphp. [17:38:29] hmm, so I can login to toolsbeta-mc [17:52:15] New review: Tim Landscheidt; "I don't see a reason to change the default behaviour. If you want to "take DIR", recursion is IMHO ..." [labs/toollabs] (master) C: -1; - https://gerrit.wikimedia.org/r/70059 [18:16:42] How do I make a py3 venv? [18:17:55] a930913: http://docs.python.org/dev/library/venv.html [18:18:34] valhallasw: That wasn't helping. Just found the -p flag on virtualenv. [18:22:33] ?? virtualenv is *part of python 3* [18:22:35] at least in 3.3+ [18:26:04] Apparently there should be a Python.h in /usr/include/python3.2mu/ ? [18:26:55] Is dev installed for py3? [20:04:05] addrest, ? [20:04:07] ping [20:14:11] petan: started puppetizing Redis :) https://gerrit.wikimedia.org/r/#/c/70064/ is the required change to our redis config for renaming commands [21:06:57] petan: https://gerrit.wikimedia.org/r/#/c/70103/ [21:13:10] YuviPanda: You can (and probably should) add petrb as a reviewer. Then the commit is listed in his dashboard and he also gets mailed about it. [21:13:50] good call, scfc_de [21:13:50] done [21:13:57] coren is auto-added [21:15:39] valhallasw: addrest did you guys do anything about dumpscan after that? [21:15:53] scfc_de: the logging system is still pending, I got distracted by Redis [21:16:44] YuviPanda: I think Coren's using the auto-add-as-reviewer bot run by ... valhallasw? multichill? Can't find the page at the moment. [21:17:57] YuviPanda: Ah, there it is: http://www.mediawiki.org/wiki/Git/Reviewers [21:18:37] hmm, clicking Edit on that section puts me in VE [21:18:44] * YuviPanda doesn't risk it, does edit source [21:19:25] Wasn't the plan not to force VE only on new editors? [21:20:06] scfc_de: it's been forced on mw.org for a while now [21:20:11] well, you can turn it off [21:20:13] if you like [21:21:25] We need global preferences :-). Wasn't the issue with VE that it can only edit whole articles (= not sections)? Then the "[Edit]" links on sections should disappear as well. Well, whatever. [21:22:39] scfc_de: yeah [21:22:50] scfc_de: that is one issue, but this has templates and i'm not sure if they'll retain well [21:22:51] so [21:23:03] scfc_de: there's a bug about that somewhere :) [21:23:05] scfc_de: anyway, do you know if there's a way I can test puppet configs locally? [21:23:29] cloning operations/puppet complains about missing 'private' repos [21:25:54] YuviPanda: Never done that. (There is a labs/private repo, but I don't if it is helpful.) I think the recommended approach is to set up a puppetmaster::self, then test the changes there and submit (the successful ones) back to Gerrit. [21:26:30] hmm, I suppose I need to read more docs now then :D [21:27:15] Are you root on toolsbeta, BTW? Otherwise that's gonna be painful :-). [21:27:27] New review: coren; "I agree that the default (recursive) behavior is what would be expected from 'take some_directory'." [labs/toollabs] (master) C: -1; - https://gerrit.wikimedia.org/r/70059 [21:27:37] scfc_de: I need to be... :D [21:28:42] I. e., you would set up "toolsbeta-puppet" as puppetmaster, then have "toolsbeta-redis"'s puppet::self::client pointing to toolsbeta-puppet. But as I said, never tried that. [21:28:42] scfc_de: what do I need to do to be root on toolsbeta? [21:29:05] scfc_de: hmm, I thought that was already sortof setup, and that is th epoint of toolsbeta? [21:29:31] New review: coren; "Minor issues (inline)" [labs/toollabs] (master) C: -1; - https://gerrit.wikimedia.org/r/70058 [21:30:40] YuviPanda: You need to ask petan for root. I'm not sure how far toolsbeta has advanced as a training ground, so may it is already set up that way. [21:30:50] *maybe [21:31:13] hmm, I'll do that [21:36:53] scfc_de: it's mine, yes :-) [21:37:10] and I think VE would be a bad idea on that page, indeed [21:37:30] YuviPanda: haven't worked on it [21:37:37] hmm, alright [21:37:42] valhallasw: where's that bot located? [21:37:46] is it running on tools labs? [21:37:51] source? [21:38:03] YuviPanda: on translatewiki.net at the moment [21:38:09] oh! [21:38:25] Siebrand was not too happy when the toolserver broke down again at some point [21:38:37] source is here: https://github.com/valhallasw/gerrit-reviewer-bot [21:38:39] heh. plan on moving it here at some point? [21:38:46] pop3bot.py is what you're looking for [21:39:08] once there is some nice logging infrastructure: yes [21:39:25] valhallasw: wait, this is running by reading the emails? [21:39:26] yep [21:39:37] ssh gerrit stream-events was too unstable [21:39:43] plus this gives a backlog if something breaks [21:39:47] hmm, true [21:40:10] so that if it breaks down, it can just resume where it left off [21:40:21] (skipping patchsets that have already been merged) [21:40:29] yeah, true. [21:41:34] I have a gerrit-to-redis thing running on tool-labs now, though. Does nothing but read from streams and puts it into different redis queue, [21:41:38] for different readers [21:41:48] hasn't crashed for the last 6 days, according to logs [21:42:02] (whicch is when I started it) [21:42:42] YuviPanda: the problem was that the ssh stream sometimes hangs - so the connection doesn't die but also doesn't show any new patchsets [21:42:48] true [21:42:55] hmm [21:42:58] i've not encountered that tho [21:43:15] it's not that it happened often [21:43:17] it happened once [21:43:20] ah [21:43:23] and then I was done with ssh :P [21:43:53] also because email had the advantage of having a backlog [21:45:51] valhallasw: true [21:45:54] it is known-reliable [21:46:14] valhallasw: how much latency are you currently experiencing, with email? [21:47:18] YuviPanda: I don't know exactly - I retrieve mails in five minute intervals. [21:47:23] oh, ok [21:47:39] Real-time was not a design goal ;-) [21:47:43] :) [22:02:49] @replag [23:12:42] Coren|Busy, replay on enwiki_p [23:13:10] *replag [23:16:42] petan, who's to blame for the replag on enwiki.labsdb? [23:16:43] :p [23:25:06] legoktm: How do I use flask on the labs cgi? [23:27:18] a930913: I think zz_YuviPanda figured it out. I forget how he did it [23:44:57] New patchset: Platonides; "Implementing verbose messages." [labs/toollabs] (master) - https://gerrit.wikimedia.org/r/70107