[00:59:27] Coren: Poke? [04:56:09] Coren: awesome, thanks :) [06:57:17] o_O Why global locked accounts can editing o_O [06:57:28] http://deployment.wikimedia.beta.wmflabs.org/w/index.php?title=Special:AbuseLog&wpSearchFilter=12 o_O [06:59:48] 01:29, 21 June 2013: User: (en.wikipedia.beta.wmflabs.org) triggered filter 12, performing the action "edit" on User:9h1k8do4b5. Actions taken: Disallow; Filter description: More Chinese spambots (details | examine | adjust visibility)' [06:59:49] And this seems to be a bus. "User:" o-O [07:04:06] *bug [10:07:41] Coren: is it on purpose that the superglobal $_SERVER does not contain "HTTPS"? [10:08:02] or petan maybe? [11:03:19] Kai_WMDE hi [11:03:27] hi petan [11:03:51] tbh I don't know :/ this is undocumented and it was set up by Coren [11:04:19] oh, i see [11:04:43] but thanks for answering :) [11:29:27] JohannesK_WMDE ping [11:29:34] JohannesK_WMDE you are running local-render ? [11:29:54] petan, yes we are, why? [11:30:07] there is instance of tlgbe/tlgwsgi.py on tools-webserver-01 [11:30:20] it eats 1600mb or resident memory [11:30:30] possible [11:30:34] 2gb of virtual memory [11:30:51] server has only 2gb of ram [11:30:54] that is a bit much but still possible [11:31:00] that's bad [11:31:13] is that normal? why it isn't optimized a little bit... better? :p [11:31:25] optimized? [11:31:36] yes, for example it could be split to multiple processes [11:31:44] or use some alternative storage than ram [11:31:51] why it eats so much operating memory? [11:32:04] so the 2GB of ram would be split into separate processes? what would that gain? [11:32:11] * Cyberpower678 is eating buttered petan. :D [11:32:12] we make huge db queries [11:32:17] it could be distributed to multiple servers [11:32:28] do we *have* multiple web servers? [11:32:31] yes [11:32:49] huge db queries do not necessarily need to use huge ram [11:32:52] i only know of 2. and i don't know of a way to distribute them. [11:33:34] you can't distribute 1 webserver session to multiple servers anyway... but I fail to see why it should eat so much, what kind of db queries it is? [11:34:05] if 1 web user is able to run a session which eats 1.6gb of ram, multiple web users could bring whole webserver cluster down [11:34:25] angelika fired off a query for just about every person in dewiki. that's probably what was so large. [11:34:53] you realize that running application which allows this can lead to abuses of servers? [11:35:19] imagine wikipedia would allow people to run expensive queries... like filter last 99999999999999999 contributions etc [11:35:28] it would bring whole wikipedia down in minutes [11:36:10] such an application if it's publicly accessible can be very dangerous [11:36:32] normally queries wouldn't be so large. but seriously, 2GB for a web server is too small. our smartphones have more. [11:36:44] 2gb for 1 session? [11:36:49] are you serious? o.o [11:36:54] JohannesK_WMDE: complicated processing should not be done on a web server in the first place ;-) [11:36:59] 2gb for the whole web server, petan! [11:37:11] I don't think production servers have significantly more [11:37:17] valhallasw: it's not complicated, it's just a large result set. [11:37:30] well, even ortelius on toolserver had 4 GB. [11:37:40] JohannesK_WMDE: as in: you're sending 2GB of data to the client? [11:38:41] valhallasw: i'm not sure. i am not allowed to log in to the webserver so i can't see what's happening. we never had 2 gb in one session allocated before. (btw: that's total apparently; rss is 160MB) [11:39:13] what you mean by rss [11:39:26] if you mean resident memory, it's 1540mb atm [11:40:05] petan: an. you said 1600MB rss, not 160. that's too much. still, i can't see what's happening. [11:40:45] how would you see it? [11:40:59] i can't because i can't log in to the webserver. [11:41:11] i asked coren yesterday if i could but he didn't reply. [11:41:13] it's a running process, you can't just see what's happening even if you had access to that server [11:41:29] ah. i can't but you can. ok. [11:41:57] I mean... how would you debug a running python process? [11:42:14] i could at least verify how much memory it uses, cpu time, etc. [11:42:56] it uses 3:43.56 cpu time 1546mb of resident memory and 2011mb of virtual memory [11:43:31] constantly on 25% cpu usage (1 cpu system) [11:45:43] petan: it's a 2GB per process limit, right? [11:45:48] or does the entire server have only 2G? [11:45:58] entire server has only 2g... [11:46:22] ... [11:46:22] ok [11:46:30] I guess the exec nodes have a lot more [11:46:55] petan: how much do the exec nodes have? [11:47:15] 8g [11:47:21] hmm [11:47:22] ok [11:50:09] petan: yes, this is a runaway thing, it is not intended to eat that much ram. we can do 2 things: first, restrict queries if we figure that the initial catgraph result set is too large. second, we can install a web server on sylvester and run the backend there. [11:50:34] petan: but you still want to put more ram into that webserver. 2gb for the entire tools webserver? you can't be serious. [11:51:08] I don't know why Coren decided to use only 2gb per webserver, but keep in mind there is going to be multiple webservers [11:52:09] petan: time to install slayerd ;-) [11:52:30] :/ [11:53:13] (which kills processes if a user uses more than a given amount of memory) [11:53:33] JohannesK_WMDE: is there any reason why catgraph cannot be submitted as SGE job? [11:53:52] or is it just very memory-intensive, while returning within ~10s for most queries? [11:54:25] yeah, I'm guessing Web Servers aren't supposed to do long running processes [11:54:25] valhallasw: it's an on-demand service [11:54:55] valhallasw: I suppose we don't really have proper precedents for doing this kinda 'background job' stuff. [11:55:32] Kai_WMDE: I understand that. But even then: you could send the user to a static html page that refreshes every 30-or-so seconds, which will be overwritten by the result from the query. [11:56:01] valhallasw: no, you misunderstand. catgraph isn't running on the webserver. [11:56:02] or use something like http://www.celeryproject.org/ [11:57:09] YuviPanda: I think the main problem lies in communicating to the user 'we are working on your job!', and then refreshing once the result is there [11:57:18] but this sounds like a generic problem [11:57:26] valhallasw: IIRC you can handle that with a little bit of ajax. [11:57:38] valhallasw: indeed. [11:57:59] addshore and I are working on a dumpscanner that would need something like this, too. [11:58:17] celery + redis sounds like a very nice solution to this [11:58:21] that would need something like? :D [11:58:23] and I think I'd want in to help :D link? [11:58:26] * addshore looks up [11:58:43] YuviPanda: http://tools.wmflabs.org/dumpscan/ [11:58:48] not working yet, though ;-) [11:58:53] its near working :P [11:59:01] YuviPanda: want to make it look pretty ? :P [11:59:02] but we are using a poor-mans alternative to celery + redis: the filesystem ;-) [11:59:10] so [11:59:18] it lets you scan the dump for files that match a specific regex? [11:59:26] and you're grepping through the logs everytime people run this [11:59:44] valhallasw: and no, catgraph can't be submitted as an SGE job, in the same way that, say, apache can't be submitted as an SGE job [11:59:46] addshore: i'm no designer, but can throw bootstrap on top of it pretty easily [11:59:52] hehe ;p [11:59:54] cu later, eat [12:00:09] JohannesK_WMDE: hmm, perhaps it can then run on its own labs instance? should be pretty easy to pick that up that way [12:00:58] addshore: valhallasw is this python or php? [12:00:59] YuviPanda: added you to the project [12:01:08] php frontend, python backend [12:01:08] hmm [12:01:14] YuviPanda: the html pages it html and a bit of php [12:01:17] backend is python [12:01:24] is it version controlled? :D [12:01:25] * YuviPanda logs in [12:01:34] also: https://github.com/addshore/dumpscan [12:01:47] i'm intereted in setting up celery + redis for something, so this might be a good candiate [12:01:49] * YuviPanda checks [12:02:25] basically, the job is now stored as json file and php (in the future) will submit an sge job to parse it [12:02:28] I imagine it could be incorperated into the labs infrastructure [12:02:33] yeah, I think so too [12:02:47] valhallasw: addshore mind if I convert the PHP script to python? WIll send a pull request... [12:03:00] YuviPanda, go for it :P [12:03:04] sweet [12:03:09] * YuviPanda uses Flask [12:03:16] it is super effective [12:03:18] * addshore really needs to step up his python ness [12:03:22] flask? [12:03:35] yes, frees you the pain of doing CGI by hand [12:05:50] addshore: http://flask.pocoo.org/ [12:06:22] :> [12:09:37] addshore: I'm just going to replace the current php code with python (writing it now) [12:09:45] addshore: soon we can move it to celery, etc. [12:09:48] this is a nice use case! [12:12:00] :P [12:13:58] * addshore needs to find a way to scan wikidata dumps easily :D [12:17:52] addshore: one of the things we can do later on is to invent nice readonly fast indexing of the dumps [12:17:57] on something that isn't mysql [12:18:04] addshore: labs is a lot of fun :D [12:18:08] YuviPanda: yes, i was also thinking that [12:18:12] and YuviPanda yes :) [12:18:21] addshore: we can also do something like have a neo4j instance [12:18:22] it has so much potential and easy ways to expand in all directions ;p [12:18:24] with all the graphs [12:18:29] and then we can do fast queries [12:18:34] that's one of the things that bugged me about ts [12:18:39] it was... technically limited. [12:21:59] mysql query can be run using sge? does it make sense? [12:22:38] it doesn't really need to. [12:23:54] YuviPanda: because are runned on different servers? [12:24:02] yeah, they're run on the db servers [12:24:08] the client is doing negligible work [12:24:52] YuviPanda: so I can even run them from webserver without any problem, isn't it? [12:25:06] should be, but why not just run it from tools-login? [12:25:11] assuming you are talking about the command line client [12:25:22] and not a separate program [12:25:48] YuviPanda: I'm writing a program that runs query (on crontab) [12:25:58] ah, then that should be on SGE fale [12:26:12] because programs take up memory, CPU on both the machine they are run and also on the mysql stuff [12:26:30] YuviPanda: the only thing the crontabbed program does is running query, nothing more ;) [12:26:41] what language is it in? [12:26:42] python? [12:26:44] php [12:26:53] so loading the php interpreter, etc takes memory :D [12:26:59] :D [12:27:03] fale: make the crontab start the job on SGE, shoudl be simple enough no [12:27:37] YuviPanda: yeah :) I used the same technique for a different program on TS :) [12:27:43] :) [12:28:01] sge has full db and (/home) fs capability? [12:28:35] yes [12:28:38] well [12:28:40] not /home [12:28:44] it has /data/project [12:28:48] and you shoudl be running it with a tool account [12:29:04] fale: after reading up, yes just do it on whatever server your on (so im guesing your suerying from a webserver / webpage) :) [12:29:08] *querying [12:29:25] YuviPanda: I do have a tool account, but I thought that tool account have the home folder in /home :D [12:29:44] fale: tools account home is in /data/project/toolname [12:29:49] :) [12:29:52] addshore: :D thanks [12:29:54] yeaah, do a pwd [12:30:07] addshore: even with crontab stuff? [12:30:22] yup [12:30:27] cd ~ [12:30:57] addshore: I was speaking about your phrase "yes just do it on whatever server your on (so im guesing your suerying from a webserver / webpage) :)" [12:31:10] :> [12:31:59] better keep the query synced to use less memory or to async them (and parallelized) to use resources for less time? [12:32:45] legoktm: do you remember which project we setup flask for? [12:33:42] addshore: i've written it, testing now [12:37:34] petan: around? [12:37:40] sure [12:39:13] petan: can you grep for errors from dumpscan.py? [12:39:22] in apache logs? [12:39:22] yes [12:40:22] petan: okay, just the last couple of exceptions? [12:42:26] YuviPanda: pastebin.com/vSWniRXU [12:43:03] errm, permission denied? [12:43:06] * YuviPanda scratches head [12:43:12] hold on o.O [12:43:25] try now [12:43:29] it wanted captcha :D [12:43:43] petan: no, i gave it captcha :) [12:43:48] petan: the error itself is confusing me [12:43:52] ah ok [12:47:31] petan: there should be a new error now, can you paste me it? [12:47:39] all this should hopefully go away once we have wsgi [12:48:15] no new error :( [12:48:27] maybe try to check your php error log file [12:48:47] this error logfile show only errors which were not catched by other logger [12:48:50] this is python [12:49:04] last error I have is from 12:37:16 [12:49:13] what is the value of date? [12:49:23] ? [12:49:37] wait I start irc client there [12:50:03] hmm? [12:50:21] I am on many computers now [12:50:25] (physical) [12:50:33] ah [12:50:34] ok [12:50:35] * YuviPanda waits [12:50:44] ok here [12:51:00] [Fri Jun 21 12:37:16 2013] [error] [client 10.4.1.89] Premature end of script headers: dumpscan.py [12:51:06] this is latest [12:52:22] are you sure? [12:52:26] http://tools.wmflabs.org/dumpscan/cgi-bin/dumpscan.py/test is giving me a 500 now [12:52:34] now not [12:52:42] [Fri Jun 21 12:49:53 2013] [error] [client 10.4.1.89] File "/data/project/dumpscan/cgi-bin/dumpscan.py", line 7, in [12:52:43] [Fri Jun 21 12:49:53 2013] [error] [client 10.4.1.89] /data/project/dumpscan/cgi-bin/app.py [12:52:45] [Fri Jun 21 12:49:53 2013] [error] [client 10.4.1.89] Premature end of script headers: dumpscan.py [12:52:46] 3 more [12:52:54] maybe apache is writing with delay [12:53:20] is that all it is giving you? no traceback, anything? [12:57:05] this is all I see in log :/ [12:59:19] fucking CGI [13:02:12] :D [13:03:54] err don't break the friendly channel policy :-) [13:04:19] it'll be easier to not do that once we have WSGI on :) [13:04:28] addshore: so it is currently writing folders into ~/scans [13:04:46] addshore: let me commit it and send a pull request :) [13:04:59] YuviPanda: mehehee [13:05:11] link me to it when your done :) [13:05:18] addshore: it is also using redis to find next-id [13:05:22] no race conditions :) [13:05:28] xd [13:05:35] addshore: it's already in the tool, checkout cgi-bin [13:05:45] mhhhm, cba to login atm ;p [13:05:45] am developing there, then copying to local for committing :) [13:05:47] h all [13:05:48] wiritng something ;p [13:05:49] *hi all [13:05:52] hi nutz_ [13:05:54] cba? [13:05:56] ah [13:05:56] ok [13:07:29] i'm on tool labs right now and am wondering: is it okay if i created a table which will probably have a bunch of GB? namely i want to run a CREATE TABLE x_revision AS (SELECT * FROM revision WHERE ... ); [13:10:02] in the documentation (https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Creating_databases) there is no limit written down, but i thought i'd better ask first [13:12:18] petan: Coren ^ [13:13:19] nutz: We're on a "as reasonable" basis atm with no hard limits, but the example you give doesn't seem like the best idea. [13:13:32] nutz: (For one, you won't get the SSD performance that way) [13:13:51] nutz: Why do you want to make a copy of revision? [13:17:01] Coren can you have a look why nfs is borked on toolsbeta [13:17:03] Coren: i have a bunch of requests that all query the same subset of revision with rev_user_text in (.... ) [13:17:29] and i figured that this way the queries would be faster since i dont have to run it through the whole revision all the time [13:17:47] nutz: Given the SSD and indexes, you're almost certainly better off trusting the optimizer [13:17:56] petan: Did you see my comment on your bug? [13:18:03] no [13:18:04] :D [13:18:24] (02:00:09 PM) YuviPanda: "JohannesK_WMDE: hmm, perhaps it can then run on its own labs instance? should be pretty easy to pick that up that way" [13:18:25] Coren: okiedokie :) [13:18:39] can't even find it :( [13:18:39] YuviPanda: catgraph *is* running on its own labs instance [13:18:45] what did you do to it... [13:18:50] Coren: thanks [13:18:55] hmm? why is it taking up memory on tools-labs? [13:19:24] petan: https://bugzilla.wikimedia.org/show_bug.cgi?id=49946 [13:19:52] Coren I did that... [13:20:03] Coren the bug contains 2 issues, you answered just to one of them [13:20:12] ... [13:20:13] ah one more question, whilst i'm here: on the old toolserver one had to specify a runtime-limit for a task, i didnt see anything on that for the new tool labs. so i just qsub all my tasks like so "qsub foo.sh" no matter the runtime? [13:20:18] oh. Didn't notice. Lemme go check. [13:20:29] nutz: Right. No such distinction here. [13:20:33] Coren: neat :) [13:20:35] thanks [13:20:35] petan: I go see [13:21:54] YuviPanda: what you want to do with graphs and neo4j may be already covered by catgraph. did you check it out? [13:22:03] ah, no. let me do that. [13:22:21] I've Caught Not Reading, will do :) [13:22:37] neo4j was just a random idea tho, currently doing something else [13:22:39] YuviPanda: catgraph isn't taking memory on tool labs. the article list generator backend is. :) [13:23:15] YuviPanda: http://tools.wmflabs.org/render-tests/catgraph/ (partly outdated... the instances on toolserver are still there, but the new stuff is on sylvester in labs) [13:23:48] heh, at this point I should just read up more [13:23:49] :) [13:25:03] * nutz waves at lbenedix [13:25:07] * nutz waves at lbenedix1 [13:28:47] valhallasw: addshore behold, stacktraces :) http://tools.wmflabs.org/dumpscan/cgi-bin/dumpscan.py [13:29:00] :D [13:29:18] petan: How odd. Something broke in puppet and the upstart_job resource doesn't seem to be working anymore; that left your boxen with a missing file. Patching around it. [13:29:29] ok [13:31:55] addshore [13:32:04] yus [13:32:06] bug #11 in pidgeon partially resolved [13:32:11] what dcc you need? :P [13:32:19] pidgeon now support ssl chat and regular chat in dcc [13:32:25] * addshore cant remember what 11 was ;p [13:32:47] pidgeonclient.org/bugzilla/show_bug.cgi?id=11 [13:32:51] http://pidgeonclient.org/bugzilla/show_bug.cgi?id=11 [13:33:08] :D [13:33:18] :o [13:33:20] i cant even remember when / why I needed it :> [13:33:27] but I didn't have a chance to test it :P [13:33:34] it's useful since NSA is scanning everything [13:33:51] SSL DCC is IMHO one of most secure means of communication on internet these days [13:33:52] is possible to read the my.cnf easily from php? [13:34:04] fale: no [13:34:14] petan: awesome :D thanks [13:34:17] my.cnf is not distributed by mysql server [13:34:46] fale, wait you mean your own .my.cnf? [13:34:47] petan: Fix't; but you'll need to run puppet and reboot [13:34:52] petan: I meaned the replica.my.cmf ;) [13:34:54] Coren ok [13:35:00] fale oh yes that is possible and very easy [13:35:09] fale: I thought you mean /etc/mysql/my.cnf [13:35:18] my.cnf != .my.cnf [13:35:23] petan: sorry [13:35:28] :) [13:36:25] addshore: http://tools.wmflabs.org/dumpscan/cgi-bin/dumpscan.py/index [13:36:49] > [13:36:50] :> [13:37:02] YuviPanda: if your good with python you could probably finish the scanning script also ;p [13:37:09] scroll to the bottom and youll see everything :. [13:37:24] then where would you get practice from? :P [13:37:35] I think I can consider this 'done', let me pull request [13:37:43] addshore: isn't this going to be super slow? [13:37:51] addshore: going through dumps instead of doing a mysql query? [13:37:58] addshore: does this tool already exist elsewhere? what is the purpose of this? [13:37:58] addshore come on, buglist is empty now :/ go fill in some bug don't tell me pidgeon is so perfect it doesn't have bugs :P [13:38:35] Coren I have to reboot all servers? [13:38:38] Coren: Poke? [13:41:21] Coren can I create a symlink in puppet git if I want 2 files to have same content? [13:41:29] I don't even know if git support symlinks [13:42:21] petan: git always follows symlinks... no way to turn that off afaik [13:43:51] YuviPanda: it would be interesting to combine that fulltext search with the deep, practically-instant category search of catgraph :) [13:43:54] JohannesK_WMDE: nonsense [13:43:59] git supports symlinks [13:44:04] valhallasw: how? [13:44:52] valhallasw: addshore https://github.com/addshore/dumpscan/pull/3 [13:45:01] let me know when you pull it, i'll set it up on tools [13:45:10] (requires virtualenv, which I've already setup) [13:45:12] http://bpaste.net/show/tWRFeeRl8ExkxdaxKZbW/ [13:45:17] JohannesK_WMDE, petan ^ [13:45:30] JohannesK_WMDE: true! I think some sortof pre-indexing for fulltext search is essential, though. Running through dumps would be madness [13:45:37] (I dunno what the current tool does, haven't looked at those yet0 [13:45:56] merged [13:46:01] JohannesK_WMDE: although I'm not sure what happens when you check it out on windows [13:46:07] !log toolsbeta petrb: rebooting all servers [13:46:09] addshore: that was fast :) [13:46:09] Logged the message, Master [13:46:19] Can I virtualenv from cgi? [13:46:35] definitely [13:46:43] virtualenv from everywhere \o/ [13:46:43] :) [13:46:46] a930913: just use #!/path/to/virtualenv/bin/python [13:46:49] valhallasw: yes, you can check in a symlink, but this is what happens to it: http://stackoverflow.com/questions/954560/what-does-git-do-to-files-that-are-a-symbolic-link (stores contents, not link) [13:47:03] addshore: look at https://github.com/addshore/dumpscan/blob/master/dumpscan.py for CGI + flask + virtualenv [13:47:06] valhallasw: Ta. [13:47:09] or just the first 3 lines for CGI + virtualenv [13:47:15] petan: Yes, reboot ALL THE SERVERS!!! [13:47:15] JohannesK_WMDE: http://bpaste.net/show/eHrJOArx54skn5xSmmGF/ [13:47:16] valhallasw: don't you need to add the site path too? [13:47:21] a930913: poke back! [13:47:29] YuviPanda: nope, that magically works as far as I know [13:47:33] I've created a tool that does need a mysql db (for his data). Does this db exists? I have to create it? On which hostname will responde? [13:47:38] hmm, it didn't for me when I was doing it for legoktm [13:47:49] JohannesK_WMDE: I have no clue what symlink support would mean other than 'when I clone the repo I get a symlink" [13:47:59] fale: You can create it, on 'tools-db'. [13:48:07] Coren: thanks [13:48:09] YuviPanda: valhallasw i still barely understand py;p [13:48:14] YuviPanda: hm, strange. Maybe there is something different from CGI, but I wouldn't know why [13:48:46] JohannesK_WMDE: also: we are in 2013 now, not 2009 ;-) [13:48:48] addshore: is this going to essentially 'grep' through the dumps? [13:48:59] JohannesK_WMDE: so it might just be that it has been added in the last four years [13:49:25] JohannesK_WMDE: eh, and your link actually states it just works?! [13:49:45] JohannesK_WMDE: "It then stores the name, mode and type (including the fact that it is a symlink) in the tree object that represents its containing directory." [13:49:55] valhallasw: the link says: "git just stores the contents of the link (i.e. the path of the file system object that it links to) in a 'blob' just like it would for a normal file. It then stores the name, mode and type (including the fact that it is a symlink) in the tree object that represents its containing directory." [13:50:12] valhallasw: which is exactly what i said before. [13:50:26] JohannesK_WMDE: yes. The contents of the *link*, not the contents of the *file* [13:50:45] that's not what 'following symlinks' means. [13:53:15] valhallasw: yeah, OK [13:53:40] addshore: also merge https://github.com/addshore/dumpscan/pull/4 [13:53:41] ? [13:54:42] petan: anyway, conclusion: you can make a symlink, and it will be cloned as a symlink [13:55:21] valhallasw, petan: it is ... different for symlinked directories though [13:56:17] valhallasw: That hasn't worked :( [13:56:34] JohannesK_WMDE: "It's worth noting that Pythonic's warnings about symlinked directories do not apply to versioned symlinks. The major edge case in question was that of folks symlinking some or all of the working tree into a different path (say onto a different partition with more disk space) and expecting git to check out code through the existing symlink." [13:56:56] valhallasw: do you have merge rights on addshore's dumpscan? [13:57:12] a930913: hm, sorry, I thought that should work. You can try the first three lines from https://github.com/addshore/dumpscan/blob/master/dumpscan.py , though [13:57:18] Coren: Might not need you if virtualenv can do what valhallasw said it can. [13:57:32] valhallasw: yes, i read that. see answer #54, same link. [13:57:52] "Any git pull with an update removes the link and makes it a normal directory" [13:58:18] YuviPanda: valhallasw closed the pull request just now [13:58:24] ty [13:58:34] sorry my internet just dropped :P [13:59:17] addshore, I'm teaching myself Objective-C [13:59:29] Hi all. Anyone know how to connect from tools-login to the tools web server? curl 'http://localhost/...' only says "couldn't connect to host". Tried "--noproxy localhost", or connection to the "full" URL, no joy. [13:59:49] magnus__: try tools-webserver-01 [14:00:25] magnus__: Johannes is hypothetically correct, but enduser access to the actual webservers isn't allowed for privacy policy reasons. [14:01:16] JohannesK_WMDE thanks, that works! [14:01:26] Oh. Through HTTP. [14:01:37] * Coren pretends he didn't say anything. [14:01:47] addshore, any updates with peachy software. [14:01:51] :p [14:01:51] JohannesK_WMDE: http://bpaste.net/show/BRsDU2rt23QYlRa34PgX/ < works for me. [14:03:33] JohannesK_WMDE: but if you have a test case that shows what happens, I'd be interested [14:03:55] Coren for some reason suphp doesn't work with apache 2.4 :/ [14:04:03] I will downgrade toolsbeta to 2.2.2 [14:04:26] Is there any rule about not draining bandwidth? [14:04:41] a930913: If you overdo it, you will get LARTed. [14:05:05] !lart [14:05:10] what is that [14:05:12] valhallasw: then maybe something changed in git in the last year or so. i remember having problems with symlinks in git before. the "fatal: 'dir_symlink/file' is beyond a symbolic link" git error is new, see http://git.661346.n2.nabble.com/Not-going-beyond-symbolic-links-td667979.html [14:05:21] Coren: How much is overdo? [14:05:22] addshore: valhallasw https://github.com/addshore/dumpscan/pull/5 [14:05:25] merge? [14:05:32] petan: Luser Attitude Readjustment Tool. [14:05:33] addshore, you there? [14:05:40] a930913: Make ops notice. :-) [14:06:14] Cyberpower678: he said his internet is choppy [14:06:17] so he may or may not be here [14:06:26] a930913: We don't have firm rules on resource use; we prefer to only put rules in place if there is abuse that needs being curbed. If you need a lot of bandwidth for a while, it's probably okay. [14:06:28] YuviPanda, ah, [14:06:42] a930913: If you abuse the freedom, ops will abuse /you/ :-) [14:06:54] * Cyberpower678 understands wants to learn objective-c [14:06:56] JohannesK_WMDE: ah, cool. That explains. Hurray for development :-) [14:07:23] Coren: Does the importance of the tool/bot allow it to consume more resources, and vice versa? [14:08:16] addshore: valhallasw okay, http://tools.wmflabs.org/dumpscan/cgi-bin/dumpscan.py/index now works, is fully python, and puts out files in the same format as before, the path is configurable in the config.py file [14:08:29] addshore: code lives at ~/dumpscan/dumpscan, required file is symlinked to cgi-bin [14:08:36] feel free to change the path :) [14:08:47] however, redis exists, so it may be easier to use that as a queue :) [14:08:59] sorry currently trying to work out why this doesnt work atal error: Class 'Wikibase\EntityId' not found in C:\xampp\htdocs\mediawiki\extensions\Wikibase\lib\WikibaseLib.php on line 164 [14:09:01] see https://github.com/yuvipanda/SuchABot for gerrit queue implementation [14:09:06] a930913: Not in fromal terms, no. It's a collaborative environment; we'e expecing everyone to do their best and we'll help as needed. [14:09:18] addshore: valhallasw yeah, just doing a dump of what I did so far. Need to go do 'work' stuff now [14:09:53] addshore: You're clearly trying to do stuff under Windows. That's probably your problem right there. :-P [14:10:13] haha Coren maybe ;p [14:10:34] addshore: vagrant :) no looking back from there, methinks [14:10:43] YuviPanda: yee [14:10:47] i really should.. [14:10:49] ys [14:10:50] *yes [14:11:04] anyway, look at the dumpscan code and let me know if some code isn't clear :) [14:11:08] hopefully all of it is straightforward [14:11:15] * YuviPanda goes back to doing 'work' work [14:22:08] Coren more [14:22:10] [Fri Jun 21 14:21:08 2013] [error] [client 10.4.0.153] SoftException in Application.cpp:299: Script "/data/project/.system/public_html/index.php" resolving to "/data/project/.system/public_html/index.php" not within configured docroot [14:22:11] [Fri Jun 21 14:21:08 2013] [error] [client 10.4.0.153] Premature end of script headers: index.php [14:22:18] why is that :o [14:22:31] that is from toolsbeta [14:23:38] did you adatp the /etc/suphp/suphp.conf like tools's? [14:24:01] In particular, docroot=/data/project [14:24:03] :-) [14:25:38] Coren: btw: what are the privacy policy reasons you mentioned that prohibit access to the webserver? [14:26:01] aha [14:26:44] JohannesK_WMDE: There is unavoidable information leak from web users on a webserver; through netstat, ps, etc. [14:28:18] Coren I can't find it anywhere, in tools we don't have such option in suphp.conf [14:28:19] Coren: hmmm, i don't see what private data you could gather from that, that you couldn't get from the same commands on -login, for instance. i thought the webserver didn't even log the remote client IPs [14:28:32] ah /etc/suphp [14:28:33] nvm [14:28:57] this needs to go to puppet [14:29:22] YuviPanda, addshore: do you know whether fcgi works in werkzeug? [14:29:25] JohannesK_WMDE: Anonymous users won't follow links from the projects to -login (which isn't protected by the privacy policy); it's the web services we are worried about. [14:29:56] petan: Yes it does. I never got around to puppetizing the webservers yet. [14:30:29] what is php-wrapper [14:30:43] that is some of your handmade tools isn't it :P [14:31:01] HA [14:31:02] it is [14:31:03] :) [14:31:04] petan: Oh! Yes, yes it is and I forgot about it. [14:31:13] !toolsadmin [14:31:13] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Documentation/Admin [14:31:52] I added that so that error /starting/ the PHP script would be reported [14:31:56] Coren: i don't understand what you mean by "Anonymous users following links from the projects to -login" [14:32:10] JohannesK_WMDE: Take, for instance, geohack. [14:32:56] yes...? [14:32:58] JohannesK_WMDE: There are links in infoboxes from projects to geohack; we don't want people following those links to have less privacy. [14:33:15] JohannesK_WMDE: The alternative is to have an ugly-ass intersitial with a disclaimer. [14:33:20] Wheeeeeeeheheeeeee [14:33:26] tools-beta.wmflabs.org [14:33:30] it's up [14:34:05] YuviPanda place for you to play with ^ [14:34:20] Might want to change the text and links to point to toolsbeta though [14:34:27] Coren: i understand, but i don't see how the data on the webservers is indivitual-related. [14:34:36] *individual [14:34:40] it does [14:34:47] JohannesK_WMDE: Even just being able to correlate visits to the tools with a user is verbotten. [14:34:59] ah these links.. [14:35:10] i thought it was all blanked out in the log files, e.g. ip addresses replaced by 127.0.0.1 [14:35:30] hey everybody, is there a way to accurately identify reverts (, except for taking ALL texts from a page, md5-ing and comparing them)? there's no flag being set, right? [14:35:41] Coren I don't want to change anything, we should have 1 package for both projects, do changes on beta and once they are stable, push them to tools [14:35:59] there should be some change like if ( $project == beta ) ... blah [14:36:59] JohannesK_WMDE: Most of it is, yes. I do my best to limit the exposure to the tools themselves. No redaction is perfect, though, and there is no defense against correlation attacks. [14:37:17] which reminds me, Coren, that toolsbeta-login is looking for tools-master instead of toolsbeta-master [14:37:35] I am pretty sure it's in puppet somewhere [14:37:56] petan: Of course it is; that's the primary parameter to the gridengine class. [14:38:10] can't we make it like motd? [14:38:15] change parameter per project name [14:38:25] petan: Way, did you use the role::labs::tools roles directly? [14:38:35] yes [14:38:36] * Coren facepalms. [14:39:00] You're supposed to make a copy to role::labs::toolsbeta and change the settings accordingly! :-) [14:39:05] we want testing environment that is same as production, not testing environment which is pretending to be same [14:39:23] The environment is defined in the module; your server roles aren't. :-) [14:39:34] if I had to do copy of every single change that would be pain in the ass [14:39:44] even Ryan doesn't do that in production [14:39:51] he has some puppet variable realm [14:40:05] and he just use same classes in labs as in production [14:40:10] where this realm just contain different value [14:40:13] petan: Yes, and that's a bad thing. Make ::toolsbeta. [14:40:27] what's bad on it? [14:40:52] petan: For one, with that scheme you won't be able to test changes in the _roles_; that kinda defeats the prupose of a beta project. [14:40:59] if you do a change to any tools class how it get merged to toolsbeta automagically? [14:41:24] petan: You didn't actually read the class, have you? All the roles to is include stuff from the toollabs module. [14:41:33] petan: do* [14:41:55] petan: manifests/role/labs.pp [14:42:15] how does it answer my question :o [14:42:36] you mean create a new class but use same module? [14:43:14] petan: Yes! Just copy the ::tools class into a ::toolsbeta class; change the parameters as apropriate. Both will still include the toolslabs module. [14:43:17] role != module [14:43:23] ok [14:44:40] So what you're making, basically, is a set of roles for "toolsbeta" which are defined as "just like tools, basically" :-) [14:44:59] But where you can change parameters like $grid_master [14:45:00] :-) [14:45:41] Which means, also, that you can try new roles or changes to roles without breaking tools. [15:16:07] Coren I updated the classes [15:16:08] err: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate definition: File[/etc/ssh/ssh_known_hosts] is already defined in file /etc/puppet/manifests/ssh.pp at line 70; cannot redefine at /etc/puppet/modules/toollabs/manifests/init.pp:58 on node i-000007d6.pmtpa.wmflabs [15:16:09] warning: Not using cache on failed catalog [15:16:10] err: Could not retrieve catalog; skipping run [15:16:36] petan, slice [15:16:44] what [15:16:50] petan, sudo make me a sandwich, [15:17:03] :D [15:17:06] Error: Cyberpower678 is not in sudoers list [15:17:22] :P [15:18:30] petan: role::labs::tools::* should be the only included class in your project. [15:18:42] it is afaik [15:18:46] petan: Did you add the roles to the puppet groups config and remove the other ones? [15:18:52] what wait [15:18:58] yes I did [15:19:10] petan: Did you /remove/ the class checkmark before you removed them from the groups? :-) [15:19:12] you say petan: role::labs::tools::* s [15:19:17] you meant toollabs? [15:19:22] I mean ::toollabs:: yes [15:19:36] yes I did uncheck them [15:19:39] petan, sudo add Cyberpower678 to sudoers list. [15:19:44] after that I removed them entirely [15:19:48] petan: Hm. [15:20:11] Cyberpower678 bash: add: command not found [15:21:36] petan: You're including class ssh from /somewhere/ [15:22:19] :/ [15:22:25] I just switched tools to toolsbeta [15:22:27] before it did work [15:22:39] I hate puppet [15:22:51] that thing is so ugly and creepy [15:23:01] the idea is great, the implementation suck [15:23:25] Coren, add me to the sudoers list so I can have petan sudo make me a sandwich. [15:23:57] make: *** No rule to make target `me'. Stop. [15:24:03] Cyberpower678 ^ doesn't work [15:25:54] :D [15:26:04] D [15:26:04] D [15:26:04] D [15:26:04] D [15:26:04] D [15:26:04] D [15:26:06] D [15:26:08] D [15:26:10] D [15:26:12] D [15:26:18] D [15:26:20] D [15:26:22] D [15:26:24] D [15:26:26] WTF? [15:26:32] I did not just do that. [15:26:52] Stupid XChat [15:28:05] ... flooding [15:28:13] xChat :P [15:28:34] lol [15:28:44] sudo rm Cyberpower678 [15:29:03] Cyberpower678 get a proper irc client [15:29:12] rm o_O [15:35:08] Coren it started to work [15:36:09] Coren what is that: [15:36:10] err: /Stage[main]/Toollabs::Bastion/File[/usr/bin/sql]: Could not evaluate: getaddrinfo: Name or service not known Could not retrieve file metadata for puppet://modules/toollabs/sql: getaddrinfo: Name or service not known at /etc/puppet/modules/toollabs/manifests/bastion.pp:46 [15:36:13] :/ [15:36:19] it doesn't want to install sql [15:38:21] Typo in the module. Odd that I never noticed. [15:40:15] Fixed [15:40:45] (Probably didn't notice because sql is already installed on tools-login [16:05:15] I can't connect to gerrit from labs: [16:05:16] git review -s [16:05:16] Could not connect to gerrit. [16:06:31] any help please? [16:08:26] ebraminio, are you able to ssh to other labs instances from that same session? [16:09:01] git review requires your ssh private key to be forwarded if you are doing git review from a labs instance [16:41:52] andrewbogott: thanks, nvm, i did it locally [16:42:10] ebraminio: OK, that's probably the right solution anyway :) [18:14:39] nutz: huhu [18:58:02] huh. i'm trying to create a table now but i do get: ERROR 1142 (42000): CREATE command denied to user 'u3356'@'10.4.0.220' for table 'u3356__mytable' . am i missing something? i'm on dewiki_p [19:09:14] nutz: I think you want CREATE DATABASE u3356__mydb, then you can create the table inside it (e.g. CREATE TABLE u3356__mydb.mytable (...)) [19:12:03] Coren: Hmm. I tried "drop database p50380g40030__test" on enwiki.labsdb, and it complains at me "ERROR 145 (HY000): Table './mysql/proc' is marked as crashed and should be repaired". [19:12:30] anomie: oooooh okay, thanks :) [19:12:57] anomie: o_O That sounds omnious. [19:15:16] anomie: Crashed client leaving it in an odd state. Try again? [19:16:38] Coren: Worked, thanks [20:02:09] hey [20:02:42] hey [20:02:44] :D [20:06:29] :o [20:09:58] Coren ping [20:10:08] pong [20:10:10] Coren I am still having troubles on toolsbeta-login [20:10:19] it still try to connect to tools-master [20:10:19] Such as? [20:11:05] Oy. Yeah, the parameter is a pressed; it's only taken into account when gridengine is installed. purge it, then run puppet again. [20:11:13] k [20:11:20] also: [20:11:23] (You'll need to do it on all of the grid) [20:11:47] notice: /Stage[main]/Base::Puppet/Exec[puppet snmp trap]/returns: executed successfully [20:11:52] err: /Stage[main]/Toollabs::Bastion/File[/etc/update-motd.d/40-bastion-banner]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/toollabs/40-toolsbeta-bastion-banner at /etc/puppet/modules/toollabs/manifests/bastion.pp:29 [20:11:53] notice: /Stage[main]/Toollabs/Exec[make_known_hosts]/returns: executed successfully [20:13:50] LOL [20:13:51] nevermind [20:14:04] ... did you put a 40-toolsbeta-bastion-banner in puppet? :-) [20:15:20] no :P [20:15:25] I can't read [20:23:40] Coren I can't ssh to toolsbeta-exec-01 :/ [20:23:47] permission denied [20:24:12] ... odd. [20:24:42] Are you sure puppet had enough time to run twice so that the key exchange takes place? [20:26:49] not sure... it was a while I last sshed there [20:41:50] . [20:41:53] o.O [20:42:06] I just disconnected my modem and connection is still alive :O [20:45:08] Yeay TCP when your IP remains. [20:47:16] I didn't know tcp can do this [20:48:14] this place has almost no signal coverage, internet is like 5kbs :/ [20:48:35] I am downloading some packages to my laptop and I never saw apt so slow [21:00:49] hi I've a service group in tools i ran parsoid (a code) but it seems they're not in a same host [21:01:39] how can i run a code in the same host [21:10:53] Amir1: Which "same host"? If you want to run two arbitrary things on the same host, it's generally as simple as starting both from the same script. [21:11:29] Amir1: If you mean "run it from the -login host" then the answer is http://i2.kym-cdn.com/entries/icons/original/000/007/423/untitle.JPG [21:11:58] :))) [21:12:30] Coren, haha [21:13:03] Amir1, I used to run my bot on -login. The results were, no one could login. :p [21:13:25] I don't where is the problem [21:13:34] but my wiki is http://tools.wmflabs.org/wikitest-rtl/w/index.php [21:13:41] Coren yelled at me and thoroughly trouted me. [21:13:59] That's using the webserver. [21:14:01] I run node server.js on my account or in wikitest-rtl account [21:14:09] they can't connect to each other [21:14:31] Why are you trying to connect them>? [21:14:43] VE [21:15:06] http://www.mediawiki.org/wiki/VE [21:15:12] http://www.mediawiki.org/wiki/Parsoid [21:16:16] I don't get it. [21:16:41] Why don't you load them into the same account? [21:16:44] I want to test VE on the wiki [21:16:57] so i have to run parsoid [21:17:15] Give me the two file paths? [21:17:50] I ran them [21:17:56] in the same account [21:18:02] but still they can't connect [21:18:23] Can you give me the file paths? [21:18:36] I want to have a look. [21:18:51] * addshore points at https://bugzilla.wikimedia.org/show_bug.cgi?id=48897 for anyone that can magic it up :) [21:19:16] oh ok [21:19:32] the wiki /data/project/wikitest-rtl/public_html/w [21:20:05] parsoid is /data/project/wikitest-rtl/public_html/w/extensions/Parsoid/api/js/server.js or /data/project/wikitest-rtl/public_html/w/extensions/Parsoid/js/apiserver.js [21:20:56] Amir1 try "take $HOME/public_html" [21:21:41] I don't get [21:21:56] Run that command in bash/ [21:23:04] local-wikitest-rtl@tools-login:~$ take $HOME/public_html [21:23:05] README.mediawiki: will not follow or touch symlinks [21:23:07] pegjs: will not follow or touch symlinks [21:23:08] express: will not follow or touch symlinks [21:23:17] (sorry for flooding but it's just four lines) [21:23:39] Try running it. [21:23:57] It's done [21:24:04] the command ends [21:24:16] No I meant try to connect. [21:24:38] ok [21:25:50] I ran but still can't connect [21:26:46] hmm. [21:27:23] try "chmod -R 775 $HOME/public_html" [21:29:31] still same [21:29:51] hmm... [21:29:57] Coren, your turn [21:30:18] My experience isn't good enough yet here on labs. [21:30:37] * addshore pokes Coren to https://bugzilla.wikimedia.org/show_bug.cgi?id=48897 ;p [21:32:35] addshore, i don't get it. [21:33:35] addshore: I'll ask Asher to give me the views and I'll add the other table to it before deploying. [21:38:26] Coren: can you move this service group to the beta cluster ? [21:39:11] Amir1: Not really, but you can just create it there anew. [21:40:02] I don't think i can even create a new [21:40:14] I'm not authorized [21:46:28] Coren: could I also get libqt4-core and libqt4-dev on tools-dev? [21:46:54] valhalla1w: Now you're beginning to worry me. I *like* Qt a lot, but why on tools? :-) [21:47:24] Coren: because the KDE svn to git tool is written in Qt4 ;-) [21:47:36] and it actually makes C++ development quite pleasant [21:47:46] ...which is not what I'm used to from C++ [21:48:50] if it's too much effort, I can also compile stuff locally [21:49:08] ... I didn't expect -login to be running X clients; performance would be shitty at best. [21:49:10] cheers Coren :) [21:49:38] Coren: it's not an X client, it's purely command line [21:49:44] but it uses stuff like QProcess, QRegExp, etc [22:09:46] * fale thought that was weird to create bots in C++... but valhalla1w beats me :D [22:15:45] Coren: is possible to publish http pages from the user account? [22:35:03] fale: yes but no, a user account on tools cant have its own webspace, but it can publish into a tools webspace [22:35:34] in other works, the only webspace is /data/project/*/public_html/ :) [23:23:49] @requests [23:23:57] * addshore goes somewhere else :>