[12:24:33] hey raylton [14:46:20] hey jonas_agx [14:46:27] q c manda? [14:46:53] i mean... what's up? [15:08:45] o/ raylton [15:11:51] hey halfak :D [15:11:57] made it safely back home, I presume :) [15:12:14] Hey YuviPanda. [15:12:14] \o [15:12:14] Nope. [15:12:14] I'm out in Copenhagen [15:12:17] haha [15:12:19] Made it safely here though [15:12:21] :S [15:12:22] nice [15:12:25] *:D [15:12:27] * YuviPanda is in cold Glasgow now [15:13:06] halfak: I encountered the nltk problem you encountered and found a fix and filed a bug. I am going to submit a pull request soon [15:13:17] I'm taking this week easy, so doing ores puppet stuff [15:13:22] I don't expect you to make time for it [15:13:24] so don't worry :D [15:13:34] Cool. Something that won't require us to create the symlink in /var/www? [15:13:42] halfak: yes [15:13:48] YuviPanda, you want to work on ores, you won't be able to hold me back :) [15:13:52] :P [15:14:02] I'm writing a patch atm [15:14:13] I'll write the wsgi file after that. should be trivial [15:14:41] BTW, when I fixed the enchant dependency issue, I didn't use pip freeze because that listed every dependency -- not just ores/revscoring dependencies. [15:15:01] Sorry, not every dependency, every package [15:15:03] The whole tree. [15:15:17] is it called ORES because you're *mining* data? [15:15:17] I thought requirements.txt should only contain top-level ones. [15:15:22] harej, yes. [15:15:26] :) [15:15:37] whee [15:15:46] IMO it should be the set of minimal requirements out of pip freeze required to run the app [15:16:11] OK. That's what I thought and what I did. [15:16:16] :) [15:16:25] halfak: yeah but you're missing versions [15:16:27] hmm [15:16:30] Not anymore [15:16:34] aaah [15:16:36] I didn't see your patch :D [15:16:36] :D [15:16:37] then all good [15:16:42] I'm lazy and just do pip freeze [15:17:01] versions are important because then you know when to run pip install again - every time you touch that file [15:17:41] +1 [15:19:45] halfak: you should also put the model files in a seperate repo [15:20:20] YuviPanda, was thinking about that. [15:21:06] I need to sync scikit-learn versions with models. I was trying to think of a good way of doing that. [15:22:00] YuviPanda, do you think that ores.wmflabs config should be in that repo too? [15:22:10] hmmm, no. [15:22:40] but you can maybe put config about the models in that repo [15:22:48] and then do a config merge [15:23:00] (it's usually nice to have config be mergeable across different files) [15:23:05] and then just have a check in your loader :) [15:23:16] halfak: hmm, actually. [15:23:25] halfak: I think the check should be performed by the *deploy* script [15:23:27] at deploy time [15:23:29] than at runtime [15:23:48] so what you should do is just have some way of specifying in the repo what version of scikit it should be [15:23:54] and I can make sure that the deploy script verifies that [15:26:24] * halfak has never merged a config file before. [15:30:30] halfak: heh, we won't have to do it this time :) [15:31:01] YuviPanda, I might need to hand-hold me a bit on what you imagine will live in this new repo [15:31:25] halfak: so the model files are just blobs, right? [15:31:29] yes [15:31:32] halfak: can you point me to where they live now? [15:32:09] ores-main:/home/revscoring.ores/projects/ores/models/ [15:32:24] ok [15:33:35] halfak: let me just make the repo [15:33:59] kk. I can put them where you point me. [15:34:27] * halfak aggressively tries to get VE work done so he can switch models. [15:34:29] *modes [15:35:07] halfak: https://github.com/yuvipanda/ores-models [15:35:11] that should be it [15:35:17] sorry to distract yhou [15:35:18] *you [15:36:06] halfak: I'll let you go on with VE work now, let me know when you want to take a brether :) [15:36:09] breather [15:36:34] Oh no. I'm just about to start a long running job. Then I'll be able to start moving over the model files. :) [15:36:41] ah :D [15:36:41] cool [15:36:55] I see.. just at the root. [15:37:05] halfak: yeah, we can organize it however we want, doesn't quite matter [15:37:17] halfak: I added a file for the scipy-version [15:37:39] and if we put this in git, we can just use the commit sha of HEAD as the model's version :) [15:49:07] YuviPanda, not all models will change at the same time. [15:49:24] We'll more often add models than change them [15:49:58] halfak: that's fine. we can just autogenerate the commit id of the last commit to touch them. [15:54:33] halfak: hmm, ValueError: Model language revscoring.languages.persian does not match extractor language revscoring.languages.persian [15:55:38] wut [15:55:56] before that it said french not found but I guess that's because you added the french one recently [15:56:09] this is using the model files on the repo I pointed you to [15:56:23] Oh! So there's an open pull request that I have installed on Ores. [15:56:50] Oh wait... no there isn't [15:56:50] aaah :) [15:56:53] oooh [15:56:54] :P [15:56:59] maybe I need to update this to master [15:57:00] let met ry [15:58:11] nope [15:58:14] I'm at master [15:59:02] * halfak looks. [15:59:13] You were using the fawiki model file, right? [15:59:32] halfak: I just copied them from the path you gave me [15:59:37] halfak: they're the files on the git repo [16:00:01] * halfak downloads and tests [16:01:25] * halfak accidentally downloads an html file and gets a fun error. [16:02:45] Hmm... Works for me. [16:03:10] * halfak re-installs things from pip to try again. [16:03:58] Yup... works. [16:04:07] You get that when you try to start up the server? [16:04:52] halfak: yeah. [16:05:10] halfak: this is on ores-sigh host, /srv/ores/src [16:05:53] morning [16:07:29] afternoon, Ironholds. What's good? [16:07:46] hey J-Mo :) [16:08:18] YuviPanda! And how are you these days, sir? [16:08:44] J-Mo: I'm great :D helping halfak with ORES [16:09:10] sounds like fun [16:09:12] yeah [16:09:20] YuviPanda, whats "sigh" mean? [16:09:21] halfak I'm really enjoying your WWW post-mortem [16:09:25] :) [16:09:39] what's 'www post-mortem'? [16:09:46] * halfak will forward [16:09:57] an email I sent out to the research list about my experience of WWW [16:10:06] (the conference) [16:10:11] TL;DR he went to a conference in Florence and all we got was this lousy email about corporaty-ness [16:10:13] :P [16:10:16] halfak: sigh because I created an instance earlier but fucked it up and had to recreate it and it was 1AM :) [16:11:03] favorite line so far "Jeanette's call for "mutistakeholder" discussions was followed by a panel of corporate speakers who were on various sides of the continuum of being able to actually speak their thoughts." [16:11:20] hahaha [16:11:47] J-Mo, ehh. Grumpy. [16:12:17] YuviPanda, lol makes sense [16:12:22] halfak: the log file is in /var/log/upstart/uwsgi_app-ores-web.log [16:12:26] grumpy happens. [16:12:29] :( [16:13:03] YuviPanda, OK if I put this on the stack for now? I have VE data and I need to get it processed. [16:13:08] halfak: sure! [16:13:10] kk [16:13:12] halfak: I'll debug and stuff :) [16:13:14] To the R's [16:13:25] YuviPanda, note that revscoring is a utility itself. [16:13:37] "revscoring -h" [16:13:40] aaah [16:13:42] cool :) [16:13:44] i'm back halfak, sorry for the delay [16:13:51] so... what's up [16:13:56] :) [16:14:11] you'll find that "revscoring score ... --api=https://fa.wiki... is nice [16:14:25] Hi raylton :) [16:14:48] I heard from He7d3r that you might be interested in join out machine learning revolution. [16:14:49] :) [16:15:42] well i want to try to know the project better ;) [16:15:53] can you help me? [16:16:20] yes. :) [16:16:31] Want to think about the social context of it or the technical? [16:17:23] Or the socio-technical: https://www.youtube.com/watch?v=Hj7o5d-OEis#t=3m11s [16:17:49] (That's a talk I gave a couple of weeks ago) [16:19:28] both, i think mainly social [16:19:36] let me see the video [16:19:56] Video will be a good starting point. [16:20:12] I apologize in advance for the speed of my speech. [16:22:50] i'ts ok... when can i meet you here in irc? [16:26:33] raylton, I mostly live in this channel, but there's also #wikimedia-research-ORES too. [16:26:38] That's more specific to the project. [16:29:12] ok great. let me finish a thing, then i will watch the video. [16:30:18] halfak: out of curiosity, why the decision to give ORES its own channel? [16:30:40] Not mine :) I conceded to popular opinion [16:30:54] I would actually like if ores chatter was in -research [16:31:19] But it is kind of nice to have a clean log of conversations -- except when someone's connection is acting up :S [16:37:30] * halfak loads up some R to do the things that python is bad at. [17:16:19] Installing scipy takes fucking forever! [17:16:27] but I don't want to mix deb packages and virtualenvs [17:18:33] YuviPanda, use R [17:18:51] Ironholds: I'm setting up halfa.k's ORES service, so you should talk to him :P [17:19:12] also, I don't think I'd have offered to help scale ORES if it was in R. It's an Ops nightmare, IIRC [17:31:44] ToAruShiroiNeko: hey! if you're around can you give me a test url to test ORES? [17:32:05] halfak: ^ [17:35:36] sure [17:35:40] one minute [17:37:22] http://ores.wmflabs.org/scores/enwiki/?revids=34854345&models=reverted [17:37:30] YuviPanda there is that [17:38:21] Is this what you are looking for? [17:40:46] ToAruShiroiNeko: yes thanks [17:42:14] YuviPanda NLTK requires you to pay attention when installing. I wish I could just apt-get install it and be done with it :p [17:42:40] ToAruShiroiNeko: don't worry, I've abstracted it away enough, I think :) [17:42:43] YuviPanda may I also suggest mirroring the Ores machine so that we can restore it if it dies. :p [17:43:29] I am unsure how realistically such a thing can be done [17:43:40] I would also suggest a FGS model for backup [17:43:53] But then again, ops isnt my area :) [17:44:04] Please take my suggestions with a grain of salt. [17:48:36] ToAruShiroiNeko: where does it do this by default? [17:48:43] root@ores-sigh:/srv/ores/src# curl 'localhost:5000/scores/enwiki/?revids=34854345&models=reverted' [17:48:46] that returns 404 [17:49:24] do what exactly? [17:50:22] ToAruShiroiNeko: I just ran the default on localhost [17:50:25] and that URL is giving me a 404 [17:58:25] ToAruShiroiNeko: best way to do backup is to not need them :D [17:58:29] and have all your services be stateless [18:00:10] well [18:00:23] maybe you need to use port 80? [18:00:33] I dont have my setup running to connect to the server [18:00:48] We would be saving massive amount of data over time so corruption is a big fear. [18:01:30] connect via ssh I mean [18:08:16] ToAruShiroiNeko: data for what? are you talking about labels or just ORES? [18:14:48] ToAruShiroiNeko: aha! found it [18:14:49] http://localhost/scores/scores/enwiki/?revids=34854345&models=reverted works [18:14:50] a bug! [19:10:10] Bugs on our system? No way! :p [19:10:28] YuviPanda Ores and labels both have different data [19:10:45] I'd like to keep both if I can [19:10:57] yeah, don't worry :) I'm doing only ORES right now [19:11:10] but labels will be where our handcoding data gathered from volunteers [19:11:16] its vital that the data there is not lost [19:11:27] we had a small DB hickup a little while ago [19:45:02] hey halfak, Ironholds: I can't seem to commit changes to Github from a repo on stat1003. connection times out. it's been a while since I've tried, but I've definitely been able to do this in months past. any idea what the issue might be? [19:45:13] http_proxy [19:45:14] ! [19:45:24] * halfak gets a thingie for you to copy-paste [19:48:32] thanks dood [19:51:44] Did I mention I found a precursor to WMF Analytics? https://en.wikipedia.org/wiki/Wikipedia:Statistics_Department [20:08:46] you did! [20:08:50] heheh [20:08:56] so, Google just released an Open Locator Code standard [20:09:09] basically it's a really simple way of consistently identifying a location down to ~10m, in 8 chars. [20:09:45] Rich Iannone and I are writing a library to handle them. "I bet we can produce something performant in base R" "okay, here's some C++ I wrote and benchmarked, write me R that validates/invalidates a million codes in 0.25 seconds. I'll be impressed" [20:48:44] * Ironholds looks at top, blinks [20:48:48] "15346 ezachte 20 0 740320 34244 4148 R 57.6 0.1 0:01.74 R" [20:48:53] ...did I have a stroke or is that R [20:49:53] I can't cohere that line. [20:51:41] harej, ErikZ, who I've only ever seen use perl, is using R [21:32:01] quarry is working nicely [21:32:14] Ironholds I just got a german translation started [21:32:16] I am excited