[00:13:54] halfak: btw, revscoring / sklearn-nn are now installed [00:14:01] pipi will have to kill and restart his container to get it [01:03:46] (03PS1) 10Ladsgroup: [WIP] Make "r" flag red [extensions/ORES] - 10https://gerrit.wikimedia.org/r/266167 (https://phabricator.wikimedia.org/T124616) [01:04:35] (03CR) 10jenkins-bot: [V: 04-1] [WIP] Make "r" flag red [extensions/ORES] - 10https://gerrit.wikimedia.org/r/266167 (https://phabricator.wikimedia.org/T124616) (owner: 10Ladsgroup) [01:07:19] (03PS2) 10Ladsgroup: [WIP] Make "r" flag red [extensions/ORES] - 10https://gerrit.wikimedia.org/r/266167 (https://phabricator.wikimedia.org/T124616) [01:10:15] (03CR) 10Ladsgroup: "A technical note: Performance of this patch is not what I wanted but there several strange things happening: I wanted to use FetchChangesL" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [01:11:52] (03CR) 10Ladsgroup: "It doesn't work in my local host but maybe it is caching issue." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/266167 (https://phabricator.wikimedia.org/T124616) (owner: 10Ladsgroup) [04:08:37] (03PS3) 10Ladsgroup: Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) [04:10:48] (03CR) 10Ladsgroup: "If you think these changes are good now. Tell me. I need to rewrite some scripts." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [08:06:36] Hello! [08:07:14] YuviPanda, I suppose scipy and all is still building. [08:18:04] You mentioned I have to kill and restart my container. Guess I have to figure out how to do that - simply shutting down my terminals and notebooks didn't do the job. [08:42:11] (03CR) 10Hoo man: [C: 04-1] "Only looked at the sql schemas." (033 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [08:52:45] Heh, a simple kill of the ps process. Gtg now. CU L8er. Bye [08:56:41] (03PS4) 10Ladsgroup: Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) [09:02:47] (03CR) 10Hoo man: Some minor improvements to the database schema (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [09:32:24] (03PS5) 10Ladsgroup: Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) [09:35:27] (03CR) 10Hoo man: Some minor improvements to the database schema (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [14:29:18] o/ [14:29:20] Hey Amir1 [14:29:31] sorry to drop off yesterday. I'm picking up where I left off. [14:30:04] Was hoping that I could give you an example of how drastically our scoring will change with these Gradient Boosting models if I don't balance the input sets. [14:30:28] This is one of the examples that we use often: http://ores.wmflabs.org/scores/enwiki/damaging/642215410/ [14:30:36] It scores 0.9366 with the current model. [14:30:51] It scores a 0.3385 with the new model. [14:32:35] This 0.3385 is a pretty high score for the new model. [14:36:19] hey halfak, I'm at middle of something really good [14:36:24] kk [14:36:26] I'll tell once it's finished [14:36:29] :) [14:36:31] * halfak looks into sample weights. [14:48:17] So, there's a sample_weight param for sklearn's fit methods, but it's only available for new sklearn (0.17+) [14:48:59] * halfak starts compiling 0.17 [15:01:30] OK! It looks like I can use the sample_weight param to make the scoring model behave as though it were given a balanced training set. :) [15:02:13] Looks like it doesn't affect fitness [15:03:07] I hope it's not going to be a pain to switch to using sklearn 0.17 [15:09:42] ORES is down and I can't ssh to the web-01 [16:53:04] BTW downtime was related to a labs-wide DNS issue and a restart of the DNS solved the issues for ORES [16:53:17] This was resolved a while ago, but I forgot to make note of it. [17:07:32] Arg! And now I'm running into a scipy bug. [17:07:42] I wonder if it was fixed in a future version of scipy [17:08:52] Ha! Got it! [17:29:24] I think we'll need to update some version dependencies with the release. [17:29:29] YuviPanda, ^ [17:29:41] I'd like to bump the sklearn dependency from 0.15 to 0.17 [17:29:44] How much pain is that? [17:30:06] BTW, sklearn was a fast compile on top of numpy and scipy. And we can stick with the old version of those. [17:30:27] we already install sklearn with pip right? [17:30:30] if so it should be good [17:30:37] OK. Yeah. that's right. [17:30:41] OK. Bumping! [17:30:56] our actual deployment might end up using pip in the end [17:30:58] we'll see [17:31:03] * halfak gets into scaling and balancing and reporting and evaluation metrics this morning. [17:31:30] YuviPanda, now that I have gotten better at understanding pip's behavior, that doesn't sound *absolutely insane* anymore. [17:31:50] I haven't had a surprise compile in a while. [17:31:53] heh [21:35:20] halfak: Conversion of a generic PCFG into one in Chomsky normal form has arrived. [21:35:21] https://github.com/aetilley/pcfg [21:36:05] (or at least is now part of this package) [21:36:17] I think you'll dig, if you get a chance to check it out. [21:36:39] Now I'm going to get some sunshine. [21:38:25] aetilley, cool. Will try to have a look soon. Digging into evaluation metrics today. [23:03:30] ok, I'm back now [23:03:34] finally [23:03:42] halfak: hey [23:03:48] o/ [23:03:56] Amir1, I [23:03:57] 'm [23:04:21] I was at the middle a meeting with CEO of one of biggest tech companies of Iran and also one of founders of fa.wp [23:04:31] Heh! Cool! [23:04:38] he is trying give away some resources for fa.wp [23:04:54] I was showing him cool things we do [23:05:18] +other cool things of wikipedia [23:05:34] anyway. I sent some notes in the work log [23:05:39] if you noticed [23:05:58] halfak: I just read ORES is down, is it correct now? [23:05:58] I did see that. :) [23:06:02] It is. [23:06:10] Was a labs problem that was very temporary. [23:06:34] Had a quick chat with Petan about updating the ORES model so that it's scores operate in a different range. [23:06:55] It turns out that the thresholds are all set on the client(!!!) so he won't be able to update them in prep for a new deploy. [23:07:06] It will be up to the users of HUggle to download a new version. [23:07:17] That has an updated config or to manually change it. [23:07:34] This is a bummer because, I think that this new probability range makes more sense than the balanced style. [23:08:31] is there a way to get score for older version of a model? [23:23:50] Amir1, we'll need to match the revscoring version and that old model file. [23:23:57] But we have a history of model files. [23:24:19] Oh! Through ORES? No. [23:25:32] ok [23:25:59] I'm wokring on the extension too [23:26:29] It seems database schema needs to change and almost everything needs to be rewritten to match the new db schema [23:28:46] halfak: do you think should we provide ores scores for old models too? [23:29:03] in case of keeping backward compatibility [23:34:01] Amir1, that's a trick for a lot of reasons. [23:34:09] We'd need to have historical feature definitions too. [23:34:23] We'd really want to have historical ORES deployments. [23:34:26] I get it :) [23:35:05] I don't know much about huggle but it just sorts based on ORES score [23:35:07] Not impossible. I've thought about it a lot, but I haven't figured out how it could work without lots of pain/resource consumption. [23:35:12] not using threshol [23:35:19] Amir1, they have colors too [23:35:35] but maybe I'm waaay wrong, I've never used it [23:35:35] But you're right that it wouldn't totally break. [23:36:00] ok [23:36:05] petan says that they have a centering and scaling strategy build into Huggle to make our predictions look like the old Huggle scores. [23:36:26] So when we change the probabilities, we'll want to adjust that centering and maybe the scaling too. [23:37:08] Speaking of which, I implemented centering and scaling of our features as part of the scorer_models. [23:37:12] I'll have a PR soon. [23:37:19] awesome [23:37:24] what should I do now [23:37:31] regarding the paper or anything [23:37:47] (the extension demands some work) [23:38:01] Amir1, I know it's kind of lame, but how about a trip through the phab board. [23:38:21] Actually wait... I take it back. [23:38:49] I think more writing would be good. You made a post about realtimeness. What other bits would you like to discuss about our work or past literature? [23:39:00] Maybe our pattern of working with people on false-positives. [23:39:40] community interaction and how we improved our models and features [23:39:58] Yeah. That sounds good :) [23:40:45] and FWIW we didn't take only false positives into consideration we also took false negatives too (things ores should score high but it didn't) and we improved our model based on that too [23:41:01] +1 [23:47:06] I'm gonna start writing in a bit