[13:13:24] halfak: o/ https://meta.wikimedia.org/wiki/Research_talk:Revision_scoring_as_a_service/Work_log/2016-05-02#Summary [13:13:32] That's what I've got until page 22 [13:13:39] before conclusion [13:27:46] o/ [13:27:50] Hey Amir1 [13:29:24] Damn. Looks like I was away for pipijov's question. I think a vector is what is called for. [13:29:46] A numpy array will work nicely as a vector for performing operations on the entire set [13:33:20] :) [13:33:28] I was afk for coffee [13:34:12] halfak: Anyway, this paper is really long. 33 pages. [13:34:18] I've got tired [13:34:28] I want to do other stuff [13:34:34] let's start with the weekly update [13:34:52] Any thoughts about quantifying the workload of reviewing all edits to Wikimedia Projects? [13:35:26] I just started thinking about it [13:35:40] two parts: 1- number of unpatrolled edits [13:35:46] 2- number of human edits [13:36:22] Oh yeah. So autopatrolled edits -- or edits by trusted users might be worth excluding. [13:37:28] yup [13:37:51] we actually don't show the flag when an edit is patrolled in the extension [13:41:29] So I was thinking that we might want give a sense for what the reviewing workload looks like and how it has changed over time. I think we'll want to have a substantial section devoted to "filter rate at high recall" [13:41:45] To discuss how efficiency improvements are made by the scoring system. [13:44:09] +1 [13:44:15] We can quantify the speed at which people can review an edit as damaging using what we know from Wiki labels. [13:44:27] Essentially, you can label 50 edits in 5 minutes if you are fully engaged. [13:56:24] halfak: https://etherpad.wikimedia.org/p/ores_weekly_update [13:56:59] I used referencing because sending it to mailing lists messes with links [13:58:18] +1 [13:59:13] 06Revision-Scoring-As-A-Service, 10wikilabels: Deploy updates to wikilabels - https://phabricator.wikimedia.org/T134173#2256645 (10Halfak) [13:59:55] 06Revision-Scoring-As-A-Service, 10ORES, 10rsaas-editquality: Deploy updates to ORES - https://phabricator.wikimedia.org/T134174#2256660 (10Halfak) [14:00:57] halfak: I actually made this: https://phabricator.wikimedia.org/T134032 [14:01:32] 06Revision-Scoring-As-A-Service, 10wikilabels: Deploy updates to wikilabels - https://phabricator.wikimedia.org/T134173#2256675 (10Halfak) [14:01:33] Merged [14:01:34] 06Revision-Scoring-As-A-Service, 10wikilabels: Deploy updates for Wikilabels - https://phabricator.wikimedia.org/T134032#2256677 (10Halfak) [14:01:54] thanks :) [14:02:16] 06Revision-Scoring-As-A-Service, 10rsaas-editquality: Deploy edit quality models for wikidatawiki - https://phabricator.wikimedia.org/T130301#2256683 (10Halfak) [14:02:19] nice [14:03:54] okay, let's publish it halfak :) [14:04:17] Think we should point out fixing the wikidata everything-gets-flagged issue? [14:06:53] not a development [14:06:57] AFAIK [14:07:24] just a quick bug fixed asap [14:09:46] Sure. Nice to report though. :) [14:13:47] YuviPanda, trying to remember -- did you advise that we send our updated to wikitech-l each week? E.g. https://etherpad.wikimedia.org/p/ores_weekly_update [14:13:54] yes [14:13:59] kk thanks :) [14:14:02] Amir1, ^ [14:14:27] sure [14:15:08] halfak YuviPanda: Just sent [14:15:16] \o/ [15:09:00] halfak: what do you think of this? https://github.com/wiki-ai/wikilabels/pull/117/files [15:09:33] Oh! Was looking at this earlier. I'm a fan. [15:09:45] I'm a little confused about the == to === when matching a string. [15:10:10] Can't think of a situation where == doesn't match the behavior of === but it certainly can't hurt to use === [15:10:20] ^ when matching a non-empty string [15:10:42] I don't think that's a matter of bug [15:10:43] Obv. "" == false [15:10:54] it's just better for reading, etc. [15:10:59] that's jshint [15:11:08] Sure +1 :) [15:11:28] awesome [15:11:37] so I need to run a test and then we are good to go [15:12:10] I'm a little bit afraid the "super" thing might cause trouble [15:12:56] yeah. Not sure what the best practices is there [15:13:26] And I'm not totally familiar with the behavior of super() vs. CLASSNAME.super.call [15:13:38] halfak: amazingly we are supporting "en-gb" now! [15:13:41] Although now that I look at it, when you do function.call, it needs to take the args as an array, I think [15:13:49] lol @ that [15:13:57] Oh! I need to update the javascript [15:13:58] ! [15:14:02] (I was updating wikilabels I saw the i18n file) [15:15:40] https://meta.wikimedia.org/w/index.php?title=MediaWiki:Gadget-WikiLabels.css&action=history [15:15:46] Woops [15:15:52] https://meta.wikimedia.org/w/index.php?title=MediaWiki%3AGadget-WikiLabels.css&type=revision&diff=15573822&oldid=15477132 [15:16:02] https://meta.wikimedia.org/w/index.php?title=MediaWiki%3AGadget-WikiLabels.js&type=revision&diff=15573820&oldid=15498850 [15:18:05] "Uncaught SyntaxError: 'super' keyword unexpected here" [15:18:11] I need to fix it [15:35:31] lots and lots of error [15:35:39] I need to check them in more depth [15:41:34] Amir1, how come you closed the branch? [15:41:55] o/ sabya [15:41:56] I checked and it seems lots of error were happening [15:41:59] Just saw your update email [15:42:10] Amir1, gotcha. That super switch might not have been right? [15:42:13] I need to check it later [15:42:16] kk [15:42:20] halfak: o/ [15:42:25] I removed the super part [15:42:34] sabya, the out of memory errors are troubling. [15:42:35] and also several other errors [15:42:46] But there should be some models that can take a sparse matrix. [15:42:58] I think the sklearn gradient boosting model can. [15:43:32] but in real world, we don't need to predict a lot of revids together, right? [15:43:43] That's right. [15:43:50] But we will need to train the model on a big set [15:44:12] In practice we only score one revid at a time as far as the model is concerned. [15:44:19] that I did, fit method takes sparse matrix [15:44:28] Great [15:44:41] So, 83% accuracy is a weird figure to think about. [15:44:43] btw, i was using sklearn gbc only [15:44:56] Since if you predicted false all the time, you'd get 95% accuracy [15:45:16] I want to see ROC-AUC and PR-AUC [15:45:59] Average precision == PR-AUC. See http://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html [15:46:19] will look those up. [15:46:28] For ROC-AUC, see http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html [15:46:40] also, in your first mail, what did you mean by hyper parameter tuning? [15:48:58] In a general sense: https://en.wikipedia.org/wiki/Hyperparameter_optimization [15:49:21] The most basic strategy (which is the one we employ) looks like this: http://scikit-learn.org/stable/modules/grid_search.html [15:49:53] ok [16:22:15] 06Revision-Scoring-As-A-Service, 10ORES, 13Patch-For-Review, 03Scap3 (Scap3-Adoption-Phase1): Move to using scap3 for deployment for ORES service - https://phabricator.wikimedia.org/T128670#2257094 (10mmodell) [16:52:03] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: [Spike] Explore ORES handling of *in-process* scorings - https://phabricator.wikimedia.org/T134064#2257229 (10Halfak) [16:52:44] 10Revision-Scoring-As-A-Service-Backlog, 10wikilabels, 07JavaScript: When Chrome is preventing the pop-out window , the code is stuck ! - https://phabricator.wikimedia.org/T131667#2257233 (10Halfak) [16:53:29] 10Revision-Scoring-As-A-Service-Backlog, 10revscoring, 10rsaas-editquality: [Spike] Explore using PR-AUC to score when tuning - https://phabricator.wikimedia.org/T133698#2257235 (10Halfak) [17:13:09] halfak: are u there? [18:36:34] o/ Oscar_ [18:36:36] what's up? [22:32:02] halfak: I'm making some progress on the regexes [22:36:41] Awesome. Any evidence of a performance increase, yet? [22:43:03] yup [22:43:06] I've got 5% [22:43:27] i'm working on the other two poor regexes [22:43:29] halfak: this is fun [23:23:37] wiki-ai/revscoring#693 (regex_opt - 3f38c13 : amir): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/127367441 [23:23:42] halfak: https://github.com/wiki-ai/revscoring/pull/269/files [23:23:47] if you are around [23:23:51] it's 20% faster [23:25:01] don't merge [23:25:08] let me add another thing [23:25:11] and test [23:30:10] aaaand another 2% improvement [23:30:13] \o/ [23:30:17] I go get some sleep [23:30:22] o/ [23:47:23] \o. [23:47:25] Awesome! [23:47:34] o/ Amir1 [23:47:37] sorry was AFK. [23:47:40] Checking out PR [23:52:53] {{merged}} [23:52:53] 10[1] 04https://meta.wikimedia.org/wiki/Template:merged [23:52:55] Awesome!