[14:13:57] halfak: hey [14:14:24] I wrote a script that gives us reverting comment for any given set of tsv-based revs [14:14:51] I want to run it but I need our labeled data (reverted edits). I couldn't find it [14:15:05] halfak: please send me those rev ids [15:39:14] halfak: around? [15:39:30] Yup. Doing my morning meetings. [15:39:35] Eek! Just saw your messages. [15:39:39] Sorry I missed them. [15:39:47] it's ok :) [15:39:54] I'm training the general model [15:39:55] Amir1, labeled data is checked into the repo [15:40:11] It's taking strangely big time [15:40:18] but It'll finish soon [15:40:22] oh, thanks [15:40:27] Yeah. Testing takes a long time. [15:40:39] Could be that the Random Forest is just slow to make a prediction. [15:41:49] exactly, training was fast, testing is taking long itme [15:41:51] *time [15:42:05] anyway, I'm on my way to get you results ASAP [15:42:20] I think it won't take more than an hour :) [15:42:25] Just finishing the model for general_and_user [15:43:12] on the bright side I learned how "Makefile" works, and how to extract features and train using revscoring [15:43:30] :D Makefiles are awesome if not totally basic in their capabilities. [15:43:32] It looks dummy but I wasn't very well aware of them [15:43:49] Na. that's how tech works. [15:43:59] 100% opaque until you pick them up. [15:51:48] halfak: https://meta.wikimedia.org/wiki/Research_talk:Revision_scoring_as_a_service/Work_log/2016-02-09 [15:53:22] Amir1, something is weird. [15:53:32] Oh! No it isn't [15:53:33] My mistake. [15:53:45] Was looking at your ~True column and thinking there were too many. [15:54:00] But I should have been looking at the True (no ~) row. [15:54:29] I'm still waiting on the general_and_user to finish testing. [16:00:17] halfak: I want you to confirm, in our 500K edits we have 700 reverted edits? [16:00:28] 698 to be exact [16:01:37] Yes [16:01:40] That's right [16:20:51] halfak: what is radius and window? [16:21:23] default is 15 and 48 hours [16:31:21] Amir1, see Makefile [16:31:37] oh okay [16:39:26] https://www.irccloud.com/pastebin/9Ifg8kl5/ [16:39:30] halfak: ^ [16:39:36] self-explantory :) [16:41:44] afk for several min. [16:42:04] Nice. [16:42:07] afk here too [16:48:04] back [16:48:39] took me 7 min. to go to wc, wc of the whole department is closed [16:48:55] I should find a better place to work [16:52:39] I try to write it [17:21:11] halfak: "However, our classifier showed out of sample 698 reverted edits due to being vandalism, 63\% (439) edits were reverted using rollback, 15% (104) were reverted using restore and 22\% (155) were reverted using other methods. Thus relying on rollback as a classifier to train a classifier is not feasible due to high error and low recall." [17:21:53] halfak: Also I'm done with analysing this and trainging and testing the general method, I'm done with work now, give me something else to do :) [17:22:02] Amir1, looks great. [17:22:26] I'd call into question any classifier that is tested for it's ability to detect edits that would be reverted using rollback. [17:22:41] Such a dataset would include a large set of positive observations that were labeled negatively. [17:23:23] Just got the results of "general_and_user". Looks like our fitness stats are basically the same for that as "all" [17:23:40] Amir1, have you started on "general_context_and_type"? [17:23:48] halfak: no [17:23:59] OK. I'll do that. [17:24:04] awesome [17:24:09] what else I need to do [17:24:20] Hmm... Maybe you should do that :) [17:24:22] actually I don't get the question :D [17:24:49] I meant "I'd call into question any classifier..." [17:26:03] Oh. Just that they might be able to train a useful classifier on rollback'd edits, but I don't see how they would be able to use that to test their classifier fairly. [17:26:21] They'll have a lot of vandalism edits reverted using Undo and other methods. [17:26:38] Their classifier may be totally unable to catch those edits, or it may be great at it. [17:27:03] If their classifier is good at detecting vandalism generally, their fitness metrics will be punished by a set of false-false-positives! [17:27:32] I think I see [17:27:53] If their classifier is no food at detecting vandalism generally -- just the vandalism that tends to be reverted with "rollback", then the fitness metrics will look good, but the classifier is less useful. [17:28:08] *food->good [17:28:11] * halfak needs lunch [17:28:21] Yay! And I have time to make it! [17:28:57] Amir1, if you get the model build for general_context_and_type, I'll start working on plotting the curves. [17:29:08] You could spend the time writing more about rollback too. [17:29:14] ok, Do I need to write these into the draft? [17:29:26] I do the general_context_and_type [17:31:47] Amir1, if you focus on general_context_and_type, we'll get the draft squared away later. [17:31:52] Kicked the feature exctractor [17:31:52] Let's get this analysis done first :) [17:31:56] Woot! [17:33:05] legoktm: hey, I hope you're around [17:34:25] okay halfak, I try to write your notes into the draft [17:34:35] Heh. It's probably due to the randomness of the test set, but it looks like we actually get a *better* model with genera_and_user than with "all" [17:34:52] * Amir1 needs another coffee [17:35:11] let me check [17:35:17] user is highly predictive [17:35:29] yeah. [17:35:29] maybe we accidentally deleted user [17:36:28] I tested everything [17:36:40] https://gist.github.com/halfak/d840ba961bc39bc323d1 [17:36:45] Nope. User features are there. [17:36:59] (They are at the very end of the feature list) [17:37:38] OK. Time for lunch. Back in a bit. [18:25:57] halAFK: If a classifier would be trained using solely rollbacks two scenarios are possible. First, the classifier can predict all cases of vandalism regardless of reverting method. In that case the tester is punished by mistakenly classifying edits that are reverted using other methods as "good edit". In the second scenario, the classifier is good at [18:25:57] classifying edits that are reverted using rollback but not other methods. In that case the classifier is useless since significant proportion of vandalism edits would be misclassified. Thus, building a classifier solely based on rollback edits is problematic. [18:26:06] I got to go [18:26:27] I'll be back soon-ish [18:31:43] Amir1, text looks awesome. [18:44:25] I hate it, I think it's too informal [18:44:28] :D [19:01:06] Informal is OK [19:09:45] halfak: feature extraction is 20% completed :) [19:09:58] Great! [19:10:20] I'm working on a new strategy for splitting the training and test set, so we might have to re-do a bit, but it won't be much. [19:10:34] cool [19:17:26] * halfak works on pull request for updated train/test [23:13:05] (03PS1) 10Reedy: Fix spaces to tabs [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269552 [23:14:12] (03CR) 10jenkins-bot: [V: 04-1] Fix spaces to tabs [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269552 (owner: 10Reedy) [23:17:28] (03PS1) 10Reedy: pngcrush pngs [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269554 [23:18:13] (03CR) 10jenkins-bot: [V: 04-1] pngcrush pngs [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269554 (owner: 10Reedy) [23:19:24] (03PS1) 10Reedy: Add \ to global classes/functions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269555 [23:19:43] (03CR) 10Hoo man: "You probably want to batch by rc_id." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/268874 (https://phabricator.wikimedia.org/T123795) (owner: 10Ladsgroup) [23:21:09] (03CR) 10jenkins-bot: [V: 04-1] Add \ to global classes/functions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269555 (owner: 10Reedy) [23:24:27] (03PS2) 10Reedy: Add \ to global classes/functions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269555 [23:25:23] (03CR) 10jenkins-bot: [V: 04-1] Add \ to global classes/functions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269555 (owner: 10Reedy) [23:26:44] (03PS3) 10Reedy: Add \ to global classes/functions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269555 [23:27:54] (03CR) 10jenkins-bot: [V: 04-1] Add \ to global classes/functions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/269555 (owner: 10Reedy)