[14:42:59] I just finished training the trwiki models. It looks like we're getting ~90 AUC for all three: reverted, damaging & goodfaith. [23:23:40] Wooo! We get ~.90 AUC on dewiki's "reverted" model [23:24:05] I'm rebuilding all the models we have language features for. That means I'm doing models for 13 languages. [23:24:21] Some languages have several models, so it's a total of 23 models. [23:24:44] I should probably pick up some of the cards that awight|afk made for me too. [23:24:48] * halfak looks into that [23:34:48] neat [23:34:54] germans are diciplined [23:35:02] so it kind of isnt that surprising [23:35:24] I am hoping for your feedback for tr.wiki [23:35:38] I am curious how much of an improvement does handcode bring compared to the revdel [23:42:20] Oh! I have results for that. I posted them in here yesterday. [23:42:30] I just finished training the trwiki models. It looks like we're getting ~90 AUC for all three: reverted, damaging & goodfaith. [23:42:32] ^ [23:42:51] wiki-ai/revscoring#312 (travis_shuddup - 1af675f : halfak): The build has errored. https://travis-ci.org/wiki-ai/revscoring/builds/90209179 [23:44:40] So we didn't see an AUC improvement, but that doesn't mean we won't see better performance in trwiki. :) [23:47:53] without language features its impessive [23:48:06] Oh. It has language features. [23:48:12] Just no dictionary. [23:48:25] or stemmer [23:48:27] or bad words [23:48:29] So no misspellings. I'm not sure that helps so much since proper nouns are "misspelled" [23:48:34] It has badwords. [23:48:41] bad words are mostly useless [23:48:45] We don't use the stemmer for badword detection anymore. [23:48:48] because they arent truly bad words [23:48:57] Well apparently they are useful since we went from .76 AUC to .90 [23:49:19] the problem is same can be achived through a more general rule [23:49:25] * .67 AUC to .90 AUC [23:49:31] I dont expect cursewords to make a difference [23:49:41] since tr.wiki heavily filters them