[09:44:58] 06Revision-Scoring-As-A-Service, 10Edit-Review-Improvements, 10rsaas-editquality, 03Collab-Team-Q1-July-Sep-2016: Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals - https://phabricator.wikimedia.org/T146333#2679461 (10Trizek-WMF) >>! In T146333#26... [11:28:04] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES: hidenondamaging=1 on Special:Contributions fails to filter out Flow contributions - https://phabricator.wikimedia.org/T146851#2679585 (10Ladsgroup) [11:28:55] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES: hidenondamaging=1 on Special:Contributions fails to filter out Flow contributions - https://phabricator.wikimedia.org/T146851#2672810 (10Ladsgroup) Does your gerrit change fix this? [13:38:38] (03CR) 10Ladsgroup: [C: 032] Refactor and simplify changeslist/contribs queries a bit [extensions/ORES] - 10https://gerrit.wikimedia.org/r/311652 (owner: 10Catrope) [13:39:27] (03Merged) 10jenkins-bot: Refactor and simplify changeslist/contribs queries a bit [extensions/ORES] - 10https://gerrit.wikimedia.org/r/311652 (owner: 10Catrope) [13:41:22] (03CR) 10Ladsgroup: [C: 032] Use aliases with "damaging" in them so we can add other ones (e.g. "goodfaith") [extensions/ORES] - 10https://gerrit.wikimedia.org/r/312164 (owner: 10Catrope) [13:44:23] (03Merged) 10jenkins-bot: Use aliases with "damaging" in them so we can add other ones (e.g. "goodfaith") [extensions/ORES] - 10https://gerrit.wikimedia.org/r/312164 (owner: 10Catrope) [13:45:18] (03CR) 10Ladsgroup: Only pull in damaging scores when damaging model is enabled (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/312165 (owner: 10Catrope) [18:39:12] halfak: BTW, https://phabricator.wikimedia.org/T146972#2679473 [18:40:21] (03CR) 10Catrope: Only pull in damaging scores when damaging model is enabled (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/312165 (owner: 10Catrope) [18:45:36] Gotcha. Thanks RoanKattouw [19:16:59] I'm trying to install the wikilabels dev server locally and I'm having trouble [19:17:20] The readme says I have to run wikilabels dev_server --config config-localdev.yaml [19:17:40] But 1) there is no file named wikilabels, and 2) there is no file named config-localdev.yaml [19:18:09] I found that maybe I should run ./utility dev_server since that accepts that as a parameter, but that then throws ImportError: No module named 'docopt' [19:18:19] Running ./setup.py also errors [19:19:09] ./setup.py: line 6: syntax error near unexpected token `(' [19:19:13] ./setup.py: line 6: `def read(fname):' [22:18:14] 06Revision-Scoring-As-A-Service, 10rsaas-editquality: Implement new json-lines pattern in editquality - https://phabricator.wikimedia.org/T146410#2681246 (10Halfak) Process is on wikidatawiki! Almost there! I'm really hoping that this will be settled and a pull request can be submitted tomorrow. [22:22:33] \o/ We got a massive performance boost by training the viwiki reverted model on 100k observations over just 20k [22:22:34] \o/ [22:24:58] ROC-AUC from 92.9 to 95.9 [22:25:23] PR-AUC from a dismal 9.3 to 45.9! [22:25:57] This is suddenly one of our best models! [22:30:06] PR-AUC for enwiktionary is 0.739! [22:30:08] Woah [22:30:43] I didn't realize this was such a good model [22:32:24] Oh! We re-scaled the positive examples for enwiktionary [22:32:28] That's why [22:33:32] I think we should scale up the number of observations for idwiki too [22:34:11] 10Revision-Scoring-As-A-Service-Backlog, 10rsaas-editquality: Scale up the number of observations for idwiki to 100k - https://phabricator.wikimedia.org/T147107#2681280 (10Halfak) [22:34:26] ^ BAM [22:36:14] Wow... We can auto-revert 30% of vandalism in nlwiki if we're OK with a 6% false positive rate [22:37:20] Woah... We can't be this good in plwiki [22:37:36] We get a PR-AUC of 92.5% [22:37:55] We can auto-revert 86.6% of damage with an 8% false-positive rate [22:37:58] Hmmm.... [22:38:50] Can this be true? It doesn't make sense. [22:40:00] o/ TarLocesilion [22:40:16] what's up, halfak? :) [22:40:27] My metrics say that the plwiki damage detection models are phenomenal. What's your experience with the predictions? [22:44:25] Our record for predicting reverted edits is pretty bad, but for predicting damage, it seems like we are almost perfect. [22:44:31] That's hard to believe [22:44:35] So I wanted to check up. [22:45:00] honestly, I patrol RC quite rarely, too rarely to have an opinion. have published a list or a table with such data? [22:45:44] predicting damage, you say [22:46:24] TarLocesilion, the quickest test I could imagine is enabling the ORES beta features and scanning RecentChanges for what gets flagged. [22:49:54] sounds OK, if that randomness could be reliable :) in Poland, it's gonna be 1 a.m., but "tomorrow", I can ask patrollers about their feelings [22:50:52] anyway, I've experienced many false positives, that's what I can say for certain. [23:02:28] TarLocesilion, ok. Any sort of meaningful rate of false positives are not captured in my metrics. But if it seems OK, then I don't feel a strong sense of urgency. [23:02:50] If you could ask for some feedback from patrollers and point me to it, I'd really appreciate that