[09:44:58] <wikibugs>	 06Revision-Scoring-As-A-Service, 10Edit-Review-Improvements, 10rsaas-editquality, 03Collab-Team-Q1-July-Sep-2016: Research how to present ORES scores to users in a way that is understandable and meets their reviewing goals - https://phabricator.wikimedia.org/T146333#2679461 (10Trizek-WMF) >>! In T146333#26...
[11:28:04] <wikibugs>	 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES: hidenondamaging=1 on Special:Contributions fails to filter out Flow contributions - https://phabricator.wikimedia.org/T146851#2679585 (10Ladsgroup)
[11:28:55] <wikibugs>	 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES: hidenondamaging=1 on Special:Contributions fails to filter out Flow contributions - https://phabricator.wikimedia.org/T146851#2672810 (10Ladsgroup) Does your gerrit change fix this?
[13:38:38] <grrrit-wm>	 (03CR) 10Ladsgroup: [C: 032] Refactor and simplify changeslist/contribs queries a bit [extensions/ORES] - 10https://gerrit.wikimedia.org/r/311652 (owner: 10Catrope)
[13:39:27] <grrrit-wm>	 (03Merged) 10jenkins-bot: Refactor and simplify changeslist/contribs queries a bit [extensions/ORES] - 10https://gerrit.wikimedia.org/r/311652 (owner: 10Catrope)
[13:41:22] <grrrit-wm>	 (03CR) 10Ladsgroup: [C: 032] Use aliases with "damaging" in them so we can add other ones (e.g. "goodfaith") [extensions/ORES] - 10https://gerrit.wikimedia.org/r/312164 (owner: 10Catrope)
[13:44:23] <grrrit-wm>	 (03Merged) 10jenkins-bot: Use aliases with "damaging" in them so we can add other ones (e.g. "goodfaith") [extensions/ORES] - 10https://gerrit.wikimedia.org/r/312164 (owner: 10Catrope)
[13:45:18] <grrrit-wm>	 (03CR) 10Ladsgroup: Only pull in damaging scores when damaging model is enabled (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/312165 (owner: 10Catrope)
[18:39:12] <RoanKattouw>	 halfak: BTW, https://phabricator.wikimedia.org/T146972#2679473
[18:40:21] <grrrit-wm>	 (03CR) 10Catrope: Only pull in damaging scores when damaging model is enabled (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/312165 (owner: 10Catrope)
[18:45:36] <halfak>	 Gotcha.  Thanks RoanKattouw 
[19:16:59] <RoanKattouw>	 I'm trying to install the wikilabels dev server locally and I'm having trouble
[19:17:20] <RoanKattouw>	 The readme says I have to run wikilabels dev_server --config config-localdev.yaml
[19:17:40] <RoanKattouw>	 But 1) there is no file named wikilabels, and 2) there is no file named config-localdev.yaml
[19:18:09] <RoanKattouw>	 I found that maybe I should run ./utility dev_server since that accepts that as a parameter, but that then throws ImportError: No module named 'docopt'
[19:18:19] <RoanKattouw>	 Running ./setup.py also errors
[19:19:09] <RoanKattouw>	 ./setup.py: line 6: syntax error near unexpected token `('
[19:19:13] <RoanKattouw>	 ./setup.py: line 6: `def read(fname):'
[22:18:14] <wikibugs>	 06Revision-Scoring-As-A-Service, 10rsaas-editquality: Implement new json-lines pattern in editquality - https://phabricator.wikimedia.org/T146410#2681246 (10Halfak) Process is on wikidatawiki!  Almost there!  I'm really hoping that this will be settled and a pull request can be submitted tomorrow.
[22:22:33] <halfak>	 \o/  We got a massive performance boost by training the viwiki reverted model on 100k observations over just 20k
[22:22:34] <halfak>	 \o/
[22:24:58] <halfak>	 ROC-AUC from 92.9 to 95.9
[22:25:23] <halfak>	 PR-AUC from a dismal 9.3 to 45.9!
[22:25:57] <halfak>	 This is suddenly one of our best models!
[22:30:06] <halfak>	 PR-AUC for enwiktionary is 0.739!
[22:30:08] <halfak>	 Woah
[22:30:43] <halfak>	 I didn't realize this was such a good model
[22:32:24] <halfak>	 Oh!  We re-scaled the positive examples for enwiktionary
[22:32:28] <halfak>	 That's why
[22:33:32] <halfak>	 I think we should scale up the number of observations for idwiki too
[22:34:11] <wikibugs>	 10Revision-Scoring-As-A-Service-Backlog, 10rsaas-editquality: Scale up the number of observations for idwiki to 100k - https://phabricator.wikimedia.org/T147107#2681280 (10Halfak)
[22:34:26] <halfak>	 ^ BAM
[22:36:14] <halfak>	 Wow... We can auto-revert 30% of vandalism in nlwiki if we're OK with a 6% false positive rate
[22:37:20] <halfak>	 Woah... We can't be this good in plwiki
[22:37:36] <halfak>	 We get a PR-AUC of 92.5%
[22:37:55] <halfak>	 We can auto-revert 86.6% of damage with an 8% false-positive rate
[22:37:58] <halfak>	 Hmmm....
[22:38:50] <halfak>	 Can this be true?  It doesn't make sense.
[22:40:00] <halfak>	 o/ TarLocesilion 
[22:40:16] <TarLocesilion>	 what's up, halfak? :)
[22:40:27] <halfak>	 My metrics say that the plwiki damage detection models are phenomenal.  What's your experience with the predictions?
[22:44:25] <halfak>	 Our record for predicting reverted edits is pretty bad, but for predicting damage, it seems like we are almost perfect. 
[22:44:31] <halfak>	 That's hard to believe
[22:44:35] <halfak>	 So I wanted to check up. 
[22:45:00] <TarLocesilion>	 honestly, I patrol RC quite rarely, too rarely to have an opinion. have published a list or a table with such data?
[22:45:44] <TarLocesilion>	 predicting damage, you say
[22:46:24] <halfak>	 TarLocesilion, the quickest test I could imagine is enabling the ORES beta features and scanning RecentChanges for what gets flagged. 
[22:49:54] <TarLocesilion>	 sounds OK, if that randomness could be reliable :) in Poland, it's gonna be 1 a.m., but "tomorrow", I can ask patrollers about their feelings
[22:50:52] <TarLocesilion>	 anyway, I've experienced many false positives, that's what I can say for certain.
[23:02:28] <halfak>	 TarLocesilion, ok.  Any sort of meaningful rate of false positives are not captured in my metrics.  But if it seems OK, then I don't feel a strong sense of urgency.  
[23:02:50] <halfak>	 If you could ask for some feedback from patrollers and point me to it, I'd really appreciate that