[00:07:03] Amir1: halfak: Hi! I was wondering if the datasets for edit quality scoring were available? I'm a PhD student interested in deep learning and I was wondering if I could help develop some models and contribute in some way! :) [00:08:10] hi moosd I can help with that [00:08:35] I'm also a PhD student and I'm doing a project about ORES. [00:08:44] Are you looking for the training data? [00:09:52] you can get it by following the makefile in the repo [00:09:53] https://github.com/wikimedia/editquality [00:10:03] groceryheist: lovely to meet you! and yep any already collected training data would be great, i found a repo on github that said it was deprecated so i thought id ask on here! [00:10:43] I wrote a script to get the labels [00:10:45] https://github.com/groceryheist/editquality/blob/master/bias_analysis/get_labels.py [00:10:51] not all the features though [00:12:23] but maybe you can extend my script to also build the features following the makefile [00:13:40] what's the repo you found that seems deprecated? [00:14:38] oh awesome, thank you! ill have a play around with that! i see the models are there too to compare against! [00:15:05] yeah totally [00:15:08] this one - https://github.com/wiki-ai/wb-vandalism [00:15:20] oh i haven't seen that [00:17:07] glad I could help! and welcome :) [03:18:43] moosd: by the way my script only gets the human labeled edits [22:53:24] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/509715 (owner: 10L10n-bot)