[19:46:21] ragesoss, hey dude. Around? [19:46:35] halfak: howdy [19:46:44] I'm looking for a phab task for that work you were doing with an intern (?) to implement recommendations for new article drafts using the wp10 model. [19:46:58] Also, I'ma talk about you in a research paper :) [19:47:19] I'll find it... [19:47:20] Re-purposing and re-interpreting the wp10 model :) [19:48:00] You're a good example of someone looking at ORES differently than the engineers who built it. [19:48:11] https://phabricator.wikimedia.org/T160840 [19:48:28] https://phabricator.wikimedia.org/T164627 [19:49:04] Thanks! [19:49:25] Are you using this BTW? [19:49:43] that project never really got into significant work with ORES. we ended with the framework for it, and some fairly trivial recommendations. [19:49:57] Gotcha. Understood. [19:50:03] it's in use, but not yet really useful. [19:50:04] "in progress" still :) [19:50:35] ragesoss, can I try it out? [19:50:48] yeah. I'm optimistic that this summer will see some major progress, while we do the related 'article finder' project [19:52:49] ragesoss, any way I could try out what you have? [19:52:54] Or maybe just see it in action? [19:53:33] halfak: go here and sign in: https://dashboard-testing.wikiedu.org/courses/test/ORES_playground/home [19:54:08] Am I an instructor? [19:55:06] no, you're a student [19:55:32] cool [19:55:34] * halfak continues [19:55:37] you can assign yourself an article, then visit https://dashboard-testing.wikiedu.org/courses/test/ORES_playground/manual_update [19:55:57] (or assign yourself several) [19:56:51] * halfak assigns himself Anarchonism [19:57:49] Woah. The pageviews for this article just went through the roof! [19:57:52] https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&range=latest-90&pages=Anachronism [19:58:12] It's not clear to me how I get recommendations [19:58:13] whoa [19:58:46] that's because there probably are not any. [19:59:13] Oh! I see. Let me try something else. [19:59:17] clicking 'Feedback' would show them. [19:59:30] but typically, you'll only see that if it's really undeveloped. [19:59:53] rather, you'll only see ORES-derived suggestions if it's really undeveloped. [20:01:42] https://en.wikipedia.org/wiki/User:EpochFail/sandbox [20:01:46] * halfak looks for suggestions [20:03:33] I expect to see "Add citations" or "Add a header" somewhere... [20:06:23] halfak: does ORES work in userspace? [20:06:27] I added a stub [20:06:40] ORES will score whatever you like :) [20:06:42] check out Charles Berkeley for a working example [20:07:28] http://ores.wmflabs.org/v3/scores/enwiki/804138720/wp10 [20:07:36] Your user page is a B class :) [20:07:43] noice [20:07:57] Actually probably a C given the distribution. The model is very confused. [20:08:21] the dashbaord doesn't really have a concept of assignments outside of mainspace [20:08:52] Ahh. Gotcha. [20:09:10] it will, eventually. [20:09:36] Aha! [20:09:37] hopefully in FY 2018-19. we're planning to hire a dev at last! [20:09:38] I see it. [20:10:02] https://github.com/WikiEducationFoundation/WikiEduDashboard/blob/master/lib/revision_feedback_service.rb [20:10:32] as you see, we've done *very* little with it so far. [20:11:00] Gotcha. Still very interesting from a theoretical level :) [20:11:14] Your system interrogates the model to help humans replicate what the model has learned. [20:11:36] I was hoping to get further with it during Outreachy, but that project went somewhat slower than I expected. [20:11:55] SOFTWARE! [20:12:17] i know right. between computers and people, i tell ya. [20:12:30] haha :) [20:19:55] One thing I was curious about — and maybe this is more a question for nettrom — would stock text analysis tools like reading level scores be useful things to add to the wp10 model? I ask because I was looking at one of the research papers for the signpost report that does article quality prediction, allegedly better than previous results... [20:20:34] and that was one of the 'manually engineered' features they used [20:20:58] and it seems like it would be relatively easy to implement, as I'm sure there are libraries for calculating such things. [20:33:02] ...libraries which usually have wildly inaccurate vocabularies [20:38:10] Nemo_bis, ragesoss, usually reading level comes from word/sentence length. [20:38:15] So vocab isn't involved. [20:38:32] https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests [20:38:41] Woops. Looks like there's a syllable count too. [20:39:23] But to answer your question ragesoss, we could implement that for sure. We have ways to sentence-ify the article in ORES. [20:39:37] And it works pretty well for most latin-ish languages. [20:39:49] We can implement other sentence delimiters for use elsewhere. [20:48:13] I prefer https://en.wikipedia.org/wiki/Dale%E2%80%93Chall_readability_formula [20:48:34] and similar [20:49:27] need to update https://meta.wikimedia.org/wiki/Writing_clearly#English , bitrotting [21:18:21] Hmm. Nemo_bis, we could probably just assume that the 3000 most common words are familiar. [21:18:40] We could essentially process a set of articles to arrive at a good enough vocabular for any language where we can delimit words.