[15:51:50] o/ J-Mo [15:52:00] howdy halfak [15:52:07] so, non-engineering support... [15:52:09] ;) [15:52:29] I just replied there, but we should continue here to be faster :D [15:52:36] "Yes! I'd like to get this type of system well documented before-hand. We might even start imagining where we'd put a proof of concept refutation UI." [15:52:52] DarTar told me that my help might be appreciated at some point this/next quarter working on the WikiLabels productization. Seems like I could get my feet wet with this. [15:52:56] Once we have figured out where a UI should live, we could start imagining functionality. [15:53:31] sounds good to me. I can start putting together some notes and tentative requirements that we can talk about. Probably have something by mid-next-week [15:53:51] Re. Wikilabels, I've been thinking about this: Maybe labels saved in Wikilabels are "refutations" [15:54:01] yeah… similar systems ;) [15:54:16] In that case "refutation" is the wrong word. [15:54:19] might make sense to take an all in one approach [15:54:20] And "annotation" is better. [15:54:28] At least on the back-end [15:54:28] it's the same activity, in many ways [15:54:41] "characterize an edit" [15:55:01] Yeah... We'd have some interesting dimensions available. E.g. key-value pairs. [15:55:13] not sure I follow [15:55:26] damaging-(true/false), goodfaith-(true/false), wp10-(Stub/Start/C/B/GA/FA) [15:55:39] And a freeform text field for an explanation [15:55:44] yes [15:56:16] Somehow the base system should be able to recommend potential values for an annotation [15:56:35] Or "label" [15:56:45] for the initial labeling task, or the "refutation" task? or both? [15:56:55] Both [15:57:07] Right now, we capture potential values in a WikiLabels "form" [15:57:14] doesn't incorporating machine recommendations into training data bias the training data? [15:57:25] Sorry, not that type of recommendation [15:57:53] We recommend that you provide a 'true/false' value to "damaging" when labeling [15:58:20] isn't it a forced choice UI? [15:58:27] * J-Mo has not used WikiLabels really [15:58:36] Yeah. Currently. Refutations will need to be a forced-choice UI too [15:58:40] yes [15:58:50] But probably can't just use the "form" from wikilabels [15:58:56] So we'll need something else. [15:58:59] you can't just say "this isn't FOO" [15:59:15] Yeah. You should say what is *is* if it isn't FOO [15:59:20] yeah [15:59:36] E.g. if it isn't "damaging", it is "not damaging" [15:59:52] But if this article isn't a "stub", it is a "B" class. [15:59:53] does it go back into a labeling (training) queue if it is disputed? [16:00:08] "disputed labels" [16:00:10] Right. We'll need review and suppression for any text field anyway [16:00:24] so might as well have the meta-reviewers attempt to re-classify [16:00:48] Essentially, all classifications without an annotation are "unpatrolled" [16:01:25] and the results of patrolling an edit should be captured as an annotation that can be refuted. [16:01:53] E.g. if ORES says "damaging" and a patrolled decides that it's not, there should be an inexpensive refutation recorded [16:02:04] Probably with a blank summary [16:02:17] ad nauseum? can a refutation be re-refuted? [16:02:33] Yeah, but they'll just pile up on a rev_id -- not refute each other directly. [16:03:06] hmmm. how to prevent refutation trolling? [16:03:14] Same meta-review [16:03:25] And deal with coordinated attacks ad-hoc. [16:03:41] will WikiLabels always be a gadget/extension? [16:03:44] * halfak imagines training a meta-model on refutation comments. [16:04:11] J-Mo, Catrope suggested we consider moving it to a pure labs-based UI. [16:04:25] that would likely be much simpler and more flexible [16:04:25] Dropping the gadget and extension in favor of a stand-alone web page. [16:04:54] I guess I would prefer that option as well, based on the kind of branching and intersecting workflows we're discussing [16:05:26] currently, does performing WikiLabelling create a revision? [16:05:42] J-Mo, I imagine that a campaign will be "Make sure this set of edits has at least N label/annotation/refutation(s)" [16:06:39] And an ID for the object being labeled/refuted/annotated will be a pair of (, "revision", ), (, "page", ), etc. [16:07:01] So we can have arbitrary things from a wiki being classified & labeled. [16:07:22] do you multi-code labels in campaigns, and then use majority rule to resolve differences? [16:07:31] sorry "multi-code revisions" [16:07:37] J-Mo, depends on the campaign. [16:07:38] or articles, or arbitrary things [16:07:44] For edit quality, we just use one label. [16:07:53] For article topic, we used 3 labels and majority [16:08:13] For edit type, we use 4 labels and merged edit type classes that appear at least once. [16:08:40] what about article quality? [16:09:00] or, is that not based on WikiLabels? [16:09:42] We haven't used wikilabels for that yet [16:09:50] Might end up doing that as we add support for new wikis. [16:10:05] We've been basing those models on templated assessments. [16:11:34] cool. well halfak thanks for mulling this over with me. feel free to ping me when you want to pick the discussion up again. [16:12:31] Any time! Will you be taking a look in the meantime? [16:12:42] Or should I ping when I have something to report? [16:12:47] yep. any particular campaign I can participate in right now, to get a sense of the workflow? [16:13:14] even a test campaign [16:13:52] I think you sent me a link to one of those before, months ago, but I'm not sure I ever got around to doing more than looking at the interface [16:15:04] J-Mo, there's a campaign live on enwiki for supplementing our editquality models with more data. [16:15:21] See https://en.wikipedia.org/wiki/Wikipedia:Labels "Edit quality (20k 2016 sample)" [16:15:37] sweet. I [16:15:41] 'll get on it [16:15:50] also, this may be out of scope, but a stand-alone arbitrary content labeling web app could potentially ALSO be used for evaluating other types of machine recommendations (like Article Recc's)… just sayin' ;) [16:17:26] Right [16:17:27] Totally independent of ORES [16:17:27] Just like WikiLabels :) [17:08:09] halfak: did you want me to include the need for code reviews in the SoS? [17:08:35] Yes please! [17:14:13] halfak|Lunch: have you seen this? https://www.mediawiki.org/wiki/Code_Review/Office_Hours [17:17:12] hmm...looks like it may not be active anymore [18:11:43] schana, yes. When you linked to -codereview, I did read a bit about that.