[15:15:37] halfak: o/ [15:16:31] o/ [15:21:08] Thinking about your note on https://github.com/wiki-ai/ores/pull/135 [15:21:33] So, I'm not sure about the separate structure for feature values. [15:21:43] But this matches our notes from the v2 spec. [15:22:19] See https://etherpad.wikimedia.org/p/ores_response_structure [15:22:33] Note lines 73-75 [15:22:59] Fun story -- I highlighted that when we were last talking about the schema :) [15:23:44] Still, I don't think it's a problem to change things up. [15:24:02] We could, for example, make it so that "features" were part of the "score" itself. [15:24:29] I don't like that because we want to cache the "score" as a JSON blob without considering what is in it. [15:24:36] And it is wasteful to cache the feature values. [15:26:39] hmm [15:26:42] But structurally, it seems to make more sense to have it live within the score. [15:26:44] let me read those [15:27:11] Also, I found an issue with the ORES PR, so don't merge yet [15:30:19] halfak: I think the first "scores" (line 69) should be named something else, like "result", "query" or something else, [15:33:56] Amir1, it's nice that our document structure reflects the path structure though, right? [15:34:24] yeah <3 swagger [15:34:24] Note that, the changes made in this PR finish the implementation of the v2 schema. [15:34:25] :) [15:34:37] Right now, we have a weird hybrid deployed. [15:34:53] (which is OK since we didn't really announce the v2 schema yet) [15:35:19] I think this feature API will be a good incentive to move over to the new spec. [15:37:06] +1 [15:37:21] I need to be afk for a while but I'll be back soon [15:37:28] and I guess you finished the PR by tehn [15:37:30] *then [15:37:39] send me an email or telegram [15:37:45] I check that [15:37:50] E.g. when you request "/scores//" you get {"scores": {"": {": {"version": ...}, "": {"version: ...}}} [15:37:52] kk [15:37:54] WIll do [20:19:17] hey halfak ! :-) [20:19:23] quick question: https://phabricator.wikimedia.org/T129230#2154064 [20:19:25] Hey Helder ! [20:20:02] No time to response right now, but I will soon. [20:20:16] For the the time being, know that these statistics are intended to be *guiding* [20:20:36] ok [20:20:39] Since our test set is imperfect, there are some issues with these stats that need to be vetted qualitativelyu [20:21:01] So, I'd be very interested to know how they work out *in practice* [20:21:27] But know that we're planning to look into changing the *range* of the probability predictions soon. [20:22:02] So vetting the probabilities output by the linear models is less useful than vetting what we are deploying next (GradientBoosting & RandomForest models) [20:22:58] We'll be making an announcement soon about the switch, then I'd like to have people vet the statistics we provide so that automated tools (like ScoredRevisions) might soon use these stats to automatically set thresholds. [20:23:19] I expect some iteration will be required for that [20:23:34] So I've planned to work on the test statistics a bit once feedback starts rolling in. [20:23:52] that would be my second question: if there was anything to do in the gadget already, related to theses thresholds [20:24:18] Helder, once we make this announcement, there will be. The useful thresholds will change. [20:24:30] I'm not 100% clear that this metadata will work for you yet. [20:24:39] (You == scoredrevisions [20:24:39] right [20:25:04] Right now, I suspect that low quality in our test set makes these thresholds a bit conservative [20:25:18] This is based on analysis of the english and wikidata models. [20:25:48] Would be very happy to set up a test instance of ORES if you wanted to experiment on ptwiki [20:25:49] :) [20:26:47] The real response to your phab question would involve a proper writeup about what we mean by "recall_at_fpr" [20:27:07] For the time being, the best reference is https://meta.wikimedia.org/wiki/Research:Building_automated_vandalism_detection_tool_for_Wikidata [20:28:38] * Helder will read that [20:33:49] \o/ [20:34:12] we need to make some small modifications to the page [20:34:13] halfak, should I be using ores.wmflabs.org/v2 instead of ores.wmflabs.org? [20:34:26] Helder, almost [20:34:32] We're still getting v2 up to spec [20:34:48] It is totally useful, but the JSON structure is going to change [20:35:06] In fact, we have already merged the changes and just need to deploy. [20:35:20] I'll be doing that deploy tomorrow morning UTC-6 [20:35:46] Woops in daylight savings now, so that' [20:35:50] s UTC -5 [20:41:31] ok [20:41:46] any relevant links? [20:41:54] docs? patches? [20:45:00] Helder: hey, https://etherpad.wikimedia.org/p/ores_response_structure [20:45:06] this is one of them [20:45:39] halfak: see the new commit, please merge the PR related to scap [20:47:42] thanks [20:59:11] (03PS6) 10Ladsgroup: Integrate with Special:Contributions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/264608 (https://phabricator.wikimedia.org/T122537) (owner: 10Awight) [21:00:26] (03CR) 10jenkins-bot: [V: 04-1] Integrate with Special:Contributions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/264608 (https://phabricator.wikimedia.org/T122537) (owner: 10Awight) [21:04:38] (03CR) 10Ladsgroup: "PS6 is just rebase. Thou shall not merge this until 1- some upstream patches get merged 2- passes composer test 3- lots of updates have be" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/264608 (https://phabricator.wikimedia.org/T122537) (owner: 10Awight) [21:54:00] (03PS7) 10Ladsgroup: Integrate with Special:Contributions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/264608 (https://phabricator.wikimedia.org/T122537) (owner: 10Awight) [23:58:06] (03PS8) 10Ladsgroup: Integrate with Special:Contributions [extensions/ORES] - 10https://gerrit.wikimedia.org/r/264608 (https://phabricator.wikimedia.org/T122537) (owner: 10Awight)