[00:25:09] 06Revision-Scoring-As-A-Service, 10MediaWiki-API: Forward request data in proxied Action API modules - https://phabricator.wikimedia.org/T161029#3123872 (10Tgr) [14:02:23] halfak: o/ [14:06:03] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels: Hidden revisions should be skipped - https://phabricator.wikimedia.org/T161104#3121644 (10Halfak) Hi @Strainu. Currently, we don't do a good job of reporting to the user when a revision has been deleted/suppressed. We can certainly improve that. Skippin... [14:06:23] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels: Hidden revisions should be skipped - https://phabricator.wikimedia.org/T161104#3121644 (10Halfak) p:05Triage>03Low [14:06:30] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels: Hidden revisions should be skipped - https://phabricator.wikimedia.org/T161104#3121644 (10Halfak) p:05Low>03Normal [14:07:22] 06Revision-Scoring-As-A-Service, 10ORES, 10Wikimedia-Logstash, 13Patch-For-Review, 15User-Ladsgroup: Send ORES logs to logstash - https://phabricator.wikimedia.org/T149010#3125096 (10akosiaris) Actually it was not that. I had to remove the `log-route` directives to make logging (neither logstash nor loca... [14:07:44] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels: Hidden revisions should be skipped - https://phabricator.wikimedia.org/T161104#3121644 (10Halfak) [14:07:46] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels: Revisions that cannot be retrieved should be skipped - https://phabricator.wikimedia.org/T161102#3125099 (10Halfak) [14:09:07] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels: Revisions that cannot be retrieved should be skipped - https://phabricator.wikimedia.org/T161102#3121544 (10Halfak) Merging in T161104 here since they are essentially the same problem from a labeling perspective. I like the title of this task because "rev... [14:09:21] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels: Revisions that cannot be retrieved should be skipped - https://phabricator.wikimedia.org/T161102#3125104 (10Halfak) Copying in a message from T161104: Hi @Strainu. Currently, we don't do a good job of reporting to the user when a revision has been deleted... [14:09:41] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels: Revisions that cannot be retrieved should be skipped - https://phabricator.wikimedia.org/T161102#3121544 (10Halfak) p:05Triage>03Normal [14:10:06] 06Revision-Scoring-As-A-Service, 10Education-Program-Dashboard, 10ORES, 10Outreach-Programs-Projects, and 4 others: Add automatic article feedback feature to Wiki Ed Dashboard / Programs & Events Dashboard - https://phabricator.wikimedia.org/T158679#3125109 (10Halfak) [14:30:29] o/ [14:41:30] o/ [14:45:06] 10Revision-Scoring-As-A-Service-Backlog, 10Bad-Words-Detection-System, 10revscoring: Add language support for Korean - https://phabricator.wikimedia.org/T160757#3125254 (10Halfak) @revi, I almost pulled this to our main workboard, but I realized that we still need a list of "informals". @Ladsgroup said that... [14:46:03] halfak: that's what I'm going to work tomorrow [14:46:14] (real life sucks (tm)) [14:47:33] that=informals list [14:48:56] 10Revision-Scoring-As-A-Service-Backlog, 10Bad-Words-Detection-System, 10revscoring: Add language support for Korean - https://phabricator.wikimedia.org/T160757#3125275 (10revi) Unfortunately I have to say updated version of BWDS run is still meaningless except one entry. Also, informals list is what I was... [14:50:30] 10Revision-Scoring-As-A-Service-Backlog, 10Bad-Words-Detection-System, 10revscoring: Add language support for Korean - https://phabricator.wikimedia.org/T160757#3125279 (10Halfak) Gotcha. Sounds good. Sorry for the BWDS issues for Korean. I've been working on that a lot in the last week. [14:51:03] :( [14:51:42] that might be 'some words in frequently reverted pages' but it wasn't bad word [14:53:43] 10Revision-Scoring-As-A-Service-Backlog, 10Bad-Words-Detection-System, 10revscoring: Add language support for Korean - https://phabricator.wikimedia.org/T160757#3109896 (10Halfak) p:05Triage>03High [14:54:19] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES, 13Patch-For-Review: Make it possible for ORES to defer changes for review - https://phabricator.wikimedia.org/T150593#3125290 (10Halfak) [14:54:34] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use redis to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3125291 (10Halfak) [14:55:53] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3107881 (10Halfak) [14:59:52] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3125301 (10Halfak) I've boldly change the title and description because poolcounter seems to get a strong recommendation. I'll look into that. [15:06:29] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3125329 (10Halfak) Just found out the poolcounter was developed in-house and has some pretty poor documentation. It might not even be in version control. @... [15:06:56] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3125330 (10Halfak) Aha!. Looks like it exists here https://github.com/wikimedia/mediawiki-extensions-PoolCounter/tree/master/daemon [15:24:52] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3125360 (10Halfak) Where would we run the daemon? Maybe we could use one of the redis nodes. What is the memory performance of poolcounter like? Do you th... [17:34:04] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3107881 (10Legoktm) I think PoolCounter is reasonably well documented at https://wikitech.wikimedia.org/wiki/PoolCounter - is there something specific you se... [17:46:33] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3125754 (10Halfak) Yeah, link to the server code is broken (just fixed it). No description of how to build and invoke the daemon (I'd have to dig through pu... [17:54:44] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3125766 (10Halfak) I should mention that I'm also looking at this as a SOFIXIT thing. If we're going to use this, we should FIXIT the documentation. [18:00:22] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3125773 (10Legoktm) >>! In T160692#3125754, @Halfak wrote: > Yeah, link to the server code is broken (just fixed it). No description of how to build and inv... [20:55:57] halfak: I think I've found a bug in the goodfaith model stats, unless I'm misunderstanding them. The crux of the issue is that our code is computing "maybe bad faith" as score<0.317 and "likely bad faith" as score<0.812, which is obviously wrong because "likely" should be stricter than "maybe" [20:56:23] I'm confused. [20:56:24] They're pegged to recall_at_precision(min_precision=0.15,false) and recall_at_precision(min_precision=0.45,false) respectively [20:56:42] And I think the ORES API is returning strange results there [20:56:51] So looking at https://ores.wikimedia.org/v2/scores/enwiki/goodfaith?model_info=test_stats [20:57:37] For recall_at_precision(min_precision=0.15) it says "false": { "precision": 0.156, "recall": 0.785, "threshold": 0.317 } }, [20:58:12] Right [20:58:19] Which I interpret as "if you set your threshold as score<0.317, you will get false (bad faith) with precision 0.156 and recall 0.785" [20:58:34] Right [20:58:47] false = 1-true [20:59:02] But then for recall_at_precision(min_precision=0.45) it claims that if I set threshold score<0.812, I will get false with precision 0.464 and recall 0.366 [20:59:11] But this sounds impossible [20:59:12] threshold for which score? [20:59:17] Ooooh [20:59:20] :) [20:59:24] The thresholds are also one-minused? [20:59:26] for false? [20:59:28] Nope. [20:59:37] But the ores extension only stores a score for true [20:59:42] Right [20:59:47] So on order to get the score for false, use 1-true :) [20:59:51] Oh but are the thresholds for the false scores? [20:59:54] That makes some sense [21:00:01] We were using thresholds on the false scores but applying them to true scores [21:00:04] E.g. https://ores.wikimedia.org/v2/scores/enwiki/goodfaith/23847523 [21:00:07] So we need to one-minus all the thresholds for false [21:00:13] Both false and true have their own score. [21:00:14] Riiiight [21:00:15] This makse sense [21:00:19] This is important for multi-class. [21:00:27] And binary classes just magically inherit that [21:00:41] Yeah, I understand [21:00:49] But we optimize by storing only one score, and that's biting us here [21:01:03] OK, I'll fix the code to apply 1- to all the false-based thresholds [21:02:40] Sorry for the confusion. opposing metrics and fitness are weird. [21:06:12] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692#3126539 (10akosiaris) >>! In T160692#3125360, @Halfak wrote: > Where would we run the daemon? Maybe we could use one of the redis nodes. What is the memory... [21:25:29] 06Revision-Scoring-As-A-Service, 10ORES: Exclude precaching requests from cache_miss/cache_hit metrics - https://phabricator.wikimedia.org/T159502#3126677 (10Halfak) https://github.com/wiki-ai/ores/pull/191 [21:25:39] 06Revision-Scoring-As-A-Service, 10ORES: Exclude precaching requests from cache_miss/cache_hit metrics - https://phabricator.wikimedia.org/T159502#3126682 (10Halfak) Still a WIP [22:14:08] (03PS1) 10Catrope: Stats: Invert "false" thresholds so they're correct [extensions/ORES] - 10https://gerrit.wikimedia.org/r/344553 (https://phabricator.wikimedia.org/T161250) [22:16:18] (03CR) 10jerkins-bot: [V: 04-1] Stats: Invert "false" thresholds so they're correct [extensions/ORES] - 10https://gerrit.wikimedia.org/r/344553 (https://phabricator.wikimedia.org/T161250) (owner: 10Catrope) [22:18:32] (03PS2) 10Catrope: Stats: Invert "false" thresholds so they're correct [extensions/ORES] - 10https://gerrit.wikimedia.org/r/344553 (https://phabricator.wikimedia.org/T161250) [22:25:53] (03CR) 10Mattflaschen: [C: 032] Stats: Invert "false" thresholds so they're correct [extensions/ORES] - 10https://gerrit.wikimedia.org/r/344553 (https://phabricator.wikimedia.org/T161250) (owner: 10Catrope) [22:28:08] (03Merged) 10jenkins-bot: Stats: Invert "false" thresholds so they're correct [extensions/ORES] - 10https://gerrit.wikimedia.org/r/344553 (https://phabricator.wikimedia.org/T161250) (owner: 10Catrope) [22:32:40] (03PS1) 10Catrope: Stats: Invert "false" thresholds so they're correct [extensions/ORES] (wmf/1.29.0-wmf.17) - 10https://gerrit.wikimedia.org/r/344555 (https://phabricator.wikimedia.org/T161250) [22:49:45] halfak: Thanks for your help, I fixed this in our threshold extraction code, SWATting the fix in 10 mins [23:04:03] RoanKattouw: I've found one bug I'm going to make a phab card [23:04:20] but one thing. Is there a way to make it remember our choices? [23:04:37] (03CR) 10Thcipriani: [C: 032] "SWAT" [extensions/ORES] (wmf/1.29.0-wmf.17) - 10https://gerrit.wikimedia.org/r/344555 (https://phabricator.wikimedia.org/T161250) (owner: 10Catrope) [23:06:20] Amir1: Not in the UI, but you could bookmark the URL [23:06:52] (03Merged) 10jenkins-bot: Stats: Invert "false" thresholds so they're correct [extensions/ORES] (wmf/1.29.0-wmf.17) - 10https://gerrit.wikimedia.org/r/344555 (https://phabricator.wikimedia.org/T161250) (owner: 10Catrope) [23:06:57] yeah, I guess it would be really complex to have preference for it [23:07:46] We could set a cookie if we wanted; but yeah we decided to go for bookmarkability instead