[00:37:30] 06Revision-Scoring-As-A-Service: Implement "thresholds", deprecate "pile of tests_stats" - https://phabricator.wikimedia.org/T162217#3155892 (10Halfak) [00:37:57] 06Revision-Scoring-As-A-Service, 10revscoring: Implement "thresholds", deprecate "pile of tests_stats" - https://phabricator.wikimedia.org/T162217#3155907 (10Halfak) [00:46:23] RoanKattouw, models are building [00:46:48] Looks like they'll likely finish by tomorrow morning. [01:18:13] Awesome, thanks [03:54:36] 10Revision-Scoring-As-A-Service-Backlog, 10Deployment-Systems, 10Wikilabels, 07Chinese-Sites, 07I18n: Deploy latest zh-hans/hant translations for Wikilabels on wmflabs - https://phabricator.wikimedia.org/T162108#3156050 (10Ladsgroup) I checked OAuth again (if it's broken, it would be a serious problem) a... [03:59:22] 10Revision-Scoring-As-A-Service-Backlog, 10Deployment-Systems, 10Wikilabels, 07Chinese-Sites, 07I18n: Deploy latest zh-hans/hant translations for Wikilabels on wmflabs - https://phabricator.wikimedia.org/T162108#3156051 (10Ladsgroup) The changes are deployed now, please check and tell me. [04:37:53] 10Revision-Scoring-As-A-Service-Backlog, 10Deployment-Systems, 10Wikilabels, 07Chinese-Sites, 07I18n: Deploy latest zh-hans/hant translations for Wikilabels on wmflabs - https://phabricator.wikimedia.org/T162108#3156077 (10Arthur2e5) 05Open>03Resolved Looks like plain old bad luck as it is working no... [04:39:25] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 07Chinese-Sites: Edit quality campaign for Chinese Wikipedia - https://phabricator.wikimedia.org/T116474#1750473 (10Arthur2e5) A minor question: how is the "trusted user groups" part going? I've recently been reviewing... [14:10:01] 06Revision-Scoring-As-A-Service, 10Wikilabels: Manage wikilabels for labsdb1004 maintenance - https://phabricator.wikimedia.org/T162265#3157320 (10Halfak) [14:38:49] 10Revision-Scoring-As-A-Service-Backlog, 10Bad-Words-Detection-System, 10revscoring: Add language support for Swahili (sw) - https://phabricator.wikimedia.org/T162271#3157434 (10Baba_Tabita) [15:53:02] 06Revision-Scoring-As-A-Service, 10Wikilabels: Manage wikilabels for labsdb1004 maintenance - https://phabricator.wikimedia.org/T162265#3157736 (10Halfak) [15:53:18] 06Revision-Scoring-As-A-Service, 10Wikilabels: Manage wikilabels for labsdb1004 maintenance - https://phabricator.wikimedia.org/T162265#3157320 (10Halfak) It looks like this maintenance is part of T155401. [17:19:39] 06Revision-Scoring-As-A-Service, 10Wikilabels: Manage wikilabels for labsdb1004 maintenance - https://phabricator.wikimedia.org/T162265#3158004 (10MoritzMuehlenhoff) I'm flexible with the upgrade, just propose a time that works for you (it's also not time-critical/urgent). [17:35:54] biking to the university. Back online in about 45 minutes [19:58:07] 06Revision-Scoring-As-A-Service, 10Wikilabels: Manage wikilabels for labsdb1004 maintenance - https://phabricator.wikimedia.org/T162265#3158649 (10Halfak) Great. I'm hoping to do the scheduling next week. Maybe we could do this in two weeks? Does that sound OK? [22:35:11] where can one read more about this? "Our tests have revealed that the actual performance of the ORES models on your wiki is different from our assumptions. There is a high risk to have the same edit marked as "bad" and "good" at the same time" [22:35:14] (from https://www.wikidata.org/wiki/Wikidata:Project_chat#New_filters_for_Recent_Changes_-_Beta_deployment_rescheduled ) [22:35:30] (asking as a heavy user of the beta feature on wikidata ;) [22:41:20] HaeB, the ERI filtering thresholds are defined in such a way that it assumes the same level of fitness as English Wikipedia across the board. [22:41:32] This results in a ton of inconsistencies from a user perspective. [22:41:47] It turns out that the Wikidata model has much more fitness than English Wikipedia :) [22:41:51] Happy problems [22:42:23] There are safe ways to define these thresholds, but those were not explored. [22:44:04] It turns out that a fitness distribution is not all that intuitive. [23:01:27] halfak: Speaking of, did the model rebuild finish? [23:01:37] Oh shoot. Good Q. [23:01:38] * halfak looks [23:01:53] * halfak had a giant meeting day [23:02:08] yes! It finished. [23:02:24] I'll need a couple of hours tomorrow to review the test stats and then we can prepare a deployment. [23:02:32] If I'm fast enough, it'll get deployed tomorrow. [23:02:39] If not, I don't think there's a window on Friday. [23:02:50] Regretfully, tomorrow is also a mega-meeting day :( [23:03:13] :/ [23:03:33] halfak: In the meantime, can you send me the stats? [23:04:46] (If that is feasible/convenient) [23:05:16] At present I'm more interested in being able to view those stats then getting them deployed [23:05:17] RoanKattouw, all of them? [23:05:32] If possible, yes. I like my data in large quantities [23:07:50] If it's more convenient to get just a few wikis or just a few stats then I can give you a list of my preferred ones [23:12:38] A list would help. [23:12:44] It's highly manual at this point. [23:12:48] Hmm... wait. I have an idea. [23:20:41] I care most about plwiki, ptwiki, ruwiki, trwiki and fawiki for the time being, and primarily about precision 0.98, 0.99 and 0.995 [23:22:04] I'm going to have a it all in a mega-json-blob in a bit [23:23:15] Awesome [23:28:41] RoanKattouw, https://gist.github.com/halfak/ee27abd719744456b5b5fc48d9dd9127 [23:29:03] gotta run. I'll do my best to have somethingh useful for you tomorrow. [23:29:07] o/ [23:30:46] That's already plenty useful, thanks! [23:37:46] Hmm the results for plwiki.damaging seem a bit fantastical, 99.6% precision and 97.7% recall?!