[16:47:41] o/ Amir1 [17:23:21] halfak: hey [17:23:34] halfak: I just got to work [17:23:43] what's the plan for today [17:25:55] I'm doing a bit to support fajne, but otherwise, I think I'll be leaving early. [17:26:24] I'm still pretty exhausted. My flight from SF landed at 1AM local time and Isla (dog) needed me to get up at 7AM [17:26:36] Anything you want me to look at quickly? [17:26:38] Amir1, ^ [17:27:01] halfak: not yet [17:27:10] mostly to see what we can do about celery 4 thing [17:29:52] 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Article quality campaign for Persian Wikipedia - https://phabricator.wikimedia.org/T174684#3772302 (10Halfak) I think the next step is to analyze the length distribution of articles and compare that t... [17:29:52] https://phabricator.wikimedia.org/T174684#3772302 [17:29:56] Amir1, ^ [17:30:37] halfak: I found something that I need you to take a look at: https://phabricator.wikimedia.org/T180450 and https://phabricator.wikimedia.org/T180686 [17:30:44] The thresholds are messed up in wikidata [17:31:10] you set it too high, it doesn't catch anything, you set it small, it catches everything [17:31:25] I can't find the middle ground [17:32:03] Anything wrong with the model building or configuration? [17:32:31] I don't know, it can be [17:32:41] because after the new deployment it stopped working properly [17:33:06] Do you have a set of false positives we can use to test? [17:35:18] Amir1, I need to step away now. But I can help look at this tomorrow. it seems urgent. [17:35:43] My plan was to use the "revscoring score" utility to make sure the predictions matched ores and then move forward from there. [17:35:45] halfak: finding those won't be hard [17:36:23] Amir1, it's possible that the real problem is in the population re-scaling that we're doing. If that's the case, then we can drop it and re-train the models. [17:37:05] halfak: That's exactly my suspicion, nothing has changed beside that [17:37:24] halfak: before you go, https://phabricator.wikimedia.org/T174684#3772302 [17:37:34] how many articles to sample? [17:37:40] (for distribution of the whole) [17:37:59] bring each target quality class up to 250 observations [17:38:13] is our goal. [17:38:32] But honestly, I think you might just target 150 and not supplement the GA/FA classes. [17:38:41] 150 would be a fine place to start [17:57:00] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10Beta-Cluster-reproducible: ORES RC filters missing in beta cluster, fetching thresholds fails - https://phabricator.wikimedia.org/T180633#3764583 (10Ladsgroup) Now it works just fine: ``` ladsgroup@deployment-tin:~$ mwscript eval.php enwiki >... [20:02:33] Amir1: on travis I see this error with flake8 but its difficult to fix since locally flake8 doesn't complain - https://travis-ci.org/wiki-ai/drafttopic/builds/304070050?utm_source=github_status&utm_medium=notification [20:02:36] any workarounds? [20:02:45] i don't have any predefined configs [20:06:51] Amir1: nevermind got it, flake8 has a peculiar thing to hide some errors in presence of others, it showed up on using ignore=E722