[08:43:36] Amir1: do you have any debugging tips for T185903? [08:43:36] T185903: Train/test damaging and goodfaith model for Hungarian Wikipedia - https://phabricator.wikimedia.org/T185903 [12:59:46] tgr|away: hey, tell me when you are around [13:05:21] 10Scoring-platform-team (Current), 10User-Ladsgroup: Edit quality campaign for Catalan Wikipedia - https://phabricator.wikimedia.org/T187771#4010106 (10Ladsgroup) That's amazing, I get to it today [13:15:51] halAFK: For when you're around ^ [15:37:40] PROBLEM - https://grafana.wikimedia.org/dashboard/db/ores grafana alert on einsteinium is CRITICAL: CRITICAL: https://grafana.wikimedia.org/dashboard/db/ores is alerting: 5xx rate (Change prop) alert. [15:49:40] RECOVERY - https://grafana.wikimedia.org/dashboard/db/ores grafana alert on einsteinium is OK: OK: https://grafana.wikimedia.org/dashboard/db/ores is not alerting. [15:57:45] Amir1, codezee: I'm going to switch locations. I'll be offline for a bit. [15:58:59] Amir1, I see codezee is offline. COuld you let him know when he gets back [15:59:01] ? [17:33:59] Amir1: o/ [18:03:53] o/ Amir1 [18:04:03] hey [18:04:05] Looking at fawiki's article quality thingie. [18:04:17] my mind is blurry about it [18:04:25] Me too :) I see you did some normal distribution analysis. [18:05:01] I think we should probably just make some arbitrary decisions and move forward from there. [18:05:24] I'm imagining a pilot labeling campaign that will help us refine our strata and then a larger labeling campaign based on what we learn from the first one. [18:05:47] I think we can get pretty far if we have people label 100 pages. [18:09:25] hmm, understandable [18:10:37] OK I think I'll work from https://quarry.wmflabs.org/query/25137 to propose some preliminary strata. [18:10:52] Can you link me to the table defining the quality classes for fawiki? [18:10:56] Amir1, ^ [18:11:32] halfak: it should be as the same as English Wikiepdia [18:11:44] OK. Even "Stub" and "Start"? [18:12:00] yup [18:16:27] Cool. I'll try to target that in the strata and use a bit of eye'ballin' [18:16:31] :D [18:16:33] Lunch! [19:04:04] Amir1: I'm around now if that works [19:05:54] tgr: yeah I'm too, just grab some water and I will be back [19:06:13] great, thx [19:09:04] tgr: so, [19:09:28] I just templed huwiki several days ago, actually it was a edge because of type of the model (random forest) [19:09:35] but we handled that before [19:12:02] how do you choose between the models? when I tried the tuning seemed to give superior results for gradient boosting: https://github.com/wiki-ai/editquality/compare/master...tgr:huwiki-damaging-goodfaith-T185903-v2?expand=1#diff-487b14f7e3629b1f795deeccfb5ce65a [19:12:02] T185903: Train/test damaging and goodfaith model for Hungarian Wikipedia - https://phabricator.wikimedia.org/T185903 [19:12:03] tgr: so now adding damaging models using templates will be easy [19:12:26] you need to choose the top from the result of the tuning report [19:12:47] so you need to build the tuning report first and then tune it after that [19:13:51] yeah, I got that far - did the templating, did the tuning, picked the top parameters (although it was gradient boosting for me) and then got the JSON error when actually trying to build the model [19:15:02] tgr: funnily enough, that's very new and actually a bug (that I probably introduced) because I'm getting the same for cawiki [19:15:10] probably did some stupid mistake but not sure where to start looking - the only JSON file included is the labeling dataset and that one seems OK [19:15:47] what I would suggest is to make a model for english wikiepdia and see the differences between datasets [19:15:56] that might give us a clue [19:16:18] will do, thanks [19:17:00] I'm doing the same here [19:19:20] tgr: I can reproduce the bug everywhere [19:22:16] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: JSON error in buildign models - https://phabricator.wikimedia.org/T188535#4011465 (10Ladsgroup) [19:22:31] tgr: I found the issue [19:23:10] during the templating, one of the things I changed was to change -p 'max_features="log2"' to without double quotes [19:23:22] I thought it doesn't affect us at all [19:23:34] but by adding them back it just works [19:23:51] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: JSON error in building models - https://phabricator.wikimedia.org/T188535#4011479 (10Ladsgroup) [19:25:21] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: JSON error in building models - https://phabricator.wikimedia.org/T188535#4011465 (10Ladsgroup) Changing `-p 'max_features=log2' \` to -p 'max_features="log2"' \` fixes it. (Mind blown). This was one of changes I did while... [19:26:15] huh [19:26:36] thanks, I'll try to build the models again after working hours [19:34:31] have fun, I'm done for the day but might still pick this up at home as it's so stupid [20:31:04] 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Article quality campaign for Persian Wikipedia - https://phabricator.wikimedia.org/T174684#4011724 (10Halfak) OK so it looks like we have the following strata: |ROUND(LOG10(page_len))|COUNT(*) |1|325... [20:47:23] 10Scoring-platform-team, 10Research, 10Research-outreach: Organize a technical workshop for ORES - https://phabricator.wikimedia.org/T141310#2493997 (10leila) @Halfak @DarTar can you help us find the right place for this task? Are we planning to do this in Research? In Scoring Platform team? ... :) [21:07:25] (03PS1) 10Reedy: Member variables with type hints [extensions/JADE] - 10https://gerrit.wikimedia.org/r/415390 [21:12:51] 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Article quality campaign for Persian Wikipedia - https://phabricator.wikimedia.org/T174684#4011898 (10Halfak) Using: https://quarry.wmflabs.org/query/25153 * 10-100 bytes: Articles consist of a singl... [21:13:29] Alright! We have a pilot sample [21:13:30] WOO [21:13:41] Now to get it stitched into wikiclass. [21:16:22] 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Article quality campaign for Persian Wikipedia - https://phabricator.wikimedia.org/T174684#4011906 (10Halfak) @Ladsgroup, can you put together a "feature_list" module for fawiki. Reference https://gi... [21:32:50] wiki-ai/wikiclass#44 (fawiki - 411b90b : halfak): The build passed. https://travis-ci.org/wiki-ai/wikiclass/builds/347480013 [21:48:32] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/JADE] - 10https://gerrit.wikimedia.org/r/415420 (owner: 10L10n-bot) [21:52:33] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/415431 (owner: 10L10n-bot) [22:37:31] PROBLEM - https://grafana.wikimedia.org/dashboard/db/ores-extension grafana alert on einsteinium is CRITICAL: CRITICAL: https://grafana.wikimedia.org/dashboard/db/ores-extension is alerting: Failure rate alert. [22:38:31] RECOVERY - https://grafana.wikimedia.org/dashboard/db/ores-extension grafana alert on einsteinium is OK: OK: https://grafana.wikimedia.org/dashboard/db/ores-extension is not alerting.