[08:49:34] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Implement hunspell dictionary for euwiki article quality model - https://phabricator.wikimedia.org/T223788 (10Theklan) How could we train this? I add @Ksarasola to this topic, maybe he has some great ideas. [12:49:41] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Implement hunspell dictionary for euwiki article quality model - https://phabricator.wikimedia.org/T223788 (10Halfak) @Theklan! We've got training in progress right now. But, I'm very interested in working with new volunt... [15:02:57] Technical Advice IRC meeting starting in 60 minutes in channel #wikimedia-tech, hosts: - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [15:52:43] Technical Advice IRC meeting starting in 10 minutes in channel #wikimedia-tech, hosts: - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [16:50:14] halfak: I moved our meeting to Friday [16:50:38] OK that works for me. [16:51:00] If you wanna check in now that's ok, but I don't have the threshold analysis results with the default thresholds yet [16:51:07] Suddenly I have more time for my (embarrassingly late) GROUP reviews. [16:51:11] Na. It's cool :) [16:51:16] pretty close though, I'm just waiting for code to run [16:51:53] when I just excluded the missing thresholds it seemed clear that 1. thresholds mattered a lot and 2. rolling out RC filters made a difference [16:52:35] So on Friday we'll talk about that and about reworking the outline [16:56:30] halfak: just saw that you're coming to give a DUB talk at UW :) [16:57:09] Oh yes :) I'm quite behind on putting it together. I don't want to recycle too much. [16:57:19] But \o/ that we can see an effect around thresholds [16:57:22] That's super cool! [16:57:53] But also a bit depressing! [16:59:38] https://teblunthuis.cc/outgoing/download.png [16:59:56] this is super preliminary [17:00:07] not the correct data [17:00:26] but the first jump is the "likelybad" cutoff and the second is the "verylikelybad" cutoff [17:00:56] pre.cutoff.fact is true before rcfilters was enabled [17:01:08] but don't put much stock into this because it's the wrong data [17:01:20] i'm going to bike in now talk to you Friday [17:01:58] the error ribbon is the standard error of model predictions [17:03:09] Thanks to both of you halfak and accraze, adding the ssh key to gerrit actually removed the blocker. [17:03:16] YES! [17:03:30] halfak I succeeded in cloning the puppet repo from gerrit. I'll carry on from here tomorrow. [17:04:09] Good day to you all. [17:04:18] Have a good night, kevinbazira :) [17:04:55] groceryheist, am I reading this right that the cutoffs were clear even before RCFilters deployment? [17:05:25] I wouldn't put much stock in that part [17:06:29] missing data isn't balanced pre-post cutoff [17:06:38] like some wikis are in the post-cutoff that aren't in the pre-cutoff [17:08:55] there's a huge amount of missing data so just take it as an illustration of what i'm working on with a hint that we'll see threshold effects. [17:10:53] gotcha. Makes sense. [18:43:56] halfak|Lunch: https://teblunthuis.cc/outgoing/threshold_analysis_less_missingness.png [18:44:02] ^ this is with the defaults [18:44:17] there are still a few missing observations that I have to track down [18:44:36] but this has the vast majority so I don't think results will change much unless I change the model specification [18:45:13] releasing RCFilters increased reversion probability at the "likelybad" level and decreased it at the "verylikelybad" level [18:47:15] I'm still not 100% sure about how to model multiple discontinuities [18:47:27] if you have any thoughts about that it would be helpful [19:09:03] It's very weird that the steps exist before and after RCFilters. [19:09:12] I don't know of any other tools that use the same thresholds. [19:09:23] groceryheist, ^ [19:09:50] I wonder if something else happens at those thresholds that is important. [19:09:59] Do we see the steps well before ORES was deployed? [19:18:45] halfak: yeah that is really weird. [19:18:59] I'll find out [19:19:16] Awesome :) [19:21:43] it could be people using the feature when it's in beta [19:22:39] to go very far back I might need you to get some threshholds manually. [20:01:34] Just got the puppet changes pushed. Hopefully akosi.aris can review in the morning :) [20:07:12] 10Scoring-platform-team (Current), 10editquality-modeling, 10Patch-For-Review, 10artificial-intelligence: Implement hunspell dictionary for euwiki article quality model - https://phabricator.wikimedia.org/T223788 (10Halfak) @kevinbazira, thanks to @Dzahn, you should be unblocked. Please try to rebuild the... [20:07:53] Na. Got mutante to do it. akosi.aris won't have to deal with the urgency in his morning. I think that's better. :) [20:19:51] * halfak starts thinking through the size of edit sessions. [20:20:29] I just did a quick analysis of the size of sessions (in number of edits) [20:24:55] Looks like 1% of edit sessions have more than 50 edits. [20:25:18] I found an edit session with 27846 edits. [20:25:23] Almost certainly a bot [20:25:47] "Conversion script" [20:31:20] get an edit token, run until done? probably some worse cases in the past *cough* elwiktionary [20:32:49] :) [20:33:34] It looks like it makes sense to limit out code at 50 edits when it comes to feature engineering. [20:33:46] More than 50% of edit sessions are 1-2 edits. [20:34:05] 80% are less than 10 edits. [20:35:12] So I think the logic will be (1) get the last two edits. If the session doesn't end there, get 8 more. If the session doesn't end, try to get 40 more. If the session is still going that far back, declare victory and set some field to "too many edits" so the model can get signal for that. [20:35:25] sounds about right [20:35:35] In my analysis, the average edits takes about 7 minutes (including browsing around looking for something to edit) [20:35:48] So 50 edits should take a human 350 minutes. [20:36:02] yeah but there will be awb-assisted edits and such [20:36:09] 5.8 hours of editing or so. [20:36:15] Right. There's gonna be a lot of that :| [20:36:17] and those can fly right by [20:36:28] Or rather, rarely, there will be some of that. [20:36:43] consider wikidata edits too, probably most will be via some tool (but human in the loop, as they say) [20:36:58] games and such [20:37:01] Oh good point. I was looking at enwiki. Wikidata is probably crazy pants. [20:37:10] 90% of edits are already bots :| [20:37:20] (or something like that) [20:37:27] still leaves 10% of really different behavior [20:37:40] ask muta hm he's not in here [20:37:46] mutante about wkidata-based games [20:38:19] basically tools for making edits via the api by prompting the human to decide if this or that, or identify something [20:38:33] anyways, lots to explore... [20:40:17] Indeed. This problem is a weird collection of strange and interesting domain knowledge, human behavior, augmented human behavior, and engineering constraints. [20:40:27] That's where I like to chill [20:42:24] enjoy :-) [21:04:01] Uh oh! New problem. The functions in DependentSets need to be converted too. Now how to do that? [21:04:16] I shall dub thee The Listy Functiony Problem [21:30:24] I'm out of here. have a good one! [21:30:26] o/ [21:30:32] later halfak