[13:33:12] o/ [13:33:18] "morning" [13:50:01] :) Good timezone appropriate greeting [13:51:44] how are your moste potente potions today? [13:52:04] although I think you called them listy functiony problem... [14:03:04] Ha! I think I have an angle into it that will make it worthwhile to engineer and maintain. Luckily python has a lot of functionality for inspection. [14:03:27] So my plan is to dig into that functionality and see if I can essentially apply some decorators to "rewrite" the old functions. [14:04:29] The problem with session extraction (e.g. I don't want to use the system resources to process 20k revisions and neither does anyone else) still seems the same -- limiting to 50 and optimizing for the common cases makes sense. [14:05:18] I've been thinking about how some sort of feature store (or metrics cache) could enable large scale processing of sessions and more flexibility but I think that would primarily have academic uses. [14:05:44] Sleeping on things is good. [14:06:05] Yesterday's "good ideas" look different when they are less fresh. [14:08:55] heh indeed [14:09:11] and yesterday's hard things can look easier in the light of day too [14:10:14] Right :) [14:10:59] and remember you can always tweak the number if 50 doesn't seem to catch enough of the cases [14:11:23] Right. The question I have now is if different models have different needs. [14:11:36] E.g. right now, I'm targeting making predictions about newcomers. [14:11:45] But if we're trying to detect unflagged bots, the needs may change. [14:12:11] These newcomer predictions are going to be really fun. [14:12:11] I mean the question probably should be "what are the different needs" rather than "are there different needs" [14:12:17] Roght. [14:12:19] *i [14:12:59] what sort of predictions are you looking to make for the newcomers case? [14:13:32] The most basic, first step is "Is this newcomer a vandal? Or are they editing in good-faith?" Which is something the WP:Teahouse needs. [14:13:55] The second is scary but I think it'll be awesome with good governance -- "How productive is this newcomer likely to be?" [14:14:13] I'll train the second based on newcomers who do not receive strong negative attention right away. [14:14:34] So hopefully I can make the prediction to be "How productive would this person be if we are generally nice to them" [14:16:44] that's... [14:16:53] a whole can of worms isn't it [14:17:33] if there's already bias in which new users get more engagement and more help (as opposed to less support) and therefore stick around [14:17:47] that will be replicated in the predictions [14:19:28] Indeed. [14:19:30] It' hairy. [14:19:53] the question that's being answered really has to do with the combination of the community + the user (even assuming that we take only users who don't get strong negative attention immediately) [14:20:13] sure will be interesting! [14:20:35] Right! I think we'll learn a lot about our community dynamics through inspecting the predictions this model makes. [14:21:06] One of the really cool things about deploying these things and encouraging inspection is that it give a foothold for social change. [14:21:26] We've seen that a lot around the damage detection models. I think we'll see it more with these models that hit on key social issues. [14:21:36] That's why Jade is so important. [14:21:38] I hope you do! [14:21:45] It's a basic mechanism for inspecting these models. [14:21:58] First, it'll help us make sure they aren [14:22:01] 't too stupid [14:22:13] selling social change within a community is often tough, but if you have a self-proclaimed group of 'show me the data' addicts, perhaps they can be persuaded [14:22:25] But ultimately, I hope Jade becomes a bit of a battleground/discussion forum for how we work. [14:22:26] depending on what the changes are and what the predictions are too [14:22:40] Right. [14:23:22] I'm curious about what some of the features are that will be extracted, that are more about community behavior [14:23:50] J-Mo gave me an interesting prediction for the kind of fights we'll likely see around "Is this newcomer working in goodfaith?" Imagine you're a grumpy patroller getting into an argument with a newcomer socializer about the quality of a newcomer. *mwahahaha* [14:24:50] oh sure [14:25:09] I think for right now, I'll minimize or remove entirely the community response features so that it's more a prediction of potential. I might only train the model in a lack of negative/positive examples under the hypothesis that goodfaith newcomers with negative experiences are a lot like goodfaith newcomers with neutral/positive experiences. [14:25:48] My assumption is a lot of the negativity is random chance. E.g. you stumbled on an article subject that someone is feeling WP:Ownership over. [14:25:49] let me make an analogy: cops have a certain view of the average citizen colored from their interactions with a higher than average number of folks that involve criminal behavior; this view then fees into their future interactions, etc [14:26:35] I don't know if that's a good assumption (random chance) or what some other possible good assumptions are [14:26:42] that would take some thought [14:26:58] s/fees into/feeds into/ [14:27:16] Right. Agreed. I think seeing how *wrong* the model is and *where* it is wrong will be a critical component of the Product(TM) work around it. [14:27:22] yup [14:27:26] No good model without good Product(TM) work. [14:27:38] and another piece of the puzzle is who is attracted to patrolling work [14:27:48] (who wants to be a cop) for what reasons [14:28:03] Right. Oh! I have a fun use-case for this for you. [14:28:06] who is attracted to teahouse work [14:28:07] etc [14:28:16] * apergos is all ears [14:29:00] Imagine we have a good/useful/effective model for predicting future productivity. We could use that model to find out who is undermining the most future productivity and put... pressure on them somehow. Maybe it's just a UI cue that suggests they should be nice. Maybe it's a leaderboard of shame ;) [14:29:12] wait stop [14:29:17] the first sentence is already suspect [14:29:21] what do you mean "good" [14:29:25] I don't know what the right intervention would be, but damn it would be crazy to be able to do something like that right. [14:29:28] Good question! [14:29:33] ahahahaha [14:30:08] I'd say the model is good if when someone *would have* received a negative response but they *don't* because of intervention, the model's predictions are accurate. [14:30:20] But I think there are other versions of "good" to consider. [14:30:39] I feel like the word "good" should always be in quotes. [14:30:53] so what you might want is a model that predicts what sort of intervention by a community member at phase X will lead to greatest productiviy of the new user [14:31:09] (GoodLuckWithThat) [14:31:54] "good" sure is subjective, imo [14:32:08] Right. "good" is a placeholder for a set of values. [14:32:43] "whatever helps us build an encyclopedia faster" (one set of values. is it my set? mmmm) [14:37:16] You know. I'm not sure I've really gotten articulate on a general set of values I bring to this work. [14:37:40] I've got some very specific values around building up tech infra in a way that empowers other to control and govern its use. [14:37:44] But that's not general enough. [14:46:13] not for this, no [14:46:36] there's layers of mission to peel away [14:47:07] build an encyclopedia? build a community contributing to open knowledge? make all knowledge open and available to all? [14:47:09] etc [14:52:31] I guess my values are more operational. I'm not sure I know where Wikipedia should go. But I'd like the distributed cognition system that makes decisions about where Wikipedia goes to be healthy and effective. [14:52:56] Personally, I think whatever decisions we make today, we will need to adjust tomorrow/next month/in 10 years. [14:54:24] 💯💯💯 [14:54:51] we'll make bad guesses, tactics will change, strategies will change, our understanding will change, etc [14:56:39] Right so the ability to make good enough decisions and make good pivots on those decisions is what I value. :) [14:56:53] Notice I use "good" [14:57:07] I don't think I should be the one to define the "good" [14:57:54] There's a collective process for identifying the "good" and some partial metrics we can use. E.g. did any one change produce the desired effect. [15:03:40] right [15:04:00] I guess another question will be, how can these models be used for not the greatest purpose by a comminuty [15:04:03] *community [15:04:26] to exclude certain categories of users, for example [15:05:30] I think we need to split real nefarious actors from simple self-interest. [15:05:43] Simple self-interest leads people to design systems that work for them but not for others. [15:06:12] Nefarious actors I like to think about in terms of threat models. [15:06:17] yep, there will be people with an agenda, people with a really bad agenda, people who have good intentions that lead you know where, etc [15:06:26] Self-interest, I like to think about in terms of power imbalances. [15:06:56] and given that "the community" (tm) acts as gatekeeper for new folks [15:07:02] that's already a power imbalance right there [15:08:16] there's a thesis waiting to be written in the aftermath of framban and the community consultation around office actions, too [16:19:12] apergos, it's a hard sell to get people to pick up these staff/community politics situation but I think they would make for a fascinating study. I was trying to get some researchers to pick up the superprotect situation under the same observation. [16:19:23] There's something really similar happening at Stack Overflow right now. [16:28:31] I heard vague mutterings about SO but I've not got the spare cycles to follow it [17:05:31] Yeah. Same here except I got 10 minutes once to take a look. :D [17:44:46] I was already bankrupt of spare ten minutes by overspending on framban and related [19:56:01] AFK for a bit. [22:50:34] I'm off for the evening. Have a good one, folks! [22:55:58] later halfak [23:07:26] goin to the post office real quick, brb