[13:33:12] <halfak>	 o/
[13:33:18] <apergos>	 "morning"
[13:50:01] <halfak>	 :)  Good timezone appropriate greeting 
[13:51:44] <apergos>	 how are your moste potente potions today? 
[13:52:04] <apergos>	 although I think you called them listy functiony problem...
[14:03:04] <halfak>	 Ha!  I think I have an angle into it that will make it worthwhile to engineer and maintain.  Luckily python has a lot of functionality for inspection. 
[14:03:27] <halfak>	 So my plan is to dig into that functionality and see if I can essentially apply some decorators to "rewrite" the old functions. 
[14:04:29] <halfak>	 The problem with session extraction (e.g. I don't want to use the system resources to process 20k revisions and neither does anyone else) still seems the same -- limiting to 50 and optimizing for the common cases makes sense. 
[14:05:18] <halfak>	 I've been thinking about how some sort of feature store (or metrics cache) could enable large scale processing of sessions and more flexibility but I think that would primarily have academic uses. 
[14:05:44] <halfak>	 Sleeping on things is good. 
[14:06:05] <halfak>	 Yesterday's "good ideas" look different when they are less fresh. 
[14:08:55] <apergos>	 heh indeed
[14:09:11] <apergos>	 and yesterday's hard things can look easier in the light of day too
[14:10:14] <halfak>	 Right :) 
[14:10:59] <apergos>	 and remember you can always tweak the number if 50 doesn't seem to catch enough of the cases
[14:11:23] <halfak>	 Right.  The question I have now is if different models have different needs. 
[14:11:36] <halfak>	 E.g. right now, I'm targeting making predictions about newcomers. 
[14:11:45] <halfak>	 But if we're trying to detect unflagged bots, the needs may change. 
[14:12:11] <halfak>	 These newcomer predictions are going to be really fun. 
[14:12:11] <apergos>	 I mean the question probably should be "what are the different needs" rather than "are there different needs"
[14:12:17] <halfak>	 Roght. 
[14:12:19] <halfak>	 *i
[14:12:59] <apergos>	 what sort of predictions are you looking to make for the newcomers case?
[14:13:32] <halfak>	 The most basic, first step is "Is this newcomer a vandal?  Or are they editing in good-faith?"  Which is something the WP:Teahouse needs.  
[14:13:55] <halfak>	 The second is scary but I think it'll be awesome with good governance -- "How productive is this newcomer likely to be?"
[14:14:13] <halfak>	 I'll train the second based on newcomers who do not receive strong negative attention right away. 
[14:14:34] <halfak>	 So hopefully I can make the prediction to be "How productive would this person be if we are generally nice to them" 
[14:16:44] <apergos>	 that's... 
[14:16:53] <apergos>	 a whole can of worms isn't it
[14:17:33] <apergos>	 if there's already bias in which new users get more engagement and more help (as opposed to less support) and therefore stick around
[14:17:47] <apergos>	 that will be replicated in the predictions
[14:19:28] <halfak>	 Indeed. 
[14:19:30] <halfak>	 It' hairy. 
[14:19:53] <apergos>	 the question that's being answered really has to do with the combination of the community + the user (even assuming that we take only users who don't get strong negative attention immediately)
[14:20:13] <apergos>	 sure will be interesting!
[14:20:35] <halfak>	 Right!  I think we'll learn a lot about our community dynamics through inspecting the predictions this model makes. 
[14:21:06] <halfak>	 One of the really cool things about deploying these things and encouraging inspection is that it give a foothold for social change. 
[14:21:26] <halfak>	 We've seen that a lot around the damage detection models.  I think we'll see it more with these models that hit on key social issues. 
[14:21:36] <halfak>	 That's why Jade is so important. 
[14:21:38] <apergos>	 I hope you do! 
[14:21:45] <halfak>	 It's a basic mechanism for inspecting these models.  
[14:21:58] <halfak>	 First, it'll help us make sure they aren
[14:22:01] <halfak>	 't too stupid
[14:22:13] <apergos>	 selling social change within a community is often tough, but if you have a self-proclaimed group of 'show me the data' addicts, perhaps they can be persuaded
[14:22:25] <halfak>	 But ultimately, I hope Jade becomes a bit of a battleground/discussion forum for how we work. 
[14:22:26] <apergos>	 depending on what the changes are and what the predictions are too
[14:22:40] <halfak>	 Right. 
[14:23:22] <apergos>	 I'm curious about what some of the features are that will be extracted, that are more about community behavior 
[14:23:50] <halfak>	 J-Mo gave me an interesting prediction for the kind of fights we'll likely see around "Is this newcomer working in goodfaith?"  Imagine you're a grumpy patroller getting into an argument with a newcomer socializer about the quality of a newcomer.  *mwahahaha*
[14:24:50] <apergos>	 oh sure
[14:25:09] <halfak>	 I think for right now, I'll minimize or remove entirely the community response features so that it's more a prediction of potential.  I might only train the model in a lack of negative/positive examples under the hypothesis that goodfaith newcomers with negative experiences are a lot like goodfaith newcomers with neutral/positive experiences. 
[14:25:48] <halfak>	 My assumption is a lot of the negativity is random chance.  E.g. you stumbled on an article subject that someone is feeling WP:Ownership over. 
[14:25:49] <apergos>	 let me make an analogy: cops have a certain view of the average citizen colored from their interactions with a higher than average number of folks that involve criminal behavior; this view then fees into their future interactions, etc
[14:26:35] <apergos>	 I don't know if that's a good assumption (random chance) or what some other possible good assumptions are
[14:26:42] <apergos>	 that would take some thought
[14:26:58] <apergos>	 s/fees into/feeds into/
[14:27:16] <halfak>	 Right.  Agreed.  I think seeing how *wrong* the model is and *where* it is wrong will be a critical component of the Product(TM) work around it. 
[14:27:22] <apergos>	 yup
[14:27:26] <halfak>	 No good model without good Product(TM) work. 
[14:27:38] <apergos>	 and another piece of the puzzle is who is attracted to patrolling work
[14:27:48] <apergos>	 (who wants to be a cop) for what reasons
[14:28:03] <halfak>	 Right.  Oh!  I have a fun use-case for this for you. 
[14:28:06] <apergos>	 who is attracted to teahouse work
[14:28:07] <apergos>	 etc
[14:28:16] * apergos is all ears
[14:29:00] <halfak>	 Imagine we have a good/useful/effective model for predicting future productivity.  We could use that model to find out who is undermining the most future productivity and put... pressure on them somehow.  Maybe it's just a UI cue that suggests they should be nice.  Maybe it's a leaderboard of shame ;) 
[14:29:12] <apergos>	 wait stop
[14:29:17] <apergos>	 the first sentence is already suspect
[14:29:21] <apergos>	 what do you mean "good"
[14:29:25] <halfak>	 I don't know what the right intervention would be, but damn it would be crazy to be able to do something like that right.
[14:29:28] <halfak>	 Good question!
[14:29:33] <apergos>	 ahahahaha
[14:30:08] <halfak>	 I'd say the model is good if when someone *would have* received a negative response but they *don't* because of intervention, the model's predictions are accurate. 
[14:30:20] <halfak>	 But I think there are other versions of "good" to consider. 
[14:30:39] <halfak>	 I feel like the word "good" should always be in quotes. 
[14:30:53] <apergos>	 so what you might want is a model that predicts what sort of intervention by a community member at phase X will lead to greatest productiviy of the new user
[14:31:09] <apergos>	 (GoodLuckWithThat)
[14:31:54] <apergos>	 "good" sure is subjective, imo
[14:32:08] <halfak>	 Right.  "good" is a placeholder for a set of values. 
[14:32:43] <apergos>	 "whatever helps us build an encyclopedia faster"  (one set of values. is it my set? mmmm)
[14:37:16] <halfak>	 You know.  I'm not sure I've really gotten articulate on a general set of values I bring to this work. 
[14:37:40] <halfak>	 I've got some very specific values around building up tech infra in a way that empowers other to control and govern its use. 
[14:37:44] <halfak>	 But that's not general enough. 
[14:46:13] <apergos>	 not for this, no
[14:46:36] <apergos>	 there's layers of mission to peel away
[14:47:07] <apergos>	 build an encyclopedia? build a community contributing to open knowledge? make all knowledge open and available to all?
[14:47:09] <apergos>	 etc
[14:52:31] <halfak>	 I guess my values are more operational.  I'm not sure I know where Wikipedia should go.  But I'd like the distributed cognition system that makes decisions about where Wikipedia goes to be healthy and effective.  
[14:52:56] <halfak>	 Personally, I think whatever decisions we make today, we will need to adjust tomorrow/next month/in 10 years. 
[14:54:24] <apergos>	 💯💯💯
[14:54:51] <apergos>	 we'll make bad guesses, tactics will change, strategies will change, our understanding will change, etc
[14:56:39] <halfak>	 Right so the ability to make good enough decisions and make good pivots on those decisions is what I value. :) 
[14:56:53] <halfak>	 Notice I use "good"
[14:57:07] <halfak>	 I don't think I should be the one to define the "good" 
[14:57:54] <halfak>	 There's a collective process for identifying the "good" and some partial metrics we can use.  E.g. did any one change produce the desired effect. 
[15:03:40] <apergos>	 right
[15:04:00] <apergos>	 I guess another question will be, how can these models be used for not the greatest purpose by a comminuty
[15:04:03] <apergos>	 *community
[15:04:26] <apergos>	 to exclude certain categories of users, for example
[15:05:30] <halfak>	 I think we need to split real nefarious actors from simple self-interest. 
[15:05:43] <halfak>	 Simple self-interest leads people to design systems that work for them but not for others. 
[15:06:12] <halfak>	 Nefarious actors I like to think about in terms of threat models. 
[15:06:17] <apergos>	 yep, there will be people with an agenda, people with a really bad agenda, people who have good intentions that lead you know where, etc
[15:06:26] <halfak>	 Self-interest, I like to think about in terms of power imbalances. 
[15:06:56] <apergos>	 and given that "the community" (tm) acts as gatekeeper for new folks
[15:07:02] <apergos>	 that's already a power imbalance right there
[15:08:16] <apergos>	 there's a thesis waiting to be written in the aftermath of framban and the community consultation around office actions, too
[16:19:12] <halfak>	 apergos, it's a hard sell to get people to pick up these staff/community politics situation but I think they would make for a fascinating study.  I was trying to get some researchers to pick up the superprotect situation under the same observation. 
[16:19:23] <halfak>	 There's something really similar happening at Stack Overflow right now. 
[16:28:31] <apergos>	 I heard vague mutterings about SO but I've not got the spare cycles to follow it
[17:05:31] <halfak>	 Yeah.  Same here except I got 10 minutes once to take a look. :D 
[17:44:46] <apergos>	 I was already bankrupt of spare ten minutes by overspending on framban and related
[19:56:01] <halfak>	 AFK for a bit. 
[22:50:34] <halfak>	 I'm off for the evening.  Have a good one, folks!
[22:55:58] <accraze>	 later halfak
[23:07:26] <accraze>	 goin to the post office real quick, brb