[16:02:33] <halfak>	 \o/
[16:02:41] <halfak>	 Dario sent me a photo :D 
[16:03:58] <harej>	 I wonder what the ORES/Wikicite connection is, though.
[16:41:21] <notconfusing>	 hello everyone, can anyone point me a to script that could be a start to help me scrape every user that's posted to a flow discussion page?
[16:43:28] <leila>	 harej: It was in the context of "what kind of systems I like"
[16:45:32] * leila has a short day ahead of her. focus time. ;)
[17:06:19] <elukey>	 hello people!
[17:06:36] <elukey>	 in db1108 (analytics-slave) there is a 'research_prod' user
[17:06:50] <elukey>	 that I am planning to drop since it shouldn't be used
[17:07:05] <elukey>	 I am wondering if anybody of you use it for some reason (rather than the usual 'resarch')
[17:09:31] <bmansurov>	 elukey: o/ let me check, it maybe mine 
[17:10:52] <bmansurov>	 elukey: not mine, I don't think I've used db1108 according to my configs
[17:13:34] <elukey>	 bmansurov: it is where the log database is, you might have used it via 'analytics-slave.eqiad.wmnet'
[17:13:39] <elukey>	 if not, no problem :)
[17:14:01] <elukey>	 I'll drop it tomorrow, if anybody wants to stop me for some good reason lemme know via IRC :)
[17:24:29] <bmansurov>	 elukey: oh that's right, let me check again
[17:26:22] <bmansurov>	 elukey: Turns out I've only used the research user. And none of the repos I maintain use that user. +1 for deletion ;))
[17:26:49] <elukey>	 ack! thanks :)
[18:01:52] <leila>	 elukey: I checked on my end and haven't used it. fine with me to drop.
[18:02:04] <leila>	 dsaez ^
[18:02:25] <leila>	 nuria: can I move our meeting tomorrow to early afternoon?
[18:02:36] <nuria>	 yes, by all means
[18:02:38] <nuria>	 leila: 
[18:02:39] <leila>	 nuria: I have some up-in-the-air meetings in the morning at Stanford. 
[18:02:42] <leila>	 nuria: thanks.
[18:03:03] <leila>	 bmansurov: I'll be right there. 
[18:03:08] <bmansurov>	 leila: sure
[18:17:15] <leila>	 nuria: it doesn't let me move the event. ;)
[18:17:22] <leila>	 nuria: can you move it to 14:00 PT on Friday?
[18:17:38] <nuria>	 leila: please try now
[18:18:54] <leila>	 nuria: thanks. I moved it (to 13:00 actually)
[19:10:19] <leila>	 bmansurov: you may want to have a look at ACM Queue https://queue.acm.org/app/ and see if you want to subscribe to it.
[19:10:55] <leila>	 bmansurov: they publish software engineering case studies that can be interesting to you.
[19:11:31] <leila>	 bmansurov: The recent edition's case study title is "CodeFlow: Improving the Code Review Process at Microsoft".
[19:12:55] <bmansurov>	 leila: thanks, I'll take a look.
[19:13:00] <leila>	 sounds good.
[19:20:17] <leila>	 dsaez: lol on your 11K comment in T210433 . 
[19:20:18] <stashbot>	 T210433: Identify and release data on similar Wikidata items - https://phabricator.wikimedia.org/T210433
[19:20:26] * leila opens her homework on that task
[19:22:11] * leila wonders why her assignment isn't in fa. ;)
[19:23:31] <bmansurov>	 leila: sorry, spark doesn't come with stopwords for farsi
[19:23:47] <bmansurov>	 leila: we'll have to create those before training models in farsi and other languages.
[19:23:59] <leila>	 bmansurov: hmmm. who should I complain to? :D
[19:24:10] <bmansurov>	 leila: hehe, spark developers
[19:24:33] <leila>	 bmansurov: too hard. ok. I'll do en for now, but happy to help with ar and fa stopwords
[19:24:58] <bmansurov>	 leila: in fact we are planning on creating our stop words and possibly upstreaming those to spark
[19:25:06] <bmansurov>	 leila: your help would be invaluable
[19:25:30] <leila>	 bmansurov: that's great. let me know when the time comes. and please create a task for it or we will forget it. It's not a priority, I would say Normal is good for that.
[19:25:41] <bmansurov>	 leila: will do
[19:34:36] <dsaez>	 bmansurov,  leila,  why just not to remove the most frequent words? 
[19:35:10] <bmansurov>	 dsaez: some stopwords don't appear that frequently
[19:35:13] <dsaez>	 it's usual to remome the top 10% and the bottom 10%
[19:35:51] <dsaez>	 I thought the definition of stopword was to be very frequent 
[19:36:10] <dsaez>	 Considering bigrams could. Be 
[19:36:27] <dsaez>	 *could be another solution 
[19:37:05] <dsaez>	 Sorry,  I'm chiming in without context 
[19:40:34] <bmansurov>	 dsaez: I think stopwords are those that don't add meaning by themselves
[19:41:04] <bmansurov>	 whence would be one of the less frequently (i think) appearing stopwords
[19:42:03] <bmansurov>	 leila: re subscription, do we have membership?
[19:43:02] <dsaez>	 Maybe yours and my definition are ok 
[19:43:05] <dsaez>	 https://en.m.wikipedia.org/wiki/Stop_words
[19:44:13] <bmansurov>	 Yeah, in our case removing the most frequent words wouldn't make sense.
[20:42:32] <leila>	 J-Mo: 'saw your comment on T210607. Thanks for the pointer. 
[20:42:33] <stashbot>	 T210607: How to best capture user feedback for recommender systems output - https://phabricator.wikimedia.org/T210607
[20:43:40] <J-Mo>	 thanks leila! sounds like an interesting problem :)
[20:43:53] <leila>	 J-Mo: brainstorming point. I expect the API itself to be used by folks with purposes we may not be able to imagine right now. I wonder, if there is a way to not have a UI for feedback, but instead, just collect the feedback directly via the same/different API.
[20:44:21] <leila>	 J-Mo: in that task, Marko and Baha suggest that there is a way, which would be great, I /think/, for this case.
[20:44:31] <leila>	 J-Mo: or do you think we will always need the UI?
[20:47:47] <J-Mo>	 leila: I think it will depend on who we want to receive feedback from (Marko said this as well, I think). If we're deploying recommendations as a product feature, there will be UI for that feature, and we should accept feedback via that UI. But we'll need to have a general solution within the API that someone who wants to build a recommendation-powered feature/app/bot whatever can hook into. And ideally that API hook should p
[20:47:47] <J-Mo>	 rovide some structure that makes it easier for the system builder to understand what feedback UI to design. 
[20:48:55] <J-Mo>	 so if we want to encourge free-text feedback, we should make sure that the API request schema supports this (w/ all the considerations about privacy and security)
[20:49:11] <J-Mo>	 (I think I'm agreeing with you, but lmk if I'm not being clear)
[20:50:48] <J-Mo>	 if we want feedback from people who develop the systems, then we probably don't need a UI. But providing options to specify other "context" characteristics in an feedback API request would be helpful (it looks like this is what Perspective does with their SuggestScoreFeedback method)
[21:19:54] * leila reads the channel
[21:25:06] <leila>	 J-Mo: what do you mean when you say API structure? (in the context of helping system builders design feedback UI on their end?
[21:25:25] <leila>	 J-Mo: do you mean a set of API parameters that are clearly documented? or something else?
[21:26:38] <J-Mo>	 leila: yep, which API parameters are available, and how they are documented
[21:27:11] <leila>	 J-Mo: 'got you. makes sense.
[21:28:18] <J-Mo>	 for example, if we decided we wanted to encourage anyone who built an app on top of our API to elicit usefulness feedback in a 5-point Likert scale ("very useful" —> "not useful at all"), then we provide a specific parameter in the feedback method that accepts numeric values between 1 and 5. 
[21:28:38] <J-Mo>	 ^leila
[21:29:05] <leila>	 J-Mo: yeah. makes sense. 
[21:29:10] <J-Mo>	 cool cool
[21:29:19] <leila>	 ALL GOOD then. :D
[21:29:26] <J-Mo>	 \o/
[21:30:03] * leila officially switches to volunteer time to work on the web conference stuff
[21:34:14] <dsaez>	 bmansurov yt?
[21:34:23] <bmansurov>	 dsaez: yo
[21:35:01] <dsaez>	 qq: do you know if the hover on Wikipedia links can be used as service? I mean, embedd those popups in your own code?
[21:35:13] <dsaez>	 (outside wikimedia.org)
[21:35:50] <bmansurov>	 dsaez: as is no, but the popups uses something that's already a service: https://en.wikipedia.org/api/rest_v1/#!/Page_content/get_page_summary_title
[21:36:15] <dsaez>	 got you
[21:36:16] <dsaez>	 great
[21:37:50] <dsaez>	 thanks