[16:02:33] \o/ [16:02:41] Dario sent me a photo :D [16:03:58] I wonder what the ORES/Wikicite connection is, though. [16:41:21] hello everyone, can anyone point me a to script that could be a start to help me scrape every user that's posted to a flow discussion page? [16:43:28] harej: It was in the context of "what kind of systems I like" [16:45:32] * leila has a short day ahead of her. focus time. ;) [17:06:19] hello people! [17:06:36] in db1108 (analytics-slave) there is a 'research_prod' user [17:06:50] that I am planning to drop since it shouldn't be used [17:07:05] I am wondering if anybody of you use it for some reason (rather than the usual 'resarch') [17:09:31] elukey: o/ let me check, it maybe mine [17:10:52] elukey: not mine, I don't think I've used db1108 according to my configs [17:13:34] bmansurov: it is where the log database is, you might have used it via 'analytics-slave.eqiad.wmnet' [17:13:39] if not, no problem :) [17:14:01] I'll drop it tomorrow, if anybody wants to stop me for some good reason lemme know via IRC :) [17:24:29] elukey: oh that's right, let me check again [17:26:22] elukey: Turns out I've only used the research user. And none of the repos I maintain use that user. +1 for deletion ;)) [17:26:49] ack! thanks :) [18:01:52] elukey: I checked on my end and haven't used it. fine with me to drop. [18:02:04] dsaez ^ [18:02:25] nuria: can I move our meeting tomorrow to early afternoon? [18:02:36] yes, by all means [18:02:38] leila: [18:02:39] nuria: I have some up-in-the-air meetings in the morning at Stanford. [18:02:42] nuria: thanks. [18:03:03] bmansurov: I'll be right there. [18:03:08] leila: sure [18:17:15] nuria: it doesn't let me move the event. ;) [18:17:22] nuria: can you move it to 14:00 PT on Friday? [18:17:38] leila: please try now [18:18:54] nuria: thanks. I moved it (to 13:00 actually) [19:10:19] bmansurov: you may want to have a look at ACM Queue https://queue.acm.org/app/ and see if you want to subscribe to it. [19:10:55] bmansurov: they publish software engineering case studies that can be interesting to you. [19:11:31] bmansurov: The recent edition's case study title is "CodeFlow: Improving the Code Review Process at Microsoft". [19:12:55] leila: thanks, I'll take a look. [19:13:00] sounds good. [19:20:17] dsaez: lol on your 11K comment in T210433 . [19:20:18] T210433: Identify and release data on similar Wikidata items - https://phabricator.wikimedia.org/T210433 [19:20:26] * leila opens her homework on that task [19:22:11] * leila wonders why her assignment isn't in fa. ;) [19:23:31] leila: sorry, spark doesn't come with stopwords for farsi [19:23:47] leila: we'll have to create those before training models in farsi and other languages. [19:23:59] bmansurov: hmmm. who should I complain to? :D [19:24:10] leila: hehe, spark developers [19:24:33] bmansurov: too hard. ok. I'll do en for now, but happy to help with ar and fa stopwords [19:24:58] leila: in fact we are planning on creating our stop words and possibly upstreaming those to spark [19:25:06] leila: your help would be invaluable [19:25:30] bmansurov: that's great. let me know when the time comes. and please create a task for it or we will forget it. It's not a priority, I would say Normal is good for that. [19:25:41] leila: will do [19:34:36] bmansurov, leila, why just not to remove the most frequent words? [19:35:10] dsaez: some stopwords don't appear that frequently [19:35:13] it's usual to remome the top 10% and the bottom 10% [19:35:51] I thought the definition of stopword was to be very frequent [19:36:10] Considering bigrams could. Be [19:36:27] *could be another solution [19:37:05] Sorry, I'm chiming in without context [19:40:34] dsaez: I think stopwords are those that don't add meaning by themselves [19:41:04] whence would be one of the less frequently (i think) appearing stopwords [19:42:03] leila: re subscription, do we have membership? [19:43:02] Maybe yours and my definition are ok [19:43:05] https://en.m.wikipedia.org/wiki/Stop_words [19:44:13] Yeah, in our case removing the most frequent words wouldn't make sense. [20:42:32] J-Mo: 'saw your comment on T210607. Thanks for the pointer. [20:42:33] T210607: How to best capture user feedback for recommender systems output - https://phabricator.wikimedia.org/T210607 [20:43:40] thanks leila! sounds like an interesting problem :) [20:43:53] J-Mo: brainstorming point. I expect the API itself to be used by folks with purposes we may not be able to imagine right now. I wonder, if there is a way to not have a UI for feedback, but instead, just collect the feedback directly via the same/different API. [20:44:21] J-Mo: in that task, Marko and Baha suggest that there is a way, which would be great, I /think/, for this case. [20:44:31] J-Mo: or do you think we will always need the UI? [20:47:47] leila: I think it will depend on who we want to receive feedback from (Marko said this as well, I think). If we're deploying recommendations as a product feature, there will be UI for that feature, and we should accept feedback via that UI. But we'll need to have a general solution within the API that someone who wants to build a recommendation-powered feature/app/bot whatever can hook into. And ideally that API hook should p [20:47:47] rovide some structure that makes it easier for the system builder to understand what feedback UI to design. [20:48:55] so if we want to encourge free-text feedback, we should make sure that the API request schema supports this (w/ all the considerations about privacy and security) [20:49:11] (I think I'm agreeing with you, but lmk if I'm not being clear) [20:50:48] if we want feedback from people who develop the systems, then we probably don't need a UI. But providing options to specify other "context" characteristics in an feedback API request would be helpful (it looks like this is what Perspective does with their SuggestScoreFeedback method) [21:19:54] * leila reads the channel [21:25:06] J-Mo: what do you mean when you say API structure? (in the context of helping system builders design feedback UI on their end? [21:25:25] J-Mo: do you mean a set of API parameters that are clearly documented? or something else? [21:26:38] leila: yep, which API parameters are available, and how they are documented [21:27:11] J-Mo: 'got you. makes sense. [21:28:18] for example, if we decided we wanted to encourage anyone who built an app on top of our API to elicit usefulness feedback in a 5-point Likert scale ("very useful" —> "not useful at all"), then we provide a specific parameter in the feedback method that accepts numeric values between 1 and 5. [21:28:38] ^leila [21:29:05] J-Mo: yeah. makes sense. [21:29:10] cool cool [21:29:19] ALL GOOD then. :D [21:29:26] \o/ [21:30:03] * leila officially switches to volunteer time to work on the web conference stuff [21:34:14] bmansurov yt? [21:34:23] dsaez: yo [21:35:01] qq: do you know if the hover on Wikipedia links can be used as service? I mean, embedd those popups in your own code? [21:35:13] (outside wikimedia.org) [21:35:50] dsaez: as is no, but the popups uses something that's already a service: https://en.wikipedia.org/api/rest_v1/#!/Page_content/get_page_summary_title [21:36:15] got you [21:36:16] great [21:37:50] thanks