[19:14:31] o/ [19:14:44] This is the first research showcase I have been present for in like 3 months ^_^ [19:15:58] halfak, nice :) [19:16:15] Who is on IRC duty today? [19:16:22] me [19:16:30] cool :) [19:16:53] I'm trying to change the channel title, but I'm not op [19:17:38] dsaez_, ^ [19:17:55] You'll need to use your "dseaz" account though. [19:17:56] ups [19:18:05] right [19:18:38] now? [19:18:39] sooo, bmansurov. I thought a bit more about the languages we're choosing and the discussion around https://en.wikipedia.org/wiki/Template%3ABabel [19:18:40] yeah [19:18:56] quassel bouncer [19:19:05] yes [19:19:16] bmansurov: How much work will it be if we want to fetch babel template usage and its corresponding information from user pages? [19:19:34] We can limit the search to the top 10 languages we're considering, bmansurov. [19:19:45] leila: can't estimate reliably off the top of my head, but I can look into it. [19:19:50] bmansurov: having enough potential people who can label is definitely key per dsaez's comment. [19:20:01] For lazy folks. This is the stream URL: https://www.youtube.com/watch?v=L-1uzYYneUo [19:20:21] We'll kick off in 10 minutes if my calendar is right. [19:20:31] bmansurov: yeah, please do that. I think this can save us some time down the line. (and this is generally a good statistic to keep track of) [19:20:49] halfak: thanks for feeding the lazy. :D The title has too much text. [19:20:56] :D [19:21:10] leila: OK. And what's the associated task? Can you link? [19:21:24] yeah, let me create a subtask and add you to it. [19:21:29] ta [19:21:30] bmansurov: ^ [19:21:39] dsaez, I usually delete everything in the topic before the "| Channel is publicly logged ..." when adding the stream URL. No strong preference though. [19:21:59] It's important to keep the publicly logged notice in there either way. [19:22:26] halfak, good, I'll do, it is my first time as host, trying to read former logs to know exactly what to do [19:26:01] Nice :) [19:28:46] Hello everyone, the Research Showcase will be starting in around 2 minutes [19:29:20] hey all [19:29:38] I'm the IRC host, so please, let me know any question or comment that you have for the speakers [19:30:45] @dsaez: Thanks! [19:30:55] the youtube stream hasn't started yet, fyi [19:31:07] yeah. I guess we're waiting for the room. [19:31:14] thanks bmansurov [19:31:16] now [19:31:22] yeah. working. [19:31:25] started [19:31:32] Yep. Can see it [19:32:51] the wikimedia research newsletter review of an earlier preprint from yan and her collaborators: https://meta.wikimedia.org/wiki/Research:Newsletter/2017/February#"ExpertIdeas:_Incentivizing_Domain_Experts_to_Contribute_to_Wikipedia" [19:33:55] brendan_campbell: any chance we can go to full screen (or is it already full screen)? [19:34:15] leila: we already are fullscreen. yan's slides are 4:3 [19:34:20] rather than 16:9 [19:35:52] I see the bar saying "Yan Chen (presenting)" at the bottom [19:35:59] It doesn't take up much space though. [19:36:54] halfak: yes, unfortunately that's on yan's end. If she clicks "hide" it will go away, but doesnt seem to be obstructing anything [19:37:30] Oh no. I see a bar beneath that that literally says "Yan Chen (presenting)" [19:37:42] And "dtaraborelli" and "8 participants" [19:37:50] Still it's very narrow [19:37:58] I'm looking at the youtube stream [19:38:00] FWIW [19:38:01] Oh okay. Yes, that's always there. It's just a function of hangouts [19:38:06] gotcha [19:38:08] :) [19:39:22] greetings everybody! [19:39:35] hey mako1 [19:39:54] o/ mako1 [19:39:59] https://www.youtube.com/watch?v=L-1uzYYneUo| [19:40:05] Woops https://www.youtube.com/watch?v=L-1uzYYneUo [19:44:36] dsaez: are you monitoring the youtube live chat as well? we may get some questions from there, as well. [19:44:55] oh! no, going there, thanks [19:45:54] "Create a channel to join the chat" ...doing that. [19:47:24] thanks! [19:49:18] Suddenly the screen sharing notice is in the way :| [19:49:23] Should we interrupt? [19:50:19] halfak, It looks good for me. [19:50:23] in youtube [19:50:35] Huh. How is that possible [19:50:38] I still see it. [19:50:56] I see it in YouTube as well [19:51:11] oh [19:51:14] "Google Hangouts is sharing your screen with hangouts.google.com [Stop sharing] Hide" [19:51:25] yes [19:51:28] you are right [19:51:50] dsaez: I'm going to throw questions at you. ;) [19:52:08] good [19:52:40] DarTar: please can you tell Yen to remove the ""Google Hangouts is sharing your screen with hangouts.google.com [Stop sharing] Hide" message [19:53:47] dsaez: In slide 21 and in the models in that page, n is the number of consumers. n is assumed to be the number of pageviews, while pageview count can be from a smaller set of people, m, who are the consumers. Have the researchers thought about this, and the potential implications on the theory side of the research? (for example, do these model require independence between consumers, cuz that will be violated in this case). [19:54:18] https://en.wikipedia.org/w/index.php?title=Traveler%27s_dilemma&diff=754617503&oldid=742878975 [19:54:48] leila: ok [19:58:09] dsaez: my second question if there ends up being time and after others. did the researchers keep out a set of articles that were not shown to any researcher from this study and checked later how much improvement is made in the article (or its talk page) organically and without this kind of recommendation? [19:58:53] ok [19:59:00] thank you! :) [20:00:33] dsaez: Did all phase 2 emails have the actual number of page views for each page in them? [20:02:05] it does. thanks. [20:02:52] It's fine, we can move on. I'm thinking the emails that were sent to the experts [20:02:59] with the list of articles they might edit [20:03:25] The example shown had the number of pageviews over the last month, but I wasn't sure if that was just the high pageview condition or both conditions [20:04:33] sorry leila! [20:05:10] no worries, dsaez. thanks for trying. :) [20:07:19] does someone know if Yan Chen et al.'s work is published? sorry I missed this if it was mentioned. [20:09:05] leila: IDK, we can ask at the end, I want to know too [20:09:10] ere was something about it on Twitter in December which is what lead to this presentation [20:09:17] (I think) [20:09:26] leila: https://meta.wikimedia.org/wiki/Research:Newsletter/2017/February has a link to a preprint version from last yeasr, but i understand it may differ from the new paper [20:09:51] DarTar: if the paper is not out yet, can you ask Yan Chen to say what the name of the theory in slide 21? I want to read up on it. :D [20:10:05] will do [20:10:12] thanks DarTar and HaeB [20:10:18] thx HaeB [20:12:55] https://meta.wikimedia.org/wiki/Community_health_initiative/Interaction_Timeline [20:12:55] This tool is super cool [20:13:37] dsaez, Q for Caroline: Is there anything in the literature or anything you can think of that *looks* like this tool in other contexts? E.g. in twitter, facebook, etc.? [20:13:48] this tool = user interaction tool [20:14:56] halfak: ok! [20:16:04] https://en.wikipedia.org/wiki/Sea_lioning (seems there's no article on dogpiling yet) [20:17:39] Somewhat related: https://en.wikipedia.org/wiki/Vote_brigading [20:18:18] https://en.wikipedia.org/wiki/Wikipedia:Harassment#Wikihounding [20:19:47] dsaez: makes me think, did we explore the infringement of the 3-revert rule in the revert analysis? [20:21:04] DarTar, based on that rule, I'm considering that there is an 'edit-war', everytime that a user revert other 3 or more times, but not just in one page [20:21:20] right right [20:22:29] We should explicitly cite that police on the report [20:22:43] dsaez: question on my end. Based on the presentation, it seems that the "intent" of person x in doing what they're doing has impact on whether a case is wikihounding or not. If this is correct, how are we planning to capture intent (which may or may not been said in words on wiki pages.) [20:23:40] leila, do you mean, how is possible to guess the user intent? [20:24:39] dsaez: guess in the scientific sense, yes. How can we predict user intent, if that's something that is important in defining what constitutes wikihounding. [20:27:51] https://splinternews.com/an-illustrated-guide-to-trolling-1793854790 [20:28:00] HaeB: I checked the preprint article. It doesn't have the theory side spelled out as in the slides. but thanks for pointing it still. [20:29:36] dsaez: adding myself to the QA queue for csinders [20:30:17] * halfak curses the industrial brain drain [20:30:36] Hear that Facebook/Google/Twitter/etc.? [20:30:51] Curse your black hole of experimental design insights. [20:31:08] halfak, didn't got you [20:31:22] dsaez, you got the Q and I got the answer :) [20:31:24] we should pick up that conversation at town halls at CSCW/WWW, started last year with some promising ideas but I don’t think it went anywhere [20:31:35] +1 DarTar [20:34:57] halfak: I think for many platforms detecting illegal items is highest priority, for example, hate speech. And those models won't be necessarily applicable to harassment case. [20:35:11] other questions for CS? if not we should open up the queue for YC [20:35:51] +1 leila. I've had an opportunity to talk to some people from Facebook/Twitter/Google/Quora who have things that look like Huggle/ClueBot NG for illegal stuff and spam. [20:36:14] yeah [20:36:27] But I've not learned anything about what they do (if anything) for harassment. it seems like twitter needs it because they do actively block people for nuanced reasons. [20:36:56] sorry dsaez i thought that was the end of the irc question queue already [20:37:37] https://tools.wmflabs.org/meta/stalktoy/ [20:37:37] np [20:37:53] halfak: btw, for the harassment case, the work by Cristian et al. is definitely something we should be building on. wikihounding is more complicated because we don't know how to define it yet. [20:38:31] “stalktoy”, good grief [20:38:41] leila, right. I think there's a lot of potential for data visualization tools like the interaction tool for these nuanced cases. If nothing else, they'll help us gather labeled data >:) [20:39:07] DarTar: do we have time for one more question to yan? [20:39:08] * halfak stalks himself. [20:39:19] halfak: haha [20:39:28] yes, we do have still plenty of time, HaeB. I too have a couple [20:39:46] plenty = 20 more mins max [20:40:14] Who is the person who is sitting by the door in the SF office? -- she recently asked a question. [20:40:29] looking to see what it does; s it different than "show edits of a user across all wikis"? actually it seems like a useful tool for stewards, though I don't love the name [20:40:38] good question, halfak. [20:40:44] whre blocked, who by, number of edits etc [20:40:55] dsaez: do you want to ask halfak's question? :D [20:41:03] * leila hides while pulling dsaez' leg [20:41:04] halfak: I don’t know her. apergos, maybe this is the last question [20:41:16] not asking q [20:41:19] just commenting [20:41:21] Status "Okay" :) [20:41:23] oh okay [20:41:27] someone else please take the q slot! [20:41:29] The stalk tool likes me. [20:41:46] I’m on it :) [20:43:08] I tried it on the suggested user (steward who apparently wrote it), [20:43:11] no question! no no [20:43:12] sorry [20:43:15] I'm lost [20:43:18] Wait. I have a question? [20:43:19] I was reading the youtube trend [20:43:21] I don't think I do [20:43:27] I had one but it was answered. [20:43:30] ok [20:43:40] anyways, halfak, it timed out for me... booo [20:43:42] I don’t think you do either, you had a rant about industrial brain drain :p [20:43:43] i miss interpret leila's comment [20:43:55] apergos is too awesome for stalking. [20:43:56] oh there, a retry got it. nice charts [20:44:12] I didn't try to stalk me, but the suggested user (pathoschild) [20:44:21] dsaez: I cracked up for a second on my end thinking that you will actually ask the question. :D [20:44:37] I was assumming that everyone is serious here :P [20:44:40] yeah [20:44:43] dsaez: always [20:45:08] https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Economics/ExpertIdeas [20:45:35] leila, I think this is harassment ... :P [20:46:03] dsaez: it actually may really be. :D I apologize. :) [20:46:05] Only if it happens three times :-) [20:46:17] hahaha [20:46:22] :P [20:52:09] sounds like one for you halfak [20:52:20] when do we know an article is about to pass to a new quality class [20:52:37] (aside from formal nominations, that is) [20:52:37] DarTar, we should be able to do that with wp10 [20:52:51] Using the weighted sum, you can see when an article is borderline. [20:52:53] yep [20:52:56] my thinking too [20:53:54] last year's discussion with one of the coauthors https://en.wikipedia.org/wiki/User_talk:I.yeckehzaare#re:_Appreciation_for_promoting_our_study [20:54:02] It would be interesting to find out if the numbers play out that way in practice. There's been lots of real-life use of the model with good response, so I expect that it'd work for looking for borderline pages too. [20:54:24] Theoretically, the model could also suggest what types of changes a page needs to cross the boundary. [20:54:29] ragesoss, ^ [20:54:32] halfak, DarTar, maybe getting the delta (derivative= [20:54:35] ragesoss has been working on that. [20:55:13] considering the last revisions [20:55:57] \o/ [20:56:00] <3 DarTar [20:56:28] Nettrom, did some work to model the shape of Featured Articles so that he could turn that into suggestions too. [20:56:33] SuggestBot uses that now. [20:56:38] yeah, took some first steps last summer, but it's been on the backburner since the end of that GSoC project. [20:56:45] but hoping to do some related work this summer. [20:56:46] the 2011 survey by dario et al https://meta.wikimedia.org/wiki/Research:Expert_participation_survey [20:57:16] ragesoss, I have some ideas about improving the performance of that kind of model-interrogation. I think we have some good options. [20:57:30] :-) [20:57:39] I'll need to implement a new output format for ORES though :/ [20:59:10] I need to step out people to get ready for the next meetings. Thank you all for organizing this session, and to our speakers. [20:59:48] oh, hmm, guess I should watch this video. [21:00:13] ragesoss, yan talked about working with WikiEd (I think) in her studies. [21:00:25] And yeah, article quality changes are relevant to her work with experts. [21:00:28] That's the first talk. [21:00:42] The second was Caroline talking about harassment & the User Interaction Timeline tool. [21:01:34] sounds like two things I want to hear about! [21:01:51] thanks for joining, y'all [21:02:13] Thank you [21:02:23] o/ [21:04:24] bye! [21:09:37] dsaez, don't forget to put the topic back [21:10:01] here's what it was: [21:10:04] Welcome to the Wikimedia Research channel -- a space for discussion of wiki research | https://meta.wikimedia.org/wiki/Research - https://www.mediawiki.org/wiki/Wikimedia_Research | Channel is publicly logged @ https://wm-bot.wmflabs.org/logs/%23wikimedia-research/ [21:24:04] see folks later! [21:28:57] {{done}} [22:14:10] halfak: hey I know you're busy but maybe we should talk about the summit? [22:14:24] milimetric, I was thinking that too. [22:14:29] :) [22:14:45] Seems leila isn't online [22:14:45] is leila gone for the day? [22:16:02] not sure. [22:16:17] milimetric, just started my last meeting. I'll be fully available again in 45 mins. [22:16:22] Will that work for you? [22:16:30] yes, no prob [22:17:07] great [22:44:01] 10Quarry: Quarry runs thousands times slower in last months - https://phabricator.wikimedia.org/T160188#3907835 (10zhuyifei1999) Query 4835 works in around 6 seconds for me. What's wrong? [23:03:38] milimetric, I' [23:03:41] m FREE [23:03:47] o/ leila [23:03:55] milimetric, and I want to talk devsummit [23:04:09] cool, let's see, batcave? [23:04:14] https://plus.google.com/hangouts/_/wikimedia.org/a-batcave [23:07:24] _a_ batcave? Does that mean there are multiple batcaves? [23:17:15] bmansurov: task for language pair distributions at https://phabricator.wikimedia.org/T185160 [23:17:24] let me know if we should make it more specific. [23:17:31] * leila reads what halfak says. [23:18:11] halfak milimetric: I'm in the middle of something I can't drop right now. How about the two of you iterate on it based on the notes in the etherpad and I catch up with you tomorrow? [23:18:31] halfak milimetric: note to self. Plan for next summit in February 2018! :D [23:18:34] I'm on vacation tomorrow and Friday :| [23:18:38] lol +1 [23:18:50] halfak: I envy you. really! :D [23:19:01] I can hang out tomorrow, leila [23:19:24] great. halfak: get ready for an upside down session on Monday when you check emails. :D [23:19:39] everyone will walk in on their hands. ;) [23:28:27] 10Quarry: Quarry runs thousands times slower in last months - https://phabricator.wikimedia.org/T160188#3908049 (10IKhitron) See five top queries on my profile. They usually run 1-5 second. Now half of them run dozens of seconds.