[19:34:44] We will be starting the research showcase shortly [19:40:06] We are experiencing some technical difficulties …. [19:40:54] I’ll be sending out a new youtube link shortly [19:41:28] while we're waiting for IT to fix the streaming issue, it's worth to note that this is my third broad talk in the Foundation, and all these times, technology has disappointed me. :D [19:41:35] it's good that this is an hour session. ;) [19:42:37] :) [19:46:31] Can you guys paste the new link here too? [19:47:17] Here is the new youtube link: https://www.youtube.com/watch?v=xIaMuWA84bY [19:48:15] Thank you! No audio here [19:48:38] We have not started streaming yet [19:49:10] Starting now [19:49:15] Thanks!! [19:51:06] A small strange echo. Sounds almost like a compression artifact. [19:51:57] perfect storm of technical difficulties today. dealing with feedback issue. please hang in there [19:52:21] my apologies... [19:53:16] works quite well [19:53:35] i'm curious [19:54:38] There will be QA session dispersed throughout the talk. I will relay your questions to Leila at those times. [19:55:03] Audio is gone [19:55:13] back [19:55:14] and back [19:55:20] Yep, audio was cut for a few seconds [20:06:53] :-) [20:07:27] ellery: from youtube chat: why persian and spanish? [20:08:00] please don't say "users" - be more specific, readers or contributors [20:08:09] @schana got it, I’ll relay the question at the next QA part [20:11:13] Ellery: Giovanni here, can you ask Leila how did they choose the weights for the propensity scores? [20:12:24] Will do [20:12:29] Thanks! [20:13:36] what were the auxiliary variables in first place? did i miss something? (getting small delays at times) [20:14:24] for the days of the week graph, does this include persian data? [20:14:24] pajz: Could you clarify what you mean by auxiliary variables? [20:14:33] or enwiki only? [20:14:42] (given the shift in workweek) [20:14:47] Afaik, Leila will discuss the features used for correcting a bit later. But basically the propensity score is determined against a random user population drawn from the complete Wikipedia log population. [20:14:57] enwiki only [20:15:56] "current event" is about something in the news? [20:16:51] Ziko: Thats most likely, but could also be something like a holiday, natural disaster [20:17:10] ok thanks - and media, in contrast? [20:17:20] ellery, by which variables she's weighing [20:17:25] Thank you Phillip! [20:17:35] Ziko: Multiple selections were allowed, so you could choose media and current event [20:17:54] ok [20:18:03] thanks psinger [20:18:04] Pajz: By a larger set of features, she will mention those in a second [20:19:59] alright [20:21:38] Ziko: media could also include TV show, sports event. I agree that media and current event have some overlap [20:21:47] thanks [20:22:28] One more question: what was the overall response rate in the surveys? [20:24:09] We know, a bit below 50%. [20:24:26] I think ;) [20:24:38] But can check for that. [20:25:31] Or at least that is the ratio of people clicking yes vs. no [20:25:39] there might be also users ignoring it, that might not be captured [20:26:32] OK that makes sense... because 1:50 x 200M requests a day would be quite a lot more! [20:26:51] it is on a user base though [20:26:53] the sampling rate [20:27:03] Ahhh, makes sense! [20:27:06] Thanks!! [20:28:40] We got a full response for roughly 1:600 survey widget impressions [20:29:41] Thx ellery! [20:30:09] ellery: impression meaning they see the question whether they want to answer a question? [20:30:21] Yes [20:31:40] Roughly 60% of users who agree to answer the survey actually submit the survey. [20:32:29] And do you know the yes/no click ratio? [20:33:52] roughly 13:33 [20:34:27] ok, thx, that's what I meant before [20:39:48] would it be possible to do this research on a set of selected articles, and repeat that over time - to see what could change the percentages? [20:40:00] i wonder how happy are readers - e.g., is wp good for an overview, is it useful for work etc. [20:40:29] I'd be curious what could trigger people to do more in-depth reading :) [20:40:33] (eventually) [20:41:51] ziko: yeah, that would be interesting too, whether that happyness would be different for people who read in-depth, or only fact lookup [20:42:43] yes, is wp as good for the different purposes as we hope it is [20:43:20] but it is already interesting that the answers are so diverse. wp is not simply the quick'n'dirty-look-it-up-site [20:43:44] Ellery: another question. First, thanks for performing such an interesting study and for sharing the details! I have a question about prediction: do you have any idea on what kind of features could make prediction feasible? [20:44:37] Donald Trump [20:44:52] The Donald! [20:45:16] BTW: For Donald Trump the distribution was super uniform, we were expecting more diversity there. [20:45:45] leila, it is an old dream of encyclopedists :-) that readers don't look somehting up quickly, but learn more (18th century) [20:49:47] I think, predictions on the level of an individual users will remain difficult, but maybe we can make progress on providing a distribution of answers would be for one article [20:50:43] re mouse tracking, see e.g https://en.wikipedia.org/wiki/Bounce_Exchange#Software .... [20:51:07] FLemmerich: yes, pretty much what I was getting at [20:51:57] From the youtube chat: "hey, our software predicts you're just bored. Bother doing some edits?" [20:52:04] great Q [20:52:14] (from the audience in the room) [20:52:27] Will think about it Leila :-) [20:52:32] btw: awesome job in keeping the number of questions so low [20:53:13] Effeietsanders: I think we had 4-5 rounds of refinements on the survey and individual question design [20:54:04] painful but successfull process, I too am curious to look at the response rates but from what I recall they were pretty high (partly, thanks to the compact design of the survey) [20:54:36] Great presentation and great research. Thank you! [20:54:42] thanks all [20:54:44] thanks [20:54:52] Detailed *and* understandable :) [20:55:05] THank you for a great presentation of some really interesting research! [20:55:23] +1 [20:56:11] A suggestion that came up in the Google chat: if we have a reason to think that the user is browsing Wikipedia when bored (for example, randomly clicking at pages or because they say so in a survey), say something like, "Our software thinks that you're bored. Would you like to help us learn by playing the Wikidata game?" [20:56:59] I'm not sure telling any reader "Our software thinks that you're bored" is a great idea :) [20:57:03] i am doing a speed view of the talk since i came late ... and i finished the first segment ... so far, it is excellent. enjoying it. [20:57:05] Even if they said so. [20:57:24] Wording can be adjusted. [20:57:31] "Would you like to play the Wikidata game?" [20:57:39] Try some A/B tests. [20:58:08] guillom: that suggestion was made somewhat as a joke, to be honest :) [20:58:13] You can find more info here: https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Behaviour [20:58:23] And if you have more questions just ping one of the collaborators [20:58:33] Effeietsanders: I like the idea of inviting people to contribute. [20:58:40] I didn't hear anything that suggested that we can predict it to sufficient certainty anyway [20:59:28] Even 10% of people opiting into the Wikidata could have a significant net positive effect. [20:59:34] pine: me too, but i don't know if this research is granular enough yet to predict with enough certainty to be helpful in that targeting process. [20:59:44] I would much rather that people play the Wikidata game than mindless pinball games on their phones. [20:59:50] IP-selection may be more effective at this point [21:00:01] Ziko: what do we know about 18th century users? [21:00:06] but, that is me speculating :) [21:00:21] Nemo_bis: we know one thing [21:00:31] But would that sort of invitation to contribute really be useful only for bored people? Or is that something that should be tried regardless? [21:00:32] The Diderot encyclopedia cost a lot, people teamed up to buy one copy [21:00:34] Nemo_bis: that they're most likely dead [21:00:38] heh. [21:01:00] I was about to say, "18th century users is a null set" [21:01:20] pine: i don't want to exclude the occasional dracula [21:01:30] but for all intents and purposes, yes [21:01:44] Death is irrelevant to me [21:01:47] * Pine waves to halfak [21:01:57] Nemo_bis: it has a pretty big impact on response rate [21:03:21] That depends on the kind of interaction [21:04:36] Nemo_bis: ok, lets stop here. this gets too detailed for me to handle :P [21:06:05] ok, signing off here [21:06:07] ciao folks! [21:08:59] Nemo_bis : the makers of 18th century encyclopedists complained that the readers were not interesting in learning much [21:09:17] Look it up at de.wp:Enzyklopädie [21:15:05] Ziko: I don't find much in https://de.wikipedia.org/wiki/Enzyklop%C3%A4die#Bis_zum_18..C2.A0Jahrhundert and https://de.wikipedia.org/wiki/Enzyklop%C3%A4die#Zeitalter_der_Aufkl.C3.A4rung [21:15:51] under criticism https://de.wikipedia.org/wiki/Enzyklop%C3%A4die#Oberfl.C3.A4chliches_Wissen [21:16:31] goethe: here anybody can pay by penny for his needs alphabetwise [21:21:01] Hm dunno, that sounds just like the perennial critique of unsystematic/popularised education [21:21:48] I'm pretty sure Seneca already said something similar [21:25:11] Can't find right now but there are a couple related quotations in https://it.wikiquote.org/wiki/Lucio_Anneo_Seneca#Lettere_a_Lucilio [22:58:24] * Emufarmers enjoyed the scrollback