[17:29:59] Hello, the research showcase is starting soon! Please send your questions to me! [17:30:56] Hellooo [17:32:18] feedback survey: https://docs.google.com/forms/d/e/1FAIpQLSecgn8cMu5IfTYRgn93bfOiJVEIL09RRf_WV0dVr6ZnJ8UU_w/viewform [17:33:04] wiki workshop website: http://wikiworkshop.org/2020 [17:50:00] woot woot workshop! :) [17:50:09] ;) [17:50:23] Nettrom: Are you watching it now? [17:50:29] leila: yes [17:51:09] Nettrom: I wonder if the model can be used to solve the same problem but on Wikipedia, improving the current plaigiarism detection models that some bots operate based on. [17:51:36] Nettrom: workshooop! [17:52:06] leila: ooh, that's a great question! I did a rudimentary analysis of those bots for the Growth team last year [17:52:34] or more precisely, the two approaches we were considering for making page patrolling easier [17:52:35] Nettrom: are there specific ones you'd recommend martin look into? [17:55:19] leila: digging up the names of them now [17:57:21] leila: the two approaches that Growth were looking at was EranBot (which uses TurnitIn, see en:User:EranBot) and Earwig's copyvio detection tool: https://tools.wmflabs.org/copyvios/ [17:57:58] Nettrom: thanks. I'll look at them after the showcase. (feel free to direct martin to them if you'd like to) [17:58:24] leila: no, having you following up with him sounds great, I have another question for Martin :) [17:58:47] ok. ;) [17:58:59] J-Mo: will you give us some time for questions? [17:59:28] we decided to have Isaac start promptly at 1800 UTC, correct? [17:59:36] I'll ask people to save thewir questions until aftrer Isaac [17:59:52] a few min later is fine, however you want to manage it though. questions at the end is fine on my end, too [18:00:08] I'll ask people to save 'em, so we have room to stretch out a bit [18:00:19] J-Mo: sounds good. thanks! [18:00:20] i'd rather here Martin than myself :) [18:00:28] I wonder what to make of the huge numer of copyright violations in the reused Wikipedia pages? How can we understand that? Is it too complicated to use CC licences properly? Should we adapt our licence accordingly? Or should we rather try to make people better comply with CC? [18:00:45] numer -> number, of course, sry [18:01:13] miriam_: ^ (are you taking IRC questions?) [18:01:21] yes leila, got that aschmidt_ [18:01:22] Dariooooooooo :`-( [18:01:42] thx, miriam_ ! [18:02:05] One more thing: Will the presentation be available after the showcase? [18:02:15] as pdf, I mean... [18:02:15] I've definitely spoken with people about creating discrete statements of knowledge [18:02:56] aschmidt_ we can ask the speaker to share them on Commons! [18:03:23] miriam_: thx, again! I've found it here: https://webis.de/downloads/publications/slides/stein_2019c.pdf [18:03:28] miriam_: Q for Martin: in our 2012 paper ("In Search of the Ur-Wikipedia") we made a first attempt at looking at the extent of translations between languages based on templates, meaning a user explicitly marked an article as a translation of one in another language. Based on the results, we think this misses lots of translations. What are your thoughts about whether your approach for identifying text reuse can also work across [18:03:28] languages to identify translations? [18:05:03] aschmidt_ here you have a bunch of links related to the presentation: https://webis.de/publications.html#?q=wikipedia%20ecir%202019 [18:05:55] leila: do you have questions for the first speaker? You'll ask them directly on hangouts? [18:06:29] miriam_: I do have general remarks. do you want me to say it here for you to say, or I say it in the hangout over voice? [18:07:23] sure, feel free to ask them on hangouts over voice :) [18:07:27] thx, again, miriam_ -- looks rather interesting [18:07:34] leila^ [18:07:53] miriam_: ok. thanks. [18:16:42] Your findings match what I see in our fundraising survey data [18:17:13] Seddon: can you link to it? [18:18:04] "younger readers are more balanced in gender" -- I think that's good news! :) [18:18:41] aschmidt_: it may be. we can know in 10 years when we run the survey again. My guess is that the divide will happen again, as people age and the roles change. [18:19:26] aschmidt_: the reason for it being, a language such as Norwegian is still showing numbers away from parity, while we know Norway is one of the top countries in terms of parity on many fronts (access to technology, internet, salary, education, etc.) [18:19:31] norwegian looks like an outlier [18:19:50] leila: Not at the moment, I need to make some changes to the tool to let me link to specific data selections. I'll let you know what I sort that [18:19:54] when* [18:20:01] subbu: in what sense? [18:20:05] leila: that's a good point! But Wikipedia content also changes, so I think we cannot tell... we'll have to wait and see [18:20:20] aschmidt_: I highly recommend we don't wait. ;) [18:20:27] oh, men reading about men and women reading about women ... looks like it was not true for norwegian? [18:20:30] :) [18:20:58] subbu: the topic differences I would interpret very gently due to large error bars, etc. [18:21:15] ok. [18:23:05] re Norwegian age graph, IIRC this ran during the summer vacation, meaning we'd expect to see a larger younger audience if we reran this during the school year [18:36:48] I blogged about the worth of WP in 2009: https://schneeschmelze.wordpress.com/2009/11/23/wikipedia-means-business/ [18:38:39] aschmidt_: nice. [18:38:42] * leila makes a note to read [18:45:22] thanks to you all! interesting session! [18:45:39] thanks everyone! [18:45:49] I'm sure I did some back of the napkin calculations about back of the napkin calculations for both how much it would cost to write all of wikipedia and how much there is in revenue potential [18:46:10] Combining the work of Halfaker and others [18:46:14] thank you issacj and J-Mo [18:46:22] and all of you for attending! [18:46:37] thank you all for making it happen. :) [18:46:44] great showcase, thanks for the awesome presentations and discussion! :) [18:46:49] Seddon: definitely let me know when/if you get the demographics data. i would love to be able to compare the high-level numbers [18:47:47] Nettrom: yeah, you're right regarding summer vacation. i would have loved to launch about a week or two earlier for that reason but :/ [18:48:17] we wrapped up monthlong surveys recently though and for Russian/English we now have June data and Sept/Oct data, so i'm curious to see how much it shifts [18:48:32] preliminary results weren't suggesting by much but we shall see [18:58:02] isaacj: You can play around with the data here: https://josephseddon.github.io/DataVisProject/ I built it for my MSc project but it's missing the ability retain selections [18:58:43] oh awesome, i can screenshot :) [18:58:48] thanks!! [19:30:21] djellel: I asked Amanda Bittaker to add you to a meeting about microcontributions on 2019-11-22. Your past experience can be helpful in that conversation, and the broader topic is also relevant for the work you are doing with the growth team. If you can't attend, let me know and I will. (more on it in our 1:1 tomorrow, I figured I put a note here for you to know what that invitation is about in the meantime.;) [19:33:23] mgerlach: I also had a conversation about Wikidata being a prioritized key deliverable for WMF in FY20 and how we can work more closely on that. Amanda Bittaker suggests that you/we directly coordinate with the Wikidata team (Lydia_WMDE and Franziska). So please reach out to them and bring the options to our 1:1s to see where we can help. (If we define a project with the wikidata team, let's keep Amanda in the [19:33:23] loop in some form, monthly/quarterly email updates, etc.) [19:33:43] mgerlach: we can also talk about this tomorrow in our 1:1, especially in the context of your next project. [19:35:55] leila: Alright, I will look out for it. [19:36:27] miriam: (for when you're back) I talked with Amanda Bittaker about images and the work you're starting to understand the value of images in Wikipedia. She is very interested in this project. One point that she mentioned that may be of interest to you: there are discussions in the (primarily) enwiki community about what is the right number of images for a WP article? basically, there are those who have concerns [19:36:27] about using too many images in an article (both in the article body and in the gallery). As you learn about how the images are being used now and by whom, you will likely be able to inform some of these conversations. [19:36:32] djellel: thanks! [19:37:39] miriam: for example, if we see evidence of visual learners, and I'll go back to my favorite example of monuments, the question becomes: are images there to aid the text or do we want to make a case for learning primarily based on the images, in which case, the level of visual documentation we need for a monument differs significantly than what we have in place today. [20:37:32] Distraction time: check out the top 10 nominations of the participating countries for Wiki Loves Monuments international competition: https://commons.wikimedia.org/wiki/Wiki_Loves_Monuments_2019_winners#Finalists . Specific to the Research team: check out Algeria, France, Germany, Spain, Iran, and the U.S. (Italy hasn't announced publicly, yet.) [21:13:07] sigh.... [21:14:28] what happened? [21:52:53] oh that was a sigh of contentment/happiness looking at the photos [21:52:58] sorry! [22:15:31] uhu