[16:47:42] Office hours starting in 15 mins [17:00:26] Hello everyone, welcome to Research/Analytics Office hours [17:00:50] Who is here for the meeting? ( if you can give me a brief o/ ) [17:01:35] I’m here for now [17:01:54] o/ [17:02:09] Hi everyone. [17:02:28] Ok, lets get started...please do not hesitate to ask any questions on research, data, projects, ideas, analysis, ... [17:02:32] I will try my best to answer directly or relay them to someone who does [17:03:07] aloha research :) [17:05:13] I have some gender gap questions. I've read some research that says female editors are not more likely than male editors to write biographies of women. Have there been any studies supporting or challenging this finding? [17:06:09] mgerlach: I'll be around for the next 40 min and happy to help. [17:06:49] Hello! My name is nadyah hani, my team and I are currently running a project to develop a web application which visualizes Wikidata’s entity completeness. The app can be accessed through prowd.id . With this app, we hope we can help people gain insights about knowledge imbalances on the web (from wikidata), and we hope this app can also spark [17:06:50] more questions and research in the social / political science field on why and how knowledge imbalances occur in the web, and what it implies about society. From looking into wikimedia’s white paper on knowledge gaps (https://meta.wikimedia.org/wiki/File:Knowledge_Gaps_%E2%80%93_Wikimedia_Research_2030.pdf), we believe ProWD can be one of those [17:06:50] tools the paper needs. The research we are conducting is focused on UI/UX analysis of ProWD and data completeness analysis on wikidata.We would love to hear your thoughts on our project and our web app (prowd.id), and are happy to talk about the project in more detail (nadyah.hani@ui.ac.id). If there are any references, resources, or anything at [17:06:51] all which might be useful for our research, we would love to be informed about them. Thank you. [17:07:16] Hello everyone, My name is Refo and I'm part of nadyah's team [17:08:46] isaacj might know something about the gender gap? [17:09:18] * leila reads nadyah 's question. [17:10:50] isaacj: ^ [17:10:59] * isaacj reads [17:11:11] clayoquot: WikiProject Women in Red would seem to be a direct refutation of that claim [17:12:33] nadyah: I'm playing with the app. please stay tuned. [17:12:41] We would like to improve the analytics part from prowd by using gini index for inequality analysis and association rules to know the relations between properties, and for that we would like to ask your opinion about those approaches (gini index and association rules) [17:13:05] the only research I know regarding gender + contribution behavior was in the domain of OpenStreetMap: http://brenthecht.com/publications/chi2019_genderbias.pdf [17:13:07] refo: acknolwedged. [17:13:22] Clayoquot: ^^ [17:13:34] well, there's the Tony Lam WP:Clubhouse paper [17:13:51] nadyah: regarding completeness, see also the December-2019 showcase https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#2019 [17:13:57] but that uses a pretty skewed sample, since they're only looking at people who self-disclose gender [17:14:23] https://meta.wikimedia.org/wiki/Research:The_role_of_citations_in_how_readers_evaluate_Wikipedia_articles/Trust_taxonomy [17:14:35] whoops, I mean http://files.grouplens.org/papers/wp-gender-wikisym2011.pdf [17:14:44] J-Mo, is Women in Red a refutation of the claim that women are more likely to write biographies about women? [17:14:51] leila: will do! [17:16:09] Clayoquot: sorry I wasn't more clear; many female-identified editors participate in WiW; WiW was started by a female identified editor; therefore, it seems likely that women ARE more likely to edit female biographies. See also WikiProject Women Scientists. [17:16:58] J-Mo, actually Rosie Stephenson-Goodknight says most of the people writing biographies of women are men. [17:17:00] mgerlach: reading it now, thank you! [17:17:30] J-Mo, the WP:Clubhouse paper doesn't mention biographies as far as I can tell. [17:17:36] by the numbers, of course! but you'd expect that if 80-90% of editors are men ;) that doesn't address the question of whether women are more likely to edit female biographies [17:18:03] clayoquot: right, it is just the only paper I could think of that looks explicitly at editing behavior and self-disclosed gender in wikipedia [17:18:45] there's Menking's "The Heart Work of Wikipedia", but that mostly describes the womens' experience of Wikipedia culture [17:19:42] J-Mo, right. So does any research address the question of whether women are more likely to edit female biographies? It sounds as if people have looked for research on this question and not found it, which would indicate that it probably doesn't exist. [17:19:46] issacj's recent research suggests that women are more likely to read female biographies than men (I believe?) [17:20:37] clayoquot, I unfortunately cannot recall any project that addresses that specific question [17:20:38] yes -- definitely evidence that women are relatively more likely to read biographies of women than men are (slide 16 here: https://figshare.com/articles/Reader_Demographics_November_2019_Wikimedia_Research_Showcase_Presentation/10565882) [17:21:47] someone could use the user_properties_table (https://www.mediawiki.org/wiki/Manual:User_properties_table) to repeat the WP:Clubhouse work and focus on edits to biographies of men/women [17:22:04] ^this [17:23:47] that would be a great OpenSym paper... nice partial-replication of existing impactful work, and it focuses on a still relevant question [17:24:00] * J-Mo has his head stuck in paper review mode this week [17:24:03] isaacj, great idea! [17:24:54] nadyah and refo : a few early pointers that may be helpful for you to develop your project further. 1. (Disclaimer: I haven't read the thesis behind ProWD) Make sure you're aware of RECOIN: http://www.simonrazniewski.com/wp-content/uploads/2018_Wiki-workshop.pdf . 2. In our team (Research), djellel can be a good person to talk with about property recommendation in WD. Lydia_WMDE is the product manager for WD. [17:24:54] I highly recommend you keep her and the Wikidata community updated via wikidata public mailing list https://lists.wikimedia.org/mailman/listinfo/wikidata about your project. 3. GoranSM (currently not logged in here and I try to find him after this message) is the data scientist working in Wikimedia Deutschland on Wikidata. He knows a lot about WD, I suggest being in touch with him. [17:25:50] J-Mo, thanks for answering. It's takes a lot of research to confirm that a gap in research exists, and you've made that much easier for me. [17:26:02] leila: thank you! [17:26:21] Clayoquot, I know exactly what you mean. Glad to help! [17:27:00] milimetric: do you want to pose your question you sent on the mailing list here and collect input, too? :) Now that we're in the office hour, why not?! ;) [17:27:00] nadyah and refo: also I would add wikidata concept monitor (by GoranSM) http://wdcm.wmflabs.org/ [17:28:47] Has anyone thought of adding an optional "gender" field to the user registration form? It would make it so much easier to track the gap between male/female participation, albeit only among registered users. [17:30:04] this may be something that the Growth team is doing. [17:30:18] mgerlach leila : thank you for the feedback. Will look into it. =D (y) [17:31:22] leila: I think it's a bit complicated and more geared towards wmf-folks in its current form. But I will definitely ask it externally at some pont [17:31:49] Clayoquot: it has been discussed but so far it has not been implemented (largely for privacy reasons). Similar surveys have been done occasionally over the years though (https://meta.wikimedia.org/wiki/Research:Gender_micro-survey and https://meta.wikimedia.org/wiki/Research:Surveys_on_the_gender_of_editors) [17:32:12] (to Clayoquot): however, I have some ethical concerns about this kind of approach overall: why would we ask for personal information about users unless we are going to use that information to directly benefit their experience of Wikipedia? Every bit of personal data we collect and store creates a measurable risk. We need to have a rock-solid justification every time we decide to collect it, whether we make that public or not. [17:32:37] milimetric: interesting. cuz you started receiving response(s) from outside of WMF folks I thought it's geared to the world. :) sounds good to skip then. [17:32:42] (make the data public, not the reason; we should always make the reason public) [17:32:43] The Growth team has done some good good survey work with new editors though which you can find here (and in adjacent pages): https://www.mediawiki.org/wiki/Growth/Personalized_first_day/Welcome_survey [17:36:39] For wikidata profiling projects such as ProWD, RECOIN, and GoranSM, what kind of analytic insights would be most useful? [17:37:17] *GoranSM's project [17:38:18] (To J-Mo), the ethical questions you're raising are issues that need to be explored, for sure, and probably warrant a community discussion. Personally I'd rather reframe the question to ask, "how could we use this information to improve the experience of Wikipedia for an under-served population, i.e. women?" This is a collective benefit rather [17:38:18] than an individual benefit. [17:41:01] Clayoquot: agree that the potential collective benefit is huge. but the risk is individual, and so I (personally) would still be uncomfortable with asking for gender by default unless we were using that information for potential individual benefit. so for example if we were going to use the information to connect new woman editors to experienced woman mentors, then there'd be an individual benefit too, and I'd be all for it. [17:41:18] refo: this is best to be discussed in wikidata@ mailing list . In the meantime, what I've heard repeatedly from Lydia is that any analytics effort that helps the community visualizes the gaps and help them prioritize work on specific gaps is helpful. more broadly speaking bias in content: the more we can surface different biases the better. [17:41:20] as LONG as we were clear to the users what we were collecting the information for, and were clear that it was not requried [17:44:30] Clayoquot: potentially related research project proposal (aimed at editor retention for underrepresented groups) https://meta.wikimedia.org/wiki/Grants:Project/Community_Health_Metrics:_Understanding_Editor_Drop-off [17:46:47] leila: Thank you for the input and we will discuss this in wikidata@ mailing list in the future [17:47:02] ^Ooh, that looks like a great proposal. [17:49:40] 10 minutes left for office hour [17:59:44] thanks everyone for joining [18:00:18] that is it for today, next office hour will be in a month [18:00:35] 25th march [18:02:45] leila: we're very glad for outside input, I just haven't thought through how to work that in the more complicated internal-focused architecture. Because the majority of the infrastructure is stuff that the outside world doesn't care about as far as I know. Like cache invalidation based on the dependency graph. But maybe you and I should talk about it more [21:02:04] o/ J-Mo [21:02:10] Google meet is slow. Trying to join [23:24:43] milimetric: no rush. we can talk about the outside part whenever you have time. just ping. (and some of it will naturally come through the public list you have in that thread.)