[17:56:24] J-Mo: yo [17:56:25] I'm here but multitasking so please excuse me if I wander off [17:56:28] Ironholds: ping [17:56:38] hey Pine [17:56:44] Hi DarTar [17:56:47] Pine: no problem. We'll start in just a few minutes. [17:56:58] Can someone change the chan topic? [17:57:08] I was just pinging halfak to discuss the agenda before we get started [17:57:32] other than having individual presentations of stuff we're working on I think there's some other stuff we wanted to announce/discuss [17:57:41] ok [17:58:28] hey there stranger! [17:58:51] How are you Siko? [17:58:58] Hi Siko! [17:59:20] Hey folks! [17:59:54] we're just getting some last minute stuff coordinated offline… just a couple more minutes ;) [18:00:17] Ok. While that's happening I'll send an email reminder. [18:00:24] good call [18:02:15] Pine: pong [18:02:42] Pine: There's no +t in here, so anyone can change the topic [18:02:47] Ironhold, thanks marktraceur changed the topic [18:02:50] oh [18:02:52] ok everybody, let's get started [18:02:52] ok [18:03:02] quick round of intros for those of you who don't know who's attending [18:03:02] Hilariously it winds up being me about 50% of the time anyway. :P [18:03:17] I'll go first [18:04:08] I'm Dario, sr research analyst, worked with Product for a while and now I'm part of the Research & Data team in Analytics [18:04:57] I'm Henrique, data analyst and experiments consultant in the Brazil Catalyst Program [18:05:09] I'm Jonathan Morgan, Learning Strategist in Grantmaking. Been contracting for a while, and just joined a full time staff. [18:05:19] I'm Aaron Halfaker. I've been doing research work for the Foundation as a contractor for a couple of years, but I just transitioned to full-time research-analyst last week. I'm currently working with the Growth team (formerly Editor Engagement Experiments) to model new editor behavior. [18:05:51] …and that's it for the researchers, I believe. Anyone else want to introduce themselves? [18:06:14] We have a professor here also right? [18:06:15] * marktraceur is tho topic change robot [18:06:29] Pine: no, Aaron Shaw wasn't able to make it. [18:06:35] ok [18:06:42] but hopefully at future office hourse! [18:06:43] I'm Haitham. Learning Strategist, works with the grantmaking team on global south and grant program evaluation. [18:08:28] I think that's it for intros. DarTar there was something you wanted to announce? [18:08:57] yeah [18:09:11] a couple of service announcements before we get into ask-me-anything mode [18:09:50] 1) we're organizing a Wikimedia research hackathon/meetup some time in late October/early November [18:10:14] the date is still to be defined but I think J-Mo and halfak should give a quick pitch [18:11:13] so, basically, we're looking to assembling a bunch of folks who are interested in research related to Wikis [18:11:14] ^^ If you'd like to get updates, add your name here: https://meta.wikimedia.org/wiki/Grants:IdeaLab/Labs2#Interested_participants [18:11:59] …academics, foundation folks, and volunteers. [18:12:35] to answer research questions, build tools, and perform analyses that people are interested in. [18:12:49] and evangelize our data sources [18:12:56] Ironholds: Can you give us a quick example by talking about your project with block reasons? [18:13:08] the idea would be to have different local hackathons on the same day, and connect them to one another using the magic of THE INTERNET [18:13:59] Ironholds: ? [18:14:10] I can give a quick overview too. [18:14:13] OK. [18:14:25] So, we're looking at how blocking behavior has changed over time in English Wikipedia. [18:14:43] Specifically, we're looking for evidence that abuse filter is reducing the rate and type of blocks that happened. [18:16:00] cool [18:16:20] I just wanted to give you guys that so you'd know what we were talking about. This event could involve performing this type of analysis, building tools, etc. [18:16:23] Any questions? [18:16:39] Is there any coordination between that research and what's happening with Snuggle? [18:17:21] Not that research specifically, but I'm using a similar set of strategies to add better block detection support into Snuggle. [18:17:39] ok [18:18:35] DarTar was there another announcement? [18:18:39] yes [18:19:12] so we've been working for a while with OKFN to experiment with the use of CKAN as a data repository [18:19:31] acronym expansion: [18:19:35] hey ashaw [18:19:38] greetings! [18:19:44] sorry to be a bit late [18:19:45] Welcome ashaw. [18:19:51] OKFN: Open Knowledge Foundation [18:20:42] CKAN: Comprehensive Knowledge Archive Network (it's the open source software currently used by large open data repos like the US and UK gov data hubs) [18:21:38] we've used for a while http://datahub.io/group/wikimedia as our semi-official data sharing solution [18:22:02] and they recently went through a major upgrade/refresh [18:22:28] they reenabled the data store which allows researchers/community members to upload and share datasets in the Wikimedia group [18:22:58] so help us spread the word and get more people to share their data, when they run research on our projects [18:23:21] and ping me if you have any issues using the datahub [18:25:21] last service announcement [18:25:51] we're still looking for contributors for the next issue of the research newsletter (Sep 2013), due for publication on Wednesday [18:26:28] Research newsletter: https://meta.wikimedia.org/wiki/Research:Newsletter [18:26:59] I can give some help. I'm going to relaunch the French version in a shortwhile. [18:27:06] hello [18:27:12] At this point, we're going to switch to an AMA style format. So anyone in this chan with a BURNING question about research should chime in at any time. [18:27:13] if you've read any of the papers listed in this etherpad: https://etherpad.wikimedia.org/p/WRN201309 we'd love to have your notes and short summaries [18:27:22] PcLanglais: that'd be grand [18:28:05] I've gone through Gellart's chapter [18:28:15] @J-Mo: could you say more about what you do as a "Learning Strategist"? [18:28:15] not sure of someone else is covering it [18:28:58] It's a deal :) [18:29:23] heh. that's a good question jcorneli. [18:29:57] OrenBochman: Are you referring to one of the open papers for the Research Newsletter? [18:29:59] basically, at this point, Haitham and I are working on setting up an infrastructure to help grantees measure the impact of the grant-funded activities they perform [18:30:04] yes [18:30:16] and also developing tools to help them measure impact. [18:30:27] (in that capacity, we work closely with Analytics) [18:30:29] So, and grants are related to learning activities? [18:31:15] it's the one called Reliable Sources for Indigenous Knowledge: Dissecting Wikipedia’s Catch–22 [18:31:29] yeah, the word "learning" is kind of strange. My former title "research strategist" made more sense, to me anyway. But as part of the Learning and Evaluation team within Grantmaking, we're s'posed to help make sure that we get the best results for the grant money we spend. [18:31:36] OrenBochman: make sure you add your name to the etherpad [18:31:40] J-Mo: how is your job different from the Program Evaluation Specialist? [18:31:42] if you haven;t yet [18:31:57] OrenBochman: It looks like that one is available to me. You should feel free to claim it. [18:32:31] done [18:32:41] another good question, Pine. The Program Evaluation Specialist (janstee) works primarily with Chapters and large-scale funded programs. Haitham and I tend to work more with smaller grants. [18:32:44] OrenBochman: Thanks! [18:33:00] Haitham, janstee, feel free to chime in with more/better explanation ;) [18:33:02] @J-Mo: OK, I think I understand better - you're *learning* how to develop more effective grant programs [18:33:05] J-Mo: ok. Is jwild supervising both? [18:33:11] icorneli: YES [18:33:18] Sounds good [18:33:32] no, jwild is Haitham and my boss, but not janstee's [18:33:39] she works with Frank Schulenberg [18:34:11] @J-Mo: here's a short "use case" for peer learning techniques that's related to your job (I think): http://peeragogy.org/practice/use-cases/improving-the-efficacy-of-research-funding/ [18:34:18] I'd be curious to know what you think of that! [18:34:49] is Aaron Shaw around ? [18:34:51] J-Mo: it's actually Schulenburg (it usually takes a few months for people to get it right) ;) [18:35:04] OrenBochman: He's having technical difficulties at the moment. [18:35:14] too bad [18:35:18] J-Mo: how is this different from wikimetrics? [18:35:38] Doc. Shaw may still make it. [18:35:38] ooh, I like this jcorneli. I'm working on a similar format to capturing key tips and considerations for evaluation, that uses Design patterns. [18:35:54] tnegrin: how is what different? [18:35:58] i'm around! [18:36:03] anyone here looking into gendergap these days ? [18:36:03] just not on the video at the moment [18:36:04] @J-Mo: yeah, we have a lot of those too... http://peeragogy.org/practice/ [18:36:08] oh yes [18:36:14] is there video too ? [18:36:27] OrenBochman: sikob may be able to answer the gender gap question. Siko? [18:36:41] OrenBochman: did you see the gender microsurveys? [18:36:51] don't think so [18:36:55] J-Mo: sorry, let's do this offline [18:36:58] I'll reach out [18:37:03] jcorneli, I clearly need to follow up with you more. Do you have time this week to teach me a little more about what you're doing with peerology? [18:37:11] tnegrin: k [18:37:30] OrenBochman: here's some high-level description of what the project is about https://meta.wikimedia.org/wiki/Research:Gender_micro-survey [18:37:40] DarTar: Can you link to info about those micro-surveys? [18:37:46] I've am looking at neutrality and gendered language on en.wikipedia. [18:37:52] OrenBochman: we're aware of gender gap and related issues in IEGCom. I'm not sure what's happening in the rest of WMF these days. [18:38:11] nothing specifically aimed at gender gap wmf-wide [18:38:14] I think that was deprioritized as a part of "narrowing focus" but not sure. [18:38:14] what is IEDCom [18:38:22] IEG, not IED. [18:38:28] Individual Engagement Grants. [18:38:28] but always interested in research on the topic [18:38:36] IED is a way more exciting and shorter conference. [18:38:47] @J-Mo: Yeah - generally free around this time of day (UK evenings), if that works well. If you want to browse through my thesis, it's online. I see that we're on about the same schedule, PhD-wise :-) http://metameso.org/~joe/thesis-outline.html [18:39:03] I've done some work using game theory to look at how gender gap arise [18:39:12] I'll send you an email to facilitate follow-up [18:39:13] * sumanah looks at logs http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/20130923.txt  [18:39:13] wikimedia-office needs an acronym-solving bot [18:39:44] OrenBochman: that would be interesting for the research newsletter! Can you work that in with DarTar? [18:39:49] nice! Want to meet here tomorrow at 11 PDT and I can pick your brian? [18:39:52] *brain [18:39:58] I'be found a family of models which model some of the qualitative issues [18:40:08] it's too early for this one [18:40:38] anyhow the most intersting results so far come from three areas [18:41:24] I looked at gate keeping and the model indicate a prepensity for agression by gatekeeprs [18:41:48] jcorneli: sweeeet [18:42:03] for context, the micro-survey approach is designed to measure gender breakdown in different parts of the reader -> new contributor funnel, we have only rolled out one for account registrations for now, but I expect we'll have more in the future [18:42:19] a second model needs a very long survey to check [18:43:21] it looks at alternative to Wikipedia in term of Cognetive surplus [18:44:45] DarTar: What I want to check is if there is gender assymetric - hyperbolic discounting [18:45:05] problem is a typical survey has around 480 questions [18:45:57] that's one of the reason why I believe we're moving away from surveys for reader/editor demographics :) [18:46:12] OrenBochman: do you have a short project description in the Research index? [18:46:16] or is it too premature? [18:46:45] I'm looking to create an adaptive survey to reduce it to around 20 questions [18:47:08] I planned to present this at the diversity confrence. [18:47:14] For those who don't know, the "Research Index" is a place where we list out all of our internal, WMF projects and ask external researchers to list there projects as well. See http://meta.wikimedia.org/wiki/Research:Index [18:47:18] but I don [18:47:21] looks like things are slowing down. anyone have any more questions for researchers? [18:47:27] I don't think I'll be going [18:48:08] any how I am currently looking at gendered bias in langague [18:48:24] Haitham and HenriqueCrang: can you talk about what you've learned about Brazil and Global South in general? Has there been any major improvement since WMF first started looking into improving Global South participation? [18:48:37] I'm creating a tool that works like sentiment analsis but for gendered language [18:49:28] I'll be putting that up in a few days on the web [18:49:41] Pine, we cant already point any improvements directly linked to the WMF action here in Brazil [18:49:57] halfak: is there a researcher assigned to what was E2? [18:50:09] I'm curious about who's using WikiMetrics, but I know that is less a researcher question and more a general analytics question [18:50:18] Pine: there are some high priority parts of E2 that we're supporting [18:50:24] Sadly there wasn't any significant improvement, but what we have now is a better understanding about where the key regions are, and this helped forming our new Global South strategy. https://commons.wikimedia.org/wiki/File:WMF%27s_New_Global_South_Strategy.pdf [18:50:25] I've used it one or twice [18:50:37] Beta and Flow in particular [18:50:41] We have seen some rates stoping to going down, but we cant afirm that there were a conection between this data and the catalyst program explicity [18:51:07] ok. Haitham what about India? [18:51:31] as compared to Brazil? [18:52:01] I have to run, thanks folks (and thanks Pine for putting this together) [18:52:09] Thanks for participating! [18:52:20] but we can really see some rates improving after the begining of the program, like in: https://meta.wikimedia.org/wiki/Programa_Catalisador_do_Brasil/Annual_Stats#Analysis [18:52:31] bye DarTar ! [18:52:34] ciao [18:52:36] not sure why WikiMetrics requires cohorts to be from one wiki [18:53:03] our events often have people editing multiple projects [18:53:23] OrenBochman: we have that request in our backlog [18:53:29] the last one was commons,en.wik and he.wiki [18:53:32] Pine, Most of the work in India is being supervised now by CIS, and we believe that they are able to do a better job as they have a better understanding of the local communities. I believe the same model be replicated other places too. [18:54:14] Haitham: ok but who is gathering data on their outcomes and is it being compared with data from Brazil? [18:54:17] Pine, we hope to run some controlled experiments this season to be able to connect the actions performed with changes, since historically ptwiki numbers are pretty unstable [18:55:32] DarTar: regarding https://meta.wikimedia.org/wiki/Research:Index - do I just add a new page under projects ? [18:55:55] Pine, i started conversations with the guys form CIS in India and we really want to share learnings and analysis [18:56:12] CIS does the data gathering on its own, you may have an updated version at http://geohacker.github.io/indicwiki/. Henrique might be the best to tell about Brazil. [18:56:24] OrenBorchman: DarTar's gone, but yes, you can just create a new research project page. [18:56:25] OrenBochman: I can field that. Yes. The best way is to use the input box on the lower left of Research:Index. [18:56:41] but now we dont have any comparative data, and I dont know if we are going to, since we have so diferent realities in both countries [18:57:07] see y'all later! [18:57:10] ok [18:57:10] OrenBochman: On https://meta.wikimedia.org/wiki/Research:Index, there's a header in the lower left for "Create a new project page" [18:57:17] I saw that [18:57:21] Awesome. [18:57:22] I'll add it [18:57:49] hasta sumanah :) [18:57:51] halfak: are you still doing the survey on your tool [18:57:53] Woops. too late [18:58:00] HenriqueCrang: if WMF is going to put a bunch of money into a Brazil consultant program I think it should have a baseline out of India and do comparisons as Brazil moves forward. [18:58:07] OrenBochman: Finished a couple of weeks ago. [18:58:27] alright everyone, I hereby call this meeting to a close. We hope to have another next month, so keep your eyes on the wikiresearch-l mailing list! [18:58:28] I was moving again [18:58:40] and thanks again to Pine for organizing this. [18:58:42] J-Mo: you are two minutes early! [18:58:47] but ok [18:58:54] Two minutes for goodbyes. [18:59:01] lol. [18:59:04] And yeah, cheers to Pine. [18:59:10] ashaw: I'd like to chat about your paper on the gender gap surveys [18:59:10] thanks everyone! [18:59:20] I look forward to the research newsletter [18:59:26] ;-) [18:59:28] Thank you Pine for organizing this. [18:59:32] J-Mo: will you post logs please? [18:59:45] mmm. where? [18:59:50] Hi Pine, comparing Brazil and India is possible, but very challenging [18:59:52] never done that. [19:00:17] J-Mo: https://meta.wikimedia.org/wiki/Office_hours#Office_hour_logs [19:00:18] (but yes, I will do so!) [19:00:26] perf. thanks [19:00:34] challenging not because of technical reasons, but because of different context and multiple aspects that should be taken into account [19:00:36] I made a copy of the logs, will see where other folk post them and do the same. [19:00:47] Haitham: https://meta.wikimedia.org/wiki/Office_hours#Office_hour_logs [19:00:57] ok, see you all. [19:00:58] one more question - anyone working with R [19:01:06] J-Mo, if you don't have a complete lot to post, let me know. I have one. [19:01:09] See you all! Sorry I was unable to join video this time around. [19:01:20] I'm going offline also. Bye! [19:01:27] OrenBochman, I have a couple minutes. What's up? [19:01:31] I'm hacking a Wikipedia API library [19:01:31] OrenBochman: I know that IronHolds and Maryana work with R [19:01:42] * halfak runs for the hills [19:01:45] what I would agree with is for us to have a conversation with CIS (also a WMF grantee) and discuss with them how we can set common baselines and methodologies to have better understandings about the big picture [19:01:49] Pine, we are willing to compare brazil evolution to brazil itself, isolating experiments and looking for results with metrics based on its own comunnity [19:02:32] cool [19:02:47] I'm making the basic stuff that would be needed by a robot [19:03:04] and for building corpuses [19:03:43] if you are interested in following the evolution of ptwiki numbers and events that happened in the past you can check out our timeline (sorry, only in portuguese): https://tools.wmflabs.org/ptwikis/Linha_do_tempo [19:04:03] and later I'll be adding support for work directly with the DB as well [19:04:10] Why access the API directly from R? [19:04:36] simplifies the app [19:04:51] Gotcha. Did you have a question about R? [19:05:00] not realy [19:05:20] someone said R [19:05:21] just wondering if other people are interstes or doing the same thing [19:05:24] I did [19:05:26] * Ironholds surfaces in a flash of scan() [19:05:45] so, R and MW's API are....interesting bedfellows. [19:05:48] Ironholds is your best bet for R interest. [19:05:52] just in terms of finding something both of them like. [19:06:00] R has support for twitter API [19:06:05] totally [19:06:07] but, ferinstance, have you tried to use R's json library? [19:06:16] Yeah. My thoughts exactly. Tables lend themselves to R data.frames. [19:06:17] teeeerrible unicode support. [19:06:21] I've used it with twitter [19:07:01] Ironholds: could be the reason for some Issue I've seen today [19:07:05] With MW I've found it works ambivalently or terribly depending on project. You don't want to go near Wikidata with R's JSON support [19:07:06] ooh [19:07:39] there are a couple of JSON libs as it is [19:08:12] I've been getting data with PHP from the DB and converting to JSON to display with D3.JS [19:08:13] really? I've only encountered rjson or jsonr or whatever it is [19:08:29] hah! that's actually my trick too; writing a PHP parser in R is trivial. [19:08:47] scan(), set of regexes, then read the resulting object in as [whatever object type you want], specifying the newly-inserted delimiters. [19:08:58] anyway, we may be taking the conversation on a tangent :) [19:09:10] * OrenBochman is shocked [19:09:17] but, if you're doing cool MW-related things with R, please do drop me a note; it's nice to know there's more than one of us [19:09:56] Ironholds: you are not alone, R rules [19:10:08] I'm working on a neutrality lab using shiney [19:10:24] programmRs unite! *shakes fist at halfak and other python-centric people* [19:10:25] its still has some bugs [19:10:29] jonas_agx: awesome! Email me too :D [19:10:52] one issue I've noticed [19:10:53] Ironholds: do you publish things somewhere? [19:10:56] OrenBochman: I tried out shiny a while back when it was pretty unstable; seemed fun as a proof of concept, but not there yet. revisiting it is on my to-do list. [19:11:05] jonas_agx: code, datasets or interpretations of datasets/ [19:11:20] Its is much more stable now [19:11:49] neat :D [19:11:54] Cool, Ironholds -- I use R Python and JS --- what is your emails? [19:12:43] guys, Homem is a brazilian who is also willing to perform researchs using statical tools like R [19:12:53] btw anyone tried using Concerto for making an adaptive survey ? [19:13:04] jonas_agx: okeyes@wikimedia.org [19:13:21] that looks familier [19:14:24] Okay! [19:14:28] Anyhow I do figure out how to make a short survey and put it online using concerto + shiny [19:14:41] Homem: bem-vindo [19:14:57] how can I get editors to it ? [19:15:29] jonas_agx: thanks. :) [19:15:34] particularly women editors ? [19:15:49] i have to leave now guys [19:15:50] bye! [19:15:53] bye [19:15:56] bye [19:15:57] HenriqueCrang: bye [19:16:10] Maryana_lunch: ping [19:16:23] jonas_agx: do you use R for research? [19:17:18] Homem: yes, I do -- mainly academic things [19:17:35] * OrenBochman HenriqueCrang: were you at hon kong this year ? [19:18:29] jonas_agx: what type ? [19:18:30] jonas_agx: R is well accepted in the academic enviroment [19:19:00] jonas_agx: this is why I'll try to learn R [19:19:51] OrenBochman: I've worked on predictive models for oil extraction -- mostly on statistical distributions [19:19:56] Homem: There are courses on corsera opening soon [19:20:03] neat [19:20:19] Homem: not so accepted -- people prefer pretty GUIs [19:20:39] R has some nice Guis as well [19:20:47] Yeah, coursera is an awesome place to begin [19:20:47] GGobi, Rattle [19:21:16] jonas_agx: yes, in Economics we have gretl [19:22:03] (I do my research in Economics) [19:22:19] Yes OrenBochman but things like SAS are much more "view friendly" -- but it depends from field to field [19:22:54] Homem: cool gretl is pretty easy and powerful [19:22:55] jonas_agx: SAS, SPSS... [19:22:58] I've been doing some exploritory analysis with R [19:23:13] and I have not used some of the stornget tools yet [19:24:03] @J-Mo: sorry I disappeared, I was writing you an email [19:24:06] I don't think anyone teaches SAS or SPSS anymore [19:24:07] Talk to you tomorrow [19:24:58] so how useful is a microservey [19:25:04] Just one question ... [19:25:09] yes, but a lot of people from health sciences use SAS -- it's like matlab [19:26:01] There are some new text book about health science using R [19:26:14] and on most other topics too [19:26:24] jonas_agx: I used SAS at Ipea (a brazilian think tank) [19:27:30] Ipea is an huge place for research, mostly covering social stuff, isn't Homem? [19:28:42] yes, is a gov. think tank. [19:32:28] I've noticed the term genderqueer being used by WMF staff as well as obeing refrenced on the micros-servey page. Is that a more politicaly correct term [19:33:10] OrenBochman: whats is *exactly* genderqueer? [19:34:04] as I understand it so far it refers to people who don't define themsesves as either male or female [19:34:16] But I'm not 100% sure [19:34:29] okay, thanks anyway [19:35:08] hmm I got it wrong [19:35:37] make that not man or woman [19:35:46] ;-) [19:37:27] * jonas_agx viewing a gender deadlock [19:41:44] OrenBochman, it's complicated, but it is a popular word in some areas to describe our gender identity. I use the word myself [19:49:29] well, bye folks -- it's cool. [19:56:32] https://en.wikipedia.org/wiki/Genderqueer natch