[14:32:15] o/ dsaez [14:32:24] Good afternoon! [14:32:51] I'm wondering if I can help you get set up with https://meta.wikimedia.org/wiki/IRC/Cloaks [14:32:52] :D [14:32:58] goor morning [14:33:03] Let me see... [14:33:24] https://docs.google.com/forms/d/e/1FAIpQLSc9f95s72cfLD-pTZrTxCD-Kyb9umm5XVz_JOWtazADYkjpvA/viewform#start=openform [14:33:30] Looks like you need submit this. [14:33:46] As part of the process, you'll need to edit your own user-page to demonstrate ownership of your wikimedia account. [14:35:26] On the third screen it will ask for a confirmation diff. [14:35:30] * halfak looks for his own diff. [14:37:44] Weird. I have no idea where mine is. But I'll just make a new example. :) [14:38:14] ok [14:39:42] https://meta.wikimedia.org/w/index.php?title=User:EpochFail/sandbox&diff=17126612&oldid=17126611 [14:39:51] Woops. "Clock" ha [14:40:07] https://meta.wikimedia.org/w/index.php?title=User:EpochFail/sandbox&diff=17126616&oldid=17126612 [14:40:28] perfect [14:41:07] It should take a few days to set up. One thing I forgot to ask about. Is 'dsaez' registered to you? [14:41:53] You can submit this request, but we'll want to get freenode to recognize you too asap. :) [14:42:07] That's just a couple of commands in-IRC. [14:42:28] Yep, i just send the msg to the NickServ [14:42:33] great! [14:46:14] ok, done, thx! [14:48:41] \o/ [14:48:56] My client still says you haven't identified with freenode. [14:49:35] You can probably configure your client to auto identify on connection, but for this session you might have to manually identify. [14:51:59] let me see, I'll try with pidgin [14:53:12] o/ leila [14:54:01] is adrian on IRC? [14:55:06] halfak: how you check wheter someone is identified or not? /whois doesn't show [14:55:48] My whois has "[halfak] is logged in as halfak" [14:56:11] got it [14:57:57] done [14:58:04] hey halfak [14:58:31] \o/ dsaez, looks good. [14:58:38] halfak: I /think/ he's not but I can ping him and tell him to join. Is it related to his question? [14:59:07] Yeah. Wanted to formally invite him to a biweekly meeting I have with the analytics folks to talk about engineering data intensive systems. [14:59:11] ottomata suggested it. [14:59:42] ottomata thought if it because of the stat1005 capacity discussion on the research list. [15:01:25] halfak: sure. in that case, I guess the best option is you emailing him. I'd suggest keeping Markus K. and I cc-ed. [15:02:07] Bummer. Oh well. I suppose he'll get an email for the meeting invite :) [16:21:56] hey dsaez. two questions for you: shall we connect in 10 min or so re expenses. also, I added you to an event on 8/30 as optional. Can you check if you've received the invitation? (it's an external invitation and calendar doesn't behave as expected) [16:22:31] 1) ok [16:23:00] 2) no, I haven't received [16:23:08] hmm. ok. [16:27:21] dsaez: I asked Naroa from UNICEF to add you to it. quick summary of the event: we're starting to discuss with UNICEF if there is an opportunity for collaboration on the topic of information poverty. On our end, we are interested to work on the problem of what information people are seeking that is not available to them (in their language or at all) and how/if Wikipedia/media should contain and surface that information. You may [16:27:52] cool! [18:09:51] o/ lzia. Can you tell me about the status of our infrastructure for randomly sampling readers for a survey? Were you able to generate a nice sample for the readers survey and how much of a pain in the butt was it? [18:10:22] I'm hoping to get a gist of how much difficulty I should plan for if I go in that direction. [18:10:23] :) [18:12:39] heading to lunch [18:12:41] back in a bit [19:57:32] Forgot to say I'm back now :) [20:33:37] hey halfak. there is infrastructure in place for doing random sampling, how random the sampling is in practice is something I would check for the specific data you want to collect. [20:33:49] is there a specific problem you're looking at halfak? [20:34:18] How difficult was it to work with? [20:34:41] I've got some potential collaborators who want to use survey methods to explore the economic value of wikipedia [20:35:11] They'll use a calibrated set of questions and short term experiment to figure out how much people are willing to *pay* to read Wikipedia. [20:35:29] Mostly by asking them how much they'd need to be paid in order to *not* read Wikipedia. [20:36:01] I figure it's got some pretty high PR value, so I'm trying to pitch it to comms, but if it's going to be really painful to set up, then I'll not waste peoples' time. [20:52:13] leila, ^ [20:52:27] * leila reads [20:53:09] how much control they need over sampling? [20:53:37] if you need *some* sampling approach and you can live with that, the infrastructure is in place and you can run it relatively easily if the reading team is onboard. [20:54:08] you also need a privacy statement from legal for this, halfak [20:54:39] In general, I would set aside 1-2 weeks to set up the survey if you already know your questions, halfak. [20:54:54] leila, thanks. that's good to know. [20:55:31] It sounds like the major problem is sampling constraints and legal review. if that's right, it'll help me try to convince Comms that they want to help drive this >:) [20:55:39] halfak: re this specific question, there is another way to approach it which doesn't require running surveys and that is by estimating how much companies are paying for SEOs and Wikipedia in that context. [20:56:24] leila, they have an about that drastically underestimating that I don't know how to recapitulate. [20:56:26] basically, how much people are willing to pay now is a good measure of the value of Wikipedia [20:56:36] right. [20:56:51] leila, so the reason I guess is that the current actualization of a market doesn't actually represent the value well. [20:58:47] halfak: I think more than the sampling and legal resources, the research resources on your end will be very costly for this research. :) [20:59:21] halfak: as you know, this can turn into a very tricky topic. It may attempt to put monetary value on a subset of pages, which in return can put monetary value on the edit contributions. [20:59:27] Surely. But it doesn't sound like there's much engineering cost from what you said. Right? [20:59:44] * halfak is not ready to defend someone else's methodology [20:59:52] but I'd be happy to invite them to an RG so you could review [20:59:57] Again, if you just want to use current quicksurvey infrastructure (with all its limitations), the eng part is not expensive. [21:00:08] I think you should probably understand their methods before raising feedback too :) [21:00:50] Thanks. That's exactly what I need right now. More plans if there are any will be on r-internal + RG meeting :) [21:01:02] halfak: I'd say let's do this in the second half of the year. :D there is a LOT going on right now (we have 4 programs, and already a ton of distractions for the work I do;) [21:01:19] yeah. but happy to discuss in RG and share high level input. [21:02:02] good luck with it, it's a good question. I had some discussions with Al Roth about it some months ago (this is not within his expertise but he was happy to chat about it). generally the topic is dear to my heart. ;) [21:03:53] :) Good to hear. No worries on capacity. I know how that goes. We're on a big timescale here. [21:04:06] Happily these researchers came to us when this was just an idea. [21:04:18] They didn't bat an eye when I told them I couldn't meet with them again for more than 1 month :D [21:06:07] that's always assuring to hear, halfak. :D [21:07:14] halfak: on the topic of sampling. I think we should generally be much more careful. the system we're using now is doing an okay job in a very broad sense, but depending on the population we want to target, we should be more careful. [21:07:55] Basically, if the distribution you're interested in is power-law and the current approach is choosing uniformly at random, the data we get is pretty problematic, halfak. [21:08:20] Oh! It's sampling per-pageload? [21:08:25] Not per-device? [21:08:27] I would want to do a lot of checks on the data sanity and sampling before doing anything with that data or running experiments. That takes the longest time in my experience. [21:08:44] * leila pulls up her notes to answer the question [21:09:20] so, some notion of device is sampled (there are details to this). then for each page-load, you have a chance to see the widget if you're in the preselect pool. [21:10:47] Are these notes on wiki? [21:10:53] * halfak does the wikimedian switcheroo [21:10:56] :D [21:10:58] they're not. I'm digging in my papers. [21:12:18] there are nuances. So if the skin is minerva, only under certain conditions you see the survey. if it's not minerva, you see it in main page, and when DNT is off. [21:13:21] * halfak doesn't even know what minerva is. [21:13:35] To the preferences! [21:13:59] Oh! It's mobile? [21:15:30] mobile and desktop. [21:18:31] Gotcha. Just looks like the mobile web ui.