[13:07:35] morning all! [13:10:16] hey Ironholds [13:10:54] hey halfak :). How goes? [13:11:38] Not too bad. Just scored some government funded travel and registration for a conf. in Nov. [13:11:49] * halfak exploits loopholes in R&D travel policy. [13:13:46] awesome! [13:14:33] oh, halfak; Maggie is telling me there's a MIT conference in NY and can I recommend any researchers to talk about strategies for identifying gender. Is Nate into that? It rings a bell but I don't know him that well [13:14:40] (obviously I'm gonna suggest Aaron S. Because duh.) [13:16:10] and, is it one of your interests? Because I know how much you like travelling :D [13:19:01] Not something I'd be able to speak on. [13:19:17] *nods*. [13:19:19] Any recommendations? [13:19:25] If you're looking for "identifying gender" that might be different from what ashaw does. [13:20:00] "identifying gender or strategies for using gender information" is the desc I've been given. [13:20:08] He trends more towards the latter I think, yeah. [13:21:09] Gotcha. Yeah. I think that ashaw would be good for that. [13:21:20] Not 100% on Nate, but it's always good to ask. [13:21:47] *nods* [13:22:26] it's been a weird, collaborative few days. I'm a fan. [13:24:00] how's your morning going? [13:25:05] Not bad. I had a rough weekend. Was sick as soon as I landed on Friday. I only got about 7 hours in over the three days when I needed more like 20. :( [13:25:16] Gotta figure out how to make up 13 hours of work this week. [13:26:57] aw man :(. *hugs* [13:27:26] honestly your 27 hours are equal to most peoples' 40, so I wouldn't sweat it. [13:29:01] heh. If only I could get that time machine working. [13:29:29] Say, jschneider sent me an email asking about accessibility of Wikipedia to blind readers. [13:29:43] Do we have someone responsible for accessibility issues? [13:31:56] not around blindness, unfortunately :(. Mostly we rely on people like quiddity to scold the features teams, and Graham/Courcelles to report screen reader issues :( [13:33:49] Accessibility seems like a bit of an issue for us. Hmm. [13:34:10] ...given that our missing is about the accessibility of information. [13:35:22] Oh well. Now to figure out what the hell I am going to present tomorrow. [13:36:09] I haven't been able to finish off a project in months. One of 'em is getting some polish. [13:36:37] oh man, I know that feel [13:36:51] I've got my circadian rhythms rewrite sat here with most of the code finished and no idea where I'm going [13:37:05] I need to take a few hours to just go: okay. What story do I want to tell? What data would validate or invalidate that story? [13:37:15] I find that thought process really helpful for working out where I need to take something. [13:56:38] halfak: when they ask about accessibility point them to thedj [13:57:01] Thanks Nemo_bis [13:57:52] halfak: sorry for interrupting right now, but would you recommend pandas to work on CSVs in python? [13:58:27] Nemo_bis, yes. If you are doing tabley work in python, pandas is the ticket. [13:58:39] ok thanks [14:01:24] oh man. I keep meaning to check pandas out but every time I do pandas.io.sql is broken in some way. [14:01:41] but we've upgraded stat1002-3 so it might be fixed now. I think it was fixed one version upstream from what we used to have. [14:04:17] Ironholds, :P just use MySQL's default output format. [14:04:29] They are clean, nice TSVs [14:04:45] I've been meaning to write a trivial wrapper for MySQL's TSVs. [14:04:57] fair point! [14:05:06] (Python's CSV library is finicky and complex to configure) [14:05:21] *finicky for large datasets [14:08:55] And to get reasonable UTF-8 one uses https://pypi.python.org/pypi/unicodecsv/0.9.4 [14:10:40] Nemo_bis, not sure. I've never had trouble with python's CSV with regards to Unicode. [14:10:56] I might just be doing a work-around without realizing that there is a better way. [14:11:27] The finicky bits of CSV reader in my experience are column and row limits imposed by C types of the underling C-based implementation. [14:11:33] (I thinK) [14:16:28] I mostly have troubles fitting an algebric view of matrices into python's view of the world [14:17:09] I see. That's something I don't do much of in python, but I suspect that numpy and scipy will have a lot for you. [14:28:22] Ironholds, you're going to like what I'm going to talk about at the showcase. [14:28:35] It's more survey than reporting on data anlysis. [14:29:23] Basically, it's going to be an argument for how I think that things like Wikipedia's communities should be seen by us and technologists. [14:29:43] awesome! [14:29:57] Is it the sort of thing you should write up as an essay on your site so I can link people to it? ;) [14:44:36] It is. [14:44:53] So... Not sure when I'll be able to get that in. Maybe I can polish it off tonight. :\ [14:45:04] psht. Give the talk, write it up after. [14:45:14] that way it'll have feedback included [14:45:24] +1 [14:52:12] Ironholds, do you know if the CLs have a channel? [14:52:22] they don't, to my knowledge [14:52:36] but if you wanna chat to them they have a general coworking hangout and suchlike a couple of times a week [14:53:41] It's surprising that the CLs aren't generally findable on IRC. [14:55:04] well, quiddity, but he's asleep [14:55:08] elitre is around sometimes [14:58:28] Ooh. This Q could go to design too. [14:58:33] * halfak hops over to -design [15:32:16] good morning Nettrom, leila :) [15:32:28] morning Ironholds [15:32:37] good morning, Ironholds, Nettrom. What about halfak? [15:32:47] he was already here! [15:32:48] good morning halfak [15:32:50] :D [15:32:59] :P [17:08:44] halfak: ping me when you have a couple of minutes to chat about WIkiGrok schemas [17:08:54] ping DarTar [17:09:01] cool, hangout? [17:09:03] anyone want to volunteer for the SSL Timings, btw? [17:09:15] DarTar, sure. Want to call when you find some space? [17:09:31] Ironholds, queued [17:09:34] halfak: I’ll do this from my desk, the office is still pretty empty [17:09:37] danke! [17:10:06] kk [17:10:08] Ironholds: yeah, I was telling Leila we have more performance related stuff in this quarter and I’d be worried if we spent too much time on projects from the last quarter [17:10:31] DarTar, then you should've chipped in on the thread ;p [17:10:38] but halfak is the new head of performance research [17:10:55] I think he prefers the title "Speed King" [17:10:59] Ironholds: I have probably 25 threads I’m supposed to chime in, seriously :( [17:11:05] I'll accept the title of "Speed King" [17:11:10] amen to that [17:11:23] Yeah. I think I'm two pings deep on a couple of threads with DarTar [17:11:23] Aaron Halfaker, Lord of Lightspeed, Duke of Dashing, Earl of E [17:11:35] well, e, I guess [17:11:44] DarTar, why won't you hang out with me. [17:11:52] Expediency [17:11:54] and in the red corner, Dario Taraborelli, Marquess of Multitasking ;p [17:11:58] because I’m talking nonsense on IRC, coming [17:38:51] halfak: I’m bringing up the page_id issue with the mobile engineers this afternoon, thanks for the input [17:40:07] :) [17:41:41] halfak, Ironholds, the CLs do actually have an IRC channel! #wikimedia-cep (It's been silent for weeks though) [17:42:12] * halfak joins up [17:45:56] DarTar, fyi, NSF funding came through for GROUP 2014. I'll be attending Nov. 10-12. [17:46:00] \o/ [17:46:10] nice [17:47:24] At no cost to the wikipedia tax-payer -- erm... donor. [17:47:32] (thank you US taxpayers) [17:48:54] nobody funds me for NUTHIN. [17:49:00] say – halfak, unrelated: I got a confirmation from david for the showcase, they can do 30 or 40 mins including QA and they already added an abstract to https://www.mediawiki.org/wiki/Analytics/Research_and_Data/Showcase#October_2014 Can you do the same when you have a moment today so I can send out the announcement? [17:49:06] on that note I'm seriously debating applying for a Berkman fellowship this year. [17:49:17] I almost certainly won't get it but it'd be interesting as a process (and interesting if I did) [17:49:34] Ironholds: I saw the call, you should talk to some of our former Berkers [17:49:42] DarTar: Will do. Still putting the final touches on my story. [17:49:45] yeah, I'm gonna grab lunch with SJ [17:49:56] halfak: thx [17:50:08] DarTar: I'll be presenting more position-paper than data-based-research-report. [17:50:21] Ironholds: the irony is that the other night I was considering applying too [17:50:34] DarTar, wouldn't that require us to occasionally let you do research though? [17:50:40] Heh. Are we all trying to become fellows now. :P [17:50:42] I thought we'd agreed you were a professional catherder now? ;p [17:50:49] sorry: SENIOR professional catherder :D [17:50:51] Ironholds: that would require taking an unpaid leave [17:50:54] yeek [17:51:00] if you do that...hang on, I have a gif for this. [17:51:07] http://workingatanonprofit.tumblr.com/post/96538987431/when-you-return-from-vacation-of-any-length [17:51:11] which is why I am not even considering it [17:51:23] lulz yeah [17:51:41] ^ that gif [17:52:02] halfak: I have a project I care a lot about that I’ll never get a chance to even get started in my current position [17:52:17] sounds familiart [17:52:25] right? [17:54:08] see, I just wanna meet new people who are better at thinking about my problems than I am [17:54:33] I don't really get projects. I get "I wonder what would happen if I looked at X?" and it turns out the answer is sometimes interesting and then I have a project. [17:54:50] (and then I get someone who can write LaTeX to paperfy it, and it gets published, and I feel mildly fraudulent to have my name on it ;p) [17:55:14] Ironholds. Sounds like you are looking for an academic advisor. [17:55:47] It's common for labs to set up formal mentorship. [17:55:51] that would require me to go into academia, and they tend to not let you do that for free and on a part-time basis, as I understand things [17:55:59] you will know this much better than I, mind! [18:01:52] Indeed. Ironholds, having an advisor is sort of like being an indentured servant. [18:02:22] The advisor only wants to work with you if you'll help them forward their agenda. [18:02:28] But you get training out of it. [18:02:30] do you need an indentured servant? [18:02:34] I like your agenda [18:02:43] it mostly consists of deep-fried beer and chi-squared tests. [18:02:47] well, and biking. [18:02:49] A good advisor will side with what benefits you the most -- at the cost of their agenda. [18:02:50] :) [18:21:50] * Nemo_bis notes there is still no sign of henna [18:26:25] hey halfak, have you committed to do any volunteer dev work for https://meta.wikimedia.org/wiki/Grants:IEG/Reimagining_Wikipedia_Mentorship ? [18:27:02] Yup. I'll be helping them with data collection when we get there. [18:27:05] J-Mo, ^ [18:27:41] cool. also: halfak is the data from that project part of Gabe's diss? [18:30:31] I'm asking these questions mostly for the sake of coming up to speed on the project. I'm going to be donating some HostBot cycles, and am curious generally about timeline and division of labor. But I'm meeting with Jethro later this week, so I can get the deets from him. [18:42:50] J-Mo: Not sure on the situation with Gabe's PhD and what data is relevant. [18:43:05] I imagine that this data will be relevant. [18:43:11] But I'm just dreamin'. [19:15:25] DarTar, just posted summary. Will you take a quick look and talk to me about any concerns? [19:15:25] https://www.mediawiki.org/wiki/Analytics/Research_and_Data/Showcase#October_2014 [19:20:03] halfak, amusingly, David's work? [19:20:17] yeah, that idea came to me in a dream last night. Looking at communcations differences on-wiki by gender or associated values [19:20:38] ....oooooh. I wonder if he'd be interested in looking at interactions between people from different /countries/. [19:20:48] * Ironholds will watch that attentively [19:21:03] +1. That's what the editor interaction dataset stuff is all about. [19:21:05] * halfak gets links. [19:21:18] https://meta.wikimedia.org/wiki/Grants:IEG/Editor_Interaction_Data_Extraction_and_Visualization [19:21:50] https://meta.wikimedia.org/wiki/Research:Ideas/Editor_profiles_and_interactions [19:22:33] His stuff is on talk page. What if we include reverts and content disputes that are only evident by the conflicting changes made to an article. [19:22:35] :) [19:24:10] halfak: thanks – I need to grab lunch, I’ll look into this in the afternoon (I just finished talking to Protonk about his microcontrib IEG proposal, so happy he’s going to work on this) [19:24:23] lots of researchy IEGs! [19:25:11] YuviPanda: totally, it’s mostly halfak’s fault [19:25:19] yeah, I could imagine :)D [19:25:42] but now that I know that it’s ok to point academics to IEG I’ll open up the floodgates on my end too ;) [19:25:44] almost 700 different queries have been tried on Quarry! [19:25:58] yeah, but I wonder if IEG will be enough funding for 'em... [19:26:08] but then I've no idea about the scale of IEG funding vs 'regular' academic funding... [19:26:42] depends, I’ve seen academic grants go from a couple of 10Ks to several mln grants [19:26:52] gotta run, bbl [19:26:55] YuviPanda, one of the IEG's this round is a couple of profs. [19:27:04] nice! [19:27:12] https://meta.wikimedia.org/wiki/Grants:IEG/WikiBrainTools [19:28:05] The prof who is running it will be on sabbatical, so this is supplementary income for him. [19:28:07] wait that sounds suspiciously familiar... :) [19:28:53] Either IEG or find a company in town (who will pay a boatload more) I think we have a closet Wikipedian in charge of the project. [19:28:55] http://www.theguardian.com/us-news/2014/oct/14/zombie-santa-claus-st-paul-bathroom-girl-minnesota two-rood, you people are weird. [19:29:17] Doesn't everyone do a zombie pub crawl? [19:29:33] I've never done any kind of pub crawl, actually [19:30:00] Nonsense. We have been to more than one pub in the same evening. [19:31:20] hmn. I guess. I dunno, I sort of see pub crawls as a formalised thing. [19:31:33] actually by that standard I went to a pub crawl in January. It ended....worst thing ever. [19:31:45] * YuviPanda went to the one in Hong Kong after Wikimania there [19:31:54] started at 7PM and for us ended at 11am the next day... [19:35:46] we had the step crawl in London [19:35:53] we didn't actually move but by the end you were basically crawling ;p [19:35:58] that was a really, REALLY weird night. [19:36:28] "step" crawl? [19:37:17] so, I come out of a friend's shindig at 1am, find like 20 staff and volunteers sat on the steps in front of the hotel [19:37:25] and am handed a glass of blended whisky and told "DRINK THIS" [19:37:30] it ends after breakfast the next day. [19:38:05] aah, that [19:38:16] didn't we end up walking in pouring rain to get some alcohol? [19:38:19] or was that a different day? [19:39:31] Ironholds, ahh. I see. [19:39:41] nope, same day [19:39:45] I barely remember anything outside of the barbican. [19:39:45] you, me, Siebrand, Keegan(?) [19:39:56] Except that I owed YuviPanda a beer and I delivered. [19:39:58] halfak, that was an exhausting weekend. Those evenings + my presentation..bleh. [19:40:13] halfak: INDEED YOU DID! that was a nice night as well [19:40:22] I've decided that my rule for Mexico, if I go, is: if I don't have my presentation finished by the submission date for the pitch, don't submit. [19:40:50] Ironholds, just need to get better at writing a presentation the day before. :P [19:41:03] halfak, dude, my presentation was AWESOME. [19:41:16] Yes, but Ironholds brain tool the punishment. [19:41:21] *k [19:41:27] being awake for 26 hours? ...fair. [20:40:47] alright, halfak: your concerns were well received and we’ll be working towards a different implementation (pending feedback from Maryana) [20:40:54] re: WikiGrok [20:41:03] woot [20:41:16] * halfak feels fulfilled in this feedback experience. [20:41:20] :) [20:41:32] we’re also planning to brutally abuse EL for storing relational data [20:41:53] which sounds like the only way to store the results, since people are not excited about building a MW table [20:42:17] I might pick your brain for feedback once more once the new proposal is up [20:43:19] +1 for relational schemas. [20:43:41] That was going to be my nit-pick (that I decided to de-priopritize) [20:43:48] *prioritize [20:43:51] yikes [20:44:00] in fact, ori had a preliminary implementation of nested schemas [20:44:11] like, built into EL [20:44:19] I wonder what the rates of EL data drop is... [20:44:25] Now that would be interesting. [20:44:33] * YuviPanda feels slightly icky storing *results* in EL [20:44:39] YuviPanda, low so long as we don't suddenly get hammered with EL. [20:44:41] YuviPanda: totally, [20:45:12] in fact I’ve been pointing at the fact that we may still have to generate plain SQL tables from the results down the line [20:45:27] if we want people (Product, Wikidatans, etc) to review the results [20:46:09] but I’m not worried about losing data for now (and assuming EL as a whole doesn’t drops tons of events), given that we’re talking of a microscopic rollout [20:46:16] also note that EL isn't 'security proof' - it's a simple GET with no auth, so anyone can put anything in there... [20:46:48] YuviPanda: true, but again – given that this is for logged in users – we can use validation to deal with that [20:47:16] only on the clientside! there's no serverside validation, and the schema is public, so it would be trivial for someone else to just randomly generate valid but bogus events. [20:47:38] that’s right [20:47:48] I’m not a fan of this pure EL solution, but it looks like this is the only way for me to collect data [20:48:10] as folks are not excited about writing throw-away mediawiki tables [20:50:08] * halfak wonders if we can have throw-away mediawiki tables stored in the EventLogging DB [20:50:25] E.g. eventlogging with an internal call on the server-side. [20:50:54] hmm, that makes me think that there’s one option we didn’t consider: the current throw-away labsdb tables [20:51:52] aren't labsdb machines heavily firewalled away from prod? [20:52:54] WIkiGrok is currently writing data into a custom DB on labsdb [20:52:59] YuviPanda: ^ [20:53:09] oh lol [20:53:20] Did they clear that with ops/sean? [20:53:26] if not they probably should... [20:53:36] I have no idea, that’s a question for max and kaldari [20:53:40] let me go hit 'em up [20:53:56] YuviPanda: wanna ask in #wikimedia-mobile first? [20:54:18] DarTar: just did [20:54:28] kk cool [21:40:43] halfak: have a minute? [21:40:52] Hi helder. What's up? [21:41:00] A user tried to run this on Windows: http://dpaste.com/3696MY8.txt [21:41:05] and got this: http://dpaste.com/0KPWFWR.txt [21:41:34] is it something to report against Mediawiki-Utilities? [21:42:19] (the user is still on #wikipedia-pt if more info is needed) [21:44:00] yeah... that library doesn't really work on windows. [21:44:01] :( [21:44:24] that library = Mediawiki-Utilities? [21:44:29] You could report it. Specifically mention "windows support for decompressing dumps." [21:44:34] Yup [21:46:29] halfak: done at https://github.com/halfak/Mediawiki-Utilities/issues/13 [21:46:52] Thanks Helder. Sorry for focusing on my own environment. :\ [21:47:04] * halfak shakes fist at windows and feels shame at the same time [21:47:18] no problem :-) [23:01:11] tnegrin, I'm running 3 min late. sorry [23:01:19] k