[13:15:26] @notify DarTar [13:15:26] I'll let you know when I see DarTar around here [13:15:32] @notify J-Mo [13:15:32] I'll let you know when I see J-Mo around here [13:27:51] @notify yo mama [13:27:51] I doubt that anyone could have such a nick 'yo mama' [13:27:57] @notify yo_mama [13:27:57] I'll let you know when I see yo_mama around here [13:28:03] no you won't because SHE'S DEAD [13:28:06] * Ironholds bursts into tears [13:28:15] alright, done with being silly for this morning. Off apartment-hunting I go. [13:28:19] YuviPanda, Boston says hi. [13:28:28] Ironholds: say hi to boston [13:28:37] I won't repeat its reply [13:28:45] boston has a filthy mouth [13:29:09] Ironholds: heh [16:04:36] J-Mo: hey! [16:04:57] hey Yuvi [16:05:42] J-Mo: I should be around for the webinar, yeah :) Although I'd probably not be too helpful with mysql questions [16:05:56] sweet [16:10:21] J-Mo: I'll probably give the landing page some love before that. let me know anything you want me to fix before that [16:11:15] awesome. will do, YuviPanda [16:39:04] hey DarTar [16:39:56] hey YuviPanda [16:40:09] in standup now, chat in a moment? [16:40:36] DarTar: sure [17:29:56] Switching locations. Back online in an hour. [17:29:57] o/ [17:31:04] * YuviPanda pokes DarTar [17:31:11] yo YuviPanda [17:31:18] DarTar: wanna hang or IR?C [17:31:20] ready now, wanna hangout? [17:31:20] *IRC [17:31:40] DarTar: ah sure [17:31:58] k hang on [17:32:06] DarTar: calling you now [18:10:31] yo yo yo [18:12:02] leila, DarTar, how's SF? [18:13:01] Ironholds: how's boston? [18:14:05] YuviPanda, warm [18:14:15] awwwarrr [18:14:17] also someone appears to be rolling a barrel full of loose change down the sidewalk [18:14:32] but I went and saw a ton of apartments this morning and put in an offer on one! :D [18:15:07] and hopefully I'll get it. [18:15:13] because it looks great [18:15:18] and all the other places I saw...eh. [18:15:22] Ironholds: wow that was quick [18:15:26] also, did you know americans make the renter pay the brokerage fee? [18:15:41] in the UK the landlord has to do it. because fucking duh. [18:15:49] Ironholds: heh, better than India, where *both* parties pay [18:15:52] apparently in boston at least, it is not that case [18:15:59] YuviPanda, is the fee equivalent to 1 months rent? [18:16:08] Ironholds: yes [18:16:35] Ironholds: my old roommates moved in to a house, and then moved out one week later when the landlord did a 'oh, one more thing' after they moved in 'no women in the house, unless they are your mothers' [18:16:40] so you end up with deposit + broker + first month + last month? [18:16:44] hahah [18:17:06] inorite [18:32:36] DarTar, I'd like to shuffle the order of some of our metrics for standardization in order to meet some of leila's data needs. Do you have some time to chat? [18:32:54] ha, I’m actually responding by email on this :) [18:33:03] Hokay [18:33:04] give me 30 secs [18:33:05] :) [18:34:41] hey halfak! [18:34:51] o/ Ironholds [18:34:51] I'm not dead! And I was conscious during the flight to boston! The entire thing! [18:35:02] ...I don't actually remember how many pills I took during the trip in total [18:35:11] but I'm going to go with 3.5-4. [18:35:24] Woot. Did oyu find a place? [18:35:32] Or is it a temp location? [18:35:39] my friend mollicent's sofa [18:35:49] but I put down a deposit on a gorgeous little place just this morning [18:36:00] assuming nobody steals it out from under me, my luck seems to be holding :) [18:36:54] Cool! [18:39:13] halfak: I'll start work on 'number of edits per country per project over 'duration' sometime later today' [18:39:33] err [18:39:34] Ironholds: ^ [18:39:43] You hadoopin' YuviPanda? [18:40:20] halfak: not yet! This is all python [18:40:30] halfak, we're doing a research project looking at the population - internet population - reader - edit attempt - edit funnels on a global scale [18:40:35] there will be hadoop also [18:40:39] http://etherpad.wikimedia.org/flagellating-funnel [18:40:46] Ironholds: I should poke around and get hadoop access [18:40:46] it's called WP:DMZ, fool, I tole you. [18:41:03] amusingly it's disturbingly similar to a project I'm doing with Han-Teng and so I hope you're okay with maybe getting a paper citation [18:41:13] halfak, oh, also! You know that work I did for heather and brent? [18:41:22] They offered me co-authorship. I like these people. [18:41:34] Makes sense. It would be unethical if they didn't [18:41:48] But I'm actually not familiar with the work, no. [18:41:50] wait, you get coauthorship out of dataset preparation here [18:41:50] Documented? [18:41:56] ... [18:42:00] Yeah totally. 80% of the work [18:42:02] brb, got to break the world citation record [18:42:20] (what IS the world citation record?) [18:42:50] https://en.wikipedia.org/wiki/Erd%C5%91s_number [18:44:13] heh [18:44:37] Ironholds: wait, looks like I still have hadoop access [18:44:50] check your permissions, dude [18:44:56] some queries require write access on the nodes [18:45:06] Ironholds: how do I test it? I'm on stat1002 [18:45:39] well, or you could just look at puppet and see if the manifests have you in the analytics usergroup ;p [18:45:45] Ironholds: I did, and I am :P [18:45:54] then yes, you have access ;p [18:46:01] Ironholds: simple test HIVE query? [18:47:33] SELECT DISTINCT(cs.content_type) FROM wmf_raw.webrequest TABLESAMPLE(10 ROWS) s WHERE s.year = 2014 AND s.month = 07 AND s.webrequest_source = 'mobile'; [18:49:09] Ironholds: seems to be running. sweet [18:49:46] Ironholds: oh wow, this is pretty coo [18:49:47] l [18:51:34] Ironholds: OH GOD THIS IS EXCITING [18:56:50] Ironholds: is there a nice 'learn HQL' article around? [18:57:21] * YuviPanda googles [19:01:38] YuviPanda, yeah [19:01:44] "type SQL and, when it fucks up, google it" [19:02:06] "and when you've done that enough to know what you're doing, or hit an error, poke me and I'll teach you how to write HQL that runs fast" [19:02:18] this being how everyone has learned every programming or query language since 1979 ;p [19:02:53] true, true [19:04:13] Ironholds: I'm actually doing some pageview analysis for mobile apps / web [19:05:00] oh? [19:06:08] Ironholds: yeah, as in 'how many views did the app get at all this month?' which IIRC we don't have? [19:06:27] select count(*) from webrequest where user_agent like 'WikipediaApp%' and year = 2014 and month = 8 and day < 15; [19:07:25] Ironholds: hmm, 'GC Overhead Limit exceeded' [19:07:38] * Ironholds winces [19:07:51] that's going to come up with the wrong number [19:07:55] indeed [19:07:59] for several reasons [19:08:05] like, double counting [19:08:07] is it hitting the mobile API AND the desktop API? [19:08:08] and not filtering by wiki [19:08:09] well, that too [19:08:13] and also it doesn't factor in the old apps [19:08:17] Ironholds: it could hit the desktop API for apps from China [19:08:26] Ironholds: I personally don't care about the old ones. They can rot in hell :P [19:11:05] hey tnegrin :) [19:11:10] YuviPanda, too bad, you wanted a number [19:11:12] and yeah, it could [19:11:25] hey Ironholds [19:11:31] how's beantown? [19:11:38] full of beaaanssss [19:12:04] so you want SELECT COUNT(*) FROM wmf_raw.webrequest WHERE user_agent LIKE 'WikipediaApp%' AND uri_query LIKE "%section=0%" AND webrequest_source IN ('text','mobile'); [19:12:08] and if it still errors out, pick a smaller date range [19:12:20] tnegrin, Boston is good! Really sunny. I found an apartment, I think. [19:12:29] I'm in a race condition with another tenant. [19:12:32] congrats! [19:12:34] welcome to SF [19:12:38] nope [19:12:45] see, in SF, there'd never have been a free apartment ;p [19:12:56] true that [19:13:23] Ironholds: ok, running [19:37:00] Ironholds: thoughts on using sqlite as the default 'intermediary data storage' engine for all research for DMZ? [19:37:35] makes it easy to have things that pause, resume, etc [19:40:21] YuviPanda, yeah, but I have zero experience with it [19:40:26] also, why would we need an intermediary? [19:40:35] we shouldn't be dealing with datasets big enough that a big-ass pickle file isn't enough [19:40:54] Ironholds: uh, so you can do things iteratively instead of having to run bigass queries all the time? [19:41:51] you run bigass query once [19:41:57] you parse results down into format you need, which is small [19:42:01] thus is the zen of data science. [19:47:20] Ironholds: not Agile enough :P [19:47:47] who the hell cares, it works. [19:48:05] Ironholds: well, more like, if I get one tiny part of the bigass query wrong, I've to go fix it and run it again [19:48:26] Ironholds: but yeah, you're kindof right [19:48:54] yes,but if you write hive right it should not take that long [19:49:07] I'm writing SQL :D [19:49:45] if you write SQL right it should not take that long. [19:52:10] Ironholds: heh, ok. Also, pygeoip, etc aren't installed on stat1003? [19:52:33] nope, or 1002 officially [19:52:41] Ironholds: let's fix that then. [19:52:44] * YuviPanda goes to write patch [19:52:44] if you want to do another good deed, debianise those and convince people to put them on the analytics cluster [19:52:53] Ironholds: pygeoip is already debianized [19:52:56] oooh [19:52:58] then yeah, please [19:53:00] Ironholds: I can do ua_parser later [19:54:30] Ironholds: aah, I see. the python GeoIP package is debianized, but pygeoip isn't [19:55:16] Ironholds: I'll get python3 and 2 versions of these debianized and put into apt. in a few hours. [20:19:13] leila, if you can persuade toby to pay for the tickets and junk, happy to come up to NY any time from the 23rd to the 29th [20:19:26] having today SIGNED A LEASE ON MY IDEAL APARTMENT THUS BUCKING THE BOSTON TREND [20:31:31] DarTar, halfak: I'm unable to join the Research staff meeting hangout. It says "this party is over". [20:33:18] J-Mo, I just joined [20:33:18] lemme try again... [20:33:18] I'm in! [20:59:13] Ironholds: halfak python-pygeoip and python-ua-parser packages built, puppet patch submitted, should be merged shortly [20:59:25] Nice! [20:59:26] you are lovely [20:59:28] I like you. [20:59:30] you can stay. [20:59:42] You have justified getting gloop all over my jacket that one night in London [20:59:55] ...I should probably be specific and say: not like that, you weirdos. [21:53:23] Ironholds, I'm reading your 01:19 message just now. I'll leave it to you to convince him. I'll be working with ottomata 28 and 29. The other days are conference, early morning til late night. [21:54:17] Ironholds: It turned out to be slightly more complicated than expected, but all done now [21:54:24] now to wait for someone to upload and merge [21:59:33] YuviPanda: you’re da man (and I love your girlfriend’s wallpaper) [21:59:46] DarTar: hahaha :D [21:59:55] DarTar: she says thanks :) [22:46:19] Ironholds: hmm, does maxmind just return '' for some IPs if it can't figure out where they're from? [22:49:52] Ironholds: also interesting. India has 4th highest number of edits this month, to enwiki, 2x over de. [22:49:54] * YuviPanda is surprised [22:51:08] halfak: do you know if there's been visualizations / etc on where our edits are coming from? [22:55:29] YuviPanda: yes, we presented this at the last monthly metrics meeting [22:55:55] (to be more precise, there are plenty of external visualizations of anon edits, the best ones coming from OII) [22:57:02] YuviPanda: for OII’s work on visualizing anon edits http://geography.oii.ox.ac.uk/?page=home [22:58:01] our own presentation was here: https://www.mediawiki.org/wiki/File:Wikimedia_Mobile_Trends.pdf [22:58:33] DarTar: aaaah, not just anons [22:58:54] background reading on our own preso is here: https://meta.wikimedia.org/wiki/Research:Mobile_trends [22:59:19] YuviPanda: it’s called mobile trends but it’s really about mobile vs anything [22:59:28] leila ^ [23:00:03] DarTar: cool :) [23:24:26] leila: have you seen etherpad.wikimedia.org/flagellating-funnel [23:26:15] lemme check it [23:26:50] okay. will read on the train, in 1.5 hour, will get back to you tonight? [23:29:49] leila: sure :) me and oliver are having fun with it [23:30:07] k. seems related to mobile trends? [23:30:40] leila: kinda but not really [23:30:48] k. will read and message you [23:30:53] leila: cool :) [23:30:55] hopefully you see it next time you get up