[00:08:27] DarTar, what did you think of the dynamic in the last health check, ooi? [00:08:31] I thought it was great :) [00:08:37] oh god, now we're doing a health check of our health check [00:10:03] yeah I liked it [00:11:56] yay! [01:58:57] Alright, enough Java [01:59:01] time to write presentations at 9pm [15:19:37] o/ [15:19:39] Ironholds, ^ [15:20:02] I decree that today be yak-shaving day. [15:20:11] At least until the stat boxes are back online. [15:20:18] they're not offline yet! [15:20:42] I was thinking of today as finish-my-presentation-publish-those-datasets-and-write-Java day [15:20:47] what kinda yak shaving do you want to do? [15:20:48] halfak: ja going to do 1002 last too, so you can use it for now [15:20:52] i can ping you when we want to take it off [15:21:26] ottomata cool! thanks. [15:21:46] Ironholds, I'm still debating. I've been meaning to release a mysql_tsv library for python [15:21:53] ooh [15:21:58] I make so many TSVs in the MySQL dialect that R likes... [15:21:59] I'm working on a fun project with Hadley [15:22:22] I go on twitter "I don't have any packages to write, who has a problem?" and he pops up with "Amazon web access logs, they're a pain to deal with" [15:22:34] "...you know what I do as a day job, right? Send me an example dataset and some use cases." [15:22:43] ottomata, do I have hours or minutes on stat1002? [15:23:04] and a professor at Iowa State would like me to de-stupid a JSON converter he wrote [15:23:22] (still not an engineer kthx) [15:23:32] probably at least an hour halfak [15:23:44] kk thanks ottomata [15:23:54] i have to get bblack to respond to ping first... :) [15:23:56] * halfak aims to get a hadoop job kicked off. [15:24:01] before we actually start! [15:26:20] goddammit [15:26:27] halfak, have you been sending my email address around? [15:26:33] I just got another total rando emailing me for advice [15:26:37] lol not exactly [15:26:44] "not exactly"? ;p [15:27:20] Usually when I give someone your email address, it's in the form of, "Hey Foo! I'd like to introduce you to Ironholds. He's really interested in ..." [15:27:27] fair [15:27:29] And you are CC'd [15:27:34] that reminds me, I need to grab lunch with Nate on Friday [15:28:52] Ironholds, I need a solid ambiguous project name. [15:30:01] what's the project? [15:31:09] The revscoring one. [15:31:24] * halfak considers Project Skynet [15:32:42] Case Nightmare Green [15:32:49] actually, if you want it to be really ambiguous... [15:33:10] halfak, call it Hewitt [15:33:18] Project Turnip [15:33:40] that's my metasyntactic variable! Hewitt is more in-jokey [15:33:56] I don't get the Hewitt reference. [15:33:56] Foster Hewitt was a Canadian sports broadcaster who introduced a phrase that became a famous idiom throughout the commonwealth [15:33:59] He shoots [15:34:01] He scores! [15:34:02] :D [15:34:13] Wow. Nice work. [15:34:39] * Ironholds bows [15:34:42] ambiguous AND punny. [15:34:49] I have a limited skillset, but within it, I am the best at what I do. [15:35:43] also, dear god, halfak, thank you so much for data.table [15:35:48] :D [15:35:54] I just help'd ya find it [15:35:54] I can stratify-sample a dataset with data[,j={(.SD[sample(1:.N,100000),])},by = "webrequest_source"] [15:36:01] just that. [15:36:37] I do not understand the "j=..." part of that. [15:40:34] oh, that's "the returned value should be 100000 randomly-selected rows from the subset" [15:40:49] but sampling data.frames is done by generating random indices and then subsetting the df/dt to those indices [15:40:59] hence 1:.N, where .N is a reserved word that means nrow(.SD) [15:42:16] Ironholds' data.table fu is beyond mine. [15:42:58] Is .SD a reserved word for the subset based on the "by="? [15:43:17] yup! [15:43:45] you also have .I for "which subset it is". I've never found a use for that but it's nice to have. Same concept as i in for(i in seq_along(foo)) [16:12:09] halfak, Ironholds: trying to come up with other names for the project, I'm wondering if it is possible to fill in the blanks on any of these options with English words: [16:12:10] "[S]ervice [C]ompounded [O]f [R]evision [E]_______ Scores" [16:12:10] "[S]ervice for [C]onsulting [O]_______ [R]evision [E] Scores" [16:12:23] which would result in an acronym SCORES [16:14:00] Service for Calculating Overall Revision Significance [16:14:09] SCORS [16:14:17] REvision [16:15:01] YES [16:15:09] Service for Calculating Overall REvision Significance [16:15:11] SCORES [16:15:15] :-) [16:15:20] Helder wins a point [16:15:38] we can also call it Altmetrics As A Service [16:15:42] I hear everything is As A Service these days [16:15:44] * Helder scores? [16:15:52] halfak, can I make a request? [16:15:59] lool Helder :) [16:16:01] if this thing ends up with an API, can you follow best practises, and... [16:16:04] *holds up pun card* [16:16:08] make it a REVful API? [16:16:11] *puts down pun card* [16:16:37] * halfak grumbles. [16:19:19] Service for Calculating Optimum REvision Statistics [16:19:52] ...or other alternatives to "Significance" / "Scores" [16:20:57] Score Calculation and Objective Revision Evaluation System [16:21:57] that last one you could probably sell to IBM [16:22:28] I like the ORES part the best. [16:22:36] they'd rename it "IBM HeliosScore", write it in poorly-optimised C++, sell it for 30k to universities and have them throw IBM a few hundred bucks every time they want a preference changed. [16:22:49] yeah, ditto [16:22:54] We could call ourselves Project Boat and we're building ORES -- the Object Revision Evaluation System [16:23:16] or the Shit Creek Expeditionary Force [16:23:20] you need ores up shit creek [16:23:25] lol [16:23:43] actually, Shit Creek OREs does fit the acronym.. [16:23:43] * Helder didn't get that one Ironholds [16:24:11] Helder, there is an idiom in English, "up shit creek without a [paddle/oar]", referring to being in an unpleasant situation and, worse, not having the tools to get yourself out. [16:24:31] hmm [16:25:02] I just like Objective Revision Evaluation System, personally. I mean, we don't have to bacronym it [16:25:29] +1 for Objective Revision Evaluation System [16:25:32] :) [16:27:53] ToAruShiroiNeko: ^ [16:27:58] halfak: we are about to do things to stat boxes [16:28:03] you ok for stat1002 to maybe go away? [16:28:19] Yup [16:28:28] Thanks ottomata [16:29:59] aw, my query will die [16:30:09] ah well, it'd run for 8 hours anyway. Something is wrong wit hit, I think. [16:32:34] Ironholds: your query might not die. [16:32:40] wait, what kind of query? [16:32:46] MySQL, in a screen session [16:32:47] it's gonna die ;p [16:33:00] maybe not....maybe so though. network addy just changed [16:33:02] but we haven't rebooted [16:33:15] sure, but you will be, and it ain't finishing any time soon [16:33:41] we might not be rebootting, not sure yet! [16:38:12] * halfak wonders if there's a good way to split a repo into parts while preserving the history. [16:38:28] with git [16:38:50] yay! https://help.github.com/articles/splitting-a-subfolder-out-into-a-new-repository/ [16:38:55] thanks Internet [17:12:00] yay, we made Dario happy [17:12:19] We did? [17:12:29] desktop requests will come tagged with https=1 when appropriate, now [17:12:35] WOOO! [17:12:41] How'd you do it? [17:13:16] oh, Adam made an unrelated change for the app UUIDs that involved inserting the logic into the text varnish caches [17:13:22] because app requests can go through there, too [17:20:36] So we get https for free now? [17:20:41] No looking up IPs and stuff? [17:20:49] yup! [17:21:05] so, uh. ottomata. I can't connect to analytics-store any more. I assume this is part of the network shift, but, just in case it's unexpected, letting you know. [17:23:08] fixing. [17:23:09] i know [17:23:10] thanks [17:24:09] ta :) [17:24:12] DarTar, I got you a present! [17:24:27] desktop requests that are HTTPS will be tagged in x_analytics now ;) [17:27:28] Ironholds: is it just analytics-store? [17:27:33] do the other research slaves work? [17:27:54] no idea, my query setup is only designed to hit analytics-store [17:27:57] I'll try connecting manually [17:28:15] s1 works fine [17:35:44] Ironholds: sweeeeet [17:39:27] ottomata, appears up now, possibly [17:39:36] Ironholds: analytics-store? [17:40:49] yup [17:42:44] ottomata, I lied, it failed again [17:42:52] ha, ok, i think bblack is working on it now [17:43:20] halfak: stat1002 should be back. I will still be working on it, so I can't guaruntee that I won't break it or need to reboot it [17:43:28] but it should be back up with a new IP inside of the analytics vlan [17:56:16] ta [19:47:51] ottomata, any updates on stat1003? [19:49:34] yes, it is back, i'm getting puppt and jobs and rsync happier [19:49:48] it is still in the 'i don't guaruntee it' yet stage [19:49:52] but it is uesable [19:50:35] kk. that [19:50:38] s fine for now [19:50:44] Just need to reference some files :) [19:50:49] thanks [21:51:03] Ironholds: running a bit late with Grace, I guess I’ll find you directly in 40-45 mins for the a/v setup? [21:51:18] ...sure. [21:56:56] hi ggellerman [21:57:13] Hi, Researh & Data! [22:02:50] \o/ ggellerman welcome! [22:34:18] ewulczyn, are you in the office? [22:36:36] leila: no I'm at home [22:37:20] I see. in this case, one of us should get on the Hangout to collect questions from IRC and read back to the room, ewulczyn. [22:37:46] halfak, ewulczyn, does one of you want to monitor IRC and get on the Hangout to say what the questions are? [22:37:58] I'm in :) [22:38:14] thanks Aaron [22:38:28] ooki, halfak. I'll keep an eye on IRC, too, but I'm working from home so it's good if one of us be on the Hangout.. Thanks! [22:38:29] Or rather, I'd like to. Someone will still need to invite me to the hangout. [22:38:40] DarTar, can you get me an invite? [22:38:49] hey [22:38:51] adding you now [22:38:56] Thanks [22:39:08] * halfak is stoked for this showcase [22:40:28] same here [22:40:32] halfak: invite sent [22:41:49] halfak: can you send a reminder to analytics-l / wiki-research-l that we’re starting in 20? [22:42:26] DarTar, yes [22:44:23] DarTar, I don't see invite. [22:50:11] Ironholds: we're rooting for you. [22:50:45] I'd be more happy if the team was rooting for the hangout to boot up, given that I've been sat here for ten minutes staring at my avatar in total silence. [22:51:47] there's always some problem with the Hangout, which is stressful for in-office folks as well as remotees [22:53:02] * halfak roots for technology to start behaving [22:53:19] google changes the interface every week [22:53:27] yeah, :-( [22:55:58] alright, well, I'm going to get a glass of water [22:56:03] hopefully it's working by the time I'm back [23:00:07] YuviPanda, I added a graphic describing the event system. https://meta.wikimedia.org/wiki/Research:MediaWiki_events:_a_generalized_public_event_datasource [23:00:33] hey guys [23:00:53] we have a hangout meltdown, I won’t be able to use my laptop since we’ll be hosting from it [23:00:54] hello DarTar_clone [23:01:15] I’ll invite both leila and halfak so you guys can help moderate / relay any questions from IRC, ok? [23:01:19] kk [23:01:21] kk [23:01:27] Is the old streaming link still good? [23:01:34] I hope so [23:06:14] For those who just joined, we're having some technical difficulties. We should have the stream up shortly [23:12:03] Excited I didn't miss the start :-) [23:12:24] yeah computermacgyver, it doesn't happen very often. ;-) [23:12:36] computermacgyver, Yup. Still working out the last few bits [23:12:45] This is a good time for people in Asia.....I've been missing the recent ones because of the time difference [23:12:53] All fine by me....cheers [23:13:01] That's great. I was just about to commend you on being up so late. [23:13:07] :P [23:13:13] 8:00AM :-) [23:14:57] do give a shout when the stream should be up, sometimes we have to refresh the page to see it [23:15:12] chrismcmahon: sure [23:16:32] It's starting for me [23:16:41] While we are waiting, you might like to look at the Reddit AMA that Reid's team did: http://www.reddit.com/r/science/comments/2o5h9i/science_ama_series_we_are_scientists_at_los/ [23:17:19] chrismcmahon: the youtube link is up now but we haven't started the presentations yet [23:17:34] yep, had to reload to see it, thanks leila [23:19:23] np chrismcmahon [23:21:18] Still working on technical stuff. Regretfully the vacations around the holiday mean we're lacking in the support we usually have to put on these sort of events. [23:21:33] We'll hit play soon. Stay tuned. [23:22:17] Here we go! [23:22:24] gotcha [23:22:42] computermacgyver, ^ [23:23:51] halfak: Thanks... [23:24:44] Yes [23:24:58] Link to stream: https://www.youtube.com/watch?v=xPO8XhmeUAU [23:26:27] BTW, if you have any questions, ping me with them and I'll make sure that they get asked. [23:35:16] Questions! [23:36:14] I can't remember: do we have pageview data by namespace within a WMF wiki? [23:36:17] just a verification. mobile is native app + website [23:36:21] or website only? [23:37:52] computermacgyver: we can ask Ironholds to confirm, my understanding was that when he says mobile, he means mobile device and than can go to Desktop, mobile web, App [23:38:18] Paul__: what do you mean by WMF wiki? [23:39:15] I mean by namespace within each wiki, for example Tempate or Category in enwiki [23:39:26] halfak just asked it Paul__ [23:39:38] Will ask about device vs. website next [23:39:48] Dario just asked. [23:39:58] great! [23:40:03] Ahh.. I should have been paying attention [23:41:20] halfak, Dario Thanks. [23:42:26] Where will you be posting draft of you definitoin of pageview? [23:42:47] https://meta.wikimedia.org/wiki/Research:Page_view [23:42:50] Paul__, ^ [23:43:18] great, thanks [23:44:03] Oliver, you did look at wikiprojects associated with mobile/desktop edits in enwiki and there wasn't a clear difference [23:44:19] Ironholds, ^ [23:44:55] computermacgyver, sorry? [23:45:08] oh, yes, my Wikimania presentation [23:45:30] Ironholds: Yes. Just thought that in regards to the final question [23:45:36] computermacgyver, mobile is website only [23:45:46] computermacgyver, yeah, ditto, but they were asking about readership rather than editorship [23:45:54] that graph I made waybackwhen still gives me nightmares [23:46:01] I had to bootstrap through the revision table *shudders* [23:46:24] Ironhold: Yeah good point. Thanks for the clarification on app/mobile. I thought so [23:46:44] Ironholds: Indeed. *shudder* [23:58:29] how did they find haiti visitors from french wiki page count files? [23:58:46] will ask pubsez [23:59:44] FYI: Stream is 30-60 seconds behind.