[14:33:38] morning! [14:52:42] Hey Ironholds [14:53:24] hey halfak :) [14:53:29] :) [14:53:57] writing up trello cards for the session/uniques stuff btw [14:55:17] woot. Do we have a proposal for reader session metrics? [14:57:15] not yet! Just writing down "and here is what we'd need to do" so we keep it in mind [14:57:25] it's in "staged" fer a reason. We'll hit it after LUCIDs, I figure. [15:14:29] halfak, what would you call a meta page to act as a hub for the LUCID proposal that would be descriptive, findable, and not terrify the shit out of our germanic colleagues? ;p [15:15:35] Hmm... I think it'd be better to name it after what we are getting at since it sounds like LUCID isn't part of the winning proposals. [15:15:52] e.g. R:Reader_behavior_metrics [15:16:17] Or just R:Consumption_patterns [15:16:43] The list out the need for unique counts next the need for session metrics. [15:16:59] consumption patterns sounds like a meta study of how many packets of chips people eat... [15:17:24] chips don't come in packets, crisps do [15:17:25] Or about the rise and fall of tuberculosis. [15:17:29] * Ironholds hits YuviPanda [15:17:34] halfak, that's where my mind went too! snap. [15:24:08] leila, so did everything work? *knocks wood, crosses toes, throws salt over shoulder* [15:24:35] mornin' Ironholds. I think so. :-) [15:25:49] YAY [15:25:58] that was an exhausting and unexpected day but goddamn was it worthwhile [15:26:10] can I just say, btw, that that friday was exactly why we need buffer room in our schedules? [15:26:30] I put in 3-4 hours of work and in exchange we remove 41 hours of waiting and worrying. That's a good return on an investment. [15:26:39] It's silly we don't have that buffer explicitly built in for everyone. [15:26:45] +1 [15:26:48] you and halfak have saved my bacon tons of times. And in halfak's case, candied it. [15:27:03] * halfak wants some candied bacon now [15:27:10] so make some? [15:27:43] No bacon. I have some breakfast sausage though. that would probably be pretty good. [15:27:48] I'm gonna eat leftover pizza. Went to my first ren faire yesterday, came back, spent the evening talking with a friend and eating said pizza. Good times! And now I get cold pizza for breakfast, which I have a weird fondness for. [15:28:09] +1 for renfest and cold pizza. [15:28:30] I go once every couple of years and enjoy it very much. [15:28:38] my bank balance is NOT happy with me. [15:28:54] Furniture will do that. [15:28:58] * YuviPanda is eating home made food in the mountains [15:28:59] that too [15:29:01] also the ren faire [15:29:02] is somewhat plain, but damn is it tasty [15:29:08] YuviPanda, which mountains? [15:29:13] Ironholds: himalayas [15:29:20] awesome! Why? [15:29:30] YuviPanda, you can't eat plain food in the mountains. It's inappropriate. [15:29:35] there's a cryptoparty / hackathon here, and I'm just hanging around for an extra couple weeks [15:29:39] and for added bonus funtimes: what country are you in? [[The Indian government's cartography department|inquiring minds]] want to know [15:29:45] halfak, GROOAAAN [15:29:48] Ironholds: haha :D This one is distinctly Indian [15:29:49] WOOO! [15:29:51] halfak: haha :D [15:30:02] halfak, you wanna mountain pun at me? bryn it on. [15:30:50] (bilingual pun is best pun) [15:30:59] * halfak had to google that one [15:31:20] I was very pleased with myself that my fragmentary Welsh was finally useful! [15:31:41] unlike the Welsh themselves? [15:31:42] * YuviPanda runs away [15:31:54] the welsh are tremendously useful! [15:32:13] for a thousand years, whenever the British have needed longbowmen to get grossly distorted skeletons and kill the french: the welsh have been there. [15:32:28] heh [15:32:36] Whenever we've needed people to get black lung so we can keep our London toes toasty with coal-fed fires: they've been there. [15:32:46] Whenever we've needed cheap holiday homes: they've been there [15:32:47] * Ironholds salutes [15:32:59] would it amaze you to know a lot of people in Wales don't like us very much? ;p [15:34:08] halfak, re naming, how about "content consumption patterns" or "content consumption metrics"? [15:34:23] (sorry missed your messages Ironholds) [15:34:25] +1 [15:34:38] I totally agree with having some buffer, in that case, it really saved me. [15:34:50] I worry that that's broader than the proposal, though. I mean: we need a centralised hub (which actually should also include a pointer to the new PVs definition) but I think the specific proposal is distinct. [15:35:32] one of my major jobs got killed on Thursday on stat1003, which in return put me really behind [15:36:03] Ironholds, I propose starting with broad topics, spec what you plan to and create sub-pages as necessary. [15:36:14] This way, we start with hubs for relevant material. [15:36:18] makes sense [15:36:23] If we go the other way, we end up with a scattered set. [15:36:33] so let's start a content consumption metrics page, link through to non-existent pages for pageviews, uniques and session tracking [15:36:39] well, existent for pageviews [15:36:42] then start work on uniques. [15:36:43] +1 [15:37:16] awesome! Okay, I'll set up a structure (let's go with "content consumption metrics") and throw you a link and we can hack, if you've got time today? [16:12:46] halfak, there was a really nice research-related nav template. Remember where it lives? White, graph as the image... [16:12:52] I want to steal it for this project [16:13:22] Hmmm. Not sure what you mean. [16:13:48] It was an infobox! [16:13:53] ...I currently lack decent words. [16:14:00] Oh! Infobox :) [16:14:08] yes. one sec. [16:14:23] This is the one I'd like to switch to: https://meta.wikimedia.org/wiki/Template:Research_project [16:14:33] I've been using it for a while. [16:14:42] This is the "official" one: https://meta.wikimedia.org/wiki/Template:Research_Project [16:14:49] Notice the minor caps difference. [16:15:26] I don't know if I'm officially allowed to advocate switching since I'm perma-blocked on this work. [16:15:46] But *I've* been using the new one with the cute little while graph. [16:16:07] that's the one! Thankee :) [16:16:22] :) [17:08:33] morning DarTar :) [17:08:44] hey Ironholds [17:08:49] how goes? [17:09:06] I’m good, happily hacking data models [17:09:25] yourself? how did the furniture sprint go? [17:09:55] we have two bookcases and a bed, or would have a bed if I hadn't forgotten the slats [17:10:17] thus necessitating a Sunday morning trip to Ikea, which brought us past a Ren Faire, and now I have no money. But an excellent walking stick and some newfound archery experience! [17:10:35] yay [17:11:06] I dread Sunday morning trips to ikea [17:11:29] they have the power of annihilating all the energy of the weekend [17:13:38] and all of the bank account [17:13:49] where there is money, it provides tiny wooden dowels [17:13:57] where there is energy, it provides delicious, stomach-deadening meatballs [17:16:29] halfak, I finally came up with a vaguely acceptable way of explaining what we're aiming to define with the PV def [17:16:44] "a way of filtering all of our requests to those that individually constitute a single, human-driven request for a self-contained piece of text-based content." [17:17:11] wordy and vague, yes, but it touches on the main points. Reminds me of the "reliable, independent, third-party sources" clause in the notability policy [17:18:06] I like it. [17:18:07] Ironholds: filtering or tagging ? [17:18:21] :) [17:19:07] Ironholds: the plan is not to exclude bot/spider requests them but flag them, correct? [17:19:16] yup [17:19:24] halfak, yay! [17:19:37] so do we need a name for spider-less PVs? [17:19:39] DarTar, the idea of excluding scares the crap out of me. Tagging it is. [17:19:44] "meatviews" [17:19:44] right [17:19:48] ;p [17:20:00] I like it [17:20:09] ...that was sarcastic right? RIGHT? [17:20:13] wait, that includes monkey browsing [17:20:20] oh god, not more selfie discussions [17:20:32] wait, if a monkey is browsing, can it claim copyright on any originality in its browsing patterns? [17:20:39] or does the guy who owns the computer get it? [17:20:51] * Ironholds images Wikimania 2015, everyone taking photos with a Hadoop printout [17:20:55] *imagines [17:20:56] erik zachte’s hamster hitting the F5 key every 2 seconds [17:21:07] so THAT'S where the "undefined" requests come from. [17:21:08] that was hands down the worst part of WM2014 [17:21:28] Guerillero, as someone who spent the entire conference dealing with GIS libraries with unstated dependencies, allow me to burst your balloon there ;p [17:22:29] I do remember thinking of going over to talk to you and then seeing the look on your face and thinking "I better let him work in peace" [17:23:03] alas, my time to work on my wikipedia android app is over [17:23:14] off to philosophy [17:36:26] ironholds, tnegrin: ping [17:37:39] Pine, poke. Just replied to your email. [17:37:49] Guerillero, don't we have an android app? [17:38:32] Pine, FWIW if you're planning on having a substantive discussion here: while I enjoy synchronous discussions, if it's /substantive/ it should probably be on the mailing list (more transparent for now and the future). [17:38:39] ah ok. I am trying to get this cleared up, I am thinking IRC might be faster. Let me go read [17:44:28] cool [18:06:08] channel: can you think off the top of your head of the technical term used in the context of the crowdourcing / citizen science lit for the pooling and joint evaluation of task responses submitted by participants? [18:06:56] Moderation [18:06:58] Curation [18:07:55] j-mo, tnegrin: discussion continuing on the Analytics list, and I think this is rapidly being cleared up :) [18:13:11] excellent. glad to hear it, Pine [19:50:57] halfak: I've been working on sharing the importance/view stuff, as we discussed on Friday. Should I make a new section for it? Not sure where it goes... [19:51:20] Nettrom, be bold. We can always clean things up later. Whatever you think is best. [19:51:47] halfak: okay, I'll create a section somewhere and add it, will happen a bit later, paper meeting first [19:51:50] thanks! [19:52:06] no. thank you! :D [19:52:34] hm [19:52:44] on that note, does anyone know off the top of their head about when we should expect CHI reviews? [19:52:47] I'm getting antsy waiting [19:53:02] request to do them, or the results? [19:54:27] Reviews are due Nov. 10th. [19:55:09] w/rebuttal period 19-26, if I remember correctly (looked at this 90 mins ago) [19:55:15] Author rebuttals are 19-26 November 2014. [19:55:24] Final notification is 15 December 2014 [19:55:32] * halfak races Nettrom with the googles [19:55:44] * Nettrom not bothered to google [20:12:37] say, regex nerds; have you ever used the Oniguruma library? If so, how would you compare it to PCRE? [20:13:57] I haven't. [20:14:06] hokay! [20:14:14] I thought you'd appreciate me opening with "say" however [20:15:39] on a related note I'm gonna spend the evening testing a couple of new string handling libraries in R. I'll let people know if I find anything interesting. [20:15:58] Sounds good. Is this part of the PV work? [20:21:49] hey, I said evening! :P [20:21:59] but I think it has implications for things like PVs; anything where we want to do string matching [20:22:46] I've got ore, which is a Oniguruma interface that'd replace regexpr/regextract/other baseR functions, and stringi, which uses the ICU library for general string handling [20:31:10] Gotcha. Is it likely to be more performant or more feature-rich? [20:33:49] that's what I want to test! Re performant. [20:34:01] Syntactically it'll definitely reduce the difficulty of, e.g., substring extraction [20:34:06] for things like Zero MCCs [20:34:24] I have sweaty nightmares about the regextract(regexpr()) syntax and its slowness. [20:34:32] and don't get me started on stringr's str_extract [20:34:43] heh. [20:35:11] * halfak mumbles something about python. [21:12:10] Hey DarTar, how do we stand on the bot filtering stuff? [21:12:17] Anything that needs doing? [21:17:39] hey halfak [21:18:01] I just got to the part on my todo list that says "Work on bot stuff or do hadoop streaming" [21:18:06] So... [21:18:08] :) [21:18:09] I think Dan captured the next steps pretty well but I was waiting on kevinator to figure our what to do next [21:18:24] I thought so too. Just wanted to make sure. [21:18:31] So - summary: [21:18:32] user_groups: method we're going with [21:18:33] E, P, NRU: no bot filtering [21:18:33] RAE, RSNAE, RROAE: yes bot filtering [21:18:55] the last item is the one that needs work [21:19:22] it’s not a huge deal but I want to hear from kevinator where they are with this [21:19:31] It seemed like milimetric was going to take on the bot filtering aspects of those metrics. [21:19:46] It is a pretty straightforward problem/solution. [21:20:10] halfak: yes, but it should be captured in the specs, not just implemented, if that’s what you’re worried about :) [21:20:31] Oh yeah! I'll get cracking on the metrics pages. [21:20:36] ha ha [21:20:46] halfak / DarTar: I already implemented filtering based on that summary, it's going to be deployed this sprint [21:20:57] Great. thanks milimetric [21:21:06] awesomeness [21:21:39] halfak: we can add as a todo, updating the corresponding cards [21:21:44] pages, that is [21:22:13] Sure. I'll get them updated with SQL and description today. Do we have a card for this work? [21:22:43] checking right now [21:23:15] https://trello.com/c/kUEwwPnt/58-bot-standardized-definition is a relevant one [21:23:28] but I think that’s closed [21:23:43] and what we’re talking about is only updating the editor model pages, right? [21:23:47] I’ll create a new one [21:24:28] Great. [21:26:24] halfak: I take that back, https://trello.com/c/xNG1Gtwc/464-decide-whether-to-add-bot-filtering-to-rolling-active-metrics [21:27:02] Nice [21:29:15] halfak: I posted the discussion there, for reference [21:29:29] Thanks DarTar. [21:29:46] np [21:34:00] milimetric: DarTar: halfak: thanks for this! [22:28:00] cool! Found a shitty bug in lubridate [22:29:56] tnegrin, for my next trick after All Of The Reader Metrics Ever, could I have a project that allows me to write code? ;p [22:30:34] I'll do my best -- can you review the goals? [22:37:36] Ironholds, we're gonna write a ton of code for Reader metrics. [22:37:41] :) [22:37:49] we are? [22:37:57] tnegrin, I did, I think? I mean, I didn't see anything problematic. [22:38:22] how can you code on them [22:38:24] ? [22:38:59] the goals? Well, the mobile stuff, although mostly it's chaining together stuff I already wrote. Still counts! Looking forward to it. [22:39:15] also, for the time being we can actually generate desktop and mobile web session benchmarks [22:39:26] the NavigationTimings schema HHVM is using. [22:40:06] Ironholds, spec the definitions, scope out the implications. Learn some stuff about use behavior. Then re-address the defintion. [22:40:20] makes sense! [22:40:33] on that note, I've created a draft page, filled out a coupla sections as best I can, and copied the etherpads over [22:40:37] your turn when you've got a minute :) [22:41:17] heh -- dario and I have a question [22:41:49] okay? [22:42:16] * halfak is excited to play around with reader sessions. [22:42:26] halfak, they are SO MUCH FUN. [22:42:27] That's going to be important once I get to that stage of HHVM [22:42:43] yup. Gah, I should start drafting the session methodology page. [22:42:48] I can at least get the prior art sections done. [22:42:59] +1 [22:47:55] DarTar: Rolling metrics updated. I made them all perform better while I was at it. [22:48:05] Card is sent to Done [22:48:07] w00t [23:14:41] gotta run. have a good evening folks! [23:15:02] cheerio! [23:15:14] * Nettrom is packing up