[00:39:09] DarTar, 1:1? [00:49:37] hey halfak [00:49:53] gosh, sorry – I thought it was at 5 [00:49:57] Hey. [00:49:58] :P [00:49:59] coming [00:50:02] kk [13:58:37] bleeeh [14:09:26] Hey Ironholds [14:09:40] hey halfak :). How goes? [14:10:44] Not bad. Still booting up my brain. Been working late on a parser the past couple of days and had a breakthrough last night. [14:10:55] Got myself the 2 orders of magnitude I needed. [14:11:10] yay! [14:11:19] what's it a parser for? [14:11:46] Extracting DOIs. [14:11:53] Regretfully, a regex is insufficient. [14:11:58] So I built a DOI parser. [14:12:08] But parsing a whole wiki page is slow [14:12:10] awesome! [14:12:16] I ordered 11 books. Stocking up for the vacation. [14:12:22] So I pinpoint the DOIs and just parse them [14:12:29] Woo! You got a list? [14:12:49] not to hand, but highlights are: Atwood, Butler, Hyde. [14:12:51] Lots of Hyde [14:13:06] He writes a lot about the anthropological and ethnographic background of gift- versus commerce-based economies [14:13:35] his most famous book is called "The Gift; The Erotic Life of Property" and is v.good if you like the idea of a ethnographic dive into the impact of a transition from gift-giving to commerce-based environments on the arts [14:13:50] although the last third is a literary interpretation of Whitman, which felt kinda weird. [14:14:23] Ahh. Not the first Hyde I found when googling [14:14:30] and then Butler is awesome feminist sci-fi, and Atwood I have no opinions on, but everyone who tries hitting on me brings her up, sooo. [14:14:58] Magret Atwood presented at CHI last year [14:15:07] The last couple of months of social events have taught me that I'm catnip to teachers who like Margaret Atwood. This is a pretty precise demographic that has so far contained, to a first-order approximation, zero people I'm interested in. [14:15:14] But I'm always down for book recommendations, so cool! [14:15:20] About this: http://www.design-engineering.com/motion-control/robotic-arm-extend-authors-signatures-over-cyberspace-10411/ [14:15:22] ooh! What was it like? [14:15:53] "and created Unotchit Inc. (pronounced �you no touch it�)" - I like this person. [14:19:13] halfak, any recommendations while I'm putting orders together? [14:19:51] I don't know if I've read books you haven't read. [14:19:53] :S [14:20:07] Ever read Flowers for Algernon? [14:20:40] yup! [14:20:44] love that book [14:20:52] and don't be silly, you read a load of weird and fascinating stuff [14:21:07] I'm reading Shogun right now, and I'm liking it a lot. [14:21:17] also, did you recommend "The Green Book: Freedom Under the Snow"? [14:21:26] I don't think so. [14:21:30] * halfak googles that [14:21:33] huh. This is weird. [14:21:38] Maybe my brain is mind-melding you and J-Mo [14:25:11] * halfak is digging for more [14:25:50] I've been working on this recently: http://www.amazon.com/Feynman-Lectures-On-Computation-Richard/dp/0738202967 [14:25:59] ooh [14:26:06] I have Kozen's "Computation and Automata" on the pile too [14:26:10] As with Feynman, it's pretty awesome. [14:26:24] yup! Wishlisted :) [14:26:33] heh, maybe that'll be one of my holidays next year [14:26:45] Is "Computation and Automata" a textbook? [14:26:56] "The time is 9am. I have no duties for a week. I have a bottle of scotch and a dozen cigars. And I have 3.5 volumes of Knuth. COME AT ME BRO." [14:27:14] yep, albeit a human-friendly one [14:27:20] sorry, "Automata and Computability" [14:27:23] Lectures on Computation is *not* a textbook, but it could probably work as one. [14:28:12] cool! [14:28:48] oh, halfak; we now have a dedicated "datasets" db on analytics-store, btw. [14:29:02] For all your static "I need to put ISO codes somewhere where they'll be human-accessible" needs. [14:29:10] (in fact, that's going to be my illustrative example) [14:29:59] Cool. We should make some rules about what gets to live there. [14:30:16] E.g. you can't put a table in datasets without first documenting it on Meta -- or something. [14:31:43] something like that, yeah [14:31:54] although wikitech might be a better venue, for purely-internal stuff? [14:37:37] Ironholds, makes sense to me. [14:37:59] cool! [14:38:02] One thing that is nice about Meta is that it is where the schemas live. [14:38:12] Or maybe that's lame too and they should live on Wikitech [14:38:15] the EL schemas? [14:38:18] Yeah [14:38:23] I think that's mostly a legacy of how SUL works rather than anything else [14:38:34] Meta is the centralised hub in the prod network [14:38:54] Oh? I thought it would be because Meta is where we document our research activities that use the EL schemas [14:40:36] My email doesn't remember, but I bet DarTar does. [14:40:41] When SF wakes up. [15:51:09] "My Friend Dario" just entered my play queue. [15:51:24] https://www.youtube.com/watch?v=_3EXHdT8DKM [15:51:41] I can't believe you really have it in your play queue halfak. :D [15:52:07] :) I like it. Not only do I find it relevant, but I enjoy the song too. [15:52:24] yes, yes. :-) [15:52:25] I originally discovered the song through Google's recommender system. [15:52:35] hahaha! that's hilarious. [15:52:37] So, it fits in my genres of interest :) [15:53:27] I was just listening to the stream when I thought, "Wait a second. Dig she just say, 'Dario'? ... Yes. I have a friend Dario too!" [15:53:48] :D it's really funny. I can imagine your face. [15:56:20] :D [15:57:37] enwiki has about 745k unique doi/article pairs. [15:57:40] FYI [15:57:43] :) [16:06:02] morning [17:05:27] Just push v0.0.4 of mwcites. I'm 100x faster than mwparserfromhell and I can effectively extract nonsense like this doi: 10.1130/0016-7606(1949)60[1527:NAHATM]2.0.CO;2 [17:05:38] Which will break wikimarkup, but not my parser :) [17:09:28] yay! [17:09:42] this reminds me of one of my recreational projects [17:09:51] using ORCIDs to build an Erdos network [17:23:59] Ironholds, what's an ORCID? [17:24:10] halfak, unique researcher ID! [17:24:11] Oh wait. I googled better [17:24:18] Do I have one? [17:24:27] Looks like I need to register [17:24:34] https://orcid.org/register <-- ? [17:27:09] yay! I successfully reproduced draws from a negative binomial. :) [17:27:33] I'm so happy for you and j! [17:27:50] Boy or a girl? [17:28:26] :P [17:29:08] Really, not boy or girl, but a simulated probability mass function of the frequency of k. [17:38:20] you can call them "simmy" for short. [18:31:50] Where's Ellery? I didn't get to sufficiently praise him for his work. [18:33:22] He's not commonly or consistently on IRC. [18:40:42] Boo [18:40:59] indeed [18:48:38] orcid.org kinda sucks. [18:48:45] Google scholar profile is much better. [18:49:16] No one is going to type in their education and employment history. [18:49:35] Ironholds, how do orchids get associated with people? [18:49:49] they have to register :/. [18:49:57] Google scholar profiles are prob more useful, yep. [18:50:03] But google hates people automagically crawling them [18:50:08] I had the devil of a time webscraping search [18:50:52] I can imagine. [18:51:03] It would be nice if we could combine the two. [18:51:15] Why is that metadata about science/scientists is so locked-up? [18:55:55] I don't know :( [18:55:57] also, halfak: 18627581 [18:56:49] ^ what is that number? [18:57:03] why, that's your Wikidata item ID [18:57:06] you're metadata now! :P [18:57:17] hey fhocutt, sorry for the delay. You should have my notes on matchbot in your inbox. [18:57:19] :) [18:57:29] I'm a data? [18:57:31] that should be the standard for HCI researchers [18:57:38] Economists make it when they win the Nobel [18:57:50] HCI researchers make it when they transmute into pure data. [18:58:05] lol [19:16:53] pie charts! [19:17:17] * Ironholds shudders [19:17:19] don't remind me [19:17:28] and all the same size [19:17:30] I'll have your pie charts whenever hadoop stops goofing up and crashing the cluster on big queries ;p [19:17:38] Just for you, I'll even make them in Exceln [19:17:41] no, Open Office. [19:17:52] So they're as scientifically inaccurate as Excel and as human-unreadable as only FOSS can achieve. [19:18:01] mission accomplished [19:18:36] predicated on hadoop not crashing, however ;p [19:22:22] I can at least retrieve the sampled log data, though [19:22:27] nice opportunity to test my pageviews lib! [19:49:33] Oh! Cool. We're going to have Haitham at the research showcase this month. [19:50:09] neat! [19:54:51] I never heard about that, but it was part of Asaf's presentation. [20:09:17] halfak: thanks so much! [20:11:09] No problem. Happy to help :) [20:23:34] Ironholds: if I want the output of sprintf('d.all.%s', gsub(' ', '', i), sep='') to be a dataframe as opposed to a string, do you know what I should do? [20:24:00] what are you actually trying to do? [20:24:07] that is, what is the purpose of the code and what is i? [20:24:26] i is a string. [20:25:10] I have data.frames like d.all.a d.all.b d.all.c etc and i is over (a,b,c, ...) [20:25:54] I want to make the output of sprintf() in a way that I can write in the for-loop statements like: check if a value is in d.all.a for example [20:26:31] ah. So, you want to check if i is found in each data.frame? [20:26:59] let me show you the whole for loop. give me a sec [20:27:09] totally [20:27:38] here: http://www.codeshare.io/5tMQy [20:27:52] Ironholds, ^ [20:28:08] yup [20:28:44] I...still don't understand what the end goal is. [20:28:52] What are you trying to determine? And, can you give me an example dataset? [20:29:42] think about this way: can we make the output of sprintf() in line 10 a data.frame? [20:29:55] so I can apply %in% in the line [20:30:36] I don't know, because without any datasets, I cannot replicate what the code is currently producing ;p [20:30:54] ooki. let me dig more and if I see I can't figure it out, I'll ping you again. :-) [20:30:58] okie-dokes [20:31:01] thanks! [20:31:03] I mean, it should be producing a vector, right? [20:31:08] %in% vector is totally valid syntax. [20:31:40] yeah, but the output of sprintf() is string. I somehow have to tell the code that don't treat that as a string but handle it as a dataframe [20:32:16] so, I instead of sprintf() in line 10 I put d.all.author then it works [20:32:30] because d.all.author is a data.frame/matrix [20:33:02] I mean, it's a character vector of length 1, but.. [20:33:08] yeah [20:33:12] erm [20:33:16] have you tried inverting it? [20:33:26] like t() [20:33:29] sprintf()... %in% page.response.a$item [20:33:38] wait, no, that's silly. Ignore me. [20:33:45] np ;p [20:33:53] if it's producing a single character value, why not == rather than %in%? [20:34:01] that should work fine. If it doesn't I have no idea :/ [20:34:42] it does produce a single item vector. the problem is that I want it to produce the data.frame that is defined in lines 1-5 [20:35:58] oh! [20:37:54] Argh. I don't know how to phrase it [20:38:21] * Ironholds needs this darn vacation [22:00:34] Ironholds: trying to find a room, brb [22:01:42] kk