[14:59:05] Timezone-appropriate greetings. [14:59:52] TAG! [15:16:03] * halfak review the philosophical foundation of a thesis. [15:16:32] Weird sometimes to switch modes and think about how we use "sociomateriality" and "intersubjectivity" to discuss the experiences of humans. [15:16:59] I find myself asking why we invest in such terminology when there's more familiar terminology that's readily available. [15:17:09] I suppose not for sociomateriality. [15:18:27] * guillom has not encountered such combinations of syllables before. [15:23:06] Those verbostercular terms are indeed most confounding, halfak. [15:23:38] Erudite vernacular [15:24:19] Don't be such a hippopotomonstrosesquipedaliophobic. :P [15:24:37] https://en.wiktionary.org/wiki/hippopotomonstrosesquipedaliophobia [15:25:37] * hippopotomonstrosesquipedaliophobe [15:26:51] Emufarmers, that's right. Thanks. :) [15:32:09] Show-offs :p [15:32:33] http://www.ucd.ie/artspgs/semantics/ConsequencesErudite.pdf [15:32:45] "Consequences of Erudite Vernacular Utilized Irrespective of Necessity" [15:33:25] hah [15:35:38] * guillom adds to reading list. [16:42:46] halfak: I'm still very much experimenting but I thought you might like to see what I've started doing with your data from mwcites. https://plot.ly/~tarrow/10/pmc-ids-cited-on-english-wikipedia/ Has much of this kind of analysis of citations been done before? [16:44:58] Hey tarrow. Not that I have seen. I did a mini-presentation at the Wikimania Hackathon about this utility and how you can extract identifiers. See https://meta.wikimedia.org/wiki/Research:Scholarly_article_citations_in_Wikipedia [16:45:25] DarTar would be a better resource for knowing what has made it to the academic literature. [16:45:34] He doesn't come onto IRC much these days. [16:46:27] ah cool, it's very interesting. I'll have a little look around [16:46:50] tarrow, I'd be very interest in talking to you about ways for turning these IDs into metadata and incorporating that into 'mwcites'. [16:47:03] Do you know if there is a good API for getting PubMed-metadata? [16:48:05] Yep, so I've just started an internship with EuropePMC (I it can be described as a mirror of the NIH version with some other benefits) [16:48:21] Cool! An insider. [16:48:35] * halfak rubs hands together and laughs maniacally [16:48:36] They have a restful-API available https://europepmc.org/RestfulWebService [16:49:36] I only just started learning python but I made a very bad client to query it at: https://github.com/tarrow/epmclib [16:49:45] So, my basic idea is this: I want you to be able to take the TSV (or whatever format) that includes PMIDs and run that through a script that will maintain a cache and make API calls for you. [16:49:55] This cache can be maintained inbetween runs [16:49:59] * halfak checks out repo [16:50:04] ah cool, that sounds good [16:50:19] "untangle"? [16:50:35] yeah, it is a simple xml reader thing [16:50:48] Oh goodness. It's XML :( [16:51:21] like I said. I literally started learning anything useful a few days ago so it is probably far from optimal [16:51:30] Why do people keep using XML for things? [16:51:35] No! Not your fault. [16:51:43] Untangle looks like a saner way of working with the XML :) [16:52:04] My most successful open source libraries are the ones that make it so no one has to work with the XML in the MediaWiki dumps. [16:52:07] :) [16:52:20] This looks great! [16:52:29] Yeah, I just took the simplest thing I could find on stack exchange [16:52:35] And yay for MIT License!!! [16:52:52] but any way I can help would be great [16:53:19] I wrote it to start populating harej's librarybase with content [16:53:29] I'd like to pick this up this weekend during my "volunteer hours" and see what it would take to generalize this metadata fetching strategy in mwcites. [16:53:31] +1 woot [16:53:36] librarybase :) [16:53:54] I'd like to have metadata fetchers for all the ID types we gather. [16:53:58] but I'm also very interested in seeing how the citations are then used [16:54:26] One possible solution which I considered is zotero [16:54:46] I understand that is how citoid works and we'd probably be looking for similar things [16:54:58] https://github.com/mediawiki-utilities/python-mwcites/issues/8 [16:55:25] What do you think? Does zotero have a nice API we could use? [16:55:39] I don't know anything more about it to be honest [16:55:51] I could definitely investigate though [16:56:24] If you beat me to it, I'd love to read your notes :) [16:56:38] In an ideal world, maybe not that far off, we could just query wikidata :) [16:57:05] Yeah. Now you're talking. If only we can get the publishers to register their own metadata there too. [16:57:10] Sure, I'll definitely let you know what I find out [16:57:14] Cool :) [16:57:42] well, if I get the library base thing working then perhaps people will be interested into pushing it into wikidata [16:57:52] This reminds me to eventually get https://www.mediawiki.org/wiki/User:Halfak_%28WMF%29/mediawiki-utilities moved to the main namespace. [16:58:19] if that looks good then I might be able to convince the people here to regularly push out the updates they get from NIH or wherever to wikidata [16:58:21] tarrow, I think you're right. This would probably be the cleanest import WikiData ever saw. :) [16:59:16] since they already have a big old pipe of papers and metadata flowing in [17:05:38] Excellent; I'll take a little look at citoid/zotero then. I'll also try and update my little library because I think we could get data that zotero won't have. For example some papers return the license and open access status from europepmc which is probably also interesting. I'll let you know if I make any good progress. [17:15:50] Sounds great. Thanks tarrow :) [17:16:32] I'll be hacking on mwcites and mwrefs this weekend -- to merge them and to develop a generalized metadata fetching/caching strategy. [17:39:43] Library base has myriad potential applications [17:39:55] But my own motivation was to create it for the Wikipedia Library [17:42:48] Namely: analysis of citation data within a subject area, spawning recommendations for sources, with corresponding hook into Wikipedia Library offerings. [17:42:51] WORKFLOWS! [17:43:43] Also, would anyone object if I replaced Blueprint with Vector? I had my fill of Blueprint :P [17:44:13] ^ MW Skins? [17:47:30] Yes [17:47:56] I initially went with Blueprint because I wanted the "next generation" skin. But then I got tired of it. [17:48:18] It's not very good for anything other than presentation. [18:23:28] harej, gotcha. I figured that preferring "vector" was sacrilege, but I suppose that "vector" is the new "monobook". [18:24:14] I'd love to use a skin that's not vectorbook, but blueprint is... not good enough [18:42:35] halfak, re: email to last years' CSCW workshop participants. We can include a link to the Figshare report in the advert for this years' workshop, OR we can send the report 'announcement' as a follow up email. Do you have a preference for which one of these we do? [18:43:13] I may be over thinking this. [18:47:07] Report announcement [18:47:12] More likely to get read [18:47:19] J-Mo, ^ [18:47:41] I put the TL;DR: at the top because I figure that people will just be scanning the rest for links and dates. [18:47:49] * YuviPanda waves vaguely [18:48:47] sounds good, halfak. So cc me on the list of recipients for the workshop announcement, and I'll reply in a few hours with "oh, and by the way folks, look what we finally have!" [18:48:55] And there will be much rejoicing. [18:49:55] halfak: we should get non-bouncer people on IRCCloud :) [18:50:11] I'm still a non-bouncer person. [18:50:17] Can I connect my IRC client to IRC cloud? [18:50:23] halfak: unfortunately not. [18:50:31] halfak: wah, you don't have a bouncer and are around every time I look? [18:50:34] are you 3 people? [18:50:35] I like my client better than their's [18:50:40] * guillom waves YuviPanda. [18:50:44] ya can't blame you! [18:50:47] * YuviPanda waves at Guerillero [18:50:49] err [18:50:51] guillom: [18:50:52] Oh. My laptop stays connected whenever I'm not traveling. [18:50:53] hi to Guerillero too :) [18:50:58] halfak: ah :) [19:04:33] Okay, I now have Librarybase using a modified version of Vector. http://librarybase.wmflabs.org/wiki/Librarybase:Home [19:05:24] logo plz kthx [19:06:33] The "Appears on English Wikipedia article" thing makes the page a bit slow to load, it seems. [19:09:07] I think usually a source won't be cited on as many pages? [19:09:24] There are Wikidata entries that are 1.7 MB in size; they have a similar problem [19:14:10] The logo should be the Credible Hulk. [19:57:41] EGalvez_: hi [19:58:42] EGalvez_: is there a reason you don't want to encourage everyone to publish the survey dates in the public page unless there is a need for this information to stay private? [22:59:18] hey halfak, got two questions for you about your VE research. got a min? [23:00:12] what user actions are involved in an "intra-edit session" and a "saveAttempt"? https://meta.wikimedia.org/wiki/Research:VisualEditor%27s_effect_on_newly_registered_editors/May_2015_study [23:01:05] o/ J-Mo ... was in meeting... just starting another meeting [23:01:13] Will come back and answer [23:01:24] no prob. whenever you have a spare cycle [23:01:49] so many meetings