[15:22:18] _o/ [15:49:26] o/! [15:51:38] Hello halfak. [16:00:31] hey halfak. I'm wondering: am I in the wrong hangout? [16:00:36] (somehow no one is in. :D) [16:00:55] ow hey guillom. I haven't seen you for such a long time. [16:01:43] lzia: I've been hiding in my cave, reading books and papers, and fomenting my plans to take over the world :) [16:03:13] haha, guillom. happy to have you back [16:03:18] :) [16:05:00] lzia: One day when we're both in the office, I'd like to talk to you about social network analysis. I saw that you were investigating it, and I have some ideas I was going to play with in the next few months. [16:05:21] tomorrow, guillom? [16:05:29] (Nothing urgent, and my thinking is not yet very clear.) [16:06:00] (I'll be remote starting Monday for 3 weeks, so let's catch up tomorrow if you'll be there in the morning? if not, early October?) [16:06:15] I need to work from home tomorrow; but I need to organize my thoughts into writing first anyway :) [16:13:35] * guillom curses at JabRef. [16:13:50] Works fine on one computer but not on the other. [16:14:02] * guillom shakes fist. [16:17:44] sad that I will miss you , guillom. [16:21:32] Hmm. What if I mass downloaded papers on SciHub and ran scripts on them to extract a citation graph? [16:21:52] Obviously I would be guilty of large scale copyright infringement, but what of the citation graph itself? [16:28:44] hare, I don't think you'd be guilty of copyright infringement [16:28:55] Unless you redistributed the papers [16:29:12] I think the citation graph can be relicensed. [16:29:16] * halfak is not a lawyer [16:29:57] Right. The citation graph would be sweat-of-brow metadata extraction and not subject to copyright. Doesn't make the methodology more legal though. [16:33:39] Interesting thought exercise. I wouldn't recommend people try it though. [19:00:05] hello, everyone! I'll be your neighborhood IRC host for the SPARQL workshop. I'll be looking for questions to relay to the room but it wouldn't hurt to ping me as well. [19:11:25] neilpquinn, is this workshop open to the public? [19:11:52] halfak I don't believe so. [19:12:01] Jonas just posted it to #wikidata [19:13:37] so maybe it is public? :) [19:14:05] neilpquinn, if it is public, I'd like to update the topic with information on how to join :D [19:15:08] halfak I just checked with Dario and he says that the workshop is just for staff, although the recording of the first part will be posted publicly later. [19:15:39] Oh. OK. We should keep the chat to -staff then, maybe. [19:15:43] I guess it's not ideal to use a public IRC channel for it, but I'm not sure there's a better one. [19:15:45] hey folks [19:15:46] sorry I was not on IRC before, [19:15:51] Meh. s'ok [19:16:19] yes we’re piggybacking #wikimedia-research to support a workshop targeted at WMF staff [19:16:24] Not sure, it feels like it might be too serious for staff? [19:16:39] the presentation will be published on commons later [19:23:06] I love that the 3 descriptions I can understand on https://www.wikidata.org/wiki/Q3 are so fundamentally different. [19:26:51] FYI the cats example that Stas just showed is the first in the "Examples" list in WDQ (so no need to re-write it) [19:27:06] yeah the example list is amazing [19:27:20] Billionaires is also in there, findable with the search box [19:27:57] J-Mo: ^ (fyi) [19:28:21] good practice , tho ;) [19:28:30] yup :) [19:30:50] timeout is thirty seconds btw [19:33:08] WikidataFacts: we started a bite late so I’m okay letting Stas go until .40 [19:33:21] I’ll give him the 5 minute warning in 2 [19:33:25] I just meant the timeout on the query service :D [19:33:35] sorry [19:34:20] ah [19:34:24] :) [19:34:33] SMalyshev: i think you just dropped out of the hangout [19:34:36] stas we can’t hear tou any more [19:34:38] oops [19:35:00] we’re restarting the speakers in the room too [19:35:16] I'll try to rejoin [19:36:30] \o/ [19:37:00] SMalyshev: 5-ish minutes left [19:38:41] Wondering if we can build timelines? [19:38:50] Thanks :) [19:39:21] embed mode is something I discovered very recently [19:40:03] (I found a timeline-compatible example with "recent events".) [19:40:11] wow, export as vega, very neat [19:40:48] so action=purge will update it? [19:40:57] This. Is. So. Cool. [19:41:39] guillom: i concur. [19:41:50] SMalyshev: -1 minute :D [19:41:51] hot damn this is awesome [19:42:03] I wish territoriality concerns wouldn't prevent things like this from making it into every Wikimedia project. [19:42:40] guillom: amen to that [19:43:26] Just stumbled upon this irc channel. May I ask was is awesome? [19:43:42] so I'm pretty good on time excluding the network problem :) [19:44:16] SMalyshev: you are [19:44:17] andrawaag: you can, but you might have to restate to a clearer question ;) [19:44:21] hey andrawaag [19:44:34] we’re hosting an internal SPARQL workshop [19:44:35] DarTar: yes. Yuri knows the details but yes it's data-driven, cacheable and purgeable [19:45:01] will be posting the presentations on Commons, we have a few people on the call you might know ;) [19:46:45] bgood: meet andrawaag [19:49:09] and this is why we have 30-second timeout ;) [19:51:19] even 30 seconds is pretty generous, but i suppose our query rate is still low enough [19:53:39] so glad to find someone else who writes triples without a space before the dot :D [19:53:43] ebernhardson: yes, it's mostly fine with this t/o [19:54:11] andrawaag: I think the specific thing that was awesome was the ability to export a result from the Wikidata Query Service as a Vega graph definition which you can then paste into a Mediawiki page (using the Graph extension). [19:54:41] btw: we're working on getting TPF/LDF server running on wikidata too. Will take some time though. [19:56:30] neilpquinn: thanks [19:56:43] SMalyshev: grand [19:59:44] and yes, we expect TPF interface will allow to solve queries like "give me all the humans" [20:00:26] i'd be curious to find out how much cheaper it is when the queries themselves are cheaper, but the query volume increases 200x [20:00:42] i can certainly see the server answering the queries being simpler [20:01:32] also TPF queries can be paged easier so if you have tons of results it may be easier on the server [20:02:02] for sparql you can have complex conditions that require to go through all data set before you can get any results [20:02:23] ebernhardson: save that for the Q&A [20:02:49] I like the fact you can start consuming data streamed as it comes in [20:02:55] DBPedia and the Harvard library use the same identifiers for specific individuals? [20:03:26] Ah that's what VIAF is for? [20:03:28] AndyRussG: if I understood it correctly, they both refer to the VIAF identifier [20:03:29] yeah [20:03:55] VIAF is a famous authority control database [20:03:55] K cool! [20:03:58] thx :) [20:05:44] The Federation is the future! ;) [20:07:41] http://bit.ly/artists-san-francisco, http://bit.ly/harvard_san_francisco, http://bit.ly/vivo-viaf-dbpedia, http://goo.gl/ZX1Xr6 [20:07:45] (We're just waiting for Zephram Cochrane to do his thing.) [20:07:47] From BJ chat. [20:08:04] we have time till 2161 :) [20:09:00] thanks Niharika [20:11:40] Gotta run for now, might make it back for the end of the session... all this is fantastic, congrats all!! [20:13:18] slides from the upcoming presentation (bgood and Tim_ ): http://www.slideshare.net/goodb/gene-wiki-and-mediawiki-foundation-sparql-workshop [20:13:38] ok I put it on commons here: https://commons.wikimedia.org/wiki/File:SPARQL_Workshop.pdf [20:14:05] andrawaag: looks like you made it in :p [20:14:08] hopefull I didn't mess up anything, not sure about all the commons rules [20:14:36] SMalyshev: everything I saw in your deck should be kosher for Commons [20:14:47] okay then [20:14:50] DarTar: Yes I did [20:15:38] andrawaag: I might sign you up for an impromptu demo using P2860 then :D [20:15:53] Dartar: Why the :p ;) [20:17:10] andrawaag: I am calling our security team right now [20:17:15] dartar: Since we are on the topic of federation P2888 would be more on topic [20:19:43] (I'm slightly disappointed that Q834003 isn't Q1701.) [20:20:27] chief sparqlr [20:21:13] https://meta.wikimedia.org/wiki/Wikidata_Easter_eggs <- no Q1701 though.... [20:22:11] I only knew of Q42 :) [20:34:23] Well I'm glad I'm taking this online course in systems biology; didn't think it would come in handy today! [20:40:19] all those refs are belong to us [20:41:31] Looking at http://www.slideshare.net/andrewsu/centralized-model-organism-database-biocuration-2014-poster and facepalming at the "the genomic data was only available in the PDF attachment" bit (in the "dark matter" section). [20:41:58] yeah.... its crazy [20:44:39] interesting, I hadn’t realized by default Gene Wiki was using “stated in” -> “scientific literature" [20:44:53] its not (bgood here) [20:45:01] under construction [20:48:18] ah ic [20:48:35] I was pretty impressed at Tim_’s multitasking abilities [20:49:52] some useful links here https://www.wikidata.org/wiki/User:ProteinBoxBot [20:54:36] (As a non-native English speaker, it feels super weird every time someone pronounces "GUI" as "gooey".) [20:55:16] I always do that :) [20:55:42] :D [20:55:43] +1 bgood very much agreed [20:56:07] domain-specific UIs for reading and writing content into WIkidata [21:00:47] DarTar sorry misunderstood your question a little. Answer is not yet but when we are successful enough that bioinformatics workflows are written that depend on wikidata it will indeed become important... [21:05:02] this is a common problem, for example i wrote up a query the other day to return olympians with the most medals, but Phelps didn't take the top spot because while many athletes have entries for individual events, https://www.wikidata.org/wiki/Q39562 only lists games medaled in and not the specific events [21:06:36] this is a common problem - data can be incomplete and inconsistent [21:07:23] e.g. when I did an example on US state governors, I discovered half of the states don't even have governor listed... so some things are well-covered, some still not [21:08:23] DarTar, you are in a ridiculous timezone! [21:08:47] Yes twitter is awesome for SPARQL: see for example: https://twitter.com/egonwillighagen/status/766988771116154880 [21:09:24] UTC-8 is one of the least populated timezones [21:09:27] http://artscience.cyberclip.com/world-population-by-time-zone [21:09:38] halfak: I thought it, thanks for saying it :D [21:10:45] i wonder what that would look like adjusted for economic activity by timezone :P [21:13:22] ebernhardson: http://tinyurl.com/jl2sabg :P [21:14:50] the duplicated values make me dubious ;) [21:15:13] ebernhardson: meaning? [21:15:27] halfak: I’m always in the most serious TZ wherever I am [21:15:35] DarTar: if 9 timezones have the exact same profit value, down to the $, the query is probably incorrect [21:15:43] or the data, whichever [21:16:00] probably companies with multiple headquarters locations [21:16:13] and very little companies with net profit entered [21:16:19] yea [21:16:37] yeah, 12, 4, and 2 results per timezone [21:17:58] wow, the timezone offset triple actually removes a lot of companies [21:18:03] 114 vs 42 [21:20:18] oh yeah, and only 15 net profit statements at all (and several items have multiple profit statements) [21:22:25] halfak: nice chart, btw. You should be able to express it in SPARQL [21:22:56] states or countries by TZ -> population [23:35:57] running to pick up the kids, see you later folks