[17:35:01] Structured Data office hour begins in ~25 minutes [17:58:40] Two minutes! [17:59:33] * waves * [17:59:53] o/ [18:00:00] Hey there [18:00:03] Here we go [18:00:08] #startmeeting [18:00:08] Keegan: Error: A meeting name is required, e.g., '#startmeeting Marketing Committee' [18:00:13] Heh [18:00:24] Great way to start things off [18:00:33] :D [18:00:37] #startmeeting Structured Data on Commons | Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) |​ Logs: https://wm-bot.wmflabs.org/logs/%23wikimedia-office/ [18:00:42] Meeting started Tue Jun 26 18:00:37 2018 UTC and is due to finish in 60 minutes. The chair is Keegan. Information about MeetBot at http://wiki.debian.org/MeetBot. [18:00:42] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [18:00:42] The meeting name has been set to 'structured_data_on_commons___channel_is_logged_and_publicly_posted__do_not_remove_this_note_____logs__https___wm_bot_wmflabs_org_logs__23wikimedia_office_' [18:00:49] We're on! Welcome all [18:00:53] Silly bot [18:00:54] Anyway [18:00:56] Hi everyone [18:01:12] Hi [18:01:12] We're going to start off with a recap of what we've been working on [18:01:29] And I'll handle keeping track of questions as they come up [18:01:42] Are there any that people have for later while risler types? [18:01:53] Hello everyone! Here's a quick summary of where we are now: [18:02:25] a.) We are coding for multilingual file captions, which will still be the very first released feature (targeting October release now) [18:03:00] b.) We're preparing Community conversations about Wikidata properties needed for Commons, as well as a chat about Structured Licenses [18:03:00] There is a prototype available for captions, contact me if you haven't tried it out yet and would like to [18:03:38] https://commons.wikimedia.org/wiki/Commons:Structured_data/Get_involved/Feedback_requests/Properties_for_Commons [18:04:01] c.) we're concurrently developing early work on the integration of depicts "tags" into Commons, starting with Search. Look for a prototype on that within the next month or so [18:04:20] and now I'll throw it over to Sandra for an update on GLAM-related work [18:04:53] In the last months we had a first conversation on how GLAM metadata can be mapped to Wikidata and Commons - https://commons.wikimedia.org/wiki/Commons:Structured_data/Get_involved/Feedback_requests/GLAM_metadata_and_ontologies_mapping [18:05:17] And we are starting to talk about the first GLAM pilot projects that will use Structured Data on Commons for the first time! [18:05:43] We had a well-attended workshop about possible pilot projects during the Wikimedia Conference in Berlin, where approx. 50 people attended. [18:06:01] And we have a great longlist of projects which I will follow up on in the upcoming months. [18:06:01] I'm active in proposing properties - and reviewing property - porposals on Wikidata; HMU if anyone needs help in that area. [18:06:04] any links for those GLAM pilots? [18:06:13] Is there User:elmacenderesi? [18:06:28] This was the session: https://meta.wikimedia.org/wiki/Wikimedia_Conference_2018/Program/47 [18:06:38] There are links to the report and the spreadsheet we collected [18:06:45] pigsonthewing: Thank you, will absolutely take you up on that at some point (or at least point others to you :) ) [18:07:00] This is just a start - if people are interested to do pilots, please get in touch [18:07:16] I will also document this better on Commons, as that is still lacking :-) [18:07:27] Sakhalinio: that user is not in here, I think [18:07:56] I am looking to photos [18:08:46] So, any questions about SDC so far? [18:08:51] Or about the future? [18:09:15] My understanding is there was a 3 year proposal, and were more than 2 years in to that. What's the progress? [18:09:35] Not so much structured data on files, but there are two big scanned-image projects (BHL, BL 1 million) that I'm working on Wikidata to build industrial-scale numbers of categories for on Commons (~100,000 + ~60,000) [18:09:56] Great question. abittaker is the Program Manager, she'll answer that Glrx [18:10:15] still cleaning up / creating the WD items, but hope to have the cats creating soon [18:10:18] Hullo Glrx, we're 1.5 years into the program, and working hard on a) MediaWiki infrastructure and b) features for Commons [18:11:10] Jheald What will the categories describe? [18:11:14] Actually i don't know this SDC project but I can help you for wikidata and commons structuring [18:11:26] we expect feature development to continue through the project, but we expect people will be able to add captions in October and depicts properties on Commons in January [18:11:39] All other properties will follow soon after that, in Feb or March [18:11:43] Sakhalinio: You can learn more about the project (and how you can help) here https://commons.wikimedia.org/wiki/Commons:Structured_data [18:11:43] it will be cat'ing the images into books, books into authors [18:11:53] also local map categories [18:12:01] also subject areas of the books [18:12:05] Jheald I see! :-) [18:12:09] We had a GLAM project in Turkey (Pera Museum) but know muesum administrator remove square codes because of wiki block in Turkey [18:12:27] * Steinsplitter waves [18:13:22] https://tr.wikipedia.org/wiki/Vikipedi:%C4%B0%C5%9F_birli%C4%9Fi_projesi/2016/WMTR-Pera_(ORK-KD:ERS) [18:13:26] Current statistics on the BHL wd items at https://www.wikidata.org/wiki/Wikidata:WikiProject_BHL w/ progress page on data augmentation [18:13:35] Sakhalinio I have seen images from Pera Museum on Commons! [18:13:41] Something we ought to talk about is the Commons roll-out of Template:Wikidata Infobox, https://commons.wikimedia.org/wiki/Template:Wikidata_Infobox [18:13:51] Certainly the most high-profile, arguably most significant, integration of Commons with wikidata yet attempted [18:14:02] Number of uses is now approaching 1.2 million, with Mike Peel's Pi bot currently adding about 4000 a day https://commons.wikimedia.org/wiki/Category:Uses_of_Wikidata_Infobox [18:14:12] example: https://commons.wikimedia.org/wiki/Category:St._Paul%27s_Cathedral , down the right hand side [18:14:52] The prototype seems fine. Maybe copyright status should be updated automatically based on the edit on the filedescription. Or do we need a bot to import stuff to structured data? Easyest way would be to parse from fildesc. [18:14:53] :) [18:14:54] these have turned up very much in the last month... would be useful to know what ppl think, how they are being received [18:15:03] Which also means that 1.2 million categories correspond with Wikidata items, which is great [18:15:25] No: 1.8 million categories correspond to Wikidata items [18:15:43] Even better :-) [18:16:24] the infobox seems (apart from minor bikeshedding, which is to be expected) very well received. [18:16:29] That's a good number for certain [18:16:36] latest stats: https://www.wikidata.org/wiki/Wikidata:WikiProject_Commons/Links_and_sitelinks/historical#9_June_2018 [18:16:51] plus 100,000 pre-empted by galleries [18:17:21] It was modified recently so it diplays nicely when used in a category with no known Wikidata equivalent [18:17:56] but important to note that this is out of a total of 6.7 million Commons cats -- so that is 5 million not currently connected [18:18:02] A lot of these initiatives are being pretty well received. That's always a nice experience. [18:18:55] many current Commons categories are intersections of two (or more) Wikidata items (cats in Paris, for example) [18:19:26] Keegan: Makes a change from the resposnse of some on enWP! [18:19:54] :X [18:19:56] A striking thing (to me) is how sane/usefully the template performs on intersection categories, eg: https://commons.wikimedia.org/wiki/Category:Grade_I_listed_churches_in_Bedfordshire [18:20:28] I'm wondering if that box might display better horizontally [18:20:32] ... but at the moment it can only be used on Commons cats that have a corresponding WD item where the data is stored [18:20:46] The Wikidata matching for cats is great progress, it will help the goal of finding translations [18:20:49] * Steinsplitter is wondering if people saw his msg regarding prototype :) [18:21:03] Jheald: Because, in that case, there is a single corresponding item [18:21:03] BIG question I think, is how to extend from 1.8 million cats to the full 6.7 million [18:21:25] Steinsplitter: which prototype are you discussing? [18:21:31] We *need* to think what we can do to push this forward [18:21:45] 2 main options: [18:21:49] Keegan: The nice prototype on your betawiki [18:22:01] I would be interested to know what kind of categories are hard to return to one or two Wikidata items, any documentation? [18:22:04] Jheald: Does Commons want/need exact 1:1 matching? [18:22:10] 1/ Create namespace for categories on CommonsData [18:22:27] 2/ Allow items for intersection cats on Wikidata [18:22:32] What would you have it do, on https://commons.wikimedia.org/wiki/Category:Cats_in_Paris ? [18:22:37] Keegan: pretty much [18:22:43] Steinsplitter: Ah, I see. Yes, we're still working on pulling and displaying structured licensing and copyright [18:22:48] Jheald, what about option 3/ allowing Q-items on Commons [18:22:55] cool, thanks. [18:22:58] We'll be getting into that with community consultations starting in the next couple of months [18:23:24] pigsonthewing: needs an item on Commonsdata or Wikidata that can host the property: Combines topics: cats, Paris [18:23:47] In theory you can search for cats+Paris and get the same result [18:23:56] AIUI, risler [18:24:19] Jheald: I don't see that going on Wikidata; maybe on Commonsdata? [18:24:31] Keegan: that doesn't give you an infobox, doesn't give the same link-clicky navigability [18:24:48] Keegan: doesn't help us document the categories in structured terms [18:25:03] it's about much more than just search [18:25:28] But we need to be more precise than "Combines topics: cats, Paris". It's "cats *IN* Paris", not "cats from Paris" nor "cats called Paris" [18:25:37] s/called/named after/ [18:25:52] because documenting the categories in structured terms gives us ready made topics for the images [18:26:07] some people have used qualifiers on the statements for that [18:26:25] pigsonthewing: see the usefulness of the Bedfordshire example above, which doesn't do that, but is still valuable [18:26:36] nature of the relationship can be added as a qualifier [18:26:49] But if we document the images properly (depcits:cat; location:Paris) do we need (structured data about) the category? [18:27:23] If I add a picture of Notre Dame and link it to the wikidata cathredral, will it pick up Paris? [18:27:43] pigsonthewing, ideally there should be an easy way to tag images based on the category they are in [18:27:43] pigsonthewing: we have to get there. It's all very well to wish for unicorns, but the cats is a much more realistic prospect to work on with bots [18:27:57] then cascade the topics as suggestions down to the images [18:28:23] Micru: Could you expand on what you mean by "tag"? [18:28:39] Is "topic" a useful thing to search for? A topic could be any aspect of the image, including where it was taken, what's in it, who took it, what time of day it was... [18:28:47] Same issue with the Bedfordshire example though this may help: https://www.wikidata.org/w/index.php?title=Q24974914&diff=703044119&oldid=701200898 [18:28:52] Glrx: we're working on that functionality right now. "statement traversal". will have more info on this in the coming months. [18:28:56] he means: add a *depicts* : *topic* stmt [18:29:19] I don't think I'm "wishing for unicorns". [18:29:40] Jheald, Keegan, recently there was a new tool on Wikidata that would show inferred statements on an item based on other items. I believe something similar could be offered in Commons. A way to show statements from Categories without actually having them on the file page itself [18:29:58] but Andy even without that, the infobox is still giving an internationalised localised description that's useful to users [18:30:51] (procedural note: 30 mins left, we're halfway done) [18:30:53] Andy: qualifier P273 Bedfordshire would be better [18:30:54] Useful to users, yes - but the aim of SD is to be useful (meaningful) to computers [18:31:31] usefulness to users is a v important driver [18:31:53] P273 == not found [18:31:59] Micru: I haven't seen this tool, do you have a link for the chat? [18:32:11] getting any structured handle / record on what the remaining 5 million cats are is a step forward [18:32:15] Keegan, I will have to dig it [18:32:25] * Keegan nods [18:32:27] Thanks [18:32:54] andy: "located in administrative territory" -- haven't been working on places for a bit, may have mis-remembered the number [18:33:10] Keegan: https://www.wikidata.org/wiki/User:Pasleim/derivedstatements.js [18:33:31] Much obliged for the link, one moment [18:33:33] I was 100% against creating items for these intersection cats on Wikidata [18:33:51] Ah, yes, I have installed that tool on my volunteer account [18:33:55] I still think a structured space on CommonsData for categories would be better [18:34:40] But having seen how usefully the {{wikidata infobox}} template performs on intersection cats [18:35:05] I think items for them on WD is something that is now a necessary structural need [18:35:10] Keegan, it was announced here: https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2018/06#Announcing_derivedstatements.js [18:35:30] Jheald: Why against it? Should the intersect category just fall out of two larger categories? [18:35:34] I would suggest making a proposal to change the notability guidelines on wikidata then [18:35:53] I know there was a lot of opposition in the past, but that was before there was any obvious use for them [18:36:04] & the more headway we can get documenting the remaining 5 million categories before 0-day for SD, the better [18:36:34] RfCs on WD tend to just die without ever any conclusion [18:37:01] It needs a group agreement that this is something we should do, and then to push it through [18:37:19] other issues with the infoboxes [18:37:49] 1) their use being blocked because the sitelink is taken by a gallery... is a pain [18:38:04] (responses are being written) [18:38:37] Need to review, and get a quick decision as to whether we're going to rethink this [18:39:02] micru, that is a super interesting tool, thank you for sharing! we will definitely share it with our designer and see if similar functionality should be part of the file page on Commons [18:39:02] How does a Russian user specify his picture is of a cat in Paris? [18:39:36] eg A) prefer WD item -> Commons cat sitelinks, leave galleries to twist in the wind (suggested by Ghouston, but not taken forward... yet?) [18:40:50] or B) confirm stick with present status: make new WD item for category, connect to main WD item via main-topic / main-cat pair of properties, and roll these out wholesale [18:41:44] it's crazy that we don't have infoboxes for the most high-profile things, because these are the very things most likely to have galleries [18:41:45] Glrx: that Russian user will add depicts statements (either in the UploadWizard, File Page, or some other method). Since both "cat" and "Paris" have labels in Russian, the "tags" will display in his language. We'll try to be as multilingual as the metadata in Wikidata allows us to be. [18:42:04] needs rapid decision one way or the other, then action [18:42:52] Is there any way to link an image annotation to a depict statement? [18:43:05] Glrx: Wikidata is pretty much 100% language agnostic, and Russian in particular is very well developed. CommonsData / Commons presentation can inherit all of that [18:43:21] Issue 2) re infoboxes [18:43:37] (2) Real documentation shortfall [18:44:25] urgently need much better docs, directed at first-time Commons users trying to add/fix/improve an infobox [18:44:35] Glrx: that Russian user could also add a statement using other properties ("location" or "taken in" or some other properties the community decides upon). So the picture could "depict" a cat, but also have a different statement explicitly saying the location where the picture was taken. [18:44:37] eg: A) why does cat not have an infobox ? [18:44:50] B) why is the map wrong ? how do I fix it ? [18:45:03] Micru: There is not yet a link between image annotations and depicts statements, but there will be next year :) [18:45:10] 15 minutes remaining [18:45:27] C) why is the blue link going to the wrong sort of (ie how to fix homophone issue on Wikidata) [18:45:46] risler, that is great! how will it be done? with qualifiers containing the annotation ID? [18:45:56] There are many {{Other versions/image123}} templates to group derived-from images. Is there a SD property to compute such a group? [18:46:15] D) why is there black text for this term not a blue link (ie how to link wd item to commons if it exists, or create one if it doesn't) [18:47:15] Glrx: It sounds very sensible to have such a property - it's up to the community to decide if that is wanted/needed and to create it and I would be all for it. [18:47:25] With luck, Commons users will really start taking to the blue-link to blue-link to blue-link navigation in infoboxes, will really want to start improving them [18:48:11] A huge boost for SD if they do, because this is exactly how SD info is going to be represented on files -- the very same vocabulary [18:48:42] risler is writing a reply to Micru. While he does that, I'd like to take a moment to promote the new page https://commons.wikimedia.org/wiki/Commons:Structured_data/Get_involved/Feedback_requests/Properties_for_Commons [18:48:51] Micru: the exact implementation details are TBD. It will also be tied to statements related to IIIF spec, so that clients can crop/zoom to the specific thing/region of interest in the image. So we have a few use cases we need to account for. We'll have it all sorted out next year though :) [18:48:55] It's an exercise in figuring out what properties Commons will need [18:49:15] risler, thanks for your answer! [18:49:29] You can use a file provided (or your own) and work through all the statements that might be possible for an image [18:49:35] So *Recommend* the SD project, particularly the community-interface co-ordinators, take on the infoboxes and their popularisation, and add this as a key task for right now [18:49:42] Or just list properties if you don't want to do the exercise, that works as well :) [18:50:26] Keegan, I thought Commons would use Wikidata properties... if there is the need to create additional properties, cannot they be created on Wikidata? [18:50:37] Micru: Yes and yes [18:50:51] The question is, what properties will the software need to support initially? [18:51:00] Great to see IIIF being invoked. For those not familiar: https://en.wikipedia.org/wiki/International_Image_Interoperability_Framework [18:51:22] Micru, Keegan: Presumably it will be like the Lexicographic properties -- use what WD already has, see what extra Commons needs [18:51:30] And https://commons.wikimedia.org/wiki/Commons:International_Image_Interoperability_Framework [18:51:34] Jheald: Yes [18:51:57] I agree, we cannot predict what is needed until we need it :) [18:52:32] Per IIIF: a gadget or API to export the contents of a category as an IIIF manifest would be ++valuable for building bridges with the IIIF community [18:52:32] Right so the idea is that we have a list of what we think we might need, at least. So we're not surprised when we need it, and we can get it created fairly easily, quickly, and painlessly [18:53:22] Some of the non-Wikimedia heavy users of MediaWiki v interested in IIIF import/export (the National Gallery in the UK for one) [18:53:45] Yes, we are quite frequently contacted/asked about it [18:55:08] Okay, about five minutes left. Any remaining questions or comments for now? [18:55:31] Also on IIIF: we have a WD property to specify part of an image in a way the existing IIIF hack for Commons can display, but there is no URL-formatter currently on WD that can make the link for the property, so even when the data has been added, there is no blue-link currently from the WD page [18:55:44] This month you'll find the properties exercise that I linked to, it's open for participation for at least all of July [18:55:58] Is there any possibility of moving forward on that? [18:56:03] And coming later on this summer we'll have the Depicts prototype out for testing [18:56:30] How will Fae's copyright question be handled: e.g., file with PD image of Mona Lisa but copyrighted description of the image? [18:56:48] Soone could write a tool (on Toolserver) to make the IIF URL using the data from Wikidata [18:57:01] Keegan, I think there were some mockups about the depict prototype, can you please pass me the link? [18:57:03] It would be really really good, eg for motivating people to use Shonagon's tool for adding detail coords to WD [18:57:15] Glrx: good question. We don't know yet. [18:57:34] s/soone/someone [I wish I could type!] [18:57:44] There will be a Structured Licensing and Copyright consultation towards the end of the summer, and the issue will be discussed then [18:57:51] Andy: Shonagon already has. Integrated part of Crotos. But a URL-formatter would be a huge boost to visibility & incentive [18:58:01] It's largely up for the community to decide, we'll have to bring the conversation together [18:58:08] Jheald, URL-formatters on Wikidata is something the Wikidata team might be able to handle [18:58:38] Keegan, Sandra: Any capacity at your end to help write HOWTO docs re fixing/adding WD infoboxes? [18:58:52] This is something that you help on [18:59:01] About a minute remaining, so thank you all for coming out and participating [18:59:17] Jhead: I don't think what's there is what I mean. If I missed something: URL, please? [18:59:30] abittaker: Needs the push, that this is valuable/needed [18:59:31] Thank you, all. [18:59:42] Thanks everybody [18:59:48] Jheald: Not on that specifically, but I am writing documentation (on very popular request) on how to do current Commons uploads to make them SD-compatible [18:59:54] wd team have endless tickets to fix, it just gets lost in the mass, w/o a championm [19:00:04] And that includes guidelines on linking categories with Wikidata items [19:00:11] abitt. Or a user script? [19:00:15] Jheald: It's unlikely I'll have time to work on howto docs for the infobox. [19:00:21] I recommend to keep categories simple for new uploads [19:00:33] Okay, thanks for coming out (again). I'm going to end the formal meeting, but conversations are free to continue [19:00:37] #endmeeting [19:01:03] Meeting ended Tue Jun 26 19:00:50 2018 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [19:01:03] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2018/wikimedia-office.2018-06-26-18.00.html [19:01:03] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2018/wikimedia-office.2018-06-26-18.00.txt [19:01:03] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2018/wikimedia-office.2018-06-26-18.00.wiki [19:01:03] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2018/wikimedia-office.2018-06-26-18.00.log.html [19:01:03] I need to switch to another meeting unfortunately, but feel free to ping me if there are any further questions! [19:01:12] spinster: Right now, help is really needed on user-docs for the infoboxes [19:01:30] the infoboxes have hit - 1.2 million in place [19:01:52] community needs much better info on how to edit/improve them [19:02:08] Jheald: If anything, you can file a task on Phabricator for improving them [19:02:13] at the moment, this is the front-line for WD on Commosn [19:02:13] It's a start to document the need [19:02:42] but does anybody look at Phab tickets, or ever care about them? [19:03:37] if having a Phab ticket wld help you argue to be given time to help on this, then I'll do it -- but otherwise it wld just be another ticket for the birds [19:04:03] Yes. I look at Phab tickets everyday. :) [19:04:26] Okay, I need to get lunch. Thanks again, all! [19:05:10] Help is really needed on this, and you just need to look at the {{wikidata infobox}} talk page to see how frayed eg RexxxS is getting fielding comments, because there's no good entry-level "how do I do this? how do I fix this?" [19:10:04] pigsonthewing: misunderstood you, re IIIF display. Crotos has good tool to make coords for a detail, & to display details that a WD item locates. But maybe a JS gadget or user-script to turn the coords-value into a link, on the the WD page? [19:10:49] JS gadget or script should be v simple, if one knows how to make it [19:15:42] Jheald: regarding making wd items, when I say proposal, I don't necessarily mean a formal rfc (which as far as I can tell isn't a requirement), I just mean that the notability policy will need to be updated at some point, otherwise it's going to create conflict, and that will need some sort of discussion [19:17:13] but of course the more people from commons who agree that they want it, the better [19:17:39] Is it correct? https://commons.wikimedia.org/w/index.php?title=Sultanahmet_Camii&type=revision&diff=308223459&oldid=271729005 [19:23:05] and please ping me if you do, I've argued before for more equal treatment for commons sitelinks [19:32:10] Sakhalinio: yes. [19:32:22] https://phabricator.wikimedia.org/T198255 [19:32:35] "Wanted: Wikidata user script to add links using IIIF data" [19:38:15] Keegan: people are also welcome to join #wikidata and ask questions there... it's not always very active but they'll probably get an answer if they hang around [19:38:46] nikki: Great, thanks