[00:13:12] Can property P921 be used for scientific articles? (this article is about...) [00:13:22] https://www.wikidata.org/wiki/Property:P921 [14:38:40] Thiemo_WMDE: can you add a link to the relevant ticket here? I can't find it. https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team#Why_can_user_interface_display_60_arcseconds.3F [14:46:25] DanielK_WMDE: Do you mean https://phabricator.wikimedia.org/T153429 ? [14:54:57] Ainali: yes, thank you! [15:09:43] Aleksey_WMDE: http://bikeshed.com/ [15:35:33] Lydia_WMDE: Linked Data Fragments looks cool! That might have solved our human problem just in time ;-) [15:35:49] multichill: ;-) [15:36:33] Is it possible to combine sparql and LDF server side or is it always the client doing that? [15:56:01] Where can I find documentation on what to expect in wbc_entity_usage's eu_aspect column? [15:59:40] I figured that'd find something with this for sure: https://www.mediawiki.org/w/index.php?search=wbc_entity_usage [16:02:39] halfak: I'm quite sure DanielK_WMDE worked on that [16:03:31] At some point, somone told me what to expect there and I've now lost the details. [16:04:07] https://phabricator.wikimedia.org/T92288 ? [16:04:12] if you guys tell me how to find a reĆ¼po's source tree on phabricator/differential, i can link you to the documentation :D [16:04:31] Github mirror? [16:04:42] i don't want to :) [16:04:58] it's there oh phab, i have seen it, but it's totally unclear to me how to find the link [16:05:12] DanielK_WMDE, what is "it"? [16:05:16] on phab [16:05:20] Is it the wikibase repo? [16:05:25] Is it an extension? [16:05:34] halfak: HOw to get from https://gerrit.wikimedia.org/r/#/c/197504/4/client/includes/Usage/Sql/SqlUsageTrackerSchemaUpdater.php to just browse the repo in phab? [16:05:35] the wikibase repo, yes. [16:05:41] yes, it'S an extension [16:06:26] https://github.com/wikimedia/mediawiki-extensions-Wikibase [16:06:30] https://phabricator.wikimedia.org/diffusion/EWBA/browse/HEAD/ ? [16:06:39] Is that supposed to give an error? [16:06:52] https://phabricator.wikimedia.org/diffusion/EWBA/browse/ works [16:07:00] halfak: yes, but we have it on phab, and we should use it, because code review is moving there, right?... [16:08:13] https://phabricator.wikimedia.org/diffusion/EWBA/ [16:08:27] Would be really nice to have documentation like https://www.mediawiki.org/wiki/Manual:Page_table [16:09:00] multichill: for a table defined by an extension? [16:09:11] This is such an awful URL: https://phabricator.wikimedia.org/diffusion/EWBA/ [16:09:15] But it's the right one [16:09:20] i guess... but i prefer to maintain such documentation as part of the code base [16:09:31] Yes. Or just a placeholder that contains a link to the documentation. That works too [16:09:35] it'S easier to keep in sync, and it's always clear what documentation refers to which version of the code [16:09:38] DanielK_WMDE, I understand, but that's really bad for people like me. [16:09:47] It's completely unfindable at the moment [16:09:52] Right [16:10:07] multichill, halfak: https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/docs/usagetracking.wiki [16:10:30] https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/docs/usagetracking.wiki;0e42902aa8cfbffd2359442019e7d04e1a0fad1a$20 [16:10:31] well, it's in the docs directory of the extension - that's where documentation of the extension should be :) [16:10:45] DanielK_WMDE, generally, we put these things on mediawiki [16:10:46] but we do not have comprehensive documentation of our database schema... [16:11:05] There's comprehensive documentation for the MediaWiki base schema [16:11:17] halfak: yes, and then forget about it. [16:11:37] See also https://www.mediawiki.org/wiki/Extension:ORES/Schema [16:11:41] well, teh db schema docs on mw.o are good for core [16:12:01] but i find it a really bad idea to have the primary copy of the documentation on-wiki [16:12:05] it should be in sync with the source [16:12:09] so it should be maintained in git [16:12:20] i'd love to have it automatically synced with the wiki, though [16:12:44] i'll put a link on the wiki somewhere [16:13:00] DanielK_WMDE, and it should be findable by developers and users of the database software. [16:13:18] DanielK_WMDE, please also add a bunch of relevant search terms around that link. [16:13:24] Like all of the table names. [16:13:55] Otherwise, the search index will be useless and you'll just have a link that no one will find. [16:14:50] it's the only table for which we have good documentation, i'm afraid [16:15:03] i'll see what i can do [16:15:49] halfak: does the documentation answer your wuestion? [16:16:20] The extension is split up in multiple parts. wb, wbs and wbc are the prefixes. wbc is the client, but what is the difference between wb and wbs? [16:17:23] halfak, multichill: oh, look! https://www.mediawiki.org/wiki/Wikibase/Schema [16:17:42] what did I say about such documentation on the wiki not being findable, and going out of date quickly? [16:19:29] Your team created that DanielK_WMDE , you should also keep it up to date or link it to the place with the up to date documentation [16:20:40] Writting and maintaining documentation might not the most fun job, but should be done.... [16:21:10] DanielK_WMDE, yes, you do need to keep documentation maintained. [16:21:12] That is a thing. [16:21:41] I don't mean to be critical, but I lost an hour today just looking for this stuff before I finally caved and asked for help. [16:24:56] Added the link on https://www.mediawiki.org/wiki/Wikibase/Schema so at least the search returns something useful now [16:25:35] halfak: when you where looking for it, did you find the schema pages on the wiki? [16:26:04] Daniel, no. I was searching around for terms related to Entity Usage and the specific columns/table names. [16:26:18] I wasn't sure if it would be in the wikibase schema since it is a client table. [16:26:37] aaagh, no [16:26:41] err [16:27:07] multichill: don't mess with the docs while i am messing with them ;D [16:27:22] Or fix edit conflict merging ;) [16:28:03] hhaha, sorry man [16:29:20] halfak: WMDE is actually working on this. the prototype looks nice! [16:29:33] OMG want so much [16:29:35] <3 [16:30:24] step 1: side-by-side interface; step 2: mine/theirs choice for each chunk. [16:30:36] halfak: Do you also do non binary classification these days? [16:30:46] Article quality is multiclass [16:30:49] halfak: being "critical" about shitty documentation is fully justified btw :) [16:31:00] Arguably our collection of binaries provides for complex classifications. [16:31:02] But also the learning of the classes? [16:31:14] Oh! Unsupervised. [16:31:18] A bit like a translation statistical problem [16:31:31] We don't have facilities for that but I still end up doing some clustering from time to time. [16:31:53] I was talking about that with one of the Dutch Wikipedians. He recently finished his AI master and was interested in playing around with that [16:32:14] https://en.wikipedia.org/wiki/Unsupervised_learning [16:32:19] multichill, where would you apply it? [16:33:45] halfak: For https://www.wikidata.org/wiki/User:NoclaimsBot , take for example https://en.wikipedia.org/wiki/User:NoclaimsBot/Template_claim . That's now static [16:34:10] Input on the wiki side would be what templates and categories are used, on the wikidata side which statements [16:34:38] multichill, hmm... not sure we [16:34:48] 'd ever want this to be unsupervised. [16:35:12] NoClaimsBot only does the first statement on an empty item [16:35:26] So from completely no info, to P31 -> Q5 (it's a human!) [16:36:11] halfak: No touching of items that already have statements [16:36:12] Ahh yes, but wouldn't we want to supervise something like that? [16:36:19] E.g. teach the algorithm what humans look like? [16:37:22] If we start out with labeled data (these items are human, these items are not), then it sounds more like supervised learning to me. [16:37:27] That's already done by Amir. Unsupervised would be interesting because it would turn up new suggestions we didn't think of [16:37:38] It's an example [16:37:42] multichill, yeah, so that's not really quite right. [16:38:01] In the unsupervised case, it would give us "categories" we hadn't noticed were dominant. [16:38:09] So it wouldn't say, "this is a human". [16:38:21] It would say, "This has a lot of things in common with these other things." [16:38:23] It would suggest to add P31 -> Q5 [16:38:31] Then you could look at it and say, "Oh! That's because it's a human" [16:38:52] WIth some percentage of certainty [16:39:43] Yeah. So in the case that we're learning what property/value pairs are relevant for an item, we'd presumably be training on items that already have those property/value pairs, right? [16:40:08] Yes, that would be the training set. [16:40:17] Scale is a bit of an issue here ;-) [16:41:03] You would need to feed it the item<->article (categories/templates) combination for the learning [16:41:06] halfak, multichill: https://www.mediawiki.org/wiki/Wikibase/Schema/wbc_entity_usage [16:42:10] ty [16:45:59] halfak: Anyway, would be a fun side project ;-) [16:48:39] Anyway, going to blow some more bubbles at http://tinyurl.com/zmsetuq [22:25:32] !admin 72.25.24.183