[06:46:39] Bit creepy to edit an item and see the subject die the next day. https://www.wikidata.org/w/index.php?title=Q7627362&action=history [07:23:17] nikki: we made some dispatch progress, but not much again [10:37:35] nikki: I couldn’t resist, I also checked the dump for the most common descriptions :) here’s the script I used: https://gist.github.com/lucaswerkmeister/a5b02cda1ce9ea14874dd2828ce57e79#file-common-descriptions-sh [10:39:10] :D [10:39:43] what's jq? [10:39:54] json query tool [10:39:58] ah [10:40:02] really convenient for working with JSON on the command line [10:52:24] in a quick test (bzcat the dump, head -n 100000, followed by the replacement), replacing "sed" with "perl -pe" is marginally faster for me, did it three times for each and got 29.8/32/32 s with sed and 29.3/30/30 s with perl. not sure if the difference is constant or proportional to the amount of input though [10:54:16] seems to be constant, so not very helpful [10:56:53] you mean, just startup time? [10:57:12] feels weird to me that perl would start faster than sed, but okay [10:59:51] not sure, I just doubled it to 200k and got 51 for sed and 50 for perl when I'd expect a bigger difference if it's related to the amount of input [11:02:55] what’s the perl command? I know almost nothing about perl [11:03:31] also, I just tried `s/.$//` instead of `s/,$//` for sed (. instead of ,), and wow that’s a lot slower (5.6 s vs 1 m 6 s) [11:03:32] perl -pe 's/,$//' - literally just replacing sed with perl -pe :) [11:03:56] :D [11:04:10] okay, with perl I get 1.8 s [11:04:38] (I have the extracted file on disk, perhaps you’re bottlenecked by the bzcat?) [11:05:16] probably [11:05:38] 7 s vs 17 s for 500k lines, looks like perl is significantly faster for me [11:05:40] thanks :) [11:05:46] cool :) [11:06:46] awk '{print substr($0,0,length($0)-1)}' ← 11 s, also faster than sed but doesn’t beat perl [11:09:31] when I was doing it, I also had "grep descriptions" to skip the lines with no descriptions. not sure if that's any faster [11:09:37] I imagine most items do have at least one description [11:10:11] oh, interesting idea [11:10:21] are the descriptions completely omitted from the JSON if there aren’t any? [11:10:37] they seem to be, yeah [11:10:43] nice [11:12:15] unless of course it's changed recently or I'm misremembering... [11:13:37] hm, on a local test item I’m getting descriptions: [] [11:14:22] bah [11:15:25] yeah, I see "descriptions":{}, looking now [11:15:38] wait, {}? [11:15:40] I get [] [11:15:46] perhaps my wiki is misconfigured [11:16:08] grep -vF '"descriptions":{}' would still do the trick, but it’s dubious whether that’s still more efficient [11:16:14] * nikki nods [11:16:16] though it might be… JSON parsing can be expensive [11:16:31] and yeah, it should be an object, 'cause the languages are the keys [11:17:14] yeah, makes sense [11:19:12] now I wonder if it changed or if I was wrong all along D: [11:19:44] I’m seeing an empty array in http://www.wikidata.org/entity/Q3907364.json [11:19:58] weird [11:20:28] do you have an item with descriptions: {} ? [11:23:10] Q32939 has {} in my dump but [] on https://www.wikidata.org/wiki/Special:EntityData/Q32939.json [11:30:47] weird. even my old dumps have {}. I have no idea why I thought they were left out [11:30:52] sorry for being confusing >_< [11:31:24] I didn’t find a single descriptions:[] in the dump [11:31:31] looks like dump and live JSON are inconsistent [11:32:14] ah, there’s already a task: https://phabricator.wikimedia.org/T138104 [12:00:44] Hello! Often when I use the MediaWiki API, I have to prefix parameters with "g" to make them work. Why? [12:01:04] For example "grnlimit" instead of "rnlimit" as it is written in the documentation. [12:02:41] no_gravity: https://www.mediawiki.org/wiki/API:Query#Generators [12:04:17] Can someone help me construct one of those autofixes for https://www.wikidata.org/wiki/Property_talk:P3065? [12:04:24] matej_suchanek: Not sure what that means. I have looked at "Generators" as kind of api end points so far. [12:05:38] matej_suchanek: If a Generator is kind of a thing that uses the API itself - does that mean that for example the random generator could be used to return a random number of search results for a search term? [12:07:17] no_gravity: you can consider generators as a "role" for some API modules [12:07:50] such an API module can do something but can also generate pages for another API module with a dedicated function [12:08:56] matej_suchanek: Hmm... tricky. [12:09:30] and in order to pass the params to the generator API module, you need to prefix them with "g" [12:09:56] matej_suchanek: But why? [12:10:13] because they could conflict with the non-generator [12:10:18] I suppose [12:11:11] I see. [12:11:57] sjoerddebruin: I can, is that in the only post on that page? [12:12:29] Yeah. There is a autofix template now. I'm not very good in regex though. :) [12:12:59] what's the name of the template? [12:13:01] I'm though but the request is quite ambiguous [12:13:18] thib: [[Template:Autofix]] [12:13:18] 10[5] 10https://www.wikidata.org/wiki/Template:Autofix [12:13:26] thanks [12:13:59] Thibaut just converted them all with QS afaik, but sources were lost back then. [12:14:31] if I remember correctly, sources were only "imported from VIAF" [12:14:49] Oh, it's you. :P [12:14:53] yeah :P [12:14:53] Sorry. [12:14:55] np [12:14:59] it's not clear to me if and why there's always "02-" at the beginning... etc. [12:15:45] I mostly add sourcing to identifies from VIAF to keep track from their changes (like, when they are still spread out over various VIAF identifiers) [12:18:13] Hi everyone ! I have an issue with a simple request : I want to find all the people who received a given award, and display a timeline of them [12:18:47] So I tried ?item wdt:P166 wd:Q33232596 pq:P585 ?date. but it is not correct and I can't find an exemple in the help :( [12:19:04] Lena_: hi, I'll try to help you [12:22:09] Lena_: try example "Awarded Chemistry Nobel Prizes" [12:23:29] matej_suchanek: Perfect, thanks ! [12:31:36] Dispatch is going different it seems? Step for step now. https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch?refresh=1m&orgId=1&from=now-3h&to=now [12:31:54] nikki: do you think we still need the new filter? [12:32:19] which filter? [12:32:28] that coordinate problem... [12:33:06] heh, "NaN ns" [12:33:21] oh, magnus said that he stopped it, didn't he? so we shouldn't need it any more [12:36:01] disabled then [12:36:15] thanks [13:01:00] Hello guys. I've just the wikidata data model and I'd like to get all externalIDs of a resource. Did I have to filter properties for wikibase:propertyType or is there a smarter way? [13:10:40] Ciccio_: that's the way to go, I think. [13:16:10] yay, constraint checks work again – thanks aude :) [13:16:24] :O [13:17:13] and pages shouldn't be broken anymore? [13:21:37] should be part of the same deployment afaik [13:21:47] but caches also need to be purged [13:22:10] Dispatch lag will be under one day soon. [13:25:31] sjoerddebruin: looks like a dream came true https://www.wikidata.org/wiki/Special:RecentChanges?liveupdate=1&urlversion=2 [13:25:54] edit rates have been low since [13:26:21] I meant something different [13:26:36] oh wow [13:27:17] The header of the page is 550px now... [13:28:05] http://wikidata.wikiscan.org/gimg.php?type=edits&date=201707&size=big [13:31:15] ok, thank you DanielK_WMDE ! [17:15:49] nikki: you want to try out another version to trim the last character? :) https://gist.github.com/lucaswerkmeister/a5b02cda1ce9ea14874dd2828ce57e79#gistcomment-2157093 [17:40:11] Lena_: Hi! Something like this? http://tinyurl.com/ya22mqlc [17:40:34] (haha I was off by a few hours, sorry >_<) [18:54:38] DanielK_WMDE: one thing already looks great: the correct item for male will be listed first. ;) [19:04:25] \o/ looking forward [19:04:45] Now we only need a gadget... [19:09:11] Same for the music genre jazz. [19:10:44] sjoerddebruin: i think we need to tweak the search profile quite a bit. I'm not sure yet i'm happy with the results [19:11:18] Yeah, I guess disambiguation pages can be put even lower... [19:17:51] sjoerddebruin: what are you referring to? sounds like interesting search news [19:18:01] nikki: https://phabricator.wikimedia.org/T125500#3467286 [19:19:00] oh, cool! [19:22:47] ooh. now I need to test my list of examples [20:15:55] aww, it doesn't fix all my examples [20:16:01] it does fix some of them though [20:24:02] WikidataFacts: how big is the dump when extracted btw? [20:24:49] 154G [20:24:57] ... ah [20:25:02] I won't be extracting it any time soon then XD [20:25:12] and looks like that means 154 GiB, 165 GB [20:25:21] (stupid binary units :D) [20:25:38] it's all way too big for my little ssd [20:25:50] well, technically, if I deleted almost everything else on the drive, it would fit [20:26:02] but I'd sooner just get another drive [20:27:39] yeah, looks like that one file takes up >75% of my home dir partition :D [20:27:58] it does seem that most of the time is extracting the lines, ~29 of the 30ish seconds [20:28:24] the C thing you linked is actually slower for me, around 34 seconds for 100k lines vs 30 for perl and sed [20:28:38] wow, weird [20:28:55] but maybe I did something wrong, I just typed gcc test.c and then piped stuff to a.out [20:29:05] been a very long time since I last tried to use c [20:29:33] you could add -O2 but I don’t think it made a difference in my case [20:29:48] (btw you can also use `make test` to get a nicer named executable, make has implicit rules for C) [20:31:51] it *is* faster if I extract 100000 lines to another file first, 0.6 for c, 1.4 for perl and 3.6 for sed [20:34:23] phew, okay :) [20:34:39] I don’t know how perl could do this much faster than the C version [20:35:36] yeah, that was odd [23:10:51] anyone know how to create wikidata items using the api? I can't find it in the docs. [23:11:37] https://www.wikidata.org/w/api.php?action=help&modules=wbeditentity [23:12:00] ah [23:12:04] thank you