[00:25:18] I'll keep an eye on these when the German corpus is uploaded [00:36:57] I just added a whole bunch more from that list too 😏 [00:37:22] I noticed there's some english words in there, would be nice if we could exclude them [00:38:13] Oh, you can! I just forgot to document that [00:38:54] Add them here: https://www.wikidata.org/wiki/Wikidata:Lexicographical_coverage/Filter/de [00:39:46] ohh thanks! [03:25:33] @vrandecic you may wish to rerun your analysis of zh-cn to cover "zh-hans" and "zh" forms, and similarly for zh-tw with respect to "zh-hant" and "zh" forms [03:29:25] I also wonder what the results for eu might be [03:43:46] I don't have an eu cleaned up corpus :( [03:44:11] For zh, that's a good point. I have no idea, but something is obviously wrong [08:13:50] some of the words appear multiple times, is there a reason for that? (e.g. dadurch, noch, schon) [11:36:37] ohh I see that japanese and chinese aren't being split into words, I assume because they don't use spaces to separate words [11:37:50] although it doesn't appear to understand fullwidth punctuation either, since some of the tokens include things like 。 、 , ( ) [11:41:10] maybe it needs to use something like https://en.wikipedia.org/wiki/MeCab for japanese [11:46:45] it's not perfect, e.g. it just split "ロシア語" (russian) into "ロシア 語" (russia + language) for me, but as it is the list is useless because it's almost entirely english so it'd still be a huge improvement [12:22:25] also you need to map no to nb [12:24:15] e.g. den, det, de, han, hun all have nb lexemes but are showing up in the https://www.wikidata.org/wiki/Wikidata:Lexicographical_coverage/Missing/no as missing [12:29:56] @vrandecic I have an idea that can be applied to use Wikidata for automatic semantic annotation. (re @wmtelegram_bot: (if anyone wants to help with them, that would be welcome! Particularly need someone to setup SUL for the annotation tool, and finish the migration to PAWS for the coverage analysis)) [12:30:26] We can make use of non-taxonomic relations to accurately annotate items. [12:30:42] <বোধিসত্ত্ব> @vrandecic , can bn have the data for lexicographical coverage ? [13:10:02] there's also something wrong with the korean matching, since things like 수, 그, 등 and 이 already have lexemes but are showing up as missing [14:44:38] https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2020-04-26/In_focus : [14:44:38] "By allowing the functions in Wikilambda to be called from wikitext, we also allow to create a global space to maintain global templates and modules, another long-lasting wish by the Wikimedia communities." [14:44:40] Is this true? [14:45:20] would abstract wiki aid global templates initiative? [14:46:22] Would lua be supported as a function, or are you coming up with some new language and new standard? [18:07:16] I screenshot the whole chat of the presentation that just ended. It has a lot of names, can I post it here? [18:07:33] I screenshot the whole chat of the presentation from Denny that just ended. It has a lot of names, can I post it here? [18:10:56] I forget, was this presentation recorded at all? (If not, you may wish to anonymize the names of those not part of this group.) (re @Dennis0123: I screenshot the whole chat of the presentation from Denny that just ended. It has a lot of names, can I post it here?) [18:18:56] it was... (re @mahir256: I forget, was this presentation recorded at all? (If not, you may wish to anonymize the names of those not part of this group.)) [18:19:52] Anonymize sounds good. I'm busy writing tools for WD scientific articles. Can you do it if I send in private? The questions/answers were valuable IMO. [18:20:40] (I'd meant if it _wasn't_ recorded, then the chat should be anonymized.) [18:23:26] But the chat was not recorded and all names of those who were silent was never published in the recording... (re @mahir256: (I'd meant if it _wasn't_ recorded, then the chat should be anonymized.)) [18:23:41] Those who asked got their names on the screen though... [18:23:47] Those who spoke got their names on the screen though... [19:29:38] What @lucaswerkmeister is currently doing on Twitch is absolutely interesting. [19:29:43] I am watching this at https://www.twitch.tv/lucaswerkmeister