[14:01:05] o/ Amir1 [14:01:12] halfak: hey [14:01:40] I'm thinking of moving out ORES pages on Meta to MediaWiki or maybe even Wikitech. What do you think? [14:02:13] I disagree about wikitech, it's extremely tech-savvy [14:02:37] Is that a big problem when our users are tool developers? [14:02:44] about mediawiki, I have no preference. Did you check other products? [14:02:56] We'd still have the ORES Review Tool on mediawikiwiki [14:04:08] halfak: I mean wikitech is a place for techy people who know the wikimedia infra. Things for tech people who are not using/familiar with the infra should be in mediawiki. Things like external APIs [14:04:23] RESTbase, api.php, etc. [14:04:25] APIs are used by tech people though. [14:04:36] Hmmm That's a fair point [14:05:55] OK seems like mw is the right place. [14:06:00] :D [14:06:30] BTW, have you seen the ACTRIAL discussions on enwiki? [14:06:52] nope, is it popcorny enough? [14:07:18] :P [14:08:16] It seems like ORES may enter the conversation soon. [14:08:28] Gist is this: There's a huge new page curation backlog [14:08:33] And not enough people to handle it. [14:09:03] So they way to take page creation rights away from newly registered accounts. (Pre auto-confirmed == the "AC" in ACTRIAL) [14:09:47] Alternatively, we could use ORES to quickly scan the backlog looking for seriously problematic pages (using the draftquality model) [14:09:48] and then enwiki number of active editors goes nose-dive [14:09:53] Right [14:10:37] And then after removing the seriously problematic pages, use the article quality model to prioritize review to the pages that look like they are the highest quality. [14:10:53] That second step would really be important for Articles-for-Creation. [14:10:58] Not regular review. [14:12:06] I see, We definitely need some UX design :((((( [14:12:32] Amir1, IMO, it shouldn't be us who engineers the user-interface. [14:12:53] It should be us who ensures that the prediction model has high fitness and that ORES stays online and can handle the capacity. [15:18:30] Amir1: cutting down ores_classification in 2h runs is taking forever. Do you see any problem with setting PurgeScoreCache on it and letting it run until it's done? [18:58:15] halfak: o/ [19:45:55] https://www.mediawiki.org/wiki/ORES [19:45:58] o/ glorian_wd [19:46:02] sorry I missed your ping [19:48:17] halfak: So right now I am working on stripping the value of instance of using API like this: https://www.wikidata.org/w/api.php?action=parse&oldid=480957070 [19:48:39] I wonder if my approach is already correct in your view. [19:49:12] hmm [19:49:42] Probably not [19:49:56] I see you're using action=parse [19:50:08] yeah [19:50:17] do you have any better way than this? [19:50:30] because I did not find other Wikidata API by which we can specify the revid [19:51:05] I think that revid is important. Who knows that there's a change occurred after an item has been labeled. [19:51:08] Oh I see. [19:51:40] Actually, wbgetclaims should be much easier to extract. But unfortunately, we cannot specify revid on that API [19:51:46] or wbgetentities [19:52:09] https://www.wikidata.org/w/api.php?action=query&prop=revisions&revids=480957070&rvprop=content [19:52:22] glorian_wd, file a big for wbgetclaims ;) [19:52:32] But for the meantime, we probably want this anyway. [19:55:17] want what? [19:55:37] the API link that you've just posted seems easier to extract than the parse API, I should try it out [19:59:09] Right. Don't worry about extracting data out of that though. We've already got that covered. [20:00:49] Random non-urgent question: how close are we to having ORES for kowiki? [20:01:39] 480957070&rvprop=content [20:01:39] [14:52:22] glorian_wd, file a big for wbgetclaims ;) [20:01:51] Woops. had something weird in my buffer [20:01:59] http://labels.wmflabs.org/campaigns/kowiki/?campaigns=stats [20:02:05] Looks like we aren't that close RoanKattouw [20:02:18] 1725 labels out of 7286 [20:03:28] Only 7k? I thought these sets were more like 20k? Did I make that up or does that 20k number come from somewhere else? [20:03:36] halfak: hmm, I thought I want to extract "instance of/subclass of" value from each labeled item. Then use the agreed formula for calculating weight [20:03:42] Either way it's like 20% done then [20:03:44] I thought I supposed to do this? hmm [20:04:08] glorian_wd, https://gist.github.com/halfak/71fd28c201331945d4c7d366796a9165 [20:04:25] glorian_wd, we can already get information about the item. [20:04:41] We need you to get the weighted coverage measure based on this information :) [20:08:25] halfak: does 'weighted coverage measure' mean, the weight score of each labeled item? [20:08:39] weight completeness score* [20:08:43] It means the thing we have been talking about. [20:08:47] :P [20:11:45] But in order to do that, I think we should get each value of "instance-of" or "subclass of" of an item, then based on that, do some checking on wbs_propertypairs and calculate the weight. Does this approach correct? [20:12:28] halfak: the wbs_propertypairs only have occurrence for "instance of" and "subclass of" [20:17:41] Sounds find to me [20:18:11] Maybe we can ask wikidata to make property recommendations for us so we don't need to query manually. [20:22:32] property recommendation? like what? [20:23:11] oh, speaking about "query", I remembered perhaps it is easier to get the instance-of or subclass-of value using query.wikidata.org [20:23:54] But I have never tried to use the query service programmatically (e.g. using Python to retrieve the SPARQL result from query service) [20:24:27] We don't have any functionality to use that in ORES [20:24:31] We'd need to build it up [20:24:38] Please use api.php if possible [20:26:24] halfak: gotcha. Then, I would just use https://www.wikidata.org/w/api.php?action=parse&oldid=480957070 [20:26:40] why? [20:28:22] Amir1, hey dude. I wanted to ask you about dexbot and having it write to MediaWiki instead of Meta. [20:28:25] What do you think? [20:28:52] halfak: it's okay but I'm trying to fix this UBN bug and then go to sleep [20:28:55] :) [20:29:01] Can I do it tomorrow [20:29:02] No worries. we can talk later [20:29:08] godspeed [20:29:17] halfak: for getting the value of each instance-of and subclass of, then calculate the weight by matching with wbs_propertypairs?? [20:30:26] glorian_wd, I just showed you that you don't need a fancy strategy to get instance-of and subclass-of because we already have that in revscoring. [20:30:49] The thing I'm worried about is: How are you going to query wbs_propertypairs. [20:32:35] halfak: could we just find the value that we get from instance-of, on wbs_propertypairs? then calculate the weight? [20:32:48] would it be too slow? [20:32:48] yeah. How will you do that? [20:35:16] For instance, an item has 'P31' : 'Q5'. Then look this occurrence on wbs_propertypairs, and check if there's corresponding pid2 on the item. Afterwards, calculate the weight for each pid2 which exist on the item [20:35:21] halfak: would it work? [20:35:23] hmm [20:37:24] glorian_wd, sounds like it would work [20:41:27] halfak: Ok. Does it mean I have to use revscoring to get the value of instance-of and subclass-of? [20:41:36] So, I don't need to use my fancy strategy? :P [20:41:52] Yes. Your problem is getting data out of wbs_propertypairs [20:42:21] alright2 [20:42:26] I will try to install revscoring [21:00:12] (03PS1) 10Ladsgroup: Do not error out when threshold can't be found [extensions/ORES] - 10https://gerrit.wikimedia.org/r/353178 (https://phabricator.wikimedia.org/T164984) [21:26:47] (03CR) 1020after4: [C: 032] Do not error out when threshold can't be found [extensions/ORES] - 10https://gerrit.wikimedia.org/r/353178 (https://phabricator.wikimedia.org/T164984) (owner: 10Ladsgroup) [21:27:01] (03PS1) 1020after4: Do not error out when threshold can't be found [extensions/ORES] (wmf/1.30.0-wmf.1) - 10https://gerrit.wikimedia.org/r/353181 (https://phabricator.wikimedia.org/T164984) [21:27:12] (03CR) 1020after4: [C: 032] Do not error out when threshold can't be found [extensions/ORES] (wmf/1.30.0-wmf.1) - 10https://gerrit.wikimedia.org/r/353181 (https://phabricator.wikimedia.org/T164984) (owner: 1020after4) [21:28:44] (03Merged) 10jenkins-bot: Do not error out when threshold can't be found [extensions/ORES] - 10https://gerrit.wikimedia.org/r/353178 (https://phabricator.wikimedia.org/T164984) (owner: 10Ladsgroup) [21:28:54] (03Merged) 10jenkins-bot: Do not error out when threshold can't be found [extensions/ORES] (wmf/1.30.0-wmf.1) - 10https://gerrit.wikimedia.org/r/353181 (https://phabricator.wikimedia.org/T164984) (owner: 1020after4) [23:13:56] (03PS1) 10Catrope: Remove Finnish text from en-gb.json [extensions/ORES] (wmf/1.29.0-wmf.21) - 10https://gerrit.wikimedia.org/r/353196 [23:14:10] (03CR) 10Catrope: [C: 032] Remove Finnish text from en-gb.json [extensions/ORES] (wmf/1.29.0-wmf.21) - 10https://gerrit.wikimedia.org/r/353196 (owner: 10Catrope) [23:16:32] (03Merged) 10jenkins-bot: Remove Finnish text from en-gb.json [extensions/ORES] (wmf/1.29.0-wmf.21) - 10https://gerrit.wikimedia.org/r/353196 (owner: 10Catrope)