[00:56:39] 10Scoring-platform-team: Minor cleanup in Makefiles - https://phabricator.wikimedia.org/T168904#3393590 (10awight) https://github.com/wiki-ai/wikiclass/pull/43 https://github.com/wiki-ai/editquality/pull/79 https://github.com/wiki-ai/draftquality/pull/4 https://github.com/wiki-ai/draftquality/pull/8 [14:40:47] halfak: o/ [14:42:39] halfak: have you tried to do "revscoring tune" using wikidatawiki.py which you have modified yesterday? I encountered "NotImplementedError" when running revscoring tune using the latest wikidatawiki.py. [14:43:18] glorian_wd, I haven't. [14:43:28] glorian_wd, is this something you want to reporting in your thesis? [14:43:37] If so, it's kind of critical that you can get it to run. [14:43:55] halfak: well, I want to make sure that I am running the classifier with the best config [14:44:05] You might consider trying to run it with the code I checked in. [14:44:13] yeah I did [14:44:19] I used your wikidatawiki.py [14:44:35] could you tried it in your system? run revscoring tune with wikidatawiki.py [14:44:45] with latest wikidatawiki.py* [14:44:54] * halfak starts up the job and watches it go [14:44:57] Seems to work to me [14:46:17] hmm so strange. I have used the latest revscoring, and this error still pops up [14:46:46] Maybe it would help if you shared the error and how you are calling it. [14:47:22] https://usercontent.irccloud-cdn.com/file/S9IyGIjo/ [14:47:25] halfak: ^ [14:48:26] https://gist.github.com/GlorianY/b1b2adcdf709a504118fa2c38a30230c [14:48:46] I identified the error pops out because of the functions that you write. [14:48:55] the ones that I blocked in the gist above [14:49:04] the ones that I commented* [14:51:26] glorian_wd, you must re-extract features. Have you done that? [14:51:36] halfak: oh why? [14:51:47] Because my features are different. [14:51:51] Different names mostly. [14:51:54] oh darn [14:51:56] it's right [14:52:10] alright thanks :) [14:52:59] :) [16:20:29] FYI: https://en.wikipedia.org/wiki/User_talk:EpochFail#Reddit [16:22:28] halfak: :) [16:24:59] halfak: could we expect new people coming in next month? because I think ORES team is going to be officially formed in July. So, I suppose there will be new employees from ORES team and they'll chime in this channel [16:25:49] glorian_wd, only new employee for at least a year is awight [16:25:54] Minimal funding :\ [16:26:13] halfak: oh I see okay :) [16:34:24] Amir1 & Nettrom: https://en.wikipedia.org/wiki/User_talk:EpochFail#Reddit FYI [16:34:41] halfak: hey, I just arrived at the hotel [16:34:41] yeah [16:34:42] <3 [16:34:46] :D [16:35:07] that’s awesome, thanks for the ping :) [16:35:33] great to see that these kind of events leads to a broader understanding of the work we’re doing [16:50:20] apologies for hammering ORES a bit, I added a 100ms delay between iterations just in case… [17:03:35] perhaps something you all might be familiar with, any ideas for reducing the categories that exist on pages to a set of independent but important (common?) categories? I've been poking around for perhaps a chi-squared test for multi-valued categories but not finding anything [17:08:23] ebernhardson: reminds me of the problem of identifying broader categories that articles fit in to, but you’re perhaps looking to keep them specific? [17:09:40] Nettrom: identifying broader categories will work as well i think. The end goal is to use the identified categories as one-hot encoded values in a search ranking decision tree [17:10:57] would apply similarly to templates, right now we manually one-hot encode the templates: Featured article, featured picture, featured sound, featured list, good article on enwiki, but looking to see if something general could be used to identify a larger list [17:11:33] there’s a few approaches I know of from the research literature: 1, WikiBrain has support for various sorts of things like that, but I’m not familiar with their exact algorithms. But it’s open source, so you can have a look at their code: https://shilad.github.io/wikibrain/ [17:11:59] 2, in one of our papers we used the WikiProjects as a categorization scheme, because their names tend to be broader :) [17:13:11] I think some papers also walk up the category tree looking for a set of pre-determined broader categories and identify the shortest path one, but can’t come up with a reference for that [17:14:05] the paper I’m referring to in #2 is http://www-users.cs.umn.edu/~morten/publications/icwsm2015-popularity-quality-misalignment.pdf [17:15:13] wkiprojects sounds like it might be a reasonable middle ground, still somewhat done by hand but easier to extract. I've pondered the category traversal but i need to find a way to do it performantly, basically search will probably need to be doing this for ~500k documents/sec under peak load so it has to be really cheap to compute [17:16:24] thanks for the pointers! i'll poke at it [17:19:41] if it needs to be done on the fly, looking at the WikiProject categories of the talk page would be my first approach, that way you don’t need to grab and parse the text of it :) [17:21:34] those WikiProject categories tend to be fairly regular in structure (or you can enforce that they should be). I’ve documented a few edge case issues because it’s relevant to the current project, discussed here: https://meta.wikimedia.org/wiki/Research:Automated_classification_of_article_importance/Gathering_importance_data [17:28:54] Nettrom, re hammering ORES, are you using the ores score_revisions utility? [17:29:03] Looks like we are well within capacity FWIW [17:30:09] halfak: I’ll use that when I’m running across the English Wikipedia. Right now I’m processing my WikiProject datasets, where I have specific revision IDs I want to score. It’s single-threaded and not too many articles, and I can wait to run it if capacity is needed. [17:30:24] yea checking exact categories is really cheap, in elasticsearch i can basically lookup a list of page ids that match a category, intersect it with the list of page ids that are being scored, and its all done [17:30:58] Nettrom, perfect. the score_revisions utility is great for when ou have specific revision IDs :) [17:31:27] halfak: ah, I’ll look into that [17:31:37] It manages parallel scorers for you in ways that I know maximize ORES score rate while minimizing resource usage. :) [17:31:54] "pip install ores" and then "ores score_revisions -h" [17:32:14] If you are working from quarry, give it the JSON-lines data type and make sure a field is called "rev_id" [17:32:33] If TSV, then consider using the tsv2json utility in the json2tsv package. :) [17:33:09] FWIW, what you are doing right now is fine. This utility just might be a bit faster :) [17:33:19] And I want to trick you into contributing to it. :D [17:33:59] ah, yes, this would work just fine as well :) [17:34:51] FYI for when you are ready for it later: https://github.com/wiki-ai/ores/blob/master/ores/utilities/score_revisions.py [17:35:40] Note that you can score multiple models per rev_id. [17:35:58] Oh I should really allow the user to change the default parallel scorers and batch size from this utility. [17:36:11] * halfak needs to stop getting distracted from his work :) [17:36:20] haha, you and me both [17:36:49] but that looks really useful, and I might want to consider moving SuggestBot over from using its own API code to just plugging into the API here [17:38:12] \o/ that would be great! [17:40:35] looks like the first PR to that will be like my PR to the mwviews library: please allow me to set my user agent, I’d like to not be halfak all the time :D [17:40:50] lool [19:32:34] 10Scoring-platform-team-Backlog, 10Wikidata, 10artificial-intelligence: Train/test item quality model for Wikidata - https://phabricator.wikimedia.org/T157498#3395961 (10Glorian_WD) [19:32:36] 10Scoring-platform-team, 10Scoring-platform-team-Backlog, 10Wikidata, 10artificial-intelligence: Engineer features for item quality model - https://phabricator.wikimedia.org/T157497#3395960 (10Glorian_WD) 05Open>03Resolved [19:33:07] 10Scoring-platform-team-Backlog, 10Wikidata, 10WMDE-Tech-Communication-Mentoring-And-Events, 10artificial-intelligence: Deploy item quality classification model for Wikidata - https://phabricator.wikimedia.org/T127470#3395964 (10Glorian_WD) [19:33:23] 10Scoring-platform-team-Backlog, 10Wikidata, 10artificial-intelligence: Train/test item quality model for Wikidata - https://phabricator.wikimedia.org/T157498#3007324 (10Glorian_WD) 05Open>03Resolved a:03Glorian_WD [20:22:10] for ORES, what is TaskRevokedError? [20:22:35] ragesoss_, not sure exactly. Are you getting it a lot? [20:22:55] yeah. https://ores.wikimedia.org/v2/scores/enwiki/wp10/641962088?features [20:23:24] (this one of the revisions in the tests for my code that consumes ores data) [20:24:01] it's been highly unstable, failing sometimes and other times not, since around the time of the 'ores is down' announcement last week. [20:24:04] http://docs.celeryproject.org/en/3.1/reference/celery.exceptions.html#celery.exceptions.TaskRevokedError [20:24:10] Not helpful yet but I found ^ [20:25:01] ah, so not specifically an ORES error, but something that gets passed along from the task framework. [20:25:10] right [20:25:24] Can you file a phab task for us an tag #scoring-platform? [20:25:34] will do [20:25:43] Amir1, ^ any chance you have some time to look at this? [20:25:56] I'm thinking that it'll be useful to inspect the relevant task for celery. [20:30:09] huh. only happens with that rev when I include ?features [20:33:03] halfak: https://phabricator.wikimedia.org/T169367 [20:33:23] ragesoss_, yeah, I'm really not sure what's going on with that. [20:33:40] We might have gotten some corruption in our celery cache. [20:33:51] Seems like the score exists in the main cache. [20:34:51] that makes sense, given that I've found that error on several of the revs in my test suite (which would have been cached) but not on random other revs I tried. [20:35:47] Yeah, what I'm thinking is that a lot of specific revisions will refuse to execute because of a taskrevokederror stored in the output cache (task cache) of celery. Once that value is cycled out of the cache it will work again. In the meantime, we should be able to clear that if I'm right.