[14:47:01] 10Scoring-platform-team, 10Discovery-Search, 10Growth-Team: Allow searching articles by ORES drafttopic - https://phabricator.wikimedia.org/T240517 (10Halfak) We won't be able to re-engineer ORES to serve predictions based on sitelinks across wikis with ORES in a reasonable amount of time. ORES is really not... [14:49:14] 10Scoring-platform-team, 10Research: Extract cross-wiki WikiProject tags - https://phabricator.wikimedia.org/T240273 (10Halfak) It seems like we should get the wikiproject_to_templates and the wikiproject_taxonomy files in figshare somewhere. I wonder how you feel about extending this record to include those... [14:55:27] o/ [14:58:31] 10Scoring-platform-team, 10Research: Extract cross-wiki WikiProject tags - https://phabricator.wikimedia.org/T240273 (10Isaac) Yeah, that works for me. I looked but doesn't seem I can give you edit permissions to a figshare item I created, so just point me towards what files you want uploaded and any additiona... [15:01:31] 10Scoring-platform-team, 10Research: Extract cross-wiki WikiProject tags - https://phabricator.wikimedia.org/T240273 (10Halfak) Indeed it looks like my account has been disabled! I emailed support to get it re-enabled. Weird. I'll get the files together and ping back. [15:19:13] o/ kevinbazira [15:19:20] How's work on the vectors going? [15:26:51] o/ halfak [15:27:04] Work on the vectors is going well! [15:27:31] The process of generating a cbow 50 cell vector model completed successfully and genrated 2 files (.vec and .bin) [15:27:49] wikimedia/revscoring#1785 (minor_german - e91c063 : halfak): The build passed. https://travis-ci.org/wikimedia/revscoring/builds/625733134 [15:32:23] 10Scoring-platform-team, 10articlequality-modeling, 10editquality-modeling, 10revscoring, and 2 others: Add English Language idioms to revscoring - https://phabricator.wikimedia.org/T205545 (10Halfak) I think that makes a lot of sense. If you return them as a list, it is trivial to count them later. [15:36:06] Oh we should do the skipgram one. [15:36:19] *subword [15:36:22] kevinbazira, ^ [15:36:39] We should generate both 50 and 100 cell vectors to experiment with. [15:36:47] What do you think? [15:56:43] kevinbazira, ? [15:58:55] Alright, I'm going to generate the skipgram one too [16:00:47] awesome. [16:05:03] wikimedia/editquality#713 (revscoring-2.6.2 - 51c0eca : Aaron Halfaker): The build passed. https://travis-ci.org/wikimedia/editquality/builds/625751659 [16:05:42] wikimedia/articlequality#262 (revscoring-2.6.2 - ed5567a : Aaron Halfaker): The build passed. https://travis-ci.org/wikimedia/articlequality/builds/625752291 [16:08:25] 10ORES, 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Retrain enwiki and dewiki models with revscoring-2.6.2 - https://phabricator.wikimedia.org/T240724 (10Halfak) https://github.com/wikimedia/editquality/pull/218 https://github.com/wikimedia/articlequality/pull/... [16:08:58] 10ORES, 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Retrain enwiki and dewiki models with revscoring-2.6.2 - https://phabricator.wikimedia.org/T240724 (10Halfak) [16:38:57] 10Scoring-platform-team, 10Edit-Review-Improvements-Integrated-Filters, 10Growth-Team, 10editquality-modeling, 10artificial-intelligence: Deploy ORES filters for jawiki - https://phabricator.wikimedia.org/T225563 (10Halfak) Hey folks. I just went to go do some work on this and it looks like the "damagin... [16:46:07] 10Scoring-platform-team, 10Research: Extract cross-wiki WikiProject tags - https://phabricator.wikimedia.org/T240273 (10Halfak) @Isaac see https://github.com/halfak/wikitax/tree/master/datasets [16:49:25] halfak: a small question, here - https://en.wikipedia.org/wiki/?diff=744069031&diffmode=source - i was expecting insertions of "[[" and "]]" to be in the same segment when looking at wikitext.revision.diff.segments_added, but they're in different segments [16:49:26] 10[1] 04https://meta.wikimedia.org/wiki/%22_and_%22 [16:50:58] Yeah, they won't be in the same segment when there is content that was "equal" between them. [16:51:10] Segments are contiguous additions/removals. [16:52:47] okay got it, i was relating segments to the lines present in the visual diff [16:53:19] I think putting those kind of things together will require some post processing. [16:53:50] If you want to dig into how diffs work, checkout https://github.com/halfak/deltas -- specifically the "segment_matcher". [17:42:38] 10Scoring-platform-team (Current), 10drafttopic-modeling, 10revscoring, 10artificial-intelligence: Generate word vectors for ar, cs, en, and ko using FastText - https://phabricator.wikimedia.org/T235184 (10Halfak) a:03kevinbazira [17:42:57] 10ORES, 10Scoring-platform-team (Current): ORES deployment mid-Dec. 2019 - https://phabricator.wikimedia.org/T240725 (10Halfak) a:03Halfak [18:27:42] Just got a sample together for Jeena to look at. [18:27:49] I'm going to grab some lunch. [19:28:07] 10Scoring-platform-team, 10NewcomerTasks 1.1, 10Research: Improve drafttopic training data pipeline - https://phabricator.wikimedia.org/T236713 (10Isaac) Some follow-up to a conversation with @Halfak and @dr0ptp4kt : This is how I have been adjusting the model outputs based on Wikidata properties in my [[ht... [19:33:32] back. Forgot to actually change my nick. [19:51:38] 10Scoring-platform-team, 10Research: Extract cross-wiki WikiProject tags - https://phabricator.wikimedia.org/T240273 (10Isaac) > see https://github.com/halfak/wikitax/tree/master/datasets Complete -- both uploaded with brief descriptions > I emailed support to get it re-enabled. Weird. Sounds good, just let m... [19:52:24] 10Scoring-platform-team, 10Research: Extract cross-wiki WikiProject tags - https://phabricator.wikimedia.org/T240273 (10Isaac) [19:57:06] goin for lunch, back in a bit [19:59:52] * halfak watches puppet agent tv [20:23:29] Well the new staging server for wikilabels seems to be broken for no good reason. [20:23:41] I'm getting a syntax error deep in a package. [20:23:45] And it's not a syntax error. [20:23:49] Simple assignment. [20:24:06] Oh! It's using the word "async" [20:24:09] Might be reserved. [20:24:12] shoot. [20:24:13] hmm [20:32:10] Aha! It works. [20:49:12] 10Scoring-platform-team (Current), 10Cloud-VPS (Debian Jessie Deprecation), 10Patch-For-Review: "wikilabels" Cloud VPS project jessie deprecation - https://phabricator.wikimedia.org/T236546 (10Halfak) 05Open→03Resolved OK this is done! [22:01:45] o/ accraze [22:01:50] I'm looking at i18n for Jade API. [22:01:52] I see "apihelp-jadeendorse-param-notes": "Notes to save when creating a new proposal A warning will be raised if a proposal already exists. Notes will not be automatically overwritten.", [22:02:01] jadeendorse shouldn't take notes. [22:02:23] Is this an oversight or something deeper I'm missing. [22:02:56] Aha! I don't see notes in Api/Endorse.php [22:03:06] So it looks like I might be able to just delete this. :) [22:03:24] I'ma squash it. [22:16:27] Bam: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Jade/+/558233 [22:19:57] accraze, could I bug you to take a look at https://wikitech.wikimedia.org/wiki/ORES/Prepare_deployment#Push_LFS_data_to_gerrit for me? I was hoping to have Kevin do it but I have found myself blocked. I want to get the new deploy out to beta today if possible. [22:29:08] 10Jade, 10I18n, 10Patch-For-Review: Please document Jade API messages in qqq - https://phabricator.wikimedia.org/T240780 (10Halfak) Thanks for catching this @Amire80! [22:29:19] 10Jade, 10Scoring-platform-team (Current), 10I18n, 10Patch-For-Review: Please document Jade API messages in qqq - https://phabricator.wikimedia.org/T240780 (10Halfak) [22:29:33] Bah. Out of time. Gonna call it. [22:29:37] Have a good one folks! [22:29:37] o/ [22:33:26] 10Scoring-platform-team, 10Discovery-Search, 10Growth-Team: Allow searching articles by ORES drafttopic - https://phabricator.wikimedia.org/T240517 (10Tgr) [23:30:05] 10Scoring-platform-team, 10Discovery-Search, 10Growth-Team: Allow searching articles by ORES drafttopic - https://phabricator.wikimedia.org/T240517 (10Tgr) I imagine in the long term you'd want that capability anyway, since using interwiki data can probably improve predictions even when they are not fully pr... [23:58:18] ahh just saw this halfak, will give the lfs stuff a try real quick