[08:25:28] 10Scoring-platform-team (Current), 10drafttopic-modeling: Implement English pronoun count features in topic models - https://phabricator.wikimedia.org/T242345 (10kevinbazira) I've added English pronoun count features to drafttopic. PR: https://github.com/wikimedia/drafttopic/pull/43 [14:25:49] 10ORES, 10Scoring-platform-team: [Discuss] Future ORES architecture - https://phabricator.wikimedia.org/T226193 (10Ottomata) https://flyte.org/ [14:36:19] o/ kevinbazira [14:36:25] I just saw your PR. Having a look. [14:37:03] o/ halfak [14:37:12] Are you working on rebuilding the model now? [14:37:32] Which model should I rebuild? [14:40:07] either drafttopic or articletopic -- take your pic. [14:40:12] Eventually we'll want to rebuild both. [14:40:23] But we need one now to find out if the theory holds in practice. [14:42:55] Alright, should I rebuild the model before the review? [14:46:28] Yes. We don't know if the change works yet. We have a hypothesis: "Pronoun counts & proportion should help us identify biographies of women". We need to test it empirically. [14:47:04] Personally, I can't imagine this *wouldn't* work. But hey, it's always good to know what we're getting before we commit to adding new features. [14:48:00] I see. Let me start the rebuild process. [15:10:02] halfak, while rebuilding the drafttopic model, I'm getting a TypeError with load_kv(). Do you mind taking a look at it via a short video call? [15:13:12] Sure. Call when ready [15:13:15] kevinbazira, ^ [15:13:31] But you probably need to upgrade your version of revscoring. [15:13:37] pip install revscoring==2.6.3 [15:14:11] Alright, let me first upgrade revscoring! [15:58:52] 10ORES, 10Scoring-platform-team: [Discuss] Future ORES architecture - https://phabricator.wikimedia.org/T226193 (10Nuria) It is probably wise to start here with the build piece and do a design document on how to move the build of models to happen in the cluster. Some projects that do that very thing for scikit... [20:22:29] Arg. Totally got lost in mwtext land again. I want to to be flexible for non-english Wikis and I want a set of tests that work before I declare victory. [20:22:44] I think I can make it today. Then I'll set up a demo for how it works. [20:24:01] I need to change locations. back in 30 [21:55:47] 10MediaWiki-extensions-ORES, 10Scoring-platform-team, 10Discovery-Search, 10Growth-Team, 10NewcomerTasks 1.1: Expose ORES drafttopic data in ElasticSearch via a custom CirrusSearch keyword - https://phabricator.wikimedia.org/T240559 (10Tgr) >>! In T240559#5770430, @dcausse wrote: > I suggest a keyword sl... [22:48:13] Finally! https://github.com/mediawiki-utilities/python-mwtext [22:48:29] Turns out this works great for korean, arabic, & vietnamese -- I'm pretty sure.