[13:32:02] o/ [13:32:22] I'm working on the feature vectors again this morning. [13:32:30] Time to figure out how tfidf will work. [16:04:54] Well, I just wrote a custom term utility function. Looks like sklearn's tfidf feature selector was kind of lame. [16:05:13] It makes the vector dense, but doesn't optimize for utility [16:05:18] Which is a shame [16:05:21] But easy to work around. [16:05:27] Now I'm writing tests [16:30:42] And pushed! I'm going to get some lunch. When i get back, I'll look at the problem of serializing a trained tfidf selector. Then I'll be looking at SelectFromModel. [17:24:13] 06Revision-Scoring-As-A-Service, 10Wikilabels, 15User-Ladsgroup: Wikilabels UI reports non-200 status errors badly - https://phabricator.wikimedia.org/T138255#2576390 (10Ladsgroup) [17:25:40] https://github.com/wiki-ai/wikilabels/pull/134 [17:25:42] :D [17:25:55] halfak: o/ I think it's safe to announce the deployment \o/ [17:26:28] Amir1, +1 [17:28:15] Amir1, https://github.com/wiki-ai/wikilabels/pull/134/files#r75912917 [17:30:36] This work on feature vectors is fun. [17:33:21] halfak: https://github.com/wiki-ai/wikilabels/pull/134/files [17:33:25] Awesome [17:33:32] I will work on some NLP stuff soon [17:33:38] I think it will be fun too [17:34:48] Amir1, https://github.com/wiki-ai/wikilabels/pull/134/files#r75914192 [17:39:41] halfak: ^ [17:42:39] {{merged}} [17:42:40] 10[1] 04https://meta.wikimedia.org/wiki/Template:merged [17:43:38] thanks halfak, we should try deploying soon [17:44:23] Agreed. I want to try to talk to Sage briefly about the changes to the article quality models [17:46:32] I meant for wikilabels [17:46:50] but ores sounds good too [17:47:16] Oh! [17:47:28] Yeah. Good point. We could do that today no problem. [17:48:12] hmm [17:48:16] I'm going to sleep [17:48:19] OK. Sage has confirmed that this won't mess him up. [17:48:23] Can I do it tomorrow my time? [17:48:27] Sure! [17:48:34] I'll need to copy JS to meta though. [17:49:09] oh, tomorrow your time [17:49:18] I do the first parts [17:49:53] OK sounds good. [17:57:42] halfak: Okay, what's the plan for the announcement before I call it a day? [17:58:02] Should I do it? Do you want to do it? [17:58:16] I think just a post on the enwiki village pump ala. https://meta.wikimedia.org/wiki/Objective_Revision_Evaluation_Service/Extension_announcement [17:58:30] why don't you do it? I think you deserve the credit. [17:58:43] I can show up right afterwards to help take any backlash. [17:59:00] :) [17:59:12] OK [17:59:14] thanks [18:10:51] announcements sent, [18:10:56] heading to bed [18:10:58] o/ [19:07:40] hi RoanKattouw Amir1 halfak. RoanKattouw suggested this channel may be of interest [19:07:51] If you want to talk about ORES then yes :) [19:08:06] :D [19:09:41] dr0ptp4kt: Out of curiosity, what's your interest in ORES-related things? My team is going to work on incorporating ORES in a product this and next quarter, and will probably submit some patches to the ORES extension. Is there anything Reading is doing / interested in doing in this area that I don't know about? [20:02:03] hey RoanKattouw. so, we're thinking about how we can surface things like an article quality panel in a reading ui [20:02:26] RoanKattouw: but also considering things like we can filter and mix and match with api.php [20:02:43] RoanKattouw: but also pageview stuff [20:03:36] RoanKattouw: what's the thing you're talking about? [20:04:03] RoanKattouw: a question is if tgr and anomie can help with scaling and more general api.php-ness [20:39:16] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES: Edits being flagged by review tool on enwiki aren't likely to be damaging - https://phabricator.wikimedia.org/T143738#2577355 (10Halfak) [21:12:20] dr0ptp4kt: Re api.php-ness, I think we probably got it (it's simple) although advice from anomie is always appreciated for anything api.php-related [21:12:31] I'll send you links to the tasks [21:13:03] Re help with scaling, I think that would be more for scaling the scoring backend, which you should talk to halfak about [21:13:30] I've been thinking about scaling quite a bit. [21:13:42] Re article quality, also halfak. Not sure what you mean by pageview stuff, do you mean like exposing the # of views for that page on the page? [21:13:51] In the past three weeks, we've doubled out capacity on the same hardware ^_^ [21:47:31] Little changes in memory usage allow us to start up more workers. [21:48:18] 06Revision-Scoring-As-A-Service, 10Wikilabels, 10rsaas-editquality: Deploy 2016 edit quality campaign to English Wikipedia - https://phabricator.wikimedia.org/T143745#2577541 (10Halfak) [21:48:47] 06Revision-Scoring-As-A-Service, 10Wikilabels, 10rsaas-editquality: Deploy 2016 edit quality campaign to English Wikipedia - https://phabricator.wikimedia.org/T143745#2577555 (10Halfak) https://quarry.wmflabs.org/query/11963 https://en.wikipedia.org/w/index.php?title=Wikipedia%3ALabels%2FEdit_quality&type=r... [21:58:59] OK. Cleaned up the vector stuff more. So right now, I have (1) a viable strategy for gramming, hashing, tfidf selection, and serialization and (2) working tests for all that stuff. [21:59:41] I think I'll be working to bridge the gap between a feature vector with sub-vectors and the way that ScorerModels behave. [22:00:23] I think we'll want a utility for training selectors like tfidf and estimator-based strategies [22:00:38] I suppose I'll work on a demo before I start worrying about what the utility is going to look like. [22:00:46] I'm calling it a day. [22:00:47] o/ [22:01:23] 06Revision-Scoring-As-A-Service, 10revscoring: Implement abstraction for Sparse Feature Vectors - https://phabricator.wikimedia.org/T132580#2577598 (10Halfak) My notes from IRC: ``` [16:58:59] OK. Cleaned up the vector stuff more. So right now, I have (1) a viable strategy for gramming, hashing, tf... [22:01:48] 06Revision-Scoring-As-A-Service, 10revscoring: Implement abstraction for Sparse Feature Vectors - https://phabricator.wikimedia.org/T132580#2577599 (10Halfak) BTW, I'll updates are in https://github.com/wiki-ai/revscoring/pull/284