[00:15:20] 10Revision-Scoring-As-A-Service-Backlog, 10Edit-Review-Improvements, 03Collab-Team-Q1-July-Sep-2016, 07Design: Design system for signifying ORES Good-Faith and Damaging scores - https://phabricator.wikimedia.org/T143455#2593320 (10jmatazzoni) [12:39:37] (03PS1) 10Ladsgroup: Purging should deelte when model is null too [extensions/ORES] - 10https://gerrit.wikimedia.org/r/307495 [12:40:40] (03CR) 10jenkins-bot: [V: 04-1] Purging should deelte when model is null too [extensions/ORES] - 10https://gerrit.wikimedia.org/r/307495 (owner: 10Ladsgroup) [12:41:34] (03PS2) 10Ladsgroup: Purging should deelte when model is null too [extensions/ORES] - 10https://gerrit.wikimedia.org/r/307495 [13:02:09] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES, 15User-Ladsgroup: Check model version replaces every time it runs. - https://phabricator.wikimedia.org/T144195#2594289 (10Ladsgroup) Since it runs replace instead of something like insert... on duplicate key update oresc_model=oresc_model (see htt... [13:36:32] (03CR) 10Thiemo Mättig (WMDE): [C: 031] "How can it happen that this is null? Incomplete data from an old version? Isn't it better to clean this up via update.php instead?" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/307495 (owner: 10Ladsgroup) [13:51:55] Well... I just crashed and lost all of my processing from yesterday. Woo. :( [13:56:11] (03CR) 10Ladsgroup: "Mostly because CheckModelVersion.php is/was broken and removes old rows in ores_model table causing this left join to fail to find such ca" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/307495 (owner: 10Ladsgroup) [13:59:42] :( [14:06:18] (03PS3) 10Ladsgroup: Purging should delete when model is null too [extensions/ORES] - 10https://gerrit.wikimedia.org/r/307495 [14:08:08] (03CR) 10Ladsgroup: [C: 032] Purging should delete when model is null too [extensions/ORES] - 10https://gerrit.wikimedia.org/r/307495 (owner: 10Ladsgroup) [15:45:30] OK. I'm tired of this nonsense with losing my data, so I'm reworking the utilities so that we can use them to train feature selectors. [15:45:43] Intermediate datasets for the win [17:08:25] 06Revision-Scoring-As-A-Service, 10MediaWiki-API, 10MediaWiki-extensions-ORES, 15User-Ladsgroup: [Discuss] api.php integration with ORES - https://phabricator.wikimedia.org/T122689#1910930 (10DarTar) @Halfak @Ladsgroup can you guys briefly comment on this ticket being closed as resolved? Is there "a plan f... [17:31:08] 06Revision-Scoring-As-A-Service, 10MediaWiki-API, 10MediaWiki-extensions-ORES, 15User-Ladsgroup: [Discuss] api.php integration with ORES - https://phabricator.wikimedia.org/T122689#2595168 (10Halfak) @DarTar, please see cards mentioned above: * T143614: Introduce ORES rvprop .Mon, Aug 22, 21:09 * T143616:... [17:43:03] OK. So I'm thinking about adjusting the pipeline of revscoring to produce JSON lines [17:43:07] Rather than TSV [17:43:34] So, we'd start with {"rev_id": ..., "label": true} or something like that. [17:43:57] We can pre-extract data to get {"rev_id": ..., "cache": ..., "label": true} [17:44:39] And then when we're ready to extract features, we get {"rev_id": ..., "cache": ..., "features": ..., "label": true} [17:44:54] Except that we'd probably drop the "cache" at that point to save on memory. [17:49:24] This would give us a lot of flexibility in how we run our feature extractors. E.g. we can cache the datasources that go into training the selectors like TFIDF [17:49:40] Generating those from the XML dumps would probably be best [18:07:25] 06Revision-Scoring-As-A-Service, 10MediaWiki-API, 10MediaWiki-extensions-ORES, 15User-Ladsgroup: [Discuss] api.php integration with ORES - https://phabricator.wikimedia.org/T122689#2595364 (10DarTar) @Halfak excellent, thanks. I'll copy them in the description. [18:08:59] 06Revision-Scoring-As-A-Service, 10MediaWiki-API, 10MediaWiki-extensions-ORES, 15User-Ladsgroup: [Discuss] api.php integration with ORES - https://phabricator.wikimedia.org/T122689#2595366 (10DarTar) [18:46:35] I just produced yamlconf 0.2.0 so that we can import deep attributes (which will be necessary for specifying dependencies on the commandline) [19:57:49] o/ Amir1 [20:35:19] halfak: hey [20:35:21] I was afk [20:36:16] Hey! No worries. I've been hacking on a generalized extraction pattern for priming vectors all day [20:37:49] awesome [21:39:00] (03PS1) 10Ladsgroup: Improve CheckModelVersions.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/307624 (https://phabricator.wikimedia.org/T144195) [21:40:58] halfak: I fixed some bugs in the extension ^_^ [21:41:29] Nice. [21:41:47] I'm considering a serious refactor to our feature extraction patterns. [21:44:32] halfak: and only one thing left in my backlog for ORES, two actually. I was thinking it's time to pick up the KDD paper [21:44:49] That sounds like a good idea. The timing is pretty good too. [21:46:31] I really want to to finish this paper, it has consuming some parts of my brain RAM [21:46:46] *it has been [21:54:27] +1. Nice to get things off of the backburner [22:01:07] * halfak appreciates the brain RAM metaphore [22:01:09] -e [22:01:59] We're up to 299 tests in revscoring O.O [22:33:38] :D [23:36:55] I just pushed a bunch more changes. Will describe in phab quick and then I'll call it a day. [23:39:32] I just realized that JSONing a map with integer terms is going to be a bad-time. Arg [23:39:43] Since JSON will convert keys to strings [23:39:46] *sigh* [23:39:52] Oh well. Leaving notes anyway.