[12:38:45] <wikibugs>	 (03CR) 10Ladsgroup: "I will work on it in later patches." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400625 (https://phabricator.wikimedia.org/T181892) (owner: 10Ladsgroup)
[14:51:01] <awight>	 o/ Nice to be back!
[15:04:30] <halfak>	 o/
[15:07:32] * halfak starts in on the email
[15:07:46] <halfak>	 Oh!  I have fiwiki models. 
[15:07:49] * halfak pokes at that
[15:07:54] <awight>	 nice work
[15:08:15] <icinga2-wm>	 PROBLEM - puppet on ORES-worker08.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[15:10:20] <awight>	 I’m going to push our logging changes at 18:00 UTC
[15:12:08] <halfak>	 awight, sounds good. :) 
[15:12:14] <halfak>	 fiwiki is WEIRD
[15:12:24] <icinga2-wm>	 PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[15:12:30] * halfak starts working on his assumptions. 
[15:12:43] <awight>	 In what way?  Test stats?
[15:13:08] <awight>	 This is the model with FlaggedRevs mixed in, right?
[15:13:28] <halfak>	 goodfaith model gets high ROC-AUC and damaging gets very low ROC-AUC
[15:13:31] <halfak>	 right
[15:14:24] <awight>	 O_o
[15:15:31] <awight>	 Maybe we could use the FR data to train goodfaith, but not when training damaging… until the mystery is dispelled.
[15:18:08] <halfak>	 hmm...  There's definitely something strange going on here. 
[15:21:31] <wikibugs>	 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Investigate code generation for model makefile maintenance - https://phabricator.wikimedia.org/T168455#3867303 (10awight) @Halfak @Ladsgroup This seemed like a fruitful project, and my prototype is c. 50% complete.  Is there a good t...
[15:37:46] <icinga2-wm>	 RECOVERY - puppet on ORES-worker08.experimental is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures
[15:41:54] <icinga2-wm>	 RECOVERY - puppet on ORES-redis02.experimental is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures
[15:44:22] <halfak>	 ^ puppet show
[15:46:05] <Adotchar>	 lol
[15:49:58] <halfak>	 :D
[15:50:02] <halfak>	 So many emails. 
[15:54:19] <Amir1>	 awight: hey, can you review this patch soon? https://gerrit.wikimedia.org/r/#/c/400183/ It would be great if I can merge this before the branch cut (in couple of hours)
[15:55:24] <awight>	 Amir1: Will do!
[15:55:37] <Amir1>	 Thank you!
[16:01:47] <awight>	 Amir1: Do you have time to write a test that integrates using the top-level API response?
[16:02:23] <Amir1>	 I think I should do it in another patch, specially I will be moving this method around 
[16:02:33] <awight>	 kk reviewing
[16:02:59] <Amir1>	 awight: I have integration tests for the API up and working: https://gerrit.wikimedia.org/r/#/c/400623/
[16:03:16] <halfak>	 awight & Amir1: Do we need to do anything to get simple english enabled in rc filters this week?
[16:03:34] <awight>	 halfak: It needs more beta testing.
[16:03:49] <awight>	 RCFilter pieces were failing intermittently.
[16:05:49] <awight>	 Amir1: Those tests still wouldn’t catch the key glitch we saw earlier...
[16:06:52] <Amir1>	 Which key glitch?
[16:07:23] <awight>	 The one where checkModelVersions expected the full API response in some places and a subtree in others
[16:07:28] <halfak>	 gotcha.  Thanks awight 
[16:07:33] <Amir1>	 This tests would catch the changes in the hooks of the API regardin database query as it does the round trip to the database and comes back
[16:07:56] <Amir1>	 Yeah, there are two different things
[16:09:16] <awight>	 Sorry to harp on this, but I still think it’s a good idea to log notices if the API response structure is surprising, rather than silently return null.
[16:10:11] <awight>	 I agree with your argument that it should be “safe”, but IMO it’s even safer to not crash, and also log any funkiness
[16:10:29] * awight shakes off holiday conservativism
[16:10:49] <awight>	 anyway, yeah let’s get this thing out and see what it does.  It does look safe.
[16:13:16] <codezee>	 o/
[16:14:19] <wikibugs>	 (03CR) 10Awight: [C: 04-1] "Lacking a call to updateModelVersion" (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400183 (https://phabricator.wikimedia.org/T183468) (owner: 10Ladsgroup)
[16:15:09] <Zppix>	 O/
[16:15:38] <wikibugs>	 (03CR) 10Awight: [C: 04-1] "This or a future patch should include a test that integrates across getScores, and checks that we've updated the model version if necessar" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400183 (https://phabricator.wikimedia.org/T183468) (owner: 10Ladsgroup)
[16:18:02] <wikibugs>	 (03PS5) 10Ladsgroup: Update model version when it's different in Scoring [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400183 (https://phabricator.wikimedia.org/T183468)
[16:19:01] <wikibugs>	 (03CR) 10Awight: [C: 032] "Great, concise!" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400183 (https://phabricator.wikimedia.org/T183468) (owner: 10Ladsgroup)
[16:19:36] <wikibugs>	 (03CR) 10Ladsgroup: "I will write integration tests for that ASAP." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400183 (https://phabricator.wikimedia.org/T183468) (owner: 10Ladsgroup)
[16:24:15] <wikibugs>	 (03Merged) 10jenkins-bot: Update model version when it's different in Scoring [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400183 (https://phabricator.wikimedia.org/T183468) (owner: 10Ladsgroup)
[16:46:02] <RoanKattouw>	 Amir1: Aw man so I'm working on what you said for normalizing the tag schema and realizing two things
[16:46:33] <RoanKattouw>	 1) This really makes the change_tag_statistics table the table that defines tags (the ID->name mapping) and so it should really be called change_tag... but that table name is already taken
[16:46:45] <RoanKattouw>	 2) There is a valid_tag table with basically no documentation as to what it does
[16:47:03] <Amir1>	 facepalm
[16:47:04] * RoanKattouw grumbles and goes off to do some code archeology into valid_tag
[16:47:26] <RoanKattouw>	 For laughs, go take a look at the schema of the valid_tag table in tables.sql
[16:47:40] <Amir1>	 RoanKattouw: Yeah, I think the name should be changed to something more meaningful but I also can't find a proper name as well 
[16:48:25] <RoanKattouw>	 If there isn't some kind of crazy reason why we can't, I think my preferred approach would be to expand the valid_tag table to contain both the ID+name mapping and the stats
[16:48:36] <RoanKattouw>	 But first I have to figure out WTH that table is used for now
[16:48:41] <Amir1>	 https://github.com/wikimedia/mediawiki/blob/master/maintenance/archives/patch-valid_tag.sql
[16:48:48] <Amir1>	 That's amazing
[16:49:09] <RoanKattouw>	 enwiki.valid_tag contains 5 rows so that's not very promising
[16:53:02] <awight>	 O_O burn it!
[16:54:03] <RoanKattouw>	 Yeah I probably can
[16:54:15] <codezee>	 halfak: I wrote a Model class that holds a collection of these classifiers, it works perfectly but the scoring seems to have taken a hit, scoring one instance with 40 diff classifiers taking 6s
[16:54:25] <RoanKattouw>	 But dinner beckons, so I'll eat first, then figure out whether I can burn it
[16:54:25] <codezee>	 I'm looking for a way to optimize right now
[16:54:53] <RoanKattouw>	 But yes it basically looks useless
[16:55:00] <Amir1>	 RoanKattouw: We checked and it's used in ChangeTags.php but not sure why the methods are orphan
[16:56:21] <awight>	 RoanKattouw: On another topic, would you feel like talking through our wacky JADE-in-ContentHandler ideas some time, for maybe 30min?
[16:59:34] <Amir1>	 awight: for when you have some free time: https://gerrit.wikimedia.org/r/#/q/owner:Ladsgroup%2540gmail.com+status:open+project:mediawiki/extensions/ORES
[17:00:49] <awight>	 Cool!
[17:03:25] <halfak>	 codezee, damn.  I figured that might be a problem.  I'm still surprised that it takes 6 seconds.  That seems like a really long time. 
[17:03:44] <halfak>	 Is all the time spent waiting on estimator to return a predict_proba()? 
[17:04:44] <awight>	 codezee: halfak: Random thing I’ve been wondering is, what volume of new articles will we be scoring and does the scoring latency matter?
[17:05:11] <halfak>	 ^ good point.  Maybe it's OK to score an article once every second. 
[17:05:17] <halfak>	 I don't think that once every 6 will work
[17:05:23] <codezee>	 awight: yes I'm not sure about that, and if we score a bunch of articles together we could gain on that front
[17:06:19] <codezee>	 halfak: do we not fire parallel scoring requests in ORES currently?
[17:06:46] <halfak>	 codezee, we do not.  That might be preventative, but maybe it is worth a shot. 
[17:06:58] <halfak>	 codezee, oh wait. Yes parallel. 
[17:07:18] <codezee>	 I was thinking its a trivia if the requests are independent and the requester has the patience to wait for 6s
[17:07:18] <halfak>	 But I was just thinking, what if you did parallelization within the prediction itself. 
[17:07:27] <awight>	 https://en.wikipedia.org/wiki/Special:NewPagesFeed suggests it’s 1-2 per minute
[17:08:03] <codezee>	 halfak: I'm trying to do that exactly but when i used multiprocessing it took 54s ! and 52s of that were stuck in acquire_lock
[17:08:07] <codezee>	 something I'm missing here
[17:08:26] <halfak>	 codezee, let's not do that.  Now that I think of it, ORES will fork bomb if that happens. 
[17:09:59] <codezee>	 halfak: the real problem is with the number of estimators per classifier, currently 400, if we drop that to 50 while retaining the fitness we gain 8 times, I've generated results with n_estimators as 50, looking into them
[17:12:15] <halfak>	 OK that sounds good :) 
[17:15:04] <codezee>	 oh, clearly that hypothesis holds, with n_estimators as 50, it takes 1.3s \o/ we just need to balance off n_estimators till the limit we can
[17:17:45] <wikibugs>	 (03CR) 10Awight: [C: 032] Introduce ScoreStorage and its Sql implementetion (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/398651 (https://phabricator.wikimedia.org/T181334) (owner: 10Ladsgroup)
[17:19:28] <wikibugs>	 (03Merged) 10jenkins-bot: Introduce ScoreStorage and its Sql implementetion [extensions/ORES] - 10https://gerrit.wikimedia.org/r/398651 (https://phabricator.wikimedia.org/T181334) (owner: 10Ladsgroup)
[17:20:09] <codezee>	 and looks like the results aren't even that much affected
[17:20:43] <wikibugs>	 (03CR) 10Awight: [C: 032] "Thanks!" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400623 (https://phabricator.wikimedia.org/T182942) (owner: 10Ladsgroup)
[17:22:01] <wikibugs>	 10Scoring-platform-team (Current), 10ORES, 10Operations, 10Graphite, 10User-fgiunchedi: Regularly purge old ores graphite metrics - https://phabricator.wikimedia.org/T169969#3867919 (10Halfak) @fgiunchedi, can you help me figure out what our next step should be here?
[17:22:09] <wikibugs>	 (03Merged) 10jenkins-bot: Integration tests for API [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400623 (https://phabricator.wikimedia.org/T182942) (owner: 10Ladsgroup)
[17:23:54] <RoanKattouw>	 awight: Sure! But let's schedule a time. I wanna make progress on the tag stuff today, and tomorrow I have a lot of meetings
[17:24:05] <awight>	 RoanKattouw: k that sounds best to me, too.
[17:24:18] <awight>	 I’ll pencil something in for Thursday, feel free to move it.
[17:24:23] <RoanKattouw>	 Sounds ogod
[17:24:32] <RoanKattouw>	 Actually Wed and Thu are equally bad so feel free to use either
[17:28:12] <RoanKattouw>	 As for tags: some docs seem to claim that only tags defined in the valid_tag table can be used, but that's clearly not true
[17:28:15] <awight>	 lol next week is fine, too.
[17:29:46] <RoanKattouw>	 Hmm but now you've moved it to 8am :(
[17:29:57] <awight>	 RoanKattouw: Oops, yeah I see you’re back on Pacific time by then.
[17:30:08] <RoanKattouw>	 Yeah sorry
[17:30:19] <RoanKattouw>	 gcal's TZ feature is nice but gets a bit confusing when people travel
[17:30:25] <RoanKattouw>	 I'm on CET this week and PST next week
[17:30:57] <RoanKattouw>	 I'd rather have it this week at a time that I'm awake than next week at a time that I'm wishing I were still asleep ;)
[17:31:32] <awight>	 It’s not a huge rush, cos MediaWiki is explicitly not happening in the current iteration of our project.
[17:31:39] <awight>	 *MediaWiki integration
[17:31:57] <codezee>	 halfak: before I push this major model addition, I want to check this - I've added a new Model class EnsembleClassifier(ProbabilityClassifier) that does everything and made a new RandomForestEnsemble as its subclass, rather than just inserting this functionality in ProbabilityClassifier, does that sound okay?
[17:32:05] <awight>	 RoanKattouw: Your calendar is lamentable :p
[17:32:12] <RoanKattouw>	 Yeah, I know :/
[17:32:26] <RoanKattouw>	 Comes with the territory of being on the annual planning core group this year, I guess
[17:32:47] <RoanKattouw>	 That plus my volunteering engagement eats up a bunch of time, but that's mostly time that non-PST people can't have meetings anyway
[17:33:35] <awight>	 Can I ask what that is?
[17:33:58] <halfak>	 codezee, only concern is that "Ensemble" is already taken as a word. 
[17:34:00] <halfak>	 https://en.wikipedia.org/wiki/Multi-label_classification
[17:34:17] <halfak>	 I'm not sure we have a one-vs-rest or one-vs-all classifier. 
[17:36:11] <RoanKattouw>	 awight: Certainly! https://medium.com/@Srish_Aka_Tux/volunteering-at-scripted-what-i-knew-taught-and-learned-b8174545b8d0  (not with Srishti, I only found out she was doing the same thing today when she published this, but she and I do the same work in different schools)
[17:36:16] <halfak>	 codezee, I think it's One-vs-the-rest
[17:36:22] <RoanKattouw>	 awight: Also, more calendar lamentations, I had to decline again, sorry
[17:37:14] <codezee>	 halfak: any difference b/w one-vs-rest and one-vs-all?
[17:37:43] <halfak>	 Yes.  One vs. all is multiclass and once vs. rest is multilabel, it seems. 
[17:44:06] <awight>	 RoanKattouw: Maybe you can find a slot?  halfak and I normally work c. 14:00-23:00 UTC
[17:44:17] <RoanKattouw>	 OK will do
[17:44:28] <RoanKattouw>	 The one you had before, on Thursday the 4th, would be fine actually
[17:44:45] <halfak>	 codezee, actually, it seems like one-vs-all and one-vs-rest are the same.  I'm unclear on that :/
[17:45:10] <codezee>	 yes I'm also refering to  the docs and they seem to be the same
[17:45:29] <codezee>	 anyways i think it wouldn't harm to go with one, i'm going with onevsrest
[17:46:04] <halfak>	 Cool. :) 
[17:46:05] <awight>	 RoanKattouw: Sounds good, thx.  ScriptEd looks fun!  I’ve been hoping to get into exactly that sort of unpaid work, I’ll check that out if I can ever afford to move back to the U.S. ;-)
[17:47:55] <awight>	 RoanKattouw: FYI your “contributions leadership” conflicts
[17:48:08] <awight>	 NVM
[17:48:11] <awight>	 wrong meeting
[18:13:17] <wikibugs>	 (03CR) 10Awight: Clean up ThresholdLookup (033 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400625 (https://phabricator.wikimedia.org/T181892) (owner: 10Ladsgroup)
[18:19:29] <travis-ci>	 wiki-ai/revscoring#1403 (ensemble - 2759b3d : Sumit Asthana): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/324230380
[18:22:18] <codezee>	 Drafttopic coming soon.... :D
[18:23:57] <halfak>	 awight, you're a saint for letting me get some lunch before we chat. 
[18:24:10] <awight>	 halfak: lolol better outcomes
[18:24:42] <awight>	 Any time, really.  It’s pouring cats and dogs on the tin roof I’m under, it might not let up within the next 30 min actually
[18:25:56] <wikibugs>	 10Scoring-platform-team (Current), 10ORES, 10Operations: Investigate why ORES logs are being written to syslog despite explicit logging config.  Fix. - https://phabricator.wikimedia.org/T182614#3868235 (10awight) A little additional excitement...  Now that we're seeing all the logs, some previously hidden er...
[18:28:28] <awight>	 ^ I was reading `ls` wrongly :)
[18:35:03] <awight>	 aaand… power went out to the town
[18:35:04] <awight>	 tiny little lightning storm.
[18:35:10] <awight>	 *Get a ground pin, y’all*
[18:38:57] <wikibugs>	 10Scoring-platform-team: Jinja error in ORES - https://phabricator.wikimedia.org/T183949#3868333 (10awight)
[18:39:11] <wikibugs>	 10Scoring-platform-team, 10ORES: Jinja error in ORES - https://phabricator.wikimedia.org/T183949#3868344 (10awight)
[19:03:25] <wikibugs>	 10Scoring-platform-team, 10Collaboration-Team-Triage, 10Edit-Review-Improvements: "Hide probably good edits" should not hide my own edits on Special:Contributions/Myself - https://phabricator.wikimedia.org/T182462#3868633 (10jmatazzoni) The answer here is to turn the feature off locally, on the Contributions...
[19:03:39] <wikibugs>	 10Scoring-platform-team, 10Collaboration-Team-Triage, 10Edit-Review-Improvements: "Hide probably good edits" should not hide my own edits on Special:Contributions/Myself - https://phabricator.wikimedia.org/T182462#3868639 (10jmatazzoni) 05Open>03Resolved a:03jmatazzoni
[19:13:23] <RoanKattouw>	 Ugh this change_tag ID stuff is going to be a bit of a pain
[19:13:43] <RoanKattouw>	 There are unique indexes on (ct_rc_id,ct_tag) etc
[19:13:59] <RoanKattouw>	 So migrating from ct_tag to ct_tag_id is going to be pretty annoying
[19:16:10] <RoanKattouw>	 I think what we have to do is not have indexes on (ct_rc_id, ct_tag_id) et al initially, and only introduce them once ct_tag_id is populated
[19:16:24] <RoanKattouw>	 But until then we also have to keep ct_tag populated for the unique index to keep working
[19:16:25] <RoanKattouw>	 sigh
[19:16:36] <RoanKattouw>	 DB migrations are hard
[19:29:40] <wikibugs>	 (03PS5) 10Ladsgroup: Clean up ThresholdLookup [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400625 (https://phabricator.wikimedia.org/T181892)
[19:30:22] <wikibugs>	 (03CR) 10Ladsgroup: Clean up ThresholdLookup (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400625 (https://phabricator.wikimedia.org/T181892) (owner: 10Ladsgroup)
[19:38:49] <wikibugs>	 (03PS1) 10Ladsgroup: Remove maintenance/CheckModelVersions.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401592 (https://phabricator.wikimedia.org/T183468)
[19:42:19] <wikibugs>	 (03CR) 10Awight: [C: 032] Clean up ThresholdLookup [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400625 (https://phabricator.wikimedia.org/T181892) (owner: 10Ladsgroup)
[19:44:47] <wikibugs>	 (03Merged) 10jenkins-bot: Clean up ThresholdLookup [extensions/ORES] - 10https://gerrit.wikimedia.org/r/400625 (https://phabricator.wikimedia.org/T181892) (owner: 10Ladsgroup)
[19:46:36] <wikibugs>	 (03CR) 10Awight: [C: 032] Remove maintenance/CheckModelVersions.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401592 (https://phabricator.wikimedia.org/T183468) (owner: 10Ladsgroup)
[19:49:05] <wikibugs>	 (03Merged) 10jenkins-bot: Remove maintenance/CheckModelVersions.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401592 (https://phabricator.wikimedia.org/T183468) (owner: 10Ladsgroup)
[19:53:32] <halfak>	 arg.  I somehow got a different roc-auc for true and false in the goodfaith model for fiwiki
[19:53:37] <halfak>	 should not be possible. 
[19:53:40] * halfak looks into it. 
[19:56:16] <halfak>	 WTF >:( 
[19:56:21] <halfak>	 Doesn't happen with any other models. 
[19:56:37] <halfak>	 Maybe it's because of something we did with RandomForest in recent revscoring code. 
[19:56:38] <halfak>	 hmmm
[19:58:03] <halfak>	 Looks like it predicts 100% True all the time. 
[20:08:29] <halfak>	 Ohhh... Seems like we have a rounding error here. 
[20:08:30] <halfak>	 Hmm. 
[20:10:02] <awight>	 Just saw this.  Looks fun!
[20:10:15] <awight>	 Anything I can write tests for?
[20:10:50] <awight>	 I just cleared my plate of “hard” coding tasks
[20:11:39] <awight>	 err, not the best choice of adjective.  I mean to say, the remaining stuff is reading + thinking rather than coding.
[20:13:17] <halfak>	 I'm not sure.  Let me talk it out. 
[20:13:32] <halfak>	 So we set up threshold_ndigits so that we could limit the number of thresholds that we report statistics for. 
[20:13:42] <halfak>	 This is primarily because it was taking up too much space. 
[20:14:13] <halfak>	 By limiting thresholds to ndigits (3 by default) we limit the number of rows of stats substantially. 
[20:14:45] <halfak>	 For most models, this means that we'll have ~500 - 1000 rows of data. 
[20:15:11] <halfak>	 With the fiwiki model, the damaging/not-goodfaith case is so uncommon that the estimator gives a very low likelihood prediction. 
[20:15:31] <halfak>	 This is because we are boosting 20k representative observations with 200k known good observations. 
[20:15:58] <halfak>	 This vanishingly small likelihood estimate has useful increments smaller than 0.001
[20:16:04] <halfak>	 So the rounding breaks the statistics. 
[20:16:22] <halfak>	 Options I see (1) find a better way to generate a small-ish set of useful thresholds. 
[20:16:36] <halfak>	 (2) round to more digits for this model specifically. 
[20:17:00] <halfak>	 I think that's it.  I can't think of a better option. 
[20:17:02] <halfak>	 awight, ^ 
[20:17:39] <awight>	 IMO (2)
[20:17:42] <awight>	 for now, at least.
[20:18:09] <awight>	 I’m just wondering what that actually looks like in the number of stats rows, though,.
[20:18:20] <halfak>	 Oh about 10
[20:18:22] <halfak>	 Maybe 20
[20:18:24] <halfak>	 heh
[20:18:30] <awight>	 Assuming we set ndigits=5, though
[20:18:38] <halfak>	 Ahh yeah.  Let's see. 
[20:18:55] <awight>	 We get a long tail of almost nothing, then 10,000 useless data points at the very edge?
[20:19:04] <awight>	 s/useless/nearly redundant/
[20:19:34] <halfak>	 Not clear to me. 
[20:19:54] <halfak>	 I think that if we want to do (1), we're going to want some information theoretic measure of the usefulness of a threshold. 
[20:20:04] <halfak>	 E.g. how much did the statistics change at this increment. 
[20:20:22] <halfak>	 I like recall specifically for this because it's not very sensitive to frequency of the positive class. 
[20:20:22] <awight>	 Yeah or less sophisticated, a 2d line graph compression algo
[20:20:54] <halfak>	 Arguably a threshold is only interesting if it includes more stuff. 
[20:21:12] <halfak>	 Oh wait.  It's also useful if it excludes stuff while still including the same amount of stuff. 
[20:21:13] <halfak>	 Hmm. 
[20:25:02] <awight>	 halfak: There’s something I’m not understanding.  Are we limiting ndigits in the inputs or outputs?
[20:25:18] <awight>	 Cos it seems that limiting just the output precision would give us what we want.
[20:25:18] <halfak>	 Both. 
[20:25:35] <halfak>	 Right.  We generate statistics based on what we publish. 
[20:25:45] <halfak>	 But we could generate statistics and only publish a subset of thresholds. 
[20:25:57] <halfak>	 Still we have the problem that it's hard to pick a useful threshold for fiwiki. 
[20:25:59] <halfak>	 :|
[20:26:01] <awight>	 Arbitrary precision on the input dimension, but limited precision for the output might give us a high-fidelity curve with finite data size.
[20:26:03] <wikibugs>	 (03PS1) 10Ladsgroup: Fully deprecate Cache.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401608 (https://phabricator.wikimedia.org/T181334)
[20:26:06] <awight>	 haha that’s for real though.
[20:28:03] <halfak>	 OK I have code that I think will work.  Re-CV-ifying
[20:30:18] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Fully deprecate Cache.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401608 (https://phabricator.wikimedia.org/T181334) (owner: 10Ladsgroup)
[20:30:19] <Amir1>	 ^ The final patch to deprecate the most horrible class in ORES extension is now up ^_^
[20:30:27] <halfak>	 \o/
[20:35:34] <wikibugs>	 (03PS2) 10Ladsgroup: Fully deprecate Cache.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401608 (https://phabricator.wikimedia.org/T181334)
[20:40:24] <travis-ci>	 wiki-ai/revscoring#1405 (threshold_ndigits_option - 0b4d664 : halfak): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/324277893
[20:43:51] <halfak>	 Okay!  And away we go!
[20:44:17] <wikibugs>	 (03CR) 10Awight: [C: 032] "I didn't even feel the surgery happen!" (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401608 (https://phabricator.wikimedia.org/T181334) (owner: 10Ladsgroup)
[20:45:57] <wikibugs>	 (03Merged) 10jenkins-bot: Fully deprecate Cache.php [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401608 (https://phabricator.wikimedia.org/T181334) (owner: 10Ladsgroup)
[20:47:35] <awight>	 Gotta do grocery things, back in a few hours.
[20:48:25] <halfak>	 Good time for me to take a break too so I'll be AFK for about 30 mins.  Time to pedal!
[20:51:29] <wikibugs>	 (03PS1) 10Ladsgroup: Follow up to I4246706 [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401611
[20:51:42] <wikibugs>	 (03CR) 10Ladsgroup: "Done in https://gerrit.wikimedia.org/r/401611" (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401608 (https://phabricator.wikimedia.org/T181334) (owner: 10Ladsgroup)
[20:57:13] <Amir1>	 I'm calling it a day 
[20:57:14] <Amir1>	 o/
[21:17:06] <halfak>	 o/ 
[21:17:14] <halfak>	 Have a good one, Amir1 :) 
[23:03:40] <halfak>	 OK I'm out of here. 
[23:03:45] <halfak>	 See ya, folks!
[23:24:22] <wikibugs>	 (03CR) 10Petar.petkovic: "This change causes:" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/401608 (https://phabricator.wikimedia.org/T181334) (owner: 10Ladsgroup)