[00:19:32] halfak: fyi I’ve gathered the things we did and threw them into rough categories, https://phabricator.wikimedia.org/phame/post/view/77/status_update_october_6_2017/ [00:19:44] I’ll add human language by early next week... [00:19:54] *have thrown [00:20:46] (03PS1) 10Sbisson: RCFilters: default highlight according to preference [extensions/ORES] - 10https://gerrit.wikimedia.org/r/382632 (https://phabricator.wikimedia.org/T172757) [01:25:54] 10Scoring-platform-team, 10MediaWiki-Vagrant, 10MediaWiki-extensions-ORES: Can't enable ores role in vagrant - https://phabricator.wikimedia.org/T177555#3663220 (10Mooeypoo) I manually updated composer afterwards in vagrant-ssh, and then ran `foreachwiki update.php --quick` and added the wg variables for ORE... [01:44:00] 10Scoring-platform-team (Current), 10Collaboration-Team-Triage, 10ORES, 10Patch-For-Review: Make RCFilters compatible with both the old and new thresholds APIs - https://phabricator.wikimedia.org/T175053#3663229 (10awight) The code is in good shape and ready for another review. [12:49:11] (03CR) 10Ladsgroup: "I think this helps in the performance of the queries, is there any chance of looking into this?" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/349235 (https://phabricator.wikimedia.org/T163337) (owner: 10Ladsgroup) [12:50:57] (03Abandoned) 10Ladsgroup: Use DISTINCT option on ChangesList select [extensions/ORES] - 10https://gerrit.wikimedia.org/r/349235 (https://phabricator.wikimedia.org/T163337) (owner: 10Ladsgroup) [14:09:29] o/ [14:56:47] 10Scoring-platform-team (Current), 10ORES, 10revscoring, 10artificial-intelligence: Revscoring 2.0 takes up too much memory - https://phabricator.wikimedia.org/T177544#3664639 (10Halfak) Just thought I should check on the draft quality model. Loading just that into memory and got the biggest boost to RES... [15:03:02] halfak: o/ [15:03:21] The extension code would benefit from eyeballs if you have them to spare. [15:03:39] will do. I'm very worried about memory usage today. Will do a bit of both :) [15:03:50] FYI: https://phabricator.wikimedia.org/T177544 [15:04:05] I’ll stand up for my new thresholds logic, but the fallback stuff is kludgey as hell [15:04:17] halfak: ah ok. That’s equally important IMO [15:06:02] 34k per model would be outstanding. [15:06:09] aren’t those MB, though? [15:07:07] yeah. [15:08:55] halfak: Could we store thresholds in a database rather than explicitly in-memory? [15:09:02] It’s accessed infrequently. [15:09:28] indexes would be ideal, come to think of it. [15:09:49] O(1) get [15:18:22] hey! just got in a meeting done in 40 mins [15:21:55] It would be really slick if the threshold lookup code tried a database first, and if unconfigured falls back to more granular stats merged into the model file. [15:37:44] So this hypothetical thresholds state db would be in MySQL. [15:38:51] How about, it has a basic schema with the indexed columns, then the remainder of statistics for each threshold are a json blob? [15:54:18] > Differences between the current environment and the environment in which the model was constructed ... [15:54:22] that’s awesome. [15:57:31] I love that it’s just a warning, too 8D [16:00:08] wiki-ai/revscoring#1257 (oneliners - a8092bc : Adam Roses Wight): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/284273726 [16:08:22] halfak: did you add pronunciation statement to ORES item? [16:08:33] we have a bet here [16:13:26] 10Scoring-platform-team (Current), 10ORES, 10revscoring, 10artificial-intelligence: Revscoring 2.0 takes up too much memory - https://phabricator.wikimedia.org/T177544#3662554 (10awight) Just playing around, I dumped the thresholds table to json: ``` m = Model.load(open("models/enwiki.damaging.gradient_boo... [16:14:30] halfak: ^ Am I forgetting some orthogonal dimension, or does m.info['statistics'].label_thresholds have the right order of magnitude size for what we’re looking to store? [16:19:04] Yeah it seems to have the thresholds for each label. LOL so the hydrated form is 20,000x bigger? [16:46:58] OMG dog problems [16:47:15] I have moles in my yard and Isla is trying to get them -- ruining all my hard work in the process [16:47:34] Amir1, I didn't add the pronunciation statement [16:47:36] * halfak searches [16:48:31] Lydia_WMDE: I won [16:48:35] Wait. is there a wikidata item for ORES? [16:49:26] awight, not sure what you mean 20,000x bigger [16:49:56] There are some but not for ORES as far as I checked [16:50:07] we have for the pages [16:50:31] Oh gotcha. Does ORES need an item? That would be cool but I'm not sure what it would be used for. [16:50:46] Definitely it will be super cool [16:54:08] * halfak has a COI ;0 [16:54:25] halfak: It seems that we can dump the entire info[‘statistics’].label_thresholds to json in just 3.5MB [16:55:09] awight, for which model? [16:55:36] enwiki.damaging [16:56:26] That's a lot of space still. [16:56:28] * awight checks draftquality [16:56:31] And one of the smaller ones [16:56:37] yeah. draftquality will be a lot bigger, I think [16:56:40] like 30MB [16:56:45] or maybe 100MB [16:57:55] "In [6] used 824.5898 MiB RAM" Just loaded the draftquality model into memory [16:57:55] El búfer 6 está vacío. [16:57:59] O_O [16:58:17] hahaha [17:00:28] True that 3.5MB is a lot, but only 20% of a 16MB serialized model. Actually, is the model storing a json or a serialized ModelInfo of the threshold stats? [17:00:41] pickle serialized. [17:01:15] According to my estimate, the draft quality model's model_info formatted as JSON is 129 MB [17:01:34] Which is about the size of the serialized model. [17:01:52] I was off by 20k about that 20,000x thing, sorry. For some reason I was thinking your numbers applied to each *row* [17:02:21] We need to stop storing threshold information for *literally ever threshold* it seems. [17:02:38] As dirty as the database suggestion is, I’m attracted to the fact that we use indexes for exactly what they’re meant to do. [17:02:57] ^? [17:03:20] I was floating a crazy idea an hour ago [17:03:31] Oh I must have missed that while in meeting. [17:03:36] that the threshold stats could optionally be stored in mysql [17:03:50] " thresholds state db" [17:03:51] ? [17:04:04] yes. max(recall) above precision X is really elegant in a db [17:04:17] Oh! Yeah. I see what you are saying there. [17:04:25] and we only need to call it once per day, per stat per model. [17:04:42] The JSON is 80.5MB when pickled rather than JSON dump'd [17:05:07] It messes up the whole thing you had going with the self-contained models, but I was saying we could provide a few granular stats in the model. [17:05:19] And 90.9MB when I just pickle the whole object without JSON formatting. [17:05:20] as a fallback. [17:05:54] awight, yeah, that's a bummer. I do really like keeping them together. But we could have parallel outputs I think [17:06:01] foo.model, foo.model_info [17:06:18] And we could provide a checksum in foo.model to make sure foo.model_info is right. [17:07:47] .model can still be loaded as if it’s a single thing, and it is responsible for loading its own model_info either from db or sister file. [17:08:19] akin to loading its dictionary [17:08:42] awight, we wouldn't want to load it by default every time we load the model though. [17:08:47] +1 [17:08:50] that’s the win [17:08:54] Because celery doesn't need the info and uwsgi doesn't need the model. [17:09:26] & if the model_info is available via db, we never have to load the detailed info into memory, that’s all done on an external db [17:10:05] How do you think this wonky scheme compares to just cutting down on granularity... [17:11:39] awight, short term, cut down on granularity. Long term, consider alternative schemes. [17:11:51] yup. [17:12:18] OK I have confirmed: A tuple of 20 ints * 200k rows = ~160MB in python memory [17:12:51] So that says we're doing an OK job of efficiently storing our data. [17:13:00] In classes that is. It's about the same as a tuple [17:14:22] Are you prepared to cut granularity by 100-500x? [17:16:56] Yes. I think so. [17:17:06] The hard part is identifying the *right* place to cut granularity. [17:17:18] I need some information theoretic measure of "important" thresholds. [17:17:29] OR we can just round to 4 decimals. [17:18:00] That's the difference between 20MB and 129MB [17:18:01] 10Scoring-platform-team (Current), 10ORES, 10revscoring, 10artificial-intelligence: Reduce label_thresholds granularity - https://phabricator.wikimedia.org/T177636#3665251 (10awight) [17:18:35] If we go down to 3 decimals, we'll be down to 700MB [17:19:02] halfak: My initial thought (recorded in that task) is that we could have the segments between each data point stay within a certain distance from the “real” function, and then interpolate when we do the calculations. [17:19:27] I don’t think it would be too hard to do over multiple functions, assuming that algorithm doesn’t already exist. [17:20:18] awight, the real problem is having the thresholds reported represent the variance in statistics at threshold optimally. [17:20:23] Just iterate and keep track of the deviation between the quantized and actual line, then write a point before it exceeds your epsilon [17:20:37] E.g. we should have a lot of datapoints around high confidence and few around low confidence for uncommon classes. [17:20:37] halfak: I think my proposal solves that. [17:21:06] awight, OK yeah. I like that idea. Let's file it though and do something dumb :) [17:21:08] we carefully map any curves, and straight lines are just kept within a tolerance [17:21:12] lolol [17:21:13] agreed [17:22:03] I think we can set the default threshold digits for sklearn probability classifiers to 4 digits, retrain the models, and be done pretty soon. [17:22:37] \o/ [17:23:09] I'll get a PR together. We'll need to retrain the models overnight. I think it'll go smooth because I just cleaned up the Makefile. [17:23:30] So this was only revscoring 2.0, true? [17:23:44] Somehow we 32 million thresholds in draftquality "OK" [17:23:46] WTF [17:23:48] right [17:23:54] LOL [17:24:04] Only information about a small set of thresholds in revscoring 1.3 [17:24:09] like 5-10 thresholds. [17:24:18] gotcha [17:24:38] We’re about to be blocked on file handles again. [17:24:40] Oh woops. [17:24:49] 32million chars in draftquality "OK [17:24:56] ah [17:25:05] * halfak re-checks rows [17:25:26] 152k for no rounding [17:25:40] 9131 for rounding at 4 digits [17:25:51] 994 for rounding at 3 digits [17:26:10] I wonder if we should round at 3. [17:26:13] Hmm. [17:26:42] The difference between 3.7 MB and 470K [17:26:55] I like 3 digits. [17:27:04] * halfak works. [17:28:10] +1 the -- [17:28:32] * halfak races to get this together so he can go eat lunch [17:40:00] I’ll be around to CR any time. [17:47:38] awight, https://github.com/wiki-ai/revscoring/pull/365 [17:49:13] looking [17:50:13] wiki-ai/revscoring#1258 (memory_usage - 09766f4 : halfak): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/284319896 [17:50:56] 10Scoring-platform-team (Current), 10ORES, 10revscoring, 10artificial-intelligence: Revscoring 2.0 takes up too much memory - https://phabricator.wikimedia.org/T177544#3665375 (10Halfak) When formatting json, the thresholds are arounded and limited. In this case, the default is 4 decimal places. You can... [17:51:06] 10Scoring-platform-team (Current), 10ORES, 10revscoring, 10artificial-intelligence: Revscoring 2.0 takes up too much memory - https://phabricator.wikimedia.org/T177544#3665376 (10Halfak) https://github.com/wiki-ai/revscoring/pull/365 [17:53:19] Running to lunch/next meeting [17:53:22] Back in 30-45 [17:53:37] halfak: util.round doesn’t eliminate any values from the list. does the grouping do that? [17:53:41] k see you [17:55:23] ok confirmed that itertools.groupby does exactly that. [17:59:54] * awight scratches head trying to figure out how to regenerate just the stats on a model [18:05:41] Something about > self.info['statistics'].fit(score_labels) [18:07:49] revscoring test_model [18:17:47] revscoring test_model models/enwiki.damaging.gradient_boosting.model damaging --model-file=models/enwiki.da [18:17:48] maging.gradient_boosting-round3.model --observations=datasets/enwiki.labeled_revisions.w_cache.20k_2015.json [18:22:24] halfak: oops, something I didn’t catch before merging. [18:22:25] File "/media/sf_work/revscoring/revscoring/scoring/statistics/classification/classification.py", line 112, in fit [18:22:26] threshold_ndigits=self.threshold_ndigits, [18:22:27] AttributeError: 'Classification' object has no attribute 'threshold_ndigits' [18:22:35] Probably simple... [18:24:12] Weird. Maybe it’s initializing .info with the serialized statistics [18:30:47] meh I’ll just retrain with 100 observations [18:34:21] 1000. [18:34:23] love the makefile [18:35:22] fwiw, I got about 370 thresholds, and they’re all unique to 3 decimals. [18:35:39] I’ll dial that down to 2 decimals just to feel like I’ve done my smoke test due diligence. [18:37:30] Success. thresholds are rounded to 2 places now, and there are c. 130 now. [18:39:54] 10Scoring-platform-team: revscoring model_info display should include target prediction value - https://phabricator.wikimedia.org/T177649#3665605 (10awight) [18:40:10] 10Scoring-platform-team: revscoring model_info display should include target prediction value - https://phabricator.wikimedia.org/T177649#3665617 (10awight) p:05Triage>03Lowest [18:41:54] halfak: Helpful if I deploy that to labs? [18:42:15] Or should that wait until you retrain a few models to test those? [19:10:14] awight, need to retrain [19:20:52] awight, https://gerrit.wikimedia.org/r/382765 [19:33:39] * halfak begins the process of rebuilding the files on our new big beefy stats machine. [19:35:21] 10Scoring-platform-team (Current), 10ORES, 10revscoring, 10artificial-intelligence: Revscoring 2.0 takes up too much memory - https://phabricator.wikimedia.org/T177544#3665826 (10Halfak) OK Just released revscoring 2.0.8. Now I'm going to rebuild all of the models -- starting with the big set of editquali... [19:37:46] halfak: for that you just “rm models/*” and make? [19:37:57] thinking about how -j should work... [19:38:31] Yup. [19:38:51] I should have pushed to packagist… [19:39:02] thanks for deploying! [19:39:18] so, the file handles… [19:39:22] Looks like stats machine is broken. Moving to ores-misc-01 [19:39:26] oof [19:40:28] I’ve read that the null handles come from files that are opened and later deleted. [19:41:01] No idea what that would be. [19:41:09] I can look at that now though. Lotsa waiting ahead. [19:41:17] Oh! I should review your extension work. [19:41:18] ty [19:41:22] Maybe Amir1 can get that in beta tomorrow. [19:42:40] Just realized that the celery worker recovers thanks to a kick from puppet. [19:45:14] What do you think about a “pending deployment” column? [19:46:33] 10Scoring-platform-team (Current), 10ORES, 10Operations, 10Patch-For-Review, and 2 others: Stress/capacity test new ores* cluster - https://phabricator.wikimedia.org/T169246#3665899 (10awight) [19:46:36] 10Scoring-platform-team (Current), 10ORES, 10Operations, 10Patch-For-Review: Give ores admins read access to /srv/log/ores/main.log* - https://phabricator.wikimedia.org/T175736#3601826 (10awight) 05Open>03Resolved [19:49:52] awight, seems like a good idea to me [19:50:10] We can try for a while, at least. [19:50:17] awight, not sure what I can do in this review. I can't comment on any good practices with PHP/MW [19:50:20] It all looks bad to me :) [19:51:07] I like that you have a function for the explicit formula conversion [19:51:26] lol [19:51:43] It's hacky and contained [19:51:49] Well, I can walk you through the fallback if you want. It’s not wholesome. [19:52:22] The scariest maneuver is when I put garbage into the new-thresholds cache as a reminder to not try fetching for another minute. [19:54:04] Pretty sure the ripcords are easy to find when we dump old-thresholds support [19:54:39] Is that the empty array you put in memcached? [19:54:42] yup [19:55:37] I might actually try a git headstand for fun, to split this and make the fallback revertable. [19:56:37] I'm a little unnerved to see "damaging" in a plane string and not "goodfaith" [19:57:57] plain [19:58:57] Is that just to check if we're v1 or v2? [19:59:07] *1.3 or 2.0? [20:01:34] Serialized model files are smaller by about 66% [20:01:39] with revscoring 2.0.8 [20:05:41] Ok other than that one Q I think I'm ready to +1 the extension work [20:05:53] I'm going to look at filedecriptors and redis. [20:07:11] I'm doing so much engineering work today. This is awesome! [20:13:37] (03PS18) 10Awight: Support new thresholds API [extensions/ORES] - 10https://gerrit.wikimedia.org/r/380893 (https://phabricator.wikimedia.org/T175053) [20:13:40] (03PS1) 10Awight: Fallback to old thresholds API as necessary [extensions/ORES] - 10https://gerrit.wikimedia.org/r/382778 (https://phabricator.wikimedia.org/T175053) [20:14:59] (03CR) 10jerkins-bot: [V: 04-1] Support new thresholds API [extensions/ORES] - 10https://gerrit.wikimedia.org/r/380893 (https://phabricator.wikimedia.org/T175053) (owner: 10Awight) [20:15:01] (03CR) 10jerkins-bot: [V: 04-1] Fallback to old thresholds API as necessary [extensions/ORES] - 10https://gerrit.wikimedia.org/r/382778 (https://phabricator.wikimedia.org/T175053) (owner: 10Awight) [20:15:20] awight, do you know if celery workers are using a lot of file handles? [20:15:38] And I'm wondering how I could check file handles used on my own machine. [20:32:38] halfak: awight: I went back and spammed the contact list with the link to sign up for the JADE feedback group themselves. Strategy worked, picked up a bunch more. I'll follow up on Monday or so. [20:37:25] o/ [20:37:44] Keegan: link? and what would the group do exactly? [20:38:56] Zppix: Not like a formal thing. It's just a mass message list to send a notice when there's something needing feedback. As you follow in this channel, you'll likely already know about whatever is being sent out. But you're welcome to sign up. [20:38:58] https://meta.wikimedia.org/wiki/Global_message_delivery/Targets/JADE [20:39:21] Every so often a "Hi, this thing about JADE needs some attention " [20:39:35] This way the team knows they're not just communicating into the dark [20:40:01] Ah i dont need any more mass message messages (say that 10x fast) im in here everytime im on irc and i get notifs from the scoring platform team project on phab [20:40:58] right [20:42:25] Im kinda on pause for dev with ores and such until I find time to setup my ubuntu vm with vagrant [20:57:39] o/ was in meeting. [20:57:42] reading scrollback [20:57:49] Nice, Keegan :) [20:58:00] halfak: do you ever not have a meeting lol [20:58:00] I responded to Baba Tabita on the talk page. [20:58:07] GOOD QUESTION [20:58:10] Seriously though [20:58:43] * Zppix hands halfak a nice cold pint [21:00:10] :D [21:00:22] I could use it. Almost to EOD on Friday! Wooo [21:00:42] end of day? [21:00:45] (eod)? [21:00:55] OK so I've confirmed that the redis connection in celery workers is *not* accounting for the bunch of file handles. [21:00:56] yeah [21:01:00] eod = end of day [21:09:20] halfak: legoktm just mentioned a huge possible legal issue on the post from Keegan on wikitech-l [21:09:46] Oh? [21:10:11] What legal issue? [21:13:54] halfak: https://lists.wikimedia.org/pipermail/wikitech-l/2017-October/088975.html [21:14:21] Oh they can go to hell [21:14:23] :) [21:14:44] halfak: you should be a lawyer :P [21:15:02] we're going to sue you, "well you can go to hell" :P [21:15:14] instant case drop right there [21:15:16] Case closed [21:17:18] I don't think it's a huge problem [21:17:22] it was more of a "fyi" [21:17:36] thanks legoktm [21:17:58] Will keep this in mind. Might talk to legal about it. [21:18:14] halfak: it sounds like you had the legal thing figured out though xD [21:18:48] I'll just confirm the complete effectiveness of the "got to hell" defense. [21:20:30] I think a trademark on term "jade" is pretty silly and it seemed like the nodejs person didn't want to fight it, but we have a pretty solid legal team [21:21:55] halfak: the visit to legal is just a formality, they just need to enter into the paperwork halfak says "go to hell" [21:25:27] That's where all paperwork goes eventually. [21:25:37] OK. I give up on the filehandle stuff. But let me first record my notes. [21:27:23] 10Scoring-platform-team, 10ORES: Clean up file handle and Redis connection management in ORES worker and celery processes - https://phabricator.wikimedia.org/T177036#3645111 (10Halfak) OK so I checked on this and it looks there's no effect at all on the file-handle count by dropping the connection to redis in... [21:35:57] halfak: 66% savings, act today to pre-order your model! [21:36:19] :D [21:36:22] Your point wrt. “damaging” is important, yeah [21:36:33] We need to make an API call that doesn’t assume either damaging or reverted. [21:36:39] Or we have to hardcode enwiki. [21:37:18] Make it to the v3 version of the API [21:37:37] /v3/scores/enwiki/?model_info [21:38:21] awight: hardcode enwiki... ill make sure to file that for halfak into the your crazy :P [21:39:00] oh... well that'd be the wikiId [21:39:00] lol @ owning the english word for a type of rock [21:39:06] awight, right [21:39:13] they should go to hell [21:39:17] :) [21:39:31] I’m already using the v3 route [21:40:12] halfak: I wonder if hell is trademarked? [21:40:20] is that enough to tell me that revscoring 2.0 is available? [21:40:35] Zppix: lol is most certainly is [21:40:40] awight, yes. Output format will be 2.0-ish [21:41:03] aha. So {wikiId} then [21:41:04] great [21:41:34] https://ores.wikimedia.org/versions should be machine readable [21:41:54] Also, it's bad that it's kind of hard-coded for which libraries matter. [21:42:26] (03CR) 10Awight: [C: 04-1] "TODO: hit /v3/scores/{wikiId}/?model_info to test API compatibility, rather than assuming the "damaging" model is present." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/382778 (https://phabricator.wikimedia.org/T175053) (owner: 10Awight) [21:42:54] halfak: whoa, neat! [21:44:20] Don’t forget to have an EOD [21:45:38] Yeah. Just about to do that :) [22:21:27] (03CR) 10Catrope: [C: 032] WLFilters: Temporarily stop respecting hideNonDamaging on WL with beta feature [extensions/ORES] - 10https://gerrit.wikimedia.org/r/382627 (owner: 10Sbisson) [22:29:15] (03Merged) 10jenkins-bot: WLFilters: Temporarily stop respecting hideNonDamaging on WL with beta feature [extensions/ORES] - 10https://gerrit.wikimedia.org/r/382627 (owner: 10Sbisson)