[13:49:46] <halfak>	 o/
[13:53:38] <halfak>	 feeling better today
[14:04:21] <paladox>	 halfak i think i'ved figured out how awight can use lfs now
[14:04:26] <paladox>	 without needing a password
[14:04:30] <halfak>	 Oh? 
[14:05:13] <paladox>	 halfak see https://phabricator.wikimedia.org/T180627#4064693
[14:05:18] <paladox>	 i tested the steps my self
[14:14:42] * halfak clicks
[14:27:15] <paladox>	 awight hi
[14:27:28] <paladox>	 awight i figured out how to get lfs working for you, see  https://phabricator.wikimedia.org/T180627#4064693
[14:27:29] <awight>	 paladox: Thanks for the suggestion on the git-lfs project
[14:27:34] <awight>	 yes, cool!
[14:27:36] <paladox>	 :)
[14:28:56] <awight>	 paladox: What about this step, https://wikitech.wikimedia.org/wiki/Git-lfs#Setting_a_http_password_in_gerrit
[14:29:04] <paladox>	 awight not needed
[14:29:20] <paladox>	 awight i figured out how to do it without http pasword
[14:29:21] <awight>	 ooh that would be great.
[14:29:35] <paladox>	 awight you doin't even need to set the lfs url
[14:29:36] <awight>	 Can you show me git remote -v?
[14:29:59] <paladox>	 origin	https://gerrit.wikimedia.org/r/test/gerrit-ping (fetch)
[14:29:59] <paladox>	 origin	https://gerrit.wikimedia.org/r/test/gerrit-ping (push)
[14:30:01] <paladox>	 awight ^^
[14:30:19] <paladox>	 awight you git lfs install
[14:30:24] <paladox>	 then it should work
[14:30:28] <awight>	 oh try a git lfs-enabled submodule
[14:30:44] <paladox>	 awight hmm, where?
[14:31:17] <awight>	 You could use that same test repo, and add scoring/ores/assets as a submodule.
[14:31:29] <paladox>	 yep
[14:32:05] <paladox>	 awight i guess ores/deploy?
[14:32:47] <awight>	 That's the production repo, but you don't need all the other submodules.  It would be enough to add scoring/ores/assets under your test repo.
[14:33:36] <paladox>	 ok
[14:37:15] <wikibugs>	 10Scoring-platform-team (Current), 10Scap, 10Patch-For-Review: Support git-lfs - https://phabricator.wikimedia.org/T180627#4064880 (10awight)
[14:38:50] * halfak claws eyes out
[14:39:16] <halfak>	 Turns out I missed a stack of annual plan deadlines because they were announced after hours on Friday last week. 
[14:39:17] <halfak>	 WTF
[14:39:45] <halfak>	 I don't think it'll affect us though. 
[14:48:37] <awight>	 That... should be against the rules.
[14:49:42] <halfak>	 Agred. 
[14:49:44] <halfak>	 *e
[14:54:35] <awight>	 halfak: Are you working on those today, or have some time for CR?  I'm out on a limb with this, https://github.com/wiki-ai/editquality/pull/144
[14:54:53] <halfak>	 awight, I left notes
[14:55:02] <halfak>	 I don't like the direction you are taking autolabel
[14:55:03] <awight>	 thanks!
[14:55:15] <halfak>	 And it seems like the new output is a little confusing. 
[14:55:17] <awight>	 Splitting the output files?
[14:55:26] <awight>	 we could do that in explicit rules, too
[14:55:31] <halfak>	 awight, right.  I don't think that's a good job for autolabel. 
[14:56:01] <awight>	 It sort of seems appropriate, since it's categorizing, but I can see that it isn't how any other utilities work in ORES
[14:56:06] <halfak>	 In the most common case, it's not necessary. 
[14:56:20] <awight>	 Really?  cos we only use labeled?
[14:56:27] <wikibugs>	 10Scoring-platform-team (Current), 10Scap, 10Patch-For-Review: Support git-lfs - https://phabricator.wikimedia.org/T180627#4064926 (10Paladox) It works with submodules.
[14:56:33] <awight>	 wait wat
[14:56:36] <halfak>	 No.  Cause we never have to work with "needs_review" separately. 
[14:56:51] <halfak>	 Loading "needs_review" into wikilabels does not require the production of that specific file. 
[14:57:20] <halfak>	 I usually do: cat <file> | grep '"needs_review": true' | ./utility task_inserts ...
[14:57:23] * awight squints at https://docs.google.com/drawings/d/1KdDdwKCIRLuveb2vIn6dicgl8vEvZId5WbNcufil9g0/edit
[14:57:26] <wikibugs>	 10Scoring-platform-team (Current), 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Watching / External), 10Wikimedia-Incident: [Blocked] Cache ORES virtualenv within versioned source - https://phabricator.wikimedia.org/T181071#4064932 (10fgiunchedi)
[14:57:38] <awight>	 It looks like every workflow requires a split on needs_review 
[14:57:44] <halfak>	 We on;y need to store the "needs_review" file in the case that we manually perform the split with shuffling.  
[14:57:48] <halfak>	 nope. 
[14:58:15] <halfak>	 because every time we shuffle/sample, we generate a non-deterministic sample. 
[14:58:23] <halfak>	 And we need to store it so that it is replicable. 
[14:58:37] <halfak>	 you can't run the pipeline from the beginning when you shuf 
[15:02:10] <awight>	 halfak: I'm fine with this change.  Is there anything else I should do to that PR?
[15:02:23] <awight>	 you'll like this one:
[15:05:30] <halfak>	 awight, the generated makefile looks like a mess. 
[15:05:38] <halfak>	 Seems like there are other changes to make but it's hard to say. 
[15:07:06] <halfak>	 awight, seems like main_template should be the template itself rather than a file to look-up.  That'd make it easier to write a test for.
[15:07:26] <halfak>	 You can then attempt to load the template in the main() function -- which is a far nicer place to throw an error like FileNotFound. 
[15:08:51] <halfak>	 You can have a nice function in codegen called "read_template" or something. 
[15:08:58] <halfak>	 that uses default paths or whatever. 
[15:09:10] <halfak>	 Maybe the default path is built into generate_make
[15:10:45] <wikibugs>	 10Scoring-platform-team (Current), 10Scap, 10Patch-For-Review: Support git-lfs - https://phabricator.wikimedia.org/T180627#4064976 (10mmodell) `git lfs install` sets up the following in the current user's gitconfig:  ``` [filter "lfs"]     clean = git-lfs clean -- %f     smudge = git-lfs smudge -- %f     pro...
[15:15:28] <awight>	 halfak: I replaced the multiple outputs thing with explicit rules.  Good idea, this looks better to me.
[15:17:30] <halfak>	 :)  I'm glad.
[15:17:58] <halfak>	 BTW, I'm working on tgr's PR and I'm wondering what happened to the "intersected_samples" thing we discussed. 
[15:18:12] <halfak>	 I know we merged the revscoring PR, but did that get implemented in the template?
[15:18:19] <halfak>	 Or is that maybe part of your current work?
[15:18:37] <travis-ci>	 wiki-ai/editquality#205 (simplify_template - efacd90 : Adam Wight): The build has errored. https://travis-ci.org/wiki-ai/editquality/builds/355901519
[15:18:43] <halfak>	 awight, ^ 
[15:19:52] <awight>	 halfak: I haven't touched intersected_samples, I wanted to ask more questions to be sure I understand.
[15:20:17] <halfak>	 I want to put a quick gist together of how I'd like to see the config-per-wiki change. 
[15:20:24] <halfak>	 Gimme a sec.  Want to run it past you. 
[15:20:35] <awight>	 Great!  I'm eager to see that cleaned up.
[15:22:16] <travis-ci>	 wiki-ai/editquality#206 (simplify_template - eac3fcd : Adam Wight): The build has errored. https://travis-ci.org/wiki-ai/editquality/builds/355903178
[15:23:24] <wikibugs>	 10Scoring-platform-team (Current), 10Scap, 10Patch-For-Review: Support git-lfs - https://phabricator.wikimedia.org/T180627#4065029 (10Paladox) @mmodell possibly want to get scap to run git lfs install so that anyone installing it (without using the puppet class) will have lfs working for them.
[15:23:52] <travis-ci>	 wiki-ai/editquality#203 (isolate_codegen - aaccff4 : Adam Wight): The build passed. https://travis-ci.org/wiki-ai/editquality/builds/355895741
[15:24:17] <travis-ci>	 wiki-ai/editquality#205 (simplify_template - efacd90 : Adam Wight): The build has errored. https://travis-ci.org/wiki-ai/editquality/builds/355901519
[15:26:00] <awight>	 ^ those are external failures, Travis is having network problems
[15:30:18] <travis-ci>	 wiki-ai/editquality#205 (simplify_template - efacd90 : Adam Wight): The build has errored. https://travis-ci.org/wiki-ai/editquality/builds/355901519
[15:35:59] <travis-ci>	 wiki-ai/editquality#205 (simplify_template - efacd90 : Adam Wight): The build has errored. https://travis-ci.org/wiki-ai/editquality/builds/355901519
[15:37:16] <awight>	 lol an AWS error.  https://status.aws.amazon.com
[15:38:46] <halfak>	 https://gist.github.com/halfak/f00ea4efb2b158fe3d1924b3dfb10d58
[15:38:49] <halfak>	 awight, ^ 
[15:38:57] <halfak>	 Note the use of lists to preserve ordering. 
[15:39:21] <halfak>	 Also the appropriation of the full dataset name in fields to give us flexibility. 
[15:39:49] <awight>	 halfak: What this supposed to be a list of hashes?  https://gist.github.com/halfak/f00ea4efb2b158fe3d1924b3dfb10d58#file-huwiki-yaml-L33
[15:39:56] <awight>	 *Was this
[15:40:16] <awight>	 Why is line 38 different?
[15:40:34] <halfak>	 typo
[15:40:35] <halfak>	 fixed
[15:40:41] <halfak>	 list of strings. 
[15:41:05] <halfak>	 There's no meaning in the name/value pair in the hashes.  They just get mashed together. 
[15:41:28] <awight>	 Splitting out the samples into sections is interesting
[15:43:32] <halfak>	 Just removed another unnecessary list-hash on line 28
[15:44:03] <awight>	 Got it.  Your point is made, though!
[15:44:12] <awight>	 This is a lot like a simplified makefile
[15:45:02] <halfak>	 Right :)  that's my thoughts.  I want flexibility to deal with new cases, but also to minimize the boilerplate. 
[15:45:25] <awight>	 Very cool that you can change all the wiring in this format, since samples are all named.
[15:45:48] <halfak>	 :)  
[15:46:13] <halfak>	 I wonder if we could take the sections and put them in files that Makefile.j2 imports
[15:46:24] <halfak>	 So that it's easier to edit in isolation
[15:46:41] <awight>	 The one thing I don't like is that it loses some of the benefit of having standard workflows.
[15:49:09] <halfak>	 Oh? 
[15:49:28] <halfak>	 To me, I see "intersected_samples" as a standard workflow component
[15:49:38] <halfak>	 Or "merged_samples", etc. 
[15:52:54] <awight>	 Well, the wiring is the same in c. 90% of cases, so the codegen should be able to guess all the rules for e.g. "balanced set" given a boolean flag and the quarry ID
[15:54:10] <halfak>	 The boolean flag takes up nearly the same amount of config space as specifying the file name. 
[15:54:20] <halfak>	 Also, what happens when we run a new sample that is unbalanced?
[15:54:32] <halfak>	 As we will soon for wikis with super-old samples. 
[15:54:33] <awight>	 I was thinking about using partials, so we can do basically implement your suggestions, but with a level of abstraction
[15:55:02] <awight>	 We could reuse the partial for similar cases, or write from scratch if the wiring is unlike other cases.
[16:01:14] <wikibugs>	 10Scoring-platform-team (Current), 10Scap, 10Patch-For-Review: Support git-lfs - https://phabricator.wikimedia.org/T180627#4065325 (10demon) Not using git-review will eliminate a few points of confusion I think :)
[16:03:03] <wikibugs>	 10Scoring-platform-team (Current), 10ORES, 10Operations: Reboot oresrdb - https://phabricator.wikimedia.org/T189781#4065332 (10Halfak) @akosiaris said he'd write up the incident report.
[16:26:18] <awight>	 Meanwhile, halfak I'm still missing why huwiki requires the "intersected" step.  The intersection returns the same observations that were in human_labeled, but merged on top of autolabeled fields for those observations.  But this seems redundant with the merge?
[16:26:46] <halfak>	 awight, the data in wikilabels does not have "needs_review" so we don't know how to merge
[16:27:00] <halfak>	 See http://labels.wmflabs.org/campaigns/huwiki/33/?tasks
[16:30:11] <awight>	 merge_labels never looks for needs_review fields in human_labeled data.
[16:30:21] <halfak>	 awight, but our merge process does. 
[16:31:44] <awight>	 Is this unique to huwiki, or is there an existing wiki I can look at as an example of this pattern?
[16:32:10] <awight>	 I'm looking at cswiki, which also uses a balanced sample for review, but don't see anything about human_labeled["autolabeled"]["needs_review"]
[16:32:11] <halfak>	 Not sure where else this happens. 
[16:32:33] <halfak>	 We just worked on this pattern
[16:32:41] <halfak>	 The merge_labels
[16:32:54] <awight>	 Yes I'm worried that I might have missed a requirement
[16:33:25] <halfak>	 https://github.com/wiki-ai/editquality/blob/master/Makefile#L496
[16:33:27] <halfak>	 awight, ^ 
[16:33:38] <halfak>	 See how we filter human_labels based on "needs_review"?
[16:33:51] <awight>	 O_o I missed that somehow
[16:33:53] <awight>	 yes
[16:34:17] <awight>	 ah
[16:34:21] <halfak>	 https://github.com/wiki-ai/editquality/pull/144/files#diff-0bf008681d5a4a89a7eaa7cb88dc0b61R85
[16:34:25] <awight>	 note how that's gone on my simplify_template branch.
[16:34:30] <awight>	 https://github.com/wiki-ai/editquality/commit/3646d18790ba321a3d13d850a0d8e864ee33e6ea
[16:34:44] <awight>	 The solution was to improve the merge_labels logic.
[16:34:56] <halfak>	 Right.  But we need that field
[16:35:07] <awight>	 We get it by merging with autolabeled
[16:35:23] <halfak>	 Yes.  This isn't merged yet. 
[16:35:29] <awight>	 at merge time, we can use autolabeled.needs_review
[16:35:52] <awight>	 What I'm thinking is that with improved merge_labels, we don't need this intersection thing.
[16:35:52] <halfak>	 Sure. 
[16:52:36] <awight>	 halfak: PR #144 is ready for re-review, I ran the Travis builds once AWS settled down
[16:55:02] <halfak>	 awight, so this doesn't just simplify.  It changes the logic. 
[16:55:14] <halfak>	 also a mean([]) errors out. 
[16:55:23] <halfak>	 That doesn't seem like it gets caught here
[16:55:25] <awight>	 thought I fixed that... can do
[16:55:29] <halfak>	 What if all labels are null?
[16:55:48] <awight>	 if all labels are null, we should drop the observation
[16:55:58] <halfak>	 OK.  Well currently, it errors out. 
[16:56:11] <halfak>	 The old logic assumed !damaging and goodfaith in the case of a null label. 
[16:59:11] <awight>	 What commit are you at?
[16:59:20] <awight>	 I protected the mean with,
[16:59:24] <awight>	 https://www.irccloud.com/pastebin/4ToE6bOA/
[16:59:41] * halfak refreshes
[16:59:50] <halfak>	 https://github.com/wiki-ai/editquality/pull/144/files#diff-c818e8b09494901a9948ec961003c293R102
[17:00:23] <awight>	 And what's the stack trace?  I don't understand how label_lists['damaging'] can ever be []
[17:01:13] <halfak>	 If all of the values of null?
[17:01:53] <halfak>	         label_lists = defaultdict(list)
[17:02:08] <halfak>	 if no append is made, then label_lists['damaging'] returns an []
[17:02:22] <awight>	 running to lunch, back in an hour.
[17:05:12] <halfak>	 OK me too 
[18:11:29] <awight>	 back
[18:38:10] <awight>	 I just checked in a REPL and defaultdict will return False for ('damaging' in label_lists)
[18:43:09] <halfak>	 defaultdict(list)?
[18:43:15] <halfak>	 That would be inane
[18:43:31] <halfak>	 awight, ^ 
[18:43:49] <awight>	 yes
[18:44:04] * awight assembled the paste
[18:44:48] <awight>	 https://phabricator.wikimedia.org/P6866
[18:45:01] <Vermont>	 halfak: https://meta.wikimedia.org/wiki/Proposals_for_closing_projects/Closure_of_Simple_English_Wikipedia_(3)
[18:45:07] <Vermont>	 lawl
[18:45:11] <halfak>	 awight, https://gist.github.com/halfak/6b7c1d304f58f12119f4d711ad30446a
[18:45:50] <awight>	 I don't like how fragile it is, but the idea is that d['foo'] is never called unless we're appending
[18:45:58] <halfak>	 awight, I was talking about calculating mean()
[18:46:00] <awight>	 point being, did you actually get a stack trace?
[18:46:05] <halfak>	 See my paste
[18:46:16] <awight>	 mean is protected by a conditional, which I still believe works.
[18:46:32] <halfak>	 ahh
[18:46:58] <halfak>	 Vermont, lol
[18:47:16] <halfak>	 awight, got it
[18:47:55] <awight>	 Vermont: very nice.  We should rename enwiki "Not-so-simple English Wikipedia"
[18:48:16] <Vermont>	 let’s do it
[18:48:45] <awight>	 It should go over just fine, since there's no requirement to give the wiki a simple title
[18:49:45] <awight>	 halfak: Since the merge_labels change will actually affect the data, should I do a before-after test on one of the models?
[18:55:58] <halfak>	 Yeah.  That's a good idea. 
[19:04:35] <awight>	 revscoring model_info <model_file> is the output we want, right?
[19:10:25] <awight>	 halfak: Very minor question.  What do you think about dumping model_info to text files as part of the Makefile?
[19:10:48] <awight>	 It would make it easier to see changes over time.
[19:23:27] <halfak>	 awight, we could do that.  The cv_train script writes model_info to stdout when it finishes
[19:23:37] <halfak>	 Oh wait.  no it writes it to stderr
[19:23:43] <halfak>	 :)
[19:23:47] <halfak>	 model goes to stdout
[19:23:53] <awight>	 probably fine to make it a separate rule
[19:23:58] <halfak>	 I'm OK with dumping model_info to a file too
[19:24:06] <awight>	 kk I'll wedge it in
[19:24:07] <halfak>	 Maybe you'd want to jump it in json format
[19:24:11] <halfak>	 not sure
[19:24:13] <awight>	 good idea
[19:24:19] <halfak>	 I've got to run to a doctor's appointment. 
[19:24:25] <halfak>	 back in probably 2 hours. 
[19:24:26] <awight>	 I might go with "anything" to start with :)
[19:24:29] <awight>	 Enjoy!
[19:24:41] <awight>	  </tongue in cheek>
[19:24:54] * halfak hopes his recent knee success is indicative of long-term knee success. 
[19:27:18] <wikibugs>	 10Scoring-platform-team (Current), 10Scap, 10Patch-For-Review: Support git-lfs - https://phabricator.wikimedia.org/T180627#4066362 (10mmodell) >>! In T180627#4065325, @demon wrote: > Not using git-review will eliminate a few points of confusion I think :)  +1
[19:30:19] <awight>	 Dramatically worse model stats.
[20:02:23] <wikibugs>	 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Update ORES wheels for new revscoring requirements - https://phabricator.wikimedia.org/T188447#4066455 (10Sumit) Yeah we'll need scipy >= 0.18.1 but i see for revscoring scipy is already set as - scipy >= 0.13.3, < 1.0.999
[20:07:37] <wikibugs>	 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Update ORES wheels for new revscoring requirements - https://phabricator.wikimedia.org/T188447#4066481 (10awight) @Sumit Cool, thank you.  All this means is that we need to update the scipy wheel.
[20:44:24] <awight>	 Whew, that was a dumb mistake.
[20:44:57] * awight arches brow
[20:58:12] <awight>	 merge_labels actually does an intersection, and we were adding needs_review human_labeled rows by passing straight through in the Makefile.
[20:58:24] <awight>	 I'll do the same thing, from merge_labels.
[21:16:48] <awight>	 halfak: Hanging out?