[11:13:28] <Amir1>	 o/
[11:14:21] <Hauskatze>	 o//
[11:27:01] <wikibugs>	 10Scoring-platform-team, 10Operations, 10Release-Engineering-Team (Watching / External): Contact number of some WMDE staff should be avalible to SRE/RelEng - https://phabricator.wikimedia.org/T210721 (10Ladsgroup) >>! In T210721#4784825, @WMDE-leszek wrote: > I take it on me. I've briefly talked about this t...
[11:27:26] <Amir1>	 akosiaris: hey, tell me if you're around for a deployment
[11:34:10] <akosiaris>	 Amir1: I am
[11:34:27] <Amir1>	 is it fine to do a deployment?
[11:39:39] <wikibugs>	 (03CR) 10Ladsgroup: [V: 03+2 C: 03+2] Revert "Temp: Revert result serializier back to pickle" [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/479500 (owner: 10Ladsgroup)
[11:41:01] <wikibugs>	 (03PS1) 10Ladsgroup: Bump ores to HEAD [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/480059
[11:41:50] <wikibugs>	 (03CR) 10Ladsgroup: [V: 03+2 C: 03+2] Bump ores to HEAD [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/480059 (owner: 10Ladsgroup)
[11:47:25] <akosiaris>	 Amir1: sure, why not ?
[11:47:49] <Amir1>	 akosiaris: I don't want to step on anyone's toes :D
[11:48:39] <akosiaris>	 no toes in the dancefloor around right now
[11:48:52] <Amir1>	 So let's get the party started
[12:33:22] <wikibugs>	 10ORES, 10Scoring-platform-team, 10Services (designing): Merge ORES precaching with ORESFetchScoreJob - https://phabricator.wikimedia.org/T201868 (10Ladsgroup)
[12:33:25] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10Core Platform Team Backlog (Watching / External), 10Services (watching), 10User-Ladsgroup: ORES precaching seem to not understand its own config - https://phabricator.wikimedia.org/T211267 (10Ladsgroup) 05Open→03Resolved The second part is also done: {F276...
[12:39:06] <Amir1>	 SPAM IS COMING
[12:39:18] <wikibugs>	 10Scoring-platform-team (Current): Read through teahouse literature to find exact outcome metric. - https://phabricator.wikimedia.org/T209652 (10Ladsgroup) 05Open→03Resolved
[12:39:20] <wikibugs>	 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Turn article quality javascript into a gadget - https://phabricator.wikimedia.org/T202744 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup
[12:39:22] <wikibugs>	 10Scoring-platform-team (Current): Create project page about Newcomer quality - https://phabricator.wikimedia.org/T210211 (10Ladsgroup) 05Open→03Resolved
[12:39:24] <wikibugs>	 10Scoring-platform-team (Current): Respond to ORES question from Mark Wang - https://phabricator.wikimedia.org/T210086 (10Ladsgroup) 05Open→03Resolved
[12:39:26] <wikibugs>	 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Respond to questions about ORES article quality model - https://phabricator.wikimedia.org/T209951 (10Ladsgroup) 05Open→03Resolved
[12:39:28] <wikibugs>	 10Scoring-platform-team (Current): Label 100 more non-singleton sessions and repeat model selection. - https://phabricator.wikimedia.org/T209728 (10Ladsgroup) 05Open→03Resolved
[12:39:30] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: Travis is failing on master of ORES - https://phabricator.wikimedia.org/T209852 (10Ladsgroup) 05Open→03Resolved
[12:39:32] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: Upgrade celery to 4.1.0 for ORES - https://phabricator.wikimedia.org/T178441 (10Ladsgroup)
[12:39:34] <wikibugs>	 10Scoring-platform-team (Current): create prediction function for newcomerquality package - https://phabricator.wikimedia.org/T211192 (10Ladsgroup) 05Open→03Resolved
[12:39:36] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: Migrate ores celery configs to celery 4 - https://phabricator.wikimedia.org/T209587 (10Ladsgroup) 05Open→03Resolved
[12:39:38] <wikibugs>	 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Update documentation for ArticleQuality.js - https://phabricator.wikimedia.org/T209387 (10Ladsgroup) 05Open→03Resolved
[12:39:40] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: Reduce connection timeout to PoolCounter - https://phabricator.wikimedia.org/T208577 (10Ladsgroup) 05Open→03Resolved
[12:39:42] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: ORES should log 500 responses - https://phabricator.wikimedia.org/T208623 (10Ladsgroup) 05Open→03Resolved
[12:39:44] <wikibugs>	 10Scoring-platform-team (Current): Incoporate newcomerquality model into a python package - training side - https://phabricator.wikimedia.org/T208365 (10Ladsgroup) 05Open→03Resolved
[12:39:46] <wikibugs>	 10Scoring-platform-team (Current): Evaluate Newcomer Model - https://phabricator.wikimedia.org/T208364 (10Ladsgroup) 05Open→03Resolved
[12:39:48] <wikibugs>	 10Scoring-platform-team (Current): Qualitative Analysis of Session-Edit mismatches. - https://phabricator.wikimedia.org/T208362 (10Ladsgroup) 05Open→03Resolved
[12:39:50] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: PoolCounter lock time has increased tenfold only in codfw - https://phabricator.wikimedia.org/T208608 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup
[12:39:52] <wikibugs>	 10ORES, 10Scoring-platform-team, 10Performance, 10Release-Engineering-Team (Next): Try to increase ORES deployment parallelism - https://phabricator.wikimedia.org/T197180 (10Ladsgroup)
[12:39:54] <wikibugs>	 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: [Epic] Use LFS for large ORES files - https://phabricator.wikimedia.org/T197096 (10Ladsgroup) 05Open→03Resolved
[12:39:57] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10Scap, 10Patch-For-Review, 10User-Ladsgroup: ORES deployment finish "successfully" even when uwsgi and celery fail to successfully start up - https://phabricator.wikimedia.org/T170950 (10Ladsgroup) 05Open→03Resolved
[12:40:00] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: ores returns error for draftquality models - https://phabricator.wikimedia.org/T209060 (10Ladsgroup) 05Open→03Resolved
[12:40:02] <wikibugs>	 10Scoring-platform-team (Current): Create sample Newcomer quality predictions for TeaHouse hosts to sanity check - https://phabricator.wikimedia.org/T209607 (10Ladsgroup) 05Open→03Resolved
[12:40:03] <wikibugs>	 10Scoring-platform-team (Current): Understand TeaHouse desires of Newcomer-Quality predictions - https://phabricator.wikimedia.org/T208367 (10Ladsgroup) 05Open→03Resolved
[12:40:06] <wikibugs>	 10Scoring-platform-team (Current): Post to enwiki TeaHouse about conducting AI-experiment - https://phabricator.wikimedia.org/T209608 (10Ladsgroup) 05Open→03Resolved
[12:40:09] <wikibugs>	 10Scoring-platform-team, 10Operations, 10Wikimedia-Incident: ORES overload incident, 2017-11-28 - https://phabricator.wikimedia.org/T181538 (10Ladsgroup)
[12:40:12] <wikibugs>	 10Scoring-platform-team (Current), 10Operations, 10User-Ladsgroup, 10Wikimedia-Incident: Investigate redis-cluster or other techniques for making Redis not a single point of failure. - https://phabricator.wikimedia.org/T181559 (10Ladsgroup) 05Open→03Resolved
[12:40:14] <wikibugs>	 10Scoring-platform-team (Current), 10Operations, 10Scap, 10Patch-For-Review, and 2 others: Deployment git server can't supply ORES hosts in parallel - https://phabricator.wikimedia.org/T191842 (10Ladsgroup) 05Open→03Resolved
[13:41:00] <wikibugs>	 10Scoring-platform-team, 10Research, 10Wikilabels, 10Research-2017-18-Q4: Design a data collection pilot using WikiLabels platform (mining reasons) - https://phabricator.wikimedia.org/T186351 (10Miriam) 05Open→03Resolved
[13:41:29] <wikibugs>	 10Scoring-platform-team, 10Research, 10Wikilabels, 10Research-2017-18-Q4: Design a data collection pilot using WikiLabels platform (mining reasons) - https://phabricator.wikimedia.org/T186351 (10Miriam)
[14:17:02] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10Operations, 10Security-Team, and 2 others: Fetching ORES API from en.wikipedia.org blocked in debug mode - https://phabricator.wikimedia.org/T211511 (10Ladsgroup) 05Open→03Resolved This is done.
[15:08:53] <halfak>	 o/ 
[15:09:18] <halfak>	 Amir1, https://github.com/wikimedia/editquality/pull/173  looks like the model files were committed directly -- rather than LFS.  Is that right? 
[15:10:15] <Amir1>	 halfak: It should not, git lfs should automatically handle it
[15:11:10] <Amir1>	 it's very likely that the file is committed through git lfs but github understands and shows it like it's not there
[15:11:22] <halfak>	 Hmm.  I'm looking for some indication that it is in LFS :| 
[15:11:31] <halfak>	 I see that old PRs look the same as this. 
[15:11:38] <halfak>	 So maybe it's fine. 
[15:13:38] <halfak>	 Looks like you can't click the trash can icon for LFS files.  Good enough for me. 
[15:13:42] <halfak>	 {{merged}}
[15:13:42] <AsimovBot>	 10[1] 04https://meta.wikimedia.org/wiki/Template:merged
[15:13:45] <halfak>	 Thanks for picking that up. 
[15:14:06] <halfak>	 We have 3 models pending deployment now. 
[15:14:24] <halfak>	 Galician Article quality, Italian & German damaging/goodfaith. 
[15:15:34] <Amir1>	 halfak: I also have some other PRs :D
[15:16:25] <Amir1>	 https://github.com/wikimedia/ores/pull/305
[15:16:29] <Amir1>	 https://github.com/wikimedia/ores/pull/304
[15:18:57] <Amir1>	 It's not stored in LFS, wtf
[15:19:06] <Amir1>	 I will fix it
[15:21:42] <halfak>	 Damn.  This happened last time somehow. 
[15:21:44] <halfak>	 kk
[15:23:26] <halfak>	 Never did like that watchdog thing. 
[15:35:23] * halfak hacks away on translatewiki models. 
[15:45:21] <Amir1>	 I go eat something, will be back in fifteen minutes or so
[15:52:50] <halfak>	 kk
[16:03:46] <Amir1>	 back now
[16:06:36] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: Change default serializer of celery from pickle to json - https://phabricator.wikimedia.org/T206333 (10Ladsgroup) the last bit is root_caches which is a dictionary on its own: ` {'datasource.revision.text': "Lots of text", 'da...
[16:48:16] <travis-ci>	 wikimedia/ores#1220 (poc_json_datasources - dd96453 : Amir Sarabadani): The build failed. https://travis-ci.org/wikimedia/ores/builds/469108170
[17:50:47] <awight>	 hoo: Mind if I schedule a 30 min meeting for us?
[17:54:35] <Amir1>	 halfak: oh I forgot to talk about dewiki damaging model, one thing is I want to talk to Birgit before moving forward but she's on vacation until Jan 8th
[17:55:12] <halfak>	 Amir1, I think we should be OK with an ORES deployment, but we might want to wait on the ORES extension until she can support us. 
[17:55:24] <Amir1>	 yes
[18:00:36] <halfak>	 cool. 
[18:00:45] <halfak>	 It'll be nice to get those new models out :) 
[18:13:17] <Amir1>	 okay the editquality now has git lfs for dewiki models 
[18:13:51] <Amir1>	 I need to leave, will be back later. Won't work much though 
[18:13:53] <Amir1>	 o/
[18:18:15] <HareJ>	 halfak: I'm not sure if you noticed this but while we were talking about ORES with the academics I relegated ORES's old long name to footnote status https://www.mediawiki.org/w/index.php?title=ORES&diff=2998716&oldid=2986497&diffmode=source
[18:20:55] <HareJ>	 oh also, where are the Q2 goals written down?
[18:21:04] <HareJ>	 and have you set up a drafting zone for Q3 goals yet?
[19:00:12] <wikibugs>	 10Jade, 10Scoring-platform-team (Current), 10Design: Discuss and create a UI mockup for the JADE editor interface - https://phabricator.wikimedia.org/T168993 (10Harej)
[19:00:16] <wikibugs>	 10Jade, 10Scoring-platform-team, 10Design: JADE UI should provide the target's ORES score, and a human-readable explanation - https://phabricator.wikimedia.org/T181327 (10Harej)
[19:00:26] <awight>	 HareJ: I actually liked Claudia's suggestion of "Open revision evaluation service"
[19:00:37] <awight>	 especially now that I know "objective" was always meant to be a joke...
[19:00:50] <HareJ>	 "Open" is a good alternative name, yes
[19:01:00] <awight>	 halfak: How are the "include_*" flags turned on during extraction?
[19:01:00] <HareJ>	 I think in practice we all just call it ORES
[19:01:03] <awight>	 :)
[19:01:06] <awight>	 "ores", even
[19:02:30] <halfak>	 yess..  "ores"  like the English word for unrefined rocks 
[19:03:08] <HareJ>	 awight: I looked through https://phabricator.wikimedia.org/T210535 and I think it's a good starting point for conversations with Prateek
[19:03:25] <halfak>	 awight, https://github.com/wikimedia/ores/blob/master/ores/wsgi/util.py#L68
[19:03:37] <halfak>	 e.g. "https://github.com/wikimedia/ores/blob/master/ores/wsgi/util.py#L91"
[19:03:39] <halfak>	 Woops. 
[19:03:43] <halfak>	 Quotes unnecessary
[19:04:02] <awight>	 HareJ: I also did some documentation in the task description of https://phabricator.wikimedia.org/T199128
[19:04:38] <halfak>	 See also https://github.com/wikimedia/ores/blob/master/ores/scoring_systems/scoring_system.py#L124
[19:05:14] <awight>	 halfak: This might be something else.  I think I get it now, though.  https://github.com/wikimedia/revscoring/blob/master/revscoring/datasources/revision_oriented.py#L243
[19:05:16] <halfak>	 Finally: https://github.com/wikimedia/ores/blob/master/ores/scoring_context.py#L89
[19:05:39] <halfak>	 aha!
[19:06:26] <halfak>	 I'm running away for lunch.  I'll be working on goals, translatewiki, and an UPE post when I get back. 
[19:06:29] <awight>	 o/
[19:26:03] <awight>	 HareJ: I didn't mean to throw a wrench in, btw--I agree that T210535 and its subtasks are a good description of what is needed, I just wanted to point out that I had left some functional specs as well.
[19:26:04] <stashbot>	 T210535: [Epic] Code support for Jade user testing - https://phabricator.wikimedia.org/T210535
[19:26:34] <HareJ>	 That should be fine
[20:17:10] <travis-ci>	 wikimedia/editquality#398 (translatewiki - d7cb91e : halfak): The build failed. https://travis-ci.org/wikimedia/editquality/builds/469193720
[21:10:25] <halfak>	 ROC-AUC of 0.971 is pretty darn good. 
[21:10:34] <halfak>	 PR-AUC of 0.879!
[21:28:33] <awight>	 halfak: For translatewiki reverted?
[21:28:42] <halfak>	 Yup. 
[21:28:48] <halfak>	 Fighting with LFS now. 
[21:28:57] <awight>	 haha the 80-20 rule
[21:29:42] <halfak>	 oh crap.  Now the dreaded "ERROR: Authentication error: Authentication required: You must have push access to verify locks" error
[21:30:15] <awight>	 I don't even know that one yet, but it sounds miserable.
[21:30:46] <halfak>	 Top result on google: "Ran into the same problem on my Mac. This issue was closed in the expectation that it will be fixed soon, but this was more than a year ago."
[21:30:49] <halfak>	 *sigh*
[21:32:43] <wikibugs>	 10Jade, 10Scoring-platform-team, 10Design: JADE UI should provide the target's ORES score, and a human-readable explanation - https://phabricator.wikimedia.org/T181327 (10awight) I'm curious about the reasoning behind descoping, or how this will fit into our plans in the future?  Happy to support scope reduc...
[21:35:13] <awight>	 > To disable lock verification, you can use the lfs.[remote].locksverify configuration described in git-lfs-config(5).
[21:35:16] <awight>	 eww.
[21:35:52] <wikibugs>	 10Jade, 10Scoring-platform-team, 10Design: JADE UI should provide the target's ORES score, and a human-readable explanation - https://phabricator.wikimedia.org/T181327 (10Halfak) It doesn't seem like this is well described.  It looks like this task is about implementing a LIME-like thing for ORES on top of J...
[21:36:13] <halfak>	 Turns out authentication is borked so I need to take my username out of the upstream URL so that it remembers to ask for my username a second time 
[21:36:17] * halfak pulls hair out. 
[21:39:17] <awight>	 so many faces to palm.
[21:41:12] <wikibugs>	 10Jade, 10Scoring-platform-team, 10Design: JADE UI should provide the target's ORES score, and a human-readable explanation - https://phabricator.wikimedia.org/T181327 (10Harej) Earlier discussion on this task suggests that it got de-prioritized because it's more of a feature for ORES than for Jade, but real...
[21:45:17] <halfak>	 This is so crazy.  WTF.  Now I can't submit a pull request because github doesn't understand what changes are in my branch!?  
[21:45:18] <halfak>	 Ahh!
[21:46:38] <halfak>	 https://github.com/wikimedia/editquality/compare/translatewiki?expand=1
[21:47:01] <halfak>	 Oh wait.  "master and translatewiki are entirely different commit histories." 
[21:47:13] <halfak>	 But I just pulled changes into master before starting work. 
[21:48:21] * halfak face-palms hard. 
[21:48:40] <halfak>	 ...and getting to work on my manual merge work. 
[21:48:41] <halfak>	 AHH
[21:58:33] <halfak>	 Yay!  meeting canceled so I can spend extra time fixing this. 
[21:58:39] * halfak gets his mop in gear
[22:07:51] <travis-ci>	 wikimedia/editquality#401 (translatewiki_fixed - 6d85a40 : Aaron Halfaker): The build failed. https://travis-ci.org/wikimedia/editquality/builds/469242863
[22:16:54] * halfak pours gasoline on his old local git and sets it ablaze. 
[22:18:18] <halfak>	 Pointer file error: Unable to parse pointer at: "models/translatewiki.reverted.random_forest.model"
[22:18:19] <halfak>	 WTF
[22:19:24] <awight>	 I have this really great idea, it's an isomorphism between a file system and version control, but so abstract that everything boils down to irreversible hash pointers.
[22:20:01] <HareJ>	 Git, the filesystem
[22:20:10] <halfak>	 new magic command I'm going to try seems to straightforward to ever work:
[22:20:11] <halfak>	 "git lfs migrate import --fixup --everything"
[22:20:13] <HareJ>	 If you accidentally enter the wrong incantation you can format your hard drive and start over
[22:20:21] <halfak>	 lolol
[22:20:33] <awight>	 Is there documentation for how we use lfs?
[22:20:41] <halfak>	 LFS uses us
[22:20:53] <HareJ>	 that's going on office bash
[22:21:13] <awight>	 how we fail to use
[22:24:18] <halfak>	 I've fully deleted everything and started from scratch.  Everything is still broken. 
[22:24:26] <HareJ>	 awight: the sandvig paper is good by the way, it's actually written like a proper piece of literature and not indecipherable journal speak
[22:25:10] <awight>	 HareJ: We're overdue for reaching out to him, I was supposed to have done that ages ago.
[22:25:11] <HareJ>	 also, my new medication regime means i can actually focus on reading texts!...except when a job interview (where I'm the interviewer) abruptly falls on my lap
[22:26:22] <awight>	 Wow, long live the regime!  Can you spray some of this book-reading medication into the Intertubes?
[22:26:42] <awight>	 It also helps if the text is written to be read, like you pointed out.
[22:27:12] <halfak>	 No way man.  Gotta write that text in order to police the boundaries of your discipline against the unwashed masses. 
[22:27:37] * awight tugs at the friar's robes
[22:27:55] <halfak>	 If you don't know what *I* mean when I say cross-cultural operationalization of articulation work, then I just don't know how to talk to you. ;P 
[22:29:32] <awight>	 It's reasonable that people have technical language, as long as the explanations are forthcoming when needed.
[22:32:28] <awight>	 The speaker can choose to waste time either by * a crude vulgarization which loses some important information relative to its jargspeak, or * use the right words, but work harder to bridge a knowledge gap for 99% of the humans.
[22:36:36] <halfak>	 I like crude vulgarizations as they tend to remind jargon-users of what we really mean when we use words. 
[22:36:51] <halfak>	 It often causes a re-negotiation of the jargon that sharpens things. 
[22:38:02] <halfak>	 E.g. "articulation work" == figuring out how we should do things. 
[22:41:55] <awight>	 ty :)
[22:42:11] <awight>	 a glossary would be helpful metadata for most content
[22:42:22] <awight>	 kind of like embedding fonts in a PDF
[22:45:44] <HareJ>	 "a journalist for
[22:45:44] <HareJ>	 The Atlantic queried Netflix URL stems repeatedly to determine that there are 76,897 microgenres
[22:45:44] <HareJ>	 of movies produced by the Netflix recommender algorithm in 2014"
[22:45:52] <HareJ>	 I have a feeling we would end up producing a similar kind of system
[22:46:00] <HareJ>	 Where as it turns out, Wikipedia can be organized into thousands of discrete subjects.
[22:48:05] <awight>	 HareJ: jmatazzoni was just talking about this to, e.g. we have a generator for wikiprojects which lets you interactively define a (micro)genre by giving examples of articles.
[22:48:34] <HareJ>	 probably because I told him that project scoping is one of the hardest things to do with wikiprojects
[22:48:35] <awight>	 As you add articles, a machine learning model is retrained and shows you a selection of 10 more articles which would be chosen
[22:48:57] <awight>	 haha very well lit that fire under him then
[22:49:25] <HareJ>	 so I think our topic modeling idea is a popular one; now what I'm wondering is how the hell to make it happen
[22:49:37] <awight>	 ++team
[22:49:44] <awight>	 it's urgent
[22:50:47] <awight>	 meanwhile, we can easily make a prototype of something like I described, or your idea
[22:53:31] <halfak>	 I want to talk to Disco folks about that. 
[22:53:55] <HareJ>	 disco, eh?
[22:53:59] <halfak>	 I think they have the right expertise to know what kind of technological hurdle we're looking at. 
[22:54:49] <halfak>	 💃💃💃
[22:55:02] <HareJ>	 discovery? search platform?
[22:55:11] <halfak>	 Yeah
[22:55:32] <awight>	 Seems heck of easy, like we store an average embedding vector for every latest revision
[22:55:39] <halfak>	 right
[22:55:44] <HareJ>	 Everything is easy until you actually do it!
[22:55:47] <awight>	 :)
[22:55:48] <halfak>	 lol
[22:56:06] <HareJ>	 Wikipedia has mountains of edge cases that are more than willing to throw a curve ball at you
[22:56:06] <awight>	 I'd doubt we even need ML for this
[22:56:26] <halfak>	 awight, at some point, we need to draw a line around ML 
[22:56:35] <halfak>	 The embedding was *learned* 
[22:56:40] <halfak>	 So it sort of is ML
[22:56:41] <awight>	 There's no classification, just nearness.  But I bet there's something interesting about how to weight each dimension.
[22:56:48] <awight>	 Ah nice
[22:57:00] <halfak>	 classification == nearness with a threshold.
[22:57:25] <awight>	 Do you know what the goal is when developing an embedding?  To be distributed across for the corpus, or to excel at some specific type of classification?
[22:57:47] <halfak>	 To capture the maximum amount of information in a small package. 
[22:58:41] <awight>	 excellent formula wrt classification and nearness, that's helpful to me!
[22:58:46] <HareJ>	 To reduce our amazingly complex world down into standardized archetypes ;]
[22:59:40] <awight>	 So the kind of algorithms we're using could be thought of as a space transformation that implies a new metric?
[23:02:13] <halfak>	 Sure.  That's a fair way to look at it. 
[23:02:59] <halfak>	 The new metric optimizes the signal for reconstructing original document using only a few numbers. 
[23:03:34] <halfak>	 In theory, you could use the vector to make a document that looks like an original wiki article -- at least thematically. 
[23:04:15] <halfak>	 You wouldn't be able to regenerate sentences.  I bet the word cloud would be nearly identical. 
[23:08:41] <HareJ>	 Would we need some way to go from vectors to human-identifiable subject areas, or would it be sufficient to treat subject areas as abstractions? "We know these two articles are related, but we don't know why."
[23:09:12] <HareJ>	 I think the most flexible system is one that isn't held back by traditional notions of subject areas. We could get pretty creative in grouping articles together.
[23:09:37] <halfak>	 HareJ, I think you're thinking about it right. 
[23:09:40] <HareJ>	 Ultimately this grouping exercise is in pursuit of a goal, and not just so we can have nice categories to sort through.
[23:09:48] <halfak>	 This allows you to design your own hierarchy if you want. 
[23:10:18] <halfak>	 Right.  Easy "Neighborhoods" and useful directories/topic-spaces for study/recommendation/evaluation/etc. 
[23:10:38] <HareJ>	 It could also in theory power a recommendation system for readers.
[23:10:41] <HareJ>	 Speaking of, I wonder how they do it.
[23:10:51] <HareJ>	 The feature that recommends other articles to read.
[23:10:55] <halfak>	 Right.  Morelike. 
[23:11:08] <halfak>	 It uses text similarity -- which is very similar, but harder to index. 
[23:11:10] <HareJ>	 I thought morelike was used strictly by the recommendation API? Or is it used by both?
[23:11:17] <halfak>	 I think it is both. 
[23:12:07] <HareJ>	 Is morelike also used for recommending articles for CX?
[23:12:37] <halfak>	 I think so. 
[23:12:57] <HareJ>	 But what we want ultimately is something more sophisticated than morelike
[23:13:03] <HareJ>	 For reasons I may not totally grasp yet :)
[23:13:17] <HareJ>	 ("He's not an engineer! Get him!")
[23:14:10] * awight whittles a forked stick in the corner of room
[23:15:30] <halfak>	 HareJ, yeah.  I think we want to talk about options with the search folk -- given their expertise in this area. 
[23:15:44] <halfak>	 OK so I think I figured out git issues a bit. 
[23:16:05] <halfak>	 I'm running git-lfs 2.6.0 locally. 
[23:16:14] <halfak>	 And on stat1007, I'm running "Error: unknown flag: --version"
[23:16:17] <halfak>	 lol
[23:16:23] <awight>	  /o]
[23:16:41] <HareJ>	 I'd be eager to participate in this meeting with search platform. Maybe. It might go over my head :)
[23:17:51] <halfak>	 Na.  I think it'll be use-case negotiation. 
[23:17:58] <halfak>	 So you'd be at home there, HareJ 
[23:18:07] <halfak>	 I haven't set anything up yet.  Want to beat me to it? :) 
[23:19:02] <HareJ>	 Maybe! I want to get through a few things first.
[23:19:31] <halfak>	 "git-lfs/2.3.4"
[23:19:46] <halfak>	 Which is apparently a very stupid version of git-lfs. 
[23:27:33] <awight>	 Annoyingly, $wgMWLoggerDefaultSpi if configured correctly will silently break $wgDebugToolbar.
[23:41:33] <halfak>	 OK good enough: https://github.com/wikimedia/editquality/pull/174
[23:41:47] <halfak>	 I think we'll need to get git-lfs update on stat1007 soon. 
[23:41:54] <halfak>	 But that's a problem for tomorrow-Aaron
[23:42:01] <halfak>	 Have a good night, folks!
[23:46:10] <wikibugs>	 10Jade, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 2 others: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) Here are some example queries to help with reviewing the DDL.  @Marostegui, I'm especially intereste...
[23:46:17] <awight>	 o/