[11:13:28] o/ [11:14:21] o// [11:27:01] 10Scoring-platform-team, 10Operations, 10Release-Engineering-Team (Watching / External): Contact number of some WMDE staff should be avalible to SRE/RelEng - https://phabricator.wikimedia.org/T210721 (10Ladsgroup) >>! In T210721#4784825, @WMDE-leszek wrote: > I take it on me. I've briefly talked about this t... [11:27:26] akosiaris: hey, tell me if you're around for a deployment [11:34:10] Amir1: I am [11:34:27] is it fine to do a deployment? [11:39:39] (03CR) 10Ladsgroup: [V: 03+2 C: 03+2] Revert "Temp: Revert result serializier back to pickle" [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/479500 (owner: 10Ladsgroup) [11:41:01] (03PS1) 10Ladsgroup: Bump ores to HEAD [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/480059 [11:41:50] (03CR) 10Ladsgroup: [V: 03+2 C: 03+2] Bump ores to HEAD [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/480059 (owner: 10Ladsgroup) [11:47:25] Amir1: sure, why not ? [11:47:49] akosiaris: I don't want to step on anyone's toes :D [11:48:39] no toes in the dancefloor around right now [11:48:52] So let's get the party started [12:33:22] 10ORES, 10Scoring-platform-team, 10Services (designing): Merge ORES precaching with ORESFetchScoreJob - https://phabricator.wikimedia.org/T201868 (10Ladsgroup) [12:33:25] 10ORES, 10Scoring-platform-team (Current), 10Core Platform Team Backlog (Watching / External), 10Services (watching), 10User-Ladsgroup: ORES precaching seem to not understand its own config - https://phabricator.wikimedia.org/T211267 (10Ladsgroup) 05Open→03Resolved The second part is also done: {F276... [12:39:06] SPAM IS COMING [12:39:18] 10Scoring-platform-team (Current): Read through teahouse literature to find exact outcome metric. - https://phabricator.wikimedia.org/T209652 (10Ladsgroup) 05Open→03Resolved [12:39:20] 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Turn article quality javascript into a gadget - https://phabricator.wikimedia.org/T202744 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup [12:39:22] 10Scoring-platform-team (Current): Create project page about Newcomer quality - https://phabricator.wikimedia.org/T210211 (10Ladsgroup) 05Open→03Resolved [12:39:24] 10Scoring-platform-team (Current): Respond to ORES question from Mark Wang - https://phabricator.wikimedia.org/T210086 (10Ladsgroup) 05Open→03Resolved [12:39:26] 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Respond to questions about ORES article quality model - https://phabricator.wikimedia.org/T209951 (10Ladsgroup) 05Open→03Resolved [12:39:28] 10Scoring-platform-team (Current): Label 100 more non-singleton sessions and repeat model selection. - https://phabricator.wikimedia.org/T209728 (10Ladsgroup) 05Open→03Resolved [12:39:30] 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: Travis is failing on master of ORES - https://phabricator.wikimedia.org/T209852 (10Ladsgroup) 05Open→03Resolved [12:39:32] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: Upgrade celery to 4.1.0 for ORES - https://phabricator.wikimedia.org/T178441 (10Ladsgroup) [12:39:34] 10Scoring-platform-team (Current): create prediction function for newcomerquality package - https://phabricator.wikimedia.org/T211192 (10Ladsgroup) 05Open→03Resolved [12:39:36] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: Migrate ores celery configs to celery 4 - https://phabricator.wikimedia.org/T209587 (10Ladsgroup) 05Open→03Resolved [12:39:38] 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Update documentation for ArticleQuality.js - https://phabricator.wikimedia.org/T209387 (10Ladsgroup) 05Open→03Resolved [12:39:40] 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: Reduce connection timeout to PoolCounter - https://phabricator.wikimedia.org/T208577 (10Ladsgroup) 05Open→03Resolved [12:39:42] 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: ORES should log 500 responses - https://phabricator.wikimedia.org/T208623 (10Ladsgroup) 05Open→03Resolved [12:39:44] 10Scoring-platform-team (Current): Incoporate newcomerquality model into a python package - training side - https://phabricator.wikimedia.org/T208365 (10Ladsgroup) 05Open→03Resolved [12:39:46] 10Scoring-platform-team (Current): Evaluate Newcomer Model - https://phabricator.wikimedia.org/T208364 (10Ladsgroup) 05Open→03Resolved [12:39:48] 10Scoring-platform-team (Current): Qualitative Analysis of Session-Edit mismatches. - https://phabricator.wikimedia.org/T208362 (10Ladsgroup) 05Open→03Resolved [12:39:50] 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup: PoolCounter lock time has increased tenfold only in codfw - https://phabricator.wikimedia.org/T208608 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup [12:39:52] 10ORES, 10Scoring-platform-team, 10Performance, 10Release-Engineering-Team (Next): Try to increase ORES deployment parallelism - https://phabricator.wikimedia.org/T197180 (10Ladsgroup) [12:39:54] 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: [Epic] Use LFS for large ORES files - https://phabricator.wikimedia.org/T197096 (10Ladsgroup) 05Open→03Resolved [12:39:57] 10ORES, 10Scoring-platform-team (Current), 10Scap, 10Patch-For-Review, 10User-Ladsgroup: ORES deployment finish "successfully" even when uwsgi and celery fail to successfully start up - https://phabricator.wikimedia.org/T170950 (10Ladsgroup) 05Open→03Resolved [12:40:00] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: ores returns error for draftquality models - https://phabricator.wikimedia.org/T209060 (10Ladsgroup) 05Open→03Resolved [12:40:02] 10Scoring-platform-team (Current): Create sample Newcomer quality predictions for TeaHouse hosts to sanity check - https://phabricator.wikimedia.org/T209607 (10Ladsgroup) 05Open→03Resolved [12:40:03] 10Scoring-platform-team (Current): Understand TeaHouse desires of Newcomer-Quality predictions - https://phabricator.wikimedia.org/T208367 (10Ladsgroup) 05Open→03Resolved [12:40:06] 10Scoring-platform-team (Current): Post to enwiki TeaHouse about conducting AI-experiment - https://phabricator.wikimedia.org/T209608 (10Ladsgroup) 05Open→03Resolved [12:40:09] 10Scoring-platform-team, 10Operations, 10Wikimedia-Incident: ORES overload incident, 2017-11-28 - https://phabricator.wikimedia.org/T181538 (10Ladsgroup) [12:40:12] 10Scoring-platform-team (Current), 10Operations, 10User-Ladsgroup, 10Wikimedia-Incident: Investigate redis-cluster or other techniques for making Redis not a single point of failure. - https://phabricator.wikimedia.org/T181559 (10Ladsgroup) 05Open→03Resolved [12:40:14] 10Scoring-platform-team (Current), 10Operations, 10Scap, 10Patch-For-Review, and 2 others: Deployment git server can't supply ORES hosts in parallel - https://phabricator.wikimedia.org/T191842 (10Ladsgroup) 05Open→03Resolved [13:41:00] 10Scoring-platform-team, 10Research, 10Wikilabels, 10Research-2017-18-Q4: Design a data collection pilot using WikiLabels platform (mining reasons) - https://phabricator.wikimedia.org/T186351 (10Miriam) 05Open→03Resolved [13:41:29] 10Scoring-platform-team, 10Research, 10Wikilabels, 10Research-2017-18-Q4: Design a data collection pilot using WikiLabels platform (mining reasons) - https://phabricator.wikimedia.org/T186351 (10Miriam) [14:17:02] 10ORES, 10Scoring-platform-team (Current), 10Operations, 10Security-Team, and 2 others: Fetching ORES API from en.wikipedia.org blocked in debug mode - https://phabricator.wikimedia.org/T211511 (10Ladsgroup) 05Open→03Resolved This is done. [15:08:53] o/ [15:09:18] Amir1, https://github.com/wikimedia/editquality/pull/173 looks like the model files were committed directly -- rather than LFS. Is that right? [15:10:15] halfak: It should not, git lfs should automatically handle it [15:11:10] it's very likely that the file is committed through git lfs but github understands and shows it like it's not there [15:11:22] Hmm. I'm looking for some indication that it is in LFS :| [15:11:31] I see that old PRs look the same as this. [15:11:38] So maybe it's fine. [15:13:38] Looks like you can't click the trash can icon for LFS files. Good enough for me. [15:13:42] {{merged}} [15:13:42] 10[1] 04https://meta.wikimedia.org/wiki/Template:merged [15:13:45] Thanks for picking that up. [15:14:06] We have 3 models pending deployment now. [15:14:24] Galician Article quality, Italian & German damaging/goodfaith. [15:15:34] halfak: I also have some other PRs :D [15:16:25] https://github.com/wikimedia/ores/pull/305 [15:16:29] https://github.com/wikimedia/ores/pull/304 [15:18:57] It's not stored in LFS, wtf [15:19:06] I will fix it [15:21:42] Damn. This happened last time somehow. [15:21:44] kk [15:23:26] Never did like that watchdog thing. [15:35:23] * halfak hacks away on translatewiki models. [15:45:21] I go eat something, will be back in fifteen minutes or so [15:52:50] kk [16:03:46] back now [16:06:36] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: Change default serializer of celery from pickle to json - https://phabricator.wikimedia.org/T206333 (10Ladsgroup) the last bit is root_caches which is a dictionary on its own: ` {'datasource.revision.text': "Lots of text", 'da... [16:48:16] wikimedia/ores#1220 (poc_json_datasources - dd96453 : Amir Sarabadani): The build failed. https://travis-ci.org/wikimedia/ores/builds/469108170 [17:50:47] hoo: Mind if I schedule a 30 min meeting for us? [17:54:35] halfak: oh I forgot to talk about dewiki damaging model, one thing is I want to talk to Birgit before moving forward but she's on vacation until Jan 8th [17:55:12] Amir1, I think we should be OK with an ORES deployment, but we might want to wait on the ORES extension until she can support us. [17:55:24] yes [18:00:36] cool. [18:00:45] It'll be nice to get those new models out :) [18:13:17] okay the editquality now has git lfs for dewiki models [18:13:51] I need to leave, will be back later. Won't work much though [18:13:53] o/ [18:18:15] halfak: I'm not sure if you noticed this but while we were talking about ORES with the academics I relegated ORES's old long name to footnote status https://www.mediawiki.org/w/index.php?title=ORES&diff=2998716&oldid=2986497&diffmode=source [18:20:55] oh also, where are the Q2 goals written down? [18:21:04] and have you set up a drafting zone for Q3 goals yet? [19:00:12] 10Jade, 10Scoring-platform-team (Current), 10Design: Discuss and create a UI mockup for the JADE editor interface - https://phabricator.wikimedia.org/T168993 (10Harej) [19:00:16] 10Jade, 10Scoring-platform-team, 10Design: JADE UI should provide the target's ORES score, and a human-readable explanation - https://phabricator.wikimedia.org/T181327 (10Harej) [19:00:26] HareJ: I actually liked Claudia's suggestion of "Open revision evaluation service" [19:00:37] especially now that I know "objective" was always meant to be a joke... [19:00:50] "Open" is a good alternative name, yes [19:01:00] halfak: How are the "include_*" flags turned on during extraction? [19:01:00] I think in practice we all just call it ORES [19:01:03] :) [19:01:06] "ores", even [19:02:30] yess.. "ores" like the English word for unrefined rocks [19:03:08] awight: I looked through https://phabricator.wikimedia.org/T210535 and I think it's a good starting point for conversations with Prateek [19:03:25] awight, https://github.com/wikimedia/ores/blob/master/ores/wsgi/util.py#L68 [19:03:37] e.g. "https://github.com/wikimedia/ores/blob/master/ores/wsgi/util.py#L91" [19:03:39] Woops. [19:03:43] Quotes unnecessary [19:04:02] HareJ: I also did some documentation in the task description of https://phabricator.wikimedia.org/T199128 [19:04:38] See also https://github.com/wikimedia/ores/blob/master/ores/scoring_systems/scoring_system.py#L124 [19:05:14] halfak: This might be something else. I think I get it now, though. https://github.com/wikimedia/revscoring/blob/master/revscoring/datasources/revision_oriented.py#L243 [19:05:16] Finally: https://github.com/wikimedia/ores/blob/master/ores/scoring_context.py#L89 [19:05:39] aha! [19:06:26] I'm running away for lunch. I'll be working on goals, translatewiki, and an UPE post when I get back. [19:06:29] o/ [19:26:03] HareJ: I didn't mean to throw a wrench in, btw--I agree that T210535 and its subtasks are a good description of what is needed, I just wanted to point out that I had left some functional specs as well. [19:26:04] T210535: [Epic] Code support for Jade user testing - https://phabricator.wikimedia.org/T210535 [19:26:34] That should be fine [20:17:10] wikimedia/editquality#398 (translatewiki - d7cb91e : halfak): The build failed. https://travis-ci.org/wikimedia/editquality/builds/469193720 [21:10:25] ROC-AUC of 0.971 is pretty darn good. [21:10:34] PR-AUC of 0.879! [21:28:33] halfak: For translatewiki reverted? [21:28:42] Yup. [21:28:48] Fighting with LFS now. [21:28:57] haha the 80-20 rule [21:29:42] oh crap. Now the dreaded "ERROR: Authentication error: Authentication required: You must have push access to verify locks" error [21:30:15] I don't even know that one yet, but it sounds miserable. [21:30:46] Top result on google: "Ran into the same problem on my Mac. This issue was closed in the expectation that it will be fixed soon, but this was more than a year ago." [21:30:49] *sigh* [21:32:43] 10Jade, 10Scoring-platform-team, 10Design: JADE UI should provide the target's ORES score, and a human-readable explanation - https://phabricator.wikimedia.org/T181327 (10awight) I'm curious about the reasoning behind descoping, or how this will fit into our plans in the future? Happy to support scope reduc... [21:35:13] > To disable lock verification, you can use the lfs.[remote].locksverify configuration described in git-lfs-config(5). [21:35:16] eww. [21:35:52] 10Jade, 10Scoring-platform-team, 10Design: JADE UI should provide the target's ORES score, and a human-readable explanation - https://phabricator.wikimedia.org/T181327 (10Halfak) It doesn't seem like this is well described. It looks like this task is about implementing a LIME-like thing for ORES on top of J... [21:36:13] Turns out authentication is borked so I need to take my username out of the upstream URL so that it remembers to ask for my username a second time [21:36:17] * halfak pulls hair out. [21:39:17] so many faces to palm. [21:41:12] 10Jade, 10Scoring-platform-team, 10Design: JADE UI should provide the target's ORES score, and a human-readable explanation - https://phabricator.wikimedia.org/T181327 (10Harej) Earlier discussion on this task suggests that it got de-prioritized because it's more of a feature for ORES than for Jade, but real... [21:45:17] This is so crazy. WTF. Now I can't submit a pull request because github doesn't understand what changes are in my branch!? [21:45:18] Ahh! [21:46:38] https://github.com/wikimedia/editquality/compare/translatewiki?expand=1 [21:47:01] Oh wait. "master and translatewiki are entirely different commit histories." [21:47:13] But I just pulled changes into master before starting work. [21:48:21] * halfak face-palms hard. [21:48:40] ...and getting to work on my manual merge work. [21:48:41] AHH [21:58:33] Yay! meeting canceled so I can spend extra time fixing this. [21:58:39] * halfak gets his mop in gear [22:07:51] wikimedia/editquality#401 (translatewiki_fixed - 6d85a40 : Aaron Halfaker): The build failed. https://travis-ci.org/wikimedia/editquality/builds/469242863 [22:16:54] * halfak pours gasoline on his old local git and sets it ablaze. [22:18:18] Pointer file error: Unable to parse pointer at: "models/translatewiki.reverted.random_forest.model" [22:18:19] WTF [22:19:24] I have this really great idea, it's an isomorphism between a file system and version control, but so abstract that everything boils down to irreversible hash pointers. [22:20:01] Git, the filesystem [22:20:10] new magic command I'm going to try seems to straightforward to ever work: [22:20:11] "git lfs migrate import --fixup --everything" [22:20:13] If you accidentally enter the wrong incantation you can format your hard drive and start over [22:20:21] lolol [22:20:33] Is there documentation for how we use lfs? [22:20:41] LFS uses us [22:20:53] that's going on office bash [22:21:13] how we fail to use [22:24:18] I've fully deleted everything and started from scratch. Everything is still broken. [22:24:26] awight: the sandvig paper is good by the way, it's actually written like a proper piece of literature and not indecipherable journal speak [22:25:10] HareJ: We're overdue for reaching out to him, I was supposed to have done that ages ago. [22:25:11] also, my new medication regime means i can actually focus on reading texts!...except when a job interview (where I'm the interviewer) abruptly falls on my lap [22:26:22] Wow, long live the regime! Can you spray some of this book-reading medication into the Intertubes? [22:26:42] It also helps if the text is written to be read, like you pointed out. [22:27:12] No way man. Gotta write that text in order to police the boundaries of your discipline against the unwashed masses. [22:27:37] * awight tugs at the friar's robes [22:27:55] If you don't know what *I* mean when I say cross-cultural operationalization of articulation work, then I just don't know how to talk to you. ;P [22:29:32] It's reasonable that people have technical language, as long as the explanations are forthcoming when needed. [22:32:28] The speaker can choose to waste time either by * a crude vulgarization which loses some important information relative to its jargspeak, or * use the right words, but work harder to bridge a knowledge gap for 99% of the humans. [22:36:36] I like crude vulgarizations as they tend to remind jargon-users of what we really mean when we use words. [22:36:51] It often causes a re-negotiation of the jargon that sharpens things. [22:38:02] E.g. "articulation work" == figuring out how we should do things. [22:41:55] ty :) [22:42:11] a glossary would be helpful metadata for most content [22:42:22] kind of like embedding fonts in a PDF [22:45:44] "a journalist for [22:45:44] The Atlantic queried Netflix URL stems repeatedly to determine that there are 76,897 microgenres [22:45:44] of movies produced by the Netflix recommender algorithm in 2014" [22:45:52] I have a feeling we would end up producing a similar kind of system [22:46:00] Where as it turns out, Wikipedia can be organized into thousands of discrete subjects. [22:48:05] HareJ: jmatazzoni was just talking about this to, e.g. we have a generator for wikiprojects which lets you interactively define a (micro)genre by giving examples of articles. [22:48:34] probably because I told him that project scoping is one of the hardest things to do with wikiprojects [22:48:35] As you add articles, a machine learning model is retrained and shows you a selection of 10 more articles which would be chosen [22:48:57] haha very well lit that fire under him then [22:49:25] so I think our topic modeling idea is a popular one; now what I'm wondering is how the hell to make it happen [22:49:37] ++team [22:49:44] it's urgent [22:50:47] meanwhile, we can easily make a prototype of something like I described, or your idea [22:53:31] I want to talk to Disco folks about that. [22:53:55] disco, eh? [22:53:59] I think they have the right expertise to know what kind of technological hurdle we're looking at. [22:54:49] 💃💃💃 [22:55:02] discovery? search platform? [22:55:11] Yeah [22:55:32] Seems heck of easy, like we store an average embedding vector for every latest revision [22:55:39] right [22:55:44] Everything is easy until you actually do it! [22:55:47] :) [22:55:48] lol [22:56:06] Wikipedia has mountains of edge cases that are more than willing to throw a curve ball at you [22:56:06] I'd doubt we even need ML for this [22:56:26] awight, at some point, we need to draw a line around ML [22:56:35] The embedding was *learned* [22:56:40] So it sort of is ML [22:56:41] There's no classification, just nearness. But I bet there's something interesting about how to weight each dimension. [22:56:48] Ah nice [22:57:00] classification == nearness with a threshold. [22:57:25] Do you know what the goal is when developing an embedding? To be distributed across for the corpus, or to excel at some specific type of classification? [22:57:47] To capture the maximum amount of information in a small package. [22:58:41] excellent formula wrt classification and nearness, that's helpful to me! [22:58:46] To reduce our amazingly complex world down into standardized archetypes ;] [22:59:40] So the kind of algorithms we're using could be thought of as a space transformation that implies a new metric? [23:02:13] Sure. That's a fair way to look at it. [23:02:59] The new metric optimizes the signal for reconstructing original document using only a few numbers. [23:03:34] In theory, you could use the vector to make a document that looks like an original wiki article -- at least thematically. [23:04:15] You wouldn't be able to regenerate sentences. I bet the word cloud would be nearly identical. [23:08:41] Would we need some way to go from vectors to human-identifiable subject areas, or would it be sufficient to treat subject areas as abstractions? "We know these two articles are related, but we don't know why." [23:09:12] I think the most flexible system is one that isn't held back by traditional notions of subject areas. We could get pretty creative in grouping articles together. [23:09:37] HareJ, I think you're thinking about it right. [23:09:40] Ultimately this grouping exercise is in pursuit of a goal, and not just so we can have nice categories to sort through. [23:09:48] This allows you to design your own hierarchy if you want. [23:10:18] Right. Easy "Neighborhoods" and useful directories/topic-spaces for study/recommendation/evaluation/etc. [23:10:38] It could also in theory power a recommendation system for readers. [23:10:41] Speaking of, I wonder how they do it. [23:10:51] The feature that recommends other articles to read. [23:10:55] Right. Morelike. [23:11:08] It uses text similarity -- which is very similar, but harder to index. [23:11:10] I thought morelike was used strictly by the recommendation API? Or is it used by both? [23:11:17] I think it is both. [23:12:07] Is morelike also used for recommending articles for CX? [23:12:37] I think so. [23:12:57] But what we want ultimately is something more sophisticated than morelike [23:13:03] For reasons I may not totally grasp yet :) [23:13:17] ("He's not an engineer! Get him!") [23:14:10] * awight whittles a forked stick in the corner of room [23:15:30] HareJ, yeah. I think we want to talk about options with the search folk -- given their expertise in this area. [23:15:44] OK so I think I figured out git issues a bit. [23:16:05] I'm running git-lfs 2.6.0 locally. [23:16:14] And on stat1007, I'm running "Error: unknown flag: --version" [23:16:17] lol [23:16:23] /o] [23:16:41] I'd be eager to participate in this meeting with search platform. Maybe. It might go over my head :) [23:17:51] Na. I think it'll be use-case negotiation. [23:17:58] So you'd be at home there, HareJ [23:18:07] I haven't set anything up yet. Want to beat me to it? :) [23:19:02] Maybe! I want to get through a few things first. [23:19:31] "git-lfs/2.3.4" [23:19:46] Which is apparently a very stupid version of git-lfs. [23:27:33] Annoyingly, $wgMWLoggerDefaultSpi if configured correctly will silently break $wgDebugToolbar. [23:41:33] OK good enough: https://github.com/wikimedia/editquality/pull/174 [23:41:47] I think we'll need to get git-lfs update on stat1007 soon. [23:41:54] But that's a problem for tomorrow-Aaron [23:42:01] Have a good night, folks! [23:46:10] 10Jade, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 2 others: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) Here are some example queries to help with reviewing the DDL. @Marostegui, I'm especially intereste... [23:46:17] o/