[00:20:48] <wikibugs>	 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES: ORES extension highlights edits that are patrolled - https://phabricator.wikimedia.org/T187337#3974260 (10Ladsgroup) It's probably both, can't say for sure. I need to investigate in more depth
[00:41:00] <icinga-wm>	 PROBLEM - ORES web node labs ores-web-02 on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[00:41:50] <icinga-wm>	 RECOVERY - ORES web node labs ores-web-02 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 440 bytes in 0.552 second response time
[00:42:01] <halfak>	 ^ wut
[01:41:41] <icinga-wm>	 PROBLEM - ORES web node labs ores-web-02 on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[01:42:40] <icinga-wm>	 RECOVERY - ORES web node labs ores-web-02 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 443 bytes in 2.041 second response time
[01:43:54] <halfak>	 Looks like we're getting hammered. 
[01:44:00] <halfak>	 Not too worried about it. 
[01:44:06] <halfak>	 This is why we have an experimental deploy
[01:50:13] <wikibugs>	 10Scoring-platform-team (Current), 10editquality-modeling, 10User-Ladsgroup, 10User-Tgr, 10artificial-intelligence: Train/test damaging and goodfaith model for Hungarian Wikipedia - https://phabricator.wikimedia.org/T185903#3974519 (10Tgr) Uh, sorry, I haven't had much free time in the last couple days....
[04:02:11] <icinga-wm>	 PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[04:03:10] <icinga-wm>	 RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.540 second response time
[13:01:39] <awight>	 akosiaris: Just holler if there’s anything I should do to support the last step, of patching changeprop requests over to the new cluster & deactivating our role…
[13:01:59] <awight>	 Looking great so far!
[13:02:56] <awight>	 ores2* is getting a strangely spikey workload, but seems perfectly healthy
[13:04:02] <awight>	 I’ll keep reminding myself, once scb* is out of the picture, I should revert the decreased worker count.
[13:05:29] <awight>	 hehe, is it something I said?  we just got a 400% surge in requests
[13:05:56] <awight>	 small number of errors for the time being.
[13:06:25] <awight>	 lol ores* is averaging 0% CPU
[13:06:30] <awight>	 err...
[13:07:00] <awight>	 Maybe we can install a bunch more memory and dial this way up.  I donno how that decision should be made.
[13:15:26] <akosiaris>	 awight: no I don't think there's anything for you to do. It's just me doing some puppet work (https://gerrit.wikimedia.org/r/#/c/408560/ is already up) and some cleanup work. Only question is when
[13:15:46] <awight>	 :-) take your time, I’m just here to catch falling embers
[13:16:10] <akosiaris>	 what on earth happened 15 mins ago ?
[13:16:17] <awight>	 Do you think those machines are under-RAMmed, btw?  64GB For 32 processors…
[13:16:21] <akosiaris>	 1k req/sec ?
[13:16:28] <awight>	 akosiaris: Some kind of external request load, is all I know.
[13:16:41] <awight>	 I’d love to have a graph of volume per user-agent.
[13:16:45] <awight>	 I think there’s a task for that...
[13:17:59] <akosiaris>	 most of those boxes have a ton of memory free 
[13:18:16] <akosiaris>	 so I am not sure I follow how they are under-RAMmed
[13:18:37] <awight>	 Well, they have 50% of memory free but maybe 0.5% CPU usage
[13:18:43] <akosiaris>	 so ?
[13:18:59] <awight>	 so it seems we’ll be memory-bound when we increase the number of workers, just feels like a waste
[13:19:27] <awight>	 we must boil the ocean with those processors!! :)
[13:19:40] <akosiaris>	 we knew we will always be memory bound. It's the nature of the software
[13:19:48] <awight>	 kk
[13:19:53] <akosiaris>	 https://grafana.wikimedia.org/dashboard/db/prometheus-cluster-breakdown?orgId=1&var-datasource=eqiad%20prometheus%2Fops&var-cluster=ores&var-instance=All btw
[13:20:06] <akosiaris>	 it's a nice cluster breakdown dashboard
[13:20:26] <akosiaris>	 CPU usage will slightly increase when SCB is out of the picture
[13:20:34] <awight>	 I think there are some memory optimizations we can make, fwiw.  Long story short, any read-only memory we can allocate before forking is shared ideally.
[13:20:36] <akosiaris>	 whereas memory usage will remain the same
[13:21:32] * awight bookmarks
[13:24:12] <awight>	 I’m assuming the oscillations in memory used are Python GC.  Gross that it’s happening at the same time across machines, though.  Maybe we can do something to stagger the collection, if that’s the cause.
[13:25:37] <awight>	 lol, I’m not even going there: https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172
[13:29:00] <awight>	 That was a messed-up thing to read in the morning.
[14:09:47] <halfak_>	 o/
[14:10:11] <halfak_>	 FYI, I'll be in late today.  Am at a doctors apt
[14:10:18] <awight>	 enjoy!
[14:10:21] <halfak_>	 Probably online in 1.5 hours
[14:10:50] <awight>	 I have my local changeprop sending events to ores/jade, which parses the judgment and stores in a reasonable place :D
[14:11:15] <halfak_>	 Nice!
[14:11:23] <awight>	 I’m thinking I’ll add a pseudo scoring model, which deals with judgment things.
[14:11:49] <awight>	 It seems to fit nicely into the models that way.
[14:12:26] <awight>	 When you’re back, I’d like some conceptual review before I do the cleanup.
[14:49:11] <halfak>	 o/
[14:50:30] <awight>	 halfak: https://phabricator.wikimedia.org/P6706
[14:50:41] <awight>	 Some extremely rough edges to deal with yet
[14:51:03] <awight>	 Did you see my note about disguising jade as a scoring model?
[14:51:19] <awight>	 I’m no longer sure that it fits well.
[14:51:24] <halfak>	 Yeah.  I've been considering it.  The idea is interesting. 
[14:51:37] <awight>	 We don’t want to score on-demand.
[14:51:41] <halfak>	 The problem is that JADE will soon have judgements that have nothing to do with our prediction models. 
[14:51:54] <awight>	 I think that’s fine, right?
[14:52:09] <halfak>	 Well, we'll be transferring a lot of irrelevant data. 
[14:52:44] <halfak>	 Given that Amir put in some work into having a RAW format without line breaks to save on bandwidth, it seems we're taking several steps back. 
[14:53:08] <awight>	 You mean, in the most common case that a client requests all models for a rev_id?
[14:53:44] <halfak>	 awight, well I'm not sure that's the most common case for a client. 
[14:53:48] <halfak>	 But it is common for our precaching. 
[14:53:57] <halfak>	 So it ends up being the most common request. 
[14:54:12] <halfak>	 I'd like to be able to get the minimal amount of information I need in a request. 
[14:54:19] <awight>	 yeah precaching is where this is an extremely bad fit, we want to *not* do anything for JADE in response.
[14:54:31] <awight>	 The precaching endpoint doesn’t actually return data, does it?
[14:54:32] <halfak>	 Agreed. 
[14:54:38] <halfak>	 It does
[14:54:44] <awight>	 we should disable that.
[14:54:48] <halfak>	 no
[14:54:51] <awight>	 ?
[14:54:56] <awight>	 Changeprop doesn’t do anything with the response.
[14:54:59] <halfak>	 If anything needs to be scored based on the event it returns the score that was generated. 
[14:55:00] <halfak>	 It will
[14:55:09] <halfak>	 It will be part of the scored-revision event feed. 
[14:57:40] * awight fumbles looking for the production change-prop config
[14:57:58] <halfak>	 Not sure why the current config matters
[14:58:47] <halfak>	 https://phabricator.wikimedia.org/T167180
[14:59:06] <awight>	 Got it.
[15:00:26] <halfak>	 https://gist.github.com/halfak/6945c4763be32530dc0ab9c8906d72b5
[15:00:28] <halfak>	 awight, ^ 
[15:00:30] <awight>	 I want to return a “no score available” response from JadeModel, if we try to pseudo-score an recent entity (don’t know how to detect that)
[15:00:43] <awight>	 In which case, we would omit the “judgment” score from any output.
[15:01:01] <awight>	 or we’d return “judgment”: false perhaps
[15:01:24] <halfak>	 What's "JadeModel"?
[15:01:49] <awight>	 the hypothetical pseudoscorer that lets Jade sit in for a scoring model
[15:02:05] <halfak>	 Ahh yeah.  So, I'm not convinced that is a good idea. 
[15:02:09] <awight>	 In the case that our rev_id is old enough, it would do the mwapi work.
[15:02:12] <awight>	 me neither.
[15:02:28] <awight>	 It just happens to let us reuse all of the score caching stuff
[15:02:32] <halfak>	 But I do like the idea of trying to find some way to fit this in the current abstractions. 
[15:02:37] <halfak>	 So it's attractive in a way. 
[15:02:43] <halfak>	 right. 
[15:03:06] <awight>	 One thing about your “false” paste above—we would omit the judgment.meta data as well.
[15:03:17] <halfak>	 Why omit that?
[15:03:23] <awight>	 ooh
[15:03:26] <awight>	 I see what you’ve done
[15:03:32] <awight>	 judgment is nested under "damaging"
[15:03:41] <awight>	 I don’t know…
[15:04:08] <halfak>	 Right.  I think we can use ORES config to say something like: judgement_path: ["diff", "damaging"]
[15:04:09] <awight>	 What about the structure in my earlier paste, where judgment is parallel to damaging?  That lets us have unrelated predictions, as you were saying.
[15:04:29] <awight>	 err s/predictions/judgments/
[15:04:47] <halfak>	 Right.  if it isn't related to a prediction, ORES shouldn't return it. 
[15:04:53] <awight>	 whoa
[15:06:29] <halfak>	 Refresh https://gist.github.com/halfak/6945c4763be32530dc0ab9c8906d72b5
[15:06:35] <halfak>	 I just added something else. 
[15:06:52] <halfak>	 It helps see why I think dropping judgement there makes sense from a consumer perspective. 
[15:07:10] <halfak>	 Compare to https://ores.wikimedia.org/v3/scores/enwiki/34234210/damaging?features
[15:07:56] <awight>	 okay, wfm
[15:08:11] <awight>	 Will there be a way to pull all judgments, whether or not they have an ORES equivalent?
[15:08:45] <halfak>	 I think that should be part of the JADE API and not ORES
[15:08:50] <awight>	 And if a context only has the “reverted” model, would we want to still provide damaging.judgment, a la wikilabels?
[15:09:08] <halfak>	 ORES should be limited. 
[15:09:13] <awight>	 kk
[15:09:13] <halfak>	 Interesting question. 
[15:09:44] <halfak>	 I think that maybe we'd decide to include "damaging" with "reverted" as it is a proxy 
[15:09:52] <halfak>	 But not "goodfaith"
[15:09:57] <halfak>	 As it is not a proxy
[15:10:39] <awight>	 there’s a lot of traversing, you’re probably right that it should be configurable.  e.g. judgment.damaging.data.damaging -> scores.damaging.judgment.data
[15:10:42] <awight>	 yuck
[15:11:01] <wikibugs>	 10Scoring-platform-team (Current), 10editquality-modeling, 10User-Ladsgroup, 10User-Tgr, 10artificial-intelligence: Train/test damaging and goodfaith model for Hungarian Wikipedia - https://phabricator.wikimedia.org/T185903#3975631 (10Halfak) No worries :)  Just taking inventory and checking on what we h...
[15:15:32] <halfak>	 I'm imagining that it will be helpful to our users if *we* make the link between a judgement schema and a prediction model. :)
[15:15:40] <halfak>	 Also you can do things like this:
[15:16:12] <halfak>	 my_score = doc['scores'][str(revid)]['damaging']
[15:16:37] <awight>	 +1 sounds good
[15:16:38] <halfak>	 if my_score['score']['prediction'] != my_score['judgement']['data']: do something different
[15:17:35] <halfak>	 Maybe we can employ ?judgements= like we do with ?model_info= and ?models=
[15:17:56] <halfak>	 e.g. we could ask ORES to predict "damaging" but give us the "damaging" and "goodfaith" judgements if available. 
[15:18:24] <halfak>	 judgements=damaging|goodfaith
[15:18:55] <awight>	 Why restrict the predictions, in that case?
[15:19:12] <halfak>	 No good reason that I can think of. 
[15:19:27] <halfak>	 But just thinking that it's an option and it would fit with other things we do in ORES
[15:19:53] <awight>	 Thanks for the ideas, making this fit into ORES is the biggest challenge left
[15:20:03] <halfak>	 awight, agreed.  
[15:20:19] <halfak>	 Seems like expanding ScoreCache is the right next step. 
[15:21:07] <awight>	 oh?
[15:21:12] <halfak>	 it'll be hard to deal with the case where we have a judgement but we have not generated a score yet. 
[15:21:14] <awight>	 I found it could stay untouched
[15:21:23] <halfak>	 Oh!  Maybe we can score everything that has a judgement. 
[15:21:28] <awight>	 I’m not sure that’s an issue
[15:21:34] <awight>	 I think they’re independent
[15:21:38] <halfak>	 What is?
[15:22:01] <halfak>	 Are you imagining that you'll store the judgement under a different key?
[15:22:06] <awight>	 If a judgment exists, then a scoring request for the revision picks it up transparently.  if not, we’re not obliged to fetch the judgment
[15:22:08] <awight>	 yes
[15:22:13] <awight>	 I realized it’s already set up that way.
[15:22:14] <awight>	 :D
[15:22:15] <halfak>	 If you do, then it'll be common to find a judgement but no score cased and vice versa. 
[15:22:23] <halfak>	 *cached
[15:22:32] <awight>	 ores:wiki:damaging:21:0.0.0 ores:wiki:judgment:21:0.0.0
[15:23:10] <awight>	 no problem though.  If the top-level scorer sees a cached judgment, it grabs that and continues with normal scoring.
[15:23:15] <awight>	 Already working!
[15:23:24] <halfak>	 But what if it doesn't see a cached judgement?
[15:23:32] <awight>	 That’s where it gets funky
[15:23:37] <awight>	 It will call our JadeModel
[15:23:49] <halfak>	 Right.  Seems to me that it's better to always have a score/judgement blob together. 
[15:23:55] <halfak>	 Like you were thinking yesterday
[15:24:16] <awight>	 Ideally, that would have two outcomes: * recent? => return false, no score, * old? => fetch judgment from mwapi
[15:24:32] <awight>	 but my point is that the scores are already split up by model, which is perfect for us.
[15:24:36] <halfak>	 We have no way of asking the "recent" question. 
[15:24:42] <awight>	 right, that burns.
[15:25:04] <halfak>	 awight, so why not have judgements split up by model-schema relationship?
[15:25:20] <awight>	 oof.
[15:25:47] <halfak>	 Yeah.  Not clear which is the best way. 
[15:26:24] <halfak>	 In thinking about recentness, it seems to me that we can drop that and just always look up a judgement when scoring if no judgement is already in the cache. 
[15:26:59] <halfak>	 If a judgement is present in the cache then we can guarantee there is a score with jade-ores listener
[15:27:07] <awight>	 If we store judgment bits alongside each model’s score, we end up with some kind of hash storage for each score, and a special slot for judgments.  If we keep judgments separate, we need to add a post-processing step that re-maps judgment data into each score.
[15:27:53] <halfak>	 ^ and we need to handle the cases of score, but no judgement and judgement but no score
[15:28:23] <awight>	 I think I see what you mean.  So if there’s any ORES score available, then we don’t have to fetch the judgment.  If there’s no ORES score, always search for a judgment.
[15:28:35] <halfak>	 Right. 
[15:29:11] <halfak>	 Fun story, if we do this then we also have guarantees about a quick cached ORES score being available for every judgement that was recently added to MW-JADE
[15:30:19] <awight>	 I don’t see how that would work, I guess you’re suggesting we take the create_judgment feed and if no score exists, run it through precaching?
[15:31:43] <awight>	 halfak: Hey don’t stop brainstorming here, but put this on your review queue, https://github.com/wiki-ai/ores/compare/jade_data
[15:32:32] <awight>	 There’s another tiny piece, https://gerrit.wikimedia.org/r/#/c/410542/3/puppet/modules/ores/templates/ores.yaml.erb
[15:36:23] <halfak>	 awight, right. 
[15:36:36] <halfak>	 (Popped out to get a snack)
[15:36:37] <awight>	 Wait.  I’m back to something we talked about yesterday.  What if we let JadeModel always pull from mwapi?  It’s lighter-weight than the other scorers, and should happen in parallel.  We can insert “false” judgments with an exclusive-create perhaps, so that Jade changeprop will always take priority.
[15:37:10] <awight>	 Then the JADE glue to file judgments into each model is just a minor postprocessing step.
[15:37:40] <halfak>	 IO for scorermodels gets bundled
[15:37:46] <halfak>	 Not parallel
[15:37:48] <halfak>	 hmm
[15:38:20] <awight>	 It also makes the JadeModel conceptually closer to a normal scorer
[15:38:43] <halfak>	 Sure.  I figured that's what you were already doing. 
[15:39:04] <awight>	 Pretty close, but I haven’t written the “always fetch” part until we work out the ideal behavior
[15:39:07] <halfak>	 E.g. make a dependency that knows how to look up a judgement for the target rev_id
[15:39:13] <awight>	 +1
[15:39:20] <halfak>	 Always fetch as in never-cache?
[15:39:27] <awight>	 no, sorry
[15:39:33] <awight>	 always fetch if nothing is cached
[15:39:39] <awight>	 if “false” is cached, use that.
[15:39:47] <halfak>	 Isn't that what we have been talking about?
[15:39:49] <awight>	 (probably “{}” rather than “false”)
[15:40:03] <awight>	 eh I had a slightly different understanding until now.
[15:40:07] <awight>	 So I guess we’re on the same page.
[15:40:18] <awight>	 I’m suggesting we don’t tamper with existing score caching
[15:40:29] <halfak>	 Oh... maybe you're saying that when we fetch and there's no matching judgement, we'd store a placeholder so we know not to fetch again
[15:40:34] <awight>	 yes
[15:40:49] <halfak>	 I don't see what this has to do with existing score caching?
[15:40:52] <halfak>	 -?
[15:40:52] <AsimovBot>	 Asimov v. 2, By jem (IRC) / -jem- (Wikimedia), 2010-18 - Bot IRC de apoyo a los proyectos y al movimiento Wikimedia programado en PHP - Las órdenes deben escribirse precedidas de alguno de los prefijos admitidos (@!-=) - 13Lista de órdenes: -ord - 13Problemas o sugerencias: -sug - 13Ayuda: -? 15,02orden / -ic - 10http://wikimedia.es/asimov?uselang=en
[15:40:55] <awight>	 and that we let the judgments be cached on their own, rather than trying to HSET into existing scores
[15:40:57] <awight>	 lol
[15:40:57] <halfak>	 o/ AsimovBot 
[15:41:08] <awight>	 you found its special purpose!
[15:41:17] <awight>	 -ord
[15:41:17] <AsimovBot>	 Órdenes: ab abraza acad act actualidad actualizador ad alias alusiones art ascii aviso ayuda ayudando ayudante ayudanteop biblio bibliotecario bin bing blist bot botop buferes bug cab cac cafe calc cas cat cb cdb char comp creadores cu d dec demoda dest diff dominio dpd drae expand fetch filtro flames gallery gblock google grupo guser hex hiperignore hora ic ide ignore ip isbn l links lista log logs ls13 =>
[15:41:20] <AsimovBot>	 mant mem msg mw nuke oldid op operador ord ort otrs otrsteam pag patea phab php ping pong prefijos proyecto radio random rank rb re reconocido relevo reset restringir spam stats status sug sul tam tarea text ticket ticketid tra user v vad vec vot webchat wikinick wikirank wle wlm
[15:41:22] <halfak>	 awight, OK but then we have the problem of handling many cases
[15:41:31] <halfak>	 And weird LRU behavior
[15:41:44] <awight>	 I don’t think LRU behavior matters any more
[15:41:58] <awight>	 We already have each model-score going out of cache independently.
[15:42:01] <halfak>	 It does if we store judgement separate from the score
[15:42:28] <awight>	 If we try to HSET, then we have bits of the judgment going out of cache separately.
[15:42:49] <awight>	 if “damaging” is cached but “goodfaith” has expired, then we have to pull the entire judgment anyway
[15:42:52] <halfak>	 Right
[15:43:00] <halfak>	 And the score
[15:44:48] * awight is wondering if the “?” has arrived
[15:45:14] <halfak>	 Not sure I have a question...
[15:45:27] <awight>	 If judgment is still in the cache in the scenario I just gave, then we only have to recalculate “goodfaith”, then postprocess the judgment bits into that.
[15:45:51] <awight>	 judgment will be the last thing to expire from LRU in this design
[15:45:59] <awight>	 hrm
[15:46:06] <halfak>	 Why would it be the last thing to expire? 
[15:46:31] <awight>	 still a nastiness about having to inject “judgment” into the list of models to fetch if we get a request for models=goodfaith|damaging
[15:47:01] <halfak>	 "nastiness"?
[15:47:14] <awight>	 Well, change to the existing score processing design.
[15:47:52] <awight>	 It would be the last thing to expire because we would always retrieve the judgment from cache if processing any model that might include judgment bits.
[15:48:13] <halfak>	 Unless the user tells us they don't want them. 
[15:48:42] <halfak>	 I guess we could touch the judgement and not use it. 
[15:48:43] <awight>	 +1, sounds like a good thing to support.
[15:49:22] <awight>	 It would be harmless to let it go out of cache first of course, it can just be fetched from mwapi again.
[15:52:18] <halfak>	 I guess.  Seems like we're handling 2x as many cases by needing to store separately. 
[15:52:44] <awight>	 It feels less disruptive, to me.
[15:52:51] <halfak>	 "disruptive"
[15:53:08] <awight>	 Making major changes to scoring that we’ll end up reverting...
[15:53:55] <awight>	 It would be nice if we had another class of things like judgments that we needed to shoehorn in here, it might make the generalizations more obvious.
[15:55:52] <halfak>	 awight, right.  So I'm worried about shoehorning a pile of judgements into the wrong abstraction (ScorerModel) 
[15:56:05] <awight>	 That part is still questionable.
[15:56:05] <halfak>	 And I don't think we're talking about major changes if we are adding fields to the cache. 
[15:56:47] <halfak>	 Right now, we just have the "score" field in the cache.  We've toyed with the idea of adding a "features" field. 
[15:56:53] <halfak>	 But we've decided against that. 
[15:56:59] <halfak>	 Ack! 
[15:57:03] <awight>	 I’m not sure how there would be more cases to handle if we store judgments in their own cell; HSET may or may not do exactly that, transparently.
[15:57:05] <halfak>	 Just realized something about cache. 
[15:57:07] <awight>	 uh, oh.
[15:57:28] <halfak>	 Well first for your question: how are there more cases?
[15:58:02] <halfak>	 If we store separately: Score, No judgement; Judgement, No score, Score and judgement; No score, no judgement
[15:58:17] <halfak>	 If we store together: No score, No judgement; Score and judgement
[15:58:43] <halfak>	 So the problem I realized with the cache is that cached scores depend on feature injection. 
[15:58:57] <halfak>	 There's a hash in the key that represents the injected features. 
[15:59:12] <awight>	 But those are orthogonal.  IMO, the cases are actually, judgment vs no judgment.  In one case we return cached, in the other case we calculate.  score vs no score is handled by existing logic.
[15:59:22] <halfak>	 If we store judgement in the cache with the scores, we need to duplicate for different injections. 
[15:59:25] <awight>	 ok, yes I’ve seen the injected_features subkey
[15:59:29] <awight>	 argh
[15:59:37] <awight>	 Well, I guess we just accept that?
[15:59:57] <awight>	 Is it okay that we’ll have the same judgment for each injection?
[16:00:05] <halfak>	 awight, but we have complex logic for subtracting missing revs from revs that are cached. 
[16:00:20] <halfak>	 gotta go to a meeting.  I have thoughts on this though. 
[16:00:23] <awight>	 ooh—but we can’t go around updating judgments for unknown keys
[16:00:24] <halfak>	 back soon. 
[16:00:27] <awight>	 ok
[16:56:29] <halfak>	 back
[16:56:39] <halfak>	 Bah.  I have another meeting coming :( 
[17:31:56] <halfak>	 Lunch!
[18:10:52] <wikibugs>	 (03CR) 10Umherirrender: [C: 032] build: Updating mediawiki/mediawiki-codesniffer to 16.0.0 [extensions/ORES] - 10https://gerrit.wikimedia.org/r/410872 (owner: 10Libraryupgrader)
[18:14:20] <awight>	 Argh.  We have to fetch both Jade:Revision/123 and Jade:Diff/123
[18:27:16] <wikibugs>	 (03CR) 10jenkins-bot: build: Updating mediawiki/mediawiki-codesniffer to 16.0.0 [extensions/ORES] - 10https://gerrit.wikimedia.org/r/410872 (owner: 10Libraryupgrader)
[18:42:35] <halfak|Lunch>	 changing locations.  Back in 60 minutes. 
[18:54:27] <luke-v-hopkins>	 Hello everyone. I'm working with a group for our capstone course in Computer Science at Metropolitan State University of Denver. Our task is to help with a FOSS project. The only requirements are we follow a SCRUM process and we have about a 9 week limit. With that said, does anyone here have any suggestions or can direct us as needed.
[19:04:18] <wikibugs>	 10Scoring-platform-team (Current), 10JADE: Build prototype JADE extension - https://phabricator.wikimedia.org/T187216#3976696 (10awight)
[19:06:06] <awight>	 luke-v-hopkins: Hi, that sounds like fun!  This channel is focused on the following projects, let us know if anything catches your fancy: https://www.mediawiki.org/wiki/Wikimedia_Scoring_Platform_team#Projects
[19:06:27] <awight>	 Otherwise, if you’re interested in Wiki projects in general, you can try the #wikimedia-dev channel...
[19:06:47] <luke-v-hopkins>	 I'll take a look. Thanks.
[19:26:39] <awight>	 halfak: I wrote the last validation for Ext:JADE, it’s ready for review if you’re in that kinda head space.
[19:26:44] <awight>	 *kind o’
[19:26:53] <halfak>	 o/
[19:27:10] <halfak>	 maaaybe
[19:27:51] <halfak>	 I really want to finish off some work I left from yesterday -- Bengali Wikisource and Bosnian.  I should have some time after that depending on how my meetings shuffle out at the end of the day. 
[19:28:31] <awight>	 Sounds good
[19:28:52] <awight>	 Am*r will want to review anyway, so we might as well leave it all for next week
[19:30:39] <awight>	 FYI, I can rough out the suppression support for ores.jade today, but next Monday will probably be blocked until we decide on an appropriate abstraction for ¿JadeModel
[19:38:57] <halfak>	 Gotcha.  
[19:39:04] <halfak>	 Also Monday is the holiday :P 
[19:40:56] <awight>	 harr, no work
[19:40:58] <awight>	 I see
[19:49:34] <halfak>	 https://github.com/wiki-ai/editquality/pull/129
[19:53:21] <awight>	 akosiaris: Only if you happen to know—I’m trying to understand what’s responsible for rendering deployment_vars into the deployed changeprop config file.
[19:53:41] <awight>	 I’d like to compare the real config file with a vagrant thing I’m working on,
[19:53:44] <awight>	 *.
[19:54:59] <awight>	 I may have found the glue: https://gerrit.wikimedia.org/r/mediawiki/services/change-propagation/deploy
[19:56:46] <awight>	 yup, that was it.  scap/templates/config.yaml.j2
[19:58:55] <awight>	 halfak: Fun side note, the change-propagation config is rendered using the same templating language as code-generation.  We might want to build new change-prop configs as we deploy wikis.
[20:05:03] <luke-v-hopkins>	 I've looked over each of the projects briefly, awight. There are a few surface level questions I have if you are anyone else is available to discuss them?
[20:05:29] <awight>	 luke-v-hopkins: Please go ahead and ask here!
[20:06:09] <luke-v-hopkins>	 Alright. First, what are the present goals of this subset of MediaWiki? I saw the mission, but how do you envision your teams progress this year for instance?
[20:07:36] <halfak>	 awight, we don't use changeprop configs anymore ^_^
[20:07:43] <awight>	 harrr
[20:07:54] <awight>	 We certainly have a lot of cruft in there still
[20:07:59] <halfak>	 o/ luke-v-hopkins 
[20:08:01] <halfak>	 Welcome!
[20:09:39] <awight>	 halfak: Are you sure about that?  I don’t know of any other explanation for why scb1002 still has busy web workers.
[20:12:57] <awight>	 luke-v-hopkins: Good question.  Our immediate goals can be seen here: https://phabricator.wikimedia.org/tag/scoring-platform-team-current/
[20:13:15] <awight>	 Speaking to your course requirements, it’s not a strict agile/scrum process.
[20:14:03] <awight>	 luke-v-hopkins: Here are our quarterly goals, https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2017-18_Q3#Program_5._Scoring_Platform_(ORES)
[20:14:42] <awight>	 luke-v-hopkins: Here are the alternative directions we might take for “fiscal year” 2019, which begins in July: https://www.mediawiki.org/wiki/Wikimedia_Scoring_Platform_team/FY2019
[20:14:59] <awight>	 For your purposes, it sounds like you might be most interested in our quarterly goals.
[20:16:10] <awight>	 Of course, these goals were prepared with the current team in mind, so if you have specific interests or strengths, it might make sense if you pursued a new project entirely.  We’d be happy to talk about that, if so.
[20:16:46] <awight>	 It depends on whether you feel like working closely with other people, on your own, synchronously vs. asynchronously, etc.
[20:19:44] <luke-v-hopkins>	 Thanks, awight. A entirely new project might be out of scope of our timeline. With that said, here is my next question.
[20:20:01] <awight>	 :)
[20:20:31] <luke-v-hopkins>	 What types of low to moderate-priority contributions would be beneficial in say the 'revscoring' or 'Wiki-Labels' projects?
[20:22:23] <wikibugs>	 10Scoring-platform-team, 10ChangeProp, 10ORES, 10Services (doing): Change ORES rules to send all events to new "/precache" endpoint - https://phabricator.wikimedia.org/T158437#3036977 (10awight) The above PR has been merged.
[20:23:16] <travis-ci>	 wiki-ai/editquality#120 (bigger_samples - 1bdf074 : Aaron Halfaker): The build passed. https://travis-ci.org/wiki-ai/editquality/builds/342036900
[20:25:12] <awight>	 luke-v-hopkins: As part of answering your question, I’ll make queries to our workboards :)
[20:25:27] <awight>	 These are the known tasks for revscoring and wikilabels, https://phabricator.wikimedia.org/project/board/1901/query/VsoxdtTrlfib/
[20:26:00] <awight>	 Anything tagged with “Easy” should be possible with a minimum of coordination effort.
[20:26:17] <awight>	 luke-v-hopkins: You might want to try one of these ^ tasks just to see if the projects suit you?
[20:27:58] <awight>	 If you enjoy writing tests, this is an “epic” task that you can do any amount of, from 1% to 100% :D, https://phabricator.wikimedia.org/T171080
[20:29:53] <luke-v-hopkins>	 Okay. Seeing your project's progress so far, I agree that testing (at least in some form) is the best until I can collaborate a little more with my group a bit more...
[20:30:00] <awight>	 luke-v-hopkins: Lovely!
[20:30:48] <awight>	 luke-v-hopkins: FYI, everyone on our team started as a volunteer, so we’re a pretty good group in terms of having experience with “non-staff” interaction.  Visit IRC any time to check in.
[20:31:29] <luke-v-hopkins>	 Alright. Last question, does your group have a specific protocol for handling tests outside of what MediaWiki suggests?
[20:32:23] <luke-v-hopkins>	 For example, validation from certain people, GitHub pull-requests, etc?
[20:34:13] <wikibugs>	 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Edit quality campaign for Bosnian - https://phabricator.wikimedia.org/T174784#3977029 (10Halfak)
[20:34:26] <awight>	 luke-v-hopkins: Yes, actually.  The wikilabels repo falls outside of the MediaWiki recommendations.
[20:34:37] <wikibugs>	 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Edit quality campaign for Bosnian - https://phabricator.wikimedia.org/T174784#3572994 (10Halfak) http://labels.wmflabs.org/ui/bawiki/
[20:34:48] <awight>	 luke-v-hopkins: On the other hand, it’s not entirely configured with the CI that we want to have for our Python repos.
[20:35:38] <awight>	 luke-v-hopkins: Here’s an example repo that’s set up with all of the CI, https://github.com/wiki-ai/ores/commits/master
[20:36:05] <awight>	 Commits and pull requests trigger Travis-CI and codecov.io
[20:36:38] <awight>	 We can happily write the infrastructure glue to trigger CI in the wikilabels repo, unless that would be fun for you?
[20:36:42] <wikibugs>	 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Bengali-Sites, 10artificial-intelligence: Bengali Wikisource not recognized as a wikimedia wiki - https://phabricator.wikimedia.org/T187503#3977049 (10Halfak)
[20:37:10] <awight>	 Oh—sorry, we *do* have the CI set up for wikilabels already.
[20:37:28] <awight>	 https://codecov.io/gh/wiki-ai/wikilabels
[20:37:54] <awight>	 https://travis-ci.org/wiki-ai/wikilabels/builds
[20:38:19] <awight>	 If you look inside a build, you’ll see, https://travis-ci.org/wiki-ai/wikilabels/builds/341527810
[20:38:31] <awight>	 We’re using flake8, eslint, stylelint, and pytest.
[20:39:10] <awight>	 Once your local virtualenv is set up, you should be able to run all of those tools on your clone.
[20:39:58] <awight>	 luke-v-hopkins: Thanks for your questions!
[20:41:52] <luke-v-hopkins>	 Thank you, awight. I'll take a look over what you just posted myself first, then confer a bit with my group tonight. If there's agreement, I'm sure we'll have more questions. Until then.
[20:42:25] <awight>	 Looking forward to it :)
[20:43:16] <wikibugs>	 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Bengali-Sites, 10artificial-intelligence: Bengali Wikisource not recognized as a wikimedia wiki - https://phabricator.wikimedia.org/T187503#3977097 (10Halfak) https://github.com/wiki-ai/wikilabels/pull/227
[20:46:29] <wikibugs>	 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Edit quality campaign for Bosnian - https://phabricator.wikimedia.org/T174784#3977103 (10Halfak) I'll update the name of the campaign as soon as someone gets back to me with a translation.
[20:48:43] <awight>	 halfak: Tiny thing I noticed: ORES will be running on Jade: edits by default.  Meta-moderation ist güt?
[20:49:00] <halfak>	 haha yup
[20:49:33] <halfak>	 Only indications of damage will be in the comment. 
[20:49:58] <halfak>	 damaging: false, "FU Wikipedia :P"
[20:52:14] * apergos peeks in, snickers, disappears again
[20:52:32] <wikibugs>	 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Edit quality campaign for Bosnian - https://phabricator.wikimedia.org/T174784#3977127 (10Halfak)
[20:53:33] <wikibugs>	 10Scoring-platform-team (Current), 10ORES, 10editquality-modeling, 10artificial-intelligence: Train and test damaging/goodfaith models for Catalan Wikipedia - https://phabricator.wikimedia.org/T186749#3977131 (10Halfak) a:03Halfak
[20:53:55] <wikibugs>	 10Scoring-platform-team (Current), 10Analytics, 10ChangeProp, 10EventBus, and 4 others: Create generalized "precache" endpoint for ORES - https://phabricator.wikimedia.org/T148714#3977133 (10awight) Argh, attached to the wrong bug, please ignore.
[20:54:45] <wikibugs>	 10Scoring-platform-team, 10ChangeProp, 10ORES, 10Patch-For-Review, 10Services (doing): Change ORES rules to send all events to new "/precache" endpoint - https://phabricator.wikimedia.org/T158437#3977139 (10awight) The config patch above smoke tests correctly under mw-vagrant.
[20:55:31] * awight marvels at apergos’s highlight rules
[20:56:12] <wikibugs>	 10Scoring-platform-team (Current), 10ChangeProp, 10ORES, 10Patch-For-Review, 10Services (doing): Change ORES rules to send all events to new "/precache" endpoint - https://phabricator.wikimedia.org/T158437#3977140 (10awight)
[20:57:02] <awight>	 Have a nice looong weekend!
[20:57:41] <halfak>	 awight, taking off?
[20:57:46] <awight>	 o/
[20:57:48] <halfak>	 o/
[20:58:51] <halfak>	 Oh my. 
[20:58:59] <halfak>	 Catalan finished and edit type campaign!
[20:59:02] <halfak>	 Not edit quality
[20:59:11] <halfak>	 Ha!  Let's get a fun model deployed for them :) 
[20:59:30] <halfak>	 I'll get this in front of codezee since this will use his OneVsRest stuff
[21:00:36] <wikibugs>	 10Scoring-platform-team (Current), 10ORES, 10editquality-modeling, 10artificial-intelligence: Train and test edit type model for Catalan Wikipedia - https://phabricator.wikimedia.org/T186749#3977151 (10Halfak)
[21:01:51] <awight>	 HARR
[21:02:00] <awight>	 Accidental, you think?
[21:04:26] <awight>	 added / modified / removed...
[21:04:33] <awight>	 What model is this for?
[21:10:00] <halfak>	 edit_type
[21:10:29] <halfak>	 A long time ago, someone was really interested in that model from catalan wiki.
[21:13:24] <awight>	 haha wow
[21:24:40] <awight>	 k I’m taking a long walk, bye for real