[08:07:52] <wikibugs>	 10Machine-Learning-Team, 10Analytics, 10SRE: Kubeflow on stat machines - https://phabricator.wikimedia.org/T275551 (10elukey) p:05Triage→03Medium
[14:15:01] <wikibugs>	 10Machine-Learning-Team, 10drafttopic-modeling, 10Discovery-Search (Current work), 10Patch-For-Review: Implement CirrusSearch keyword for drafttopic - https://phabricator.wikimedia.org/T268272 (10Gehel) 05Open→03Resolved
[14:15:03] <wikibugs>	 10Machine-Learning-Team, 10drafttopic-modeling, 10Discovery-Search (Current work), 10Epic: [Epic] Add drafttopic predictions to ElasticSearch index for the Draft namespace where available - https://phabricator.wikimedia.org/T249341 (10Gehel)
[16:28:39] <chrisalbon>	 Sorry all, I'm a terrible manager for not seeing it. Our mid week meeting will only be 30 minutes and start after the Tech Department meeting
[16:29:42] <chrisalbon>	 Also, I'd love any feedback on this talk page, which we will use for a blog/forum for machine learning at the foundation
[16:29:43] <chrisalbon>	 https://www.mediawiki.org/wiki/Talk:Machine_Learning
[16:33:59] <klausman>	 Hard to say if people use it, but I like it :)
[17:09:23] <chrisalbon>	 I actually think we have a better use case than a private company. Transparency is our friend in the ML case. And frankly I feel like I'd use it!
[17:12:29] <klausman>	 Yeah, I think it's generally a good thing. Hopefully, it's discoverable enough (and we advertise it) so people can use it
[17:12:58] <chrisalbon>	 There is definitely an issue with what is a "model" if we are retraining models every night
[17:13:02] <chrisalbon>	 versions of model?
[17:13:05] <chrisalbon>	 trainings of a model?
[17:13:19] <klausman>	 instances?
[17:13:24] <klausman>	 revisions?
[17:13:30] <klausman>	 let me fetch my thesaurus :)
[17:13:44] <klausman>	 generations might also work
[17:17:44] <chrisalbon>	 accraze did have the idea of auto-updated a model's mediawiki/wikitech (we should decide this) page with the latest training details, performance metrics, etc.
[17:20:03] <accraze>	 yeah i really like the idea of doing something like model cards
[17:20:21] <chrisalbon>	 accraze and are going to trackle taking two existing ores models and trying to write up cards for them as a proof of concept
[17:21:10] <accraze>	 google has some nice examples of model cards here: https://modelcards.withgoogle.com/model-reports
[17:22:34] <accraze>	 this paper does a deep dive into using cards for model reporting: https://arxiv.org/abs/1810.03993
[17:23:19] <chrisalbon>	 We could even auto update a model's card with the latest performance metrics, AUC etc.
[17:23:59] <klausman>	 Those cards have a good information density, I love them
[17:24:21] <klausman>	 You don't have to be an ML expert, but it's also not just magic and buzzwords
[17:26:52] * elukey never heard of model cards
[17:27:58] <klausman>	 They're a bit like product brochures, but useful :)
[17:29:06] <chrisalbon>	 Given that a lot of models we will deploy are from the community, the card also provides a point where folks can get context about what a model that we serve is, who requested it, etc.
[17:29:33] <klausman>	 What training data it had is what I'd want to know
[17:29:52] <chrisalbon>	 A lot of that knowledge for some ORES models is lost in a sea of closed phab tickets and that violates our goal of transparency
[17:30:16] <klausman>	 Backfill of that data sounds both daunting and desirable
[17:30:44] <chrisalbon>	 Yeah, we will do some but we will have to rely on the community for that too because there are 100+ models right now
[17:30:59] <klausman>	 Are we considering a freshness horizon for models?
[17:32:09] <klausman>	 I mean, not everyone needs to be retrained at the same frequency, but is a model trained on 10 year old data (and with 10 year old ML best practices) not a liability?
[17:32:50] <klausman>	 s/everyone/everything/
[17:44:08] <accraze>	 yeah this is true, all models degrade at different rates
[17:47:15] <klausman>	 Love that Crhis covered it in the meeting seconds after I mentioned it here :)
[17:50:12] <chrisalbon>	 Just in time presentations
[22:05:41] <wikibugs>	 10Machine-Learning-Team: Model Reporting - https://phabricator.wikimedia.org/T276397 (10ACraze)
[22:08:20] <wikibugs>	 10Machine-Learning-Team: Experiment with on-wiki model documentation - https://phabricator.wikimedia.org/T276398 (10ACraze)
[22:08:49] <wikibugs>	 10Machine-Learning-Team: Experiment with on-wiki model documentation - https://phabricator.wikimedia.org/T276398 (10ACraze)
[22:08:51] <wikibugs>	 10Machine-Learning-Team: Model Reporting - https://phabricator.wikimedia.org/T276397 (10ACraze)
[22:10:56] <wikibugs>	 10artificial-intelligence, 10Machine-Learning-Team (Active Tasks): Model Inventory - https://phabricator.wikimedia.org/T275709 (10ACraze)
[22:10:57] <wikibugs>	 10Machine-Learning-Team: Model Reporting - https://phabricator.wikimedia.org/T276397 (10ACraze)
[22:19:39] <wikibugs>	 10Machine-Learning-Team: Experiment with on-wiki model documentation - https://phabricator.wikimedia.org/T276398 (10ACraze)
[22:33:00] <wikibugs>	 10Machine-Learning-Team: Model Reporting - https://phabricator.wikimedia.org/T276397 (10ACraze)
[22:42:17] <wikibugs>	 10Machine-Learning-Team: Model Reporting - https://phabricator.wikimedia.org/T276397 (10ACraze)
[22:45:30] <wikibugs>	 10artificial-intelligence, 10Machine-Learning-Team (Active Tasks): Model Inventory - https://phabricator.wikimedia.org/T275709 (10ACraze)
[22:51:56] <wikibugs>	 10Machine-Learning-Team: Experiment with on-wiki model documentation - https://phabricator.wikimedia.org/T276398 (10ACraze) @calbon do you have preference on which models to try this out with?   I was thinking maybe the `en.damaging` editquality model and maybe the en.articlequality model, but am open to any oth...
[23:22:21] <wikibugs>	 10Machine-Learning-Team: Experiment with on-wiki model documentation - https://phabricator.wikimedia.org/T276398 (10calbon) Let's try whichever one is the easiest to start out with. That'll give us a good idea of the lowest hanging fruit are.