[14:54:26] "It's, Oh, So Quiet ..." [15:45:18] dsaez: i was going to send you an email but I'll use IRC :) i fixed the language on https://wiki-topic.toolforge.org/ so that it doesn't point to the wrong dashboard anymore. thanks for pointing out. also, at this point we should discuss pushing people towards the outlink-based model because I think that's the better fit for most people's needs (topics for Wikipedia articles in a bunch of languages) though I did note some outliers [15:45:18] in the model predictions: https://meta.wikimedia.org/wiki/Research:Language-Agnostic_Topic_Classification/Outlink_model_performance/All_wikis#Outliers_and_Anecdotal_Observations [15:57:55] hey isaacj. This week I'll have a meeting with Connie from product analytics, they want to use the model, and we will discuss how to productionize. The previous one I know how to use it from end-to-end, and it is "easy" to make it periodically. Maybe we can talk before, or you can join that meeting, and we can discuss which is more suitable for their needs. [16:00:08] yeah, i'd be happy to join. it's easy to run periodically though not at production-level, which i think would require some with better data engineering skills than i. right now, i collect all pages + outlinks via PySpark but then push it to stat1007 to do the final prediction step (only takes a few hours even though it's sequential and not in parallel) [16:00:13] dsaez: ^^ [18:21:14] hello! just saw this note about inviting Isaac to the meeting with Product Analytics. happy to do so ! :) [18:38:05] thx mayakpwiki [19:03:52] as well -- thanks for adding me mayakpwiki [19:29:45] isaacj, let's see which is the easiest model to implement. For this kind of scenarios I think estability is more important than precision I think [19:48:20] yeah, that makes sense. hopefully they give similar results but i actually haven't checked that yet