[19:08:41] <leila>	 isaacj: thanks for your work and update in T238437#6844215 . Is the task going to be on the enwiki data? and if so, does it have to be?
[19:08:42] <stashbot>	 T238437: Identify and prepare a data-set for Fair Ranking Track at TREC - https://phabricator.wikimedia.org/T238437
[19:10:43] <leila>	 dsaez: thanks for the update at T260566#6844279 . What does announcing the model to the NLP group at Cambridge mean? are they asked to take a particular action based on the announcement? 
[19:10:45] <stashbot>	 T260566: Develop 1 model to identify Misinformation - https://phabricator.wikimedia.org/T260566
[19:11:37] <dsaez>	 leila, just fixed, it was the datasets, no models :D
[19:12:05] <leila>	 dsaez: oh! thanks! :D and are they asked to do something with the dataset?
[19:12:48] <dsaez>	 we are planning to start a formal collaboration
[19:13:32] <isaacj>	 leila: yes - we discussed multilingual datasets but chose not to do it this year for a few reasons: what we're going to propose is already pretty hard and adding the multilingual challenge was decided to be too much for one competition. the "hard" parts for this challenge are supposed to be about the fairness components so we wanted to make sure we did a good job with that first. in the past, these competitions have also been 
[19:13:32] <isaacj>	 iterated on over multiple years so if this goes well, we're considering that as an extension for next year. gathering data on wikiprojects in other languages is pretty difficult too. it's more or less straightforward for arabic, hungarian, turkish, and french because they use the PageAssessments extension that english wikipedia uses but I didn't have great insight into the activity levels on those wikis and whether the data we'd 
[19:13:32] <isaacj>	 extract would be high quality.
[19:15:31] <leila>	 dsaez: (I suspect you have a lot of context that I may be missing.) you already made the data-set public, correct? how does the formal collaboration enter? 
[19:16:29] <dsaez>	 the dataset will be published next week
[19:16:59] <dsaez>	 and we runned some baselines, but we need to create better algorithms 
[19:17:06] <leila>	 isaacj: excellent. thank you! please add that in the task description (what dataset you work on and possible future data-sets). it's good to have it documented that you've considered it and there are solid reasons for delaying adding languages.
[19:18:17] <leila>	 dsaez: 'got you. I thought you're talking about this dataset that is already published, but now I remember there are multiple of them: https://zenodo.org/record/4433137#.YDAOy3WYWEA . All clear. thanks.
[19:26:48] <isaacj>	 leila: sounds good
[19:36:14] <leila>	 isaacj: thanks so much for responding to the external question we received.
[19:36:44] <isaacj>	 :thumbs up: easy one to do