[00:17:17] 10Quarry, 10Patch-For-Review: Quarry should refuse to save results that are way too large - https://phabricator.wikimedia.org/T188564 (10zhuyifei1999) >>! In T188564#4795463, @Framawiki wrote: > If they were not stored as sqlite files would the problem be partially resolved? I have trouble seeing the interest... [16:41:07] * leila waves to the channel [16:41:50] ciao leila :) [16:42:01] miriam: ciao. :) what's going on? [16:42:28] dsaez: I don't know if I ever mentioned this to you. Martin who will talk to us today is the one you requested to come talk with us, re topic modeling and the WSDM submission. [16:47:23] cool [16:49:47] dsaez: did you see Eriq's email? [16:49:54] yep [16:50:23] I didn't got exactly the problem, but I'll read again later, I'm trying to finish the rec api [16:53:51] dsaez: the examples and categories of problems he has reported seem to be things we should be able to fix, (via synonym detection?) [16:54:45] dsaez: will you have a chance to talk with Eriq before you sign off? I want to understand if he's blocked on you or he can update the sysnonym detection model or the clean up steps on his own [16:55:16] let me see again his email [16:55:37] ooki [16:56:02] ooh, I see [16:56:11] this is what I reported long time ago [16:56:22] it's a recall problem on the ground truth [16:57:10] some of them should be fixable no? for example, the en_Music video example and the issue of accent sensitivity? [16:57:14] dsaez: ^ [16:58:27] dsaez: or case normalization. [16:58:29] accents are part of the language [16:58:50] dsaez: do you expect the algorithm to get the accents right? [16:59:04] they re not accesory things, It would be unacceptable to recommend misspelds words [16:59:36] *misspell [16:59:51] accents are not optional in Spanish. [17:00:22] dsaez: this is an alignment problem (Even for recommendation we can discuss). In terms of alignment, we should align them if the issue is just the accent, no? there is an editor in the loop for any application and that editor can make the judgment call as to accept or fix and use. [17:00:49] we can discuss about case-sensitive , still I don't think that is correct [17:01:02] but ó and o are two different things [17:01:31] In spanish a word can change completely the meaning [17:01:36] depending on the accent [17:02:02] we have a recall problem in the ground truth that it's more general [17:03:21] assuming that get full ground truth is impossible we have two options: 1) use the ground truth that we have, and assume that we are reporting a lower-bound for our precision, or ii) add a new layer of manual evaluation. [17:03:26] dsaez: we should figure out if we can fix that. That can explain part of the reason that precision @ 1 is not high in some of the pairs. In spanish, can you give some examples of words (that can be section titles) and the change in accent can completely change their meaning while they're the same for the rest of the letters? [17:04:21] I would need to look for examples, but this is just not correct. It's like I saying that V and B are the same letter [17:04:22] dsaez: yeah. the lower bound is definitely something we should emphasize on (I took a note now), but let's see if we can change that as well. [17:04:29] haha [17:04:51] dsaez: I'm not attacking Spanish. ;) I can imagine that can happen, I'm looking for some examples to see if at the section level this can happen often [17:05:32] dsaez: the other example about gendered section titles is a good one, too. and that one actually is interesting to figure out how to address as en, for example, doesn't have gender [17:05:47] I don't know, but if we want to expand this method to all the langauges, we shouldn't consider cherry-picked filters. [17:05:48] actually in can. ;) [17:06:00] yeah. I understand that point. :) [17:07:25] As you know, I'm not big fan of puting make-up on the results. I would prefer to explain the limitations of our evaluation and emphasize that we are are reporting lower-bound , [17:07:37] the gender thing it's also tricky. [17:07:50] dsaez: I'll attend Tech Mgmt now, will come back here later. [17:07:51] specially for knowledge gap program ;) [18:07:21] dsaez: are you joining us? [20:14:18] J-Mo1: do you have some time to do one pass over http://52.47.121.188:8080/ and provide feedback? (for the eliciting new editor interests experiment) [20:14:37] J-Mo1: we're trying to send it out of the door this Friday or next Monday. Hopefully we won't miss this deadline. [20:15:00] J-Mo1: I'm heading to lunch now. Open to have a chat later (virtually or here). [20:19:16] leila|lunch: sure thing! [21:34:08] 10Quarry, 10Patch-For-Review: Quarry should refuse to save results that are way too large - https://phabricator.wikimedia.org/T188564 (10Framawiki) >>! In T188564#4796104, @zhuyifei1999 wrote: >>>! In T188564#4795463, @Framawiki wrote: >> The problem is that the recording at the end of the workers' task is to... [21:41:31] J-Mo1: thanks for your feedback. catching up. [21:41:40] np leila [21:46:26] J-Mo1: Ok. done with reviewing your feedback. thanks. one question, and you mentioned it in your email: I think we have to make the instructions /much/ shorter and simpler. [21:46:57] J-Mo1: some of the things described there are things people will see when they are fronted with the question. Explaining the details at first can be deterring/confusing. [21:47:15] J-Mo1: I'm also thinking it's good to separate the instruction page from the questionnaire. [21:47:45] J-Mo1: the idea would be that first you read the instructions, and only when you click "I read the instruction and I'm ready to go" or simply "Next" ;) you can see the questions. [21:47:59] J-Mo1: otherwise, I would skip reading the questions and would get to the first question. [21:48:00] yep, I agree with all of this leila! [21:48:19] J-Mo1: Ok. Let me try to make it shorter, run it by you, and send it to Ramtin then. thanks! [21:48:39] you can just send it to Ramtin, leila. You clearly know what you're talking about here ;) [21:48:48] haha ok. [21:51:17] J-Mo1: what about the font size differences? :D [21:51:39] J-Mo1: should the question text be bigger than the lists and response options? it confuses me. [21:51:53] I agree. [21:52:11] J-Mo1: is there a best practice on font size/type? [21:52:12] :D [21:52:58] question leila: I had some of these same suggestions, but decided not to make them because I was worried it would feel like overkill… should I follow your example and send a follow-up email with some of these lower-level formatting suggestions? [21:53:22] not sure what best practices are in general, but you could use the same heading ratios as the WMF style guide… [21:53:34] J-Mo1: I /think/ they're improtant. Why don't we put all our suggestions in https://etherpad.wikimedia.org/p/CBAooDPb6p and send them in one go? [21:53:46] https://github.com/wikimedia/WikimediaUI-Style-Guide [21:53:53] sounds good leila [21:54:18] I've got a meeting, but will add some more stuff to that pad in about an hour [21:54:27] J-Mo1: sounds good. thanks. [22:29:30] J-Mo1: I left some feedback in the etherpad. I'm heading to a meeting and will be back in an hour. Please simplify the questionnaire even further if you can find ways. [23:26:52] leila: I supplemented your feedback on the etherpad. Will you share it with Ramtin et al?