[02:14:35] yo halfak|Net. Welcome to CA! ;) [17:41:17] Gone for diner a-team [19:18:08] hi everyone! we're starting the Research Showcase in about 10 minutes. Today's speaker will be Lucas Dixon from Jigsaw. YouTube stream and speaker names & abstracts are linked in the Channel topic. I'll be fielding questions for the speakers during the showcase, so if you have a question, please ping me so I see it & and I'll ask it for you during the Q&A period. [19:27:53] hi J-Mo [19:28:05] hi Guest23954! welcome [19:28:24] deh! why do you see me as that. I'm Leila. aaaa! internets. [19:28:54] how do I know you're not just masquerading as Leila? [19:29:05] :D [19:29:34] oh look mako is in the chan today. hi mako. Are you going to watch our research showcase? [19:32:12] video came late, somehow. but it's okay. hearing the room now. [19:32:14] the showcase has started! I'm IRC host. Watch here: https://www.youtube.com/watch?v=nMENRAkeHnQ [19:34:53] J-Mo: sure! [19:35:33] ohai! Lucas Dixon from Jigsaw is talking about wikipedia talkpage discussions. https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#November_2017 [19:36:02] * cscott waves to mako [19:37:25] hey folks o/ [19:37:30] o/ [19:40:04] https://en.wikipedia.org/wiki/Artificial_intelligence_in_fiction [19:57:32] it seems a bit fragile to have a hand-curated list of "identity terms" [19:57:44] it seems you'd be vulnerable to someone discovering/abusing a new slur [19:58:35] ie, what happens when milk or keurig machines suddenly become a classifier [19:58:51] i'd be interested in seeing the single word "pinned AUC" probabilities [19:59:27] for the entire vocabulary, to see if the top things are all epithets (as expected) or if there are new unknown "identity terms" which are being mislabelled [20:00:24] I see this cscott and will ask for you. thanks! [20:01:25] thx! [20:03:50] J-Mo, I would like to know which representation of text are they using for their analysis. Bag of words? embeddings? [20:04:07] got it dsaez [20:05:47] cscott: i couldn't help but notice that searching for a single four-letter string would have found every single one of their top-scoring toxic messages [20:07:37] i'm sure that showing just the top scoring messages wasn't the most effective way to show how this is useful. but it seems a lot of wind-up for something i one could do with grep :) [20:08:12] * cscott is amused to see how folks who study toxicity have to talk about/around it when presenting their work :) [20:08:59] and i suspect it would have had very few false positives :) [20:09:08] I lost the room. is it just me? [20:09:12] (i never really thought about how you carefully you have to write your talk when doing this research) [20:09:13] no, lost here too [20:09:14] mako do you want me to bring this up with the speaker? [20:09:20] J-Mo: we lost the room. [20:09:24] thanks, Nettrom [20:09:29] I have also lost the feed on youtube [20:09:38] +1 [20:09:44] back [20:09:51] J-Mo: we're back. [20:09:59] i reloaded and itis back [20:09:59] it's been cutting in and out [20:10:03] mako: in a slightly more mathematical fashion, we're constructing a detailed sequence model of the input text, and then evaluating it based on single-word probabilities. [20:10:12] I refreshed and it is working [20:10:13] i've got the stream back, fwiw [20:10:16] back [20:10:20] thanks folks! [20:11:03] mako: even if we carefully curated a list of "identity terms" to level out the single-word probabilities, it seems like that just pops the problem down a level. [20:11:19] mako: now conversations which have certain two-word phrases get bias, etc. [20:11:46] i think i know the answer (i.e., blacklists are easy to work around, etc etc) [20:12:08] sorry about that blip. looks like it was on youtube's end [20:12:12] J-Mo, just to add context to my question: will be their system senstive to variations of keywords, such as fuc.k you or sometihing like this. This is something very usual to see in online forums [20:12:55] yeah, i think my question is basically contained within cscott's :) [20:13:09] got it dsaez and mako [20:14:32] i think i would just be interested in seeing examples of things that would not be caught by the simpler methods that are being caught by this technique [20:14:39] J-Mo: Question: Were you folks blindsided by these reports https://motherboard.vice.com/en_us/article/j5jmj8/google-artificial-intelligence-bias (the identity term issue)? [20:14:40] and showing the top-scoring terms doesn't get us that [20:14:47] J-Mo: (cont) If so, how can researchers and developers avoid similar "didn't see this" mistakes in the future? If not... what happened there? [20:15:18] got it apergos [20:16:48] J-Mo: We (WMF team writing ORES) think there’s a lot of potential in getting direct feedback from humans about false positives. This doesn’t seem to be fleshed out in your work yet, it was mentioned as something that happens ad-hoc. What are the obstacles to systematically asking for more of this type of feedback? [20:17:15] awight: nice, I was going to ask that question on your behalf :D [20:17:17] mako: you could train a a single-word probability model (a proxy for your "simple grep), and then compare it to your deep neural network [20:17:39] * mako nods [20:17:51] showing the most extreme differences, eg, to show the worth of your deep network [20:18:13] got it awight [20:18:49] i'm interested in the sort of feedback loop sketched, where you have a biased network with an attention model, which gets corrected using crowd-sourced bandit learning (click here if this assessment is wrong)... [20:19:08] ...and then that gets fed back to source additional training data (eg from wiki articles) in order to correct the biases identified [20:20:03] so the more folks identify that assessments based on the word "gay" (pointpointed by the attention model) are incorrect, the more unbiased training data containing the word "gay" are introduced. [20:20:04] cscott: that’s a big project halfak|net_derp and awight are working on (codenamed JADE) [20:20:23] https://www.mediawiki.org/wiki/JADE [20:20:32] thx [20:21:02] i talked with halfak|net_derp about bias correction at wikimania this year, yeah [20:22:25] didn't know about JADE specifically, thanks for the pointer [20:23:16] (i was interested in the systemic bias that creeps into machine translation -- most obviously with pronoun choices used for words like "doctor" or "nurse") [20:23:49] fwiw, we’ll be providing the “adversarial” feedback in parallel with the AI outputs, because we believe it will be higher-quality data [20:26:19] cscott: hang out in #wikimedia-ai <3 [20:26:31] there is such a channel? uh ohes [20:26:44] so many channels [20:27:11] lol don’t worry, it’s just humans flogging the machines for now. Dynamics will be inverted soon I’m sure. [20:27:32] i need an ai to suggest irc channels about ai [20:28:20] "likely to leave a conversation" -- interesting to see whether participants leave a conversation after a certain point. [20:28:34] J-Mo: to follow onto your question, is it a problem if we give authors an ML tool to detect how their toxicity would be perceived ahead of time? Or is that gaming aspect offset by the huge impact on medium-good-faith authors getting pre-moderation correction? [20:28:53] the converse is a little dangerous -- i bet an insult or a retort is quite likely to prompt a response but that doesn't mean that trolling is a "good conversation" tactic [20:29:34] isn't the deliberate misspelling of four-letter words already evidence of gaming the system based on feedback from blacklist-based moderation? [20:30:16] awight will ask if we have time [20:30:18] that would be a hucking forrible outcome [20:30:32] J-Mo: nbd, put me at the bottom of the queue. I’m just musing [20:31:43] awight: now you're gaming the system based on your expectation of feedback from human-based moderation ;) [20:31:51] lol [20:32:04] humans are more flexible in adjusting to your gaming attempts though [20:32:14] hopefully [20:32:27] you can't fool me, i know you meant to write h*cking [20:32:27] models need human intervention to patch them up (for now :-P) [20:33:43] (https://www.npr.org/sections/alltechconsidered/2017/04/23/524514526/dogs-are-doggos-an-internet-language-built-around-love-for-the-puppers is anyone needs me to explain the joke) [20:34:24] DarTar: Thankfully, jerks seem to love using English curse words :) [20:36:53] I can provide a pointer to the NLP research he just mentioned, if anyone's intereste [20:36:59] please do [20:37:10] hang on a little bit, let me dig it up [20:38:12] http://forum.opennmt.net/t/training-romance-multi-way-model/86/8 [20:38:19] the "zero-shot translation" paper in particular [20:38:41] https://arxiv.org/pdf/1611.04558v1 [20:39:22] " In addition to improving [20:39:22] the translation quality of language pairs that the model was trained with, our models can also learn [20:39:22] to perform implicit bridging between language pairs never seen explicitly during training, showing that [20:39:22] transfer learning and zero-shot translation is possible for neural translation. Finally, we show analyses [20:39:23] thanks [20:39:23] that hints at a universal interlingua representation in our models and show some interesting examples [20:39:23] when mixing languages." [20:40:37] My question was just, gaming vs. helping the low-hanging accidental abusers [20:41:56] 4chan helping reseach. nice. [20:41:59] *research [20:42:34] "comments made between midnight and 6am" [20:42:49] "bad night" hypothesis, really [20:43:08] and that's a wrap. thanks everyone! [20:43:25] thanks a lot [20:43:30] thanks all o/ [20:43:36] * cscott enjoyed this a lot [21:09:50] Thank you!