[15:32:03] 06Revision-Scoring-As-A-Service, 10Wikilabels, 07Chinese-Sites, 15User-Ladsgroup: Chinese translations are not being loaded - https://phabricator.wikimedia.org/T154897#2930662 (10Ladsgroup) a:03Ladsgroup [15:32:16] 06Revision-Scoring-As-A-Service, 10Wikilabels, 07Chinese-Sites, 15User-Ladsgroup: Chinese translations are not being loaded - https://phabricator.wikimedia.org/T154897#2930666 (10Ladsgroup) https://github.com/wiki-ai/wikilabels/pull/151 will fix this [16:04:01] 06Revision-Scoring-As-A-Service, 10Wikilabels, 07Chinese-Sites, 15User-Ladsgroup: Chinese translations are not being loaded - https://phabricator.wikimedia.org/T154897#2930818 (10Ladsgroup) Fix merged and deployed. It should be okay now. [16:04:50] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 07Chinese-Sites: Edit quality campaign for Chinese Wikipedia - https://phabricator.wikimedia.org/T116474#2930822 (10Ladsgroup) [16:04:52] 06Revision-Scoring-As-A-Service, 10Wikilabels, 07Chinese-Sites, 15User-Ladsgroup: Chinese translations are not being loaded - https://phabricator.wikimedia.org/T154897#2927713 (10Ladsgroup) 05Open>03Resolved [17:32:17] o/ [17:32:33] Hey folks. Just setting up for the AI Wishlist session. It will start in 1.5 hours. [18:36:09] hi [18:43:59] Hi all! At 11AM PT, we'll be streaming a #wikidev17 chat on AI and discussion will take place in this IRC channel. [18:43:59] What should an AI do you for you? Building an AI Wishlist [18:43:59] STREAM: https://www.youtube.com/watch?v=j8ND7Uu4e_s [18:43:59] DISCUSS: https://webchat.freenode.net/?channels=#wikimedia-ai (NOTE: different IRC room) [18:43:59] NOTES: https://etherpad.wikimedia.org/p/devsummit17-ai-wishlist [18:44:42] melodykramer: this is the IRC-room right? [18:45:06] That's correct - x-posted that from #wmhack, which is where other conference discussions are taking place. [18:45:21] And if you'd like to see the entire program, it's here: https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2017/Program [18:45:56] Yep checked it out briefly [18:54:53] Hey folks. Just setting up. The stream should be online. [18:55:23] yes it is working fine (sound not really understandable yet, but guess that'll come) [18:55:39] For anyone else who wants to connect, see https://www.youtube.com/watch?v=j8ND7Uu4e_s [18:56:03] It sounds good, think it is just the people talking all at the same time [18:56:20] halfak: bit noisy in the background [18:56:46] Sorry about that. I've muted. I'll unmute when I start talking. [18:57:02] hello everyone! [18:57:29] Sorry for the shoddy setup. [18:57:32] Hey Nettrom! [18:57:49] I'm going to use the article importance model as an example for our wishlist format :) [18:58:00] Also, I'm going to recruit yall into TEAM IRC soon. [18:58:03] So be prepared. [18:58:06] ;) [18:58:17] halfak: cool :) [18:58:26] 2 minute warning. [18:59:21] BTW, I might try to head out 12:05, meeting conflict :( [18:59:25] thanks for the last minute set-up halfak [19:02:36] can hear you fine! [19:02:38] perfect sound [19:02:39] we can hear you! [19:03:43] that's https://etherpad.wikimedia.org/p/devsummit17-ai-wishlist [19:05:18] the other one is more filled out: https://etherpad.wikimedia.org/p/devsummit17-AI_wishlist [19:05:53] fixed the link in the program [19:06:18] yay SuggestBot! [19:06:40] we’re in English, French, Portuguese, Russian, Persian, Norwegian, and Swedish [19:07:15] o/ [19:07:38] hey Amir1 [19:09:07] o/, I'm helping bridge the IRL-IRC group :D [19:09:24] could we have the etherpad link again? [19:09:25] awight: : thanks for helping out with that! [19:09:29] https://etherpad.wikimedia.org/p/devsummit17-AI_wishlist-IRC [19:09:36] https://etherpad.wikimedia.org/p/devsummit17-ai-wishlist [19:09:42] https://etherpad.wikimedia.org/p/devsummit17-AI_wishlist [19:10:49] I guess we should brainstorm in both IRC and etherpad? [19:11:05] sounds good [19:11:27] agreed, probably good to brainstorm here and copy ideas over to the etherpad? [19:11:31] https://etherpad.wikimedia.org/p/devsummit17-AI_wishlist-IRC like basvb sed... [19:11:34] Nettrom: +1 [19:11:37] thx [19:16:30] So lets start our discussions? [19:17:04] yeah, I guess this is a problem of “where do we start"? [19:17:12] I'm ruminating on whether to drift towards tasks that take a lot of human effort, or tasks that are unpleasant for people [19:17:24] well I've been walking around with image classification ideas [19:17:25] or are currently overlooked [19:17:29] cool! [19:17:43] all those categories sound interesting to me [19:17:55] basvb: to help with commons search? u have anything written down? [19:17:59] Nettrom: does the article importance model only use properties like pageviews and pagelinks or does it also factor in recommendations from Wikipedia Article Assessment? [19:18:05] to help with crude categorization mainly [19:18:09] I;ve an example even [19:18:35] https://commons.wikimedia.org/wiki/User:Basvb/Deeplearning [19:18:56] but was a bit of a case of looking for a use case for a solution, agreeably the other way around is better [19:19:36] Could be used for crude image categorisation: do we see a dog, human, car, nature, city-street on the image [19:19:46] codezee: we haven’t decided on that exactly yet, I’m currently reading up on the literature to see what others have done. I’ve built some rudimentary models for predicting WikiProject importance assessments based on pageviews and pagelinks, but it’s not great. [19:19:59] basvb: I think ever seen a phab ticket about image classification for Commons such that we can detect objects in the images [19:20:11] but can't find the ticket now [19:20:14] yes somebody did a project on it [19:20:35] can look for that one [19:21:18] Free association: it might be rad to have an AI that helped connect editors with wiki projects [19:22:58] basvb: perhaps this https://phabricator.wikimedia.org/T49492 ? [19:23:36] Nettrom: importance in general could be ambiguous but if thats tied to a specific wikiproject as was mentioned, domain specific info could be used for similarity measures with that domain [19:23:49] awight: WikiProject discovery is a great idea, I’m sure there’s been tons of contributors over the years who had no idea they existed [19:23:53] glorian_wd: have been there once, but there is an even better one I think, but guess we should add that one [19:23:53] Nettrom: do you have any opinions about a measure of importance in the general sense? [19:24:32] in terms of free association: maybe an AI-coach for new users? [19:24:41] codezee: that’s a great point about domain specific info for WikiProjects, I’m writing that one down! :) [19:25:35] Nettrom: I go by Sumit on etherpad, I'll add anything there if needed :) [19:26:11] basvb: I like that idea--but maybe there's a way to complement that with human 1:1 interactions, like dropping new editors into various greeting workflows [19:26:57] yes combining the two is always the best [19:27:03] or well in that case [19:27:15] codezee: yeah, let’s use the etherpad for that as well, adding it now [19:28:39] I think this page may give some inspiration http://wikipapers.referata.com/wiki/List_of_open_questions [19:30:00] codezee: regarding your other question on importance in the general sense, much of the research uses link-based metrics (e.g. PageRank), but Wikipedia’s link graph is really dense, so I’m still not sure if those are really good. At the moment I am also considering whether WikiProject assessment works if you turn them into votes based on project size (e.g. WP Biography gets lots of votes because it’s big) [19:30:09] Ow I also had some idea of a fact-check bot, trying to understand wiki-articles and check facts against other wikis/wikidate/the www [19:31:27] * awight reminds comrades to add names to "Attendees" at the top of the doc [19:32:21] awight: nice reminder :) [19:33:31] there was a talk on search results on Wikipedia on the dutch Wikimedia Conference from a prof [19:33:46] let me search it and add that one [19:34:05] glorian_wd: wow, nice list! [19:34:37] awight: but I think there are only few topics that related to AI [19:34:40] on that list [19:36:25] Sad to see that the number of publications dropped after 2010--I don't remember that being true? [19:37:02] hmm, just thinking out loud… an AI for predicting notability of a topic? or an AI for predicting the quality of a reference? [19:37:33] like and like [19:38:12] however, I worry that predicting notability might be trained on existing notability guidelines, which can create a vicious cycle of (ethno)centrism basically [19:38:29] awight: that’s my concern as well, so I’m not sold on either of these ideas :) [19:38:51] interesting, yeah predicting the quality of a reference could be treacherous for the same reason [19:39:20] *but, that's sort of where ORES shines, is that they've severed "will be reverted" from "will damage the corpus" [19:39:30] which is an opening to help improve culture [19:39:51] Nettrom: mind if we record your ideas, and take note of the dangers? [19:40:00] awight: go for it! [19:40:31] yes best to write everything down for ideas [19:40:47] it's always easier to not start an idea then to come up with an idea again [19:45:51] hey [19:45:58] hi [19:48:16] soo, i just submit my contributions, or do i have to follow some special rules? [19:48:25] yes just submit them [19:48:32] ok, thank you [19:48:35] or well scream them here and we will agree with your ideas [19:48:47] :) [19:48:49] does anyone know if there’s a way to find content on commons that might fit an article? or is that basically “search by keywords” on commons? [19:48:50] and then we write it down at the IRC [19:49:04] search by keyword/search other language version/add yourself [19:49:11] use categories [19:49:34] basvb: I really like the fact checker idea--feel like there was a recent talk at WMF using wikidata connections to check facts? [19:49:37] How about an AI which suggest which words in your article should be linked and which not [19:49:44] basvb: ooh +1 [19:49:52] seems like an easy one to make [19:49:56] the data is there [19:50:22] Ari__: you found https://etherpad.wikimedia.org/p/devsummit17-AI_wishlist-IRC? [19:50:48] I think there are fashions in whether to over or underlink, we would need some kind of error signal where readers label whether a paragraph is appropriately linked? [19:51:10] basvb: can you tell in a line what does the link https://commons.wikimedia.org/wiki/File:20161119_Key_note_Maarten_de_Rijke_WCN_2016.pdf that you posted talks about? [19:51:24] I don't know dutch :| [19:51:26] basvb: random paper, http://link.springer.com/chapter/10.1007%2F978-3-319-38791-8_40 [19:51:42] codezee will do [19:51:50] I didnt think it through very well, and I dont know much about artificial intelligence, but would there be the possibility to let an AI calculate like, the "best" solution possible for complex problems we have going on? [19:52:11] awight you can use a scroll bar to say how many links you want [19:52:19] so I can put it at few and you at many [19:52:25] and it keeps suggesting more and more [19:52:58] awight, that paper seems to be in the right direction [19:53:12] there is likely some papers building knowledge bases based on Wikipedia [19:53:52] Ari__: it might not be able to solve the whole problem, but might be able to help with certain parts of it. Is there a particular problem you had in mind? [19:55:14] codezee: I did a quick summary based on memory [19:55:22] can do a more thorough one later [19:55:39] I was thinking about political conflicts for example, or how to solve problems with food production in the most sustainable way [19:55:44] that should be fine for now:) [19:56:11] AI is good to help with predictions, based on these predictions humans are good in deciding what is important [19:56:43] forgot how they wrote that down exactly in the paper I read recently [19:57:16] Ari__: I think that might be a bit too high level for the current stuff in AI [19:57:43] as those problems have a very open world [19:57:51] donno--political conflicts are probably detectable via edit wars and how entrenched editors behave [19:58:15] although w:Chocolate has been a battleground as well [19:58:28] detecting them should be possible [19:59:31] ah sorry I see the backscroll now: calculating the "Best" solution to politics [20:00:08] but maybe there is an AI approach we could take in which we model whether it will help to divide warring content into "opinion A" "opinion B" [20:00:19] alright, best wishlist ever completed https://etherpad.wikimedia.org/p/devsummit17-AI_wishlist-05 [20:00:25] So sort of discussion summerizer? [20:00:52] DarTar did you read all the wishlists? [20:01:18] basvb: hmm. I was thinking, predict whether a paragraph is contentious, and if so whether it will survive if isolated into a "criticism" sort of section [20:01:34] aah more in the articles itself awight ? [20:01:40] exactly [20:01:45] I was thinking more in line of conflicts between users [20:01:55] nice, that sounds like a second useful thing [20:02:07] for articles we could do some stuff like: balance prediction, neutralize texts [20:02:07] probability that editors will collide and convert into pure drama ;-) [20:02:24] is balance a term used outside nl wiki? [20:02:50] the balance between standpoints in the article [20:03:06] one can be neutral but only show one side of an issue [20:03:29] +1, but I'm not familiar with the rhetoric on enwiki [20:04:55] basvb: Can you elaborate on what an AI would do to analyze warring editors? [20:05:08] within a discussion? [20:05:19] It could summerize standpoints in a large discussion [20:05:24] pro and contra [20:05:28] wow, that would be fancy [20:05:37] I gotta run, thanks everyone for a good set of ideas, and see you all around! :) [20:05:50] thanks to you as well, bye [20:05:58] let me write that one down awight [20:05:59] basvb: I jammed that into the etherpad, if you want to tweak it [20:06:00] kk! [20:06:03] thanks [21:12:54] Hello. I am remote moderator for the "Algorithmic dangers and transparency -- Best practices" session. [21:13:00] You can ping me, but I will follow directly as well. [21:13:15] hi matt_flaschen_ is the stream supposed to be on yet? [21:13:23] No [21:13:29] ok :) [21:13:52] hey basvb, sorry I missed your ping. No I haven’t yet [21:14:15] DarTar: it was just the set up to say ours was better ;) [21:14:25] :D [21:14:30] basvb, actually it should be on. [21:14:37] https://www.youtube.com/watch?v=w__x1p66y5U [21:15:02] matt_flaschen_: aah works now on https://www.youtube.com/watch?v=myB278_QthA [21:16:27] new stream works, thanks! [21:16:28] Stream link: https://www.youtube.com/watch?v=myB278_QthA [21:16:30] Can you tell me if it is working for you? [21:16:32] https://etherpad.wikimedia.org/p/devsummit17-AI_ethics [21:16:46] works for me [21:16:50] halfak [21:16:50] * DarTar waves at Nettrom [21:16:56] Sorry for the inconvenience. [21:16:56] oh J-Mo1 [21:16:59] * Nettrom waves at DarTar and J-Mo1 [21:17:03] hello!!! [21:17:08] hi [21:17:14] I'm participating remotely ^_^ [21:17:30] I figured [21:17:49] looks like we have two separate pads againhttps://etherpad.wikimedia.org/p/devsummit17-algorithmic-dangers https://etherpad.wikimedia.org/p/devsummit17-AI_ethics [21:18:02] HaeB, correct one is https://etherpad.wikimedia.org/p/devsummit17-AI_ethics . [21:18:05] Please don't use the other one. [21:18:17] .. we had serious vandalism and repeated hostile action on our beloved MediaWiki at OSGeo.org :-( [21:18:51] The link on the program was recently corrected. [21:19:19] thx to Legoktm and others for assist on that [21:22:52] Of course, remote participants should feel free to submit questions. [21:23:10] And answers to these session questions. [21:23:47] One hand - New topic [21:23:52] Two hands - Continue current topic [21:23:58] Please use that here as well. [21:25:40] whats two hands on IRC? [21:25:54] hh: [21:25:57] \o/ ? [21:26:09] +1 for HH [21:26:15] :) [21:26:20] ✋ [21:26:24] vs ✋✋ [21:26:27] stream sound gone? [21:27:25] Anyone else having sound problem? [21:27:38] yes [21:27:40] I have sound just fine [21:27:45] or well no then [21:27:47] questioners need a mike, or the speaker repeats the question... sound here is fine for the speaker [21:28:13] Sorry, I'll make sure we do that. [21:28:59] hmm, I have a stream issue in general it seems [21:29:11] I've asked people to come up. Sorry about that. [21:29:24] Can you hear adam [21:29:25] ? [21:29:46] "Another thing that caught my eye" [21:29:49] ouch [21:30:00] that’s the mic and don’t scream, this isn’t black metal :P [21:30:05] :p [21:30:05] had my volume way up [21:30:12] He's not screaming, sorry about that. [21:30:38] Sorry. We were confused about whether or not audio should have comin out of the speakers here. [21:30:46] baaaa [21:33:14] Staeiouuuuuuuuuuuuuu! [21:33:57] ✋✋BTW, you can have a similar problem in predicting article quality: contributor experience helps predict quality, but you might end up encoding that higher quality comes from the experienced folks. This is partly why ORES’ quality model is they way it is, because it’s trained to label articles without considering contributor experience. [21:37:00] ... does the identity of the editor factor in.. big question [21:38:12] re:AI design to avoid discrimination, have you guys seen this? https://research.google.com/bigpicture/attacking-discrimination-in-ml/ [21:38:17] and https://drive.google.com/file/d/0B-wQVEjH9yuhanpyQjUwQS1JOTQ/view [21:38:42] or like the insurance car companies biasing/increasing the price the price of insurance if the driver is under e.g 25 years old [21:39:15] dbb, answered above. [21:39:23] DarTar, GhassanMas , should I raise that in-person? [21:40:19] GhassanMas: the Google Research link above addresses specifically that question, it’s a great read/interactive demo, more details in the paper [21:40:46] DarTar: that first link is a great explanation of looking at model health beyond just ROC AUC! [21:41:06] awight: yep, optimizing for equal opportunity — as they call it [21:41:43] could you talk about "weapons of math destruction" ? black box problem [21:42:33] false positive issue is also there for abuse filters right? [21:43:27] basvb: yes, but I believe most patrollers accept the idea that abuse filters (being based on heuristics) are by design error prone [21:43:34] slowking, did Aaron answer your question, or should I raise it? [21:43:54] I think there’s a higher risk of people blindly trusting the output of black box ML tools [21:44:12] H: in the 2-5 years time frame, I suspect you will need to differentiate between simple things (link spam) and finer grain things like harassment or rascism.. be certain to do a good job. and limit the scope, of the BASICs , and embed that into the evolution of the filtering [21:44:13] that was good - i wonder how we improve feedback and explain the ai so there is buy-in? [21:44:17] DarTar: I'm trying to understand one thing, how they're pretty much looking at the reverse of our problem, where we were overfitting based on population group and deciding to be "group unaware" [21:44:18] DarTar: well it's interesting to search for the differences, what extra issues are there with black box methods? [21:45:24] awight: I haven’t read the paper yet, the google link was circulated 2 days before xmas, but that’s the impression I too am getting from the short demo :) [21:45:25] Remember, please use one hand/two hand if you want me to relay to the session. [21:45:38] may be more samples [21:45:49] the black box/AI might learn easier from false positive compared to abuse filter [21:46:00] how to avoid baking in bias in algorithm - make it flexible + responsive enough to change and improve [21:46:18] one hand - new topic [21:46:22] two hand - continuw [21:46:24] yes abuse filter is set and forget no improvement [21:46:43] basvb: for sure, just speculating about people’s attitudes towards AF, knowing how it works [21:47:27] yes I'm also just searching for the main differences, main difference might be more unexpected biasses and harder to spot them [21:47:47] i've seen AF reject external links adding by librarians as a part of 1lib1ref [21:47:56] slowking: :( [21:48:01] more happening -> more potential side effects? [21:51:21] agree that a "chain of evidence" is important to review important cases [21:51:48] so, save the revert context, editor, article, bot revision and other context [21:52:08] yes, but you need to go further than saving likely [21:52:09] graph based storage is really useful for this [21:52:50] making complex algorithms insightfull could be hard if you make decisions based on ensembles of classifiers (1000s of weighted decisions) or deep learning [21:52:53] * dbb tends toward "human in the loop" in many cases [21:53:03] awight: the main limitation to applying that model (e.g. designing for equal opportunity instead of demographic parity) to what we’re trying to do on wiki is that we lack true demographic data about most of our contributors [21:53:50] other than the anonymous / registered distinction or self-reported gender or demographic data (tiny) there’s very little to work with [21:53:51] Anyone who hasn't gotten a chance to say anything [21:53:56] Who would like to get involved? [21:54:00] Jaol is up now. [21:55:31] we do faster with it [21:55:37] at least [21:55:48] DarTar: ah, also, I'm starting to see that we could have actually compensated for bias against anonymouses rather than just zeroing out that feature, which might fit into the "equal opportunity" approach as described in the paper [21:59:14] .. thinks of "protected pages" extended to "volitile topics" versus not-so-volitile topics [21:59:35] I am still struggling with the question of how to enforce policies based on the notion of a protected class in a pseudonym system [21:59:45] with no demographic data [21:59:57] see also https://blog.wikimedia.org/2016/09/12/research-newsletter-august-2016/#Ethics_researcher:_Vandal_fighters_should_not_be_allowed_to_see_whether_an_edit_was_made_anonymously (and community discussion about the review/paper at https://en.wikipedia.org/wiki/Wikipedia_talk:Wikipedia_Signpost/2016-09-06/Recent_research ) [22:01:48] doesn't Huggle have an explicit "good faith revert" option? [22:02:25] Hearing Stuart OK? [22:02:36] He is standing on the other side of the microphone [22:02:45] broad observation : databases were built on boolean predicate matches, but later a distance function changed search.. similarly applies to "damaging edits" [22:02:47] we are already in a class system - unconfirmed get AF at external links + bot reversion [22:03:02] yes halfak we hear all of you perfectly [22:03:20] I hear stuart [22:03:25] Great :) [22:04:14] the class is based on edit count, but oblivious to anti-register bias [22:04:19] (... stiki and twinkle do have good faith revert msgs, at least e.g. https://en.wikipedia.org/w/index.php?curid=336822&diff=759391167&oldid=759375659 ) [22:04:47] and they found people blew by the twinkle love as a speed bump [22:04:49] +1 on feedback to anons [22:04:54] so what are the big extra ethical issues which are caused by AI compared to user preconceptions or rule based biases? [22:05:57] many who do want to be tracked by admins or poetntial stalkers go ip from public to get anon [22:06:15] unfortunately, human scammers/criminals do use clues like that.. no easy answer.. as "a good liar will say EXACTLY what a real persuasive human says" [22:08:59] H: not true - depends on the motivation and skill level of the vandal.. see our wiki attacks for an example [22:09:17] HH: sorry [22:09:20] Maybe the EU's upcomming "right to explanation" is interesting in this context [22:09:56] https://arxiv.org/abs/1606.08813 [22:10:11] dbb, is that in response to Aaron's point about vandals not being stupid? [22:10:15] Rather, being stupid [22:10:20] yes [22:10:46] .. we had high-skill, high motivation attacks on our beloved WikiMedia instance :-(( [22:11:14] https://blog.wikimedia.org/2016/12/26/research-newsletter-november-2016/#.22Privacy.2C_anonymity.2C_and_perceived_risk_in_open_collaboration:_a_study_of_Tor_users_and_Wikipedians.22 [22:11:37] you may make 3 edits in one minute that is related to a changing a fact e.g. soccer match [22:13:28] or 3 external links in 20 minutes [22:15:16] digit flippers namechecked by Staeiou [22:15:39] would ai on dumb vandals free up humans for the less dumb? [22:15:51] +1 slowking [22:16:26] slowking, I think halfak just answered your question. [22:16:28] If not, I can raise it. [22:16:42] we need fact checking bot for the subtle vandalism [22:16:54] .. thinks of misdemeanors versus felony , of vandalism [22:16:57] yes, we also need to sell to the patrolers [22:18:21] Thanks to ya'll :) [22:18:31] thank you halfak! [22:18:33] thanks returned [22:18:44] thanks everyone, great discussion! [22:18:47] super-important, thx all [22:25:14] bye all