[18:04:12] Hey folks! [18:04:20] Who is on IRC duty for the showcase. [18:04:21] ? [18:20:02] hi [18:22:34] o/ FaFlo [18:22:55] great to have you presenting your work today :D [18:23:12] dsaez, do you know who the IRC host is today? [18:23:25] I can't be around so I wanted to make sure the topic got changed :D [18:23:44] FaFlo, gonna need to watch your presentation after-the-fact. [18:24:17] hi, thanks! Dario says he is [18:24:27] *being the IRC master [18:24:51] halfak [18:25:27] * halfak wonders if Dario remembers his IRC-fu [18:31:43] hey all, showcase is starting right now [18:31:59] I’ll be hosting on IRC if there’s anyone watching and relaying questions [18:32:19] do we have an URL? [18:32:26] also: hello everyone :) [18:32:33] sure, one sec [18:32:41] https://www.youtube.com/watch?v=uK7AvNKq0sg [18:32:50] Nettrom ^^ [18:32:54] cheers! [18:33:01] https://www.youtube.com/watch?v=uK7AvNKq0sg [18:33:39] cool to see both nb and nn on that language chart! [18:34:07] woo! Norway FTW [18:41:20] Nigragorga pigogarolo is my new spirit animal [18:52:43] I’ll be relaying to the speaker any questions from the channel, ping me if you want to ask one [18:55:03] J-Mo: I have a question and a comment, if you want to add me to the queue [18:55:15] you can go first, DarTar [18:55:23] cool [18:55:28] I have a question as well, and I'll go at the end [19:00:26] WikiLabels just mentioned by DarTar: https://labels.wmflabs.org/ [19:05:50] btw that was the legacy (pre-2010) wikipedia logo on the first slide ;) https://en.wikipedia.org/wiki/Wikipedia_logo#Logos [19:07:19] Good catch. I didn't notice. Still LOLing over "Mance Legstrong" [19:10:16] one can distinguish it by the presence of the klingon character (the spiky thing on the top right of the globe) [19:11:10] this dataset is SO amazing [19:11:16] here is the research newsletter's coverage of the dataset paper https://meta.wikimedia.org/wiki/Research:Newsletter/2017/August#Who_wrote_this?_A_new_dataset_tracks_the_provenance_of_English_Wikipedia_text_over_15_years [19:11:34] I bet many people don’t know it exists [19:14:50] \o/ [19:15:00] hey halfak [19:15:38] just got done with my other thing. Who is this Manse guy? [19:15:40] these plots on the post 2007 recovery of edit-related actions is super-interesting, I don’t think I’ve seen this before [19:15:41] ;) [19:15:49] +1 [19:16:08] DarTar, yeah, I was really only plotting the productivity patterns. [19:16:16] halfak not sure who Mance is, but sounds like a BAD ROLE MODEL for young athletes [19:16:20] i.e. how many words added that survived. [19:16:32] halfak: yep [19:17:16] Partial revert = an edit? [19:17:20] :\ [19:18:31] HaeB: tweetable factoid that fun fact [19:20:28] yes, also the rise of the refs from 2006 on [19:20:40] is the slide deck available online? [19:22:15] HaeB: I don’t have it yet, we can get it later [19:25:07] J-Mo: two Qs here, but please put me at the end of the queue if there other Qs [19:25:45] got it. Order: halfak, HaeB, J-Mo, DarTar [19:26:07] * halfak trolls FaFl.o from the peanut gallery. [19:26:21] awesome use of a Jupyter notebook [19:27:43] J-Mo: I was thinking I’d like to get the link of that NB too [19:27:57] I think he's saying these are dynamically generated? [19:28:00] I'm going to ask [19:28:15] I thought he had a static / cached one [19:28:20] to demo [19:28:33] if they don’t they should [19:28:55] DarTar, steal that "ref" tag cloud. [19:29:44] indeed [19:31:07] HaeB: since you’re in the room you can ask the Q directly? [19:31:31] DarTar: ok [19:33:15] Right. I add "The apple is really red" [19:33:24] And in the next edit, you change it to "The apple is red" [19:33:30] Then I have been partially reverted. [19:34:01] i thought i unmuted [19:37:54] if there are final questions about either preso, I’ll take them now after asking Fabian my last Q [19:45:28] J-Mo: I think you can close the show after this [19:46:13] got it [19:47:44] great work and presentations, thanks everyone! [19:49:33] thanks for joining, folks [19:50:44] thanks for having me guys [19:50:51] I'll be around a couple minutes more [19:50:53] great talk FaFlo [19:51:28] J-Mo was curious about how you treat markup as tokens [19:51:39] i.e. if you include/exclude anything in specific [19:51:42] I was wondering that too after seeing "ref" [19:51:54] Seems like it should be a tag like "" [19:52:01] so we retrieve the complete markup [19:52:18] then we split after a self-built rule set for the tokens [19:52:26] (sentences is quite standard) [19:52:39] and we actually split every special character into a single token [19:52:49] that includes "<" etc [19:53:10] ah, but some exceptions for sentences actually: [19:53:22] for example, a url is treated as a sentence [19:53:25] FaFlo: good point about being careful regarding interpretation of partial/later reverts - IIRC that seems to have been where the "even good bots fight" paper by yasseri et al went awry https://upload.wikimedia.org/wikipedia/commons/f/f4/Operationalizing-conflict-bots-wikipedia-cscw-preprint.pdf [19:53:43] it is not pretty, but it works.. @splitting [19:53:51] Interesting FaFlo. I was wondering if I could use the API to track "controversial" citations, e.g. I see if "http://inforwars.com/Obama_is_muslim" is being added and removed? [19:54:23] yes and no, because currently, you would exactly have to put it back together from the single tokens [19:54:38] yeah, that's what it sounds like. Thanks! [19:54:47] so it is possible, but tricky [19:55:16] i'd like to have a call that works for retrieving only references, external links etc [19:55:23] I guess you could use some heuristics on the domain name on a per-article basis [19:55:30] yeah [19:55:46] yes [19:55:56] FaFlo: that would be fantastic, and very much in line with what we’re doing this coming year, [19:56:12] we’re actually planning to release a feed of changes to ref tags and external links [19:56:53] sounds interesting, I could help with that if needed [19:57:13] I’ll make sure to loop you in when we start working on it >:-) [19:57:24] perfect [19:57:25] also [19:57:39] the research I mentioned is a spin-off of this paper https://arxiv.org/pdf/1805.05345.pdf [19:57:41] i also just hired a new developer that could be of assistance [19:57:55] ah yes, I know that one, great paper [19:57:58] the separate study that looks at deletions and refactoring in talk pages is under review [19:58:07] but I can share it with you privately [19:58:11] if you’re not the reviewer [19:58:13] :D [19:58:18] haha, nope [19:58:26] ok hang on [20:00:22] FaFlo: {{done}} [20:00:28] thanks! [20:01:03] I’ll share the talk and slides with the rest of the team working on this topic [20:01:13] …and with that I’m checking out for tonight [20:01:18] c ya [20:01:52] me too, bye