[10:27:18] Amir1: btw, OAuth integration has been done :) [10:27:23] Amir1: for PAWS :) [10:27:33] Amir1: so if you login now and can just start using scripts [10:27:42] no need to use password or deal with OAuth stuff [11:43:10] YuviPanda: awesome [11:43:14] I was asleep [16:11:13] o/ [16:11:34] Amir1, any problems with the Wikidata edit quality campaign? [16:11:44] * halfak just gave an interview to Data Skeptic podcast. [16:14:02] o/ hey [16:14:21] halfak: Tell me when you're around [16:14:28] or check telegram [16:14:49] Oh yeah. I got your message on telegram, but I got interrupted for the interview. [16:15:26] Oh yeah. I think that we should make an announcement about the Edit quality campaign right now. [16:16:02] Then, when we do a blog post/report announcement, we can call attention to the Edit quality campaign again. [16:16:17] I think the multiple announcements will help us gather more labelers. [16:16:19] sure [16:16:24] I write that [16:16:37] right now [16:18:07] Cool. If you put into a shared editing space, I'll help draft. [16:18:52] sure thing [16:19:15] See https://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_(technical)&diff=prev&oldid=688131520 [16:19:28] ^ That's the post I made on the enwiki village pump about the "Edit types" campaign., [16:19:35] Might serve as a good reference. [16:20:18] great :) [16:20:43] You might like to include some graphic from https://meta.wikimedia.org/wiki/ORES/What [16:49:34] hello [16:49:39] o/ ToAruShiroiNeko [16:49:53] failed to burn my house down [16:50:08] Yay! [16:50:09] note to self, crates burn good, too good [16:50:11] That's always a win [16:55:43] Not the crate burning, but the non-house burning. [16:55:48] Anyway. I'l glad you're OK. [16:55:51] *m [16:55:57] ToAruShiroiNeko, ^ [17:04:16] ToAruShiroiNeko, so, it looks like we only have 791 revisions to label in urwiki. [17:04:25] I guess that this is a bot-pedia? [17:10:11] ToAruShiroiNeko, thoughts on running the campaign with less than 1k obs.? [17:10:22] I'm considering just going ahead to see if we can get it to work. [17:10:29] hmm [17:10:30] We could also boost our sample size. [17:10:34] yes we can [17:10:49] Would not be terrible to triple it. [17:10:57] I think all smaler wikis are very bot pedia a year ago [17:10:59] I wish we had an urdu wikipedian to talk to. [17:11:03] Yeah. [17:11:03] itnerwiki linking and all [17:11:08] mind you that doesnt happen anymore [17:11:14] per wikidata [17:11:14] Yeah. Wikidata now. [17:11:22] perhaps this summer is a better start for our sample [17:11:29] where no more itnerwiki bot activity should exist [17:11:40] I'm worries about the periodic nature of vandalism. [17:11:44] we normally do 2014 [17:11:56] We get a lot more over the summer months than the western school year in english Wikiepdia. [17:12:08] we can randomly sample staring on say may 1st for 2015 until now [17:12:19] that should cover most types of vandalism [17:12:36] Yeah... you're probably right. Let me try changing the sample and see what happens. [17:12:45] Can you relay our progress to the urdu contact. [17:12:51] sure [17:12:53] It will be good to know that we're not jsut sitting on this. [17:13:02] * halfak goes to change the sample. [17:13:57] http://quarry.wmflabs.org/query/6337 [17:19:36] I think they have constant feedback seing thigs happen but yes it is always good to inform :) [17:37:23] Wooops! Looks like I broke revscoring. [17:53:10] ToAruShiroiNeko, so I ran the revert detections script on the 20k sample (old one) from urwiki. We only have 100 reverted edits! [17:53:24] We might need to generate a balanced dataset like we've done for Wikidata. [17:53:30] yeah, perhaps [17:53:33] In the meantime, I'll try out the new sample. [17:53:47] I dont really exp[ect too many reverts on smaller wikis [17:53:55] since often vandalism isnt detected at all [17:54:11] you have 2-5 people at best running a site with thousands of pages [17:54:23] which are mostly bot generated stubs [17:54:47] wiki-ai/revscoring#354 (fix_import_from_path - f835169 : halfak): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/95208672 [17:56:30] ToAruShiroiNeko, OK.. Running on the new sample now. [17:56:40] ToAruShiroiNeko, you think that we can ever support urwiki then? [17:56:46] We'd need a substantial amount of labels. [17:57:03] Seems like it's a community that could really use some damage detection support. :S [18:20:09] yes [18:20:21] remember how we talked about supporting smaller wikis in bulk? [18:21:01] since they will primarily face non local vandalism [18:21:11] this is a new type of challenge for us I think [18:21:40] ur wiki has 188 active users btw [18:21:43] so it is not so bad [18:21:56] 15 admins [18:22:33] I like to divide that value by 10 [18:22:59] so we probably habe 19 people that may potentially want to help us while others are one off editors [18:23:50] Seems like that could work. [18:24:15] doesnt change lack of reverts possibly [18:29:56] ToAruShiroiNeko, yeah... Even so, I think we can address that. [18:30:11] We're getting an order of magnitude more reverts/revision in urwiki than in wikidata :) [18:30:42] FYI: http://socio-technologist.blogspot.com/2015/12/disparate-impact-of-damage-detection-on.html [18:43:16] no-user AUC is when you dont use user.age and user.is_anon right? [18:43:40] Yes [18:43:58] I think we can get 0.02 - 0.05 AUC back from model tuning. [18:44:01] I'm still working on that. [18:44:39] I am pleased to see that we have gotten to a place where we worry about 0.02 AUCs rather than much larger gains :) [18:44:47] racing towards the 90s [18:53:59] :D [18:54:32] I'm hoping my next block entry will be on either (1) the results of the discussion on-wiki or (2) the results of model-tuning. [18:55:44] ok [18:59:09] *block --> blog [18:59:11] yikes [18:59:19] * halfak drinks more coffee [19:15:47] * halfak releases revscoring==0.7.5 [19:15:56] Includes tuning utility! [19:15:58] :D [19:27:50] * YuviPanda waves weakly [19:28:31] hmm [19:28:46] this is something we discussed but [19:29:04] do you think we can reasonably have a system where we let the user tain a model by turning a feature on or off? [19:29:08] or more features [19:29:17] ToAruShiroiNeko, we did way better with the 2015 only dataset. [19:29:22] 545 sketchy edits. [19:29:24] :D [19:30:04] Oh wait. No. That's worse [19:30:05] :S [19:30:06] lol [19:30:35] ToAruShiroiNeko, we can't turn features on and off. [19:30:41] That's not really how the models work. [19:30:51] However, we *can* hold a feature value constant. [19:31:39] Related to https://github.com/wiki-ai/ores/issues/100 and https://github.com/wiki-ai/ores/issues/101 [19:31:56] I've shown that it can be done. We just need to work out the request pattern for doing it. [19:32:00] * YuviPanda leeches some energy from halfak [19:32:17] * halfak remembers that he forgot to eat lunch. [19:32:41] YuviPanda, http://socio-technologist.blogspot.com/2015/12/disparate-impact-of-damage-detection-on.html [19:32:51] I think we have some work to do here. [19:33:07] I just talked to Dario about changing our goals for next quarter so that I can focus more on this. [19:33:36] \o/ [19:33:42] more time to do things! [19:33:44] * YuviPanda rads [19:33:46] *reads [19:38:14] So yeah. I'll be writing a report tomorrow or Tuesday that summarizes the problem, explains the loss in fitness, advocates for deploying the new models immediately, and discusses how our current projects are aimed at increasing fitness without resorting to these problematic user-dimensions. [19:39:08] halfak: that's nice, and I'm glad that the people 'we' have working on things like this think of things like this [19:39:35] :) [19:39:36] I wonder if we can / should offer both the models simultaneously.. [19:40:06] YuviPanda, yeah. We discussed that too. (1) If we do that, the default should not use these user dimensions and (2) we'll need to do some refactoring to ORES. [19:40:59] yeah, (1) seems sane [19:41:05] and 2 yeah not sure how hard/easy that is [19:41:16] * YuviPanda knows surprisingly nothing of ORES's code still [19:41:26] Well, it might not be *so* bad. ORES has the capacity to host an infinite number of models. [19:41:36] But they are in a flat directory key-value pairs. [19:41:50] So we can have enwiki/damaging and enwiki/damaging-user [19:42:20] Not that I'm looking at it again, that doesn't look so bad. [19:42:31] \o/ [19:42:34] cool :) [19:42:51] Yeah... that's a good proposal :) [19:42:58] It will be helpful for making the comparison too. [19:43:36] So, we'll have "-user" for all "reverted", "damaging" and "goodfaith" models -- at least for a little while. [19:44:02] It seems that we will want to deprecate those as soon as the non-"-user" models are high enough fitness. [19:44:09] Do you think that makes sense? [19:44:10] yeah [19:44:14] We'd be removing a path from the API. [19:44:15] I think explicitly setting a numeric goal [19:44:20] might be good [19:44:24] although might be harsh too [19:44:29] on you that is. [19:44:49] That's OK. If we can't hit the goal, let's have another honest conversation. [19:45:54] IMO, the goal should be 0.840 AUC. [19:46:05] This is the state of the art as of http://repository.upenn.edu/cgi/viewcontent.cgi?article=1515&context=cis_papers [19:46:44] So, some of the models are already ready. [19:47:15] :) [19:47:40] halfak no I mean the user can order ores to retrain a model without a feature [19:48:35] ToAruShiroiNeko, yeah... ORES doesn't retrain models itself though. [19:48:41] That sounds like a lot of new code to maintain. [19:49:17] Seems like we should think about how we'd like users train models directly first. [19:49:34] E.g. we could provide a UI for gathering datasets and adapters to ORES and Quarry. [19:49:37] * YuviPanda continues work on PAWS for a bit [19:56:02] halfak: hmm, I wonder if there's an ecosystem of tools emerging now that'll need integration (ORES, Quarry, PAWS) [19:56:32] Indeed. That's a good point. [19:56:40] I want Wikilabels to be integrated with quarry too. [19:56:51] So that people can pull in datasets for labeling from a query. [19:58:46] right [19:58:53] halfak: I've been thinking of writing a small library [19:58:59] halfak: that can pull datasets from quarry easily [19:59:35] YuviPanda, next step is being able to run the queries :D [19:59:42] halfak: and put that into PAWS. as well as a small library for pageviews. [19:59:51] halfak: hmm, so the problem there is that you need OAuth :) [19:59:57] and it isn't open to all [20:00:02] Oh yeah. No totally do what you were thinking. [20:00:05] but that problem kindof goes awway with PAWS [20:00:09] PAWS to Quarry as a lib would be great. [20:00:12] since it's got deep OAuth integration now [20:00:24] halfak: I want PAWS to have direct mysql access too :D [20:00:33] +1 [20:00:33] I wonder if PAWS can become a workbench of sorts... [20:00:39] +1! [20:00:39] mix in different stuff from multiple places [20:00:41] and publish it [20:00:49] Wikiworkbench == next Wikimedia Project [20:00:56] https://phabricator.wikimedia.org/T119859 [20:01:11] that bug includes notebook rendering [20:01:17] so you can do some work and publish the notebook [20:01:35] +1. Any chance for collaboratice notebooks? [20:01:51] It would be great if we could do github style commit access sharing. [20:02:08] halfak: upstream is working on it [20:02:17] Gotcha. \o/ [20:02:20] halfak: there's a project to diff and merge notebooks [20:02:24] halfak: which is their first step [20:02:29] +1 [20:02:49] halfak: there's http://colaboratory.jupyter.org/welcome/ [20:03:06] which uses google docs I think [20:03:08] for collaboration [20:03:13] which is sadly out of the question for us [20:05:01] Yeah. Agreed. [20:09:15] all of this somehow reminds me of http://imgur.com/gallery/Yodcq [20:09:45] OK. I'm off for the day. Have a good one, folks! [20:10:33] halfak: cya! [21:24:43] Cya