[00:03:03] halfak: are you sure we need ORES extension for user contributions? [00:03:14] it sounds like a hard task [00:03:20] Amir1, I use it. It's good for reviewing the activities of a new user. [00:03:26] But no. I'm not sure it is necessaru [00:03:30] *necessary [00:03:44] It's just a shame that ScoredRevisions has that functionality and the ORES extension will not [00:03:47] okay [00:04:55] OK. 9/10000 edits are vandalism [00:05:01] Another 3 are probably damaging [00:11:07] We should have a feature for # of capital letters added in a description. [00:11:10] https://www.wikidata.org/wiki/?diff=284751409 [00:11:22] https://www.wikidata.org/wiki/?diff=284751409 [00:11:23] But honestly, this looks pretty good. [00:11:33] it gives me 76% [00:11:45] it's not as good as I want [00:11:54] but it's not 8% [00:12:06] can you check again halfak ? [00:12:13] Wait a tick [00:12:14] http://ores.wmflabs.org/scores/wikidatawiki/reverted/284751409/ [00:12:18] Not gives me 100% [00:12:21] *Now [00:12:26] WTF [00:13:00] WEIRD! [00:13:42] I'm going to check some of the others. [00:14:18] Yeah... all the rest check out. [00:14:22] This is strange. [00:15:12] I'm OK with us not catching https://www.wikidata.org/wiki/?diff=279323030 [00:15:49] Otherwise, it looks like we can set the threshold at 0.95 and get what we want. [00:17:04] If we set the threshold that high, we only need to review 1.3% of edits! [00:17:06] This is crazy! [00:17:22] We need more observations and we need to formalize this revert review process. [00:17:28] http://ores.wmflabs.org/scores/wikidatawiki/reverted/279323030/ [00:17:40] it's crazy that we catch this too [00:17:51] * Amir1 headdesks [00:18:47] Wait a second. We... were'nt catching that a moment ago [00:18:58] What is going on with my scores!?! [00:19:30] Hmmm.. Could this be caching? [00:19:42] We didn't update the version with this recent deploy. I figured the models were similar enough. [00:19:50] * halfak checks against scoring utility [00:21:24] some of my scores have changed too [00:21:29] check the etherpad [00:21:52] Local scorer confirms our new scores. [00:21:57] I think we hit a caching issue. [00:22:07] Arg. Now I need to regenerate scores. [00:22:37] I confirmed a few that "changed" [00:23:17] * halfak starts scoring script again [00:23:29] so basically at threshold 95% we have recall 100% [00:25:48] sample everything that is above 95% I bet we can find some vandalism that people didn't catch [00:26:25] Good point. Let's look at that. I'll post that in the etherpad. [00:26:31] Also, I can't seem to score 284477427 anymore [00:26:39] Looks like it's one of the bugs I reported. [00:26:58] okay [00:27:03] I will take a look at it [00:27:10] and try to fix the bug [00:27:45] that would be great if we can publish these results in two or three days [00:29:33] +1 [00:29:55] It'll be hard with the holiday, but I plan to be hacking on Friday and Saturday. [00:32:14] Wow. [00:32:21] Lots of vandalism here that wasn't reverted! [00:32:22] 284477427 is a pywikibase bug [00:32:49] I looked at that and I'd figured. Glad to find it :) [00:32:57] *glad you found it [00:33:20] :) [00:33:55] Woops. looks like I filed a dupe! [00:35:00] 282788592 is probably a wb-vandalism bug [00:35:43] https://phabricator.wikimedia.org/T117802 [00:35:55] halfak: do you some list so we can revert vandalisms? [00:36:19] Yeah. I'm still generating scores, but I'll paste what I have so far. [00:36:50] thanks [00:37:42] I think our time constraint is too strict for wikidata [00:37:53] We should move the revert window back. [00:37:58] Right now, it is 48 hours. [00:38:08] Apparently vandalism in wikidata tends to sit for a while. [00:40:35] the best thing is finishing this campaign [00:42:07] halfak: your scores seems wrong [00:42:19] http://ores-staging.wmflabs.org/scores/wikidatawiki/reverted/285666558/ [00:42:43] http://ores.wmflabs.org/scores/wikidatawiki/reverted/285666558/ [00:42:44] but you gave 1.0 [00:42:49] Hmmm [00:43:01] it gives 6% [00:43:07] oh boy [00:44:17] we also should push staging model to the main server [00:44:21] yup. Something is definitely very weird. [00:44:27] It was pushed. [00:44:36] We could be experiencing caching issues. [00:44:54] But I think my scoring script is somehow making mistakes. [00:45:04] let's score everything directly [00:45:17] using wb-vandalism [00:47:04] I'm getting headaches [00:47:23] Yes. I'm worried that it is something to do with the ORES server [00:47:31] I'm running some tests [00:51:05] Ha! I made the error happen. Now to figure out why. [00:51:16] OH! It's a dropped rev_id. [00:51:31] Whenever we get a scoring failure I get an off by one! [00:51:38] It's in my script! [00:51:40] Not ORES [00:51:41] :D [00:52:26] OK Running again [00:52:28] ^ Amir1 [00:53:10] yay [00:53:11] yay [00:53:18] https://gist.github.com/halfak/765261c47b9204a60db0 [00:53:24] ^ This is what causes the error [00:53:42] if you run zip() on uneven length iterables, it will just iterate until the first one is done. [00:54:00] Less on learned! [00:55:59] New paste [00:56:05] Ready in etherpad [00:59:13] awesome [01:00:33] "Martin Shkreli" is one of items I prefer to be vandalized [01:00:35] :D [01:01:03] lol I had a good laugh at that one too. [01:02:23] should I revert this vandalism? Nah, let's stay there for a while :D [01:05:46] OK. new list is ready [01:12:20] WTF is this? https://www.wikidata.org/wiki/?diff=286096053 [01:12:21] Seriously [01:12:27] Every year gets an item [01:18:39] interestingly this vandalism was there for 24 days: https://www.wikidata.org/w/index.php?title=Q79995&diff=next&oldid=278316832 [01:18:55] that's such a shame [01:19:31] Yeah. [01:19:57] We should catch badwords added in fields somehow. [01:33:32] OK. That looks pretty good. [01:33:38] thanks :) [01:33:45] the good thing is we can improve it too [01:33:57] we should teach it page moves are not bad [01:34:03] yeah. [01:34:10] I'm conflicted about person IDs [01:34:21] (it's not wikidata's concern to fix vandalism in clients) [01:34:23] It seems like some other system should worry about verifying those. [01:34:40] +1 re. page moves and deletions [01:34:53] Also, it seems that merge and redirect within Wikidata still is getting picked up. [01:34:58] that's because I was asked to consider this statements more carefully [01:35:04] because vandalism is more common [01:35:10] Would you post this on-wiki with our notes? [01:35:11] and verifying it is harder [01:35:21] sure [01:35:38] I should have sorted the list in the other direction. [01:35:45] halfak: about page deletions, it's sorted out now [01:35:57] I don't know why page moves still come up [01:36:06] I see a few page deletions. [01:36:17] oh [01:36:25] They were pretty high before [01:36:28] It must be because pages often get moved and then moved back. [01:36:37] So that looks like a revert [01:36:38] I started from 100% and I didn't get anything [01:36:46] So our algorithm learns that page moves are bad. [01:36:53] exactly [01:36:54] We'll do way better with human labels :) [01:37:03] Gotcha. [01:37:36] I'm going to make a card quick about investigating wikidata's precision and recall. [01:40:55] Amir1, I just merged the two lists [01:43:35] and I just posted it [01:43:36] https://www.wikidata.org/wiki/Wikidata_talk:ORES/Report_mistakes [01:43:50] I'm not sure if it was the right place [01:44:05] feel free to move wherever you think is okay [01:44:08] I want to run through a quick sample of 0.80s and 0.70s. [01:44:13] You should probably go to bed :P [01:44:17] Or make breakfast ;) [01:45:01] 5:15 AM [01:45:10] break fast is better :D [01:45:54] *breakfast [01:46:13] I will study a little [01:46:21] so I'm afk :) [01:49:27] OK. Have good braining :) [01:50:26] thanks :) [03:40:08] OK. After cleanup and doing some spot checking, I think we can safely say that we can reduce the workload of reviewing the recent changes feed in Wikidata by two orders of magnitude. [03:41:00] Reviewing 1% of edits will get you -- as far as we can tell -- 100% of the vandalism. [03:41:04] \o/ [08:26:40] halfak https://phabricator.wikimedia.org/tag/revscoring/ this still looks populated [08:27:14] few items though [08:50:15] halfak 1% rules ofer 99% eh? :p [16:23:27] o/ ToAruShiroiNeko [16:23:40] Yeah. I tried to get that board deleted so that it would just show the project page. [16:23:50] It turns out that phab doesn't support board deletion. [16:23:51] :S [16:36:30] o/ Amir1 [16:36:39] hey halfak [16:36:46] these numbers are amazing [16:36:48] Just started coffee and am reviewing your post on Wikidata [16:36:53] Seriously. [16:38:29] * Amir1 is waiting for coffee to kick in [16:38:35] :D [16:44:34] I'm hoping to get some more docs and test coverage in for the new PR this morning., [16:50:09] ... before people show up to celebrate. [17:01:13] Do you need any help? [17:01:28] That I can do [17:01:48] * halfak thinks. [17:02:08] Actually no. You should study and/or have fun today [17:02:10] :D [17:25:59] thanks halfak :) [23:47:56] hello all [23:48:04] halfak I am way ahead of you today [23:48:44] I have some feedback for you regarding chinese wikipedia character storage [23:48:47] and also for ECOC [23:50:57] helder pointed out https://zh.wikipedia.org/wiki/Special:PrefixIndex/MediaWiki:Conversiontable/zh [23:50:59] etc etc