[00:02:04] awight: hey! [00:02:07] :) [00:02:23] thanks for your work on the extension [00:02:43] re your code review, what do you suggest as name of the function [00:09:11] halfak: the ar.wp guy finished the job :) [00:09:15] it's in meta [00:09:22] \o/ [00:09:33] I just saw his private message in facebook [00:09:33] I think I'm ~3 languages behind then [00:09:39] Which is awesome. [00:09:56] Maaaybe you could take a look at implementing those languages while I work on ORES UI? [00:10:15] It would be especially helpful for ar since I struggle to look at ltr. [00:10:17] *rtl! [00:10:49] Stupid eye-brain interface is like "what is this?" and I'm like "words, yo" and it's like "nope. Can't be" [00:11:07] :)))) [00:11:13] it happens to me too [00:11:38] bidi gave me headache all the time :) [00:11:43] I'll do [00:12:46] Yeah. That's what my editor does. Not OK. I wish I could tell it to just put everything LTR and I'll work it out. [00:13:16] Really, a big part is that anything substantially beyond ASCII (anything non-latin) puts the cursor at the wrong position too. [00:13:32] So I get to do a lot of type-undo-move-type-undo-move... [00:13:48] :))) [00:14:07] Next time I see, I teach you how to work with bidi and non-latin [00:49:39] Amir1: Running off for the day, sorry! I think that function will be fine for now. [00:49:58] oh thanks :) [00:49:59] It's small enough that, in the unlikely event that someone other than us looks at it, they can just guess what it does ;) [00:50:12] let's change it later [00:50:16] I think about it [00:50:21] Everything else in your patches were great, so I had to find something pedantic to complain about. [00:50:37] I'll be around tomorrow, thanks! [00:50:37] if you have anything needed to review, I'd be thrilled [00:50:43] hehe ok [00:50:52] thanks [00:50:56] :) [00:50:59] Amir1, where can you link me to the latest ORES UI code? [00:51:11] * halfak finally is ready to get it merged into ORES! [00:51:18] I did take a look at the Special:Contributions integration, cannot remember if I got anywhere however. [00:51:28] bye [00:51:52] halfak: https://github.com/wiki-ai/ORES-GUI [00:52:42] halfak: http://rabble.ca/sites/rabble/files/node-images/6887406195_ff05a643de_z.jpg [01:10:39] halfak: I'm trying to translate names I know in Arabic, for some of them I'm not sure [01:10:45] so I skip them [01:10:51] would it be okay for you? [01:11:18] Amir1, as in adding comments after then with English translation? [01:11:34] yes [01:12:09] e.g. arabic has tons of names for "these" and "those" [01:12:27] I don't remember which word implies what [01:12:48] but I know it's about "these" [01:16:59] Amir1, no problem then. Thanks for doing translations for what you can :) [01:17:02] Also see Amir1, https://github.com/wiki-ai/ores/pull/118 [01:17:23] It seems to work as expected when I run the dev server. [01:17:42] We need to handle a few more errors, but I can file those as feature requests. [01:17:49] Want to review and merge when ready? [01:18:28] I've got to run. I'm having company over soon and I have to clean up. [01:18:35] I'll swing by to check messages in a couple hours. [01:18:36] o/ [01:18:40] and merged [01:18:54] thanks for doing this [01:19:02] next time I work on flask directly [01:33:58] damn you git [01:36:42] halfak: https://github.com/wiki-ai/revscoring/pull/235 [02:28:12] hey Amir1 , sorry it took so long, had to put out some fires [02:28:21] are you still around? [02:28:52] im not going to be on here much longer, but I wanna make sure to catch you today [14:44:18] o/ [20:59:16] o/ halfak [20:59:23] I'm still a little bit sleepy [20:59:31] waiting for coffee to kicks in [20:59:40] o/ Amir1 [21:27:07] halfak: I've a suggestion to make. [21:27:33] as a community liaison people have trouble making use of our scores [21:27:55] they don't know what 90% means and it's vary in models [21:28:16] so 90% in a model is not a bad thing but in another model it is [21:29:28] so I suggest we define something called "ORES hardness" (or any name you suggest) and put it in user preferences, 1 would be easiest and catching only really bad edits (high precision, low recall) [21:29:45] and five would be the toughest (low precision, high recall) [21:30:29] Amir1, seems like those are the only two thresholds we need. [21:30:35] and we should determine threshold at those numbers [21:30:41] +1 [21:30:58] So we could flag something as "review" (high recall) and "revert" (high precision). [21:31:45] hmm [21:32:50] okay :) [21:33:55] I want to run some analysis to see what numbers these should be [21:35:16] +1. It would be great if this was part of a testing utility for revscoring. [21:35:32] E.g. we could split `train_test` into `train` and `test`. [21:35:45] That way, we can have more control over our test set [21:36:04] E.g. we can train over a balanced set and test with an random sample. [21:36:46] that would be good [21:37:14] next week is the between-semesters-break [21:37:24] so I've got nothing else to do except ORES [21:37:26] yay [21:37:33] \o/ [21:37:43] Regretfully, I'll be at a workshop for the second half of the week. [21:39:45] assign something hard to me, and I make myself busy :D [21:40:25] Generating these thresholds sounds like a good assignment to me. [21:42:24] o/ awight [22:07:44] Amir1 halfak: Today morning I pushed this commit directly [22:07:46] Amir1 https://github.com/wiki-ai/ores/commit/fadb8e592dc8c6230888c47ed3bccf0a65e9ced3 [22:07:47] Amir1 I hope I didn't do anything bad [22:07:49] Amir1 It was a very minor and easy bug [22:07:59] strangely I was disconnected [22:08:25] Amir1, no worries. looks good. [22:08:41] https://travis-ci.org/wiki-ai/revscoring/builds/103713329 [22:08:48] I also added arabic tests [22:08:54] waiting to see if it passes [22:09:09] \o/ [22:09:24] Amir1, was going to show you this too https://github.com/wiki-ai/revscoring/issues/234 [22:09:26] It' [22:09:32] s an idea that I wrote up yesterday. [22:09:47] This would help me optimize our feature extraction performance. [22:11:23] that would be really cool [22:11:33] I don't know how you want to do it [22:11:40] but that sounds awesome [22:11:46] Yeah. So that's something I'd like to discuss. [22:11:55] I think I can do an initial WIP PR tonight [22:12:09] That will put some timing events in extract_features [22:12:09] great [22:12:31] It would be nice to hand this off to you and let you take a pass on it. [22:12:39] after this arabic tests, I go to change GUI based on May suggestions [22:12:39] But I'll be available to chat. [22:12:50] and I send you my draft of the paper [22:12:56] Amir1, great. BTW, I'm just finishing up v1.0 updates to wikiclass. [22:13:06] We've got updates to all of the editquality models ready. [22:13:17] I've re-generated all of the tuning reports too :) [22:13:20] oh I need to run some analysis on test data [22:13:30] \o/ [22:13:36] Oh yeah. I generated that dataset for you when I woke up this morning! [22:13:45] yeah [22:13:46] So many fun things to do. So little time in the day to do them :\ [22:13:47] thanks [22:14:03] I have a lot of time today [22:14:06] I just woke up [22:14:08] :) [22:14:21] I consider working on the ORES extension too, if I got some time [22:14:51] halfak: I would be thrilled to check the PR [22:15:12] and do whatever you think I should do [22:15:24] I've never done profiling before [22:15:39] Amir1, regretfully, I don't think that using pythons cProfile will get us what we want. [22:15:49] but I think it will be obvious from my initial PR how that will work. [22:16:02] Or at least it will open wide for criticism :) [22:16:54] I meant profiling in general [22:17:07] I like to see how it works [22:20:09] halfak: it's safe to merge arabic patch, please review the PR [22:20:11] :) [22:20:25] Amir1, will do in 5 min [22:20:29] Ran 4 tests in 0.077s [22:20:31] OK [22:20:35] ok :) [22:20:37] thanks [22:22:03] Uhoh. So, this PR is going to need to be ported to revscoring 1.0 [22:22:09] I'll do both the review and the porting. [22:22:14] We should probably merge 1.0 soon. [22:22:27] I just want to make sure that the article quality models work as expected first. [22:22:56] have we released revscoring 1.0? [22:23:12] that's cool [22:23:17] I didn't know that [22:23:53] 1.0 isn't out yet. But I have all of the boxes tick'd. [22:24:00] Just want to make sure all the models still work first :) [22:24:14] When we release 1.0, editquality and wikiclass will be ready to go. [22:24:25] And wb-vandalism will become deprecated. [22:24:44] awesome [22:24:48] It was really easy to merge wb-vandalism stuff into editquality now that wikibase is a core part of revscoring :) [22:25:16] I told you that some features are from revscoring 1.0. Have you had a time to add them? [22:25:29] *are missing [22:25:29] *are missing? [22:25:31] ha [22:25:35] jinx [22:25:42] :P [22:25:44] So, I have everything that I thought made sense [22:25:50] And re-generated the wikibase model. [22:25:58] You can see it in the WIP pr I have for editquality [22:26:01] * halfak gets link [22:26:33] * halfak 's computer locks up loading the PR [22:26:34] lol [22:26:53] haha [22:27:02] https://github.com/wiki-ai/editquality/blob/rs_v1/editquality/feature_lists/wikidatawiki.py [22:31:12] I'm searching for my notes re. missing features [22:32:09] I found it [22:32:16] halfak: two features were missing [22:32:18] diff.identifiers_changed [22:32:26] and diff.en_label_touched [22:32:36] both of them are externally important [22:32:39] en_label_changed [22:32:41] Is there [22:32:52] maybe I forgot or you added it later [22:33:20] diff.number_changed_identifiers [22:33:22] ^ that it? [22:34:08] if isinstance(old.target, str): counter += 1 [22:34:13] https://github.com/wiki-ai/wb-vandalism/blob/master/wb_vandalism/features/diff.py#L166 [22:34:21] I think so now [22:34:24] yeah [22:34:30] Ok. That's doable. [22:34:38] * halfak hacks on revscoring PR. [22:34:50] that's important :) [22:34:53] thanks [22:39:09] Ha! We already have "identifiers changed" implemented. [22:39:15] I just need to add it to the feature list :) [22:39:43] Nope. It's there :) [22:39:43] https://github.com/wiki-ai/editquality/blob/rs_v1/editquality/feature_lists/mediawiki.py#L97 [22:39:46] \o/ [22:43:08] awesome [22:43:17] probably it wasn't there [22:43:22] thanks [22:43:40] Amir1, I'm sure I added those after you pointed them out and just forgot :) [22:44:09] \o/ [22:46:42] * halfak pulls up the Arabic PR [22:47:39] Hello all [22:47:47] o/ pipivoj [22:49:33] {{merged}} [22:49:33] 10[1] 04https://meta.wikimedia.org/wiki/Template:merged [22:49:34] YuviPanda, is PAWS disabled ATM? I'm having trouble logging in [22:50:02] pipivoj: yeah, we're dealing with [22:50:06] http://www.ubuntu.com/usn/usn-2870-1/ [22:50:11] a security vulnerability everywhere [22:50:14] so it's going to be shaky [22:50:16] for a bit [22:50:18] sorry about that [22:50:20] ok, nice to have a fast feedback [22:52:40] \o/ [23:01:12] Amir1, I found a dictionary for arabic in aspell! [23:01:19] * halfak adds that to the arabic stuffs. [23:01:34] yay [23:21:20] WORKING WITH MIXED DIRECTION TEXT IS AWEFULL AHHH [23:30:05] * halfak crosses fingers and monitors travis [23:30:19] Amir1, willing to merge this monster as soon as I get tests passing on travis? [23:30:30] (giant v1.0 PR) [23:30:32] of course [23:30:36] Cool [23:30:39] * halfak keeps monitoring [23:30:46] that would be my honor [23:41:55] halfak: see this [23:41:57] https://gist.github.com/Ladsgroup/d1e94f1100f8822c6dfa [23:42:03] this is analysis of the test set [23:42:22] it can give people what would be the best threshold to choose [23:42:51] What are the columns? [23:42:56] Oh! I see. [23:43:24] Our recall looks bad at f4 [23:43:30] Or not as good as I'd like [23:44:18] I can go on to f5 or f6 [23:44:37] Shouldn't need to. It'll be important that we look into that. [23:44:43] Maybe some of the damaging edits are mislabeled. [23:44:51] hmm [23:44:56] I can give list of them [23:45:01] It would be nice if we had a good interface for reviewing them [23:45:17] +1 +1 +1 +1 +1 .... [23:46:11] specially if it shows damaging edits with low scores first and not-damaging edits with high score first [23:46:35] so we can see maybe we are missing some features, mis-labeled, etc. [23:48:12] we have damaging edits with these scores: [0.020074954581917433, 0.02398766085883239, 0.05767973435439568, 0.06174985381944237, 0.08295150632116773, 0.09603874304760768, 0.09731090857607992, 0.15932536252770257, 0.22263375677283145, 0.24245648070651324, 0.2722669293828467, 0.31387378742441174, 0.3207530364629796, 0.3224913777973052, 0.41698496208767855, 0.4400582351118262, 0.45135386846911585, 0.5 [23:48:13] , 0.5222901095029724, 0.5400177595550764, 0.5438202248495151, 0.5783641437925711, [23:48:59] it's interesting to me that we have even 0.02 score in our predictor [23:49:28] If i go and change a dot it scores me at least 50% [23:50:08] Amir1, yeah. I want to see that 0.02 one [23:50:14] let me get rev ids [23:50:49] 625453607: True, 0.097 [23:51:09] 641821881: 0.02 [23:51:30] 634594551: 0.02 [23:51:50] 606878467 [23:52:08] 645979588 [23:52:09] 641821881 isn't damaging [23:52:40] 640127045 [23:52:56] 618595654 [23:53:06] these are all damaging ones with less than 10% [23:53:12] let's check [23:53:53] 625453607 isn't damaging [23:53:59] https://en.wikipedia.org/wiki/?diff=640127045 is not damaging [23:54:32] we had a vandal in our labeling system [23:54:45] This is possible [23:55:02] it's strange to me "641821881" scored 0.02 [23:55:19] oh my bad [23:55:20] Really? [23:55:26] it's definitely [23:55:28] good thing [23:55:31] :) [23:55:36] I thought editor is adding "bold text" [23:55:44] I think labeler made the same mistake [23:55:48] yeah ^ [23:56:13] we definitely need to make some reviews [23:56:38] what do you suggest halfak? :)