[00:03:40] halfak: I think we are good to go to cluster in a month (the ORES extension) [00:04:03] but I'm terrible at estimating :D [00:04:08] :P [00:04:31] but biggest tasks are being handled [00:05:25] if ores,wmflabs.org can handle that much of request, it'll be a piece of cake [00:06:16] we need to determine which wikis we should start (we also need community consensus, which won't be a problem for fa.wp at least) [00:06:43] I think we'll get consensus fast. [00:06:51] We're also going to get a lot more demands placed on us. [00:07:18] If the service gets popular (like I suspect), we'll be both the heroes and villains. [00:08:28] "We have a problem." [00:08:28] "Remember, there are no such things as problems, only opportunities." [00:08:28] "Well then, we have a DDoS opportunity" [00:09:08] I think we will have some "opportunities" like this when the extension is in the cluster [00:25:27] halfak: ^ :D [00:26:54] :D [00:36:01] Hey Amir1. [00:36:15] So... thinking about this change in the scoring scale. When do you think we should do it? [00:36:17] Hey! [00:36:42] So, we have two options: Reasonable percentage and Balanced percentage [00:37:11] Reasonable percentage means "revisions with this scores have roughly the same probability of being damaging" [00:37:21] in ideal world we could run old models too, it would be perfect to use reasonable percentage [00:37:22] Balanced percentage is what we have deployed right now [00:37:41] So, we could deploy pairs of models. [00:37:56] yeah [00:37:56] We'd still end up updating the old models, but we can train them in a balanced way for now. [00:38:02] that's a great idea [00:38:10] we can implement "damaging_beta" [00:38:16] And deploy e.g. "damaging_new" with an reasonable percentage. [00:42:14] Amir1, I'd like to avoid increasing our models exponentially. [00:42:22] But maybe we can do this just for a transition period. [00:42:33] yeah [00:42:39] definitely agreed on that [00:50:37] regarding the extension: We also have some upstream issues [00:50:50] core doesn't support things very well [00:51:11] Sounds like MediaWiki :\ [00:51:58] https://gerrit.wikimedia.org/r/#/c/265676/ [00:52:15] performance of this patch is horrible IMO [00:52:32] I'm trying to use a hook but there is none [00:53:51] I'll ask more from core people [00:56:10] Amir1, does it need a DB index? [00:56:20] A btree will be nice for filtering on arbitrary thresholds. [00:56:25] I had an idea of getting meeting dev in hackathons (cause it was pretty useful for me) but there is nothing in near future [00:56:39] halfak: no [00:57:34] the issue is it need to call and get the user preferences for every line of recent changes [00:58:08] we can load that threshold and add it as attribute to recent changes object [00:58:29] but there is no roboust way to do [00:58:43] Oh! Yikes [01:03:09] OR we can load the preference from the db and add it as attrbitue [01:03:12] *attribute [01:03:25] my Englis is really going backwards [01:03:34] it's pretty easy to do [01:03:38] You mean with a join? [01:03:46] yeah [01:03:53] Also, isn't the pref in the user_properties table already? [01:03:54] left join I guess (not sure) [01:04:08] yeah I think so [01:04:13] let me check my local [01:04:15] yay [01:04:21] it's really easy to [01:04:26] * Amir1 dances [01:14:37] halfak: do you have a timeframe for this? https://phabricator.wikimedia.org/T106867 [01:14:59] I think we should mark this solved before deploying the extension [01:15:23] Amir1, +1. [01:15:33] So I was just talking to YuviPanda about this. [01:16:33] We're blocked on akosiaris, so I'm going to check with him when he gets back from OoO. [01:19:07] I googled but I couldn't find what OoO means :D [01:19:17] *find out [01:21:37] Out of Office [01:21:50] AFK in work contexts :D [01:22:15] haha [01:29:27] I hate taking complex strings from users!!! AHHH [02:00:09] halfak: joing with user_properties won't work because it doesn't save the value when it's default and joining cases result be empty [02:00:15] but I found a simple solution [02:00:19] I load the threshold [02:00:26] and just add it to the select [02:00:28] left join [02:00:31] http://stackoverflow.com/questions/3097150/add-a-temporary-column-with-a-value [02:00:44] that won't work too [02:00:57] Left join will return records even when only one table matches. [02:01:18] yeah I know [02:01:34] Hmm. I must be misunderstanding. Sorry. [02:01:46] maybe I'm wrong [02:01:51] I tried with left join [02:02:00] let me give it another try [02:02:07] IFNULL(up_value, "default")? [02:03:21] hmm [02:04:04] left join needs a "on" [02:04:27] joining recent changes with user_properties [02:04:33] we don't hvae anything [02:04:36] *have [02:04:50] left join user_properties on rc_user = up_user [02:05:21] we are getting query of recent changes not recent contribution of myself [02:05:35] I want to see anyone's edit not just mine [02:06:37] the join itself won't sound bad but its performance is poor AFAIK [02:06:51] I do join and then using WHERE I fix it [02:07:02] but that's a horrible idea :D [02:08:06] Oh! [02:08:32] Can't you put your user's threshold in the query itself? [02:09:03] http://stackoverflow.com/questions/3097150/add-a-temporary-column-with-a-value [02:09:08] I do it this way [02:09:25] (I showed this link to you about ten minutes earlier) [02:09:34] I think it's the best wrokaround [02:09:36] But why not just do it in the where? [02:10:00] WHERE ores_score > %(my_threshold)s " ... or whatever [02:11:00] We do this when we want to only get results of "hide good edits" [02:11:29] but in general we need this threshold to format the line (e.g. add the "r" flag) later [02:12:09] Oh yeah. That makes sense. You need the view builder that works with the rows to have access to the value while it decides what flags belong? [02:14:49] yes [02:14:53] exactly [02:45:10] This evaluation metric PR turned into more than I wanted. [02:54:15] Looks like this PR is going to chill for the night. [02:54:25] :)))) [02:54:32] o/ [02:57:35] o/ [03:09:03] (03PS7) 10Ladsgroup: Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) [03:09:58] (03CR) 10jenkins-bot: [V: 04-1] Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [03:12:23] (03PS8) 10Ladsgroup: Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) [03:16:17] (03CR) 10Krinkle: Let people choose a threshold (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [03:16:24] (03CR) 10Krinkle: [C: 04-1] Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [03:20:50] Krinkle: hey, thank you for your feedback, higher recall means it can get more proportion of vandalism [03:20:59] what do you suggest as an alternative? [04:04:24] Amir1: I'm not sure, but "high/low recall" is imho not going to work in the interface. Might as well remove it in that case. [04:07:43] thanks :) [04:09:43] (03PS9) 10Ladsgroup: Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) [04:10:13] Krinkle: ^ [04:10:16] :) [04:10:45] (03CR) 10jenkins-bot: [V: 04-1] Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [04:12:32] (03PS6) 10Ladsgroup: Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) [04:15:42] (03PS10) 10Ladsgroup: Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) [04:19:58] (03CR) 10Jcrespo: Some minor improvements to the database schema (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [04:24:42] (03CR) 10Ladsgroup: Some minor improvements to the database schema (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [04:52:49] (03CR) 10Ladsgroup: Some minor improvements to the database schema (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [05:38:59] Krinkle: I'm new to RL, so I can't see why this patch is not working. The CSS is okay the extension.json correctly picks up the CSS file: https://gerrit.wikimedia.org/r/266167 [05:39:21] I'd appreciate if you take a look at this [06:56:19] (03PS7) 10Ladsgroup: Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) [06:57:26] (03CR) 10jenkins-bot: [V: 04-1] Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [07:00:00] (03PS8) 10Ladsgroup: [WIP] Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) [07:00:55] (03CR) 10jenkins-bot: [V: 04-1] [WIP] Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [09:27:02] (03CR) 10Hoo man: "Style is inconsistent (sometimes you have int(3) sometimes INTEGER(3)), but MySQL wont mind that. Not sure about SQLite offhand." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [12:31:42] (03CR) 10Jcrespo: "Please use UNSIGNED SMALLINT (or UNSIGNED TINYINT if there is less than 255 values), without argument. Sqlite recognizes it, and mysql int" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [15:12:42] hi is anybody around? [15:14:46] common guys [15:16:03] hey :) [15:16:19] hi [15:16:23] how are you [15:16:38] are you into this AI stuff [15:17:04] yeah I'm one of the revscoring team [15:17:34] im not into wiki media but im into AI and programming [15:18:01] awesome [15:18:04] i was hoping to chat about some ascpects of AI to help me develop my system [15:18:37] I think halfak and I can help you [15:18:39] :) [15:18:49] what exactly is a revscoring team member do? [15:18:56] good [15:18:59] nice [15:19:23] we are making an AI infrastrcutre for wiki [15:19:33] very cool [15:19:37] what langauge [15:19:45] so people can fight against vandalism, find goo-dfaith editors [15:19:54] etc. [15:20:06] the goal is to be language-less [15:20:21] so your writing an engine [15:20:24] we run this project in about 20 languages [15:20:36] we are writing API [15:20:58] blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs/ [15:21:07] read this if you haven't [15:21:53] i do most of my code dev in vb6 and assembly then migrate to just assembly when im finished.. i only do desktop applications.. i dont do interenet stuff [15:22:36] we write our system in python [15:23:21] if you know python you can simply understand our code and use it [15:24:52] i use the win32 api [15:25:07] and api components i create in asm [15:26:13] anyway my main issue is the *understanding unit* [15:28:09] do you feature extraction? [15:29:26] in other unrealated programs i have [15:29:57] ive done experimental data mining that used that [15:32:27] that's cool [15:32:34] we do data mining too [15:32:49] mostly to get swear words for different languages [15:32:57] lol [15:33:14] so does your system use any *understanding* [15:33:28] finding swearwords is vital part of writing anti-vandalism [15:34:07] I actually don't understand what you mean by understanding [15:34:17] do you mean feature extraction? [15:34:25] no [15:34:41] * halfak wonders what "understanding" could mean [15:35:03] true AI needs to understand things [15:35:13] for instance .. [15:35:27] do something at a certain place for a certain reason [15:35:55] o/ halfak :) [15:35:58] Conceptual, but what does it mean to "understand"? Is recognizing a relationship understanding? [15:36:01] o/ Amir1 [15:36:08] These are philosophical questions. [15:36:33] I can talk about how the cortical columns in our brain "understand" relationships and temporal patterns by identifying correlations. [15:36:35] to a comptuer its not philosophy tho [15:36:42] The machine isn't doing something all that different. [15:37:21] directive requirements must be understood [15:37:28] There's no magical "understanding", it's a bunch of correlations and relationships that can be learned through observation and then applied in new circumstances. [15:37:54] true.. exactly [15:38:31] So, in this way, a simple linear regression is "understanding" a trend. [15:38:34] however it must be stored organized and optimized for later consideration and use [15:38:49] https://en.wikipedia.org/wiki/Linear_regression [15:39:35] the understanding *unit* must optimize organize store consider and access [15:39:46] and be reconfigureable [15:43:28] by golly i think you guys have done it [15:43:54] thakns so much [15:44:34] (03CR) 10Ladsgroup: [C: 032] Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [15:44:53] :P Not sure we've done something marvelous. this isn't the kind of intelligence that you have a conversation with. It's the kind that replicates subjective human judgement. [15:45:26] Anyway, discussions of deeper AI are always welcome here :) [15:45:40] who wants to have a conversation with a computer [15:45:41] (03Merged) 10jenkins-bot: Let people choose a threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265676 (https://phabricator.wikimedia.org/T122684) (owner: 10Ladsgroup) [15:45:44] ? not i [15:46:49] natural language processing is a part of AI im not realy interested in [16:00:30] * halfak is a computer [16:00:40] In an abstract sense. [16:00:52] We use NLP to detect vandalism [16:00:57] And good-faith contributors [16:14:07] thanks for the convo got to go cya [16:23:04] (03PS9) 10Ladsgroup: [WIP] Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) [16:23:22] (03CR) 10jenkins-bot: [V: 04-1] [WIP] Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [16:26:22] oh halfak https://gerrit.wikimedia.org/r/#/c/265676/10/includes/Hooks.php,cm [16:26:22] line 87 [16:31:02] Amir1, this is what you were talking to me about last night? [16:31:19] yes [16:31:23] it's merged now [16:31:28] works with good perofrmance [16:31:33] performance [16:33:17] Great! :) [16:53:33] * halfak runs a test that includes experimental test statistics. [17:58:25] (03PS10) 10Ladsgroup: [WIP] Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) [17:59:25] (03CR) 10jenkins-bot: [V: 04-1] [WIP] Some minor improvements to the database schema [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [18:01:33] (03CR) 10Ladsgroup: "PS 10 is rebase only" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/265944 (https://phabricator.wikimedia.org/T124443) (owner: 10Ladsgroup) [18:45:44] o/ Amir1 [18:45:46] See https://gist.github.com/halfak/c01ea5ea4638b7ba9014 [18:46:02] This shows the new evaluation stats and how to work with them using the train_test utility. [18:46:29] You can also see how "balance-sample-weight" works as a param we can set. [18:46:44] It doesn't seem to affect PR-AUC or ROC-AUC. [18:46:59] But that's hard to tell. [18:48:06] You can see in the output how the thresholds change for the balanced and non-balanced weightings. [19:02:52] Just finishing documentation. [19:02:58] Will post PR shortly. [19:28:14] hi [19:29:54] Hi! [19:30:07] I'm just about to run to a meeting, but I thought I'd say "Hi" back :) [19:30:16] I'll be around again in 1 hour if you want to chat. [19:30:17] hey man ..im developing a chat bot.. do u have any suggestions for me? [19:32:27] I'm not familiar with chatbots. :/ [19:32:45] Well.. I know what they are, but I never made one. [19:32:49] * halfak runs to meeting. [19:32:53] ooh man [19:33:29] * Singam is waiting [19:57:14] hi stashbot [19:57:14] how are you man [21:50:42] halfak: I just saw your message at the log [21:50:57] Hey Amir1, are you able to pull up the gist? [21:51:18] I'm reading it right now :) [21:52:32] :) [21:52:37] which one is balanced [21:52:47] If I get that correctly [21:52:49] I've got a PR open for `revscoring`, but I'm still working on test coverage. [21:54:10] it seems the first one is trying to weight [21:55:31] okay, this looks good to me [21:55:34] both situations [21:56:30] but numbers of the first one looks more intuitive [21:56:41] (thresholds) [21:56:48] what do you think halfak [21:57:56] * halfak looks again [21:58:00] Forgot which one was first [21:58:37] Indeed. More intuitive, but that's the weird one where the probability is wrong. [22:08:54] The really probability is more *right* with the second version. [22:09:09] But if you'd like to stick with the first ("balanced") version, that makes our lives easier. [22:10:55] it's your call [22:11:07] :) [22:59:21] I like easy. I'll do the balanced set first. [22:59:51] Amir1, do you have time to give me notes on that PR? [23:03:22] sure [23:03:27] let me see [23:04:34] I'll have a new PR with more test coverage in a moment. [23:05:38] is it #237? [23:05:59] https://github.com/wiki-ai/revscoring/pull/237 [23:06:04] Yeah. Sorry for the confusion [23:08:21] Test coverage reports 93% in my local nosetests [23:08:22] :) [23:09:53] that's super awesome [23:10:18] let's see if travis is happy [23:12:26] btw halfak do you want to give a try to the ores extension? [23:13:53] Yeah. Should I just use the ORES role or should I load up any particular patchset? [23:14:38] no just enable it :) [23:14:44] that is much easier [23:15:13] I'm finishing the schema change [23:15:17] Right now I test it [23:15:53] before that I'm writing an ad-hoc maintenance script to populate the database since I drop the tables [23:20:34] * halfak updates. [23:26:38] * halfak waits for vagrant to boot [23:45:41] Amir1, our PR-AUC for the wikidata model is 0.93! [23:45:46] *0.94 [23:45:50] That's really really good. [23:46:20] We'll want to go through this dataset and do some re-labeling though. The thresholds are misleading. [23:47:02] what is their thresholds; [23:47:19] sorry for ; writing php scripts right now [23:48:55] https://gist.github.com/halfak/067113e7f5365ae53fdd [23:49:24] It wants us to use thresholds much closer to 0.5 and claimss much worse recall and filter rate. [23:49:54] I think that we should manually curate a test set. [23:50:02] Right now, we run tests with 4k observations. [23:50:11] So that's 2k reverts to review. [23:53:27] that's not bad [23:53:43] We can focus on the ones that are scored badly, I suppose. [23:53:53] yeah [23:53:54] E.g. revisions not reverted that scored highly [23:53:58] like the last time [23:54:02] Yeah. [23:54:04] I can help with that [23:54:06] Just gotta formalize this somehow [23:54:22] So that we're not just fooling ourselves into thinking the model is better than it is. [23:54:35] E.g. we could create a wiki page with sections of edits in a table format. [23:54:47] We'd want to be able to read the page back as structured data later. [23:56:46] +1 [23:58:52] * halfak installs new sklearn on ores-compute.labs [23:59:04] Oh! My vagrant. I was going to look at that :)