[00:32:19] * halfak tunes all the models [00:32:52] YuviPanda, I'm considering spinning up an extra x-large instance for tuning and model building. [00:33:24] I guess I don't need anything -- I just wanted you to know that I'm using the crap out of ores-compute.eqiad.wmflabs [00:47:04] halfak: sure! do it if you need it :D [00:53:12] YuviPanda, BTW, I did a deploy today. [00:53:27] \o/ [00:53:28] how did it go? [00:53:32] I'm feeling confident enough to do them without you nearby so long as I'm not doing a big update. [00:53:35] No problems. :) [00:53:40] I was just tallking to someone about how deploys should feel like trains [00:53:42] Didn't even get an alaram email :) [00:53:43] rather than rocket launches [00:53:50] awesome! [00:53:56] * YuviPanda feels happy about our deployment system [00:54:07] And now our Wikidata model is substantially more fit. [00:54:20] wheee! [00:54:24] * YuviPanda takes all the models to the gym [00:54:29] :D [00:54:42] I've been working them out like you wouldn't believe. [00:54:48] :D [00:54:51] For every model we actually use, we train about 90 [00:54:53] halfak: how's your neck? [00:55:01] And now I've automated that reporting :) [00:55:09] nice [00:55:09] Neck is amazing (compared to yesterday) [00:55:13] hah! [00:55:15] great :D [01:12:32] o/ bmansurov_away [01:24:45] hello ;) [05:42:23] Hey folks. New tuning report for Wikidata. See https://github.com/wiki-ai/wb-vandalism/blob/wikidata_tuning/tuning_reports/wikidata.reverted.md [05:43:10] Looks like we can get ~0.97 ROC-AUC with a RandomForestClassifier or GrantiantBoostingClassifier. [05:43:19] Sooo, I'm going to look into getting that implemented. [05:43:32] Bad news is that these types of models produce much larger files than the old SVCs. [05:45:21] Amir1, check this out. [05:45:21] https://github.com/wiki-ai/wb-vandalism/blob/wikidata_tuning/tuning_reports/wikidata.reverted.md [05:45:23] ^ [05:48:21] * halfak goes to bed. [05:48:23] o/ [05:49:36] o/ [05:49:37] hey :) [05:50:38] that's amazing halfak [05:50:48] so are we using RF now? [07:48:42] hi [10:12:04] (03PS1) 10Awight: Switch from the "reverted" to "damaging" model [extensions/ORES] - 10https://gerrit.wikimedia.org/r/257851 (https://phabricator.wikimedia.org/T112856) [10:14:46] (03CR) 10Awight: Flag reverted risk rows using the recentChangesFlag (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/247185 (https://phabricator.wikimedia.org/T112856) (owner: 10Awight) [10:23:11] (03PS2) 10Awight: Switch from the "reverted" to "damaging" model [extensions/ORES] - 10https://gerrit.wikimedia.org/r/257851 (https://phabricator.wikimedia.org/T112856) [10:32:15] (03PS3) 10Awight: Switch from the "reverted" to "damaging" model [extensions/ORES] - 10https://gerrit.wikimedia.org/r/257851 (https://phabricator.wikimedia.org/T112856) [10:45:13] !log sudo gnt-instance modify -H cpu_type=SandyBridge seaborgium to get aesni and sse cpu flags. rebooting seaborgium to apply those [10:45:13] No hay log abierto en #wikimedia-ai - log on para abrirlo, log list para listar los logs disponibles. [10:45:20] wrong channel [10:49:58] (03PS1) 10Awight: Simplify down to a single threshold [extensions/ORES] - 10https://gerrit.wikimedia.org/r/257856 (https://phabricator.wikimedia.org/T112856) [13:53:29] o/ [14:43:54] o/ halfak [14:44:01] Hey Amir1 [14:44:14] Hey, did you get my email? [14:44:25] Still working through my morning email wave. :) [14:44:33] oh sure [14:44:35] :) [14:55:21] Amir1, we should really set up a dev. server for ORES that works in a trivial way. [14:55:26] I've been meaning to do this. [14:55:58] I want to make a model that makes a prediction based on the last two digits in the revision ID. [14:56:11] agreed [14:56:12] E.g. 36238250 would have a 50% probability of being reverted [14:56:18] haha [14:56:29] So that we can confirm that scores are transfered correctly by looking at the rev_id. [14:56:29] :D [14:56:30] I think that would get a really good AUC [14:56:31] :D [14:56:34] lol [14:56:43] distrbute it randomly [14:56:53] Yeah. It should be really good for eyeballing :) [14:57:21] So, I think that we want to implement a scoring model that is not a Machine Learned classifier. [14:57:21] +if we want to get a highly scored revision we need to do about 100 edits [14:57:35] Instead, it's just a basic ScorerModel with the rules implemented manually. [14:57:45] Amir1, that's a good point. [14:57:56] We could do the last digit *10 [14:58:04] E.g. 32847239 would be 90% [14:58:22] good [14:58:32] agree [14:58:43] If you build that system [14:58:48] I install it in labs [14:59:04] *install the extension and make it useable [14:59:55] just give me a url [14:59:57] :D [15:00:53] OK. Looks like we can get 0.84 AUC for enwiki damaging and 0.87 AUC for enwiki goodfaith with gradient boost classifiers [15:01:08] I need to do some tests with file sizes of the models. [15:01:12] But this looks good. [15:01:18] OK back to the test server. [15:01:29] This should probably be part of ORES. [15:01:53] Without having a trained model, you should be able to just run "ores dev_server" and it will have a single model that makes predictions. [15:02:03] What is the DB name of your test wiki? [15:02:05] "testwiki" [15:02:07] "test"? [15:03:24] oreswiki [15:03:54] So, I think we should either hardcode the wiki-name for vagrant or make it configurable. [15:05:02] I suppose that the config file will work. [15:05:06] Doesn't need to be a param. [15:07:18] +1 [15:18:53] "orestestwiki" sounds more fun to me halfak [15:22:11] It works! [15:22:22] Now what do I call this model? [15:24:22] "revid" [15:24:38] version "test only" [15:24:54] Oh wait... I need to make the version configurable so that we can experiment with caching. [15:33:14] Amir1, https://github.com/wiki-ai/ores/pull/107 [15:34:23] That should work great :) [15:34:37] You can set up a config file for your local installation. [15:35:22] It doesn't need to get anything from the API to do the scoring, so the only thing you'll need to change is the name of the model and the name of the wiki. [15:37:19] can we get this from ores.wmflabs.org? [15:37:38] like ores.wmflabs.org/scores/testwiki/1234 [15:37:58] halfak: ^ [15:37:59] Amir1, I think it should be run separately from the ORES service. [15:38:08] you'll want to change the version to check on caching anyway. [15:38:16] Right now, that happens from the config file. [15:38:27] You won't be able to just change the config file on the ORES main service. [15:38:34] I also made a line note [15:38:40] I responded :) [15:39:12] You should be able to install ORES dependencies and simply run "ores dev_server" from the main directory of ORES. [15:39:44] If you want to run it from another directory, you can provide a path to the config "ores dev_server --config=.../myconfig.yaml" [15:40:17] \o/ [15:40:28] It's hard to do this on tools [15:40:41] Amir1, you can actually run it on localhost too :) [15:40:43] since installing dependencies sometimes can be hard [15:40:50] Yeah I can [15:40:58] Also, you should consider using vagrant. [15:41:01] :P [15:41:14] Not ORES vagrant, but MediaWiki vagrant. [15:41:16] but I want to make this working for tools.wmflabs.org/ores/mediawiki [15:41:25] por que? [15:42:03] absolutely :D [15:42:26] I don't know if it's possible to use vagrant on tools [15:42:35] but I'll give it a try [15:42:41] Use it on your local machine [15:43:15] my reason to install a mediawiki in tools is to make possible shared testing and checking [15:43:30] so I can easily reproduce bugs, etc. [15:45:27] Amir1, this is also the purpose of vagrant, but I can see how being able to *just show* someone ORES/mediawiki integration could be valuable. [15:45:45] yeah [15:45:48] exactly [15:47:54] So, deploying a model for testwiki would potentially be valuable, but I don't think it would be great for testing. [15:48:01] We could also have a test instance of ORES. [15:48:14] eventually, ores.wmflabs.org will *be* the test instance. [15:48:43] * halfak thinks [15:52:19] OK. So I think I agree that we should release a model for testwiki, but it seems like we ought to do the testing of the ORES extension primarily against a custom instance. [15:53:19] * halfak wonders if we can count testwiki as another wiki supported by ORES [15:56:23] o/ bmansurov [15:58:07] halfak: hello ;) [16:02:09] Good... morning? [16:02:10] :D [16:02:23] I suppose it's probably evening there [16:42:53] https://commons.wikimedia.org/wiki/File:ORES.swagger_with_fitness_plots.svg [16:44:51] YuviPanda, when you get back, I want to talk to you about what our swagger spec is going to look like when we switch our domain and routing. [16:45:12] It seems like it is going to need to conform to the call pattern, not the shape of the actual service. [16:51:19] https://meta.wikimedia.org/wiki/Talk:Objective_Revision_Evaluation_Service#Automatically_generated_documentation [16:51:38] ^ Some discussion of including model performance information in the swagger documentation. [19:35:35] o/ YuviPanda [19:35:54] hi halfak [19:36:11] See my notes in the backscroll re. swagger and the new endpoint. [19:36:20] But I think that I might have figured out an answer to my own question. [19:36:43] We can host a complete doc at "/?spec" and partial docs at "//?spec" [19:37:33] So when you go to en.wikipedia.org/api/ores/?spec, you'll get the swagger spec, you'll get the latter. [19:37:48] Also, I'm seriously considering generating swagger specs online within ORES. [19:37:57] yeah, a lot of tools do that [19:38:02] So that we can include endpoints for each wiki [19:38:04] so it's always up to date [19:38:20] OK. Last hare-brained idea. [19:38:36] Including figures depicting fitness statistics in the Swagger documentation. [19:39:02] Essentially, we'd show fancy graphs next to a description of every model. [19:39:04] figures as in numbers? [19:39:06] ah [19:39:08] graphs [19:39:11] right [19:39:12] ok [19:39:15] See https://meta.wikimedia.org/wiki/Talk:Objective_Revision_Evaluation_Service#Automatically_generated_documentation [19:39:16] well, swagger itself is just YAML [19:39:20] There's a little mockup [19:39:37] nice! [19:39:43] I figured there might be some nice way that we can do some HTML or markdown or something within a description of an endpoint. [19:39:46] I wonder if swagger-ui supports that [19:39:50] if not we can probably hack it in [19:39:52] yeah [19:40:05] So doesn't sound too crazy? [19:40:20] no [19:40:28] 's all a matter of prioritization tho [19:40:30] :D [19:40:36] Yeah. Definitely. [19:40:54] OK. One more crazy idea. [19:41:06] So I build a simple rule-based model that should make testing ORES easier. [19:41:09] It will work on any wiki. [19:41:24] It takes the last two digits of a rev_id and reports that as a score. [19:41:38] I'm considering adding this model to live ORES and registering it for testwiki. [19:41:51] So that we can play around with extensions and stuff on testwiki. [19:44:58] YuviPanda, ^ thoughts? [19:45:24] +1 [19:45:36] I like it, decouples UI work from model gathering / building work [19:47:34] OK. Will try to get that into the next deploy then. It is pretty easy. :) [19:48:15] \o/ [19:48:20] halfak: do you know if dartar is in the office today? [19:48:44] Should be. It doesn't look like w WFH day for him. [19:49:09] * halfak pulls up DarTar's calendar and says "nope" [19:49:22] Both because nothing suggests he's OOO and also because holy crap meetings. [19:49:40] I only have 2-3 hours of meetings per day these days :D [19:50:03] hahaha [19:50:05] nice [19:50:12] I have 1.5h of set meetings a week and hate it... [19:50:20] but I have to realize I'm livin the dream [19:50:48] * halfak invites Yuvi to a bunch of meetings to talk about managing things and politics. [19:51:35] shit I've to go to adam's thing tonight [19:51:38] and I've no real pants. [19:51:40] this is unideal [19:51:43] Yup. me too [19:51:45] I have pants though [19:51:47] :P [19:52:03] It starts at 8:45 local time. [19:52:06] 8:45PM [19:52:26] * halfak will have had a beer and dinner before he connects. [19:53:43] I just skipped out of a meeting about meetings that appeared on my calendar. [19:53:52] It was good. [19:54:24] nice [20:28:04] awight, just invited you to a short meeting with May to talk about design support for the ORES extension. If you can't make it, don't worry. I'll have good notes. [20:30:39] word! [20:31:02] rats, I did have a thing at that time... [20:31:05] I'll try to sneak out. [20:31:16] * halfak adds a metric ton of cards to the phab board. [20:31:33] Badly needed though, thanks for getting the designer attention! [20:31:37] you mean fuck, right? [20:31:53] IMO the RecentChanges UI is pretty grim. [20:32:06] awight, currently or our changes to it? [20:32:46] what currently exists. [20:32:57] Yeah. Totally agree there. [20:33:25] Might be a good time to re-think the UI there. [20:33:37] We can talk about short- and long-term potential. [20:33:39] I did have a slight stroke of insight, and started pushing some of the RC UI into a Mustache template... [20:33:42] but yeah [20:33:43] Can't redesign everything now. [20:33:49] Oooh. [20:33:56] Screenshot or mock you could share? [20:34:09] it's exactly the same. That's how they like it in mw-core [20:34:23] But I've decoupled the backend from the view [20:34:44] Gotcha. [21:05:02] Curses! [21:05:18] I need dictionaries installed on stat1003 in order to do my model tuning. Arg! [21:49:31] halfak: shouldn't be too hard to send a puppet patch for :D [21:49:45] YuviPanda, yeah. You're right. [21:49:51] I should start doing that more often. :/ [21:50:12] I got a surprise meeting and was trying to monitor the job while paying some attention in the meeting :) [21:50:16] fun [21:51:02] halfak: modules/statistics/manifests/compute.pp is what you want btw [22:01:21] (03PS1) 10Awight: Switch RC flag letters [extensions/ORES] - 10https://gerrit.wikimedia.org/r/258042