[01:48:27] o/ YuviPanda [01:48:32] hi [01:48:39] Just got back. [01:48:45] me too [01:48:55] * YuviPanda got his hair cut and re-red it a little bit [01:48:56] Want to do the ORES redis stuff? [01:49:08] halfak: am ok waiting for tomorrow... [01:49:30] OK. That's cool with me too. :) [01:49:35] +1 [01:49:39] Also pics or it didn't happen [01:49:42] * YuviPanda is taking it a bit easy this week [01:49:44] halfak: once it dries [01:49:46] it isn't that red tho [01:49:47] kk [01:49:48] need to re-bleach [01:49:53] might / should do before all staff [01:49:54] probably [01:49:57] and get green [01:49:58] :) [01:50:03] Green? [01:50:10] yeah [01:50:14] once the red flares [01:50:16] * halfak tries to imagine YuviGreen [01:51:55] So, I'm thinking that ORES should take a list of config files that it will merge. [01:52:05] hmm the usual trick is you have a .d folder [01:52:08] config.d [01:52:11] and then you merge everything in it [01:52:13] Boo. [01:52:16] k [01:52:31] that provides a lot more flexibility [01:52:33] Oh wait. That'll work fine for prod ORES [01:52:41] How do you know what order to merge? [01:52:47] Alpha order? [01:52:57] ah, so that's why the files are named '50-' '100-' [01:53:02] and that number is called 'priority' [01:53:15] you do a natural sort before merging [01:53:18] Wait. 100-something would come before 50-something [01:53:21] :P [01:53:23] natural sort :P [01:53:36] https://en.wikipedia.org/wiki/Natural_sort_order :D [01:53:51] Woah! [01:53:56] TIL [01:54:01] and http://blog.codinghorror.com/sorting-for-humans-natural-sort-order/ [01:54:47] halfak: the bigger question is how to merge arrays [01:54:51] halfak: I don't have an answer to that [01:55:00] do you merge or do you replace or do you append? [01:55:31] this is why grrrit-wm doesn't actually do a merge :D it just takes two separate files, one for connections and one for everything else [01:55:54] most merging things get away with this by not supporting arrays :P [01:56:20] Replace. Rewrite your array if you need to change it [01:56:26] I figure that's the safest [01:56:50] * YuviPanda nods [01:56:59] but it'll prove frustrating very quickly the first time you want to just add something [01:57:08] or just remove something [01:57:11] which I've run into [01:57:18] YuviPanda, fair enough. Don't use merge for anything like that [01:57:29] Also, how would merge remove something? [01:57:36] indeed, can it? it probably can't [01:57:44] Sounds like config files are getting awfully complicated [01:57:46] unless you construct your config files like patch files [01:57:47] yeah [01:57:58] I guess this is why I had just used connections.yaml and config.yaml [01:58:02] and not actually done any automerging [01:58:10] and had forgotten about that until you brought it up [02:00:16] I thought you were the one that suggested the merging strategy to me :P [02:00:34] I did [02:00:45] and then actually talking about it triggered painful backmemories :| [02:00:58] the reason it works in apache et al's case is probably that they have their own language [02:01:09] and different array directives merge or add depending on the context [02:01:13] which yaml doesn't [02:01:30] I guess splitting it into two files (one for 'connections' and stuff and another for everything else) then? [02:01:31] Well.. either way, I'd like to stick with what we've got and implement the config directory strategy [02:01:37] +1 [02:02:14] halfak: that way, we can also enforce that only the connections config can vary between staging and prod [02:02:18] and the config config does not [02:03:07] +1 [02:03:23] We'll need a way to get the connections config on each machine [02:03:26] And a place to store it. [02:03:32] How do you want to do that? [02:03:34] since it's all public right now [02:03:38] and we don't have password [02:03:41] s [02:03:45] we can just put that in ores-wikimedia-config [02:03:49] when we move it into production and have passwords [02:03:58] there's a private config repo we can store passwords in [02:04:00] and we can use that [02:04:05] hmm [02:04:08] or maybe [02:04:13] we should put connections.yaml in puppet [02:04:19] and config.yaml in ores-wikimedia-config [02:04:24] +1 [02:04:30] Makes sense to me [02:04:34] ok! [02:04:36] so let's do that :D [02:04:54] * halfak puts the final touches on yaml config merging [02:05:01] With backwards compatibility! [02:05:02] WOOO [02:05:10] Oh wait. I should write tests [02:05:11] halfak: how about you split up the config into connections and config and ship a connections.yaml.sample into ores-wikimedia-config (as documentation?) and I do the puppet work [02:05:11] :) [02:05:33] That sounds good. [02:05:37] ok! [02:06:17] halfak: anything else before I run off? [02:06:28] Na. I think I'm good. Have a good one! [02:06:53] halfak: <3 thanks [02:07:10] halfak: btw, reminder that we need to do some quarry and other work for the cscw workshop :D i dunno what we're to do yet [02:08:11] halfak: I also have general long term thinking about quarry and how it relates to notebooks in my thoughts. do think about it too [02:08:37] * YuviPanda runs off [02:09:08] Ahh yes. I have to ping the MO folks to get you feature requests for integration with their metadata wiki. [02:09:17] They were supposed to have sketches together [09:23:18] (03PS1) 10Awight: [WIP] messing with thresholds [extensions/ORES] - 10https://gerrit.wikimedia.org/r/259645 [12:45:13] (03CR) 10Ladsgroup: Log API requests (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/259214 (owner: 10Awight) [14:27:10] halfak: o/ [14:27:17] my internet connection is not stable [14:27:30] but anyway. I'm running our dataset against Kian [14:27:50] using sampling w/ replacement [14:28:27] https://github.com/Ladsgroup/Kian/commit/6ec8444514594fb13cb8506d8cea3fd0e412c728 [14:28:37] I also wrote first part of the blog post [14:40:16] I've a question re. this task https://phabricator.wikimedia.org/T120999 (I maybe do it) [14:40:33] what do you mean by "reason" column? [14:42:40] I got to go, be back soon [14:53:17] Boo.. Sorry to miss the messages [15:40:12] o/ Amir1 [15:40:17] sorry to not respond earlier [15:40:18] hey :) [15:40:26] np [15:40:58] The "reason" column would contain information about why the edit was or was not considered damaging. [15:41:23] Right now, we use this in the prelabel script to track why we thought an edit does or does not need review in Wikilabels. [15:41:42] E.g. user_rights, reverted, blocked_user, total_edits, etc. [15:41:56] If the edit is reverted, or saved by a blocked user, it needs review. [15:42:20] If the edit was saved by a user with advanced rights or someone who has a very high total of lifetime edits, it doesn't need review. [15:42:39] This lets us go back through the list to amend it based on various criteria. [15:42:57] I think it would be good to include this in the balanced revert script. [15:42:58] gotcha :) [15:43:13] :) [15:43:31] BTW, just fleshed out the ORES extension MVP here: https://phabricator.wikimedia.org/T120923 [15:43:39] Requirements are up for debate. [15:43:58] I think we could get away with just 1 and 1.A for features if there was a good reason. [15:45:19] I'm not the kind of guy who determine the requirement or debate about them, I'm kind of person who make the product meet the requirement [15:45:21] :D [15:45:30] I'll probably start working on features [15:45:32] soonish [15:46:02] yay! [15:46:35] One thing I'm worried about is that including ORES on watchlist and usercontribs breaks the model that LegoKTM started us on -- a table of scores that runs parallel to recentchanges. [15:46:43] I also reviewed some patches in gerrit [15:46:44] Oh wait. Watchlist *is* recent changes :) [15:46:54] I don't know if we can make them merged soon [15:46:55] So maybe just usercontribs would be a problem [15:47:13] Amir1, I think we need to reign in our gerrit patches. [15:47:34] agreed [15:47:40] I've been thinking of asking all of our MW dev volunteers to coordinate better around reviewing patches so that we can unblock each other. [15:48:01] Not quite sure how to do that, but it seems like a couple of meetings would help us get into a rhythm. [15:48:32] +1 [15:48:54] usercontribs sounds like an issue [15:49:15] let's ask legoktm [15:50:26] got to go [15:50:30] be back soon [15:52:22] Hokay [15:52:26] * halfak hacks on outline [15:59:18] I'm starting to think that, for botpedias, we might just focus on human editors when building our models. [15:59:47] So the model will *maybe* make bad predictions about bot edits, but it wouldn't really be intended to run on bot edits, [16:00:03] By default, the recent changes feed filters bot edits from the list. [16:00:12] * halfak "hmm"s about that [16:01:54] bmansurov, o/ [16:02:00] o/ [16:02:25] Been thinking about our code review problems. [16:02:48] I'd like to bring all of us together to discuss a pattern that will work better. [16:03:01] sure [16:03:09] E.g. putting cards on the review column of the board and calling out blocked items on a periodic basis. [16:03:22] sounds good [16:03:32] What are your usual working hours? [16:05:00] 6-8:30, 15:00-18:00, 21:00-23:30 GMT+5 [16:09:14] * halfak does conversion to UTC [16:09:57] So 0100-0330, 1000-1300 and 1600-1830 UTC? [16:10:27] yes [16:10:42] Cool. So it looks like our best time for interaction with PT folks is during their morning. [16:11:08] I'm thinking that a half-hour long checkin once per week would be a good place to start. [16:11:08] btw, my times can be flexible [16:11:18] sure [16:11:26] I think you're already super flexible working these hours :) [16:11:37] ok [16:29:55] o/ Amir1 [16:29:59] Still hacking on outline :) [16:30:12] o/ [16:30:14] I just filed a request to set up a mailing list for ai@wikimedia.org [16:30:18] Reading from log [16:30:22] *ai@lists.wikimedia.org [16:30:34] that would be really good to have [16:30:35] So that we can have an async extension of the conversations that happen here. [16:30:38] :) [16:30:56] It'll be good to publicize that with our blog posts. [16:31:19] Much easier for people to follow a mailing list than to log into our IRC channel or watch our talk page. [16:31:52] Also, I figure that we can start posting our weekly update there to reach a bigger audience. :) [16:34:48] yeah [16:35:02] it's a pretty good idea in wiki environment [16:40:00] halfak: there was a discussion about how people can flag things from recent changes [16:40:11] In Wikidata? [16:40:12] for example, the edit is flagged by ORES [16:40:20] no in the extension [16:40:24] Gotcha [16:40:35] but I check it and realize it's not, [16:40:52] what's the best approach to remove the flag [16:41:15] the best approach IMO is showing "r" when the edit is not patrolled [16:41:24] *only when [16:42:12] so we can track edits that has been marked patrolled but our system (probably mistakenly) scored it high and we can improve the system [16:42:26] halfak: what do you think [16:44:08] +1 [16:44:18] I think that's a great idea. [16:44:39] Proper integration of quality control practice in mediawiki would involve sharing of patrolling information. [16:45:06] I'm not 100% sure that we could make use of it directly for modeling, but we can at least use it to learn from our false positives. [16:45:40] It would be great if we could also have integration of revert *reasons* with undo/rollback [16:45:59] So that we can differentiate vandalism from good-faith mistakes and content disputes. [16:46:20] I wonder if that can be done with rev_tags on the reverting edit. [16:46:44] It would be great if we could make selecting a *reason* be required when reverting. [16:46:58] Of course, the set of reasons should be modifiable by the users of a wiki. [16:47:06] Like the sidebar [16:49:29] OK. I finished my pass on the draft. [16:49:44] they are separated into several different tools, like huggle, twinkle, etc. [16:49:47] Anything in [brackets] is an editorial note. [16:50:12] It seems like tools could draw from such a structured list. [16:50:34] https://etherpad.wikimedia.org/p/wmde_ores_blogpost [16:51:23] other things are content of the blog? [16:51:34] (anything except []) [16:52:16] Amir1, yeah... Wording that I thought was useful anyway. [16:52:31] great [16:52:35] I like to put notes in [bracks] when I am not sure how to word it yet [16:52:53] Still, feel free to re-write anything that's not in [bracks] too :) [16:53:13] thanks [16:53:23] But generally, I think we should talk about what's missing before we start fleshing things out. [16:53:29] (or what needs to be cut) [16:54:17] let me read it carefully to see what's missing [16:54:33] at a glance I think we covered everything [16:57:01] Once we feel good about it, we should run through it with Lydia. [16:57:25] Maybe a VOIP call would be good. [17:03:07] usually they put it google docs [20:39:45] halfak: what is 'semantic_form' [20:40:28] :) It's a complex form element that allows Wikipedians to encode both semantic and syntactic meaning about what happened in an edit. [20:40:34] * halfak gets a screenshot. [20:44:05] http://imgur.com/VVANECN [20:44:34] It lets you select a "reason" that an edit was made from a drop-down and then add syntactic "operations" to each relevant reason. [20:44:44] ah [20:44:46] interesting [20:44:52] In this edit, a user was both doing cleanup and improving POV by modifying a sentence. [20:45:03] I'm considering what it'd take to build a generic form control [20:45:06] This is all for https://en.wikipedia.org/wiki/Wikipedia:Labels/Edit_types [20:45:11] where people can specify the spec in JSON Schema or something like that [20:45:16] lol [20:45:17] and have it output JSON [20:45:19] that's what we have :D [20:45:22] indeed [20:45:27] I suspected :D [20:45:32] http://labels.wmflabs.org/form_builder/ [20:45:33] :D [20:45:34] this is for harej and wikiprojextx [20:45:44] It build OOjs-UI forms [20:45:50] haha crazyasfuck [20:45:52] harej: ^ [20:46:07] Regretfully, the complex form element of semantic labeling required me to do some javascript. [20:46:13] * YuviPanda nods [20:46:19] does it output stuff into anything? [20:46:29] halfak: can that form builder be used for any forms on Wikipedia? [20:46:48] if it can't be right now, we can work on making that happen [20:47:04] Well, right now, we're only loading them into WikiLabels. We have a little bit of custom code to turn a JSON blob into a set of form fields. [20:47:12] right now the signup forms on WPX wikiprojects rely on formwizard, and each one has to have its own MediaWiki-space page [20:47:18] Also for turning the values into JSOn and loading JSON blobs into the form [20:47:30] See https://github.com/wiki-ai/wikilabels/blob/semantic_form/forms/available/edit_type.yaml for the config that produces the semantic form. [20:47:34] * harej returns to his other job: liberating data from HTML and PDF [20:47:44] Not that internationalization is at the bottom of the form. [20:47:46] *note [20:47:52] if we can seperate that out into: 1. yaml/json to OOJS UI, 2. input/output adapters [20:48:05] i/o adapters can be wikitext for example [20:48:07] YuviPanda, should be pretty easy. [20:48:13] and be widely more useful [20:48:15] ! [20:48:27] You are going to have an aneurysm when you see what I did to make OOjs-UI work [20:48:46] https://github.com/wiki-ai/wikilabels/blob/semantic_form/wikilabels/wsgi/static/js/oo.util.js [20:48:59] Giant switch statements. [20:49:04] I've been told by many people that aneurysms and OOJs-UI go hand in hand [20:49:09] * halfak sighs heavily [20:49:19] it's almost like writing java! [20:49:25] anyway, https://phabricator.wikimedia.org/T120902 is the task [20:49:32] How do you get data out of the widget? Well it depends on the widget. WHY!? [20:49:58] haha, so it has all the negatives of Java and none of the negatives?! [20:50:14] I was just going to use react but then it can't live on wiki easily [20:50:37] We should aggressively adopt whatever the UI standardization people are pushing. [20:50:41] +1 [20:50:57] BTW violetto ^ [20:51:10] however, there's this wikitech-l thread that went very weird with no answers to the question of 'who is maintaining OOJS-UI?' [20:51:20] mostly 'all of us!' which vaguely means 'none of us'? [20:51:26] Yeah. I read the first few messages and then checked out [20:51:29] anyway, there's certainly heavy politics there I've steered away [20:51:31] from [20:51:38] Yup. [20:51:46] Really, I should make time for fixing the bugs I've filed. [20:52:02] * YuviPanda nods [20:52:06] anyway [20:52:58] * YuviPanda cc's halfak on https://phabricator.wikimedia.org/T120902 [20:53:41] I'm really interested in making this (JSON-->useful "Form" with general interfaces) generalized and importing it to wikiLabels [20:53:57] +1 [20:54:07] already love being here halfak [20:55:16] oojs is being mostly supported by the ve team itself because it was built by the same team [20:55:37] violetto, what's the long-term plan for ui standardization? [20:55:43] the downside is that because it's supporting one product, things change according to what ve specifically needs [20:55:44] Sticking with oojs-ui? [20:56:01] we've been doing user research to find out that future [20:56:08] Gotcha. [20:56:25] halfak: I read that as 'they gonna politick so we do not have to' [20:56:26] but it's likely going to be a stylesheet in plain html and css [20:56:40] :P [20:56:43] and leave the technology for certain groups to choose [20:56:51] but that's going to hurt performance [20:56:51] Gotcha. That would be nice. [20:57:57] https://phabricator.wikimedia.org/T118918 this is what volker and i have been working on [20:58:02] YuviPanda, would that make switching to react easier? [20:58:10] to get that less file for everyone to use [20:58:27] no, switching to react is basically impossible with MW I'm afraid without a lot of redoing [20:58:40] we'll be forever stuck with the underlying architecture of 2008 JS :) [20:58:45] I've no hope there [20:59:10] YuviPanda, how about iframes? [20:59:12] * halfak runs away [20:59:28] halfak: :D I might use iframes for my notebookviewer stuff eventually [20:59:52] I suppose that might make sense for interactive plots. [20:59:56] halfak: iframes are a fairly useful security isolation tools [20:59:58] yah [21:00:03] now *frames*, fuck those :P [21:09:01] * halfak reviews violetto's less file [21:09:27] It's not much. And I think that's really nice. [21:13:44] * halfak likes simple things that serve a clear purpose and don't try to do anything else. [21:15:45] yeah, that's the whole point of standardization [21:16:03] you look for areas where things can be simplified [21:16:22] and the interface becomes standard because things suddenly look like they belong together [21:16:31] because they share styles with each other [21:16:50] +1 [21:17:05] Do you guys have an engineer in the ui standardization team? [21:17:23] a ux engineer, who's volker [21:18:13] Gotcha. Does he have time to work on standard tool developer tool kits like OOjs-UI? [21:18:22] and me who suddenly have become a bit of a front end engineer, but i wouldn't count that as one [21:18:30] yeah slowly but surely [21:18:40] we want to build something like this: wikimedia-ui.wmflabs.org [21:18:48] that's like my "mock up" [21:19:06] but it's usable [21:19:31] are you on the design mailing list halfak ? [21:19:43] What interprets "" and turns it into HTML? [21:19:46] I'm not. [21:19:51] * halfak fears new mailing lists. [21:20:23] * halfak subscribes anyway [21:20:31] it's using webcomponents.js to encapsulate what could be a long html [21:21:12] Gotcha. [21:21:34] so turns into html [21:21:52] native html elements [21:22:58] ideally we want somethings as simple as this -- https://upload.wikimedia.org/wikipedia/commons/0/07/How_to_turn_pencil_and_paper_sketch_to_high_fidelity_mock_up_with_Sketch_app.gif [21:23:07] Will try to find an opportunity to adopt these. Is webcomponents.js available on wikimedia wikis right now? [21:23:53] on labs, YuviPanda would you say so since it's on wikimedia-ui.wmflabs.org? [21:26:15] Woops. Wrong button = logout [21:26:40] But yeah, I think that you could set up whatever you wanted on a labs instance, so that doesn't mean it's working on wikitech or any other labs wikis. [21:26:59] I've got a meeting in a minute, but I'll run a quick test when I get a chance. [21:28:34] ok [22:28:34] violetto, turns out it's harder to check that I thought. [22:28:52] I've been looking in resourceloader, but it's hard to search the giant array! [22:30:56] atom [22:30:58] woops [22:45:40] hard to check if it's available of any wikimedia wikis halfak ? [22:49:48] Yeah. That's right. [23:00:02] halfak: am here [23:00:07] Hey! [23:00:16] BTW, just did a deploy to wikilabels. [23:00:19] Trains run on time :) [23:00:25] woo! [23:01:02] should we setup redis now [23:01:08] Yes [23:01:18] I'm going to be 50% in a meeting though. [23:01:28] For the next hour. [23:02:07] halfak: sure. [23:06:53] YuviPanda, so it seems like we need to [23:06:59] 1. create a new instance [23:07:18] 2. redirect ores to this new instance [23:12:22] halfak: yeah doing that now [23:21:29] So, it seems like we should stand these up in parallel and have a new "ores-redis" hack in place for a new deploy. [23:21:40] e.g. "ores-redis-big" [23:21:45] make sense? [23:21:48] YuviPanda, ^ [23:22:17] halfak: already created redis-01 [23:22:19] ores-redis-01 [23:22:32] Works for me [23:29:39] * halfak is hacking the deploy config [23:29:46] kinda slowly [23:30:32] YuviPanda, was thinking... we could do two redis servers right now. Why don't we do that? [23:32:22] E.g. have celery use `ores-redis-01` and the score cache live in `ores-redis` [23:41:56] halfak: sorry, got sidetracked [23:42:10] No worries [23:42:18] Still only 50% here [23:42:40] halfak: mostly because in prod we'd use multi-instance redis, where we'd have two redis processes on the same machine (which works out great since redis uses only one core) [23:43:04] Oh. Well we can do ports too. [23:43:17] yeah [23:43:21] but that requires a bit of puppet change [23:43:24] hmm [23:43:26] maybe I can do that now sure [23:43:30] I'll do that now [23:43:46] and we can do ores-redis-01 6379 and 6380 [23:46:19] OK. will make the change in the config. How will this work on staging? [23:46:23] YuviPanda, ^ [23:47:31] halfak: ugh, yeah that needs the connections.yaml stuff [23:47:43] We can do it. [23:53:25] halfak: kk am doing the puppet change