[20:54:27] hey folks! [20:57:00] Hi MissGayle [21:01:08] #startmeeting RFC meeting [21:01:09] Meeting started Wed Mar 18 21:01:08 2015 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:01:09] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:01:09] The meeting name has been set to 'rfc_meeting' [21:01:28] #topic Master & slave datacenter strategy | RFC meeting | Topic is Wikimedia meetings channel | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ [21:01:35] #link https://www.mediawiki.org/wiki/Requests_for_comment/Master_%26_slave_datacenter_strategy_for_MediaWiki [21:01:58] Hi James! [21:02:45] ok, where should we start? [21:03:17] * aude waves [21:03:23] hello [21:03:29] hi [21:03:34] bonjour :) [21:03:45] to me the overall strategy makes a lot of sense [21:04:09] o/ [21:04:12] Hey, everyone [21:04:21] I have some questions about the job queue portion [21:04:42] AaronSchulz: do you want to say something about overall strategy and current status? [21:05:19] In terms of strategy can't think of a way to do so without just repeating the RfC [21:05:24] * AaronSchulz can't speak about status a bit though [21:05:52] some of the stuff is done already or well in progress (see https://phabricator.wikimedia.org/T88445) [21:06:19] the memcached bits are the next major thing to get done [21:06:30] The only Gerrit change I found was https://gerrit.wikimedia.org/r/#/c/187074 [21:06:33] Are there others? [21:06:41] I also agree with the overall strategy [21:06:47] I do fear the devil is in the details here though :) [21:06:48] RoanKattouw: everything else was merged or not done yet [21:06:51] Oh I see a bunch of smaller stuff on the task [21:07:09] the user_touched stuff is blocked on memcached (I'd like to build of the "touch key" stuff for that) [21:07:25] the distributed cache clears should help a lot [21:07:36] It looks good, I wonder how to do rollback, though [21:07:37] I still need to set up the local job queues in eqiad at least...not too much work as the code is already there [21:07:42] i am a little worried that round-trips to the master databse from distributed clients may be too slow though [21:07:50] gwicke: what about the queue? [21:07:57] at least in some cases like page updates i think we do reads & writes and it gets potentially nasty with long RTT [21:07:57] Yeah, should rollback also use an AJAX post? [21:08:04] Platonides: you mean rollback to using the older caches? [21:08:13] superm401: YES! death to rollback GET [21:08:20] AaronSchulz: do you expect the current design to handle full http purge loads? [21:08:24] or wiki page rollback? [21:08:26] AaronSchulz, I mean a GET with action=rollback [21:08:30] AaronSchulz, wiki page [21:08:41] since the user expects to see the action result [21:09:16] gwicke: what do you mean? [21:09:17] superm401: +1 [21:09:29] Platonides: a GET page with an AJAX POST (and a form submit backup for non-JS) should handle the job there [21:09:41] brion, don't page updates still go to the master since it's a POST? [21:09:46] well, Platonides says "action result" [21:09:49] <_joe_> AaronSchulz: you have a series of challenges ahead of you if you do active/active, and I think you addressed most of them. what we didn't consider is that all interaction with caches (varnish, etc) should be replicated in both Datacenters, AFAICS [21:09:54] Or maybe I'm misunderstanding what you mean by 'page update'? [21:09:54] which I suppose will be handled by this new cookie [21:09:59] mark: I was wondering if the job queue design is intended for reliable purging in general [21:10:00] we can redirect to the diff, have it returned by the POST response, or just not have it (maybe just show a link) [21:10:00] superm401: yes, but that lets us treat the GET and the POST separately and it just keeps things cleaner [21:10:01] basically the same as an edit [21:10:17] the UX could be revisited if we want [21:10:42] with an edit, you have a POST which sets a datacenter_preferred cookie in its response [21:10:57] that causes the subsequent page view to go to the master, so that you see your own updates [21:11:07] rollback could be done in the same way [21:11:08] gwicke: probably, but we'd need to test the load to know for sure [21:11:12] brion, I was referring to your "at least in some cases like page updates ..." comment, not the part about rollback. [21:11:38] AaronSchulz: makes sense; could maybe also think about pub/sub by then [21:11:39] gwicke: the python broker is a prototype, though it's on each client/server rather than some central service, so I'm too worried about that...the redis part is sharded too [21:11:55] <_joe_> AaronSchulz: so say you have an edit on page ABC, how do you propagate that info to the secondary datacenter caches? I am thinking of varnish and restbase, but also parsoid [21:12:02] superm401: ah durrrrr, you are correct. under this proposal the POSTs get routed to master DC and there’s no RTT worry :D [21:12:15] * brion reads more closely :D [21:12:15] gwicke: it won't use the job queue for purging memcached if that's what you mean [21:12:38] TimStarling: datacenter_preferred? is that supposed to be set to "eqiad"/"codfw" or a symbolic name ("master")? [21:12:39] _joe_: really the only info we need is an event stream of edits / renames / deletions etc [21:12:40] brion: we don't do lots of updates on page views, mostly only on POSTS that go to the active DC anyway [21:12:45] varnish purges are up to many hundreds a second of course [21:12:46] so RTTs should be super low [21:12:51] AaronSchulz: exactly yeah :D [21:12:53] possibly more, so that might strain the job queue indeed [21:12:54] brion: I'm fixing the ones we *do* have [21:12:56] Yes, Cassandra/RESTBase should be covered. [21:13:01] cutting down on dbperformance.log spam [21:13:22] <_joe_> gwicke: ok so we need that and consumers that operate in a second datacenter [21:13:28] that log contains stuff that would be problematic for multi-DC use [21:13:40] <_joe_> and we have rapidly dwelved in the rabbithole of replication :) [21:14:01] paravoid: according to the RFC, existence of the cookie causes the request to be routed to the master [21:14:17] then the cookie expires after a few seconds, and so the user goes back to be routed normally [21:14:23] it's a bit unclear to me from the RFC [21:14:36] _joe_: yea, not worried about restbase stuff since it uses cassandra (I'm assuming we will use a global cluster and use LOCAL quorum when needed) [21:14:45] <_joe_> AaronSchulz: right [21:14:50] _joe_: but yes, lots of rabbit holes :) [21:14:55] AaronSchulz: *nod*, defaulting to localQuorum already [21:14:57] <_joe_> AaronSchulz: varnish _is_ a concern though :) [21:15:03] like elasticsearch is totally up in the air [21:15:18] swift is also up in the air [21:15:25] we don't have a global swift cluster [21:15:27] <_joe_> AaronSchulz: you were thinking of writing to two independent queues, that would make sense [21:15:32] _joe_: we could keep using htcpurger or try the new daemon (I tacked on some PURGE support, but never really tested it) [21:15:38] we have two independent swift clusters [21:16:00] we'd still need mediawiki to update both swift clusters [21:16:07] <_joe_> about that... I think thumbs may stay local to a DC [21:16:21] <_joe_> and just "replicate" the originals [21:16:33] my main doubt is about the implementation of the event / job queue [21:16:35] but perhaps we should keep elasticsearch and swift out of this discussion for now? [21:16:38] _joe_: probably makes sense [21:16:40] * aude thinks something like elastic would not be too much a problem [21:16:42] it's complicated enough with it [21:16:46] <_joe_> mark: yes [21:16:48] _joe_: I haven't though about that optimization [21:16:49] <_joe_> :) [21:16:53] we can discuss them separately later [21:17:16] Will Redis really replicate in "a few seconds"? [21:17:25] <_joe_> mark: my general advice there is we should replicate as few things as possible [21:17:27] My understanding is the MariaDB slave lag can be more than that. [21:17:54] _joe_: yeah, I'm planning to change that bloom filter code to not need redis slaves [21:18:27] do we need a more general fix for high-latency stuff in db replication beyond what we’ve already done? [21:18:33] master/slave is very stateful, requires manual switchover and usage knowledge, and is a SPOF...so the less stuff the better [21:18:47] there are a few built-in tolerance limits for replication lag: datacenter_preferred cookie expiry, WANObjectCache tombstone expiry [21:18:47] bulk delete/update operations can be slow for instance, i know breaking them up into smaller pieces can help with replication. [21:18:50] <_joe_> AaronSchulz: I couldn't agree more. [21:18:59] superm401: unless it was having problems it should, yes [21:19:05] last thing we want is for everything to explode redirecting to master during a rep delay [21:19:05] e.g. network issues [21:19:09] what bad things will happen if replication lag exceeds the limit? [21:19:21] <_joe_> I don't see any [21:19:25] TimStarling: which limit? [21:19:31] brion no, last thing we want is a splitted heads case ;) [21:19:35] <_joe_> I see bad things if replication breaks down [21:19:38] loadbalancer has a few thresholds [21:19:44] tombstone expiry, datacenter_preferred cookie expiry [21:19:46] penultimate thing we want then ;) [21:19:48] AaronSchulz, can you add guidance for "Code that uses caches must be aware of whether it needs to do explicit purges (use the WAN cache) or can use the local cache"? [21:20:23] TimStarling: well, if replication (say for sessions) was that bad (e.g. > 10 seconds) then sometimes chronologyprotecter would not apply all the time [21:20:34] and by lag I mean all the different sorts of lag, I'm basically wondering what sort of disaster it will be if the network connection goes down for a minute or two [21:20:40] _joe_: the worst about master/slave is having to decide to throw away data that might still be on the old master [21:21:03] if this lag was >= sticky DC cookie TTL then you might save a page and come back later and not see the change I guess [21:21:14] since writes all go to master, i thinkt he worst case is you get out of date data on reads and mysterious hangs on writes (from user perspective)? [21:21:16] hm.... i know we have code that writes a new revision and then uses master for some lookup and post edit thing [21:21:16] I think things already suck in that case about equally [21:21:26] yeah [21:21:29] though cross-DC lag is more likely due to network trouble I suppose [21:21:29] might need to rethink that [21:21:48] well, we already will lose HTCP packets [21:21:51] aude: does it do that in the same request? [21:21:58] aude, wouldn't that still be on master, cause of the POST request then datacenter_preferred? [21:22:28] AaronSchulz: mostly yes, but might be exceptions to that [21:22:37] superm401: we do sometimes have 10 seconds or so of lag, usually due to a bad disk or some stupid code inserting 20k rows at once [21:22:49] Right [21:22:52] dbperformance.log should warn when huge writes happen so we can find that code [21:22:53] i'm thinking of various stuff for wikidata, but don't have specific example right now [21:22:58] it's bad enough right now [21:23:05] will the master just drop the remote slave if it's getting behind too far? [21:23:20] #info AaronSchulz: dbperformance.log should warn when huge writes happen so we can find that code [21:23:23] Can the datacenter_preferred cookie last a few minutes? Our POST request count has to be pretty low compared to GETs, so it would seem this would be okay, and be a little more error-tolerant. [21:23:27] aude: it's just random "update stuff on page views" that you need to be careful of mostly [21:23:28] in general, will you add warnings to any code that writes to datastores/caches in a non-POST request? :) [21:23:35] seems to be our secondary updates [21:23:39] emit log messages I mean [21:23:40] aude: using EnqueueJob jobs is one solution [21:23:46] yeah [21:23:48] (that queue should always be local) [21:24:02] mark: we do that already [21:24:08] that's why the log is spammed ;) [21:24:09] ok :) [21:24:17] mark: I think writes to $_SESSION would have to throw an exception [21:24:19] it will be much better tomorrow after the deploy dust settles [21:24:29] would job enqueuing go local, then get pushed up to master? or would it require a full master RTT at enqueuing time? [21:24:29] we still have user_touched and CentralAuth spam to fix though [21:24:35] $_SESSION will be read from the local redis slave [21:24:41] so you can't write it back to the master if you wanted to [21:24:49] brion: enqueue jobs should be configured as always local (which then push to the master) [21:24:54] TimStarling: right [21:24:56] ok that makes sense [21:24:58] that is the main point of them, it should always be fast [21:25:00] <_joe_> I'm not sure cross-DC redis replication is a good idea [21:25:01] yeah [21:25:12] <_joe_> we should test it right away [21:25:28] <_joe_> it may fail in ways we don't anticipate [21:25:36] we don't log $SESSION use on GET though [21:25:39] <_joe_> no one does that AFAIK [21:25:48] there will be some legit use cases, as long as they are rare [21:26:16] heh, we used to write on every request with a session until I patched that lately [21:26:27] random question -- what about WQS? Does BlazeGraph support cross-DC replication? [21:26:40] no idea about blazegraph [21:26:42] <_joe_> paravoid: of course not [21:26:51] can we agree on this meeting that any additional datastores added to our infrastructure [21:26:52] I thought they were going to use a feed-based updater [21:27:03] feed-based indeed [21:27:04] if they go with that, then the same thing can just be run in both DCs [21:27:08] should support cross-DC replication /or/ what we build on top of it can? [21:27:09] yeah [21:27:15] I pushed for that during the eval [21:27:28] (before Blaze though, but it's the same idea) [21:27:35] paravoid: if it's an index then having a reliable event stream is okay too [21:27:45] _joe_, Redis use case you might not be aware of: GettingStarted uses it to keep a category list. If there is no replication, I think it will get out of date. [21:27:58] It's updated on POST (edit). [21:28:02] paravoid: something like wdq is basically an index slave to wikidata [21:28:12] if we end up having finally openstreetmap rendering, it's also event/feed based like wdq [21:28:17] <_joe_> paravoid: but we think we can get around that - we are keeping multi-dc into account [21:28:36] <_joe_> so yes, what gwicke said [21:28:44] finally having* [21:28:45] #action superm401 to file bug about GettingStarted Redis use [21:28:46] sure -- still, can we agree that multi-DC considerations should be part of any (relevant) RFC from now on? [21:28:54] <_joe_> yes [21:29:04] yes [21:29:17] +1 [21:29:20] #action AaronSchulz to expand "Code that uses caches must be aware of whether it needs to do explicit purges (use the WAN cache) or can use the local cache" to explain how developers know which to use. [21:29:28] #agreed multi-DC considerations should be part of any (relevant) RFC from now on [21:29:30] aude: can you or someone look though Wikibase use of memcached to see what needs broadcasted purges? [21:29:52] awesome :) [21:29:54] AaronSchulz: together with DanielK_WMDE_ , i think so [21:29:58] how does that work, is there a list of primary consideration for rfc's to add this to or just a best known practice? [21:30:00] I'll be doing that for lots of extensions, but wikibase is a bit harder to understand than others :) [21:30:08] \o/ [21:30:27] aude: hopefully wancache will be enough, but if something is needed let me know [21:30:44] #info aude and DanielK need to look into "Wikibase use of memcached to see what needs broadcasted purges" [21:31:03] brion: it would be nice to have a sprint for extensions...maybe even hack day stuff :) [21:31:19] chasemp, good point. We should have a list of things to consider in every RFC (sort of like how every IETF RFC has "security considerations", but done in a more practical way). [21:31:23] e.g. (does this need the new cache or should it be left alone) [21:31:27] AaronSchulz: file it in phab :D [21:31:39] that’d be a great topic for lyon [21:31:52] do we need an api.php proxy which can tell which POST requests are writes and which are reads? [21:31:58] i don't fully understand the details yet and probably will have some questions [21:32:25] aude, AaronSchulz: on a first glance, client/maintenance/populateInterwiki.php [21:32:27] currently a lot of traffic is posted to api.php with no URL parameters [21:32:44] TimStarling: probably [21:32:49] and possibly lib/includes/store/CachingEntityRevisionLookup.php [21:32:56] Are people regularly using POST for api.php reads just because the library does that by default? [21:32:59] #action (spagewmf) < superm401> We should have a list of things to consider in every RFC (sort of like how every IETF RFC has "security considerations", but done in a more practical way). [21:33:00] DanielK_WMDE_: that's just a crappy script, not for wmf use [21:33:14] for people like us, on our dev wikis [21:33:18] aude: good then. i was wondering :) [21:33:23] superm401: it's often about size limits in the URL [21:33:25] it does suck when you have to use POST to get around URL size limits (and GET can't use the body per standards) [21:33:32] should probly be under utils/, not maintenance/ [21:33:32] superm401, take into account url lengths [21:33:33] superm401: yes, probably, and the library authors probably make that choice because they are concerned about URL length limits [21:33:36] such things might end up on the main DC for no reason [21:33:53] unless more sophisticated routing rules were used...which sounds...ugh [21:33:59] AaronSchulz: i don’t think that’s hugely worse than the current situation for bot stuff [21:34:12] most in-browser or app api clients shouldn’t do that i hope :D [21:34:13] well, yeah not *worse* that now, true, heh [21:34:17] but…. for mobile we might need to check [21:35:00] perhaps mediawiki itself can be the proxy for that [21:35:10] like, always POST api.php to the local dc, and mediawiki can proxy to the master if needed [21:35:15] * gwicke imagines the latencies [21:35:46] although, would only happen on write, which is normally slower anyway [21:36:03] would be better than always to master [21:36:10] yeah, for sure [21:36:24] mark, I think we need some stats to estimate what proportion of POSTs are api.php. [21:36:31] that'd be good yes [21:36:36] Then, if that figure is significant, we can check which proportion of those are not actually writes. [21:36:48] I can tell you that most of the parsoid requests are [21:36:49] #action check on what proportion of API reqs are POST that maybe aren’t writes [21:36:54] The first figure should be relatively simple. [21:37:04] * AaronSchulz tries to volunteer brion to AJAXify rollback :D [21:37:24] AaronSchulz: why not :) put it in phab and assign me :D [21:38:20] There is already a rollback API, which helps. :) [21:38:28] yes, indeed [21:38:34] yup [21:38:55] brion, I think it should also show a "Submit" button for no-JS fallback. [21:39:06] Ideally [21:40:00] superm401: agreed [21:40:33] AaronSchulz: in the current scheme, will there be separate jobs enqueued for each consumer? [21:40:46] or do you plan to support some kind of pub/sub? [21:41:17] * springle wakes, reads backlog [21:41:19] What do you mean by consumer? [21:41:49] any service interested in following a kind of event [21:41:55] edits, say [21:42:35] gwicke: there is no generic pubsub thing, no [21:42:44] Like a RESTBase hook into MediaWiki listening to edits then broadcasting to all DCs with a job? [21:43:01] AaronSchulz: k [21:43:02] the is a purge relayer and an 'enqueue' queue that relays jobs to the master [21:43:18] I mentioned a 'multienqueue' queue as possible hack for elastic [21:43:42] superm401: pub/sub basically [21:43:46] but that may end up not happening (I guess that's sort of the closest thing to pubsub, but the jobs would "know" of the consumers) [21:43:54] or at least, the endpoints (e.g. DC) [21:44:04] true pubsub wouldn't require that knowledge [21:44:05] yeah, it would get most of the perf benefits [21:44:11] but not the decoupling [21:44:53] maybe stupid question, is the goal to architect for an unknown number of datacenters or specifically to solve problems for just 2 or just 3? lots of overlap but some cases not necessarily. [21:44:54] it would be nice to model the jobs around events though [21:45:02] as a stepping stone towards pub/sub [21:45:15] chasemp: it's for any small number of DCs, not just 2 [21:45:18] how is the WANObjectCache purge service going to work? that is REST-like interface isn't it? [21:45:38] AaronSchulz: gotcha thanks [21:46:10] it is REST-like, HTTP, and has local daemons (on apaches and on the cache servers...though the later don't *have* to be local to the cache) [21:46:28] will there be a specific RFC about it? [21:46:29] TimStarling: that stuff is all just a prototype in a github project of mine (linked to in phab) [21:46:50] AaronSchulz, can you document WANObjectCache somewhere? It's mentioned twice in passing, but I really have no idea what it does. [21:47:34] superm401: it wraps BagOStuff with a similar interface but can broadcast some purge related operations (e.g. delete()) [21:47:48] AIUI User Aaron makes edit, gets datacenter_preferred cookie and sees his change. AIUI user Brion doesn't see the edit even on shift-Reload until replication completes. If Brion knows to add ?action=purge, does he get the new content? [21:47:54] the patch is full of long prologue comments [21:48:41] AaronSchulz: which ticket? [21:48:44] spagewmf: well I'd suppose the purge will be enqueue after the first one, so it wouldn't purge any faster [21:49:13] AaronSchulz: so it doesn't set the datacenter_preferred cookie [21:49:35] from which ticket is the github project linked, or alternatively, may we have a link to the github project? [21:50:23] https://github.com/AaronSchulz/python-memcached-relay I think [21:50:39] right [21:50:54] I'm just trying to find were I linked it, apparently not in the main memcached subtask [21:51:59] ok i gotta run, ping me if y’all need to volunteer me for more bits ;) [21:52:26] aahh, it was https://phabricator.wikimedia.org/T88340 [21:52:28] it sounds like it basically implements a REST event queue, with a special subscriber that purges various caches [21:52:33] which is an old duplicate now [21:53:37] AaronSchulz: I'm definitely very interested in that area, just watched your repo [21:55:04] alright, any conclusions for the notes? [21:55:07] re DB replication, if we solve the remaining tables without primary keys, we can trail galera for multi-DC commit [21:55:13] trial* [21:55:18] gwicke: sure; solving purges is easier than pubsub in general though [21:55:26] I guess we still have a way to go on this, we should probably schedule another meeting in a month or so, right? [21:55:31] agreed [21:55:31] I can assume a lot more stuff and used time-based reasoning more :) [21:55:38] can we agree that the general direction is good? [21:55:49] yes that would be nice [21:56:02] and that this is something that's useful and needed [21:56:27] AaronSchulz: indeed [21:56:39] gwicke: of course since it's a simple rest API we could swap it out if some general solution comes into used and happens to be adaquete for cross-DC stuff [21:56:55] (e.g. no cross-DC zookeeper or anything crazy) [21:57:10] +1 to paravoid [21:57:22] AaronSchulz: yup, that's what I was thinking too [21:57:41] It will be cool to have all read requests (even uncacheable things like watchlist) work if master goes down. [21:58:03] +1 from me on both strategy & usefulness [21:58:04] another solution might let the logs backlog for longer (since servers can be down for sometime...though for now we can just wipe such servers...memcached would work that way anyway, heh) [21:58:22] #agreed yes we want this, AaronSchulz please continue with designing and prototyping [21:58:26] AaronSchulz's patches like https://gerrit.wikimedia.org/r/#/c/194962/ are already being merged, regardless of RFC's approval :) [21:58:30] springle: we should talk about that later [21:58:41] "move fast and make things better" [21:58:43] I defer hot/hot stuff till after we get hot/warm working [21:58:48] AaronSchulz: aye. just a data point [21:59:20] gwicke and I have some concerns about cross-DC galera, though I think it might work fine if we are careful and the broadcast mechanism used optimizes RTTs [21:59:50] the certification-based approach is fairly sound though [22:00:15] of course it has it's long-tail of stuff to fix...like GET_LOCK() not working, blah blah [22:00:39] we wouldn't be the first to use galera cross-DC. but i agree it isn't necessarily a silver bullet [22:00:49] also you can get lag with galera unless you use a setting that hurts perf a bit [22:01:00] but that lag should be rare afaik [22:01:15] one step at a time ;) [22:01:23] I notice the IPSec project is moving along now, I guess we can expect that to be done before this project is deployed? [22:01:37] yes [22:01:41] unless something goes really bad? :) [22:01:51] TimStarling, mark: It would be good for keeping Aaron allocated to this to get a +1 on continued work from the architecture committee [22:02:02] it's scheduled to be finished in the next 2-3 weeks [22:02:06] So I can wave that at people who ask if it is worthwhile [22:02:14] bd808: see the last #agreed [22:02:17] bd808: where do you need the +1? :) [22:02:51] I can point to these notes :) [22:03:11] time's up, thanks everyone for coming [22:03:35] thanks AaronSchulz, this is nice :) [22:03:39] +1 [22:03:41] very nice [22:03:45] next week may possibly be something multimedia related, the committee notes are vague [22:03:59] it's a major effort, but very important [22:04:29] #endmeeting [22:04:29] Meeting ended Wed Mar 18 22:04:29 2015 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:04:30] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-03-18-21.01.html [22:04:30] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-03-18-21.01.txt [22:04:30] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-03-18-21.01.wiki [22:04:30] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-03-18-21.01.log.html [22:04:44] unless something goes really bad? :) [22:05:01] AaronSchulz: https://www.mediawiki.org/wiki/Requests_for_comment/Master_%26_slave_datacenter_strategy_for_MediaWiki#Design_implications talks about "changes would be needed to MediaWiki development standards", are those standards on-wiki anywhere? [22:05:06] you mean like staff members working on it mysteriously disappearing? ;) [22:05:25] TimStarling: we're suing the NSA now, soon it won't be needed anymore ;-) [22:05:32] spagewmf: no, it's still a daft RfC, heh [22:05:43] but the bullet points describe them [22:06:14] spagewmf: I guess we could start updating some stuff as of today though [22:07:12] probably all the stuff except the wan cache (since that's not even merged) can be done now [22:07:51] AaronSchulz: I mean does mw.org say anything about caching, master-slave currently? I think I've seen it but not organized and obviously not in "How to write your first extension" :) [22:08:16] maybe, I honestly haven't checked lately [22:08:17] spagewmf: "it's complicated" ;) [22:08:45] spagewmf: did you see the performance guidelines pages? [22:09:02] I guess we could hammer stuff out if you want later [22:10:38] AaronSchulz: I'll make a #documentation phab task for it, thanks. [22:12:31] AaronSchulz: I found https://www.mediawiki.org/wiki/User:Aaron_Schulz/How_to_make_MediaWiki_fast but it's paywalled :) [22:18:17] TimStarling: FYI I updated https://phabricator.wikimedia.org/T88666 with link to your AGREED [22:45:10] The weekly bug triage meeting for VisualEditor is starting in about 15 to 20 minutes.  If you’re interested in joining, the information is posted on wiki at https://www.mediawiki.org/wiki/Talk:VisualEditor/Portal  This meeting will happen in Google Hangouts.  A few people will also be in this channel during the meeting, although if the recent trend holds, it will be more like an "office half-hour" than [22:45:10] an "office hour". [23:00:40] #startmeeting VisualEditor Weekly Bug Triage Meeting [23:00:40] Meeting started Wed Mar 18 23:00:40 2015 UTC and is due to finish in 60 minutes. The chair is whatami. Information about MeetBot at http://wiki.debian.org/MeetBot. [23:00:40] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [23:00:40] The meeting name has been set to 'visualeditor_weekly_bug_triage_meeting' [23:00:58] #chair James_F [23:00:58] Warning: Nick not in channel: James_F [23:00:58] Current chairs: James_F whatami [23:01:08] #topic VisualEditor -- See https://www.mediawiki.org/wiki/Talk:VisualEditor/Portal for link to audio  THIS CHANNEL IS ALWAYS PUBLICLY LOGGED [23:01:17] #link https://phabricator.wikimedia.org/project/board/1015/ [23:01:39] Hello, everyone, and welcome to what is probably the penultimate weekly bug triage meeting for VisualEditor.   [23:01:51] The main speaker for this meeting will be James Forrester, who is the Product Manager for VisualEditor.  James will probably not be commenting on IRC himself, but you can hear him and talk to him if you join the Google Hangout.  If you’re not joining the conference call, then talk to us here, and someone will get your ideas or questions passed along. [23:01:58] I have some links you need to know about: [23:02:02] 1)  Where to find information about the meeting, including how to join via Google Hangouts:  https://www.mediawiki.org/wiki/Talk:VisualEditor/Portal [23:02:07] 2) The project board (with release criteria):  https://phabricator.wikimedia.org/project/profile/1015/ [23:02:12] 3) The Phabricator work board (with list of newly nominated bugs):  https://phabricator.wikimedia.org/project/view/1015/ [23:04:23] Jared Zimmerman, UX person, has stopped in to ask about expanding the user testing that's being planned to include subjective measures like whether editing is enjoyable. [23:08:07] If anyone has any views about how much that should matter, then feel free to speak up or contact James F or Jared later. [23:09:35] Moving on with the agenda: [23:10:09] There was significant progress on the burn-down chart. [23:10:53] The major change is that the Citoid service is reasonably stable in production. [23:11:31] The extension (the software in VisualEditor that calls the service) is still not deployed. [23:12:34] However, it still needs some more work. [23:13:12] As for the rest, James_F says "Other patches got fixed. No one cares." [23:13:28] The first newly nominated bug is https://phabricator.wikimedia.org/T93125 [23:13:41] It has been accepted on grounds of corruption. [23:13:59] https://phabricator.wikimedia.org/T93128 is about & characters getting converted to HTML. [23:14:06] It is being accepted on grounds of corruption. [23:14:37] https://phabricator.wikimedia.org/T72143 is about copying and pasting. [23:15:27] https://phabricator.wikimedia.org/T93045 is a new round of [[Mediawiki:Badtitletext]]. [23:15:33] It's a corruption bug and is accepted. [23:16:02] https://phabricator.wikimedia.org/T71494 is about span tags again, and will be merged with the other and accept it. [23:16:30] https://phabricator.wikimedia.org/T92993 is related to performance measurements. [23:16:50] This is being accepted. [23:17:17] https://phabricator.wikimedia.org/T92896 is about dirty diffs, with unnecessary spaces being kept at the end of links. [23:17:52] It's being accepted as a polish bug. [23:18:56] It appears that there are some complications involving non-English or non-Latin languages around that one, but it's being accepted anyway. [23:19:43] https://phabricator.wikimedia.org/T92583 is a technical debt bug. [23:20:01] It's being accepted as "polish". [23:20:22] When you create a reference, the links in the citation preview don't always work. [23:20:46] The goal of this bug is to make them either always work or never work, but not sometimes work and sometimes not work. [23:21:36] That's the end of the meeting. If you have questions, please follow up with James F in the #mediawiki-visualeditor channel, or leave a message at http://mediawiki.org/wiki/VisualEditor/Feedback. [23:21:47] #endmeeting [23:21:48] Meeting ended Wed Mar 18 23:21:47 2015 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [23:21:48] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-03-18-23.00.html [23:21:48] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-03-18-23.00.txt [23:21:48] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-03-18-23.00.wiki [23:21:48] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-03-18-23.00.log.html