[12:59:20] Content Translation office hour starting here in less than a minute [13:01:15] #startmeeting Language Team office hour - September 2017 [13:01:15] Meeting started Wed Sep 20 13:01:15 2017 UTC and is due to finish in 60 minutes. The chair is kart_. Information about MeetBot at http://wiki.debian.org/MeetBot. [13:01:15] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [13:01:15] The meeting name has been set to 'language_team_office_hour___september_2017' [13:01:25] Welcome to this online+IRC office for Content Translation [13:01:35] Our main conversation is happening on Google Hangout/youtube: [13:01:42] #link https://www.youtube.com/watch?v=MD-BKoSj-oY [13:01:54] Please let us know if you would like to join on the hangout [13:02:07] We will also be taking questions here [13:02:18] Reminder that the logs of this channel will be recorded and posted on meta wiki [13:02:25] The recording from the last meeting is at: [13:02:40] #link https://www.youtube.com/watch?v=8Euhu4Q7HF4 [13:02:42] and logs are at: [13:02:54] #link https://meta.wikimedia.org/wiki/Special:MyLanguage/IRC_office_hours/Office_hours_2017-06-27 [13:03:08] We are talking today about updates to Content Translation. [13:03:39] If you have any questions please ask us here and we will address them either on IRC or in the main session. [13:20:28] Dydh da [13:32:00] #link https://www.mediawiki.org/wiki/New_Editor_Experiences [13:37:54] https://www.mediawiki.org/wiki/New_Editor_Experiences does this apply to small Wikipedias as well? [13:40:47] Aguyintobooks: The research was done with two medium sized wikis for now. [13:41:17] Czech and Korean [13:41:29] I see [13:42:04] Aguyintobooks: You can ping AbbeyRipstra or neilpquinn for more information. [13:42:33] Is there an equivalent research program, or one that is planned or completed, for the smaller size wikipedias? [13:42:42] The research document (on that page) is pretty extensive too [13:43:20] Aguyintobooks: I don't think so. But Abbey and Neil would know more. Did you have any particular language/wiki in mind? [13:45:28] yes my local one: https://kw.wikipedia.org/wiki/Folen_dre [13:47:02] Aguyintobooks: ahh ok. But sorry, we don't know for sure. [13:48:51] Ok, well thanks anyway [13:49:55] Aguyintobooks: in case you are interested there was an earlier research for readers that had more variety in terms of the wikis that were covered [13:50:01] #link https://meta.wikimedia.org/wiki/New_Readers/Findings [13:50:14] Aguyintobooks: ^^ [13:50:25] I will check it out [13:54:07] I am mainly asking after noticing the disparity between the different Wikipedia. For example, I live in Launceston, which is in Cornwall, the Cornish language being that of the kw-wiki. [13:54:22] Hi, can I read the research done on Nigeria? [13:55:07] its en-wiki page is pretty good: https://en.wikipedia.org/wiki/Launceston,_Cornwall. the corresponding kw-wiki page is a stub: https://kw.wikipedia.org/wiki/Lannstefan. [13:57:58] mojaam: AbbeyRipstra can help in that. [13:58:27] @kart_ thanks, I've checked the logs and found it. [13:59:13] #endmeeting [13:59:14] Meeting ended Wed Sep 20 13:59:13 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [13:59:14] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-09-20-13.01.html [13:59:14] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-09-20-13.01.txt [13:59:14] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-09-20-13.01.wiki [13:59:14] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-09-20-13.01.log.html [14:01:18] Aguyintobooks: That is exactly what the Content Translation tool is aimed at helping with. You can find a video about it in this blog post: https://blog.wikimedia.org/2016/07/16/content-translation-milestone/ [14:17:46] The content translation tool does not include machine translation for all languages. Is this correct? [21:01:19] #startmeeting RFC meeting [21:01:20] Meeting started Wed Sep 20 21:01:19 2017 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:01:20] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:01:20] The meeting name has been set to 'rfc_meeting' [21:01:58] #topic Introduce InterruptMutexManager T161749 [21:01:59] T161749: Introduce InterruptMutexManager - https://phabricator.wikimedia.org/T161749 [21:02:03] * legoktm waves [21:03:07] apparently AaronSchulz is not here today, we have Krinkle though [21:03:56] o/ [21:05:19] Aye, so in a nut shell, we've got various different ways of doing locking (mysql, memcached, PoolCounter) and using the best tool for a particular purpose is sometimes not possible (or requires hardcoding things with either no good fallback, or duplication of fallback code). [21:05:43] This task proposes an abstract interface for such operations that could be implemented in different ways. [21:06:16] The task doesn't currently mention how such instance would be obtained, but I suppose it would be via MediaWikiServices [21:06:31] but the PoolCounter core class is already a generic interface, you specify the class name in $wgPoolCounterConf [21:06:34] there is an abstract base class called LockManager. can you explain how that relates to the new interface? [21:06:45] that's the split between the core PoolCounter and the extension PoolCounter [21:06:54] is the problem just that we should have chosen different names for those? [21:06:56] Either configurable (similar to JobQueue backend and other such configuration variables), or a relevant extension might claim the default automagically. [21:08:10] TimStarling: That's a good point. The current task doesn't seem to have considered PoolCounter (I think). However it does mention the desire for interacting with a subset of features that cannot be emulated by MySQl/Memcached [21:08:19] Whcih is presumably why we don't have a default for PoolCounter in core? [21:08:36] This is mainly about the mutex case (not semaphores) [21:08:36] DanielK_WMDE: some LockManager subclasses could implement it [21:08:55] hey AaronSchulz! [21:09:22] that can* be emulated by MySQL/Memcached. [21:09:36] LockManager has to be broad enough to supports lots of things, some of which uses WaitConditionLoop (a retry/sleep loop) [21:09:51] I think we don't have one in core because nobody has asked for that before? [21:09:52] so that cannot implement that interface itself [21:10:10] PoolCounter only has three abstract functions that you need to implement: acquireForMe(), acquireForAnyone() and release() [21:10:21] most of the complexity is in the config [21:10:25] PoolCounter can implement it, though not all things that implement it could implement PoolCounter [21:12:31] acquireForAnyone() and wait queue limits and such are useful but more specific logic that requires a service aware of PC-specific logic [21:12:38] I'm blurry on the intended use case. For Wikidata's ChangeDispatcher, we need mutex logic so mutilple instances of the dispatchChanges maintenance script don't try to dispatch to the same wiki at the same time. [21:13:05] We have LockManagerSqlChangeDispatchCoordinator that uses the lock() and unlock() functions from the LocakManager base class. [21:13:05] DanielK_WMDE: that probably doesn't need good performance, right? [21:13:31] well, it shouldn't have *terrible* performance, but it doesn't need to be blazingly fast, i suppose [21:13:41] I suppose any use of getScopedLock or lock or IDatabase or BagOStuff qualifies. [21:13:48] so a dumb retry-sleep-work is OK for that [21:13:49] time between lock and unlock ranges from seconds to minutes [21:14:16] rather than an interrupt based system with queueing (probably FIFO) [21:14:17] Which brings up the *other* abstract lock interface we have in core, in addition to PC, which is BagoStuff::lock [21:14:30] one of the goals in the RFC is to have a fallback sequence [21:14:46] which is something that core PoolCounter doesn't provide, like I say, all complexity is in the config [21:15:12] which is fine for WMF but maybe not fine in general? [21:15:13] It seems to me that the difference between the different locking interfaces, their intended use case and their contracts need to be made very clear. I for one am a bit confused right now [21:15:31] BagOStuff::lock() isn't meant to be that general, it seems like a quick-and-dirty way to fake CAS [21:15:45] lots of things use it since it happens to be there AFAIK ;) [21:15:59] I think for some cases using db:llock or bagostuff::lock directly is fine, but I think we have a habit of using it in lie of the the thing this RFC proposes. [21:16:17] the PoolCounter extension doesn't even register itself, so there's no way for a theoretical fallback sequence to even use it at present [21:17:53] DanielK_WMDE: LockManager: SH/EX locks, not for high performance/contention, PoolCounter: has to support acquireForAny() cache logic; IMM => high performance EX locks [21:18:27] the low criteria for LockManager makes it easy to implement lot of ways [21:18:47] AaronSchulz: so the new interface provides the same basic functionality as LockManager, but adds guarantees? [21:19:42] I can't really thing of way to merge these into one or two interfaces without curtailing the possible sub-classes (such as those for vanilla installs just grabbing whatever is available that works) [21:19:49] Ah, I didn't know LockManager is usable outside of filebackend. Looks like it is. [21:20:01] DanielK_WMDE: basically, but also doesn't have to support SH locks [21:20:27] we could have a basic interface that doesn't require SH locks [21:21:01] DanielK_WMDE: not really useful without the interrupt part too [21:21:49] AaronSchulz: i didn't get the interrupt part. what should i be reading? [21:21:56] for-loops either have very spammy pooling or backoff (somewhat the opposite of FIFO in effect) and overshoot (if it doesn't timeout due to being raced out) [21:23:01] DanielK_WMDE: I suppose the InterruptMutexManager class comment [21:24:27] * Mutexes are acquired via native blocking and are released automatically on disconnect. [21:24:29] * These are suitable for high-traffic items that need FIFO-style blocking without polling. [21:24:34] AaronSchulz: you mean that? [21:28:22] so it's the simplest (most possible implementations) interface to guarantee that kind of non-slow-polling property [21:29:41] so the proposed patch doesn't have any registration or fallback, it's unclear how you would get an instance [21:30:11] having an interface for code that just needs some mutex mechanism definitly sounds like a good thing. It should probably be a generic service in MediaWikiServices. [21:30:50] So application logic can just ask for a mutex manager to be injected [21:31:09] the patch is not on the top of my todo list, but it would be nice to have at some point [21:31:50] would the generic service wiring mechanism be sufficient to provide the registration and fallback mechanism you have in mind? [21:32:12] then we wouldn't have to come up with a new config/registry/factory mechanism [21:33:57] I'd like to know a little bit more about how the different locking interfaces we have actually relate to each other, and whether they have any overlap. [21:34:18] in my mind, we should have an interface for each useful set of methods/guarantees. [21:34:38] the fact that we have several locking interfaces with different beahviours seems to indicate a need for such interface definitions. [21:34:51] Daniel asked about the use case earlier and we didn't get an answer [21:35:07] does that mean that nobody really needs or wants this? [21:35:43] is it tech debt? will existing code start to use it when it is available? [21:35:48] a common case would be (has mysql/postgres => use named DB lock version) [21:35:58] <_joe_> which kind of promises do we expect this lock interface to make? PoolCounter doesn't guarantee global locks to be unique, fi you have more than one poolcounter instance, right? [21:36:24] AaronSchulz: use for what? [21:37:14] DanielK_WMDE: edit stash locking as the commit msg mentions, also https://phabricator.wikimedia.org/T161749#3622647 [21:37:33] if there's more than one PC server, the server is chosen from the key based on consistent hashing [21:37:44] so there is only one pool per key, as long as the server is not flapping [21:37:45] <_joe_> TimStarling: and if one server fails? [21:38:10] <_joe_> TimStarling: yeah, my point is if it's acceptable for the lock to give no strict guarantee of uniqueness [21:38:10] yeah, PC prioritises staying up over being correct, that is true [21:38:41] AaronSchulz: do you think wikibase should use it instead of using LockManager? [21:38:52] it sounds like it should [21:38:55] <_joe_> because what most people expect of a lock is for it to be a guarantee across the whole application environment, and that requires a distributed datastore [21:39:15] if there are zero servers available then it will pretend the lock was acquired, which is the ultimate in non-unique locking, right? [21:39:47] <_joe_> like zookeeper or etcd, but both have serious performance issues. I'm not sure low-latency distributed locking is something available outside of some large internet companies [21:40:14] <_joe_> so I would ponder which kind of promises we want this locking interface to make to the programmer. [21:40:49] <_joe_> I would advise against offering a global unique lock for things like guranteeing single execution across different servers. [21:41:36] <_joe_> PC gets away with its simple and effective design because we're ok with having at most N concurrent processes thinking they have the lock [21:41:50] all of the use cases there are for caching or flap-tolerant [21:41:56] <_joe_> ok [21:42:16] I suppose one could do something like BagOStuff::getQoS() to see how rigorous they are [21:42:43] <_joe_> I think DanielK_WMDE hinted at a different use-case earlier, that's why I was asking [21:42:54] I would prefer separate interfaces for different sets of goarantees. could be compatible interfaces though [21:43:19] if they have to be rigorous they could probably just use DB->lock() or the DB subclass directly [21:43:31] but type hinting is nicer that checking if ( $cache->getQoS() & QOS_GLOBAL ) ... [21:43:41] (since we tend to store stuff like that in the DB masters anyway) [21:43:48] <_joe_> yes [21:44:10] what are our notes and action items? [21:44:13] that'S what we used to do for wikidata, but that caused us to hog connections to the master db. [21:44:26] so we switched to a Redis based LockManager [21:44:46] I can only think of FileBackend not necessarily being alongside DB_MASTER writes and using no SPOFs for canonical data [21:44:46] a nice narroww interface would be preferrable to binding to the LockManager base class [21:45:04] is the RFC to be approved? [21:45:09] in that case, a distributed and rigorous mutex could be useful in theory [21:45:28] I was really just focused on the cache and lightweight data cases [21:45:41] <_joe_> AaronSchulz: a distributed and rigorous mutex is a very, very complex problem, I'd leave it aside [21:45:47] DanielK_WMDE: yeah, an interface probably could do to [21:45:49] TimStarling: i'm in favor of defining an interface for a generic mutex service. i have no idiea if the interface as proposed is ideal [21:46:55] I think it can be approved assuming ServiceWiring integration? [21:47:12] that's a separate question from whether anyone cares enough to bother implementing it [21:47:54] +1 [21:48:20] the PC patch, https://gerrit.wikimedia.org/r/#/c/332951/ , has a few things I would want to follow up about in gerrit, it's a bit weird [21:48:27] there isn't much to implement. aarons patch covers most of it. just needs DI intergration and unit tests. [21:49:54] it's become clearer to me in the course of this discussion that the existing PC interface is not enough, it makes sense to have a variety of lock interfaces [21:50:36] hmm, I may have comments on the core patch as well [21:51:18] timeout as a function parameter is of course different from the PC policy of centralised configuration [21:51:44] but that's not in the RFC, so doesn't need to block the RFC [21:53:21] so approve or last call? more conventional to go to last call [21:53:42] sgtm [21:54:03] you mean last call sounds good? [21:54:20] yes. and more conversation [21:55:04] #info move RFC to last call [21:55:13] #endmeeting [21:55:13] Meeting ended Wed Sep 20 21:55:13 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [21:55:13] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-09-20-21.01.html [21:55:13] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-09-20-21.01.txt [21:55:13] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-09-20-21.01.wiki [21:55:13] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-09-20-21.01.log.html [22:00:51] AaronSchulz: Could you review https://phabricator.wikimedia.org/T161749#3622991 at some point? [22:01:04] Made a comparison of the existing lock managers [22:03:10] Krinkle: does "limited semaphore" mean N-semaphore? [22:03:24] e.g. true semaphore and not just a mutex [22:03:31] AaronSchulz: Yeah, so not support for reference counting but with a maximum [22:03:37] like PC does [22:03:41] not sure if there's a better name for that [22:04:11] If there's better things to also/instead add to the table, feel free [22:04:18] I'm just trying to get a better picture of things. [22:04:39] maybe copy to the task description and edit there? [22:05:11] I suppose another column could be no-polling