[18:11:23] Hey everyone, Roan Kattouw is currently tech talking about ResourceLoader Tips and Tricks: https://www.youtube.com/watch?v=AGiDYvcroWEif you have questions for about the Tech Talk feel free to ask here. [18:37:05] I note that a lot of virtual file generators are stored in the (ExtensionName)Hooks.php file [18:37:20] Is that a good practice given that these aren't really hooks? [18:46:48] srrodlund: lots of echo since you unmuted :) [18:46:54] (in youtube anyway) [18:47:06] i got one! well not a question [18:47:17] FYI, Timo and I recently added callbackParam for callback [18:47:17] https://gerrit.wikimedia.org/r/c/mediawiki/core/+/544240 [18:47:19] sounds ok to me on YouTube, no echo [18:47:30] hm ok. [18:47:46] oh i have it open in another window somewhere... [18:48:04] oh good thanks ottomata I wasn't hearing it, but I do that sometimes, too! [18:48:15] Especially when there is a stream and a hangout at once [18:51:24] hey, is there a way to change the value of the virtual json file in tests? I want to write to some qunit tests for different values of a config variable but I couldn't find an easy way [18:53:08] hm, could hack that in with the callbackParam, but would be bit hacky, putting test conditional in code [18:53:25] Is there a plan to add documentation on this and can we help / where? [18:54:27] ... My question [18:54:47] Am I lagged with the video? [18:54:53] I dunno [18:55:07] Thanks! [18:55:10] :( [18:55:11] ty! [18:55:12] OH no [18:55:12] Apparently [18:55:15] We missed it [18:55:19] rats! [18:55:23] I think I'm lagged [18:55:32] well try asking RoanKattouw directly :-P [18:55:35] documentation on what btw? [18:55:38] It's okay I'll just ask RoanKattouw later 😁 [18:55:59] Everything he showed, but more specifically the last bonus with the advanced stuff [18:55:59] I think matmarex wrote the imagemodule stuff [18:56:03] Yes. Also, please let me know, so I can document the question [18:56:58] The invert and colors for image module through extension.json documentation would be incredibly helpful [19:00:07] https://www.mediawiki.org/wiki/Manual:ResourceLoaderImageModule.php [19:00:32] that was super helpful and will be a nice thing to send to new devs in the future, thanks Roan and Sarah! [19:00:58] Yay! Happy it was helpful [19:01:10] I'll publish a corrected version of the slides soon [19:01:34] Some errors snuck in, I tried to call them out as I went [19:03:22] And yes huge props to MatmaRex on the image module code (especially the OOUI image module stuff which is dark magic) [20:50:38] ooh i'll have to watch the video of that [20:53:06] <_joe_> I'll try to be around for the IRC meeting as long as I manage [20:56:43] my goat baby monitor batteries died during the first TechCom meeting so I was happy for the extra time to check on her and get the charger [20:56:49] (she was due yesterday) [21:00:05] #startmeeting TechCom RFC discussion [21:00:06] Meeting started Wed Oct 23 21:00:05 2019 UTC and is due to finish in 60 minutes. The chair is KateChapman. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:00:06] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:00:06] The meeting name has been set to 'techcom_rfc_discussion' [21:00:20] #topic Require use of common storage abstractions https://phabricator.wikimedia.org/T215465 [21:00:59] so who is here to discuss common storage abstractions and requiring their use? [21:01:07] o/ [21:01:14] o/ [21:01:52] o/ [21:01:53] o/ [21:02:06] o/ [21:02:22] excellent. Let's start off by talking about the benefits this would provide and also possible concerns. [21:02:29] Krinkle: did you want to give an overview? [21:02:36] <_joe_> o/ (ish) [21:02:49] OK [21:03:36] So today we're looking to explore the general area. Whether a policy of this type is desirable, and what benefits it might have (and for whom), and what restrictions/process it might put in place (and for whom). [21:04:20] The general area being to formalise a strong preference for using existing abstractions within MediaWiki over connecting to available backends directly. [21:04:38] As example, to e.g. require use of BagOStuff over connecting to a Memcached server directly. [21:04:52] [21:05:22] anyone have any questions so far? [21:05:23] can you give an example for db access? [21:06:37] I think we have defined DB access less well; at the least folks should use the accessors in place of manually writing SQL whenever possible, but there's a lot of wiggle room where the current APIs are hard to extend "cleanly" and easy to extend "poorly" :D [21:06:45] The main sql interface that comes to mind is the wikimedia-rdbms library. so it could mean that we would not allow (unless agreed ahead of time to do otherwise) that an extension should not use php socket or mysqli functions directly but instead use the core IDatabase implemetnation. [21:06:57] *nod* [21:08:13] <_joe_> I have one question: the idea is to have a policy and document it, or do we want to establish ways to enforce it? [21:08:42] _joe_: can you give an example of what enforcing would mean? [21:08:48] a technical limitation? [21:08:54] <_joe_> yes [21:09:05] <_joe_> some form of code analysis [21:09:13] <_joe_> in ci [21:09:38] I didn't think of that as it seemed impossible to me, but I suppose ^ yeah, static analysis could catch it, but no run-time way I think that would be hard to do with maybe not justified amount of effort required. [21:09:52] <_joe_> the alternative model is "we will rollback your code if we notice" [21:10:12] <_joe_> I wasn't suggesting a runtime enforcement, no [21:10:25] If this became a development policy, the burdon would be on +2'ers to enforce this. TechCom could be a last resort if there is a dispute though. [21:10:44] And yeah, generally once a policy like this forms, it doesn't take long before someone new gets told off and then someone writes a PHPCS rule for it. [21:11:01] <_joe_> ack [21:11:08] That would still allow room for exceptions to be made (via phpcs) so that seems fine. Not sure we should code that into the policy thoguh [21:11:17] * via phpcs.xml [21:12:09] I have a question, is there a list of storage abstractions currently available? [21:12:13] some mechanism for allowing exceptions should be included (who grants them? what's the process?) [21:13:02] a perhaps less obvious example of something that could fall under this policy is shell execution. We could decide to limit the policy to external services. But requiring use of MediaWiki\Shell over shell_exec seems preferable as well (and already socially established I believe). That can be a non-techcom coding convention on its own of course, so could also keep out of scope [21:13:42] let's keep that out of this discussion, as it's not strictly storage-related [21:13:44] <_joe_> answering the question from cdanis: no, there is no such thing, and it should be a consequence of this RfC IMO [21:13:49] _joe_: ack [21:14:18] apergos: how exceptions works is open for debate. The default answer (if not specified otherwisse by the policy) would be that TechCom decides (TechCom task, or techcom@ e-mail required; RFC not required). [21:14:44] <_joe_> #info <_joe_> While there is no current list of recommended storage abstractions, one should exist as part of this RfC [21:14:50] all right [21:15:22] #info how exceptions works is open for debate. The default answer (if not specified otherwisse by the policy) would be that TechCom decides (TechCom task, or techcom@ e-mail required; RFC not required). [21:15:49] any other questions or comments on this level? If not next I was going to bring up the levels of abstraction [21:16:16] <_joe_> My next question is: we have plenty of different-level abstractions for storage, do we want to allow use of them all or rather pick some of them, and narrowly define their promises in terms of consistency, availability, latency, and cost of storage? [21:16:22] <_joe_> eheh KateChapman :P [21:17:01] can you give an example of where the same thign has multiple abtractions, or where two separate things are abstracted at different levels? [21:17:11] <_joe_> ok [21:17:24] <_joe_> so BagOfStuff is a very generic k-v interface [21:17:52] <_joe_> but different implementations have radically different guarantees in terms of how storage is treated [21:18:21] <_joe_> think of RestBagOfStuff (that we back with kask/cassandra AIUI) and memcachedbagofstuff [21:18:31] or apcubag vs sqlbag [21:18:38] <_joe_> they can be plugged in independently into code [21:18:49] <_joe_> that uses bagofstuff [21:19:12] <_joe_> and change radically how things are stored and what expectations you can have on their peristence / consistency [21:19:27] or latency [21:19:41] <_joe_> yeah it wasn't a complete list [21:20:23] <_joe_> this can result in people deploying mediawiki making wildly inaccurate choices on what instance of that interface to use [21:21:03] <_joe_> so I would hope that the allowed interfaces would be more generalized than MemcachedBagOfStuff but less generic than BagOfStuff [21:21:28] <_joe_> so something like VolatileBagOfStuff, if you allow me the horrible name [21:21:51] I think not coding a against a specfic backend (postgrest, memcached) is good and allows for easy porting down the road. [21:22:00] _joe_ (or others) do you think there are current scenarios where the right level of abstraction doesn't exist? Meaning it would be hard to make a recommendation through policy that is possible to follow? [21:22:05] but, I don't see yet how that ties in with the policy. [21:22:20] For the policy, as I understand it, it would mean to interact with Memcached you have to use MemcachedBagOfStuff [21:23:06] <_joe_> ok, I hoped we could also become picky about what interfaces are recommended amongst the existing ones [21:23:09] Whether behind the type hint of "BagOStuff" with a generic factory like "getMainStash()" or directly as 'new MemcachedBagOfStuff($wgUnicornMemcached['servers'])' is imho out of scope. [21:23:22] <_joe_> but probably is matter of a further RfC [21:23:30] <_joe_> more general than just MediaWiki [21:23:42] I do agree with you more generally that BagOStuff specifically should probably allow more granular promises so that a consumer can make expectations based on it. [21:24:12] I imagine we have a lot of code right now written with SqlBag or MemcBag in mind that would not work as expected if given a ApcuBag in a multi-server setup. [21:24:12] <_joe_> but for now you just want to establish that people shouldn't reinvent the wheel in every extension [21:24:20] <_joe_> fair enough :) [21:24:51] I believe Aaron has some patches for this, very much like you're saying. Basic key/value store vs replication vs merging/lock/sync etc. [21:24:55] But yeah out of scope I think [21:24:59] <_joe_> ok [21:25:35] <_joe_> that can be part of a more general RfC about storage interfaces and MediaWiki can be a special case where those would be applied. [21:25:38] maybe it is out of scope for this discussion, but I think it might be nice to see a very limited set of abstractions allowed in WMF production, not just for maintainability and understandability of our own MW installations, but also for a future where all of those abstractions are properly instrumented with basic metrics monitoring, distributed tracing support, etc [21:25:43] is the lowest public/stable interface we have for abstracting the various storage backends MW supports always at the same abstraction layer currently? [21:26:16] #info it might be nice to see a very limited set of abstractions allowed in WMF production, not just for maintainability and understandability of our own MW installations, but also for a future where all of those abstractions are properly instrumented with basic metrics monitoring, distributed tracing support, etc [21:26:23] <_joe_> cdanis: yeah I think that can be matter of a later RfC [21:27:15] <_joe_> Krinkle: I still think we should allow, per policy, to refer developers to a list of allowed interfaces [21:27:45] <_joe_> not /require/, but /allow/, so that if in the future we want to restrict which interfaces to use, it can be done without changing the policy again [21:28:20] Yeah, I think a list of examples would be useful to get more feedback, but I don't want to whitelist that the policy only applies to that set of specific backend services [21:28:55] We might want something more generic like for a given backend service, MW should only offer one abstraction, and that one should be re-used/improved for all other use cases (unless othewise… …) [21:28:59] <_joe_> yeah I wasn't suggesting that. I find already problematic that most of our interfaces are pretty technology specific [21:29:30] <_joe_> and yes, 1 abstraction per storage system is absolutely a must [21:29:51] <_joe_> but well, maybe more of a single /library/ [21:29:55] OK. So next: current violations and whether they are a problem? [21:30:29] s/are a problem/ could be improved by using a another one of the interfaes [21:31:07] Krinkle: the policy is speaking to those using the abstractions. Are you also thinking some developing the abstractions guidelines? [21:31:08] <_joe_> Krinkle: there is at least one use of redis with a direct connection, using technology-specific verbs [21:32:10] KateChapman: Yeah, that should be much rarer (we introduce those maybe once every 10 years), but I think could be coded as part of this. Don't create new ones and don't bypass existing ones. [21:32:48] also brion bought up "What should the interfaces look like in how they are communicated to the dev that needs to use them?" is that something to discuss? [21:33:45] <_joe_> Krinkle: I would suggest that violations should be eventually resolved. In some cases they might come from a specific need that our interfaces don't cover [21:34:20] Yeah, I was thinking we could try to enumerate some here as it might help gain confidence or find holes in our ideas. [21:34:32] <_joe_> but I would consider such exceptions technical debt. It surely is at least as far as the SRE side of the equation is concerned [21:35:13] <_joe_> IIRC one is Extension:GettingStarted ? [21:35:47] <_joe_> https://www.mediawiki.org/wiki/Extension:GettingStarted suggests my memory is not failing me [21:37:03] Yeah, Redis use in GettingStarted does not seem "necessary" I think it would be accommodated by the general abstractions we have for that in core. [21:37:14] But, we have two [21:37:22] 1) RedisBag, and 2) RedisConnectionPool [21:37:25] the second is more low-level [21:37:29] should that be considered internal? [21:37:40] <_joe_> I would think so [21:38:11] Is it used by JobQueueRedis, and RedisPubSubFeed (aka for RCStream.py) [21:38:12] <_joe_> I also think RedisBag is already quite narrowed to a specific technology, but I can live with that :) [21:38:28] should those be able to use RedisBag? [21:38:31] <_joe_> both now unused IIRC [21:38:36] for WMF yes, [21:38:43] but still supported in core [21:38:52] and seems like a valid use case [21:39:02] <_joe_> I doubt JobQueueRedis would be able to, it's a ton of storage-specific lua [21:39:58] <_joe_> what one could argue is that the real interface is the one that /uses/ jobqueueredis [21:40:01] And for RCFeed, it uses pub() which seems like it shoudl not be exposed via RedisBag [21:40:05] <_joe_> that's a specific implementation [21:40:16] <_joe_> of the JobQueue interface [21:40:41] <_joe_> developers should use the JobQueue interface, not the specific implementation. [21:41:02] So I guess we're allowing RedisConnectionPool to be used directly, but only it requires more than trivial key-value logic. Or are we saying we don't allow that for WMF, but elsewhere in core and WMF extensions if disabled by default? [21:41:03] <_joe_> being an implementation, JobQueueRedis can use redisconnectionpool [21:41:19] Rihgt but not php-redis directly [21:41:29] so for the policy I guess we're saying redisconnectionpool is the abstraction. [21:41:35] <_joe_> Krinkle: I think it should only be allowed in implementing an implementation of a storage interface [21:41:56] <_joe_> so interface: JobQueue implementation: RedisJobQueue [21:42:07] <_joe_> implementations should use our lowe-level abstractions, yes [21:42:17] redisconnectionpool < redisbago < redis revision blob store < Revision Service. [21:42:24] It's quite blurry where the draw the line. [21:42:35] <_joe_> sure [21:43:11] Should we distinguish "2nd level storage interface" in the policy, or can we stop it at redisconnectionpool for the purpose of "hard" requirements? [21:43:46] <_joe_> but yes, in general use of low-level interfaces like RedisConnectionPool should be limited to implementing interfaces and considered case-by-case [21:43:56] Or something use-case driven (which cannot be easily enforced) like if there are multiple layers available to use the highest available one that can work. But that seems odd to include in a policy. [21:43:59] <_joe_> using php-redis directly should be of course out of the question [21:44:20] <_joe_> it's easier to make specific examples than to formalize a policy [21:44:32] <_joe_> but also, I have talked too much :) [21:44:35] Right but arguably everything is a storage interface that uses redis within a class that will be used by something else. GettingStarted probably wrapped it in a class as well. [21:44:39] that doestn't make it right [21:44:47] <_joe_> yes [21:45:51] next: memcached [21:45:59] We have MemcachedClient (low level) and MemcachedBagOStuff [21:46:10] unlike redis, I thikn we can consider MemcachedClient as fully internal [21:46:16] even MemcLockManager uses the BagO layer [21:46:43] <_joe_> so the situation is better defined in that case. [21:47:02] <_joe_> WanCache is also an interface built on top of it right? [21:47:08] Sure yeah [21:47:27] it types against BagO generally, but yeah, commonly used with memc [21:47:51] <_joe_> so I would assume people should in general use WanCache maybe? As opposed to a LocalCache? [21:48:14] <_joe_> as a developer, it would be easier for me to reason in terms of locality of the caches (which also imply performance) [21:48:14] anything above MemcBag, I don't really mind for this purpose. [21:48:39] <_joe_> rather than "this uses memcached / this uses apc" which requires specific knowledge [21:48:54] but yeah, from a MW specific perspective, WANObjectCache is the default recommendation. With LocalClusterCache as second choice if less indirection is needed. [21:49:05] both of which wrap the same MemBag [21:49:05] 12 minutes left [21:49:31] <_joe_> ok, I thought that we could summarize this with: [21:49:39] also next: sql. It seems we always use IDatabase there, even for things like SqlBagOStuff and MySqlLockManager [21:49:52] <_joe_> - only use mediawiki's abstraction, not direct php drivers [21:50:18] <_joe_> - when possible, use one of the recommended Interfaces (with a list to be maintained with proper documentation) [21:50:53] <_joe_> - when not possible, that can be done after having reached an agreement? [21:51:41] <_joe_> yes sql is pretty uncontroversial [21:52:16] <_joe_> everything including kv access like external storage or parsercache use the mw abstractions already [21:53:13] So thinking about Echo and GettingStarted and how to codify (if at all) against this. [21:53:19] GettingStarted did use RedisConnectionPool [21:53:23] via RedisConnRef [21:53:48] But not only to use a lower layer, but also to use its own connection (different hostname/port config) [21:54:39] <_joe_> that would fall into the "when not possible" category [21:54:49] <_joe_> but I'm not 100% convinced that's justified [21:54:55] <_joe_> esp in an extension [21:55:30] <_joe_> recommended interfaces vs low-level abstraction [21:55:36] Right for that rule to apply we would need to disallow RedisConnectionpool as well [21:55:55] and draw the line at RedisBag instead (and other Redis* things) [21:56:06] which means TechCom involvement if somethig like RedisBag or RedisJQ would be created in the future [21:56:09] maybe that's a good thing? [21:56:14] <_joe_> not include it in the recommended interfaces, yep [21:56:20] <_joe_> i think so [21:56:28] cool, that wasn't what I said earlier, but makes sense to me [21:58:52] <_joe_> any other matter that can be discussed in 2 minutes? [21:58:56] <_joe_> 1 minute :P [21:59:04] I'm hungry [21:59:10] <_joe_> eheh [21:59:26] I'm mostly asleep at this point (sorry) [21:59:34] I'll follow up on wikitech-l for more input. [21:59:36] Thanks all :) [22:00:15] thanks all! [22:00:19] #endmeeting [22:00:19] Meeting ended Wed Oct 23 22:00:19 2019 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:00:19] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2019/wikimedia-office.2019-10-23-21.00.html [22:00:19] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2019/wikimedia-office.2019-10-23-21.00.txt [22:00:19] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2019/wikimedia-office.2019-10-23-21.00.wiki [22:00:19] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2019/wikimedia-office.2019-10-23-21.00.log.html [22:00:30] good night!