[20:58:52] * duesen sees two meetbots [20:59:17] At least one of them won't be happy, then. [20:59:29] <_joe_> probably we've implemented locking on memcached [21:01:32] #startmeeting [21:01:32] duesen: Error: A meeting name is required, e.g., '#startmeeting Marketing Committee' [21:01:46] #startmeeting TechCom RFC disucssion [21:01:46] Meeting started Wed Sep 16 21:01:46 2020 UTC and is due to finish in 60 minutes. The chair is duesen. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:01:46] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:01:47] The meeting name has been set to 'techcom_rfc_disucssion' [21:02:14] #topic PHP microservice for containerized shell execution [21:02:23] #link https://phabricator.wikimedia.org/T260330 [21:02:32] * legoktm waves [21:02:40] * duesen wibbles [21:03:05] who's here to talk about running shell commands in a (micro)service? [21:03:11] o/ [21:03:22] <_joe_> o/ [21:04:21] \o [21:04:34] one thing I need is agreement on the broad idea of a BoxedCommand and the MW interface I'm proposing for that [21:05:01] another thing is the tricky remaining problems, things that have not been implemented yet [21:05:44] and maybe we can talk about how it actually fits into kubernetes etc. since people keep asking me about that and I just give a shrug and say "hopefully not my problem" [21:06:22] For the BoxedCommand interface, do you mean the one defined here? https://gerrit.wikimedia.org/r/c/mediawiki/libs/Shellbox/+/622707/20/src/Command/BoxedCommand.php [21:06:49] yeah, or more succinctly, the example code under "File API" in the task [21:07:06] something like [21:07:15] #info Key questions: 1) BoxedCommand php interface, 2) tricky remaining problems, 3) integration with Kubernetes [21:07:34] ok I won't paste examples, they'll get too long, you can read the task [21:08:15] basically a BoxedCommand always has a private empty working directory, even if it has no files [21:08:35] you add input files to it, which are placed into the working directory [21:08:57] you register output files, which can be either fixed filenames or a specific sort of glob pattern [21:09:04] <_joe_> TimStarling: I wondered how one can manipulate the execution environment variables [21:09:18] there is environment() [21:09:37] <_joe_> heh I read the examples in the task only [21:09:46] this existed previously [21:10:12] in the old code it was implemented by prepending name=value pairs to the shell command line [21:10:23] in the new code I used the environment option to proc_open() by default [21:10:45] the BoxedCommand API looks reasonable to me, have you tried converting real MediaWiki shell outs to use the new API? [21:10:46] proc_open() has a few nice features that we weren't using [21:10:49] <_joe_> another question: usually command execution environments allow to either interpret the command in a shell, or avoid doing so. AIUI we default to using a shell, correct? [21:11:17] legoktm: not yet [21:11:45] _joe_: yes [21:12:24] #info environment options are implemented via the environment option to proc_open() [21:12:24] fundamentally commands are strings, which are run via the shell [21:12:39] #info fundamentally commands are strings, which are run via the shell [21:12:49] this is partly a legacy of wfShellExec(), which is still commonly called [21:12:59] wfShellExec() just takes a single string argument [21:13:08] TimStarling: doe environment variables work for remote commands already? are they transferred like files? [21:13:25] environment variables are transferred remotely using Command::getClientData() [21:13:35] they are JSON-serialized, not binary like files [21:13:58] I think it would be nice to see some real commands converted, just to see what it's like and make sure the API isn't too clunky, but I don't see any issues, it looks nice to me [21:14:24] Command::getClientData() returns a JSON-serializable array containing the command string and various options [21:14:40] #info for remote executions, environment variables are encoded as JSON, while files are transferred as binary blobs [21:14:43] then on the server side there is the internal Command::setClientData() which restores those options [21:14:44] <_joe_> TimStarling: my 2c is that it would be nice to allow non-shell execution in the future, but probably it's only a nice-to-have [21:15:37] <_joe_> thinking out loud: could there be a case where env data is not json serializable? I guess not if all values are php strings [21:15:43] we have restrict(Shell::NO_EXECVE) which is currently implemented by running firejail via the shell, and firejail runs the command not using the shell [21:16:04] if it is not valid UTF-8 then it can't be JSON serialized [21:16:11] <_joe_> right [21:16:36] I tried playing with the json_encode() options but there is apparently no way around this [21:16:43] <_joe_> but I would consider that a feature tbh [21:17:00] for the most part I expect that environment variable values are not directly user controlled and should be safe to JSON serialize [21:17:01] <_joe_> if your env variable is not a utf-8 string, I'm happy rejecting it :P [21:17:30] I check errors from json_encode() and throw if it is not serializable [21:17:38] <_joe_> good enough for me [21:17:44] <_joe_> actually better than the alternative [21:17:50] _joe_: is the primary motivation for non-shell execution for better performance? [21:18:05] <_joe_> legoktm: and smaller attack surface, more importantly [21:18:19] TimStarling: if running via firejail means that the command isn't run in a shell, does this mean that the command syntax would be different? w.g. pipes wouldn't work? [21:18:45] I've been talking about pipes on the task, legoktm had some thoughts an hour or so ago [21:18:48] <_joe_> there's less trickery you can do if the process and args are directly executed with execve() [21:18:57] #info non-shell execution would be nice to have, to reduce attack surface (and improve performance) [21:19:09] if you do "foo | bar" this becomes "firejail foo | bar", which means foo runs under firejail and bar does not [21:19:11] https://phabricator.wikimedia.org/T260330#6468248 about whether we need pipes [21:19:17] <_joe_> TimStarling: do we allow pipes? And more importantly, do we use that? [21:19:46] we allow them by not disallowing them [21:19:47] legoktm is saying that we don't use it and that we can go with one of the options I presented which was to reject commands with pipes, at least in BoxedCommand [21:19:54] <_joe_> TimStarling: I would assume you execute something like "firejail -c '$commands'" [21:20:06] <_joe_> +1 [21:20:24] <_joe_> I would really prefer we don't allow fancier stuff than executing one command [21:20:45] <_joe_> I suspected EasyTimeline could use pipes, btw [21:21:06] redirections have the same problem [21:21:23] there is also filehandle duplication "2>&1" etc. but that's not really a security issue [21:21:57] #info rough consensus that we don't need to support pipes, and probably don't want to [21:22:13] <_joe_> I would consider all that an antipattern. Shell execution should return stdout and stderr, and the mangling should happen in code [21:22:20] TimStarling: what would be need redirection for of we can directly capture stdout and std err? [21:22:25] +1 _joe_ [21:22:36] <_joe_> I mean, I try to avoid it in *rakefiles*. This is code running in production. [21:22:48] we have includeStderr() which does do that [21:23:04] actually it appends 2>&1 but whatever [21:23:06] it could do that [21:23:37] item 2. tricky questions [21:24:12] #info we probably don't want stdout/stderr redirection either [21:24:12] so I think we should support the basic getStdout(), getStdErr() and the includeStdErr() options which merges the two streams, but no arbitrary redirections [21:24:27] BoxedCommand and UnboxedCommand share a base class with a lot of shared functionality [21:24:35] <_joe_> yes, if BoxedCommand controls it, it's fine [21:24:45] some things don't really make sense for BoxedCommand [21:24:54] <_joe_> my point was about allowing redirections in the command to execute [21:25:16] #info legoktm> so I think we should support the basic getStdout(), getStdErr() and the includeStdErr() options which merges the two streams, but no arbitrary redirections [21:25:20] for example I was talking on the change about whitelistPath() [21:26:01] this is meant to tell the sandbox to allow access to a specific local file or directory, but that doesn't work in BoxedCommand, the local file won't be transferred [21:26:37] another example is restrict() -- the assumption is that the caller understands what is going on inside the sandbox, the caller basically configures the sandbox [21:27:22] a third example is the MediaWiki firejail.profile file [21:28:08] for UnboxedCommand it makes sense to continue to support these things [21:28:23] but for BoxedCommand maybe the server should be configuring them on a per-route basis [21:28:37] <_joe_> yes, the idea is that when you do an RPC call, whoever set up the service you're calling has decided the restrictions for you. So I guess +1 to what you just said :D [21:29:07] if ( $cmd instanceof UnboxedCommand ) { $cmd->restrict( ... ) } ? [21:29:18] <_joe_> each route will be a separate instance of the service btw, in a different container [21:29:36] but then there is the local mode, which exists for the convenience of third parties [21:29:50] in which MediaWiki just executes the boxed command locally [21:30:16] we could duplicate the per-route configuration into MediaWiki for the benefit of that use case [21:30:18] <_joe_> so in local mode some of the above restrictions still make sense? [21:30:27] yes [21:31:06] or we could ignore restrict() calls in the remote mode and use them in the local mode [21:31:20] <_joe_> I assumed you could just make it so that if you're using RPC, restrict() and such would simply be ingored, but whitelistPath() is trickier [21:31:44] tangent re firejail.profile - if systemd-run can support everything we use firejail for, do we want to continue to support firejail in the long run? I would think not [21:31:54] currently whitelistPath is used by one thing only, that is GitInfo, which we don't think will be a BoxedCommand [21:31:55] <_joe_> no. [21:32:06] <_joe_> TimStarling: ack [21:32:43] <_joe_> legoktm: my "no." was to you, systemd-run is by far the most obvious choice for us [21:33:05] <_joe_> otoh [21:33:08] <_joe_> portability [21:33:17] <_joe_> do we require Linux to run MediaWiki? [21:33:18] ignoring restrict() for remote execution sounds potentially risky. or at least non-obvious when reading the calling code. [21:33:20] firejail is already Linux-only [21:33:30] <_joe_> ack [21:33:38] <_joe_> yeah then no reason. [21:33:52] firejail has different goals to systemd-run, they've solved some tricky problems, like disabling execve [21:33:53] <_joe_> duesen: what does restrict() allow you to do exactly? [21:34:00] IIRC the idea I had around whitelistPath() was that you mark the files/dirs you need, and then firejail would hide everything else (to an extent). but that never worked out because how firejail implements path whitelisting and blacklisting is broken [21:34:10] in the current system, the decision to permit/deny something is generally maintained close to the code calling it. E.g. a syntaxhighlight shellling to pygments, or SpecialVersion shelling to git. In the new system, these restrictions would need to be done ahead of time in the image configuration. I generally like that as its more strict. But it also allows for mismatches. E.g. if something doesn't need networking but the image permits [21:34:11] it anyway, would it be useful to verify this in some way? [21:34:11] I haven't tested it yet in systemd-run but I doubt it can disable execve() [21:34:21] _joe_: i don't know the details, but i assume it allows you to specify restrictions. which would only apply if executed loaclly. [21:34:54] <_joe_> Krinkle: I doubt networking will be allowed in any of those images, but I get the example [21:35:02] to illustrate in a perhaps over-engineered way, what if part of the rest protocol included sending the restrict flags and the other end will deny if it's expecting something too strict or too loose. [21:35:05] _joe_: I'd say for remote execution we'll require linux, but local-only should work on windows... [21:35:31] right, maybe there's just nothing left to restrict. no network and no local disk. [21:35:36] sub process we dont care per se? [21:35:37] remote execution will work on windows, remote mode is not the complicated part [21:35:57] firejail/systemd-run are complicated but I understand these are not even proposed for production [21:35:58] I would like to see some kind of automated exploit testing that tries to break out of the sandbox [21:36:04] #info for local exectuion of boxed commands, do we want an explicit restrict() to apply? Or do we want to locally emulate per-route (per-command) restrictions? [21:36:07] they are just going to be legacy sandbox methods for 3rd parties [21:36:12] <_joe_> Krinkle: the point is, we're moving the control of the resources from the caller to me, and I like it a lot :D [21:36:26] #info I would like to see some kind of automated exploit testing that tries to break out of the sandbox [21:36:32] <_joe_> but your idea is interesting - adding limits to the wire protocol [21:36:32] I like that, chaos monkey approach [21:36:42] _joe_: yeah, not for switching but for verifying [21:36:58] but maybe we don't need it anymore [21:37:03] previously _joe_ said there will be no container inside a container [21:37:14] <_joe_> yes, we won't do that [21:37:30] <_joe_> if the limits are stricter than what we configured server-side, they will be ignored [21:38:04] I don't know how the service would verify the limits are implemented. E.g. if NO_NETWORK is set, does it try to ping somewhere and verify that it fails? [21:38:30] legoktm: my thinking was that the shell service would be told what its limits are, jsut best effort verification [21:38:50] <_joe_> legoktm: it would have access to its configuration, that could include its limits pretty easily [21:39:03] <_joe_> I think this is unnecessary btw [21:39:06] currently the server can be configured with useSystemd, useFirejail, if they are both false then NO_NETWORK will be ignored [21:39:08] <_joe_> in practical terms [21:39:17] <_joe_> it just makes the interface more fair to the caller [21:39:47] ah, firejail inside the shell service container? [21:39:54] <_joe_> TimStarling: NO_NETWORK will be the default condition, enforced by kubernetes, unless we whitelist internet access [21:40:04] I assume there will be a host link [21:40:06] in general I think having remote mode ignore restrict() and whitelist/blacklistPath() seems like the easiest way forward, just trusting that the service is set up with the appropriate level of restrictions/isolation [21:40:26] for a world where we have no idea what commands client code may want to run, that makes sense to me. But we generally do know, and we ware going to set up specific images and rules for each such command. [21:40:31] but continue to support those (or their successors) in local mode [21:40:55] ok, tricky problem #2 [21:40:56] this configuration doesn't have to be wikimedia-only and specific to our environment. It could be generic, so others can use it. [21:41:11] In that case, it would make sense to also enfore the same rules in the same way for local execution [21:41:12] (sidenote: we don't even shell out to git in production because of the pre-computed "git cache") [21:41:14] server-side validation of commands against configuration [21:41:23] I think whitelist() should cause an error, since that's not about protecting stuff, but setting a functional requirement that we can't satisfy, right? [21:41:57] or would we want callers to sometimes do file writes directly and sometimes indirectly when emulating the box service. [21:42:06] seems simpler to always follow the tmp/remote approach [21:42:14] <_joe_> Krinkle: +1. Even worse, it assumes you have access to the local filesystem where you e.g. have mediawiki's sources [21:42:20] <_joe_> and you don't [21:42:20] fwiw whitelist is effectively broken in the current firejail implementation: https://phabricator.wikimedia.org/T182486 so if we lost it it's not a regression [21:42:50] <_joe_> TimStarling: what are your doubts about server-side validation? [21:42:52] so you have this whole JSON-serializable command blob sent to the server [21:43:14] do we want to validate the whole thing using JSON schema or something? (that would be a new dependency added to MW) [21:44:08] validating the command string means doing some sort of parsing on it, I've been looking at the POSIX shell grammar for ideas on that [21:44:24] <_joe_> TimStarling: yes, I assume you want to do that [21:44:25] what constraints do we want to place on the command string? [21:44:26] the blob doesn't seem that large or complicated to where JSON schema would be an advantage (considering the cost of the dependency) [21:44:41] compared to just writing the validation by hand [21:45:07] <_joe_> TimStarling: mostly you want to ensure the executable is the one you're configured to execute, then you might want to do regex matches on the arguments? [21:45:43] <_joe_> like check that no arguments includes '../', or that all characters in arguments are ASCII [21:45:48] <_joe_> (just as examples) [21:45:54] what are we trying to protect against? [21:46:05] I think the way to do validation by hand would be to allow exact matches of properties, and also to allow unconstrained properties [21:46:09] <_joe_> commands that inject user input [21:47:07] <_joe_> legoktm: also if this service needs to be something more than just "execute commands coming from MediaWiki", it seems like a valuable thing to do [21:47:17] <_joe_> being able to set constraints on the command-line arguments [21:47:26] I can imagine a constraint like ensuring that a command runs a given binary, with no pipelines etc. [21:47:32] I'd so much love if we could push escaping of user input to the actual point of execution, where we know exactly what will be executing the string. Doing quoting/escaping on one system, and then sending the result off to another system to be run via an unknown shell soudns really scary. [21:47:56] if we're worried about malformed commands, that seems maybe more valuable for the unboxed than the boxed implementation. [21:47:57] <_joe_> TimStarling: yes [21:48:04] #info I think the way to do validation by hand would be to allow exact matches of properties, and also to allow unconstrained properties [21:48:07] regexes are potentially error-prone for that, maybe we want to parse the command and validate a structured representation of it [21:48:11] although if the images are persistent, then I guess we are worried about it for the boxed one as well. [21:48:11] <_joe_> duesen: that was the idea yes [21:48:36] <_joe_> TimStarling: yes, I was thinking in python I'd do this using argparse.ArgumentParser [21:48:37] I recall Tim saying that the images are not concurrently shared with other commands, but that they are recycled/long-running. [21:48:38] _joe_: but the current proposal takes the entire shebang as a single string... [21:49:35] <_joe_> duesen: and I assumed the remote service would do the valdation of that string [21:49:49] I will defer to _joe_ on any details about containers/images [21:49:55] is it possible to get the benefits of native PID/namespaces without killing the pod, e.g. do we do something there to prevnet background/sleeper processes from staying around? [21:50:21] <_joe_> Krinkle: what would that sleeper process access? [21:50:32] <_joe_> I think it's a third-order issue [21:50:43] I'm thinking that commands are generally really simple if we can disabllow everything fance, including quoting, escaping, pipes, variablesm, etc. That wouold be possible if we could inject escaped values late in the process: the command string would contain a placeholder, and there would be a setParameter( $name, $value ) method. [21:50:56] Time check: 10 minutes [21:51:21] we have drifted on to the last agenda item I suggested which is container details [21:51:42] <_joe_> ok [21:51:56] _joe_: my point was that validation is really hard if you allow all the fancy syntax. and could be much simpler if we could side-step this. [21:51:57] _joe_: what I mean is a user tricking MW/box service to run something unintended with side effect of a sub process staying around in the container, able to e.g. observe past runs and/or influence future ones. [21:52:35] it sounds like we mostly treat the box service as being without much risk, not allowed to do anything, which is is imho only fine if the image is shut down afterwards and not reused from a live pool. [21:53:08] <_joe_> Krinkle: ok, that container is a stripped-down linux distro with bash, and one othe executable. what do you think a "sleeper process" would be able to do? [21:53:12] there could be privacy consequences of allowing one shell command to access the temp files of another [21:53:23] <_joe_> what I mean is - are we passing PIIs to this? [21:53:34] I don't think we can reasonably avoid that, yes. [21:53:36] for example, the first command could be from a public wiki and the second command could be a private wiki [21:53:40] #info commands are generally really simple if we can disabllow everything fancy like quoting. For user input in parameters, we could use placeholders, that are substituted late in the process. [21:53:46] <_joe_> ok, that's a fair point [21:54:15] _joe_: I don't know specifically about PII, but content from private wikis definitely [21:54:19] <_joe_> I might add, though, that I think this is much more improbable as an attack surface than anything we have today [21:54:20] if we had some "fairly good" approach to being able to kill the command and anything it did between runs, that'd address my concern. [21:54:39] <_joe_> also in terms of stealing private data [21:54:43] and yeah, it's be great if we can at least make sure tmp output files of "good commands" are cleaned up indeed [21:54:58] <_joe_> the command will be killed, but the php process that launched it, potentially not [21:55:09] <_joe_> so not the whole contianer, no [21:55:15] #info there could be privacy consequences of allowing one shell command to access the temp files of another. for example, the first command could be from a public wiki and the second command could be a private wiki [21:55:27] <_joe_> that could be done on a case-by-case basis though [21:55:38] would it be feasible within a container to give the spawned a PID namespace of some kind that we can easily kill (without all the other security benefits of a true container) [21:55:51] #info <_joe_> the command will be killed, but the php process that launched it, potentially not [21:56:06] wouldn't it be sufficient to randomize the file names, and prevent enumeration of temp files? [21:56:07] <_joe_> Krinkle: I *think* so, but I have to veirfy [21:56:34] <_joe_> duesen: not if the attacker gains root via a local privilege escalation of some kind [21:56:58] <_joe_> so on this topic, I think we should spawn a separate task and I'd like to involve more SREs [21:57:40] it would be nice to know exactly what is feasible and what is not [21:57:51] setrlimit() is unprivileged and I assume it would be allowed [21:58:02] <_joe_> I would assume as well [21:58:22] <_joe_> TimStarling: ask me again when it's not midnight though :P [21:58:40] fair enough [21:58:55] we are nearly at time [21:59:00] any last things to sort out? [21:59:18] <_joe_> to be clear, I'm not discounting the possibilty of respawning a container for each execution, esp for the most sensitive tasks. It's a big performance hit though [21:59:35] probably >1s [21:59:37] _joe_: maybe only for the private wikis? [22:00:24] <_joe_> duesen: if we want to go that way, I'd assume we'd end up using some system similar to aws lambda on the backend [22:00:40] right [22:00:52] ok, anything that still needsd to be captures in #info? [22:00:53] <_joe_> but I think this could be also a future evolution [22:01:31] <_joe_> the practical fact is that executing external programs remotely from where the mediawiki secrets are reduces our vulnerability surface by a lot [22:01:40] maybe process groups can be as good as PID namespaces, if the command is restricted from creating a process group [22:01:58] but that probably means allowing seccomp inside the container [22:02:24] ok, i'm closing the meeting. [22:02:39] please continue the conversation on the ticket or on #mediawiki-core [22:02:44] #endmeeting [22:02:44] Meeting ended Wed Sep 16 22:02:44 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:02:45] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2020/wikimedia-office.2020-09-16-21.01.html [22:02:45] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2020/wikimedia-office.2020-09-16-21.01.txt [22:02:45] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2020/wikimedia-office.2020-09-16-21.01.wiki [22:02:45] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2020/wikimedia-office.2020-09-16-21.01.log.html [22:02:49] <_joe_> ack :) [22:02:53] Thanks for hosting. [22:02:53] thank you all!