[16:58:21] * bd808 looks around for anomie [16:58:41] * anomie forgot to click "join" after opening the hangout [17:18:01] bd808, anomie: if you guys are meeting -- AaronSchulz really needs anomie to review his patches to make CentralAuth multi-dc compatible -- https://gerrit.wikimedia.org/r/#/q/owner:%22Aaron+Schulz+%253Caschulz%2540wikimedia.org%253E%22+status:open+project:mediawiki/extensions/CentralAuth,n,z [17:21:47] ori: I looked at it, thought "It'll probably do what it purports to do, but I don't really know all the details of why it's being done, or about CA for that matter, so..." [17:22:59] anomie: "....so I'll verify that it works, give it a +1, wait a week, and then +2 it if no one objects"? :) [17:23:14] ori: "... so I don't really know how to test it or anything" [17:23:32] AaronSchulz: ^ [17:27:27] labs has the db2 slave [17:28:02] AaronSchulz: you'll probably need to be a lot more verbose and explicit than that [17:28:11] that's all rather terse [17:28:20] maybe add a comment or send brad an email with instructions on how to test it? [17:28:44] * AaronSchulz was still typing [17:28:51] TYPE FASTER [17:28:58] * ori kids [17:29:24] bd808: Hmm. Hangouts just died on me. [17:29:32] meh. ok [17:29:47] maybe email is better. In any case it's a matter of (a) looking for race condition at account creation and (b) doing creation/login/logout/views on labs and making sure that works [17:30:08] * AaronSchulz still needs to get back to chris on that autologin bit [21:54:20] csteipp: have a sec? [21:54:39] Give me about 3 minutes [21:58:54] bd808: welcome back, sir :) [21:59:09] thanks greg-g [21:59:15] I trust you had a relaxing time? [21:59:25] * bd808 is trying to remember how this work thing is supposed to go [21:59:46] coffee, typey typey typey, something something, sleep [22:00:14] yeah. totally unplugged and I only thought about work in a vague "I'll have to do things when I get home" sense [22:01:00] sweet [22:01:04] well done [22:01:41] I *almost* bought a sim card for my phone and then decided not to. I was very glad by day 2 that I hadn't [22:01:56] yeah bd808 glad your about, and I already bugged you about random minutia [22:02:06] it's like you never had a vacation at all now :) [22:02:31] I'd honestly hoped nobody would notice I was gone. Apparently that didn't happen [22:06:24] csteipp: ready? [22:07:17] AaronSchulz: Yes! [22:07:27] csteipp: if ChronologyProtector used user ID (else IP) for keys instead of being part of the session and MW supported some X-MediaWiki-CP-User: header, parsoid could get easy chronology protection on behalf of the user. However, I wonder if a simple rule of ignoring the header unless the REMOTE_ADDR is local is needed (and is enough). "Enough" to avoid a [22:07:27] side-channel leakage of whether a user was recently doing stuff on wiki by how slow adding the header for a given user ID makes requests. Usually state changing data makes public logs anyway, but not always (e.g. oversight deletions, event logging type stuff, ect...). HMAC + timestamp check instead of a REMOTE_ADDR check could beef it up, but makes things a [22:07:27] wee bit more complex. [22:07:47] some context at https://phabricator.wikimedia.org/T111264 [22:08:58] AaronSchulz: So to start... ChronologyProtector? [22:09:07] the class in db/ [22:09:20] currently is stores DB master positions in the user session [22:09:29] so the next page views make sure whatever slave is used is caught up to that [22:09:46] cool [22:09:50] the positions are stored (at the end of the request) when the user causes writes [22:09:57] *it stores [22:12:09] using CP will slow the page views a bit (or even several seconds if lag is several seconds) [22:12:22] since it waits on DB slave connection [22:12:42] though if the user didn't cause any DB writes it won't [22:12:54] (recent writes) [22:12:56] It waits on the slave instead of going to the master directly? [22:13:01] yep [22:14:23] So the general attack being, find a way to read a page as the user (CSRF). If there's a spike in latency, we know the user did a write. If it hasn't been logged anywhere, it may have been a private / suppressed activity. Is that right? [22:14:45] well if we add a simple header for the user, no CSRF is needed [22:14:51] the bug is about splitting CP from sessions [22:15:22] the question is how much to lock down any CP-specific header for things like parsoid ect. [22:15:28] Oh, so if it's not in the session, then the attacker can just query as the user directly. Uhg, really don't like that. [22:16:24] well "as the user" only for CP [22:16:57] Right :) [22:17:22] but that's enough to maybe tell if the user is active (if anything is causing DB writes, analytic tracking stuff being the easiest since it doesn't require the user to be making edits or anything) [22:17:52] it's easy for code to slip in writes without anyone knowing...MW is big :) [22:18:08] so we can't assume writes would be for stuff like edits and whatnot [22:18:11] So yeah, checking REMOTE_ADDR would solve this for parsoid. There are certainly ways to have our servers make querries, so I don't love it. But the chances of being able to set a custom header like that should be small. [22:19:06] so "is local" wouldn't just be 127.x but any of the local IP ranges, of course [22:19:08] What would the hmac actually be of? Some secret we give to the user? [22:19:15] Right [22:19:25] csteipp: a secret only parsoid boxes and MW know I'd assume [22:19:53] restbase too if needed [22:20:09] Oh, only our services, not the user. gatcha. [22:20:54] That would work. It should be an hmac-based signature over the request, so it can't be stolen. [22:20:58] so the user wants parsoid html+rdfa, parsoid needs to fetch stuff from MW, but wants to see any recent changes [22:21:04] s/stolen/replayed/ [22:21:27] gwicke doesn't want to forward the whole MW cookie, as then revdelete/private stuff can show depending on the users rights [22:21:34] that would mess with their caching model [22:21:45] Yeah, it would [22:22:14] a TTL should be enough, as replay doesn't cause any true state changes [22:22:23] the only "state" is just caching afaik [22:23:45] what is your threat model? An attacker guessing the CP cookie? [22:24:24] we could wrap that in a JWT [22:24:33] well it's open source, heh [22:24:36] gwicke: who's threat model? mine? [22:24:43] we don't show user IDs much but they are not officially hidden [22:24:49] so no counting on that [22:24:52] user id's are public [22:25:08] yep [22:25:13] ah [22:25:24] why not use timestamps? [22:25:27] or counters? [22:25:53] I assume the id is used to resolve to the offset? [22:26:57] the user ID gives master position [22:28:05] (or IP if logged out, all via redis lookup) [22:28:28] could we set a cookie with the actual position instead? [22:29:05] we don't expose any of this in cookies or URLs atm since it's a bit messy [22:29:49] if we didn't protect the value, then I guess one concern would be a DOS, by asking for an offset far in the future [22:29:51] so if I make an edit, there could be 3 DB master positions stored, could be weird fitting them into one cookie (or split up) [22:30:12] we'd also have to sanity check the value (via hmac and binlog coordinate rules) [22:30:27] heh, I remember that coming up the last multi-dc rfc [22:31:05] is there a way to translate a timestamp to a master position? [22:32:15] no way to get historical timestamp => positions [22:33:00] if we used pt-heartbeat (we will soon), we could query the master UNIX timestamp and wait till the slave heartbeat.heartbeat table has heartbeat.t > (that timestamp). Though that would require polling, ugh. [22:33:02] but, would it be possible to get ? [22:34:12] currently the waiting use a nice blocking MASTER_POS_WAIT method [22:34:28] but that needs an exact coordinate [22:35:04] csteipp: so how good is just the "is local" IP check? [22:35:28] hmm [22:35:32] obviously more vulnerable to proxying surface [22:35:49] 'is local' IP checks would require XFF forwarding [22:36:17] well MW has decent XFF support [22:36:25] which isn't an issue where we do that anyway, like for save APIs [22:36:36] but, for Parsoid that's not something we would normally supply [22:36:46] I guess that would help with any servers that mindless relays headers (assuming it sends XFF) [22:36:53] it shouldn't need to know about the client requesting this [22:37:54] AaronSchulz: It depends on what is being proxied in-- if it's the User id from their cookie, then it's not much better than allowing attacker direct access. [22:38:53] If there's a cookie with the master position, then attacker can't know that. But if it's just user id from the cookie, then attacker can ask parsoid to do the lookup, and you still have the leak. [22:39:31] we could protect the id with a signature [22:40:16] while the attacker knows the id, it couldn't fake the signature [22:40:50] csteipp: I wonder if this can be done now already [22:41:08] e.g. does a cross domain iframe element still expose onload triggers? [22:41:34] well, that would require the user to be also visiting some site...so that's a lot narrower [22:41:36] nevermind [22:41:56] gwicke: yeah, that was the hmac idea [22:42:17] it's not too complex, but does require a bit of config to be in sync [22:42:32] if that's fine for parsoid the we could do that [22:42:36] we already use JWTs to protect page state in restbase, for example [22:43:15] the config is basically one secret [22:44:43] gwicke: As long as the attacker can't ask parsoid, with an unsigned user id, for something that parsoid then signs and asks mediawiki about [22:44:47] top hit for PHP is https://github.com/firebase/php-jwt [22:45:20] that's already in mediawiki/vendor [22:45:23] csteipp: so far, parsoid doesn't have any special privileges [22:45:26] so no signing [22:45:51] it also doesn't get any sensitive request information from RB [22:45:55] Sorry, "service" [22:46:17] So as long as the attacker can't ask RB, with an unsigned request... etc. [22:46:38] as long as other services don't know the secret, they won't be able to sign anything [22:46:40] Otherwise the timing attack is still there, just with extra overhead for both services. [22:47:24] neither RB nor Parsoid would know the CP secret [22:47:49] Hmm? AaronSchulz, I thought that is what you were trying to do? [22:48:25] yeah, there is a missing link there [22:48:36] gwicke: Or you're saying mediawiki gives a cookie / jwt to the browser, then the browser uses that to query RB? [22:48:45] csteipp: yes [22:49:21] a la https://www.mediawiki.org/wiki/Requests_for_comment/Service-oriented_architecture_authentication [22:49:42] either (a) the stuff that makes DB writes has to be a request that gives a signed token to the user to give on their next parsoid request or (b) parsoid would have to know that a request it is acting behalf of is "for user X AND authentic" [22:49:43] but limited to CP, for now [22:50:21] I would rather not trust RB or Parsoid more than absolutely necessary [22:50:25] gwicke: maybe that phab task needs some scenarios of what web requests actually happen for example [22:51:30] one is hitting 'edit' using VE right after saving [22:51:34] and before the slave has caught up [22:51:40] gwicke: So yes, you can go that route, if Aaron's ok with that. I think you would have to make that cookie based, otherwise you're going to have requests get out of sync... but that's more you're territory :) [22:53:27] so the JS could ask for a signed CP token and then use it for getting the html+rdfa to show on 'edit' [22:54:02] (ask some JTW or cruder HMAC api.php endpoint) [22:54:10] added that use case to https://phabricator.wikimedia.org/T111264 [22:54:42] maybe the token could be prefilled for performance like the csrf tokens are [22:54:42] the save end point should basically set a cookie [22:55:57] a timestamp would be nicer in some ways [22:56:56] there's a somewhat related problem when we are getting serious about caching authenticated requests [22:57:54] one of the ideas we discussed in the past was to keep a list of recently edited things (or just a 'last edit' timestamp) in a cookie, and then disabling caching based on that [22:59:44] * AaronSchulz looks at ResourceLoaderUserTokensModule [22:59:52] FB is handling the same issue with write-through cache invalidation combined with a stable client -> proxy mapping [23:01:31] a cookie-based solution would have the advantage of working cross-DC as well [23:02:04] well the sticky DC cookie should be set here anyway [23:02:19] so checking redis for the position is fine [23:02:55] unless you by "works" mean "let's you not have to stick" [23:03:19] if the only data changed was in those DBs (not swift or anything), then yes that is true in theory [23:05:23] gwicke: so yeah, a special ResourceLoaderUserTokensModule token seems like it could do the trick [23:05:38] * AaronSchulz will update the task [23:07:32] AaronSchulz: the JWT stuff standardizes some things around validity, how the hash is combined with the content etc; could be worth reusing [23:07:51] despite the pain of pulling in PHP libs [23:08:15] relative pain [23:10:46] gwicke: as in one hash for several token entities or something? [23:10:51] * AaronSchulz needs to read up on that spec [23:11:33] https://en.wikipedia.org/wiki/JSON_Web_Token [23:13:32] php-jwt seems to have seen some amount of scrutiny [23:33:36] gwicke: you realize chris already reviewed php-jwt, and we're using it for OAuth and ContentTranslation? :) [23:33:59] legoktm: oh, didn't know that [23:34:08] my grep-fu seems to be too weak [23:34:57] may I quote you on the bug? [23:36:43] legoktm: went ahead at https://phabricator.wikimedia.org/T111264#1683486 [23:38:08] legoktm: thanks! [23:38:36] gwicke: I was looking for the task, it's https://phabricator.wikimedia.org/T106762 [23:38:51] https://phabricator.wikimedia.org/T106762#1489741 "In production, you shouldn't set JWT::$leeway to more than a few seconds, if at all."