[18:01:51] o/ is CREDIT happening today? [18:06:56] yeah same question. I wanted to present but I don't see anything happening [18:07:02] I wonder if we have it or not? [18:07:23] the calendar has it but the wiki https://www.mediawiki.org/wiki/CREDIT_showcase#Presenting says "TBD" [18:07:39] so I guess I'll wait another 5 mins and then bail... [21:02:33] hi [21:02:50] in a moment, we'll discuss https://phabricator.wikimedia.org/T167906 [21:03:00] * DanielK_WMDE_ is looking for meetbot [21:03:09] * DanielK_WMDE_ sees *two* meetbots [21:03:15] #startmeeting [21:03:15] DanielK_WMDE_: Error: A meeting name is required, e.g., '#startmeeting Marketing Committee' [21:03:15] DanielK_WMDE_: Error: A meeting name is required, e.g., '#startmeeting Marketing Committee' [21:03:20] mo' meetbots mo' problems [21:03:32] #startmeeting ArchCom RFC [21:03:35] Meeting started Wed Jul 5 21:03:32 2017 UTC and is due to finish in 60 minutes. The chair is DanielK_WMDE_. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:03:35] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:03:35] The meeting name has been set to 'archcom_rfc' [21:03:35] Meeting started Wed Jul 5 21:03:32 2017 UTC and is due to finish in 60 minutes. The chair is DanielK_WMDE_. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:03:35] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:03:35] The meeting name has been set to 'archcom_rfc' [21:03:37] meetbot, now in stereo? [21:03:43] #topic Make API usage limits easier to understand, implement, and more adaptive to varying request costs / concurrency limiting [21:03:50] #link https://phabricator.wikimedia.org/T167906 [21:04:11] TheresNoTime: yes. potentially writing into each other's view... [21:04:18] can someone kick one of them? [21:04:22] anyway [21:04:44] gwicke: what are the main questions you want to have answered today? [21:05:31] I would like to get feedback on the direction of using concurrency limits rather than rate limits to better handle differing request costs [21:06:09] if we reach consensus, then this would provide a mandate to investigate avenues for implementing this [21:06:11] gwicke: in practical terms, this means a limit to the number of concurrent requests from a single IP, or for a single session ID? [21:06:27] and concurrent is across the cluster, not on a single node, right? [21:06:27] either, that's a separate question [21:06:56] for now, it is IPs, but in the future I expect it to be tokens or some other, more unique property [21:07:21] concurrent means how many requests can be outstanding from any given client [21:07:34] identified by (ip, token, something else) [21:07:40] is there a reason to choose only one of those options? [21:07:49] concurrency vs rate limiting I mean [21:08:13] what would happen when the concurrency limit is exceeded? [21:08:18] complexity would be one reason [21:08:54] gwicke: iirc, you said that it's very hard to come up with useful rate limits [21:09:08] DanielK_WMDE_: I only see one meetbot job running on the grid. I'm afraid if I mess with it that both bots will die... [21:09:12] people struggle to clearly understand and implement rate limits right now, and the hope is to achieve more effective protection by moving to something that is easier to understand and implement [21:09:32] yes, rate limits are also very blunt instruments [21:09:45] bd808: huh, fun. [21:09:47] there is no need to implement concurrency limits at all, right? [21:10:01] if your overrun, some of your connections just get blocked [21:10:04] TimStarling: requests would be rejected with status 429 [21:10:18] bd808: an op could just kick one of them from the channel... oh, well. no big deal. [21:10:36] IP limits would probably be pretty harmful for Tools [21:10:38] or do you mean people who maintain the services struggle with rate limits? [21:10:54] tgr: client struggle with them [21:10:59] you'd need to maintain an IP whitelist [21:11:02] we have a finite IP pool for a a large percentage of Action API requests [21:11:16] bd808: on labs, you mean? because they share an IP? [21:11:20] the handling of labs & ips is a separate question [21:11:30] how is it separate? [21:11:41] bd808: the scoring team wants to limit concurrency by user agent, that seems like a useful approach to me [21:11:43] "< gwicke> identified by (ip, token, something else)" [21:11:44] it's orthogonal to rate vs. concurrency limiting [21:12:11] we already set different limits for internal ips, including labs [21:12:13] tgr: what, only 100 firefox users at once?... [21:12:36] DanielK_WMDE_: makes more sense for services that not exposed to browsers [21:12:41] bd808: with orthogonal, I mean that they can be discussed completely independently [21:12:48] #info limits based on IP is potentially problematic. a whitelist would be needed for wmlabs, at least. [21:13:02] the only real fix for this is requiring api tokens [21:13:14] and that's kind of nasty for open data [21:13:26] tgr: yea, but we are talking about public apis. so one UA would for instance identify pywikibot. [21:13:26] and then you have two problems [21:13:28] right, but lets keep that discussion for later [21:13:46] if it is trivially easy to get another token then its trivially easy to abuse. [21:13:53] DanielK_WMDE_: so put your username into the pywikibot UA or suffer [21:13:58] if it not trivally easy then it isn't open [21:14:04] bd808: you could fall back to IP based limits if no token is given. [21:14:09] being reachable is the whole point of client UAs [21:14:28] tgr: pywikibot should do that by design if it's a logged in action... [21:14:35] Less chance for user error [21:14:39] hmm, it seems that the discussion has moved to other topics [21:14:40] indeed [21:15:12] gwicke: the whole comparision between rate limiting and concurrency rings false to me [21:15:16] gwicke: so you want to discuss the mechanism in the abstract? [21:15:17] any thoughts on the RFC? [21:15:29] we don't do rate limiting in the sense limiting is proposed here [21:15:36] gwicke: this seems to be a crucial aspect, though. can we really discuss the rest of your proposal without an idea per what we are going to apply the limits? [21:15:42] ie. limiting enforced by the server [21:16:10] we ask users to rate limit themselves, at best [21:16:10] DanielK_WMDE_: how would one affect the other? [21:16:15] I'm not sure what issue we're trying to solve tbh. I disagree with the assertion that rate limiting I'd hard on the client side [21:16:22] *is hard [21:16:30] (sorry, mobile) [21:16:30] so of course that's harder for them to implement than limiting done by us [21:16:34] concurrency is typically an app layer concern. the app knows what is actually concurent, like rendering a page or whatever [21:16:44] tgr: we enforce rate limits a) at the varnish level, and b) on a global REST API level, and c) on a per-endpoint level [21:17:33] so one of the issues is that limits typically only apply to cache misses, which are very hard for clients to predict [21:18:04] also, global limits need to be set very conservatively, as some endpoints are very cheap, and high request rates are expected and wanted [21:18:22] as a result, global rate limits fail to protect expensive end points [21:18:53] global concurrency limits however could help to provide baseline protection for both cheap & expensive end points [21:19:11] in a way that is more predictable for clients [21:19:12] gwicke: if we just assume that we can somehow identify a client, and leave it for later to find out how, then it's fine. If we cannot assume that, we have to think about the pros and cons of various approximations, and how these approximations interact with different kinds of limits. [21:19:19] I'm having a hard time with the argument that IP is orthogonal when the RFC seems to propose building an IP based rate limiter into Varnish [21:19:23] as no distinction between cache hits & misses needs to be made [21:19:41] #info I'm having a hard time with the argument that IP is orthogonal when the RFC seems to propose building an IP based rate limiter into Varnish [21:20:03] orthogonal means that a change along one dimension does not affect the other dimension [21:20:24] you can swap ips against tokens without that affecting the discussion of rate vs. concurrency limits [21:20:47] Well then you shouldn't use IPs in your example ;-) [21:20:55] gwicke: only if both are viable ways to identify a single client. which they are not really. [21:21:06] fwiw, I do see the problem that gwicke is trying to solve. [21:21:07] And I agree that concurrency limits make more sense than rate limits, for the reason just layed out. [21:21:33] Well, lemme rephrase. At the application level, rate limits make sense. That's the best way to control behavior on an expensive endpoint [21:21:40] (See pingLimiter() in MW core, for example) [21:21:45] I agree as well, but you need to define the granularity of concurrency [21:21:52] But, concurrency limiting could make sense @ the varnish level [21:21:58] Instead of rate-limiting [21:22:32] maybe we can try to focus on some kind of baseline consensus. [21:22:39] let me try and phrase some questions. [21:22:44] bd808: could you elaborate on granularity? [21:22:51] assuming that we can effectively identify individual clients: [21:22:59] 1) would concurrency limiting be useful? [21:23:13] 2) can concurrency limiting replace rate limiting? [21:23:18] I also thing that if we're talking varnish-level stuff, it would benefit to have someone from ops (Brandon preferably) involved :) [21:23:37] gwicke: the width of the bucket that is used to count concurrent requests. Is it an IPv4 address? A CIDR mask? A unique token that is obtained by X? .... [21:23:47] RainbowSprinkles: this came out of a discussion with him [21:23:56] Fair 'nuff [21:23:57] :) [21:24:19] bd808: okay, so back to the question of the key [21:24:33] I grok that these are implementation details but they are important ones [21:24:51] my point is that any kind of limiting will be on some key [21:25:02] so we can abstract that question out, and discuss it separately [21:25:34] gwicke: only if we can assume that all keys are created equal [21:25:35] assuming we have a key, what should we do to limit requests for that key [21:26:16] so are you looking for "resolved: concurrency limits are good" or "resolved: rate limiting should be replaced with concurrency limiting in all tiers"? [21:26:27] follow-up discussion: which keys should we use [21:27:02] the first is easy [21:27:03] baby steps, essentially ;) [21:27:58] I think "resolved: rate limiting should be replaced with concurrency limiting in all tiers" overstates it a bit, but I was thinking about replacing rate limits in the REST API with concurrency limits [21:28:06] bd808: maybe we can soften the second point to "concurrency limits can replace rate limits at least in cases where it's hard to define good rate limtis" [21:28:32] gwicke: under what circumstances do you think rate limits should *not* be replaced? [21:28:43] the emphasis is more about getting good baseline protections in place [21:29:09] DanielK_WMDE_: rate limits could still make sense where requests are extremely cheap, but we want to limit them anyhow [21:29:16] like a client hitting Varnish over & over [21:29:24] from a sub-millisecond connection [21:29:35] #info DanielK_WMDE_: rate limits could still make sense where requests are extremely cheap, but we want to limit them anyhow [21:29:54] perhaps also specific functionality that has high per-request costs that is not captured in the response time [21:30:28] gwicke: like editing pages? [21:30:49] yeah, that's a good example [21:30:55] as much of the cost in that is async [21:31:14] #info perhaps also specific functionality that has high per-request costs that is not captured in the response time [like editing pages] [21:31:25] I think most write actions make sense to be rate-limited. [21:31:27] so I think rate limits still have a place, but it is not a good baseline policy [21:31:46] #info I think most write actions make sense to be rate-limited. [21:32:13] gwicke: so perhaps the rfc should clarify for what service and/or what use case concurrency limits should be implemented. [21:32:34] right now, it says "API", which could mean anything [21:32:52] if we can find a way to implement concurrency limits globally, then I think that would make the most sense [21:33:05] they can be used to soften different types of abuse certainly. concurrency is better for thundering herd and maybe is needed at multiple levels of granularity for a good SLA [21:34:11] specific end points (like those that write, as mentioned above) can then add rate limiting per entrypoint / action [21:34:16] say for instance that we only allow 10K requests in flight at all and then further only 0.1% can be attributed to a given client [21:34:32] gwicke: globally, as in "once client can only make 1 request at a time to the wikimedia cluster"? No matter what endpoint they are hitting? [21:34:58] I think you'll very quickly need to figure out your token, and it clearly can't be on IP. I wonder if analytics' work on "unique devices" would be of any use here [21:35:02] DanielK_WMDE_: yes, although 1 is a bit too low [21:35:19] yea, for some value of 1. [21:35:20] also, keep in mind that we are talking about sliding averages, with the ability to burst [21:36:00] that's what we do for rate limits as well [21:36:02] #info we are talking about sliding averages, with the ability to burst [21:36:13] RainbowSprinkles: the unique devices count is predicated on client cookies seen on request. its not really stable for bucketing [21:36:23] * RainbowSprinkles nods [21:36:32] I didn't know, was just thinking aloud :) [21:36:50] yeah, its totally a fair question to ask :) [21:36:55] #info concurrency [limiting] maybe is needed at multiple levels of granularity for a good SLA. for instance that we only allow 10K requests in flight at all and then further only 0.1% can be attributed to a given client [21:37:41] #info I think you'll very quickly need to figure out your token, and it clearly can't be on IP [21:37:49] SLA == "service level agreement" there for those that don't think in english tech acronyms [21:38:12] might be a generational thing at this point [21:38:42] they seemed to be all over tech a couple years ago, less so now [21:39:09] Not cool anymore. We're in a post-SLA industry now [21:39:14] I'm thinking that whatever key we use per default, we should offer the option to provide a token explicitly, to overcome any troubles with shared IPs or such. [21:39:16] SLAs will be responsive to service levels [21:39:36] RainbowSprinkles: lol [21:40:29] DanielK_WMDE_: that would allow us to push anonymous limits further down [21:40:36] DanielK_WMDE_: I like that. If it defaulted to IP and the 429 response pointed to a way to break out of the jail that would be nicer than just hitting the wall [21:40:48] I think bblack's plans are very much along those lines [21:40:58] Do we have any idea on how clients respond to 429 generally? [21:41:16] (I imagine most will either handle it gracefully or just barf) [21:41:19] #info I'm thinking that whatever key we use per default, we should offer the option to provide a token explicitly, to overcome any troubles with shared IPs or such. [21:41:22] probably like any other 4xx [21:41:34] unless they are built to know what to do [21:41:38] RainbowSprinkles: ideally, they back off; in practice, I think many just retry more or less immediately [21:42:08] note that PoolCounter's policy is to try waiting for a slot, and to only deliver an error message if the concurrency exceeds a second higher limit, or if a timeout is exceeded [21:42:09] Let me try to summarize what we have so far: [21:42:16] a 429 I think is supposed to suggest a delay [21:42:28] bd808: In the response? [21:42:30] retry-after, yeah [21:42:34] Ah [21:42:35] the idea being to protect users from seeing errors as far as possible [21:42:41] 1) concurrency limits would be good to have, but we have to figure out good keys/buckets. [21:42:41] but, as far as I know, we don't set that [21:42:52] 2) rate limits still have their place, especially for write operations [21:43:45] is there general consensus that concurrency limits make the most sense at a high (global ideally) level? [21:44:20] while rate limits would be specific to certain end points [21:44:43] TimStarling: Soft enforcement, by making the client wait? That should work nicely especially for trivial cases. Which is why we already do it, I guess. [21:45:41] gwicke: having high level concurrency limits certainly seems good. having more restrictive limits (rate or concurrency) for specific endpoints would also be good, though. [21:46:12] TimStarling: I like the idea of using delays to pace clients as well [21:47:06] #info TimStarling suggests to make the client wait for a response, instead of sending a 429 response right away, to enforece a back-off without any cooperation from the client. [21:48:14] gwicke: to me, global limits seem a good option, of they are easy enough to implement. [21:48:27] okay, it sounds like the next step would be to investigate ways to actually implement global concurrency limits [21:48:34] they are actually easier to implement than per-endpoint solutions, afaict [21:49:05] as long as we can afford it (and connections are cheaper in varnish than at any later time) then I think waiting is a good idea [21:49:05] gwicke: yes, it seems the discussion could benefit from a more concrete proposal [21:50:05] or even at the load balancer... [21:50:17] *** ten minute warning *** [21:50:25] our LBs are behind varnish [21:50:28] as long as we hash connections to nginx and varnish instances by our $key (ip currently), we might be able to implement limiting with an in-memory counter [21:51:07] nginx already has a module for that, varnish does not yet [21:52:41] hashing connections to varnish based on $key seems bad for cache efficiency... shouldn't we hash (hard) by requested resource? [21:52:52] supporting applicaton-level keys will exceed what LVS can help us with right now [21:53:20] * (shard) [21:53:28] DanielK_WMDE_: we do, but that's between the first & second level of varnishes [21:53:44] the hashing I'm talking is about internet -> nxing, and nginx -> first-level varnish [21:54:09] you mean LVS -> nginx [21:54:16] ah, i though this was between nginx and varnish. ok then [21:54:48] the internet will presumably not hash source IP addresses for you, but LVS can do that [21:54:54] TimStarling: different way to write it [21:54:55] in principle [21:55:02] in my case, the arrow is LVS [21:55:28] in any case, LVS can't hash on API tokens [21:55:40] but nginx could, potentially [21:55:58] true [21:56:10] #info LVS can't hash on API tokens. but nginx could, potentially [21:56:57] ok, let's wrap up. [21:57:00] #info next steps: investigate options for implementing global concurrency limits, considering future support for application-level keys like API tokens [21:57:39] a singular could have done there as well [21:57:40] i hear general suppor for the idea, with a lot of need for discussion about the details, especially the key to limit by. [21:58:02] (Thanks, All!) [21:58:22] DanielK_WMDE_: yeah, that's always been a popular topic ;) [21:58:34] what, "details"? [21:58:42] no, the $key [21:58:43] that's where the devil is, right? [21:59:04] the $key question, so to speak [21:59:13] right :) [21:59:34] aaaanywayy.. thanks for the input, everyone! [21:59:42] * DanielK_WMDE_ is waiting for the ball to drop [22:00:03] #endmeeting [22:00:03] Meeting ended Wed Jul 5 22:00:03 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:00:03] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-07-05-21.03.html [22:00:03] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-07-05-21.03.txt [22:00:03] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-07-05-21.03.wiki [22:00:03] Meeting ended Wed Jul 5 22:00:03 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:00:03] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-07-05-21.03.html [22:00:03] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-07-05-21.03.txt [22:00:03] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-07-05-21.03.wiki [22:00:03] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-07-05-21.03.log.html [22:00:03] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2017/wikimedia-office.2017-07-05-21.03.log.html [22:00:07] thanks all :) [22:00:12] :) [22:02:50] anyone notice the clone meetbot? [22:04:26] Hi TimStarling, Gabriel (gwicke), Ryan (Kaldari), Rob-Sterbal, and All, I have a question about installing MediaWiki. Is it ok to ask it here? [22:04:39] Zppix: yeah. I was waiting for the meeting to end before trying to it [22:05:06] Thanks for this recently in the "[MediaWiki-l] Installing Mediawiki" thread from July 3, Gabriel: "The current default setup runs all these containers on a single machine, using minikube . [22:06:22] Scott_WUaS: keep in mind though that it is very early days [22:07:04] Thanks, gwicke. [22:07:26] that said, we now have a container-based service delivery pipeline as an annual goal, which means that we will be spending some time on polishing the dev environment as well [22:07:41] I opened Minikube to Darwin - brew-cask - minikube.rb.tmpl since I'm on a Mac [22:07:52] it's a big project across ops, release engineering & services though, with a lot of moving parts [22:07:53] I'm a newby to this. [22:08:07] Shall I open minikube.rb.tmpl ? [22:08:30] I'm not a Darwin user myself [22:08:33] Or what shall I open to begin the installation process, please? [22:08:46] Pchelolo might be able to help you [22:09:05] Thnx ... [22:09:21] * Pchelolo reading the backlog [22:09:22] Hi :Pchelolo [22:10:31] Scott_WUaS: do you use homebrew? if yes then just type `brew install minikube` [22:10:48] I've had some notes on installation somewhere, lemme find them [22:11:31] Thnx ... will email you as a backup [22:12:56] Scott_WUaS: https://gist.github.com/Pchelolo/333e482559f7025eb253a2643ddb56d3 [22:14:39] Thanks! [22:22:36] Downloaded Homebrew successfully [22:23:16] :Pchelolo Tried your step #2 "brew install minikube docker-machine-driver-xhyve kubernetes-cli" but got error message [22:30:09] In Terminal, with Homebrew installed ... Find Cellar works ... Cellar ... but [22:31:15] but Cellar/wget/1.16.1 ... etc .... doesn't work in https://brew.sh/ [22:33:22] When I write ls -l bin, I get ... [22:34:10] Macinto8020c8c2:local RedLotus$ ls -l bin total 24 lrwxr-xr-x 1 root wheel 76 May 24 2012 MozyHomeBackup -> /Library/PreferencePanes/MozyHome.prefPane/Contents/Resources/MozyHomeBackup lrwxr-xr-x 1 RedLotus admin 28 Jul 5 15:20 brew -> /usr/local/Homebrew/bin/brew lrwxr-xr-x 1 root wheel 27 Apr 22 2014 gpg -> /usr/local/MacGPG2/bin/gpg2 [22:34:50] Could the gpg2 be blocking this somehow? [22:42:13] Which minikube do I start please per your "4. minikube start --vm-driver=xhyve --show-libmachine-logs --v=10 --alsologtostderr" :Pchelolo I wonder? [22:48:57] install minikube gives me info and a new prompt, but minikube start gives me an error message ... and this page seems relevant but not sure how https://github.com/kubernetes/minikube