[11:12:18] <ema>	 bblack: re: exp policy size-based cutoff, couldn't we move the whole exp logic after cluster_fe_backend_response, thus avoiding splitting the code between wm_common_backend_response and vcl_backend_response (fe)?
[11:18:32] <ema>	 after calling cluster_fe_backend_response, that is
[11:22:05] <ema>	 uhm, but cluster_fe_backend_response returns
[11:35:19] <ema>	 yeah I'm confused as to where exactly put the size-based hfp cutoff
[11:36:05] <ema>	 not wm_common_backend_response as we want it on frontends only
[11:36:42] <ema>	 probably not in cluster_fe_backend_response as we don't want to copy-paste it in multiple VCL files
[11:42:05] <mark>	 ema: instead of retracting service IPs from bgp announcements, we should probably raise MED instead eh
[13:21:35] <ema>	 mark: oh, so play with MED values to promote/demote IPs, so to say
[13:24:59] <mark>	 yes
[13:25:05] <mark>	 instead of retracting the ip entirely
[13:25:10] <mark>	 making the host not eligible to be used at all
[13:25:13] <mark>	 just make it less attractive
[13:25:21] <mark>	 i guess there could be situations where we want to retract the ip entirely
[13:25:27] <mark>	 but that should probably wait for the FSM stuff
[13:25:37] <mark>	 now I was just thinking of raising the MED while depool threshold is in effect
[13:25:48] <mark>	 and maybe until init completes, dunno
[13:32:21] <ema>	 mark: what happens if the host with higher MED is unreachable? Do packets in that case get routed to the other hosts?
[13:32:25] <ema>	 I imagine so
[13:32:51] <mark>	 so the router simply picks the router with the lowest MED (all other things being equal) at all times
[13:33:02] <mark>	 so the higher MED one isn't even used
[13:33:15] <mark>	 and it will soon drop its bgp due to timeout or whatever
[13:33:23] <ema>	 oh right, lowest
[13:33:24] <mark>	 even better would be BFD, i might add that some day
[13:33:32] <mark>	 yeah see it as 'distance'
[13:33:53] <mark>	 brb
[13:34:36] <ema>	 HTTP Immutable Responses - https://tools.ietf.org/html/rfc8246 
[13:47:43] <mark>	 sorry, diaper duties...
[13:47:51] <mark>	 ema: and I was thinking
[13:48:04] <mark>	 after this works we should also add prometheus metrics for 'active med' per service
[13:48:26] <mark>	 and besides the obvious benefits of that, it would also allow e.g. grafana to work out at all times what the active master is
[13:48:37] <mark>	 (the lvs instance with the lowest med per service ip)
[13:48:48] <ema>	 nice, yes
[13:48:55] <mark>	 and then you could have dashboards which only show the metrics of the active lvs without the clutter of the backups
[13:49:05] <mark>	 which are irrelevant for most purposes/people
[13:50:09] <ema>	 and possibly even a scriptable way to figure out who the master is!
[13:50:19] <mark>	 yes
[13:50:24] <ema>	 "master"
[13:50:31] <mark>	 the active one anyway
[13:50:34] <ema>	 right
[13:51:28] <mark>	 but, multiple services share the same ip
[13:51:38] <mark>	 so one service with higher MED should probably affect all others
[13:52:01] <mark>	 (e.g. port 80 vs 443)
[14:33:53] <paravoid>	 mark: one idea that we had floated before was to set the MED as the sum of the weights of the pooled realservers
[14:34:14] <mark>	 mm
[14:34:19] <paravoid>	 so that the LVS having the connectivity to most realservers wins
[14:34:26] <paravoid>	 but in case of complex outages, this could get a bit messy
[14:34:30] <paravoid>	 flapping traffic etc.
[14:34:33] <mark>	 yes
[14:34:55] <mark>	 especially with ips shared by multiple services
[14:38:36] <volans>	 it could also flap when depooling a realserver based on which pybal update first the config from etcd, right?
[14:44:09] <mark>	 volans: i can't parse that
[14:46:09] <volans>	 mark: when doing a normal depool of a realserver, different LVSes pick that depool from etcd at slightly different times, so the change in MED could make it flap each time we pool/depool a realserver IIUIC
[14:46:26] <mark>	 yes
[14:46:37] <mark>	 could mitigate that I guess with a delay
[14:48:36] <volans>	 or instead of the sum of weights using a % of the reachable realservers
[14:50:26] <volans>	 and some delta threshold (switch only if > 15% diff)
[14:50:46] <volans>	 to account for the change of percentages when adding/removing servers
[15:22:43] <mark>	 i'm inclined to just go with a static raised value
[15:22:48] <mark>	 only on depool threshold
[16:05:19] <ema>	 bblack: I was looking into 200 responses with CL:0 to try get rid of the vcl workaround for T144257
[16:05:19] <stashbot>	 T144257: Certain images failing to load in ulsfo - https://phabricator.wikimedia.org/T144257
[16:06:34] <ema>	 bblack: and noticed that we do generate 200s with CL:0 for healtchecks, it might be nicer to include some context info? https://gerrit.wikimedia.org/r/#/c/393251/
[16:08:03] <ema>	 as an alternative we could return 204s I guess, the response body being empty, but then we'd have to update the pybal checks ugh
[16:11:14] <ema>	 not that the healtchecks responses have anything to do with T144257 nor its workaround, it just came to mind while poking around :)
[16:11:15] <stashbot>	 T144257: Certain images failing to load in ulsfo - https://phabricator.wikimedia.org/T144257
[16:16:42] <ema>	 there are a few such responses on upload btw http://bit.ly/2iMvIRw
[21:41:00] <wikibugs>	 10Traffic, 10Operations, 10Performance-Team: load.php requests taking multiple minutes - https://phabricator.wikimedia.org/T181315#3786217 (10Tgr)
[21:57:29] <wikibugs>	 10Traffic, 10Operations, 10Performance-Team: load.php requests taking multiple minutes - https://phabricator.wikimedia.org/T181315#3786261 (10Tgr)