[06:54:19] elukey: thank you! That's an upload node for the record [06:54:45] it seems that removing max_connections was the right thing to do indeed: [06:54:48] https://grafana.wikimedia.org/dashboard/db/varnish-failed-fetches?orgId=1&from=1507525839837&to=1507532032552 [06:55:34] there are still a bunch of fetch failures, but nothing compared to the huge spike we had when reaching max_connections (~8K IIRC) [08:04:56] <_joe_> ema: do you have a standard set of hosts we run pcc on for the caches? [08:08:50] upload: cp2020.codfw.wmnet cp4026.ulsfo.wmnet cp1073.eqiad.wmnet cp3048.esams.wmnet [08:08:53] text: cp2013.codfw.wmnet cp4028.ulsfo.wmnet cp1053.eqiad.wmnet cp3030.esams.wmnet [08:08:56] misc: cp2018.codfw.wmnet cp1051.eqiad.wmnet cp3010.esams.wmnet [08:08:58] _joe_: ^ [08:09:04] <_joe_> ema: thanks :) [08:09:52] <_joe_> ema: I have to ask you to start looking at the first few patches before I can proceed, as I reached the length limit for the patchset series in gerrit, I think [08:09:58] <_joe_> but lemme run pcc on those first [08:10:04] ok [08:16:49] <_joe_> https://gerrit.wikimedia.org/r/#/c/382674/ is reviewable. Before you do that though, get some early reward by looking at https://gerrit.wikimedia.org/r/#/c/383078/1 which is a (partial) point of arrival :) [08:17:30] wow [08:21:15] _joe_: looks good, ship it! [08:22:10] I've published PS5 which just fixes two minor things in logging.pp comments [08:23:00] <_joe_> oh [08:23:07] <_joe_> ok :) [08:34:04] bblack, ema: https://phabricator.wikimedia.org/T177742 , let me know if you have any objections/alternative proposals [08:36:38] ^ also paravoid [09:53:24] moritzm: interesting! [09:55:56] ok so I've tested setting bgp-med to 0 on lvs3001 and to 100 on lvs3003, all looks good [12:54:02] 10Traffic, 10netops, 10Operations, 10Pybal, 10Patch-For-Review: Deploy pybal with BGP MED support (for primary/backup) in production - https://phabricator.wikimedia.org/T165584#3270574 (10ema) All load balancers are now using BGP MED. Primaries send the MED attribute with a value of 0, backups send 100.... [12:59:31] godog: https://grafana.wikimedia.org/dashboard/db/ipvs-backend-connections updated to use the aggregation rules, much snappier now! [13:01:09] umh, https://grafana-admin.wikimedia.org/dashboard/db/load-balancers also would need some love in that regard [13:01:39] actually we did have a couple of rules for it already but I've removed them in bcdade7c8265b763d56d731d2c8b9d41d97dc9e0, doh [13:05:04] JFTR, godog's out today, it's a local bank holiday in Sevilla [13:05:21] oh right [16:29:45] 10Traffic, 10Analytics-Kanban, 10Operations: Invalid "wikimedia" family in unique devices data due to misplaced WMF-Last-Access-Global cookie - https://phabricator.wikimedia.org/T174640#3669886 (10Nuria) a:03JAllemandou [18:55:25] 10Traffic, 10Analytics-Kanban, 10Operations, 10Patch-For-Review: Invalid "wikimedia" family in unique devices data due to misplaced WMF-Last-Access-Global cookie - https://phabricator.wikimedia.org/T174640#3670314 (10JAllemandou) The change above doesn't change the behavior of cookies, but at least removes...