[09:20:00] <pofoobar>	 ██╗██████╗  ██████╗   ███████╗██╗   ██╗██████╗ ███████╗██████╗ ███╗   ██╗███████╗████████╗███████╗    ██████╗ ██████╗  ██████╗
[09:20:03] <pofoobar>	 ██║██╔══██╗██╔════╝   ██╔════╝██║   ██║██╔══██╗██╔════╝██╔══██╗████╗  ██║██╔════╝╚══██╔══╝██╔════╝   ██╔═══██╗██╔══██╗██╔════╝
[09:20:07] <pofoobar>	 ██║██████╔╝██║        ███████╗██║   ██║██████╔╝█████╗  ██████╔╝██╔██╗ ██║█████╗     ██║   ███████╗   ██║   ██║██████╔╝██║  ███╗
[09:20:12] <pofoobar>	 ██║██╔══██╗██║        ╚════██║██║   ██║██╔═══╝ ██╔══╝  ██╔══██╗██║╚██╗██║██╔══╝     ██║   ╚════██║   ██║   ██║██╔══██╗██║   ██║
[09:20:15] <pofoobar>	 ██║██║  ██║╚██████╗██╗███████║╚██████╔╝██║     ███████╗██║  ██║██║ ╚████║███████╗   ██║   ███████║██╗╚██████╔╝██║  ██║╚██████╔╝
[09:20:19] <pofoobar>	 ╚═╝╚═╝  ╚═╝ ╚═════╝╚═╝╚══════╝ ╚═════╝ ╚═╝     ╚══════╝╚═╝  ╚═╝╚═╝  ╚═══╝╚══════╝   ╚═╝   ╚══════╝╚═╝ ╚═════╝ ╚═╝  ╚═╝ ╚═════╝
[09:20:36] <pofoobar>	 hashar vgutierrez eddiegp Pchelolo gwicke chasemp paravoid dcausse Platonides Ivy SMalyshev phedenskog musikanimal bearloga paladox CIA marlier Krenair _joe_ bd808 madhuvishy elukey greg-g ema Danny_B stashbot Matthew_ bblack MaxSem gehel Snorri moritzm gilles Steinsplitter mark legoktm volans 
[10:27:58] <wikibugs>	 10Traffic, 10Operations, 10Page-Previews, 10RESTBase, and 2 others: Cached page previews not shown when refreshed - https://phabricator.wikimedia.org/T184534#3957874 (10phuedx) >>! In T184534#3954704, @BBlack wrote: > I think to really comprehend the right fix here, I'd need to rewind a little and figure o...
[12:53:41] <wikibugs>	 10Traffic, 10Gerrit, 10Operations, 10Phabricator, 10periodic-update: Phabricator and Gerrit: Improve the way that maintenance downtime is communicated to users. - https://phabricator.wikimedia.org/T180655#3958098 (10demon) https://gerrit.googlesource.com/plugins/motd/+/master could be useful on gerrit's...
[14:07:26] <bblack>	 ema: I think we've hit 1w as of roughly now on upload-ulsfo?
[14:07:29] <bblack>	 still no ramps!
[14:07:39] <bblack>	 https://grafana.wikimedia.org/dashboard/db/varnish-mailbox-lag?orgId=1&from=now-7d&to=now&var-datasource=ulsfo%20prometheus%2Fops&var-cache_type=upload&var-server=All
[14:08:09] <bblack>	 \o/ for now.  I think that's a pretty solid improvement indicator, although a second week will help solidify that view
[14:09:33] <ema>	 bblack: one week exactly, yes!
[14:42:03] <wikibugs>	 10netops, 10Operations, 10fundraising-tech-ops: bonded/redundant network connections for fundraising hosts - https://phabricator.wikimedia.org/T171962#3958405 (10Jgreen)
[14:44:06] <wikibugs>	 10netops, 10Operations, 10fundraising-tech-ops: bonded/redundant network connections for fundraising hosts - https://phabricator.wikimedia.org/T171962#3958412 (10Jgreen) We've done all hosts but civi1001, frdb1001, and frdb1001 which require fundraising downtime.
[14:44:15] <wikibugs>	 10netops, 10Operations, 10fundraising-tech-ops: bonded/redundant network connections for fundraising hosts - https://phabricator.wikimedia.org/T171962#3958413 (10Jgreen) 05Open>03Resolved
[14:44:18] <wikibugs>	 10netops, 10Operations, 10ops-eqiad, 10Patch-For-Review: eqiad: rack frack refresh equipment - https://phabricator.wikimedia.org/T169644#3958414 (10Jgreen)
[16:24:48] <XioNoX>	 the 30 days view of that graph is great
[17:07:56] <_joe_>	 bblack: we just had a big spike of mailbox lag on 4023 btw
[17:08:01] <_joe_>	 you jinxed it
[17:09:15] <_joe_>	 it's still a small peak, and seems related to an increased backend activity
[17:20:43] <bblack>	 _joe_: related to wmf.20 attempts and such? or natural traffic?
[17:21:37] <_joe_>	 no idea, maybe the former
[17:22:16] <bblack>	 yeah I see 4023, still spiking up so far
[17:22:30] <bblack>	 but the old ramps came up slower, and reached the multi-millions, this is something else
[17:23:56] <bblack>	 err no, I was wrong about ramp slopes
[17:24:03] <bblack>	 the slop does look very similar to the old ones
[17:24:07] <bblack>	 *slope
[17:24:16] <bblack>	 getting into the few-hundred-K range in ~15m
[17:24:31] <bblack>	 if it's a ramp like the old ones, it will probably keep building for quite a while though and reach millions
[17:26:58] <bblack>	 it seems like possibly a scan-attack sort of thing (which often are merely misguided rather than malicious)
[17:27:09] <bblack>	 but it's odd it's hitting one server harder than the rest in that case
[17:28:01] <bblack>	 maybe huge traffic influx on a particular media file too
[17:44:09] <bblack>	 !log cp4023: experimental, "renice -19 39007" (backend cache-timeout aka expiry thread), to see if mbox lag resolves on its own quicker
[17:44:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:03:37] <bblack>	 !log cp4023: now seems to be leveling off on lag and decreasing objhdr locks.  either expiry thread prio helped (which argues for our prio-related patches) or it was naturally going to end?
[18:03:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:03:52] <bblack>	 ema: whenever (monday?), take a peek at cp4023 around the time of these SALs
[18:08:24] <bblack>	 !log cp4023: after a brief period of levelling off a bit: sharp, steep recovery of mbox lag ramp back to ~6K.  not sure if this is a new floor or will drop further, but seems pretty ok.
[18:08:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log