[04:23:44] <_joe_> elukey: uh since when? [04:23:54] <_joe_> yesterday morning it was still in the same condition [04:25:40] <_joe_> oh that patch, I did misread yesterday morning and I was convinced it was already merged [06:21:34] _joe_ morning! The remaining step for that imo is to create bandwidth alerts for the mc hosts [06:23:16] <_joe_> having fun in mallorca? [06:24:30] elukey: I will create a task if you go away :) [06:32:19] _joe_: weather is alternating between clouds/rain and nice sun, of course it rained like crazy when I was on vacation last week :D [06:32:35] <_joe_> at least that. [06:32:38] we are in a little old town in the north of mallora [06:32:39] ahhahah [06:33:21] jijiki: :) [11:11:45] I see a couple of snapshots failed, will check them later, not time sensitive [14:37:20] from 13:52 to 14:25 there was a large amount of s3 writes [18:45:41] we have 20 unhandled CRITs again. maybe some can be ACKed / donwtimed / linked to tickets / something in SAL please [18:47:00] disabling notifications without further action does not make them handled and leaves it ambigious whether it was forgotten from last time (common) or is actually known [18:48:04] well the ms-be2043 could be related to https://phabricator.wikimedia.org/T222654 [18:48:54] restbase-dev1006 is known they are working on it [18:49:00] 3 are wmcs [18:49:10] cr1-codfw is being handled [18:49:41] and I don't remember about cr2-* [18:49:52] the rest I dont know [19:05:33] on netmon1002 there are 3 failed systemd units all related to netbox reports [19:07:37] jijiki: thanks! i can't help but wonder though if we can actually remove these things. if something like "is wmcs" truly means "not actionable" for example [19:08:14] I dont know [19:09:01] it seems the only way you can know these things are handled and worked on is through IRC context? [19:09:29] that would be the part i'm trying to fix [19:24:00] hehe [19:31:59] so the netbox stuff looks like a permission issue [19:32:01] [ERROR] Unexpected exception occurred during check: [Errno 13] Permission denied: '/etc/netbox/repor [19:32:40] i'll just make a ticket for that [21:38:35] is there anyone still awake who could have a look at he 500 rate on cache upload? [21:39:34] Might be thumbor related (since it is cache upload) [21:39:44] cc: godog, cdanis ^^ [22:28:12] we restarted 3 varnish backends and things recovered now [22:28:32] it was mailbox lag stuff and SAL showed traffic people doing the same recently