[04:49:25] FIRING: SystemdUnitFailed: statograph_post.service on alert1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:54:25] RESOLVED: SystemdUnitFailed: statograph_post.service on alert1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:35:48] FIRING: PuppetFailure: Puppet has failed on logging-hd2005:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [05:45:48] RESOLVED: PuppetFailure: Puppet has failed on logging-hd2005:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [14:48:35] a bit time sensitive as I'd like to deploy it before fixing the issue to make sure the alert is working properly, could someone have a quick look at https://gerrit.wikimedia.org/r/c/operations/alerts/+/1297163 ? https://alerts.wikimedia.org/?q=%40state%3Dactive&q=alertname%3DCertAlmostExpired [14:59:33] XioNoX: I took a brief look, and it seems to be fine. The only thing I noticed, if you're interested in adding it, is that the test case for the critical alert scenario is missing. [15:01:39] tappof: thanks, I copy-pasted the existing ones [15:03:22] tappof: happy to add it if you can help with the proper "values" to use :) [15:08:08] Yeah, sure XioNoX ... but I'm in a meeting right now. Would it be too late if I took a look tomorrow morning? [15:15:45] tappof: I tried something but it's not working, I think I'm getting confused with the `&cert_expired` no worries, tomorrow is fine, thanks ! [15:16:59] Ack XioNoX, I'll take a look and let you know. [15:20:08] tappof: PS2 passes CI, but it looks wrong to me :)