[00:35:48] FIRING: PuppetFailure: Puppet has failed on logging-hd2005:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [00:45:48] RESOLVED: PuppetFailure: Puppet has failed on logging-hd2005:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [01:32:37] FIRING: OpensearchClusterHealth: Opensearch cluster health reported as red - https://wikitech.wikimedia.org/wiki/Runbook - https://grafana.wikimedia.org/d/e7d7fa18-7bc3-4548-bb07-ef261a9d3b8b/opensearch-cluster-health?var-cluster=production-elk7-codfw - https://alerts.wikimedia.org/?q=alertname%3DOpensearchClusterHealth [01:37:37] RESOLVED: OpensearchClusterHealth: Opensearch cluster health reported as red - https://wikitech.wikimedia.org/wiki/Runbook - https://grafana.wikimedia.org/d/e7d7fa18-7bc3-4548-bb07-ef261a9d3b8b/opensearch-cluster-health?var-cluster=production-elk7-codfw - https://alerts.wikimedia.org/?q=alertname%3DOpensearchClusterHealth [01:46:40] FIRING: LogstashKafkaConsumerLag: Too many messages in logging-eqiad for group logstash7-codfw - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [01:51:40] RESOLVED: LogstashKafkaConsumerLag: Too many messages in logging-eqiad for group logstash7-codfw - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [07:46:25] FIRING: SystemdUnitFailed: thanos-compact.service on titan2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:46:36] ^^ it's me [07:51:25] RESOLVED: SystemdUnitFailed: thanos-compact.service on titan2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:04:48] FIRING: PuppetFailure: Puppet has failed on logging-hd2005:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [12:44:48] RESOLVED: PuppetFailure: Puppet has failed on logging-hd2005:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [17:13:35] FIRING: DiskSpace: Disk space kafka-logging1002:9100:/srv 3.95% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=kafka-logging1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [17:18:35] RESOLVED: DiskSpace: Disk space kafka-logging1002:9100:/srv 3.901% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=kafka-logging1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [17:28:35] FIRING: DiskSpace: Disk space kafka-logging1002:9100:/srv 3.949% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=kafka-logging1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [20:07:22] added 150G to /srv on kafka-logging1002 [20:08:35] RESOLVED: DiskSpace: Disk space kafka-logging1002:9100:/srv 3.382% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=kafka-logging1002 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace