[00:48:56] <jinxer-wm>	 FIRING: SystemdUnitFailed: curator_actions_cluster_wide.service on logstash2026:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[04:48:56] <jinxer-wm>	 FIRING: SystemdUnitFailed: curator_actions_cluster_wide.service on logstash2026:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[08:48:56] <jinxer-wm>	 FIRING: SystemdUnitFailed: curator_actions_cluster_wide.service on logstash2026:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[10:24:40] <jinxer-wm>	 FIRING: LogstashKafkaConsumerLag: Too many messages in logging-eqiad for group logstash7-codfw - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag
[10:29:40] <jinxer-wm>	 RESOLVED: LogstashKafkaConsumerLag: Too many messages in logging-eqiad for group logstash7-codfw - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag
[12:46:28] <godog>	 I've silenced the curator_actions_cluster_wide.service alert until next week btw
[13:50:43] <jinxer-wm>	 FIRING: BenthosKafkaConsumerLag: Too many messages in logging-eqiad for group benthos-mw-accesslog-metrics - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops&var-consumer_group=benthos-mw-accesslog-metrics - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag
[13:55:43] <jinxer-wm>	 RESOLVED: BenthosKafkaConsumerLag: Too many messages in logging-eqiad for group benthos-mw-accesslog-metrics - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops&var-consumer_group=benthos-mw-accesslog-metrics - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag
[13:56:18] <herron>	 thx go.dog
[13:56:41] <vgutierrez>	 hi folks, I'm trying to create an annotation query based on haproxy_process_build_info that includes haproxy version running as the content of the label "version"
[13:57:19] <vgutierrez>	 label_values(haproxy_process_build_info, version) doesn't do the trick, what I'm missing, probably cause the value is always the same (aka 1)?
[14:02:59] <vgutierrez>	 sigh I see.. can't use label_values directly
[14:09:12] <vgutierrez>	 and last_over_time(haproxy_process_build_info[1h]) will flood the dashboard with annotations
[14:36:32] <cdanis>	 vgutierrez: ah, you're looking for changes in the value from the same instance?
[14:36:45] <vgutierrez>	 yes
[14:36:57] <vgutierrez>	 I'd like to get automatic annotations when haproxy gets updated
[16:13:14] <godog>	 vgutierrez: interesting, were you able to get sth working ?
[16:15:18] <vgutierrez>	 nope
[16:15:32] <vgutierrez>	 closest seems to be https://github.com/grafana/grafana/issues/11948#issuecomment-403841249
[16:15:50] <vgutierrez>	 `haproxy_process_build_info unless (haproxy_process_build_info offset 1m)`
[16:16:12] <vgutierrez>	 but doesn't seem to work properly if you try to get it per instance/cluster/site
[16:16:40] <vgutierrez>	 the approach proposed here https://github.com/grafana/grafana/issues/11948#issuecomment-389743476 makes sense but needs a recording rule
[16:18:15] <godog>	 I see, yeah not ideal and seems workable at least