[08:08:41] FIRING: [228x] PrometheusRuleEvaluationFailures: Prometheus rule evaluation failures (instance titan1001:17902) - https://wikitech.wikimedia.org/wiki/Prometheus - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRuleEvaluationFailures [08:08:52] FIRING: [2x] ThanosRuleHighRuleEvaluationFailures: Thanos Rule is failing to evaluate rules. - https://wikitech.wikimedia.org/wiki/Thanos#Alerts - https://grafana.wikimedia.org/d/35da848f5f92b2dc612e0c3a0577b8a1/thanos-rule - https://alerts.wikimedia.org/?q=alertname%3DThanosRuleHighRuleEvaluationFailures [08:13:41] RESOLVED: [248x] PrometheusRuleEvaluationFailures: Prometheus rule evaluation failures (instance titan1001:17902) - https://wikitech.wikimedia.org/wiki/Prometheus - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRuleEvaluationFailures [08:13:52] RESOLVED: [2x] ThanosRuleHighRuleEvaluationFailures: Thanos Rule is failing to evaluate rules. - https://wikitech.wikimedia.org/wiki/Thanos#Alerts - https://grafana.wikimedia.org/d/35da848f5f92b2dc612e0c3a0577b8a1/thanos-rule - https://alerts.wikimedia.org/?q=alertname%3DThanosRuleHighRuleEvaluationFailures [08:17:49] that was me ^ [18:22:23] denisse: I don't mean for you to drop everything and fix my problem, this is kind of a nice-to-have for debugging a thing [18:23:02] andrewbogott: No worries, I'm happy to help. [18:23:07] Let's see if we can figure it out. :) [18:23:22] thank you! [18:24:20] * andrewbogott expecting something like "did you restart X to get the new config?" but in theory puppet does that [18:25:15] andrewbogott: It may be that Puppet hasn't run on those hosts yet, but looking at the time you merged that patch I think it's unlikely. [18:25:23] yep, it's been > an hour [18:31:31] I've modified the services setting but I can't see any logs for magnum nor heat. [18:31:48] Dummy question, but, are they currently generating logs? [18:32:08] I'm asking this because they may not be generating anything hence nothing is sent to the index. [18:35:19] they are definitely logging to their local log files /var/log/magnum/* [18:35:46] they ought to be logging to LOG_LOCAL0 as well since I told them to... [18:36:08] Yes, I took a look at the keystone config and it logs to LOG_LOCAL0 successfully. [18:37:56] I'm wondering if it could be that JSON is not being parsed correctly. [18:38:15] could be! although I believe one of the services is exporting json and the other is not [18:38:18] Looking at the cinder config, it has `use_json = false`. [18:38:27] so I expected to see one or the other at least! [18:38:40] denisse: now that I've interrupted you I realize I need to step away for 20 minutes :/ please continue poking or stop, as you prefer! [18:38:55] andrewbogott: Sure thing. [18:40:43] Continuing with loggging as JSON, trove logs as JSON and I can see trove logs. I'm still wondering where the culprit may be. [18:49:00] I can see some logs with message "Error parsing json" that correlate to the time the patch was merged. [19:21:40] denisse: want me to switch off the json flag and see what happens? [19:24:19] (done, for magnum) [19:24:41] Nice, let's wait for a puppet run and see what happens. [19:25:05] I restarted the services, I don't think puppet is involved now [19:25:14] (and turned off on the hosts emitting the logs because I hacked a config change) [20:04:22] i'm not a great deb packager, and i was wondering is there a good place to see the build steps for the opensearch deb maintained by observability? inflatador and i were chatting and the thought is for mutualized opensearch (basically opensearch-as-a-service) and for the opensearch backend for the content sites those will be using the Observability-maintained opensearch. i understand we're building form source [20:06:06] s/form/from/ (i looked a bit in codesearch, the opensearch article on wikitech, and revisited the deb pages on wikitech, but was thinking maybe someone has a pointer) [20:07:53] one other piece here is that most likely we'll be looking to do v1 opensearch, then later v2 opensearch, in order to make the migration from elasticsearch 7.10 to v1 opensearch smoother, and v1-to-v2 smoother. we may need to consider the build approach for v1. [20:09:43] or maybe we're relying on ubuntu debs or vendor debs? anyway, interested to learn more if anyone knows off the top of their head. [22:33:02] I think we're using vendor debs but I'm not 100% on that [22:33:05] dr0ptp4kt: ^ [22:57:03] just installed opensearch2 from our repos on a cloud server and it looks like we are using vendor debs: https://phabricator.wikimedia.org/P67742