[08:40:01] * volans back FYI, reading backlogs and emails [08:44:08] welcome back volans ! [14:16:52] btw godog -- re https://wikitech.wikimedia.org/wiki/Incident_documentation/20190425-prometheus -- I think I have found some of the queries responsible [14:18:14] there are some prometheus-side settings to limit max # of samples read for a query (which has a good but not perfect correspondence to RAM consumption) [14:21:16] cdanis: awesome (re: problematic queries), seems like the default limits are a big generous/optimistic [14:21:29] in 2.something they added a limit for the # of samples [14:21:48] let me find the post [14:22:10] https://www.robustperception.io/limiting-promql-resource-usage [14:22:19] --query.max_samples in 2.5.0 [14:22:36] in theory we were under this limit at the defaults! [14:27:21] sigh, mmh perhaps we could start with 30/40% reduction in max samples [14:29:20] also maybe I should try asking in #prometheus on freenode again [14:33:30] why not, can't hurt! I'm lurking there and folks seem to be generally helpful [14:51:48] yeah, last time I asked no one seemed to be around/interested; I'll try again [17:09:19] godog: these swift replicates are taking unreasonably long [17:09:29] it will be days heh