[14:47:01] Flagging this for y'all. We found out that JVMs running in containers on linux 6.12+ assign the max heap size as 1/4th of the *node max memory* instead of the cgroup max memory. https://phabricator.wikimedia.org/T405361 We're evaluating how to mitigate. [14:47:19] This is not affecting wikikube in any way, as it is running linux 6.1.0 but we had to install a backported kernel tp get a kernel fix for a bug affecting the ceph kernel modules [14:50:54] (in dse-k8s clusters, I mean) [15:28:04] <_joe_> brouberol: the "obvious" fix is to pass to the jvm the max heap size explicitly and use the value from your resources definition, if we're talking about a helm chart, that's what I'd do [15:28:10] <_joe_> but, sigh [15:28:29] I get the feeling it isn't just one helm chart that's the issue heh [15:30:00] dumb question, is there an openjdk package in common amongst all the DSE usages? [15:39:58] <_joe_> cdanis: yeah I was thinking airflow as well :/ [18:00:16] _joe_: we're still undecided between rebuilding a 6.12 kernel with CONFIG_CPUSETS_V1=y or setting Xmx everywhere. The issue is that we have operators in charge of running flink applications, spark applications, airflow jobs running spark applications, etc etc, so that might be a pretty far reaching change, and things might be missed. I'd tend to [18:00:16] favour rebuilding a kernel with a custom option until the openjdk fix is backported to older versions and available via apt repos. I just don't know how cumbersome it might prove to be [18:03:45] brouberol: moritzm would be a good person to weigh in on both sides of that [18:14:37] Yes, sounds about right! [19:54:51] > is there an openjdk package in common amongst all the DSE usages? [19:55:44] One per openjdk version, so not one, but not many. In container land we mostly have jdk8 and 17 [19:56:28] I was going to suggest something silly like dockerfile templates to add a wrapper script that sets -Xmx [19:59:36] * inflatador wonders if there's a way to make a fake `/proc/cgroups` for the containers to read [20:04:37] I can imagine a few different ways yeah [20:04:54] can you implement the translation you need? [20:14:32] sorry, you lost me on that one. If it helps, the problem is with OpenJDK's detection of cgroups (uses the old /proc/cgroups), not a lack of cgroup features. So the thought is if it sees `cpuset` in the output of `/proc/cgroups` it will happily use cgroups to determine memory settings instead of falling back to checking host memory [20:16:08] if you can write code that knows how to emit the input the jdk expects, then yeah, i can imagine a few different options for presenting that to pods [20:16:28] gotta log off for now though, have a good weekend all :) [20:16:39] ACK, thanks! Have a great weekend [20:19:02] If it requires writing code, probably beyond my capabilities ;)