[10:03:31] follow up from yesterday - TIL that a k8s controller can also run outside the cluster. In the amd gpu node labeller case, it is sufficient to have a valid kube-config to point it to [10:03:57] and of course the right perms in deployment charts to allow it to modify "nodes" etc.. [10:04:38] since the other gpu plugin is running outside k8s (on the bare metal os) and the labeller needs to read /dev/ etc.., I'd be inclined in making it running also on bare metal [10:05:15] it seems counter intuitive to wrap it in a container and let it escape it to read from the underlying OS [10:08:24] I'll experiment with it and report back :) [13:15:34] The new opensearch on k8s services will be behind basic auth. Is anyone scraping password-protected endpoints in k8s with prometheus? It's not a hard requirement but just curious if we had something in place for that [14:07:37] inflatador: Are the prometheus metrics behind authentication, too? That seems odd. Normally, they are on a separate port. [14:35:40] btullis yes, the exporter is integrated with opensearch via a plugin (ref https://github.com/opensearch-project/opensearch-prometheus-exporter?tab=readme-ov-file#install-or-remove-plugin ) so it's subject to the same rules. We can enable "anonymous auth" ( https://docs.opensearch.org/2.19/security/access-control/anonymous-authentication/ ) so it's not a blocker, just curious if there was precedent [14:45:08] inflatador: what is providing the basic auth, and could it accept mtls as well? [14:46:26] cdanis it's enforced by OpenSearch itself. It could accept mTLS as well [14:46:53] out of curiosity what does opensearch use as an http server? [14:47:50] making prometheus send the basic auth header is relatively straightforward iirc, just a config option in the target config [14:48:26] OpenSearch doesn't have a separate HTTP server, the REST API is part of it [14:48:53] taavi: depends -- you can't reference a k8s secret from the prom labels, I think, only from the prom operator CRDs [14:48:57] you can do RBAC stuff via config files or the rest API [14:49:35] oh right, missed this is k8s - sorry [14:49:37] inflatador: https://gerrit.wikimedia.org/g/operations/puppet/+/ef0edcb8946f27910bbede9faefb988d9d472a82/modules/profile/manifests/prometheus/k8s.pp#265 k8s prom is already scraping the pods while presenting its (puppet-CA-backed) TLS cert [14:50:58] so I think, make OpenSearch accept that cert -- `'names' => [{ 'organisation' => 'system:monitoring' }],` from the definition of $client_cert looks promising [14:51:04] cdanis nice! Yeah, I agree [14:51:06] and it should just work [14:52:20] * inflatador starts reading https://docs.opensearch.org/2.19/security/authentication-backends/client-auth/ [14:52:46] ah [14:53:14] inflatador: we create an intermediate CA specifically for prometheus [14:54:03] is that installed automatically as part of `wmf-certificates`? [14:56:33] that just has the root CA, which if you want basic auth for any/all traffic, is probably too broad to trust [14:57:39] I don't really have a choice re: basic auth, that comes with the operator ;) . I'm fine w/read-only access from anywhere within WMF [16:45:37] Should pods in the same namespace be able to reach one another by default? I'm getting some connection failures and just wanted to rule out FW issues [17:51:12] inflatador: As per this: https://github.com/wikimedia/operations-deployment-charts/blob/master/helmfile.d/admin_ng/values/dse-k8s.yaml#L36-L51 - we allow egress from all pods in a namespace to all other pods in the same namespace. But we still need to configure ingress by the destination pods.