[07:38:08] Hi! Do we have a logstash dashboard somewhere for Kubernetes audit logs? We've seen some resources be deleted without our knowledge in the past couple of days, and we'd like to investigate why. Thanks [08:32:39] brouberol: do you mean specifically for audit.k8s.io/v1 (see https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/) ? We don't have enabled AFAIK. [08:33:02] Yep, that was what I meant [08:33:29] well, we'll do without [08:33:30] thanks! [09:12:52] wait, don't we have audit logs on the k8s masters themselves? [09:42:07] We do have the audit logs enabled, at least on the dse-k8s cluster. [09:42:11] https://www.irccloud.com/pastebin/wSD6Yzus/ [09:43:04] Here's a sample entry: https://phabricator.wikimedia.org/P74696 [09:45:05] brouberol: [09:45:33] alright, now we're talking! Thanks btullis [09:45:55] I'll go dig into that trove to know more about the PG cluster deletion [09:46:21] brouberol: Should we look at whether this would be worth indexing into logstash? Or might it be too a) noisy or b) security sensitive? [09:46:50] past experience tells me that there's a really high value to be gotten when having a proper dashboard over these logs [09:47:13] my experience with logstash is that I stay far away from it as much as possible to keep my anger level low [09:47:15] They will be split across both dse-k8s-ctrl100[1-2] as well, because of the LVS load-balancing. [09:52:56] a cursory look shows that resource deletion are not logged [11:32:55] oh, I stand corrected! I indeed see that there are 2 policies under https://github.com/wikimedia/operations-puppet/tree/production/modules/k8s/files. The modify pods one enabled by default, the default one just on staging-eqiad apparently? [11:33:01] that's interesting, I had missed this development [11:33:25] but yes, delete isn't tracked for the first one. verbs: ["create", "patch", "update"] [11:33:34] and just for pods apparently [11:36:34] +1 on getting them on logstash fwiw [11:36:56] I see they are like a few tens of megabytes uncompressed, shouldn't be an issue for logstash [11:37:19] maybe across all cluster like 1 gigabyte per day? [11:38:12] no more. like 2.5GB? [11:38:16] but still, meh ? [11:58:42] yep, I agree with the sentiment [14:43:58] hey folks! [14:44:06] I filed a change to move Kartotherian to Istio ingress - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1133389 [14:44:19] if anybody could give it a check I'd be grateful [14:45:40] in particular, I see that for staging we use "staging: true" in a lot ov values-staging.yaml, but afaics from the ingress module we don't have that option anymore [14:46:05] yes, it was dropped in ingress.istio:1.2.0 [14:46:10] * akosiaris reviewing [14:46:42] then what is the alternative, if you have used it? [14:48:45] the CI diff for staging is probably not correct right now [14:48:58] also, shall we move more services to ingress?? [14:49:26] overall? almost definitely [14:49:41] we 've already moved 1 MediaWiki installation to it, namely mw-wikifunctions [14:49:42] ack ack I wanted to make sure that this was the way to go [14:49:50] you don't even need the old staging behavior now [14:50:09] 1.2.0 does thing in a smarter way so that flag was just removed [14:50:15] nothing to migrate to [14:50:18] very nice then [14:50:24] I'll check what it does [14:50:31] I recently found out as well [14:50:56] cause my original stuff for mw-wikifunctions used 1.1 and that staging: true key [14:51:37] btw, a gotcha for you. If something in the wikikube cluster itself needs to talk to ingress for some reason, special egress rules are needed [14:51:46] if you can avoid that and go via the mesh, it's preferable [14:52:14] but if you do, look at wikifunctions egress rules that target the ingressgateways in the istio-system namespace. [14:52:15] okok interesting, will remember it [14:52:24] * elukey nods [14:52:46] the other one that I have in mind is Citoid, since we are doing https://wikitech.wikimedia.org/wiki/SLO/Citoid [14:53:00] it would be nice to use the istio ingress slis instead [14:53:07] and it's dependency, Zotero [14:53:54] that one not 100% sure, it is meant to be called only by citoid right? We could leave it with mesh only [14:59:27] true [15:04:32] thanks for the review akosiaris, I included your suggestion, should be ready now.. I'll sync with Yiannis to test staging [15:04:41] 👍