[11:14:21] (03CR) 10AikoChou: [C:03+2] revertrisk: suppress FastAPIDeprecationWarning from upstream KServe [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1276644 (https://phabricator.wikimedia.org/T416384) (owner: 10AikoChou) [11:18:05] (03Merged) 10jenkins-bot: revertrisk: suppress FastAPIDeprecationWarning from upstream KServe [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1276644 (https://phabricator.wikimedia.org/T416384) (owner: 10AikoChou) [11:42:59] (03CR) 10Kevin Bazira: [C:03+1] revscoring: fix SudachiPy 0.5.2 build [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1276755 (https://phabricator.wikimedia.org/T416384) (owner: 10AikoChou) [12:10:17] (03CR) 10AikoChou: [C:03+2] revscoring: fix SudachiPy 0.5.2 build [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1276755 (https://phabricator.wikimedia.org/T416384) (owner: 10AikoChou) [12:11:40] (03Merged) 10jenkins-bot: revscoring: fix SudachiPy 0.5.2 build [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1276755 (https://phabricator.wikimedia.org/T416384) (owner: 10AikoChou) [12:31:47] klausman, dpogorzelski o/ as FYI I just switched the cert-manager's config of ml-staging-codfw to the discovery2026 pki intermediate. The 'discovery' intermediate is about to expire on Sunday, and for various reasons we preferred to move to a newer one [12:31:53] we'll have to do the same for prod [12:33:57] bartosz, isaranto since I was syncing admin_ng I went ahead and tried your ingress change as well [12:33:58] Error: UPGRADE FAILED: release knative-serving failed, and has been rolled back due to atomic being set: cannot patch "knative-ingress-gateway" with kind Gateway: admission webhook "validation.istio.io" denied the request: configuration is invalid: server cannot have TLS settings for plain text HTTP ports [12:38:33] there are also several errors now like [12:38:34] Error: Failed to render chart: exit status 1: Error: unable to build kubernetes objects from release manifest: resource mapping not found for name: "default" namespace: "" from "": no matches for kind "ClusterStorageContainer" in version "serving.kserve.io/v1alpha1" [12:46:56] ok so we definitely need to revert and revisit how to tackle this [13:23:21] elukey: regarding the cert-man changes, do we need to bounce any services or will the change filter in automagically? [13:25:32] klausman: in staging in theory no, since the certs have a short expiry.. in prod yes, cert-manager will not do it if the intermediate changes :( Janis is working on something for Wikikube, we should be able to apply the same to ml-serve [13:44:00] elukey: o/ were you trying the ingress change on staging? one thing is that staging and prod aren't fully aligned rn. staging has been updated to kserve 0.17. could that be affecting it? [13:46:41] aiko: in theory no, I think it is probably a matter of config https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1277436 [13:50:03] ack! :) [13:52:30] ahhh no maybe https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1277436/6/custom_deploy.d/istio/ml-serve/config_1.24.2.yaml needs to be applied [13:52:55] (03CR) 10AikoChou: [C:03+2] revscoring: add threshold to elapsed_time logging [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1270917 (https://phabricator.wikimedia.org/T416384) (owner: 10AikoChou) [13:56:21] 06Machine-Learning-Team, 06Research: AI/ML Model Request: Text-to-Speech - https://phabricator.wikimedia.org/T419288#11861122 (10Sucheta-Salgaonkar-WMF) [13:57:11] {"time":"2026-04-27T13:54:38.429243Z","level":"warning","scope":"envoy config","msg":"delta config for type.googleapis.com/envoy.config.listener.v3.Listener rejected: Error adding/updating listener(s) 0.0.0.0_8081: error adding listener '0.0.0.0:8081': filter chain '' has the same matching rules defined as ''. duplicate matcher is: [13:57:11] {}\n","caller":"external/envoy/source/extensions/config_subscription/grpc/delta_subscription_state.cc:276","thread":"20"} [13:57:29] (03Merged) 10jenkins-bot: revscoring: add threshold to elapsed_time logging [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1270917 (https://phabricator.wikimedia.org/T416384) (owner: 10AikoChou) [13:57:36] so yeah something is still wrong [15:10:33] I think Dawid nd Bartosz are also messing with Istio on staging, so that error might be related [15:11:04] s/messing with/working on/