[07:35:45] 06serviceops, 07Kubernetes: Add pod ip address blocks to staging-eqiad - https://phabricator.wikimedia.org/T386232#10558448 (10JMeybohm) Ah, I was pretty sure I was reading netbox wrong :) 10.64.80.0/20 and 10.192.80.0/20 should be fine as well though... [08:00:51] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Review maps outage happened on Feb 17th - https://phabricator.wikimedia.org/T386648#10558506 (10Jgiannelos) I possible scenario can be the following: * We have mixed servers (k8s and bare metal) * The old bare metal is allowed to talk to public... [08:20:52] 06serviceops, 10Page Content Service, 10Content-Transform-Team (Work In Progress), 13Patch-For-Review: Dont propagate server error details to end users - https://phabricator.wikimedia.org/T385821#10558549 (10Jgiannelos) [08:21:35] 06serviceops, 10Page Content Service, 10Content-Transform-Team (Work In Progress), 13Patch-For-Review: Dont propagate server error details to end users - https://phabricator.wikimedia.org/T385821#10558552 (10Jgiannelos) a:03Jgiannelos [09:32:45] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10558767 (10ops-monitoring-bot) pool host wikikube-worker[1148-1153].eqiad.wmnet by kamila@cumin1002 with reason: reimage complete [09:32:46] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10558768 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by kamila@cumin1002 pool for host wikikube-worker[1148-1153].eqiad.wmnet completed:... [09:42:54] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10558826 (10ops-monitoring-bot) pool host wikikube-worker1123.eqiad.wmnet by kamila@cumin1002 with reason: reimage complete [09:42:55] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10558827 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by kamila@cumin1002 pool for host wikikube-worker1123.eqiad.wmnet completed: - wiki... [10:05:08] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Replace k8s-controller-sidecars with built in Sidecar containers on k8s 1.31 - https://phabricator.wikimedia.org/T386694 (10JMeybohm) 03NEW [10:36:09] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Review maps outage happened on Feb 17th 2025 - https://phabricator.wikimedia.org/T386648#10558981 (10Aklapper) [10:40:28] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Review maps outage happened on Feb 17th 2025 - https://phabricator.wikimedia.org/T386648#10558996 (10elukey) Thanks a lot for the brainbounce :) From [[ https://logstash.wikimedia.org/app/dashboards#/view/7f883390-fe76-11ea-b848-090a7444f26c?_g=... [13:11:15] 06serviceops, 06Growth-Team, 10GrowthExperiments, 10MW-on-K8s, 13Patch-For-Review: Migrate GrowthExperiments maintenance jobs to mw-cron - https://phabricator.wikimedia.org/T385782#10559397 (10Urbanecm_WMF) Question: How would this work impact beta? Some of our jobs are fairly important to run there, as... [13:54:17] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Review maps outage happened on Feb 17th 2025 - https://phabricator.wikimedia.org/T386648#10559641 (10Jgiannelos) The errors not showing up on the k8s side could be because there was actually no error, the `ETIMEDOUT` is raised in the client side,... [14:01:37] 06serviceops, 06Growth-Team, 10GrowthExperiments, 10MW-on-K8s, 13Patch-For-Review: Migrate GrowthExperiments maintenance jobs to mw-cron - https://phabricator.wikimedia.org/T385782#10559681 (10Urbanecm_WMF) > jobs that are low criticality and could be migrated first Mentioned on the patch. I'd suggest t... [14:10:13] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Review maps outage happened on Feb 17th 2025 - https://phabricator.wikimedia.org/T386648#10559717 (10elukey) >>! In T386648#10559641, @Jgiannelos wrote: > The errors not showing up on the k8s side could be because there was actually no error, the... [14:11:21] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Review maps outage happened on Feb 17th 2025 - https://phabricator.wikimedia.org/T386648#10559718 (10Jgiannelos) I agree, but the only env that could hang on `en.wikipedia.org` is k8s, maps nodes can talk to that endpoint directly. [14:12:54] 06serviceops, 06Growth-Team, 10GrowthExperiments, 10MW-on-K8s, 13Patch-For-Review: Migrate GrowthExperiments maintenance jobs to mw-cron - https://phabricator.wikimedia.org/T385782#10559725 (10Urbanecm_WMF) >>! In T385782#10559681, @Urbanecm_WMF wrote: > I _think_ some jobs can be removed. I'll double ch... [14:53:04] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Review maps outage happened on Feb 17th 2025 - https://phabricator.wikimedia.org/T386648#10559821 (10elukey) I want to make a test via LVS load balancing config, namely put more weights on k8s pods/workers (5 in total in each DC) so that more req... [17:18:00] 06serviceops, 06Growth-Team, 10GrowthExperiments, 10MW-on-K8s, 13Patch-For-Review: Migrate GrowthExperiments maintenance jobs to mw-cron - https://phabricator.wikimedia.org/T385782#10560359 (10Urbanecm_WMF) [18:03:57] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Review maps outage happened on Feb 17th 2025 - https://phabricator.wikimedia.org/T386648#10560543 (10Scott_French) I took a quick look this morning after seeing the discussion in IRC, and it looks like none of the k8s worker nodes that have been... [21:46:13] 06serviceops, 06Release-Engineering-Team, 10Scap: OSError "Message too long" from scap helmfile diffs - https://phabricator.wikimedia.org/T386759 (10RLazarus) 03NEW