[06:17:05] 06serviceops, 06SRE: New WMF docker registry credentials - https://phabricator.wikimedia.org/T412524#11458693 (10Marostegui) [06:17:10] 06serviceops, 06SRE: New WMF docker registry credentials - https://phabricator.wikimedia.org/T412524#11458694 (10Marostegui) p:05Triage→03Medium [08:13:17] 06serviceops, 06SRE: New WMF docker registry credentials - https://phabricator.wikimedia.org/T412524#11458891 (10elukey) [08:33:08] 06serviceops, 06SRE: New WMF docker registry credentials - https://phabricator.wikimedia.org/T412524#11458924 (10elukey) High level things to do afaics: 1) Our current strategy so far has been to add docker credentials to the root's home directory, essentially to limit the users that can push images. The buil... [09:35:05] 06serviceops, 10Shellbox: Sudden increase in shellbox-syntaxhighlighing requests lead to api_appservers running out of idle workers - https://phabricator.wikimedia.org/T325890#11459040 (10jijiki) 05Open→03Invalid Bluntly closing, api_appservers do not exist anymore. [09:55:10] bjensen, matthieulec - o/ [09:55:28] I have the new spiecerack release ready to be installed [09:55:47] I got confused by pings from both of you, who should I work with to deploy it? :D [09:57:07] ah, thanks! my change is the one that's dependent on the spicerack deployment, so probably me? [09:58:37] yes Blake is the right POC [09:58:56] thanks Luca! [10:07:11] oook! bjensen so usually when we deploy the new spicerack debian packages on the cumin nodes we test it briefly on one to see if the issue is fixed, but it in this case it may be tricky [10:07:20] so we can do this [10:07:30] 1) I deploy the new spicerack version and I do a quick sanity check [10:07:38] 2) you revert you puppet change and deploy it [10:07:55] and I think the errors where coming from a cookbook right? We could try it again and verify all is good [10:07:58] if you have time of course [10:08:39] i think rather than a cookbook, the python console was being used to test the service catalog directly [10:08:51] but i'd be happy to perform that testing again after the steps you've outlined :) [10:09:02] see https://phabricator.wikimedia.org/T412457 with the steps to reproduce [10:12:04] bjensen: ah nice that was the sanity check that I wanted to do :D [10:12:42] sounds good! [10:12:49] bjensen: so yeah I think we are good to deploy your puppet patch, I'll update spicerack in the meantime [10:12:59] ah, okay, i'll revert my reversion :p [10:19:13] that's https://gerrit.wikimedia.org/r/c/operations/puppet/+/1218213, if someone has a moment to review [10:19:45] +1ed, spicerack upgraded [10:19:49] (on cumin hosts) [10:21:00] deploying now [10:21:54] puppet-merge complete [10:35:00] i'm now unable to reproduce the issue in T412457, so it seems things are fixed :) [10:35:04] thanks elukey! [10:36:17] nice! [10:50:47] great, thanks Luca [11:12:47] 06serviceops: Move EXCLUDED_SERVICES attribute from sre.discovery.datacentre to service catalog - https://phabricator.wikimedia.org/T412211#11459533 (10Blake) Luca deployed a new version of Spicerack, and I redeployed the change to the service catalog - things look good. The remaining work is to switch to using... [11:36:37] 06serviceops, 06Content-Transform-Team, 07OKR-Work: Migrate parsoidtest functionality to kubernetes - https://phabricator.wikimedia.org/T386246#11459657 (10jijiki) [11:38:31] 06serviceops, 06Content-Transform-Team, 07OKR-Work: Migrate parsoidtest functionality to kubernetes - https://phabricator.wikimedia.org/T386246#11459663 (10jijiki) [11:38:32] 06serviceops, 06Content-Transform-Team, 06MediaWiki-Engineering, 07OKR-Work, 03Readers Essential Work 2025: Transition parsoidtest1001 to PHP 8.1 - https://phabricator.wikimedia.org/T380485#11459664 (10jijiki) [11:38:45] 06serviceops, 06Content-Transform-Team, 07OKR-Work: Migrate parsoidtest functionality to kubernetes - https://phabricator.wikimedia.org/T386246#11459666 (10jijiki) [11:38:48] 06serviceops, 10MW-on-K8s, 06Release-Engineering-Team (Priority Backlog 📥): Provide an mwdebug functionality on kubernetes (mw-experimental) - https://phabricator.wikimedia.org/T276994#11459667 (10jijiki) [11:39:40] 06serviceops, 06Content-Transform-Team, 07OKR-Work: Migrate parsoidtest functionality to kubernetes - https://phabricator.wikimedia.org/T386246#11459671 (10jijiki) I have updated the task description, to reflect how we could potentially move this forward [11:55:35] 06serviceops: Ensure all Chart.yaml files include required metadata fields - https://phabricator.wikimedia.org/T412693 (10jijiki) 03NEW [12:13:08] 06serviceops: ☂️ Production Readiness and Service Excellence - https://phabricator.wikimedia.org/T412696 (10jijiki) 03NEW [12:13:38] 06serviceops: ☂️ Production Readiness and Service Excellence - https://phabricator.wikimedia.org/T412696#11459801 (10jijiki) [12:13:39] 06serviceops: Ensure all Chart.yaml files include required metadata fields - https://phabricator.wikimedia.org/T412693#11459802 (10jijiki) [12:15:38] 06serviceops: Ensure all Chart.yaml files include required metadata fields - https://phabricator.wikimedia.org/T412693#11459820 (10jijiki) [12:23:56] 06serviceops, 13Patch-For-Review: Ensure all Chart.yaml files include required metadata fields - https://phabricator.wikimedia.org/T412693#11459859 (10jijiki) [12:46:35] 06serviceops, 13Patch-For-Review: Ensure all Chart.yaml files include required metadata fields - https://phabricator.wikimedia.org/T412693#11459908 (10jijiki) [12:53:48] 06serviceops, 13Patch-For-Review: Ensure all Chart.yaml files include required metadata fields - https://phabricator.wikimedia.org/T412693#11459931 (10jijiki) [12:54:31] 06serviceops: Enforce Chart.yaml metadata in CI - https://phabricator.wikimedia.org/T412699 (10jijiki) 03NEW [12:55:17] 06serviceops, 13Patch-For-Review: Ensure all Chart.yaml files include required metadata fields - https://phabricator.wikimedia.org/T412693#11459951 (10jijiki) [12:55:18] 06serviceops: Enforce Chart.yaml metadata in CI - https://phabricator.wikimedia.org/T412699#11459952 (10jijiki) [12:55:19] 06serviceops: ☂️ Production Readiness and Service Excellence - https://phabricator.wikimedia.org/T412696#11459953 (10jijiki) [13:05:52] 06serviceops, 06Infrastructure-Foundations: Improve release process of Spicerack and service catalog - https://phabricator.wikimedia.org/T412700 (10MLechvien-WMF) 03NEW [13:06:38] 06serviceops, 06Infrastructure-Foundations: Improve release process of Spicerack and service catalog - https://phabricator.wikimedia.org/T412700#11459989 (10MLechvien-WMF) As I discussed separately with @elukey @Blake , filing the task to continue the discussion here. [13:11:11] 06serviceops, 06Infrastructure-Foundations: Improve release process of Spicerack and service catalog - https://phabricator.wikimedia.org/T412700#11459994 (10MLechvien-WMF) [13:30:35] 06serviceops, 13Patch-For-Review: Ensure all Chart.yaml files include required metadata fields - https://phabricator.wikimedia.org/T412693#11460035 (10JMeybohm) Are you planning to enforce this via CI checks? [13:47:03] 06serviceops, 10Prod-Kubernetes, 07good first task, 07Kubernetes: Replace k8s-controller-sidecars with built in Sidecar containers on k8s 1.31 - https://phabricator.wikimedia.org/T386694#11460116 (10JMeybohm) Thank you for tagging this task with #good_first_task for Wikimedia newcomers! Newcomers often ma... [13:49:49] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Replace k8s-controller-sidecars with built in Sidecar containers on k8s 1.31 - https://phabricator.wikimedia.org/T386694#11460160 (10JMeybohm) [14:09:34] 06serviceops, 06Infrastructure-Foundations, 13Patch-For-Review: Improve release process of Spicerack and service catalog - https://phabricator.wikimedia.org/T412700#11460318 (10Volans) I've sent a proposal patch with a possible approach to keep the benefits of the `@dataclass` approach and allow to ignore ex... [15:22:06] 06serviceops, 06Infrastructure-Foundations, 13Patch-For-Review: Improve release process of Spicerack and service catalog - https://phabricator.wikimedia.org/T412700#11460687 (10elukey) p:05Triage→03Medium [15:23:32] 06serviceops, 10observability, 13Patch-For-Review: Create a visual representation of where each service is active from, any given time - https://phabricator.wikimedia.org/T327663#11460730 (10MLechvien-WMF) Tested the script today locally on cumin1003: ` root@cumin1003:/home/matthieulec# python3 ./export_serv... [15:41:48] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install wikikube-worker refresh - https://phabricator.wikimedia.org/T408760#11460843 (10VRiley-WMF) [15:49:30] 06serviceops, 06Infrastructure-Foundations, 06SRE, 10SRE-tools, 07Datacenter-Switchover: Support locking cookbooks run except for switchover related cookbooks - https://phabricator.wikimedia.org/T330997#11460869 (10LSobanski) @Clement_Goubert is this still needed? [16:15:28] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install wikikube-worker refresh - https://phabricator.wikimedia.org/T408760#11461007 (10VRiley-WMF) We currently are out of space in row A for 8 - 10 Gig connections due to power. I have notified @Clement_Goubert to see if we might be able to move t... [16:56:50] 06serviceops, 10MW-on-K8s, 06SRE: Pushing to the docker registry fails with 500 Internal Server Error - https://phabricator.wikimedia.org/T412265#11461202 (10dancy) Here are pretrain image build and push times for the last several days: Unless otherwise noted, full l10n rebuild occurred for each of these im... [17:01:05] 06serviceops, 10MW-on-K8s, 06SRE: Pushing to the docker registry fails with 500 Internal Server Error - https://phabricator.wikimedia.org/T412265#11461220 (10Urbanecm_WMF) FWIW, the third deployment attempt has succeeded today. But, I still have very little idea about what went wrong and why, so I'm leaving... [17:11:46] 06serviceops, 06MW-Interfaces-Team, 06Traffic, 07Epic, and 3 others: Epic: Enforce API rate limits (WE5.1.3c) - https://phabricator.wikimedia.org/T412585#11461265 (10Clement_Goubert) >>! In T412585#11458421, @matmarex wrote: > I haven't been able to find this in the related tasks: what does the error respo... [17:22:25] 06serviceops, 06Infrastructure-Foundations, 06SRE, 10SRE-tools, 07Datacenter-Switchover: Support locking cookbooks run except for switchover related cookbooks - https://phabricator.wikimedia.org/T330997#11461316 (10Clement_Goubert) Not strictly, but it would be nice to have for peace of mind. This may be... [17:34:50] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install wikikube-worker refresh - https://phabricator.wikimedia.org/T408760#11461353 (10Clement_Goubert) Thanks for the heads up @VRiley-WMF. I've tried to plan out how the balance of servers per row would look like after the refreshes in [[ https:... [19:44:39] 06serviceops, 06MW-Interfaces-Team, 06Traffic, 07Epic, and 3 others: Epic: Enforce API rate limits (WE5.1.3c) - https://phabricator.wikimedia.org/T412585#11461635 (10daniel) >>! In T412585#11458421, @matmarex wrote: > is there some way to "force" an error response to test client libraries? Hm... That is u...