[00:01:15] (03approved) 10bd808: Release 6.1.0 [toolforge-repos/python-toolforge] - 10https://gitlab.wikimedia.org/toolforge-repos/python-toolforge/-/merge_requests/28 (https://phabricator.wikimedia.org/T333727 https://phabricator.wikimedia.org/T333728 https://phabricator.wikimedia.org/T339940 https://phabricator.wikimedia.org/T396115) (owner: 10lucaswerkmeister) [04:00:40] 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul-runners VPS project - https://phabricator.wikimedia.org/T396540#10902534 (10Andrew) +1 [07:24:25] (03update) 10taavi: registry-admission: local: Exempt local-path-storage [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/795 [07:24:29] (03update) 10taavi: registry-admission: local: Exempt local-path-storage [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/795 [07:27:51] (03merge) 10taavi: registry-admission: local: Exempt local-path-storage [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/795 [07:27:54] (03update) 10taavi: logging: Init component [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/796 (https://phabricator.wikimedia.org/T386480) [07:29:24] (03update) 10taavi: logging: Init component [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/796 (https://phabricator.wikimedia.org/T386480) [07:29:25] (03update) 10taavi: logging: Add basic rate limiting and retention config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/807 (https://phabricator.wikimedia.org/T386480) [07:29:31] (03update) 10taavi: logging: Add basic rate limiting and retention config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/807 (https://phabricator.wikimedia.org/T386480) [07:29:34] (03update) 10taavi: logging: Init component [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/796 (https://phabricator.wikimedia.org/T386480) [07:47:57] (03PS1) 10Muehlenhoff: Add stub keytab for install7002 [labs/private] - 10https://gerrit.wikimedia.org/r/1155594 [07:49:51] (03CR) 10Muehlenhoff: [V:03+2 C:03+2] Add stub keytab for install7002 [labs/private] - 10https://gerrit.wikimedia.org/r/1155594 (owner: 10Muehlenhoff) [08:11:02] (03PS1) 10Elukey: Rename docker_registry_ha's occurrences to docker_registry [labs/private] - 10https://gerrit.wikimedia.org/r/1155601 (https://phabricator.wikimedia.org/T390251) [08:18:31] 06cloud-services-team, 10Toolforge (Toolforge iteration 20), 13Patch-For-Review: `toolforge jobs dump` fails for tools.stewardsbot - https://phabricator.wikimedia.org/T396210#10903049 (10dcaro) @Raymond_Ndibe Hmmm... that change should not have been backwards incompatible :/, it should not have changed t... [08:29:46] 06cloud-services-team, 10Toolforge (Toolforge iteration 20), 13Patch-For-Review: `toolforge jobs dump` fails for tools.stewardsbot - https://phabricator.wikimedia.org/T396210#10903092 (10dcaro) Next steps then: * Add support for both, warning the health_check_type as deprecated * Monitor it's usage * Rem... [08:42:29] (03update) 10dcaro: bump_version: copy from jobs-api [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/67 [08:55:38] 06cloud-services-team, 10Toolforge (Toolforge iteration 20), 13Patch-For-Review: `toolforge jobs dump` fails for tools.stewardsbot - https://phabricator.wikimedia.org/T396210#10903226 (10dcaro) 05Resolved→03Open [08:55:59] (03open) 10dcaro: health_check: default to 'type' but support 'health_check_type' [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/107 (https://phabricator.wikimedia.org/T396210) [08:58:28] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api] expose health_check.type deprecation metrics - https://phabricator.wikimedia.org/T396236#10903233 (10dcaro) [08:58:30] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api] Introduce deprecation metrics - https://phabricator.wikimedia.org/T390137#10903235 (10dcaro) [08:58:32] 10Toolforge (Toolforge iteration 21): [components-api] deploy on tools - https://phabricator.wikimedia.org/T394337#10903241 (10dcaro) [08:58:33] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [components-api] Add endpoint to get what would be the "current" config - https://phabricator.wikimedia.org/T394753#10903237 (10dcaro) [08:58:34] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [components-api] Add alerts and runbooks for basic service health - https://phabricator.wikimedia.org/T394275#10903239 (10dcaro) [08:58:36] 10Toolforge (Toolforge iteration 21): [jobs-api] bug in runtime diff_with_running_job function - https://phabricator.wikimedia.org/T394734#10903249 (10dcaro) [08:58:45] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [components-api,buildsa-api] When building and deploying, if none of the settings changed, the jobs are not restarted - https://phabricator.wikimedia.org/T389044#10903247 (10dcaro) [08:58:49] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [components-cli,components-api, toolforge-weld] Add a warning message saying it's 'beta' - https://phabricator.wikimedia.org/T394277#10903245 (10dcaro) [08:58:53] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api] check for diff in services when running diff_with_running_job - https://phabricator.wikimedia.org/T392717#10903251 (10dcaro) [08:59:26] (03open) 10dcaro: health-check: return `type` by default [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/174 (https://phabricator.wikimedia.org/T396210) [09:00:46] 06cloud-services-team, 10Toolforge: Provision object storage volumes for Loki - https://phabricator.wikimedia.org/T396574 (10taavi) 03NEW [09:04:43] (03update) 10lucaswerkmeister: Release 6.1.0 [toolforge-repos/python-toolforge] - 10https://gitlab.wikimedia.org/toolforge-repos/python-toolforge/-/merge_requests/28 (https://phabricator.wikimedia.org/T333727 https://phabricator.wikimedia.org/T333728 https://phabricator.wikimedia.org/T339940 https://phabricator.wikimedia.org/T396115) [09:06:33] (03merge) 10lucaswerkmeister: Release 6.1.0 [toolforge-repos/python-toolforge] - 10https://gitlab.wikimedia.org/toolforge-repos/python-toolforge/-/merge_requests/28 (https://phabricator.wikimedia.org/T333727 https://phabricator.wikimedia.org/T333728 https://phabricator.wikimedia.org/T339940 https://phabricator.wikimedia.org/T396115) [09:14:50] (03open) 10dcaro: functional_tests.jobs: add tests for health-check [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/810 (https://phabricator.wikimedia.org/T396210) [09:21:50] 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul-runners VPS project - https://phabricator.wikimedia.org/T396540#10903338 (10fnegri) 05Open→03In progress a:03fnegri [09:30:27] 06cloud-services-team, 10Toolforge: Provision object storage volumes for Loki - https://phabricator.wikimedia.org/T396574#10903370 (10taavi) p:05Triage→03High [09:32:36] 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul-runners VPS project - https://phabricator.wikimedia.org/T396540#10903390 (10hashar) `runners` comes from GitLab semantic: https://docs.gitlab.com/runner/ . Zuul refers to test resources as nodes and... [09:32:51] 06cloud-services-team, 10Toolforge: Provision object storage volumes for Loki - https://phabricator.wikimedia.org/T396574#10903393 (10taavi) [09:36:09] 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul-runners VPS project - https://phabricator.wikimedia.org/T396540#10903420 (10fnegri) I will wait for @bd808 to +1 or -1 the name change. :) [09:36:24] (03update) 10taavi: logging: Init component [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/796 (https://phabricator.wikimedia.org/T386480) [09:36:29] (03update) 10taavi: logging: Add basic rate limiting and retention config [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/807 (https://phabricator.wikimedia.org/T386480) [09:38:56] 10Cloud-VPS (Quota-requests): Temporary quota increase for 'cvn' - https://phabricator.wikimedia.org/T395274#10903423 (10fnegri) 05Open→03Declined I'm gonna close this as Declined for now, feel free to reopen if IPv6 doesn't fulfill your needs. [09:43:20] 10Cloud-VPS (Quota-requests): Pixel project "disk40" flavor, and perhaps a few more cores? - https://phabricator.wikimedia.org/T395837#10903443 (10fnegri) 05Open→03Declined Closing this as Declined for now, @Mhurd feel free to reopen with more details if Cinder volumes are not a viable solution. [09:55:52] (03update) 10dcaro: health_check: default to 'type' but support 'health_check_type' [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/107 (https://phabricator.wikimedia.org/T396210) [10:10:04] (03update) 10dcaro: functional_tests.jobs: add tests for health-check [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/810 (https://phabricator.wikimedia.org/T396210) [10:13:57] (03update) 10dcaro: functional_tests.jobs: add tests for health-check [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/810 (https://phabricator.wikimedia.org/T396210) [10:14:52] (03open) 10dcaro: toolforge_deploy_mr: force install the packages [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/247 [10:15:27] (03update) 10dcaro: toolforge_deploy_mr: force install the packages [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/247 [10:16:31] (03update) 10dcaro: health_check: default to 'type' but support 'health_check_type' [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/107 (https://phabricator.wikimedia.org/T396210) [10:20:18] (03update) 10dcaro: health-check: return `type` by default [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/174 (https://phabricator.wikimedia.org/T396210) [10:24:55] (03update) 10dcaro: health-check: return `type` by default [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/174 (https://phabricator.wikimedia.org/T396210) [10:38:42] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api] Create storage layer, and save business models in persistent storage - https://phabricator.wikimedia.org/T359650#10903640 (10dcaro) [10:38:43] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 05Goal: [harbor] Move harbor data to object storage service - https://phabricator.wikimedia.org/T350687#10903642 (10dcaro) [10:38:45] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 21): Intermittent redis connection timeouts in Toolforge - https://phabricator.wikimedia.org/T318479#10903644 (10dcaro) [10:38:46] 10Toolforge (Toolforge iteration 21), 07Upstream: [builds-builder,jobs-api,upstream] Calling nontrivial Procfile commands with arguments results in confusing error (“no such file or directory”) - https://phabricator.wikimedia.org/T356016#10903648 (10dcaro) [10:38:48] 10Toolforge (Toolforge iteration 21), 07Upstream: [builds-builder] golang based images get infinite nested loops for procfile entries - https://phabricator.wikimedia.org/T363417#10903650 (10dcaro) [10:38:50] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 21), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#10903646 (10dcaro) [10:38:52] 10Toolforge (Toolforge iteration 21): [toolforge] simplify calling the different toolforge apis from within the containers - https://phabricator.wikimedia.org/T356377#10903652 (10dcaro) [10:39:00] 06cloud-services-team, 10Toolforge (Toolforge iteration 21): [kyverno] Upgrade to `3.3.9` chart (`1.13` app) for k8s 1.30 support - https://phabricator.wikimedia.org/T394787#10903662 (10dcaro) [10:39:02] 10Toolforge (Toolforge iteration 21): [k8s,infra] use the new docker-registry.svc.toolforge.org host everywhere - https://phabricator.wikimedia.org/T394902#10903664 (10dcaro) [10:39:06] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [components-api] Add support for port/helathcheck for continuous jobs in tool config/depolyment - https://phabricator.wikimedia.org/T362072#10903660 (10dcaro) [10:39:10] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#10903658 (10dcaro) [10:39:15] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [cicd] create cicd flow for non repo owners - https://phabricator.wikimedia.org/T394595#10903666 (10dcaro) [10:39:19] 10Cloud Services Proposals, 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 21), 05Cloud-Services-Origin-Team, and 3 others: [Hypothesis] WE6.3.10 start a beta for the push-to-deploy features - https://phabricator.wikimedia.org/T393564#10903668 (10dcaro) [10:39:23] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api] when running a command with wrong quoting, no logs nor useful feedback is given to the user - https://phabricator.wikimedia.org/T356267#10903670 (10dcaro) [10:39:27] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: Persist important toolforge k8s components logs - https://phabricator.wikimedia.org/T383081#10903672 (10dcaro) [10:39:31] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api,infra] upgrade all the existing toolforge jobs to the latest job version - https://phabricator.wikimedia.org/T359649#10903674 (10dcaro) [10:39:39] 10Toolforge (Toolforge iteration 21), 07Epic: [cicd] Streamline toolforge cli deployment and external contributor ci flows - https://phabricator.wikimedia.org/T392524#10903680 (10dcaro) [10:39:43] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api] Split the `*Job` API models into three - https://phabricator.wikimedia.org/T390136#10903676 (10dcaro) [10:39:47] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api] refactor models - https://phabricator.wikimedia.org/T389118#10903678 (10dcaro) [10:39:57] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 07Epic: [jobs-api,webservice] Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#10903683 (10dcaro) [10:40:01] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 21), 07Epic: [components-api] First iteration of the component API - https://phabricator.wikimedia.org/T362051#10903687 (10dcaro) [10:40:05] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [harbor,infra] Find a way to manage toolforge project policies with code - https://phabricator.wikimedia.org/T360509#10903685 (10dcaro) [10:40:09] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 07Epic: [jobs-api] expose jobs-api continuous jobs to the internet via `toolname.toolforge.org`, just like webservice - https://phabricator.wikimedia.org/T388092#10903691 (10dcaro) [10:40:13] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: Toolforge: Replace all bastion with grid-less bookworm based bastion hosts - https://phabricator.wikimedia.org/T314665#10903689 (10dcaro) [10:40:22] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 21), 07Epic: [KR] WE6.3 Introduce a sustainability scoring system for the Toolforge platform - https://phabricator.wikimedia.org/T368600#10903697 (10dcaro) [10:40:26] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [jobs-api] Periodically refresh image-config data - https://phabricator.wikimedia.org/T357112#10903693 (10dcaro) [10:40:30] 06cloud-services-team, 10Toolforge (Toolforge iteration 21): [builds-builder] Upgrade python buildpack to v0.17.0 or newer for Poetry support - https://phabricator.wikimedia.org/T374056#10903699 (10dcaro) [10:40:34] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [k8s,infra] Upgrade Toolforge to Uwubernetes (1.30) - https://phabricator.wikimedia.org/T362869#10903701 (10dcaro) [10:40:38] 10Toolforge (Toolforge iteration 21): [maintain-harbor,ci] the version number does not get bumped on every release - https://phabricator.wikimedia.org/T396504#10903705 (10dcaro) [10:40:42] 10Toolforge (Toolforge iteration 21): [components-api] Add all missing options for scheduled components - https://phabricator.wikimedia.org/T395071#10903708 (10dcaro) [10:40:46] 10Toolforge (Toolforge iteration 21), 07good first task: [components-cli] bash autocomplete does not autocomplete file name when creating config - https://phabricator.wikimedia.org/T395077#10903706 (10dcaro) [10:40:50] 10Toolforge (Toolforge iteration 21), 07good first task: [components-api] use the `build.params.image_name` to compare with the `component` - https://phabricator.wikimedia.org/T395076#10903707 (10dcaro) [10:40:54] 10Toolforge (Toolforge iteration 21): [components-api] add all the missing options for continuous components - https://phabricator.wikimedia.org/T395070#10903709 (10dcaro) [10:40:58] 10Toolforge (Toolforge iteration 21): [components-api] Add support for scheduled components - https://phabricator.wikimedia.org/T395065#10903710 (10dcaro) [10:41:02] 10Toolforge (Toolforge iteration 21): [components-api,components-cli] add `deploy cancel` feature - https://phabricator.wikimedia.org/T395039#10903711 (10dcaro) [10:41:06] 10Toolforge (Toolforge iteration 21): [infra] move toolsbeta to `toolsbeta.org` domain - https://phabricator.wikimedia.org/T394997#10903713 (10dcaro) [10:41:10] 06cloud-services-team, 10Toolforge (Toolforge iteration 21): [components-api] optionally log deployments to SAL automatically - https://phabricator.wikimedia.org/T393169#10903712 (10dcaro) [10:41:14] 10Toolforge (Toolforge iteration 21): [envvars] show the 'global' envvars when running `toolforge envvars list` - https://phabricator.wikimedia.org/T394408#10903717 (10dcaro) [10:41:18] 10Toolforge (Toolforge iteration 21): [functional-tests,builds-builder] create a test suite to run builds for all the sample tools we have - https://phabricator.wikimedia.org/T394927#10903716 (10dcaro) [10:41:22] 10Toolforge (Toolforge iteration 21), 07good first task: [components-cli] make `toolforge components deployment show` show the latest deployment if no id passed - https://phabricator.wikimedia.org/T394994#10903714 (10dcaro) [10:41:26] 10Toolforge (Toolforge iteration 21), 07good first task: [components-api] add `GET` endpoint `/v1/tool//deployments/latest` - https://phabricator.wikimedia.org/T394990#10903715 (10dcaro) [10:41:30] 10Toolforge (Toolforge iteration 21): [builds-cli] add resolved reference when showing a build - https://phabricator.wikimedia.org/T394300#10903718 (10dcaro) [10:41:38] 10Toolforge (Toolforge iteration 21), 07Documentation: [components-api] Add admin documentation page - https://phabricator.wikimedia.org/T394280#10903719 (10dcaro) [10:41:42] 10Toolforge (Toolforge iteration 21): [builds-api] define a policy to update runtimes - https://phabricator.wikimedia.org/T393937#10903721 (10dcaro) [10:41:46] 10Toolforge (Toolforge iteration 21), 07Documentation: [components-api,components-cli] add user documentation page - https://phabricator.wikimedia.org/T394279#10903720 (10dcaro) [10:41:50] 06cloud-services-team, 10Toolforge (Toolforge iteration 21): [jobs-api] Indicate when a job is too big to be scheduled - https://phabricator.wikimedia.org/T383515#10903722 (10dcaro) [10:41:54] 10Toolforge (Toolforge iteration 21): [toolforge,jobs] "toolforge jobs logs" fails when job has not started yet - https://phabricator.wikimedia.org/T349775#10903724 (10dcaro) [10:41:58] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 10Sustainability (Incident Followup): [docs,envvars-api,jobs-api,builds-api] create docs on how to operate the cluster and core components - https://phabricator.wikimedia.org/T380959#10903723 (10dcaro) [10:42:02] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS (Debian Buster Deprecation), 10Toolforge (Toolforge iteration 21), 07Epic, 05Goal: [infra] Toolforge: migrate to Debian Bullseye or later - https://phabricator.wikimedia.org/T311897#10903725 (10dcaro) [10:42:10] 10Toolforge (Toolforge iteration 21): [maintain-harbor,ci] the version number does not get bumped on every release - https://phabricator.wikimedia.org/T396504#10903728 (10dcaro) p:05Triage→03Low [10:42:14] 06cloud-services-team, 10Toolforge (Toolforge iteration 20), 13Patch-For-Review: `toolforge jobs dump` fails for tools.stewardsbot - https://phabricator.wikimedia.org/T396210#10903730 (10dcaro) 05Open→03In progress [10:42:18] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: `toolforge jobs dump` fails for tools.stewardsbot - https://phabricator.wikimedia.org/T396210#10903732 (10dcaro) p:05Triage→03Medium [10:42:24] 06cloud-services-team, 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: `toolforge jobs dump` fails for tools.stewardsbot - https://phabricator.wikimedia.org/T396210#10903735 (10dcaro) a:05Raymond_Ndibe→03dcaro [10:45:09] 06cloud-services-team, 10Toolforge: [infra,ci] Alert when toolforge-deploy changes are not deployed - https://phabricator.wikimedia.org/T358908#10903755 (10dcaro) [11:01:29] (03PS4) 10Majavah: toolforge: Add cookbook to mirror Loki-related images [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1154802 (https://phabricator.wikimedia.org/T386480) [11:03:15] (03CR) 10Majavah: [C:03+2] toolforge: Add cookbook to mirror Loki-related images (032 comments) [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1154802 (https://phabricator.wikimedia.org/T386480) (owner: 10Majavah) [11:06:52] (03Merged) 10jenkins-bot: toolforge: Add cookbook to mirror Loki-related images [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1154802 (https://phabricator.wikimedia.org/T386480) (owner: 10Majavah) [11:09:16] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.logging.copy_images_to_registry for Loki 3.5.0, Alloy 1.9.1 [11:09:17] !log taavi@cloudcumin1001 tools Updating container image docker-registry.svc.toolforge.org/grafana/loki:3.5.0 [11:09:19] !log taavi@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.k8s.logging.copy_images_to_registry (exit_code=99) for Loki 3.5.0, Alloy 1.9.1 [11:11:34] (03PS1) 10Majavah: toolforge: logging: Fix container image tag syntax [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1155641 [11:12:35] (03CR) 10Majavah: [C:03+2] toolforge: logging: Fix container image tag syntax [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1155641 (owner: 10Majavah) [11:16:13] (03Merged) 10jenkins-bot: toolforge: logging: Fix container image tag syntax [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1155641 (owner: 10Majavah) [11:18:11] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.logging.copy_images_to_registry for Loki 3.5.0, Alloy 1.9.1 [11:18:11] !log taavi@cloudcumin1001 tools Updating container image docker-registry.svc.toolforge.org/grafana/loki:3.5.0 [11:18:32] !log taavi@cloudcumin1001 tools Updating container image docker-registry.svc.toolforge.org/grafana/alloy:v1.9.1 [11:19:19] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.logging.copy_images_to_registry (exit_code=0) for Loki 3.5.0, Alloy 1.9.1 [11:20:28] (03update) 10dcaro: [deploy] skip build if refs are same [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/77 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [11:30:00] 06cloud-services-team, 10Toolforge: Provision object storage volumes for Loki - https://phabricator.wikimedia.org/T396574#10903920 (10taavi) @Andrew do you happen to know if individual buckets can be configured to have a storage limit applied to them without having to go through the Rados admin CLI? [11:38:23] (03PS1) 10NkwadaNora: rearrange the location of some files [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1155646 [11:39:29] (03update) 10taavi: Adding loki [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/226 (https://phabricator.wikimedia.org/T386480) (owner: 10rook) [11:56:23] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [11:58:43] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:01:07] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:02:23] 06cloud-services-team, 10Cloud-VPS: Create OpenStack role that allows object storage access only - https://phabricator.wikimedia.org/T396594 (10taavi) 03NEW [12:02:36] (03close) 10taavi: Adding loki [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/226 (https://phabricator.wikimedia.org/T386480) (owner: 10rook) [12:04:48] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:05:44] 10Tool-python-toolforge: Support reading Wiki Replica/ToolsDB credentials from envvars - https://phabricator.wikimedia.org/T339940#10904052 (10LucasWerkmeister) 05Open→03Resolved Should be done now! [12:06:22] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:07:47] 10Tool-python-toolforge: Support connecting to extension databases - https://phabricator.wikimedia.org/T396115#10904056 (10LucasWerkmeister) 05Open→03Resolved Should be implemented now: `lang=python from toolforge import connect conn = connect('wikidatawiki', extension='termstore') with conn.cursor() a... [12:09:33] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:21:13] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:22:01] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:23:00] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:30:35] (03open) 10dcaro: builds: show also the pending state builds [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/107 [12:31:58] (03update) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:32:29] (03merge) 10dcaro: [deploy] skip build if refs are same [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/77 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:32:31] (03update) 10dcaro: [deploy] add force-build and force-run query params [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/80 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [12:35:21] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: components-api: bump to 0.0.115-20250611123240-830d7be5 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/811 (https://phabricator.wikimedia.org/T389044 https://phabricator.wikimedia.org/T395533) [12:40:48] (03update) 10dcaro: builds: show also the pending state builds [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/107 [12:47:47] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 06Data-Engineering, 06Data-Persistence, and 3 others: Migrate clouddb* hosts to MariaDB 10.11 - https://phabricator.wikimedia.org/T394372#10904329 (10fnegri) @Alien333 thanks for reporting this! I don't think the assignments (`:=`) are the proble... [12:53:56] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 06Data-Persistence, 13Patch-For-Review: Migrate clouddb* hosts to MariaDB 10.11 - https://phabricator.wikimedia.org/T394372#10904351 (10fnegri) [12:55:27] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 06Data-Persistence, 13Patch-For-Review: Migrate clouddb* hosts to MariaDB 10.11 - https://phabricator.wikimedia.org/T394372#10904379 (10Volans) From MariaDB 10.7 according to https://mariadb.com/kb/en/reserved-words/ ;) [12:57:33] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 06Data-Persistence, 13Patch-For-Review: Migrate clouddb* hosts to MariaDB 10.11 - https://phabricator.wikimedia.org/T394372#10904382 (10fnegri) > From MariaDB 10.7 according to https://mariadb.com/kb/en/reserved-words/ ;) That's some good sleuthi... [12:58:10] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 06Data-Persistence, 13Patch-For-Review: Migrate clouddb* hosts to MariaDB 10.11 - https://phabricator.wikimedia.org/T394372#10904389 (10Alien333) Thank you very much for the precision. [13:01:59] (03close) 10dcaro: [components-smoke-test] add components-api conditional build and run tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/797 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [13:06:24] (03update) 10dcaro: components-api: bump to 0.0.115-20250611123240-830d7be5 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/811 (https://phabricator.wikimedia.org/T389044 https://phabricator.wikimedia.org/T395533) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [13:12:07] !log dcaro@acme toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component components-api [13:12:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [13:14:35] 10Toolforge (Toolforge iteration 21): [components-api] Add basic prometheus metrics - https://phabricator.wikimedia.org/T394276#10904502 (10dcaro) Hmmm.... something looks suspicious... it says we are getting >300 req/s in tools, mostly to the health endpoint (the endpoint is ok, but 300/s seems too many as that... [13:15:13] !log dcaro@acme toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component components-api [13:15:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [13:17:33] 10Toolforge (Toolforge iteration 21): [components-api] Add basic prometheus metrics - https://phabricator.wikimedia.org/T394276#10904517 (10dcaro) The checks seem to be 1 every 3s: ` dcaro@tools-bastion-13:~$ kubectl-sudo -n components-api logs components-api-685b8cb7f8-28mhk -f --timestamps --tail 10 Defaulted... [13:21:21] 10Toolforge (Toolforge iteration 21): [components-api] Add basic prometheus metrics - https://phabricator.wikimedia.org/T394276#10904531 (10taavi) The values of those metrics seem to be flipping between two values: {F62292739} Almost as if there were two workers, both of which kept their own separate values in m... [13:23:01] (03open) 10chuckonwumelu: d/changelog: bump to 1.6.9 [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/80 (https://phabricator.wikimedia.org/T262562 https://phabricator.wikimedia.org/T394277) [13:25:51] !log chuckonwumelu@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component toolforge-weld [13:25:53] !log chuckonwumelu@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.component.deploy (exit_code=99) for component toolforge-weld [13:26:08] !log chuckonwumelu@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component toolforge-weld [13:32:13] !log chuckonwumelu@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component toolforge-weld [13:32:26] 10Toolforge (Toolforge iteration 21): [components-api] Add basic prometheus metrics - https://phabricator.wikimedia.org/T394276#10904634 (10dcaro) Yep, it's getting two values, one for each gunicorn worker, besides the high values. [13:33:44] !log chuckonwumelu@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component toolforge-weld [13:35:17] (03approved) 10fnegri: d/changelog: bump to 1.6.9 [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/80 (https://phabricator.wikimedia.org/T262562 https://phabricator.wikimedia.org/T394277) (owner: 10chuckonwumelu) [13:39:13] !log chuckonwumelu@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component toolforge-weld [13:41:57] 10Tool-extloc: Extloc gives 500 on using an unrecognized extension name - https://phabricator.wikimedia.org/T396616 (10SD0001) 03NEW [13:42:41] (03merge) 10chuckonwumelu: d/changelog: bump to 1.6.9 [repos/cloud/toolforge/toolforge-weld] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-weld/-/merge_requests/80 (https://phabricator.wikimedia.org/T262562 https://phabricator.wikimedia.org/T394277) [13:49:32] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [components-cli,components-api, toolforge-weld] Add a warning message saying it's 'beta' - https://phabricator.wikimedia.org/T394277#10904692 (10Chuckonwumelu) 05In progress→03Resolved [13:56:51] 06cloud-services-team, 10Cloud-VPS: Create OpenStack role that allows object storage access only - https://phabricator.wikimedia.org/T396594#10904734 (10fnegri) p:05Triage→03Medium [13:57:12] 06cloud-services-team, 10Toolforge: [toolsbeta,tofu,infra] There's some discrepancy between the volumes in toolsbeta and tofu - https://phabricator.wikimedia.org/T396276#10904742 (10fnegri) p:05Triage→03High [13:57:46] 06cloud-services-team: HAProxyServiceUnavailable HAProxy service designate-api_backend has no available backends on cloudlb1002:9900 - https://phabricator.wikimedia.org/T396257#10904746 (10fnegri) 05Open→03Resolved a:03fnegri This is now resolved. [14:04:05] (03approved) 10dcaro: components-api: bump to 0.0.115-20250611123240-830d7be5 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/811 (https://phabricator.wikimedia.org/T389044 https://phabricator.wikimedia.org/T395533) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [14:04:10] (03merge) 10dcaro: components-api: bump to 0.0.115-20250611123240-830d7be5 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/811 (https://phabricator.wikimedia.org/T389044 https://phabricator.wikimedia.org/T395533) (owner: 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620) [14:10:26] 06cloud-services-team, 06DC-Ops, 10ops-codfw, 06SRE: cloudcontrol2010-dev service implementation - https://phabricator.wikimedia.org/T396064#10904809 (10Andrew) 05Open→03Resolved [14:11:10] 06cloud-services-team, 10Toolforge: 2025-06-06 Toolforge NFS cleanup - https://phabricator.wikimedia.org/T396220#10904815 (10fnegri) 05Open→03Resolved a:03fnegri Still some things to cleanup but we're below the threshold now. [14:11:29] 06cloud-services-team, 10Cloud-VPS: Review our handling of keystone 'member' role (previously known as 'projectadmin') - https://phabricator.wikimedia.org/T396016#10904820 (10fnegri) p:05Triage→03Medium [14:12:44] 06cloud-services-team, 10Cloud-VPS: YAML aliases do not seem to pass cleanly through from Horizon to the Puppetserver - https://phabricator.wikimedia.org/T395940#10904828 (10fnegri) p:05Triage→03Medium [14:12:45] 06cloud-services-team, 10Toolforge: 2025-06-06 Toolforge NFS cleanup - https://phabricator.wikimedia.org/T396220#10904829 (10taavi) a:05fnegri→03taavi [14:16:31] (03open) 10dcaro: toolsbeta:prometheus_vm:2: make volume 'a' 20G [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/48 [14:18:07] (03update) 10dcaro: toolsbeta:prometheus_vm:2: make volume 'a' 20G [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/48 [14:18:19] 06cloud-services-team: KernelErrors Server cloudvirt1045 logged kernel errors - https://phabricator.wikimedia.org/T395739#10904876 (10taavi) ` [Sat May 31 20:49:35 2025] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 4 [Sat May 31 20:49:35 2025] {1}[Hardware Error]: It has been corr... [14:18:45] 06cloud-services-team: KernelErrors Server cloudvirt1045 logged kernel errors - https://phabricator.wikimedia.org/T395739#10904882 (10fnegri) 05Open→03Resolved a:03fnegri This is not worrying unless it starts happening frequently. [14:19:22] 06cloud-services-team, 10Striker: Rotate StrikerBot GitLab PAT before it expires on 2025-07-29 - https://phabricator.wikimedia.org/T395694#10904888 (10taavi) p:05Triage→03High a:03taavi [14:20:24] 06cloud-services-team, 10Striker: Update StrikerBot Developer, SUL, and related accounts to email folks besides just bd808 - https://phabricator.wikimedia.org/T395697#10904890 (10fnegri) p:05Triage→03High [14:20:48] 06cloud-services-team, 10Striker: Update StrikerBot Developer, SUL, and related accounts to email folks besides just bd808 - https://phabricator.wikimedia.org/T395697#10904896 (10fnegri) p:05High→03Medium a:03taavi [14:22:55] 06cloud-services-team: PuppetFailure Puppet has failed on cloudcontrol2006-dev:9100 - https://phabricator.wikimedia.org/T395695#10904909 (10fnegri) 05Open→03Resolved a:03fnegri This keeps happening on and off because of failures with git syncing. Resolving for now. [14:23:25] 06cloud-services-team: NovafullstackSustainedFailures Novafullstack tests have been failing for more than 5hours in eqiad - https://phabricator.wikimedia.org/T395658#10904912 (10fnegri) 05Open→03Resolved a:03fnegri [14:24:03] 06cloud-services-team: SystemdUnitDown The systemd unit prometheus-node-textfile-wmcs-bastionless.service on node cloudcontrol1007 has been failing for more than two hours. - https://phabricator.wikimedia.org/T395515#10904915 (10fnegri) 05Open→03Resolved a:03fnegri Related to OpenStack upgrade and fail... [14:24:29] 06cloud-services-team: KernelErrors Server cloudcontrol1011 logged kernel errors - https://phabricator.wikimedia.org/T395509#10904931 (10fnegri) 05Open→03Resolved a:03fnegri Just a reboot [14:25:42] 06cloud-services-team, 10Data-Services: [wikireplicas] Create views for new wiki minwikibooks - https://phabricator.wikimedia.org/T395502#10904944 (10fnegri) p:05Triage→03Low Waiting for the wiki to be created. [14:27:13] 06cloud-services-team, 10Toolforge, 06Infrastructure-Foundations, 10netops: [infra] Reports of slow connectivity from APAC - https://phabricator.wikimedia.org/T395135#10904952 (10fnegri) [14:29:22] 06cloud-services-team, 10Toolforge: [build-service] remove legacy fagiani/apt 0.2.5 builder from `--use-latest-versions` stack - https://phabricator.wikimedia.org/T394466#10904957 (10fnegri) →14Duplicate dup:03T387141 [14:29:25] 06cloud-services-team, 10Toolforge: [builds-builder,apt] migrate from apt buildpack to Heroku's .deb packages buildpack - https://phabricator.wikimedia.org/T387141#10904959 (10fnegri) [14:30:36] 06cloud-services-team, 10Toolforge: [build-service] Allow custom tagging of build service generated images - https://phabricator.wikimedia.org/T369192#10904962 (10fnegri) p:05Triage→03Medium [14:38:41] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:54:15] 06cloud-services-team, 10Toolforge, 07Documentation, 07Security, 05Vuln-Infoleak: Documented `toolforge envvars` usage exposes secrets via `ps` and others - https://phabricator.wikimedia.org/T396232#10905076 (10JJMC89) [14:59:40] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,cinder [14:59:47] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,cinder [15:04:34] 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul-runners VPS project - https://phabricator.wikimedia.org/T396540#10905116 (10bd808) 05In progress→03Stalled >>! In T396540#10903420, @fnegri wrote: > I will wait for @bd808 to +1 or -1 the name c... [15:05:46] 06cloud-services-team, 10Toolforge, 10Pywikibot: unexpected root-owned files in /data/project/pywikibot/public_html - https://phabricator.wikimedia.org/T375279#10905125 (10Xqt) Stopped again last night, this time during pre-nightly. [15:11:18] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services [15:13:45] (03open) 10dcaro: run_functional_tests: add extra logs with filters/components [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/812 [15:18:20] PROBLEM - nova-compute proc minimum on cloudvirt1052 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:18:28] PROBLEM - nova-compute proc minimum on cloudvirt1049 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:18:46] PROBLEM - nova-compute proc minimum on cloudvirt1059 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:18:47] (03update) 10dcaro: run_functional_tests: add extra logs with filters/components [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/812 [15:18:56] PROBLEM - nova-compute proc minimum on cloudvirt1050 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:18:58] PROBLEM - nova-compute proc minimum on cloudvirt1053 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:19:12] PROBLEM - nova-compute proc minimum on cloudvirt1048 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:19:14] PROBLEM - nova-compute proc minimum on cloudvirt1056 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:19:30] PROBLEM - nova-compute proc minimum on cloudvirt1061 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:19:38] PROBLEM - nova-compute proc minimum on cloudvirt1054 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:19:46] RECOVERY - nova-compute proc minimum on cloudvirt1059 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:19:48] (03update) 10dcaro: run_functional_tests: add extra logs with filters/components [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/812 [15:19:56] PROBLEM - nova-compute proc minimum on cloudvirt1057 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:20:14] RECOVERY - nova-compute proc minimum on cloudvirt1056 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:20:30] RECOVERY - nova-compute proc minimum on cloudvirt1061 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:20:36] !log andrew@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.openstack.restart_openstack (exit_code=97) on deployment eqiad1 for all services [15:20:38] RECOVERY - nova-compute proc minimum on cloudvirt1054 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:20:40] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services [15:20:56] RECOVERY - nova-compute proc minimum on cloudvirt1057 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:21:13] (03update) 10dcaro: run_functional_tests: add extra logs with filters/components [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/812 [15:22:20] PROBLEM - nova-compute proc maximum on cloudvirt1050 is CRITICAL: PROCS CRITICAL: 0 processes with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:22:20] PROBLEM - nova-compute proc maximum on cloudvirt1053 is CRITICAL: PROCS CRITICAL: 0 processes with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:22:47] PROBLEM - nova-compute proc maximum on cloudvirt1049 is CRITICAL: PROCS CRITICAL: 0 processes with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:22:56] PROBLEM - nova-compute proc maximum on cloudvirt1048 is CRITICAL: PROCS CRITICAL: 0 processes with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:22:56] PROBLEM - nova-compute proc maximum on cloudvirt1052 is CRITICAL: PROCS CRITICAL: 0 processes with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:22:58] RECOVERY - nova-compute proc minimum on cloudvirt1053 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:23:20] RECOVERY - nova-compute proc maximum on cloudvirt1053 is OK: PROCS OK: 1 process with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:23:28] RECOVERY - nova-compute proc minimum on cloudvirt1049 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:23:46] RECOVERY - nova-compute proc maximum on cloudvirt1049 is OK: PROCS OK: 1 process with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:23:56] RECOVERY - nova-compute proc minimum on cloudvirt1050 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:24:20] RECOVERY - nova-compute proc maximum on cloudvirt1050 is OK: PROCS OK: 1 process with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:24:56] RECOVERY - nova-compute proc maximum on cloudvirt1048 is OK: PROCS OK: 1 process with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:24:56] FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [15:25:12] RECOVERY - nova-compute proc minimum on cloudvirt1048 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:25:15] (03open) 10dcaro: functional_tests: use the right webservice tag for the tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/813 [15:25:20] RECOVERY - nova-compute proc minimum on cloudvirt1052 is OK: PROCS OK: 1 process with regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:25:23] (03update) 10dcaro: functional_tests: use the right webservice tag for the tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/813 [15:25:56] RECOVERY - nova-compute proc maximum on cloudvirt1052 is OK: PROCS OK: 1 process with PPID = 1, regex args ^/usr/bin/pytho[n].* /usr/bin/nova-compute https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting [15:30:24] (03open) 10dcaro: prometheus: use multiproc stats [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/85 (https://phabricator.wikimedia.org/T394275) [15:31:23] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for all services [15:38:07] (03update) 10dcaro: prometheus: use multiproc stats [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/85 (https://phabricator.wikimedia.org/T394275) [15:40:03] 10Tool-refill: Refill tool stuck "waiting for an available worker" - https://phabricator.wikimedia.org/T396631 (10skarz) 03NEW [15:44:14] 10Tool-refill: Refill tool stuck "waiting for an available worker" - https://phabricator.wikimedia.org/T396631#10905347 (10Curb_Safe_Charmer) a:03Curb_Safe_Charmer [15:44:24] 10Tool-refill: Refill tool stuck "waiting for an available worker" - https://phabricator.wikimedia.org/T396631#10905348 (10Curb_Safe_Charmer) 05Open→03In progress [15:44:59] 10Tool-refill: Refill tool stuck "waiting for an available worker" - https://phabricator.wikimedia.org/T396631#10905354 (10Curb_Safe_Charmer) Following process at https://en.wikipedia.org/wiki/Wikipedia:Refill/restart [15:51:26] RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [15:54:16] (03update) 10dcaro: prometheus: use multiproc stats [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/85 (https://phabricator.wikimedia.org/T394275) [15:56:31] 10Tool-refill: Refill tool stuck "waiting for an available worker" - https://phabricator.wikimedia.org/T396631#10905465 (10skarz) 05In progress→03Resolved Thanks! [15:56:32] (03approved) 10dcaro: [components-service] add components api alert [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/31 (https://phabricator.wikimedia.org/T394275) (owner: 10raymond-ndibe) [15:56:35] (03merge) 10dcaro: [components-service] add components api alert [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/31 (https://phabricator.wikimedia.org/T394275) (owner: 10raymond-ndibe) [15:58:57] 10Tool-refill: Refill tool stuck "waiting for an available worker" - https://phabricator.wikimedia.org/T396631#10905479 (10Curb_Safe_Charmer) Started working again after running webservice restart ./restart.sh [16:14:41] 06cloud-services-team, 10Toolforge, 10Pywikibot: unexpected root-owned files in /data/project/pywikibot/public_html - https://phabricator.wikimedia.org/T375279#10905533 (10Xqt) p:05Low→03Medium [16:17:04] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-54 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [16:22:21] 10Toolforge (Toolforge iteration 21): [components-api] deploy on tools - https://phabricator.wikimedia.org/T394337#10905555 (10dcaro) >>! In T394337#10876418, @Raymond_Ndibe wrote: > we can probably mark this as resolved Still ongoing, there's a patch for review, and there will be some followup to catch up tool... [16:24:28] 10Toolforge (Toolforge iteration 21): [components-api] Add support for scheduled components - https://phabricator.wikimedia.org/T395065#10905567 (10dcaro) >>! In T395065#10872015, @Raymond_Ndibe wrote: > should probably wait until the jobs split PR (https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/m... [16:27:04] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-54 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [16:51:08] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for service: project,cinder [16:51:18] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment codfw1dev for service: project,cinder [17:25:55] 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul-runners VPS project - https://phabricator.wikimedia.org/T396540#10905853 (10thcipriani) 05Stalled→03Open >>! In T396540#10905116, @bd808 wrote: >>>! In T396540#10903420, @fnegri wrote: >> I will... [17:26:26] 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul VPS project - https://phabricator.wikimedia.org/T396540#10905856 (10thcipriani) [17:27:24] (03PS12) 10Majavah: Switch username validation to Bitu API [labs/striker] - 10https://gerrit.wikimedia.org/r/1134724 (https://phabricator.wikimedia.org/T364605) (owner: 10Arendpieter) [17:27:24] (03PS1) 10Majavah: build: Upgrade Codex to 2.1.0 [labs/striker] - 10https://gerrit.wikimedia.org/r/1155739 [17:28:55] (03update) 10dcaro: [deploy] add force-build and force-run query params [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/80 (https://phabricator.wikimedia.org/T389044) (owner: 10raymond-ndibe) [17:29:59] (03CR) 10CI reject: [V:04-1] Switch username validation to Bitu API [labs/striker] - 10https://gerrit.wikimedia.org/r/1134724 (https://phabricator.wikimedia.org/T364605) (owner: 10Arendpieter) [17:30:24] (03CR) 10CI reject: [V:04-1] build: Upgrade Codex to 2.1.0 [labs/striker] - 10https://gerrit.wikimedia.org/r/1155739 (owner: 10Majavah) [17:31:05] 06cloud-services-team, 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul VPS project - https://phabricator.wikimedia.org/T396540#10905897 (10bd808) [17:32:00] (03PS2) 10Majavah: build: Upgrade Codex to 2.1.0 [labs/striker] - 10https://gerrit.wikimedia.org/r/1155739 [17:32:00] (03PS13) 10Majavah: Switch username validation to Bitu API [labs/striker] - 10https://gerrit.wikimedia.org/r/1134724 (https://phabricator.wikimedia.org/T364605) (owner: 10Arendpieter) [17:33:49] 06cloud-services-team, 10Cloud-VPS (Project-requests), 10Continuous-Integration-Infrastructure (Zuul upgrade): Request creation of zuul VPS project - https://phabricator.wikimedia.org/T396540#10905909 (10bd808) [17:35:30] 10VPS-project-Codesearch, 06collaboration-services: Graduate codesearch to production - https://phabricator.wikimedia.org/T268199#10905916 (10Dzahn) Created a new gitlab project called [[ https://gitlab.wikimedia.org/repos/sre/sourcebot | sourcebot ]] under repos/sre to hold the docker files for letting CI bui... [17:38:59] (03approved) 10chuckonwumelu: toolsbeta:prometheus_vm:2: make volume 'a' 20G [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/48 (owner: 10dcaro) [17:39:14] (03merge) 10dcaro: toolsbeta:prometheus_vm:2: make volume 'a' 20G [repos/cloud/toolforge/tofu-provisioning] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/48 [18:05:51] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: SSD firmware update for cloudcephosd10[35-41] - https://phabricator.wikimedia.org/T396651 (10RobH) 03NEW [18:07:34] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: SSD firmware update for cloudcephosd10[35-41] - https://phabricator.wikimedia.org/T396651#10906116 (10RobH) a:03Andrew @andrew, Would you be the best person to handle this or should I task it over to Joanna for assignment? Basically we need... [19:36:28] FIRING: InstanceDown: Project tools instance tools-k8s-worker-nfs-46 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [19:41:28] RESOLVED: InstanceDown: Project tools instance tools-k8s-worker-nfs-46 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [21:38:05] 06cloud-services-team, 10Toolforge: Provision object storage volumes for Loki - https://phabricator.wikimedia.org/T396574#10906740 (10Andrew) There is an API as well as a CLI: https://docs.ceph.com/en/latest/radosgw/adminops/ So it should be possible, although I have never tried to use it. I'm still a bit a... [21:55:03] 06cloud-services-team, 10Cloud-VPS: Support keystone role management with tofu-infra - https://phabricator.wikimedia.org/T396671 (10Andrew) 03NEW [21:55:09] 06cloud-services-team, 10Cloud-VPS: Support keystone role management with tofu-infra - https://phabricator.wikimedia.org/T396671#10906773 (10Andrew) p:05Triage→03Medium [21:55:55] 06cloud-services-team, 10Cloud-VPS: Create OpenStack role that allows object storage access only - https://phabricator.wikimedia.org/T396594#10906775 (10Andrew) In a perfect world this role would be managed with tofu (T396671) but it's trivial to create by hand, and the good bits are going to be in the ceph co... [22:04:52] 10Tool-python-toolforge: Add read_private() function to python-toolforge library - https://phabricator.wikimedia.org/T333728#10906800 (10LucasWerkmeister) cookiecutter-toolforge [also updated](https://github.com/lucaswerkmeister/cookiecutter-toolforge/commit/88bfd9c3224065f7b45bcdf058fcf1cdaf9fa840); with th... [22:13:40] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: SSD firmware update for cloudcephosd10[35-41] - https://phabricator.wikimedia.org/T396651#10906820 (10Andrew) Yes -- assuming that the cookbook works reliably for updating the firmware, these should either be be managed by me or by the hypothe... [22:17:15] 10Tool-wdactle, 10MediaWiki-extensions-Wikibase-Repo, 10Wikibase Action API (WPP), 10Wikidata, 10MW-1.45-notes (1.45.0-wmf.5; 2025-06-10): wbformatentities with generate=text/plain HTML-escapes some characters - https://phabricator.wikimedia.org/T395731#10906823 (10LucasWerkmeister) 05Open→03Resol... [22:35:09] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: SSD firmware update for cloudcephosd10[35-41] - https://phabricator.wikimedia.org/T396651#10906864 (10RobH) The cookbook worked reliably for updating 4 of the 6 cirrussearch hosts (first couple were used in testing so had issues on the automat... [22:35:18] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: SSD firmware update for cloudcephosd10[35-41] - https://phabricator.wikimedia.org/T396651#10906865 (10RobH) [23:07:51] 10Tool-campwiz-nxt, 06translatewiki.net, 10LPL Essential (LPL Essential 2025 Apr-Jun: CX), 07Unplanned-Sprint-Work: Add CampWiz NXT to translatewiki.net - https://phabricator.wikimedia.org/T393850#10907030 (10Nokib_Sarkar)