[02:23:55] 06Release-Engineering-Team (Radar), 06cloud-services-team, 10Cloud-VPS: Magnum cluster stuck in DELETE_FAILED status - https://phabricator.wikimedia.org/T428312#11992004 (10Andrew) Keystone says: ` raise exception.Forbidden(_('Trustee has no delegated roles.')) ` [03:38:28] 06Release-Engineering-Team (Radar), 06cloud-services-team, 10Cloud-VPS: Magnum cluster stuck in DELETE_FAILED status - https://phabricator.wikimedia.org/T428312#11992007 (10Andrew) trust_role_id: e2dfeddb1c514120b467835266f4bc06 (k8s_admin) current_effective_trustor_roles: ['38676f30eaeb44518bf7e144a73c8da6'... [04:39:46] 10Phabricator (Upstream): [Typo Report] on XUnitTestEngine.php:91 - https://phabricator.wikimedia.org/T428380#11992040 (10Pppery) [06:00:34] 10Phabricator (Upstream), 07Upstream: Typo in Arcanist's XUnitTestEngine.php:91 - https://phabricator.wikimedia.org/T428380#11992066 (10Aklapper) [06:28:10] 06Project-Admins, 10Community Tech CRS Support: Archive Community Tech CRS Support project - https://phabricator.wikimedia.org/T428382 (10Bugreporter) 03NEW [06:30:57] 10GitLab, 06Release-Engineering-Team (Radar), 06collaboration-services, 13Patch-For-Review: gitlab behind CDN - https://phabricator.wikimedia.org/T425441#11992117 (10ABran-WMF) [07:12:37] 06Project-Admins, 06Release-Engineering-Team (Doing 😎): Archive Community Tech CRS Support project - https://phabricator.wikimedia.org/T428382#11992228 (10Aklapper) 05Open→03Resolved a:03Aklapper Done, thanks for reporting this! [08:00:35] 06Project-Admins: Reconciliation project - Subproject to manage reconciliation service - https://phabricator.wikimedia.org/T426541#11992343 (10DaxServer) Hello @Aklapper Thanks for providing the Mediawiki links. Yes, I'm referring to creating a subproject under the #reconciliation project. I wasn't sure to whom... [08:14:54] 10GitLab, 06Release-Engineering-Team (Radar), 06collaboration-services, 13Patch-For-Review: gitlab behind CDN - https://phabricator.wikimedia.org/T425441#11992381 (10ABran-WMF) After our meeting on Friday, we were leaning towards using a dedicated hostname for git SSH endpoint, I tried to compare https and... [08:39:22] 06Release-Engineering-Team (Priority Backlog 📥), 07Essential-Work, 05Release, 05Train Deployments: 1.47.0-wmf.6 deployment blockers - https://phabricator.wikimedia.org/T423915#11992551 (10Dreamy_Jazz) [09:00:13] 10Diffusion, 10Phabricator, 06collaboration-services: Drop our mirroring of code to Diffusion and empty the repos - https://phabricator.wikimedia.org/T359549#11992697 (10Aklapper) **Lacking a better place I'm gonna dump my notes on rendering of linked objects (like: T359549) after //uninstalling// the relate... [09:14:31] 06Project-Admins: Reconciliation project - Subproject to manage reconciliation service - https://phabricator.wikimedia.org/T426541#11992761 (10Aklapper) No problem. The only thing that I must point out is that #reconciliation currently has no subprojects. Please see https://www.mediawiki.org/wiki/Phabricator/Pro... [09:19:32] FIRING: PuppetAgentFailure: Puppet agent failure detected on instance deployment-schema-3 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [09:19:38] 10Beta-Cluster-Infrastructure: Puppet agent failure detected on instance deployment-schema-3 in project deployment-prep - https://phabricator.wikimedia.org/T428410 (10wmcs-alerts) 03NEW [09:25:10] (03PS5) 10Hashar: Zuul: Add Əkrəm to CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/1298384 (owner: 10NMW03) [09:26:13] (03CR) 10Hashar: [C:03+2] Zuul: Add Əkrəm to CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/1298384 (owner: 10NMW03) [09:29:53] (03Merged) 10jenkins-bot: Zuul: Add Əkrəm to CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/1298384 (owner: 10NMW03) [09:33:46] 10GitLab, 06Release-Engineering-Team (Radar), 06collaboration-services, 13Patch-For-Review: gitlab behind CDN - https://phabricator.wikimedia.org/T425441#11992868 (10ABran-WMF) [09:33:53] 10GitLab, 06Release-Engineering-Team (Radar), 06collaboration-services, 13Patch-For-Review: gitlab behind CDN - https://phabricator.wikimedia.org/T425441#11992870 (10ABran-WMF) [09:34:32] RESOLVED: PuppetAgentFailure: Puppet agent failure detected on instance deployment-schema-3 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [09:34:48] 06Release-Engineering-Team (Priority Backlog 📥), 07Essential-Work, 05Release, 05Train Deployments: 1.47.0-wmf.6 deployment blockers - https://phabricator.wikimedia.org/T423915#11992884 (10Dreamy_Jazz) [09:35:01] 10GitLab, 06Release-Engineering-Team (Radar), 06collaboration-services, 13Patch-For-Review: gitlab behind CDN - https://phabricator.wikimedia.org/T425441#11992886 (10ABran-WMF) [09:41:30] 10Continuous-Integration-Config, 10Diffusion, 10Phabricator: integration-agent-docker machines excessively pull some Wikibase related Git repos in Diffusion - https://phabricator.wikimedia.org/T349921#11992931 (10Aklapper) Updated numbers after T424098#11884831 / T397714#11884839: ` mysql:phstats@m3-slave.eq... [10:58:14] (03update) 10aghirelli: feat(T424210): add enum-description checks for requestBody and response [repos/ci-tools/wikimedia-spectral-ruleset] - 10https://gitlab.wikimedia.org/repos/ci-tools/wikimedia-spectral-ruleset/-/merge_requests/8 [11:28:31] 10GitLab (CI & Job Runners), 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: Remove buildkit helper image docker/dockerfile-copy from build pipeline - https://phabricator.wikimedia.org/T321316#11993540 (10Jelto) [11:46:03] (03update) 10phedenskog: Add dashboard with CI time spent by repo/jobs [repos/releng/develstats] - 10https://gitlab.wikimedia.org/repos/releng/develstats/-/merge_requests/6 [12:36:18] 06Release-Engineering-Team (Radar), 10Scap: scap update-patch can abort and leave /srv/patches in a mess - https://phabricator.wikimedia.org/T428316#11993838 (10jnuche) The pretrain is indeed trying to [[ https://gitlab.wikimedia.org/repos/releng/scap/-/blob/b3142be0175d0d8cd6319c3517863e7cb67d6ff1/scap/patche... [12:38:15] (03CR) 10Hashar: zuul: restore comments about disabled deps (032 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/1294934 (owner: 10Hashar) [12:38:25] 06Release-Engineering-Team (Radar), 10Scap: scap update-patch can abort and leave /srv/patches in a mess - https://phabricator.wikimedia.org/T428316#11993846 (10taavi) It seems like the cleanest option for me would be to make the prep job abort in case `/srv/patches` is dirty (or has stashed changes, as those... [12:45:32] 06Release-Engineering-Team (Radar), 10Scap: scap update-patch can abort and leave /srv/patches in a mess - https://phabricator.wikimedia.org/T428316#11993869 (10jnuche) >>! In T428316#11993846, @taavi wrote: > It seems like the cleanest option for me would be to make the prep job abort in case `/srv/patches` i... [12:52:09] 10Gerrit, 06collaboration-services, 07Incident Severity 3, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11993888 (10MLechvien-WMF) [12:54:52] 06Release-Engineering-Team (Radar), 06cloud-services-team, 10Cloud-VPS: Magnum cluster stuck in DELETE_FAILED status - https://phabricator.wikimedia.org/T428312#11993922 (10Andrew) ...and after all that, I'm curious to hear if either of you (@dduvall or @bd808) can still create clusters. Or delete them, for... [12:57:45] 10GitLab (Account Approval), 06Release-Engineering-Team: Requesting GitLab account activation for [YOUR DEVELOPER ACCOUNT USERNAME HERE] - https://phabricator.wikimedia.org/T428446 (10Rainmonger) 03NEW [12:58:09] 10GitLab (Account Approval), 06Release-Engineering-Team: Requesting GitLab account activation for Rainmonger - https://phabricator.wikimedia.org/T428446#11993962 (10Rainmonger) [13:05:32] FIRING: PuppetSyncFailure: Failed to update Puppet repository /srv/git/operations/puppet on instance deployment-puppetserver-1 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetSyncFailure [13:05:38] 10Beta-Cluster-Infrastructure: Failed to update Puppet repository /srv/git/operations/puppet on instance deployment-puppetserver-1 in project deployment-prep - https://phabricator.wikimedia.org/T428447 (10wmcs-alerts) 03NEW [13:09:25] 10GitLab (Account Approval), 06Release-Engineering-Team: Requesting GitLab account activation for Rainmonger - https://phabricator.wikimedia.org/T428446#11994026 (10Aklapper) 05Open→03Resolved Hi, looks like your GitLab account has been already activated. [13:25:32] (03PS1) 10JMeybohm: helm-linter: Update k8s JSON schema [integration/config] - 10https://gerrit.wikimedia.org/r/1298782 (https://phabricator.wikimedia.org/T427069) [13:25:34] (03PS1) 10JMeybohm: helm-linter: Use version 0.8.1 [integration/config] - 10https://gerrit.wikimedia.org/r/1298783 (https://phabricator.wikimedia.org/T427069) [13:29:31] postmerge seems to be having issues (running for 5hrs and still pending) [13:29:35] anyone aware of that? [13:42:08] 10Diffusion, 10Phabricator, 06Release-Engineering-Team (Priority Backlog 📥), 06collaboration-services, 13Patch-For-Review: Disable IO for diffusion repositories - https://phabricator.wikimedia.org/T405596#11994185 (10Aklapper) FYI made some trivial merge requests not to link to a non-canonical Diffusion... [13:49:28] 10GitLab (CI & Job Runners), 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: Remove buildkit helper image docker/dockerfile-copy from build pipeline - https://phabricator.wikimedia.org/T321316#11994195 (10Jelto) [13:51:07] 10GitLab (CI & Job Runners), 06Release-Engineering-Team, 06collaboration-services, 13Patch-For-Review: Remove buildkit helper image docker/dockerfile-copy from build pipeline - https://phabricator.wikimedia.org/T321316#11994215 (10Jelto) I opened MRs and changes for all projects. Some maintainers already m... [14:21:49] urbanecm: as far as I’m aware, it’s not *particularly* unusual – it looks like only one codehealth job can run at a time, and so when there are lots of localisation updates changes, they pile up [14:21:55] (that doesn’t mean it shouldn’t be fixed, of courser [14:21:56] hey folks, I see that train log triage this week conflicts with the new "essential learnings" meeting which is essentially "time to discuss the item bumped from the staff meeting" which includes the gemini pilot and the compensation cycle. does it make sense to move the triage? [14:21:58] *course) [14:33:21] (03PS1) 10Gkyziridis: inference-services: Add liftwing-openapi-server CI/CD pipelines. [integration/config] - 10https://gerrit.wikimedia.org/r/1298804 (https://phabricator.wikimedia.org/T427902) [14:54:11] (03approved) 10kineticpelagic: feat(T424210): add enum-description checks for requestBody and response [repos/ci-tools/wikimedia-spectral-ruleset] - 10https://gitlab.wikimedia.org/repos/ci-tools/wikimedia-spectral-ruleset/-/merge_requests/8 (owner: 10aghirelli) [14:54:33] (03merge) 10kineticpelagic: feat(T424210): add enum-description checks for requestBody and response [repos/ci-tools/wikimedia-spectral-ruleset] - 10https://gitlab.wikimedia.org/repos/ci-tools/wikimedia-spectral-ruleset/-/merge_requests/8 (owner: 10aghirelli) [15:16:17] (03open) 10jnuche: patches.py: replace `shutil.copy2` with `shutil.copy` [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1200 (https://phabricator.wikimedia.org/T428316) [16:20:32] RESOLVED: PuppetSyncFailure: Failed to update Puppet repository /srv/git/operations/puppet on instance deployment-puppetserver-1 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetSyncFailure [16:24:00] (03update) 10brennen: mr. widget locking: switch to getRuntimeCache() [repos/phabricator/extensions] (wmf/stable) - 10https://gitlab.wikimedia.org/repos/phabricator/extensions/-/merge_requests/56 (https://phabricator.wikimedia.org/T401160) [16:24:18] (03update) 10brennen: Draft: mr. widget locking: switch to getRuntimeCache() [repos/phabricator/extensions] (wmf/stable) - 10https://gitlab.wikimedia.org/repos/phabricator/extensions/-/merge_requests/56 (https://phabricator.wikimedia.org/T401160) [16:43:24] 06Release-Engineering-Team (Radar), 06cloud-services-team, 10Cloud-VPS: Magnum cluster stuck in DELETE_FAILED status - https://phabricator.wikimedia.org/T428312#11995247 (10dduvall) Thanks, @Andrew! I was able to delete the cluster template via `openstack coe cluster template delete` just now FWIW. I'll atte... [16:55:28] 10Beta-Cluster-Infrastructure: Failed to update Puppet repository /srv/git/operations/puppet on instance deployment-puppetserver-1 in project deployment-prep - https://phabricator.wikimedia.org/T428447#11995288 (10bd808) 05Open→03Resolved a:03bd808 `lang=shell-session bd808@deployment-puppetserver-1.de... [17:10:49] (03update) 10jnuche: patches.py: replace `shutil.copy2` with `shutil.copy` [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1200 (https://phabricator.wikimedia.org/T428316) [17:10:49] (03close) 10jnuche: patches.py: replace `shutil.copy2` with `shutil.copy` [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1200 (https://phabricator.wikimedia.org/T428316) [17:14:59] (03open) 10jnuche: patches.py: replace `shutil.copy2` with `shutil.copyfile` [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1201 (https://phabricator.wikimedia.org/T428316) [17:15:49] (03approved) 10thcipriani: patches.py: replace `shutil.copy2` with `shutil.copyfile` [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1201 (https://phabricator.wikimedia.org/T428316) (owner: 10jnuche) [17:17:57] (03merge) 10jnuche: patches.py: replace `shutil.copy2` with `shutil.copyfile` [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1201 (https://phabricator.wikimedia.org/T428316) [17:19:46] (03open) 10jnuche: Release 4.268.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1202 [17:22:42] (03merge) 10jnuche: Release 4.268.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1202 [17:23:52] RECOVERY - jenkins_service_running on contint1003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [17:24:22] (03update) 10jnuche: Release 4.268.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1203 [17:24:24] (03open) 10jnuche: Release 4.268.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1203 [17:26:52] PROBLEM - jenkins_service_running on contint1003 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [17:28:21] (03merge) 10jnuche: Release 4.268.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1203 [17:33:52] RECOVERY - jenkins_service_running on contint1003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [17:36:52] PROBLEM - jenkins_service_running on contint1003 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [17:42:19] 10Beta-Cluster-Infrastructure: Puppet agent failure detected on instance deployment-schema-3 in project deployment-prep - https://phabricator.wikimedia.org/T428410#11995504 (10bd808) 05Open→03Invalid `lang=shell-session bd808@deployment-schema-3:~$ sudo -i puppet agent -tv Info: Using environment 'produc... [18:01:40] 10Beta-Cluster-Infrastructure, 06Growth-Team, 10GrowthExperiments-NewcomerTasks, 06Discovery-Search (2026.06.01 - 2026.07.03): [beta-cluster] Fetching task suggestions failed: cirrussearch-backend-error - https://phabricator.wikimedia.org/T427196#11995672 (10dcausse) a:03dcausse Search should be back on... [18:05:23] 10Beta-Cluster-Infrastructure, 06cloud-services-team, 10Cloud-VPS: horizon: hiera config reseted to an empty state for deployment-prep instances - https://phabricator.wikimedia.org/T262284#11995699 (10bd808) 05Open→03Declined Too old to do anything about [18:06:51] 10Beta-Cluster-Infrastructure, 07Epic: Use infrastructure as code techniques to rebuild the Beta Cluster - https://phabricator.wikimedia.org/T394316#11995706 (10bd808) [18:22:51] (03PS1) 10Jforrester: jjb: [catalyst-daily-Echo] Add Echo job [integration/config] - 10https://gerrit.wikimedia.org/r/1298868 (https://phabricator.wikimedia.org/T427007) [18:22:57] (03CR) 10Jforrester: [C:03+2] jjb: [catalyst-daily-Echo] Add Echo job [integration/config] - 10https://gerrit.wikimedia.org/r/1298364 (https://phabricator.wikimedia.org/T427007) (owner: 10Vaughn Walters) [18:26:25] (03PS2) 10Jforrester: jjb: [catalyst-daily-Echo] Add Echo job [integration/config] - 10https://gerrit.wikimedia.org/r/1298364 (https://phabricator.wikimedia.org/T427007) (owner: 10Vaughn Walters) [18:26:32] (03Abandoned) 10Jforrester: jjb: [catalyst-daily-Echo] Add Echo job [integration/config] - 10https://gerrit.wikimedia.org/r/1298868 (https://phabricator.wikimedia.org/T427007) (owner: 10Jforrester) [18:26:49] (03CR) 10Jforrester: "…" [integration/config] - 10https://gerrit.wikimedia.org/r/1298364 (https://phabricator.wikimedia.org/T427007) (owner: 10Vaughn Walters) [18:29:24] (03Merged) 10jenkins-bot: jjb: [catalyst-daily-Echo] Add Echo job [integration/config] - 10https://gerrit.wikimedia.org/r/1298364 (https://phabricator.wikimedia.org/T427007) (owner: 10Vaughn Walters) [18:29:55] 10Phabricator: Requests for changes of the automated weekly Phabricator data for Tech News - https://phabricator.wikimedia.org/T428290#11995754 (10STei-WMF) 1A: yes, let's exclude all bot authored activities. With regards to adding the string, how and who does it? 1B and C: If the request would mean missing impo... [18:30:35] 10GitLab (Account Approval), 06Release-Engineering-Team: Requesting GitLab account activation for dmiranda - https://phabricator.wikimedia.org/T428494 (10dmiranda) 03NEW [18:33:20] 06Release-Engineering-Team (Radar), 06cloud-services-team, 10Cloud-VPS: Magnum cluster stuck in DELETE_FAILED status - https://phabricator.wikimedia.org/T428312#11995793 (10dduvall) 05Open→03Resolved a:03dduvall I tested a full create/destroy cycle via `tofu` and it worked. Thanks, @Andrew [18:44:10] 10Phabricator: Requests for changes of the automated weekly Phabricator data for Tech News - https://phabricator.wikimedia.org/T428290#11995837 (10STei-WMF) So the SSH key part is where I would get stuck. It looks like I need to spend more time figuring out which instruction I am not following well. But I will t... [19:27:12] 10Phabricator: Requests for changes of the automated weekly Phabricator data for Tech News - https://phabricator.wikimedia.org/T428290#11995950 (10Quiddity) @Aklapper Re: 1a.: Sounds reasonable, although I believe `LibUp-bot` is the only bot that ever creates tasks that show-up in this weekly summary. But I defe... [19:30:02] 10Phabricator (Upstream), 07Upstream: Typo in Arcanist's XUnitTestEngine.php:91 - https://phabricator.wikimedia.org/T428380#11995957 (10Pppery) This isn't exactly a typo - writing quotes like that was once an accepted way of doing it: https://en.wikipedia.org/wiki/Backtick#As_surrogate_of_apostrophe_or_(openin... [19:32:12] 10Phabricator (Upstream), 07Upstream: Typo in Arcanist's XUnitTestEngine.php:91 - https://phabricator.wikimedia.org/T428380#11995962 (10Pppery) https://we.phorge.it/D27063 [19:32:16] 10Phabricator (Upstream), 07Upstream: Typo in Arcanist's XUnitTestEngine.php:91 - https://phabricator.wikimedia.org/T428380#11995964 (10Pppery) a:03Pppery [19:35:52] RECOVERY - jenkins_service_running on contint1003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [19:38:52] PROBLEM - jenkins_service_running on contint1003 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [20:47:50] (03PS1) 10SD0001: Zuul: [TemplateStyles] Add Scribunto phan dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1298905 (https://phabricator.wikimedia.org/T386436) [21:26:11] 06Project-Admins: Consider archiving #server-side-upload-request - https://phabricator.wikimedia.org/T428508 (10Pppery) 03NEW [22:03:54] 06Release-Engineering-Team (Radar), 10Cloud-VPS (Quota-requests): Quota increase request for zuul - https://phabricator.wikimedia.org/T428515 (10dduvall) 03NEW [22:11:54] PROBLEM - zuul_merger_service_running on contint2002 is CRITICAL: PROCS CRITICAL: 2 processes with regex args bin/zuul-merger https://www.mediawiki.org/wiki/Continuous_integration/Zuul [22:12:54] RECOVERY - zuul_merger_service_running on contint2002 is OK: PROCS OK: 1 process with regex args bin/zuul-merger https://www.mediawiki.org/wiki/Continuous_integration/Zuul [23:19:43] 06Release-Engineering-Team (Radar), 10Cloud-VPS (Quota-requests): Quota increase request for zuul - https://phabricator.wikimedia.org/T428515#11996847 (10bd808) @dduvall The `tls-server-name: 127.0.0.1` trick in the kubeconfig does not work for the executor access?