[00:15:03] maintenance-disconnect-full-disks build 703710 integration-agent-docker-1045 (/: 25%, /srv: 99%, /var/lib/docker: 32%): OFFLINE due to disk space [00:20:03] maintenance-disconnect-full-disks build 703711 integration-agent-docker-1045 (/: 25%, /srv: 11%, /var/lib/docker: 30%): RECOVERY disk space OK [03:22:31] 10Release-Engineering-Team (Doing 😎), 07Essential-Work, 05Release, 05Train Deployments: 1.45.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T392172#10846325 (10MusikAnimal) Could we confirm T394891 is a blocker? I don't think it is, but I can get the fix backported if anyone disagrees. Cod... [03:49:19] 10Release-Engineering-Team (Doing 😎), 07Essential-Work, 05Release, 05Train Deployments: 1.45.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T392172#10846355 (10Aklapper) I'm fine removing it as a blocker. I set it as a blocker right after the group1 deployment yesterday, in the following h... [03:49:25] 10Release-Engineering-Team (Doing 😎), 07Essential-Work, 05Release, 05Train Deployments: 1.45.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T392172#10846356 (10Aklapper) [04:27:37] 10Phabricator, 10Legalpad, 06SRE: Allow aklapper to view/edit L3 - https://phabricator.wikimedia.org/T394966 (10Aklapper) 03NEW p:05Triage→03Low [04:30:00] 10Phabricator, 10Phabricator Antivandalism Extension: AVA: Punish on mass-adding edges at once - https://phabricator.wikimedia.org/T394967 (10Aklapper) 03NEW p:05Triage→03Low [05:31:13] 10Phabricator, 10Release-Engineering-Team (Priority Backlog 📥): Decrease number of open Phab tickets with assignee field set for more than two years (aka cookie licking) (Q2/2025 edition) - https://phabricator.wikimedia.org/T380312#10846513 (10Aklapper) 05Open→03In progress a:03Aklapper [05:41:15] 10Phabricator, 10Release-Engineering-Team (Doing 😎): Decrease number of open Phab tickets with assignee field set for more than two years (aka cookie licking) (Q2/2025 edition) - https://phabricator.wikimedia.org/T380312#10846547 (10Aklapper) Sent emails about 359 open tasks assigned for more than two years wi... [07:14:55] 10WikimediaDebug, 06SRE, 06Traffic, 07Developer Productivity, 13Patch-For-Review: Let X-Analytics response header pass through with WikimediaDebug - https://phabricator.wikimedia.org/T305794#10846779 (10Vgutierrez) 05In progress→03Resolved CR got merged now, give it the usual ~30 minutes for pupp... [07:27:31] (03open) 10jelto: gitlab-runner: bump image version to alpine-v17.10.1 [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/471 (https://phabricator.wikimedia.org/T394953) [07:35:27] 06Project-Admins, 06Release-Engineering-Team, 10MediaWiki-extensions-General, 10MediaWiki-General, and 2 others: Create a security pre-release Phabricator policy manageable by the Security Team - https://phabricator.wikimedia.org/T393403#10846815 (10Aklapper) Let's first try without "hiding" the ACL projec... [07:40:02] 10Gerrit: 403 Forbidden on Gerrit - https://phabricator.wikimedia.org/T394916#10846820 (10hashar) Hi, I am one of the person maintaining Gerrit at the foundation. @Peachey88 is correct, the issue is we recently started blocking some old browsers. Part of the rationale is the Gerrit code review system (https://g... [07:41:04] 10Gerrit: Users of ProvieIt gadget get a 403 Forbidden fetching i18n files from Gerrit/Gitiles - https://phabricator.wikimedia.org/T394916#10846826 (10hashar) [07:43:55] So, whats the best channel to talk to people about "MediaWiki-Quickstart" in? [07:44:21] #wikimedia-qte perhaps! (answers for himself) [07:58:50] addshore: ADDSHOREEEEEEEEEE [07:59:18] I guess I should lurk that channel as well [07:59:52] xD [07:59:56] addshore: they are mostly on Slack though and iirc MW Quickstart is mostly done by Monte who is in the usa/west coast [08:00:06] aaah, not awake yet :D [08:00:30] had we hired you, that would not be an issue :b [08:00:47] bwhahahaaaa [08:17:09] 10MediaWiki-Releasing, 10AntiSpoof, 06Trust and Safety Product Team: Bundle AntiSpoof extension with MediaWiki - https://phabricator.wikimedia.org/T191736#10846913 (10Bugreporter2) [08:20:11] 10Phabricator, 06MediaWiki-Platform-Team: Should '#MediaWiki-Platform-Team (Roadmap)' be added to the exclusions for H425? - https://phabricator.wikimedia.org/T394936#10846931 (10taavi) 05Open→03Resolved a:03taavi [08:40:03] maintenance-disconnect-full-disks build 703811 integration-agent-docker-1050 (/: 25%, /srv: 95%, /var/lib/docker: 38%): OFFLINE due to disk space [08:45:03] maintenance-disconnect-full-disks build 703812 integration-agent-docker-1050 (/: 25%, /srv: 61%, /var/lib/docker: 39%): RECOVERY disk space OK [08:53:01] 10GitLab (Infrastructure), 10Ceph, 06collaboration-services, 10Data-Persistence-Backup, and 3 others: Migrate gitlab storage to apus (also: backups from S3?) - https://phabricator.wikimedia.org/T378922#10847120 (10Jelto) >>! In T378922#10843403, @MatthewVernon wrote: > @Jelto both buckets deleted. Thanks... [09:16:54] (03approved) 10cgoubert: mwscript-mwcron: historical compatibility logging mode [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/176 (owner: 10swfrench) [09:19:42] 10Release-Engineering-Team (Radar), 06Infrastructure-Foundations, 06serviceops-radar: Allow release engineering to delete images - https://phabricator.wikimedia.org/T354786#10847263 (10Clement_Goubert) a:05Clement_Goubert→03None [09:34:08] 10Continuous-Integration-Infrastructure (Zuul upgrade), 06collaboration-services, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: create new admin group for "zuul devs" - https://phabricator.wikimedia.org/T394819#10847319 (10LSobanski) a:03Dzahn [09:36:28] 10GitLab (Infrastructure), 10Ceph, 06collaboration-services, 10Data-Persistence-Backup, and 3 others: Migrate gitlab storage to apus (also: backups from S3?) - https://phabricator.wikimedia.org/T378922#10847339 (10MatthewVernon) Ah, the bucket is gone from `eqiad`, but `codfw` is still catching up: ` root@... [09:57:18] GitLab needs a short maintenance break in one hour [10:11:40] 10Phabricator (Upstream), 07Upstream: Phabricator workboard import: Array for %Ls conversion is empty. Query: projectPHID IN (%Ls) - https://phabricator.wikimedia.org/T392168#10847445 (10Aklapper) I proposed a fix in upstream https://we.phorge.it/D26030 [10:15:06] 06Project-Admins, 06SRE: Disable #acl*sre_team workboard and update its project description - https://phabricator.wikimedia.org/T394654#10847452 (10LSobanski) 05Open→03Resolved a:03LSobanski Done. [10:24:13] 06Project-Admins, 06SRE: Disable #acl*sre_team workboard and update its project description - https://phabricator.wikimedia.org/T394654#10847477 (10Aklapper) Thanks!! [10:47:58] 10Release-Engineering-Team (Doing 😎), 07Essential-Work, 05Release, 05Train Deployments: 1.45.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T392172#10847549 (10Aklapper) 05Open→03Resolved This looks fine. [10:49:33] 10Gerrit: Incorrect "CI has completed checks" popup appears when navigating from a change with tests in progress to one with no tests in progress - https://phabricator.wikimedia.org/T394485#10847552 (10A_smart_kitten) 05Open→03Resolved I can no longer reproduce this issue - I believe it has been resolved... [10:52:03] 10Phabricator, 06Infrastructure-Foundations: Removing an offboarded user from privileged Phab ACL projects should not require admins - https://phabricator.wikimedia.org/T392111#10847566 (10Aklapper) a:03MoritzMuehlenhoff [11:05:36] 10Phabricator, 07SecTeam-Processed: Change the dropdown in security ticket dropdown to not include WMF Product and WMF Technology as two separate departments - https://phabricator.wikimedia.org/T384243#10847604 (10Aklapper) If this data is not being used, then I propose to decline this task, not to collect thi... [11:07:02] GitLab maintenance finished [11:13:42] 10Beta-Cluster-Infrastructure, 10CXServer: CXServer doesn't work on Beta Cluster - https://phabricator.wikimedia.org/T323417#10847648 (10Nikerabbit) p:05Triage→03Low [11:14:46] 10Gerrit: Incorrect "CI has completed checks" popup appears when navigating from a change with tests in progress to one with no tests in progress - https://phabricator.wikimedia.org/T394485#10847653 (10hashar) Thank you for the very two reports and verifications! Eventually I should overhaul the code to make... [11:18:26] 10GitLab (Infrastructure), 06collaboration-services: Check GitLab artifact retention time - https://phabricator.wikimedia.org/T395014 (10Jelto) 03NEW [11:26:27] my emojis change has been merged :) (including my change to allow plugins to customise it further) [11:36:17] 10Gerrit, 10Wikimedia-GitHub: Changes in github.com mirror behavior - BlueSpiceSmart{L|l}ist - https://phabricator.wikimedia.org/T394903#10847713 (10hashar) Hi, I have restarted Gerrit yesterday at 2025-05-21 07:24:48 UTC which triggers a full replication of all repositories. The corresponding replication logs... [11:37:20] !log gerrit: changed parent of mediawiki/extensions/BlueSpiceSmartlist (lower case L) to All-Archived-Projects to prevent it from being replicated to GitHub | T394903 [11:37:22] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:37:22] T394903: Changes in github.com mirror behavior - BlueSpiceSmart{L|l}ist - https://phabricator.wikimedia.org/T394903 [11:39:38] !log Triggered replication of mediawiki/extensions/BlueSpiceSmartlist and mediawiki/extensions/BlueSpiceSmartList to fix https://github.com/wikimedia/mediawiki-extensions-BlueSpiceSmartlist | T394903 [11:39:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:43:05] 10Gerrit, 10Wikimedia-GitHub: Changes in github.com mirror behavior - BlueSpiceSmart{L|l}ist - https://phabricator.wikimedia.org/T394903#10847735 (10hashar) I have replicated: The lower case version which now only replicates to gerrit-replica.wikimedia.org: ` $ ssh -p 29418 gerrit.wikimedia.org replication st... [11:43:28] 10Gerrit, 10Wikimedia-GitHub: Changes in github.com mirror behavior - BlueSpiceSmartList & BlueSpiceSmartlist - https://phabricator.wikimedia.org/T394903#10847743 (10hashar) [12:14:39] 10Beta-Cluster-Infrastructure, 10NavigationTiming, 06SRE Observability: navtiming: Loss of Kafka connection fills multiple log files with identical stack traces - https://phabricator.wikimedia.org/T391273#10847883 (10Krinkle) For what it's worth, there never has been a local Kafka there and it wouldn't help... [12:19:41] (03CR) 10Hashar: [C:03+2] Gerrit 3.10.6 and rebuild plugins [software/gerrit] (deploy/wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1148868 (https://phabricator.wikimedia.org/T390666) (owner: 10Hashar) [12:20:21] (03Merged) 10jenkins-bot: Gerrit 3.10.6 and rebuild plugins [software/gerrit] (deploy/wmf/stable-3.10) - 10https://gerrit.wikimedia.org/r/1148868 (https://phabricator.wikimedia.org/T390666) (owner: 10Hashar) [12:28:35] 06Release-Engineering-Team, 06collaboration-services: ProbeDown - https://phabricator.wikimedia.org/T395022 (10phaultfinder) 03NEW [12:28:36] FIRING: [2x] ProbeDown: Service gerrit1003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit1003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [12:29:58] 10Gerrit (Gerrit 3.10), 13Patch-For-Review: Upgrade to Gerrit 3.10.6 - https://phabricator.wikimedia.org/T390666#10847915 (10hashar) 05Open→03Resolved [12:33:31] RESOLVED: [2x] ProbeDown: Service gerrit1003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit1003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [12:34:51] 10Gerrit (Gerrit 3.10), 07Upstream: Gerrit editor replaces tabs with spaces - https://phabricator.wikimedia.org/T355816#10847949 (10hashar) @Reedy & @Winston_Sung I have upgraded to Gerrit 3.10.6 some minutes ago, that supposedly fixes the issue. [12:38:15] 10Gerrit (Gerrit 3.10), 07Upstream: Gerrit editor replaces tabs with spaces - https://phabricator.wikimedia.org/T355816#10847956 (10Winston_Sung) Tested and confirmed already fixed. [12:44:30] 10Gerrit: Users of ProvieIt gadget get a 403 Forbidden fetching i18n files from Gerrit/Gitiles - https://phabricator.wikimedia.org/T394916#10847962 (10Sophivorus) Hi! I'm happy to hear that fetching localization messages from Gitiles is a valid use case, but I think I will replace it for a more standard approach... [12:56:44] 10GitLab (Infrastructure), 10Ceph, 06collaboration-services, 10Data-Persistence-Backup, and 3 others: Migrate gitlab storage to apus (also: backups from S3?) - https://phabricator.wikimedia.org/T378922#10848007 (10Jelto) >>! In T378922#10847339, @MatthewVernon wrote: > Ah, the bucket is gone from `eqiad`,... [12:56:54] 10Gerrit (Gerrit 3.10), 07Upstream: Inverted search operators are not autocompleted - https://phabricator.wikimedia.org/T388672#10848010 (10hashar) 05Open→03Resolved a:03Paladox After upgrading to Gerrit 3.10.6 I can confirm searching `-owner:a` popups a lists of users matching `a`. That solved it.... [13:06:37] 10GitLab (Infrastructure), 10Ceph, 06collaboration-services, 10Data-Persistence-Backup, and 3 others: Migrate gitlab storage to apus (also: backups from S3?) - https://phabricator.wikimedia.org/T378922#10848027 (10jcrespo) I am working on setting up the dedicated gitlab/gerrit storage host, but at the mome... [13:10:10] 10Gerrit, 10Wikimedia-GitHub: Changes in github.com mirror behavior - BlueSpiceSmartList & BlueSpiceSmartlist - https://phabricator.wikimedia.org/T394903#10848037 (10Osnard) Thank you very much! > What I wonder is why we/I did not delete the lower case version. Possibly because we want to keep an history of t... [13:11:21] 10Gerrit (Gerrit 3.10), 07Upstream: Gerrit editor replaces tabs with spaces - https://phabricator.wikimedia.org/T355816#10848047 (10Jdforrester-WMF) 05Open→03Resolved a:03Paladox [13:22:28] (03open) 10taavi: push: Fix hashtag logic being the wrong way around [repos/ci-tools/libup] - 10https://gitlab.wikimedia.org/repos/ci-tools/libup/-/merge_requests/77 [13:22:33] (03update) 10taavi: push: Fix hashtag logic being the wrong way around [repos/ci-tools/libup] - 10https://gitlab.wikimedia.org/repos/ci-tools/libup/-/merge_requests/77 [13:24:40] (03approved) 10jforrester: push: Fix hashtag logic being the wrong way around [repos/ci-tools/libup] - 10https://gitlab.wikimedia.org/repos/ci-tools/libup/-/merge_requests/77 (owner: 10taavi) [13:25:11] (03update) 10taavi: push: Fix hashtag logic being the wrong way around [repos/ci-tools/libup] - 10https://gitlab.wikimedia.org/repos/ci-tools/libup/-/merge_requests/77 [13:27:48] (03merge) 10taavi: push: Fix hashtag logic being the wrong way around [repos/ci-tools/libup] - 10https://gitlab.wikimedia.org/repos/ci-tools/libup/-/merge_requests/77 [14:09:02] (03PS1) 10Slyngshede: New Docker image, dotnet version 8 [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 [14:10:42] (03CR) 10CI reject: [V:04-1] New Docker image, dotnet version 8 [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (owner: 10Slyngshede) [14:11:40] (03PS2) 10Slyngshede: New Docker image, dotnet version 8 [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 [14:14:21] (03CR) 10Slyngshede: "I've attempted to test build this locally, but docker-pkg doesn't appear to support cross-platform builds." [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (owner: 10Slyngshede) [14:15:34] 10Phabricator, 07SecTeam-Processed: Change the dropdown in security ticket dropdown to not include WMF Product and WMF Technology as two separate departments - https://phabricator.wikimedia.org/T384243#10848430 (10sbassett) I don't mind declining this task. I'm not sure about entirely removing the field. The... [14:21:26] 06Project-Admins, 06Release-Engineering-Team, 10MediaWiki-extensions-General, 10MediaWiki-General, and 2 others: Create a security pre-release Phabricator policy manageable by the Security Team - https://phabricator.wikimedia.org/T393403#10848462 (10sbassett) >>! In T393403#10846815, @Aklapper wrote: > I e... [14:21:56] (03PS3) 10Slyngshede: New Docker image, dotnet version 8 [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (https://phabricator.wikimedia.org/T395036) [14:22:05] 10Phabricator, 10Release-Engineering-Team (Doing 😎): Call to a member function updateDatasourceTokens() on null at PhabricatorProjectFulltextEngine.php:20 (due to DB corruption) - https://phabricator.wikimedia.org/T345563#10848473 (10Aklapper) a:03Aklapper [14:23:49] 10Phabricator, 10Release-Engineering-Team (Doing 😎): Call to a member function updateDatasourceTokens() on null at PhabricatorProjectFulltextEngine.php:20 (due to DB corruption) - https://phabricator.wikimedia.org/T345563#10848487 (10Aklapper) 05Open→03Resolved This problem was created in T177787#90235... [14:24:42] (03CR) 10Jforrester: "Images in this repo are meant to be very sparing, as we have to maintain them. Can you not use the pipeline system to control your own CI," [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (https://phabricator.wikimedia.org/T395036) (owner: 10Slyngshede) [14:31:14] (03merge) 10dancy: gitlab-runner: bump image version to alpine-v17.10.1 [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/471 (https://phabricator.wikimedia.org/T394953) (owner: 10jelto) [14:35:37] !log Upgrading gitlab-runner to v17.10.1 in gitlab-cloud-runner staging [14:42:26] !log Upgrading gitlab-runner to v17.10.1 in gitlab-cloud-runner production [14:42:39] 06Project-Admins, 06Release-Engineering-Team, 10MediaWiki-extensions-General, 10MediaWiki-General, and 2 others: Create a security pre-release Phabricator policy manageable by the Security Team - https://phabricator.wikimedia.org/T393403#10848638 (10Aklapper) > I think we'd be more concerned with setting t... [14:44:13] dancy: your messages have a space in front of them which means they're not actually getting logged [14:44:27] (03CR) 10Krinkle: "This image will replace the `dotnet-mono612` image (for the countervandalism/CVNBot repo) once this new job is passing. That one can then " [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (https://phabricator.wikimedia.org/T395036) (owner: 10Slyngshede) [14:44:28] d'oh! Thanks Taavi. [14:45:04] !log Upgrade gitlab-runner to v17.10.1 in gitlab-cloud-runner (staging and production) T394953 [14:45:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:45:21] (03CR) 10Krinkle: New Docker image, dotnet version 8 (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (https://phabricator.wikimedia.org/T395036) (owner: 10Slyngshede) [14:45:57] (03CR) 10Jforrester: "Yes, that one shouldn't be here either. See e.g. how we moved out rust etc. images as not really appropriate for this infrastructure." [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (https://phabricator.wikimedia.org/T395036) (owner: 10Slyngshede) [14:54:11] (03CR) 10Krinkle: "I don't understand what is being proposed. This is running linting and unit tests for a dotnet tool." [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (https://phabricator.wikimedia.org/T395036) (owner: 10Slyngshede) [14:55:20] (03open) 10dancy: branch.py: Increase GERRIT_TIMEOUT to 1.25 hour [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/177 [14:55:24] (03update) 10dancy: branch.py: Increase GERRIT_TIMEOUT to 1.25 hour [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/177 [14:57:50] (03update) 10dancy: branch.py: Increase GERRIT_TIMEOUT to 1.25 hour [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/177 [15:01:41] 10Gerrit, 10Release-Engineering-Team (Doing 😎), 06collaboration-services: Enable and explore Google search console for Gerrit - https://phabricator.wikimedia.org/T392669#10848798 (10hashar) There is the task about having Google to crawl Gerrit code reviews and have Timo (& others) knowledge exposed to the wo... [15:04:35] (03CR) 10Jforrester: "> I don't understand what is being proposed." [integration/config] - 10https://gerrit.wikimedia.org/r/1149413 (https://phabricator.wikimedia.org/T395036) (owner: 10Slyngshede) [15:16:43] 10Beta-Cluster-Infrastructure, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2025.05.02 - 2025.05.23), 07Puppet: Puppet failing on deployment-cirrussearch{12,13,14}.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T393924#10848905 (10dancy) I want to attempt to make p... [15:23:52] 10Beta-Cluster-Infrastructure, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2025.05.02 - 2025.05.23), 07Puppet: Puppet failing on deployment-cirrussearch{12,13,14}.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T393924#10848955 (10dancy) I ended up adding an entry... [15:41:00] FIRING: [3x] PuppetAgentNoResources: No Puppet resources found on instance deployment-cirrussearch12 on project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [15:46:00] FIRING: [3x] PuppetAgentNoResources: No Puppet resources found on instance deployment-cirrussearch12 on project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [15:47:55] 10Beta-Cluster-Infrastructure, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2025.05.02 - 2025.05.23), 07Puppet: Puppet failing on deployment-cirrussearch{12,13,14}.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T393924#10849133 (10dancy) Notes: `modules/role/manif... [15:51:00] RESOLVED: [3x] PuppetAgentNoResources: No Puppet resources found on instance deployment-cirrussearch12 on project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [15:59:01] 10Continuous-Integration-Infrastructure (Zuul upgrade), 06collaboration-services, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: create new admin group for "zuul devs" - https://phabricator.wikimedia.org/T394819#10849192 (10thcipriani) Re-using `contint-roots` makes sense here. I note that that also p... [16:17:09] (03open) 10cgoubert: mediawiki-cli: Install fonts-freefont-ttf [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/178 [16:17:40] 10Beta-Cluster-Infrastructure, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2025.05.02 - 2025.05.23), 07Puppet: Puppet failing on deployment-cirrussearch{12,13,14}.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T393924#10849264 (10dancy) I ran: ` $ for n in 12 13 1... [16:22:28] FIRING: PuppetAgentFailure: Puppet agent failure detected on instance deployment-cirrussearch13 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:32:28] FIRING: [2x] PuppetAgentFailure: Puppet agent failure detected on instance deployment-cirrussearch12 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:37:28] FIRING: [2x] PuppetAgentFailure: Puppet agent failure detected on instance deployment-cirrussearch12 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:40:42] 10GitLab (CI & Job Runners), 06Release-Engineering-Team, 07Essential-Work: buildkit v0.21.1 released - https://phabricator.wikimedia.org/T393731#10849368 (10dancy) Since the buildkit 0.21.1 upgrade there have been some cases where buildkit's storage directory has been filling up. Examples: * https://gi... [16:41:16] 10Continuous-Integration-Infrastructure, 07Jenkins: CasC configuration for CI Jenkins - https://phabricator.wikimedia.org/T328920#10849371 (10jnuche) 05Open→03Declined //There's no CasC configuration for Jenkins CI, only Zuul//: https://phabricator.wikimedia.org/project/view/7592/ We are currently wor... [16:41:49] 10Continuous-Integration-Infrastructure, 07Jenkins: CasC configuration for CI Jenkins - https://phabricator.wikimedia.org/T328920#10849377 (10jnuche) a:05jnuche→03None [16:43:19] 10Continuous-Integration-Infrastructure, 07Jenkins, 10Release-Engineering-Team (Priority Backlog 📥), 06collaboration-services: Automate integration Jenkins deployment and config changes - https://phabricator.wikimedia.org/T319406#10849389 (10jnuche) 05Open→03Resolved Only remaining subtask has now... [16:46:30] 10Release-Engineering-Team (Priority Backlog 📥), 10Scap, 13Patch-For-Review: backport is showing confusing prompt under certain conditions - https://phabricator.wikimedia.org/T360291#10849407 (10jnuche) a:05jnuche→03None [16:47:28] RESOLVED: [2x] PuppetAgentFailure: Puppet agent failure detected on instance deployment-cirrussearch12 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [17:11:39] (03merge) 10swfrench: mwscript-mwcron: historical compatibility logging mode [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/176 [17:18:03] (03CR) 10BryanDavis: [C:03+2] jjb: Skip first notif for Selenium jobs on beta [integration/config] - 10https://gerrit.wikimedia.org/r/1147076 (https://phabricator.wikimedia.org/T394551) (owner: 10BryanDavis) [17:19:30] (03Merged) 10jenkins-bot: jjb: Skip first notif for Selenium jobs on beta [integration/config] - 10https://gerrit.wikimedia.org/r/1147076 (https://phabricator.wikimedia.org/T394551) (owner: 10BryanDavis) [17:22:00] (03update) 10swfrench: make-container-image: stop building the "publish" flavour [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/172 (https://phabricator.wikimedia.org/T391057) [17:24:28] (03merge) 10swfrench: make-container-image: stop building the "publish" flavour [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/172 (https://phabricator.wikimedia.org/T391057) [17:25:44] !log `./jjb-update 'selenium-daily-beta*-MediaWiki'` to deploy updates to selenium-daily-beta-MediaWiki and selenium-daily-betacommons-MediaWiki failure notifications (T394551) [17:25:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:25:47] T394551: Only alert when selenium-daily-{sitename}-MediaWiki tests fail on consecutive runs - https://phabricator.wikimedia.org/T394551 [17:31:43] 10Continuous-Integration-Config, 10Testing Support, 10Test-Platform (sesa): Only alert when selenium-daily-{sitename}-MediaWiki tests fail on consecutive runs - https://phabricator.wikimedia.org/T394551#10849638 (10bd808) 05Open→03Resolved a:03bd808 https://integration.wikimedia.org/ci/view/seleniu... [18:23:15] 10GitLab (CI & Job Runners), 06Release-Engineering-Team, 07Essential-Work: buildkit v0.21.1 released - https://phabricator.wikimedia.org/T393731#10850048 (10dancy) >>! In T393731#10849368, @dancy wrote: > Since the buildkit 0.21.1 upgrade there have been some cases where buildkit's storage directory has... [18:56:24] 10Continuous-Integration-Infrastructure (Zuul upgrade), 06collaboration-services, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: create new admin group for "zuul devs" - https://phabricator.wikimedia.org/T394819#10850226 (10Dzahn) Thanks for this confirmation and the chat in the meeting today. In thi... [19:10:34] (03close) 10swfrench: mwscript-mwcron: Make logging optional [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/175 (owner: 10rzl) [19:26:11] you probably are aware there are the 2 types of "docker compose", right? like the old one and the new one that is a plugin of docker. (v1 vs v2, Python vs Go). One, docker-compose, comes in standard Debian repos.. and the other, docker-compose-plugin, has a license incompatible with Debian.. there is a deb as well. but it only comes in docker.com upstream APT repo. Gotta decide again _HOW MUCH_ [19:26:18] it mattered to have the newer one. [19:35:49] (03open) 10brennen: Merge phorge/2024.35 into wmf/stable [repos/phabricator/arcanist] (wmf/stable) - 10https://gitlab.wikimedia.org/repos/phabricator/arcanist/-/merge_requests/4 (https://phabricator.wikimedia.org/T370266) [19:38:08] (03open) 10brennen: update submodules for upstream 2024.35 merge [repos/phabricator/deployment] (wmf/stable) - 10https://gitlab.wikimedia.org/repos/phabricator/deployment/-/merge_requests/69 (https://phabricator.wikimedia.org/T370266) [19:39:21] 06Project-Admins, 06Release-Engineering-Team, 10MediaWiki-extensions-General, 10MediaWiki-General, and 2 others: Create a security pre-release Phabricator policy manageable by the Security Team - https://phabricator.wikimedia.org/T393403#10850405 (10Mstyles) Since this would only be for a few tasks per yea... [20:05:00] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Rename gerrit2 unix user to gerrit and assign a fixed uid - https://phabricator.wikimedia.org/T338470#10850475 (10Dzahn) a:05Dzahn→03None [20:07:18] mutante: docker-compose v1 (the python script version) is dead tech. [20:07:39] podman should have good docker-compose v2 support [20:09:01] unfortunately podman was previously ruled out for $reasons [20:09:48] if docker compose v2 is absolutely needed we need to figure something out to import it [20:10:10] as the license is not compatible with Debian apparently [20:10:36] in that case will talk to infra-sec [20:11:11] !log devtools: phorge: test deploying work/merge-phorge-2024.35 changes [20:11:12] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:12:55] (03open) 10dancy: 0.7.0: Add a Grafana dashboard monitoring disk usage [repos/releng/buildkit-chart] - 10https://gitlab.wikimedia.org/repos/releng/buildkit-chart/-/merge_requests/2 [20:12:58] (03update) 10dancy: 0.7.0: Add a Grafana dashboard monitoring disk usage [repos/releng/buildkit-chart] - 10https://gitlab.wikimedia.org/repos/releng/buildkit-chart/-/merge_requests/2 [20:19:13] mutante: I haven't seen the docker-compose file itself. It might be compatible with the v1 schemas, but the python script itself has been abandonware from upstream for like 3 years now. [20:21:04] bd808: ACK! *nod* will find out [20:26:42] (03update) 10dancy: 0.7.0: Add a Grafana dashboard monitoring disk usage [repos/releng/buildkit-chart] - 10https://gitlab.wikimedia.org/repos/releng/buildkit-chart/-/merge_requests/2 [20:27:57] (03merge) 10dancy: 0.7.0: Add a Grafana dashboard monitoring disk usage [repos/releng/buildkit-chart] - 10https://gitlab.wikimedia.org/repos/releng/buildkit-chart/-/merge_requests/2 [20:29:58] 10Release-Engineering-Team (Radar), 06collaboration-services, 06Traffic, 13Patch-For-Review: Separate Gerrit https and ssh/git hostnames - https://phabricator.wikimedia.org/T394271#10850552 (10thcipriani) I'd really love to keep the same url to access gerrit regardless of protocol. That's the expected work... [20:37:50] 06Release-Engineering-Team, 06collaboration-services: ProbeDown - gerrit1003 - https://phabricator.wikimedia.org/T395022#10850609 (10Dzahn) [20:54:27] 06Release-Engineering-Team, 06collaboration-services: ProbeDown - gerrit1003 - https://phabricator.wikimedia.org/T395022#10850649 (10Dzahn) 05Open→03Resolved a:03Dzahn This was a Gerrit version upgrade where it was forgotten to add a downtime. [20:55:31] 10GitLab (CI & Job Runners), 06Release-Engineering-Team: Modernize buildkitd's GC settings - https://phabricator.wikimedia.org/T395091 (10dancy) 03NEW [21:09:11] !log Added `block_help: "see https://wikitech.wikimedia.org/wiki/Beta/Blocked_help for more information."` under `profile::cache::varnish::frontend::fe_vcl_config` in both deployment-cache-text and deployment-cache-upload Prefix Puppet (T393404) [21:09:13] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:09:13] T393404: Beta cluster IP block page should not point to noc@wikimedia.org - https://phabricator.wikimedia.org/T393404 [21:09:32] !log Cherry-picked https://gerrit.wikimedia.org/r/c/operations/puppet/+/1143602 (T393404) [21:09:34] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:12:53] !log Forced Puppet run and restarted varnins-frontend on deployment-cache-text08 to pick up new config (T393404) [21:12:54] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:15:52] !log Forced Puppet run and restarted varnins-frontend on deployment-cache-upload08 to pick up new config (T393404) [21:15:54] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:15:54] T393404: Beta cluster IP block page should not point to noc@wikimedia.org - https://phabricator.wikimedia.org/T393404 [21:24:53] 10Continuous-Integration-Infrastructure (Zuul upgrade), 06collaboration-services, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: give contint-roots access to new zuul VMs (was: create new admin group for "zuul devs") - https://phabricator.wikimedia.org/T394819#10850765 (10Dzahn) 05Open→03In progress [21:48:09] 10Continuous-Integration-Infrastructure, 07Jenkins, 10LDAP-Access-Requests, 06SRE, and 2 others: Grant Jenkins admin rights to Peter Hedenskog (QTE) - https://phabricator.wikimedia.org/T394749#10850823 (10Dzahn) a:03thcipriani [21:58:11] 10Continuous-Integration-Infrastructure, 07Jenkins, 10LDAP-Access-Requests, 06SRE, and 2 others: Grant Jenkins admin rights to Peter Hedenskog (QTE) - https://phabricator.wikimedia.org/T394749#10850839 (10thcipriani) a:05thcipriani→03None Approved as keeper of contint-admins. Also, I am @Peter's manag... [22:16:34] 10Continuous-Integration-Infrastructure, 07Jenkins, 10LDAP-Access-Requests, 06SRE, and 2 others: Grant Jenkins admin rights to Peter Hedenskog (QTE) - https://phabricator.wikimedia.org/T394749#10850867 (10Dzahn) @hashar So it requires 2 things, membership in LDAP group ciadmin and also shell access with co... [22:17:04] 10Continuous-Integration-Infrastructure, 07Jenkins, 10LDAP-Access-Requests, 06SRE, and 2 others: Grant Jenkins admin rights to Peter Hedenskog (QTE) - https://phabricator.wikimedia.org/T394749#10850868 (10Dzahn) I already did the LDAP group membership just now after Tyler's approval. [22:34:09] 10Continuous-Integration-Infrastructure, 07Jenkins, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Grant Jenkins admin rights to Peter Hedenskog (QTE) - https://phabricator.wikimedia.org/T394749#10850897 (10Dzahn) 05Open→03Resolved a:03Dzahn Done. Peter has a shell user on contint* machi... [23:19:51] (03update) 10dduvall: spiderpig: Support MediaWiki train deployments [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/808 (https://phabricator.wikimedia.org/T392610) [23:19:58] (03update) 10dduvall: spiderpig: Support MediaWiki train deployments [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/808 (https://phabricator.wikimedia.org/T392610) [23:20:01] (03update) 10dduvall: spiderpig: Support MediaWiki train deployments [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/808 (https://phabricator.wikimedia.org/T392610) [23:22:14] (03open) 10dancy: 0.7.1: dashboard.json: Disable instant datapoints [repos/releng/buildkit-chart] - 10https://gitlab.wikimedia.org/repos/releng/buildkit-chart/-/merge_requests/3 [23:22:18] (03update) 10dancy: 0.7.1: dashboard.json: Disable instant datapoints [repos/releng/buildkit-chart] - 10https://gitlab.wikimedia.org/repos/releng/buildkit-chart/-/merge_requests/3 [23:23:33] (03merge) 10dancy: 0.7.1: dashboard.json: Disable instant datapoints [repos/releng/buildkit-chart] - 10https://gitlab.wikimedia.org/repos/releng/buildkit-chart/-/merge_requests/3 [23:38:31] 06Release-Engineering-Team: Recent incidents of buildkitd's storage volume filling up - https://phabricator.wikimedia.org/T395097 (10dancy) 03NEW [23:40:15] 10GitLab (CI & Job Runners), 06Release-Engineering-Team, 07Essential-Work: buildkit v0.21.1 released - https://phabricator.wikimedia.org/T393731#10850963 (10dancy) The buildkitd volume issue is moved to T395097. [23:45:30] 06Release-Engineering-Team: Recent incidents of buildkitd's storage volume filling up - https://phabricator.wikimedia.org/T395097#10850969 (10dancy) On a 7-day view, https://grafana.cloud.releng.team/d/demojg23eidq8c/buildkitd shows regular daily spikes where a buildkitd volume maxes out, starting 2025-05-19. T... [23:45:39] 10GitLab (CI & Job Runners), 06Release-Engineering-Team: Recent incidents of buildkitd's storage volume filling up - https://phabricator.wikimedia.org/T395097#10850970 (10dancy) [23:57:40] (03update) 10dduvall: spiderpig: Support MediaWiki train deployments [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/808 (https://phabricator.wikimedia.org/T392610) [23:57:45] (03update) 10dduvall: spiderpig: Support MediaWiki train deployments [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/808 (https://phabricator.wikimedia.org/T392610) [23:57:48] (03update) 10dduvall: spiderpig: Support MediaWiki train deployments [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/808 (https://phabricator.wikimedia.org/T392610) [23:58:02] 10Phabricator, 06Release-Engineering-Team, 06collaboration-services, 06DBA: Prepare a database test for m3 - https://phabricator.wikimedia.org/T390034#10850978 (10thcipriani) >>! In T390034#10820826, @Marostegui wrote: > @brennen @Aklapper @Dzahn what is the status of this? Any ETA on when the tests wi...