[00:17:42] Project beta-scap-sync-world build #177487: 04FAILURE in 2 min 23 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/177487/ [00:28:16] Yippee, build fixed! [00:28:17] Project beta-scap-sync-world build #177488: 09FIXED in 2 min 37 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/177488/ [01:52:03] (03approved) 10dancy: jobs: Clone local mediawiki/core reference repo with git commands [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/94 (owner: 10dduvall) [05:23:33] FIRING: DatasourceError: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [05:33:33] RESOLVED: DatasourceError: Queue (Jenkins jobs + Zuul functions) alert - https://grafana.wikimedia.org/alerting/grafana/iS0FSjJ4z/view - https://wikitech.wikimedia.org/wiki/Monitoring/DatasourceError - https://alerts.wikimedia.org/?q=alertname%3DDatasourceError [07:02:04] 10Continuous-Integration-Infrastructure, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Grant bd808 membership in the contint-roots and contint-docker groups - https://phabricator.wikimedia.org/T377792#10252751 (10Bmueller) Approved - thanks @Dzahn! [08:57:25] I think https://phabricator.wikimedia.org/T377912 needs more eyes on it (possibly an emergency) [09:10:11] 10Continuous-Integration-Infrastructure, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Grant bd808 membership in the contint-roots and contint-docker groups - https://phabricator.wikimedia.org/T377792#10253156 (10hnowlan) 05Openβ†’03Resolved a:03hnowlan Merged! [09:32:32] (plz ignore my previous message, was an overreaction) [09:49:41] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 05Release, 05Train Deployments: 1.43.0-wmf.28 deployment blockers - https://phabricator.wikimedia.org/T375659#10253379 (10Michael) [09:55:55] ah nodejs [09:55:58] well npm [11:54:17] 10GitLab (Infrastructure), 10Release-Engineering-Team (Radar), 10ChangeProp, 06collaboration-services, and 10 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10253657 (10jijiki) [12:11:33] 10GitLab (Infrastructure), 10Release-Engineering-Team (Radar), 10ChangeProp, 06cloud-services-team, and 11 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10253695 (10jijiki) [12:35:46] (03PS1) 10Lucas Werkmeister (WMDE): Zuul: [mediawiki/extensions/EntitySchema] Add WikibaseCirrusSearch phan dep [integration/config] - 10https://gerrit.wikimedia.org/r/1082460 (https://phabricator.wikimedia.org/T376250) [12:38:50] (03CR) 10Lucas Werkmeister (WMDE): "Not particularly urgent but would be nice to have :)" [integration/config] - 10https://gerrit.wikimedia.org/r/1082460 (https://phabricator.wikimedia.org/T376250) (owner: 10Lucas Werkmeister (WMDE)) [13:02:41] (03CR) 10Urbanecm: [C:03+1] "LGTM" [integration/config] - 10https://gerrit.wikimedia.org/r/1082263 (https://phabricator.wikimedia.org/T374428) (owner: 10Michael Große) [13:35:01] 10GitLab (Infrastructure), 10Release-Engineering-Team (Radar), 10ChangeProp, 06cloud-services-team, and 11 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10254048 (10bking) Forgive the drive-by comment, but at the 6-month anniversa... [13:38:38] 06Release-Engineering-Team, 06Data Products, 06Data-Platform-SRE, 10Dumps-Generation, and 2 others: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#10254054 (10Joe) I think @BTullis' idea is great - there's a few unknowns regarding how to ke... [13:41:19] 06Release-Engineering-Team, 06Data Products, 06Data-Platform-SRE, 10Dumps-Generation, and 2 others: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#10254068 (10akosiaris) >>! In T352650#10252263, @BTullis wrote: > I'm keen to hear your feedb... [14:11:03] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 10Scap, 06SRE Observability: Scap prometheus migration: Reduce the cardinality of scap timers/statsd metrics - https://phabricator.wikimedia.org/T377883#10254226 (10lmata) [14:17:48] 10WikimediaDebug, 13Patch-For-Review, 10SRE Observability (FY2024/2025-Q2), 07Wikimedia-production-error: On-demand excimer profiling does not work for URLs longer than 255 bytes - https://phabricator.wikimedia.org/T377433#10254278 (10lmata) [14:23:52] 10Release-Engineering-Team (Radar), 10CAS-SSO, 06Infrastructure-Foundations, 06SRE Observability: Document how to authenticate a bot account through CAS-SSO - https://phabricator.wikimedia.org/T377372#10254313 (10lmata) Hi @bd808 during our team sync we discussed this and dont have a good answer. This feel... [14:32:41] 10Continuous-Integration-Infrastructure, 07Jenkins, 10Castor, 07Upstream: castor-save-workspace-cache aborted during postbuild - https://phabricator.wikimedia.org/T352319#10254357 (10hashar) I have updated the postbuild script on the CI Jenkins since my PR got merged and a release has been created. For th... [14:43:20] (03CR) 10Jaime Nuche: [C:03+2] Fix stacktrace blame links [releng/phatality] - 10https://gerrit.wikimedia.org/r/1082242 (owner: 10Dduvall) [14:43:46] (03Merged) 10jenkins-bot: Fix stacktrace blame links [releng/phatality] - 10https://gerrit.wikimedia.org/r/1082242 (owner: 10Dduvall) [14:50:35] (03PS1) 10TechieNK: Add TechieNK to trusted users [integration/config] - 10https://gerrit.wikimedia.org/r/1082489 [15:09:24] oh joy [15:11:05] 06Release-Engineering-Team, 06Data Products, 06Data-Platform-SRE, 10Dumps-Generation, and 2 others: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#10254536 (10Ottomata) +1, this is a great idea. [15:12:07] (03merge) 10dduvall: jobs: Clone local mediawiki/core reference repo with git commands [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/94 [15:34:15] (03open) 10dduvall: jobs: Abandon open changes when deleting `wmf/next` [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/95 [15:34:22] (03update) 10dduvall: jobs: Abandon open changes when deleting `wmf/next` [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/95 [15:34:23] (03update) 10dduvall: jobs: Abandon open changes when deleting `wmf/next` [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/95 [15:38:25] 10Continuous-Integration-Infrastructure, 07Jenkins, 10Castor, 07Upstream: castor-save-workspace-cache aborted during postbuild - https://phabricator.wikimedia.org/T352319#10254732 (10hashar) I have restarted the CI Jenkins | Parameterized Trigger | 806.vf6fff3e28c3e-31-g9f7e52b-pr400 | PostBuildScript |... [15:41:17] (03approved) 10dancy: jobs: Abandon open changes when deleting `wmf/next` [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/95 (owner: 10dduvall) [15:41:28] (03merge) 10dduvall: jobs: Abandon open changes when deleting `wmf/next` [repos/releng/jenkins-deploy] - 10https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/95 [15:53:02] Project beta-code-update-eqiad build #518739: 04FAILURE in 10 min: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/518739/ [15:54:08] >16:53:01 15:53:01 prep failed: Failed to acquire lock after waiting for 10 minute(s); concurrent prep is locked by jenkins-deploy (pid 3319583) on Wed Oct 23 15:35:43 2024; reason is "beta-scap-sync-world (build #177576)". [15:54:16] jenkins restart related? [15:58:43] (03PS1) 10Hashar: Refresh list of Jenkins plugins used for testing [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082502 [16:02:01] (03CR) 10Hashar: "That is a sync up of the plugins currently deployed on https://integration.wikimedia.org/ci/ . I did the same thing last year with I3b2d26" [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082502 (owner: 10Hashar) [16:03:03] Project beta-code-update-eqiad build #518740: 04STILL FAILING in 10 min: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/518740/ [16:03:44] well that build in the scap lock message is https://integration.wikimedia.org/ci/job/beta-scap-sync-world/177576/console [16:04:01] started at 15:35:42 [16:04:01] 15:35:47 FATAL: command execution failed [16:04:33] and 15:35:47 org.pircbotx.exception.DaoException: UNKNOWN_CHANNEL: #wikimedia-releng [16:04:37] so we never got hte notification [16:04:47] cause I guess the plugin already disconnected from IRC [16:05:15] I have no idea what has happened to the running process [16:05:21] it got a SIGKILL maybe? [16:05:27] in which case the lock is still there [16:06:35] I don't even kno wwhere the lock is [16:07:32] -rw-rw-rw- 1 jenkins-deploy wikidev 0 Aug 1 23:03 /var/lock/scap-global-lock [16:09:43] (03CR) 10Dduvall: [C:04-1] "Strange. works for me. Also GNU make 4.3." [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082502 (owner: 10Hashar) [16:13:03] Project beta-code-update-eqiad build #518741: 04STILL FAILING in 10 min: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/518741/ [16:13:14] The last Puppet run was at Fri Oct 4 07:34:56 UTC 2024 (27878 minutes ago). [16:13:16] pff [16:14:11] lol [16:14:27] scap lock --unlock-all "Release all locks after Jenkins got restarted and SIGKILL deployment" [16:14:33] 16:14:16 lock failed: Expecting value: line 1 column 1 (char 0) (scap version: 4.101.1-1) [16:14:37] so yeah you now.. [16:14:40] know [16:16:47] OH [16:16:59] json one /var/lock/scap.srv_mediawiki-staging.lock [16:17:32] !log deployment-prep: sudo rm /var/lock/scap-global-lock # it is an empty file, scap now expects a json payload [16:17:33] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:17:59] !log deployment-prep: scap lock --unlock-all "Release all locks after Jenkins got restarted and SIGKILL deployment" [16:18:19] which hm [16:18:22] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:18:23] did not fucking remove the lock files [16:18:25] ... [16:18:56] !log deployment-prep: sudo rm /var/lock/scap* [16:18:57] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:19:19] Project beta-code-update-eqiad build #518742: 15ABORTED in 6 min 15 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/518742/ [16:19:38] ah that made it [16:19:42] it is running again [16:19:45] Reedy: solved [16:19:51] thanks to `strace -e trace=file` [16:20:23] and `scap lock --unlock-all` gave me a: [16:20:23] 16:17:44 No global lock set. Nothing to do [16:20:35] :) [16:20:38] which is written in GREEN and thus my brain did not register it [16:20:58] cause `--unlock-all` is ONLY for the global lock [16:21:07] if you want to unlock all locks you need to use `--all` [16:21:09] ... [16:21:11] * hashar eek [16:21:35] Yippee, build fixed! [16:21:35] Project beta-code-update-eqiad build #518743: 09FIXED in 2 min 7 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/518743/ [16:22:10] T95395 [16:22:21] Do not say "< wmf-insecte> Yippee, build fixed!" [16:23:09] T95395: Do not say "< wmf-insecte> Yippee, build fixed!" - https://phabricator.wikimedia.org/T95395 [16:24:29] Yippee, build fixed! [16:24:29] Project beta-scap-sync-world build #177577: 09FIXED in 2 min 54 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/177577/ [16:37:08] \o/ [16:37:12] thanks for flying with Antoine [16:37:26] Reedy: beta jobs got fixed \o/ [17:03:07] 10Phabricator, 10Tool-ldap: https://ldap.toolforge.org/ integration assumes that `cn` and `uid` are equivalent - https://phabricator.wikimedia.org/T376769#10255342 (10Legoktm) 05Openβ†’03Resolved The Phabricator part of this was deployed, so we should be all set here! [17:03:42] 10Phabricator (2024-10-22), 10Tool-ldap: https://ldap.toolforge.org/ integration assumes that `cn` and `uid` are equivalent - https://phabricator.wikimedia.org/T376769#10255345 (10Pppery) [17:04:17] (03PS2) 10Hashar: Refresh list of Jenkins plugins used for testing [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082502 [17:04:28] (03CR) 10Hashar: "Of course `parameterized-trigger:806.vf6fff3e28c3e-31-g9f7e52b-pr400` is an unreleased fork I have made for T352319." [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082502 (owner: 10Hashar) [17:10:49] (03PS1) 10Hashar: Archive mediawiki/tools/cli [integration/config] - 10https://gerrit.wikimedia.org/r/1082519 (https://phabricator.wikimedia.org/T288502) [17:23:58] (03PS1) 10Dduvall: systemtests: Add /var/lib/git directories to git `safe.directories` [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082521 [17:24:48] (03PS3) 10Dduvall: Refresh list of Jenkins plugins used for testing [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082502 (owner: 10Hashar) [17:25:24] (03Abandoned) 10Dduvall: Support promote step in systemtests [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/628239 (owner: 10Jeena Huneidi) [17:25:52] dduvall: ahhrg https://gerrit.wikimedia.org/r/c/integration/pipelinelib/+/1082521/1 [17:26:03] yeah there is a whole lot of issues with that safe.directory thing :] [17:26:26] users mismatch [17:26:34] anyway, thanks for taking the trouble to run the systemtests ! [17:26:37] I am off for dinner [17:26:50] (03CR) 10Dduvall: [C:03+2] systemtests: Add /var/lib/git directories to git `safe.directories` [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082521 (owner: 10Dduvall) [17:27:11] (03CR) 10Dduvall: [C:03+2] Refresh list of Jenkins plugins used for testing [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082502 (owner: 10Hashar) [17:27:19] \o/ [17:27:19] (03Merged) 10jenkins-bot: systemtests: Add /var/lib/git directories to git `safe.directories` [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082521 (owner: 10Dduvall) [17:27:34] hashar: thanks for the patch! [17:27:43] (03Merged) 10jenkins-bot: Refresh list of Jenkins plugins used for testing [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/1082502 (owner: 10Hashar) [17:27:54] \o/ [17:51:34] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 05Release, 05Train Deployments: 1.43.0-wmf.28 deployment blockers - https://phabricator.wikimedia.org/T375659#10255528 (10Urbanecm_WMF) [18:12:06] 10Release-Engineering-Team (Priority Backlog πŸ“₯), 13Patch-For-Review, 05Release, 05Train Deployments: 1.43.0-wmf.28 deployment blockers - https://phabricator.wikimedia.org/T375659#10255683 (10dancy) [20:37:50] Project beta-scap-sync-world build #177602: 04FAILURE in 2 min 32 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/177602/ [20:38:38] 21:37:50 [46 hits] PHP Notice: Undefined variable: namespace [20:38:41] Looks like it's back again [20:48:21] Yippee, build fixed! [20:48:22] Project beta-scap-sync-world build #177603: 09FIXED in 2 min 39 sec: https://integration.wikimedia.org/ci/job/beta-scap-sync-world/177603/ [21:03:44] 20:37:50 20:37:50 Logstash checker Counted 46 error(s) in the last 20 seconds. The threshold is 10. [21:03:47] nice [21:05:21] I filed a bug about that output in that specific CI job being... unhelpful :) [21:09:15] 10Phabricator (2024-10-22), 10Tool-ldap: https://ldap.toolforge.org/ integration assumes that `cn` and `uid` are equivalent - https://phabricator.wikimedia.org/T376769#10256481 (10bd808) >>! In T376769#10255342, @Legoktm wrote: > The Phabricator part of this was deployed, so we should be all set here! Tha... [21:40:56] did you change your phabricator team tags recently? [21:41:06] oh, nevermind. ignore that :) [23:36:32] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Rename gerrit2 unix user to gerrit and assign a fixed uid - https://phabricator.wikimedia.org/T338470#10256889 (10Dzahn) On gerrit2003 (not in production yet): ` - Notice: /Stage[main]/Ssh::Server/File[/etc/ssh/userkeys/gerrit2]/ensure: remov...