[05:50:43] (03update) 10sg912: Adding CIM repo [repos/releng/gitlab-trusted-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/84 [06:48:11] Project beta-code-update-eqiad build #501674: 04FAILURE in 5 min 10 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/501674/ [06:48:26] hello, is somebody around? gerrit seems to have issues [06:53:19] its back! [06:55:15] Yippee, build fixed! [06:55:15] Project beta-code-update-eqiad build #501675: 09FIXED in 2 min 14 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/501675/ [07:21:36] (03approved) 10jnuche: scap clean: Add --dry-run mode [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/363 (owner: 10dancy) [07:27:49] (03update) 10jnuche: branch-cut-test-patches: clean up MW checkouts [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/83 (https://phabricator.wikimedia.org/T368239) [07:27:54] (03update) 10jnuche: branch-cut-test-patches: clean up MW checkouts [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/83 (https://phabricator.wikimedia.org/T368239) [08:47:56] (03merge) 10aklapper: Revert "Add metadata to the gettasktransactions conduit method response" [repos/phabricator/phabricator] (wmf/stable) - 10https://gitlab.wikimedia.org/repos/phabricator/phabricator/-/merge_requests/61 (https://phabricator.wikimedia.org/T364728) [08:48:27] 10Phabricator (phabricator-next), 07Technical-Debt: Revert or upstream rPHABf2fd14dc1edeb41aa2874336548cfaa7fa0e87a0 (maniphest.gettasktransactions API) - https://phabricator.wikimedia.org/T364728#9920507 (10Aklapper) [08:48:53] 10Phabricator (phabricator-next), 07Technical-Debt: Revert or upstream rPHABf2fd14dc1edeb41aa2874336548cfaa7fa0e87a0 (maniphest.gettasktransactions API) - https://phabricator.wikimedia.org/T364728#9920508 (10Aklapper) a:03Aklapper [09:01:42] (03update) 10aklapper: Indent JSON files with tab instead of two spaces [repos/phabricator/arcanist] (wmf/stable) - 10https://gitlab.wikimedia.org/repos/phabricator/arcanist/-/merge_requests/3 (https://phabricator.wikimedia.org/T349989) [09:23:03] (03merge) 10aklapper: Indent JSON files with tab instead of two spaces [repos/phabricator/arcanist] (wmf/stable) - 10https://gitlab.wikimedia.org/repos/phabricator/arcanist/-/merge_requests/3 (https://phabricator.wikimedia.org/T349989) [09:23:27] 10Phabricator (phabricator-next), 06translatewiki.net: Phabricator and translatewiki disagree over how to indent JSON files - https://phabricator.wikimedia.org/T349989#9920674 (10Aklapper) [09:35:00] (03CR) 10Hashar: [C:03+2] "Congratulations!" [integration/config] - 10https://gerrit.wikimedia.org/r/1049251 (https://phabricator.wikimedia.org/T367220) (owner: 10Majavah) [09:36:36] (03Merged) 10jenkins-bot: zuul: [mediawiki/extensions/OpenStackManager] Mark as archived [integration/config] - 10https://gerrit.wikimedia.org/r/1049251 (https://phabricator.wikimedia.org/T367220) (owner: 10Majavah) [09:50:30] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 06serviceops, 06SRE, and 2 others: Turn down api_appserver and appserver clusters - https://phabricator.wikimedia.org/T367949#9920844 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=7ca43ab0-579a-4f82-97aa-11720f300bd7) set by cgoubert@cum... [09:54:13] 10Release-Engineering-Team (Seen), 10MW-on-K8s, 06serviceops, 06SRE, and 2 others: Turn down api_appserver and appserver clusters - https://phabricator.wikimedia.org/T367949#9920870 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=046a1781-9fad-454c-b26b-ad2c96d2d8b2) set by cgoubert@cum... [10:25:01] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure): ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.com" - https://phabricator.wikimedia.org/T362425#9921033 (10Lucas_Werkmeister_WMDE) I just filed the jQuery version at T... [10:41:05] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure): ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.com" - https://phabricator.wikimedia.org/T362425#9921102 (10Lucas_Werkmeister_WMDE) BTW, I also remember occasionally ge... [11:32:37] (03CR) 10Arthur taylor: zuul: Enable PHPUnit parallel for all WMDE-maintained repos (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/1047919 (https://phabricator.wikimedia.org/T361190) (owner: 10Kosta Harlan) [11:33:37] (03CR) 10Kosta Harlan: zuul: Enable PHPUnit parallel for all WMDE-maintained repos (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/1047919 (https://phabricator.wikimedia.org/T361190) (owner: 10Kosta Harlan) [11:37:12] (03CR) 10Arthur taylor: zuul: Enable PHPUnit parallel for all WMDE-maintained repos (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/1047919 (https://phabricator.wikimedia.org/T361190) (owner: 10Kosta Harlan) [11:59:43] (03CR) 10Kosta Harlan: zuul: Enable PHPUnit parallel for all WMDE-maintained repos (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/1047919 (https://phabricator.wikimedia.org/T361190) (owner: 10Kosta Harlan) [12:00:20] (03PS4) 10Kosta Harlan: zuul: Enable PHPUnit parallel for all WMDE-maintained repos [integration/config] - 10https://gerrit.wikimedia.org/r/1047919 (https://phabricator.wikimedia.org/T361190) [12:08:17] (03CR) 10Arthur taylor: [C:03+1] "Looks good to me. I have a heads-up to the WMDE engineering team that this is coming, so we should be on the lookout for any breakages." [integration/config] - 10https://gerrit.wikimedia.org/r/1047919 (https://phabricator.wikimedia.org/T361190) (owner: 10Kosta Harlan) [12:29:55] ^ I think things have improved with regard to global state [12:30:03] let's see how it goes [12:30:45] we should set some notice in the Quibble output and/or to wikitech-l with a heads up to people, and let them know 1) what issues to watch out for and 2) where to report problems. codders is that something you could draft? [12:31:35] Sure - that makes sense. I'll try and write a couple of sentences [12:34:39] I will no more be available in a couple hours, I started rather early today [12:35:00] and I have planned a few repairs in the house this evening [12:35:09] so if a rollback is needed, you'd have to poke around here ;) [12:42:06] hashar: sounds good [12:42:14] codders: I can help with the wording, lmk [12:52:44] hashar: I added you to the doc we're working on [12:53:12] be bold! [12:53:21] codders: lgtm, I suggested removing one sentence but otherwise is fine [12:53:35] great [12:54:07] k. removed that sentence [12:54:56] should I go ahead and send that kostajh ? [12:56:58] codders: lgtm [13:00:28] I think for Quibble it would be great to have something that prints a message at the end of the job's console log to say that the test was executed with parallel enabled, and where the ticket is. I have to head off soon, but I can take care of that in the morning [13:02:40] hashar: are there some dashboards that we should keep an eye on? https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=integration perhaps? [13:05:19] yeah load and cpu waiting time would be the things to watch [13:05:42] and I guess tests failling [13:10:11] all these tasks like https://phabricator.wikimedia.org/T368390 would be more convenient to reproduce if PHPUnit supported multiple test arguments :S [13:10:15] like, I can run `composer phpunit:entrypoint extensions/Flow/tests/phpunit/PermissionsTest.php extensions/Flow/tests/phpunit/Formatter/RevisionFormatterTest.php` [13:10:41] and PHPUnit will think that I meant “run PermissionsTest and ignore RevisionFormatterTest” [13:22:48] Lucas_WMDE - yeah, it's not my favourite. [13:26:29] I gotta head off for today - fingers crossed things continue to run smoothly. see you tomorrow! [13:34:05] hashar: there is a backlog in gate-and-submit but don't think it's related to this change [13:40:16] kostajh: no, I think that’s mainly due to T368383 [13:40:17] T368383: MediaWiki core gate-and-submit test failure: ParserIntegrationTest::testParse with data set "wtEscaping.txt: Links 17. Link trails (T236183)" ('legacy'): Failed asserting that two strings are equal. - https://phabricator.wikimedia.org/T368383 [13:47:27] Lucas_WMDE: ack, thx [13:48:32] I once crafted a dashboard that showed when instances were starving for cpu [13:48:37] but I can't find it anymore [13:48:54] (given dashboards are randomly deleted ... :D ) [13:51:49] hashar: https://gerrit-review.googlesource.com/c/gerrit/+/430737 [13:51:54] I tested it and it works for me [13:52:03] ah for diff3 [13:52:08] well I am not qualified to review it :D [13:52:19] oh Edwin approved it! [13:52:38] https://phabricator.wikimedia.org/F55873734 [13:52:48] that is going to please Lucas_WMDE ;) [13:53:00] \o/ [13:53:03] nice [13:53:33] paladox: may you post a summary and that screenshot on our task at https://phabricator.wikimedia.org/T359821 ? [13:53:41] Sure [13:54:41] thanks!! [13:56:18] hmm seems wikibugs is down [14:00:33] hmm [14:00:57] everytime I head on https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board I feel like a graph got removed :D [14:01:56] o/ [14:01:59] hello folks! [14:02:27] I am trying to do some cleanup in the Docker Registry, I found a lot of releng-named images with "stretch" or "jessie" in their name [14:02:30] https://phabricator.wikimedia.org/T367427#9921815 [14:02:40] ok to drop those? [14:08:42] hashar: https://grafana.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&from=now-12h&to=now&var-project=integration&var-instance=All&viewPanel=369 this seems spikier since the change rolled out, hard to say if that has anything to do with the parallel change tho [14:08:59] hi, is anyone here in power to wipe the gate-and-submit queue? https://phabricator.wikimedia.org/T368383#9921637 [14:09:19] build is broken on master and all these jobs are going to fail, but it will take like 1.5 hours [14:09:29] and the change fixing the build it currently at the ver yend of the queue [14:12:35] which change? [14:15:11] elukey: I have replied [14:15:27] MatmaRex: which change fixes it? [14:16:04] elukey, hashar: was about to reply to say dev/stretch* should be safe to remove, so +1. [14:16:20] brennen: well do please +1 as well on the task :] [14:16:40] cause I am not entirely familiar with those dev images, though at least I found out you did the removal of Stretch \o/ [14:20:31] thanks a lot folks! [14:22:55] hmm [14:22:57] zuul promote --pipeline gate-and-submit --changes 1049510,2 [14:22:59] but that fails :/ [14:23:09] Exception: Unable to find shared change queue for 1049510,2 [14:30:41] hashar: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049510 hopefully [14:30:49] but i don't want to force-merge it [14:34:21] 10Deployments, 10MW-on-K8s: Pushing mediawiki-multiversion Docker image from deploy server takes 4 minutes - https://phabricator.wikimedia.org/T341441#9921976 (10Jdforrester-WMF) Looking at the numbers in logstash, the overall time for build-and-push-conainer-images appears to now be three times worse than whe... [14:35:26] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 07Zuul: zuul promote broken: Exception: Unable to find shared change queue for 534195,1 - https://phabricator.wikimedia.org/T231913#9921979 (10hashar) 05Resolved→03Declined That is still a thing: ` zuul.rpcclient.RPCFailure: Traceba... [14:36:20] so hmm [14:36:22] I can't promote it [14:36:26] that is broken [14:37:22] MatmaRex: I guess the best bet is to remove the CR+2 on all the changes in the queue [14:37:30] rebase them(which would clear them from the queue) [14:37:37] then +2 the one that is fixing things [14:38:42] can you clear the whole queue? [14:38:48] sorry i'm away in a meeting [14:38:49] nop [14:38:53] and I am about to leave [14:39:05] oh well, we can just wait for it [14:41:53] oh yeah and I should disable submit [14:41:57] eventually [14:42:37] I have rmeoved some CR+2 votes and rebased changes [14:42:44] which removed them from the queues [14:44:05] thanks! [14:44:16] yeah it looks much better already [14:44:33] 14:44:15 Time: 24:36.686, Memory: 2.27 GB [14:44:34] .... [14:46:41] I’m wondering what could’ve been improved here [14:46:59] well [14:47:01] I have a long rant ;) [14:47:15] but more seriously I have a few ideas to improve things [14:47:21] when I initially saw MatmaRex’ +2s I thought “would’ve been better not to do all those” but I’m not sure that’s reasonable of me tbh [14:47:37] the thing is [14:47:40] maybe not all of them, but restarting a few of the failed gate-and-submits was fine and I might well have done the same [14:47:44] eventually Zuul would have managed to pass through all those changes [14:48:00] there is a `zuul promote` command which is intended to bump a fix up in the queue [14:48:02] but it is broken [14:48:04] :/ [14:48:05] ah :( [14:48:20] cause well at some point Gerrit json changed change numbers from string to int [14:48:22] or something like that [14:48:42] I think the most unlucky part is actually that the first build for my fix happened to fail for an unrelated reason, and apparently that was the *one* build that runs the parser tests [14:49:07] the thing I wonder is how we got a breakage in the first place :D [14:49:10] and I’m so used to every test being included in at least two builds (“quibble” and “vendor” IIRC?) that I didn’t even bother to recheck *or* check the test locally (which wouldn’t have been that hard) [14:49:31] oh [14:49:33] I just assumed that, if all but one builds succeeded, then the change must be fine. and it wasn’t [14:49:35] i18n [14:49:39] hashar: that’s easy, the i18n just changed upstream [14:49:52] (whether we should have tests that can be broken by that is another question, I suppose ^^) [14:49:58] ;) [14:50:54] I have manually killed the jobs for https://gerrit.wikimedia.org/r/c/mediawiki/extensions/BlueSpiceBookshelf/+/1049553 [14:50:57] which got force merged [14:51:03] and I should again remove that behavior [14:51:06] permission [14:51:17] but that means overhauling the access lists we have [14:51:38] that is it [14:51:41] https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049510 is ahead in the queue [14:51:51] hopefully that will fix it and anything behind will pass [14:51:53] I am off [14:51:56] didn't James_F make a separate queue for BlueSpice things to make that exact issue go away? [14:52:09] Not a queue, just a different job set. [14:52:21] BlueSpice should be moved out entirely [14:52:22] well [14:52:35] anything not deployed on the WMF cluster should bem oved out to another zuul queue really [14:52:53] Yeah. [14:56:39] well [14:56:47] https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049510 has already failed :/ [14:56:58] 14:56:11 LogicException: Failed to download resource at https://code.jquery.com/jquery-3.7.1.js [14:57:00] .... [14:57:04] i need to be off [14:57:26] sorry [14:58:04] uggghhhhh [14:58:05] that shit again [14:58:23] that’s T368385, the second of two CI failures I filed today >.< [14:58:23] T368385: Occasional test failure: ForeignResourceStructureTest::testVerifyIntegrity: LogicException: Failed to download resource at https://code.jquery.com/qunit/qunit-2.20.0.js - https://phabricator.wikimedia.org/T368385 [14:58:29] (different file but same code.jquery.com domain) [14:59:32] I have flushed the queue again [14:59:41] and https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049510 is now alone [14:59:49] no issues on the jquery side of that domain that I can see :D [15:00:01] that test which depends on a third party should be removed for sure [15:00:10] anyway I am off [15:03:10] (03merge) 10brennen: update submodules for 2024-06-25 deploy [repos/phabricator/deployment] (wmf/stable) - 10https://gitlab.wikimedia.org/repos/phabricator/deployment/-/merge_requests/46 (https://phabricator.wikimedia.org/T368392) [15:03:11] 10Phabricator (phabricator-next), 06Release-Engineering-Team, 13Patch-For-Review: Deploy Phabricator/Phorge 2024-06-25 - https://phabricator.wikimedia.org/T368392#9922088 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=b34b3de0-17a0-4c1c-a3df-8275277dcba3) set by jelto@cumin1002 for 0:30:... [15:03:47] 10Phabricator (phabricator-next), 06Release-Engineering-Team, 13Patch-For-Review: Deploy Phabricator/Phorge 2024-06-25 - https://phabricator.wikimedia.org/T368392#9922091 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=34d08d08-25e3-4acf-ae8e-18f759b0815b) set by jelto@cumin1002 for 0:30:... [15:16:31] (03PS1) 10Novem Linguae: doc: remove CollaborationKit [integration/docroot] - 10https://gerrit.wikimedia.org/r/1049585 (https://phabricator.wikimedia.org/T368092) [15:16:45] andre: sorry i missed yr original question on T360756 - seems pretty safe, fwiw. [15:16:46] T360756: Make config page display version information - https://phabricator.wikimedia.org/T360756 [15:17:27] brennen: heh, np. Thanks as usual for the deployment, I'll clean up the Phab tickets [15:18:55] (03CR) 10Novem Linguae: "My last patch to the integration repo ( https://gerrit.wikimedia.org/r/c/integration/config/+/1032475 ) didn't go so well, so adding a cou" [integration/docroot] - 10https://gerrit.wikimedia.org/r/1049585 (https://phabricator.wikimedia.org/T368092) (owner: 10Novem Linguae) [15:26:37] (03CR) 10WMDE-leszek: [C:03+1] zuul: Enable PHPUnit parallel for WikibaseQualityConstraints [integration/config] - 10https://gerrit.wikimedia.org/r/1049562 (https://phabricator.wikimedia.org/T361190) (owner: 10Kosta Harlan) [15:39:00] three more builds for my core fix failed with those fucking ForeignResourceManager errors 😭 [15:39:05] it’s never gonna get merged is it [15:40:11] Hmm, wikibugs not posting any Phab task updates since 15:03, it seems? For example I closed T368392 since then [15:40:12] T368392: Deploy Phabricator/Phorge 2024-06-25 - https://phabricator.wikimedia.org/T368392 [15:41:09] andre: it was restarted at 15:09 cause of the same issue [15:41:22] ah, thanks. so not related to Phab deployment I guess [15:41:57] andre: https://sal.toolforge.org/log/C0y7T5ABhuQtenzvWuG2 [15:42:12] So if it isn't working, maybe it didn't get restarted properly [15:42:44] I see lots of gerrit events [15:42:50] On irc [15:42:52] lately #wikimedia-dev does not give me the Phab task noise level I expected :P [15:42:59] So maybe the phab bit is still broken [15:44:36] * RhinosF1 tries a test comment [15:45:02] err it's starting to sound like we need to actually force-merge some changes [15:45:13] or we'll be here all day [15:45:19] one to fix the i18n thing [15:45:22] andre: nothing, there's definitely missing stuff from wikibugs [15:45:25] and one to disable the foreignresourcewhatever testss [15:45:27] Maybe restart it again? [15:45:39] I'm not sure who can tbh [15:46:22] MatmaRex: I really hope not but it’s starting to feel like it [15:46:30] andre: would you like to be able to do that? [15:46:41] it certainly feels lkie the ForeignResourceStructureTest things have picked up significantly [15:46:51] like, several failures in one gate-and-submit is far from normal [15:47:36] RhinosF1: uh damn. I now wonder if https://phabricator.wikimedia.org/T364728 was not a good idea [15:47:45] taavi, honestly: undecided if I want more hats :D [15:48:00] seems like it's crashing with [15:48:01] 2024-06-25T15:43:09+00:00 [phorge-7ffcbf8d99-8z648] File "/workspace/src/wikibugs2/phorge.py", line 235, in get_transaction_info [15:48:01] 2024-06-25T15:43:09+00:00 [phorge-7ffcbf8d99-8z648] if "core.create" in trans["meta"]: [15:48:01] 2024-06-25T15:43:09+00:00 [phorge-7ffcbf8d99-8z648] ~~~~~^^^^^^^^ [15:48:01] 2024-06-25T15:43:09+00:00 [phorge-7ffcbf8d99-8z648] KeyError: 'meta' [15:48:21] andre: can't look properly, stupid train internet that's not very consistent [15:49:09] taavi: urgh, I guess that's directly related to https://phabricator.wikimedia.org/T364728 / https://gitlab.wikimedia.org/repos/phabricator/phabricator/-/merge_requests/61 [15:49:17] sounds very possible [15:49:24] :( So much for finding out the hard way which custom downstream changes are used by what [15:49:58] brennen: I think I broken wikibugs by merging https://gitlab.wikimedia.org/repos/phabricator/phabricator/-/merge_requests/61 [15:50:56] andre: ah, plausible [15:51:25] andre: want to push a revert? [15:51:49] sigh.... if I could only travel back in time and add force folks to write commit messages explaining what the heck a downstream custom patch actually does [15:52:08] brennen: yeah, let me cook that up [15:52:26] fucking hell, so many more core changes keep showing up in gate-and-submit [15:52:43] and if https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049510 fails again I (or someone) will have to kick all of them out of the queue again >.< [15:53:26] I almost feel like it’s time for a wikitech-l email… but maybe it would actually just be time for a force-merge before then [15:54:30] alright, let's force-merge things, surely it all can't be any more broken [15:55:11] done https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049510 [15:55:11] (03update) 10dancy: scap clean: Add --dry-run mode [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/363 [15:55:17] is there a patch to disable the other filing stuff? [15:55:20] failing* [15:55:56] not that I’m aware of [15:56:09] there’s a patch that at least adds logging (Krinkle already +2ed it) [15:56:18] so that’ll make its way through gate-and-submit now [15:56:34] https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049584/2 [15:56:47] basically GitHub is flaky. Probably not something we can do something about? [15:56:50] brennen: I pushed https://gitlab.wikimedia.org/repos/phabricator/phabricator/-/commit/7f611dbbef587f5aa69622bd6ae7e554a99e7c24 and added a comment [15:56:57] brennen: meh. how do we proceed? [15:56:58] but sure we can gain visibly on what problem they're having :) [15:57:08] foreign resource cache persists beteen CI jobs so that should absorb most of these [15:57:09] Krinkle: IMHO it’s unclear whether it’s GitHub or us [15:57:09] (03merge) 10dancy: scap clean: Add --dry-run mode [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/363 [15:57:17] I’ve seen GitHub, npm and jQuery fail so far [15:57:27] ok [15:57:32] andre: i'll update the deploy repo then we'll figure out if we're clear for a deploy [15:57:37] I wonder if the cache got lost? [15:57:48] i submitted https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049594 for it [15:57:49] brennen: thanks [15:57:50] it shoudl nly have to download entries that have changed, which usually is nothing. [15:57:52] verify works offline [15:58:02] i will self-merge it in a minute unless one of you stops me right now [15:58:07] o_O [15:58:11] then I misunderstood what it does [15:58:37] MatmaRex: I’d rather see some of those detail messages first tbh [15:58:44] but then I’m not waiting for any other particular patch to merge [15:58:49] (03approved) 10dancy: branch-cut-test-patches: clean up MW checkouts [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/83 (https://phabricator.wikimedia.org/T368239) (owner: 10jnuche) [15:58:49] unlike most people here, I assume ^^ [15:59:02] Lucas_WMDE: i think you can do that later by un-skipping it in your change? [15:59:47] i feel like the right solution is for this test to skip itself automatically if the remote resources are inaccessible [15:59:58] (and only fail if they are accessible but do not match the expected values) [16:00:05] I'm seeing composer install also fail on downloading stuff from github fwiw [16:00:09] but i'm not going to implement that right now [16:00:27] e.g. at https://gerrit.wikimedia.org/r/1043884 [16:01:25] I don't think that's going to help. Our CI won't pass anyway if it can't reach github, composer, packagist and npm [16:01:45] what is PS*2* of https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1049510 doing in the gate-and-submit queue o_O [16:01:51] (PS3 was force-merged) [16:02:07] I assume zuul will still sort itself out sooner or later [16:02:11] but looks like it got quite confused [16:02:17] our CI may fail less often if there are fewer tests doing stupid stuff [16:02:45] just because there are other problems, it doesn't mean we can't fix this problem [16:06:52] 1049510,2 has failed in the meantime with "LogicException: Failed to download resource at https://code.jquery.com/jquery-3.7.1.js" [16:07:49] that’s the one that shouldn’t have been in the queue in the first place [16:09:08] yeah, but still [16:09:37] I’ll abort the other ones, no point in wasting CPU time there [16:11:01] andre: revert deployed [16:11:14] brennen: thanks so much. and sorry! [16:11:45] no worries! i should have guessed that one would be an issue in advance, but little harm done. [16:12:58] well then, i'm gonna try to +2 some changes, go have dinner, and see if any of them went through in an hour or so [16:13:14] LogicException: Failed to download resource at https://cdnjs.cloudflare.com/ajax/libs/chosen/1.8.2/chosen-sprite%402x.png: HTTP request timed out. [16:13:17] that’s a four domain btw [16:13:37] i don't understand why we're sitting here watching this outage when we could attempt to fix it [16:13:44] but i'll let you watch if you want [16:15:46] tbh that feels to me like you’re just causing problems to prove a point [16:15:56] did it really have to be four core changes at once [16:16:45] eh, why not? [16:16:55] sorry, i'm not trying to make the situation worse [16:17:03] but it looked like everything previously in the queue was cancelled [16:17:10] so i thought this would not hurt [16:17:31] there were already two Wikibase changes, so we’re now at six queued changes [16:17:39] and i don't want to sit here watching them constantly [16:17:40] (okay, one of them just failed, so make that five >.<) [16:17:57] and i am expecting some of them to fail, so it helps to have more data points [16:17:59] (“Failed to clone https://github.com/wikimedia/oauth2-server.git via https”, so network issue in different CI step I guess) [16:18:08] (what Krinkle said earlier) [16:18:09] i will actually go eat now. be back later [16:18:26] I’ll probably have signed off by then, so have a nice evening ^^ [16:18:40] i am hoping to actually merge some changes today [16:18:53] i did 4 but i wanted to do about 10 :) [16:29:19] (03merge) 10jnuche: branch-cut-test-patches: clean up MW checkouts [repos/releng/release] - 10https://gitlab.wikimedia.org/repos/releng/release/-/merge_requests/83 (https://phabricator.wikimedia.org/T368239) [16:58:52] i'm back. so far, it looks like everything failed [17:01:01] and there are a few more patches in the queue that other people +2'd [17:03:26] oh, thanks for approving my thing. let's see if it helps [17:03:42] yeah, I’m (unfortunately) still here ^^ [17:03:46] but about to sign off, I think [17:04:03] and will only be back tomorrow afternoon-ish (CEST), so I’ll miss whatever fun happens in the morning too [17:24:27] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team: Rebuild integration-cumin to get rid of Debian Buster - https://phabricator.wikimedia.org/T360784#9922958 (10thcipriani) [17:24:29] 06Release-Engineering-Team, 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "integration" project Buster deprecation - https://phabricator.wikimedia.org/T367534#9922959 (10thcipriani) [17:27:14] 06Release-Engineering-Team, 06Infrastructure-Foundations, 10Cloud-VPS (Debian Buster Deprecation): Cloud VPS "integration" project Buster deprecation - https://phabricator.wikimedia.org/T367534#9922964 (10thcipriani) Hrm, those pkgbuilder hosts are used for the Jenkins debian glue jobs—testing debian package... [17:39:32] “Failed to clone https://github.com/wikimedia/oauth2-server.git via https” hrm, I just tried to clone that repo on an integration machine via https and it works, so something else fishy is going on, is there a task for this? [17:41:32] 100% anecdotal but... I am seeing some weird connectivity issues with various sites not loading on first attempt, so maybe the internet's just having a bad day [17:47:45] thcipriani: no task for today's outage afaics, but there is https://phabricator.wikimedia.org/T362426 and https://phabricator.wikimedia.org/T362095 and probably more [17:51:35] and with castor failing, we are at the mercy of internet weather [17:53:37] thcipriani: if that was a bullseye VM where it happened, then this should be fixed by https://lists.debian.org/debian-security-announce/2024/msg00128.html [17:58:23] happening within a non-bullseye container image within CI, composer not installed from debian there (but I'm unsure how it's installed at the moment). May be related tho, thanks cdanis [18:04:08] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9923224 [18:35:17] interesting error on https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1027277 : [18:35:21] 19:58:09 git.exc.GitCommandError: Cmd('git') failed due to: exit code(128) [18:35:21] 19:58:09 cmdline: git remote prune --dry-run origin [18:35:21] 19:58:09 stderr: 'fatal: unable to access 'https://gerrit.wikimedia.org/r/mediawiki/extensions/Elastica/': The requested URL returned error: 502' [18:41:58] Cloning into 'Elastica'... [18:41:58] remote: Total 2428 (delta 0), reused 2428 (delta 0) [18:41:59] Receiving objects: 100% (2428/2428), 875.55 KiB | 1.71 MiB/s, done. [18:49:02] 10Phabricator, 13Patch-For-Review: Make config page display version information - https://phabricator.wikimedia.org/T360756#9923352 (10Dzahn) Please add reviewers on Gerrit that are service owners instead. Only fall back to puppet request window if there is a problem getting a reaction from them on a constant... [18:52:35] 10Phabricator, 13Patch-For-Review: Make config page display version information - https://phabricator.wikimedia.org/T360756#9923399 (10Dzahn) The change is good. It's just in the wrong profile. That "phorge" profile was just used to setup a test instance of phorge. While the production class is still called ph... [19:33:17] Lucas_WMDE: so for ping late in your time, can we merge stuff in core now? [19:35:41] 10Phabricator (2024-06-25), 06translatewiki.net, 13Patch-For-Review: Phabricator and translatewiki disagree over how to indent JSON files - https://phabricator.wikimedia.org/T349989#9923560 (10Izno) > I guess we have a trade-off here between reading JSON on smaller screens versus making translatewiki.net... [19:35:57] 10Phabricator: Disable Herald rule H99 (adding #Internet-Archive)? - https://phabricator.wikimedia.org/T367650#9923562 (10Aklapper) p:05Triage→03Medium [19:36:35] 10Diffusion, 10Phabricator, 10Release-Engineering-Team (Priority Backlog 📥): Unify gitlab migration scripts - https://phabricator.wikimedia.org/T346191#9923564 (10Aklapper) p:05Triage→03Low [19:38:10] 10Phabricator (2024-06-25), 06translatewiki.net, 13Patch-For-Review: Phabricator and translatewiki disagree over how to indent JSON files - https://phabricator.wikimedia.org/T349989#9923565 (10Izno) Also, this seems like something that should be less opinionated in a CMS and otherwise dictated by a langu... [19:41:08] 10Phabricator (Upstream), 07Mobile, 07Upstream: Allow contracting (collapsing and expanding) of columns on project workboard in mobile view - https://phabricator.wikimedia.org/T208434#9923574 (10Aklapper) a:03Aklapper [19:44:18] Amir1: yes [19:44:58] at least you can try :) some changes have merged [19:50:35] 10Phabricator, 06Release-Engineering-Team: Update to Phorge upstream 2024.19 release - https://phabricator.wikimedia.org/T368453 (10Aklapper) 03NEW p:05Triage→03Low [19:51:38] 10Phabricator, 06Release-Engineering-Team: Update to Phorge upstream 2024.19 release - https://phabricator.wikimedia.org/T368453#9923596 (10Aklapper) [19:51:38] 10Phabricator (Upstream), 07Upstream: Address FIXMEs in Phabricator-translations - https://phabricator.wikimedia.org/T365853#9923597 (10Aklapper) [19:51:40] 10Phabricator (Upstream), 07Regression, 07Upstream: PhabricatorDataNotAttachedException when rendering project hovercard with username mentioned in project description - https://phabricator.wikimedia.org/T360530#9923599 (10Aklapper) [19:51:41] 10Phabricator (Upstream), 07Upstream: Phabricator-translations should not extract strings from test cases - https://phabricator.wikimedia.org/T363364#9923598 (10Aklapper) [19:51:42] 10Phabricator (Upstream), 07Upstream: Allow "lipsum" data generation tool to create projects with empty descriptions - https://phabricator.wikimedia.org/T355966#9923600 (10Aklapper) [19:51:43] 10Phabricator (Upstream), 07Upstream: Exception: Map returned by "newPagingMapFromCursorObject()" in class "ManiphestTaskQuery" omits required key - https://phabricator.wikimedia.org/T344241#9923601 (10Aklapper) [19:51:47] 10Phabricator (Upstream), 07Upstream: Disable Audit application in Phabricator - https://phabricator.wikimedia.org/T330794#9923603 (10Aklapper) [19:51:51] 10Phabricator (Upstream), 07Upstream: Remove "Action Has No Effect" warning dialog when removing auto assigned user - https://phabricator.wikimedia.org/T337017#9923602 (10Aklapper) [19:52:37] 10Phabricator (Upstream), 07Upstream: Remove "Action Has No Effect" warning dialog when removing auto assigned user - https://phabricator.wikimedia.org/T337017#9923607 (10Aklapper) [19:52:39] 10Phabricator: Disable auto-assign/claim feature - https://phabricator.wikimedia.org/T261483#9923608 (10Aklapper) [19:53:25] 10Phabricator (Upstream), 07Regression, 07Upstream: PhabricatorDataNotAttachedException when rendering project hovercard with username mentioned in project description - https://phabricator.wikimedia.org/T360530#9923609 (10Aklapper) 05Open→03Stalled [19:54:39] 06Release-Engineering-Team, 06Trust and Safety Product Team, 13Patch-For-Review, 10Temporary accounts (Release train CI and infrastructure), 03Trust and Safety Product Sprint: [Epic] Make PHPUnit extension and core, Selenium, and API testing tests pass wi... - https://phabricator.wikimedia.org/T355879#9923622 [19:59:36] (03CR) 10Hashar: [C:03+2] doc: remove CollaborationKit [integration/docroot] - 10https://gerrit.wikimedia.org/r/1049585 (https://phabricator.wikimedia.org/T368092) (owner: 10Novem Linguae) [20:00:53] (03Merged) 10jenkins-bot: doc: remove CollaborationKit [integration/docroot] - 10https://gerrit.wikimedia.org/r/1049585 (https://phabricator.wikimedia.org/T368092) (owner: 10Novem Linguae) [20:02:37] 10Phabricator: Disable Herald rule H99 (adding #Internet-Archive)? - https://phabricator.wikimedia.org/T367650#9923650 (10Aklapper) a:03Aklapper [20:10:58] 14Gerrit (Gerrit 3.9): Configure Gerrit to use conflictStyle diff3 - https://phabricator.wikimedia.org/T359821#9923662 (10LucasWerkmeister) That sounds amazing, thank you! [20:52:55] 10Release-Engineering-Team (Priority Backlog 📥), 05Release, 05Train Deployments: 1.43.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T366956#9923810 (10daniel) ##### Risky Patch! 🚂🔥 * **Change**: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1005589 * **Summary**: REST: Make module... [21:02:02] does scap create any git config for the deploy repos it creates? [21:02:18] like config per repo that isn't global [21:10:04] 10Phabricator: Automate weekly request for Phabricator data for potential Tech News entries - https://phabricator.wikimedia.org/T368460 (10Quiddity) 03NEW [21:15:56] mutante: good question [21:16:47] #git says I should tell the deployment tool to set the git config :) [21:17:25] that seems like an ok idea. it might be something that should be configurable in scap? [21:17:33] (or maybe it should just do it, not sure.) [21:18:41] learns about https://en.wikipedia.org/wiki/Security_Content_Automation_Protocol from googling scap [21:21:29] mutante: what is it about? [21:21:51] there are some bits created by the Puppet provider [21:21:55] which definitely set some git config [21:22:22] i see some stuff for lfs as well, not sure what else, but it looks like there are places it could be hooked in [21:22:28] the core of the issue is that the actual deployment dir changes on every deployment [21:22:45] so only scap knows that path, puppet does not [21:22:54] so the proper way would be that scap configures git for that dir [21:23:03] instead of using puppet and apply it to * [21:23:22] example: [21:23:35] the "deploy dir" is /srv/phab but that's not real, it's just a symlink [21:23:39] the real one is: fatal: detected dubious ownership in repository at '/srv/deployment/phabricator/deployment-cache/revs/72ad841a0bf22b0dd1aca0d53af6c84ab044e94a' [21:23:48] oh [21:23:49] and I can't know that full path [21:23:54] so you are hitting the issue with git itself [21:24:09] there is a long task about that broken feature [21:24:15] I can either just set * on the entire machine as safedir and move on [21:24:20] or do it the "right" way [21:24:21] how about this: [21:24:29] which would be that scap does this config setting [21:24:42] actual_phab_path=$(readlink -f /srv/phab) [21:24:59] in modules/phabricator/templates/phab_deploy_finalize.sh.erb [21:25:19] git -C "$actual_phab_path" config ... [21:25:43] yea.. hm...I could also.. write a command like that above.. and let puppet execute it [21:25:52] it's just all annoying in different ways :) [21:26:06] the deploy's already running the finalize script, so it wouldn't be much hackier than anything else we're doing there [21:26:10] I was about to say just set * and it's ok [21:26:19] then the guy in #git told me it's so evil :) heheh [21:26:32] but there are no manual git users here [21:26:32] it would probably be reasonable to have scap know about this, but waiting on that will take longer [21:27:12] yea, I think I like the idea [21:28:13] thanks, I think I will upload a patch like that [21:38:13] well meanwhile [21:38:31] 1990's way to add a checkbox to a web page: [21:39:36] 2024's way: "I am still figuring out how to attach the WebComponent element to a in a shadow dom" [21:39:52] web development is an entire new thing ;) [21:40:16] (03open) 10egardner: releases: Bump Codex to 1.8.0 [repos/ci-tools/libup-config] - 10https://gitlab.wikimedia.org/repos/ci-tools/libup-config/-/merge_requests/24 [21:43:11] * hashar sleeps to get ready for the Gerrit 3.10 upgrade [21:43:16] (03approved) 10catrope: releases: Bump Codex to 1.8.0 [repos/ci-tools/libup-config] - 10https://gitlab.wikimedia.org/repos/ci-tools/libup-config/-/merge_requests/24 (owner: 10egardner) [21:43:19] (03merge) 10catrope: releases: Bump Codex to 1.8.0 [repos/ci-tools/libup-config] - 10https://gitlab.wikimedia.org/repos/ci-tools/libup-config/-/merge_requests/24 (owner: 10egardner) [21:51:52] 10Gerrit, 06Release-Engineering-Team: Delete All-Projects-In-Phabricator.git Gerrit project - https://phabricator.wikimedia.org/T355070#9924028 (10hashar) Hurrah! antoine-approve [22:47:25] Project beta-code-update-eqiad build #501770: 04FAILURE in 4 min 24 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/501770/ [22:48:46] yea, nothing is simple anymore. always 10 levels of frameworks and abstractions [22:55:10] Project beta-code-update-eqiad build #501771: 04STILL FAILING in 2 min 9 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/501771/ [22:55:18] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9924168 [23:05:15] Yippee, build fixed! [23:05:15] Project beta-code-update-eqiad build #501772: 09FIXED in 2 min 14 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/501772/ [23:31:17] 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review: ForeignResourceStructureTest flaky in CI due to "Failed to download resource at https://codeload.github.c... - https://phabricator.wikimedia.org/T362425#9924317