[08:57:10] morning [08:57:21] morning [08:57:27] welcome back :) [08:57:51] thx :) i'm catching up + testing/merging pre-commit/poetry updates [08:59:09] let me know if you have anything that needs more immediate attention! [09:00:04] quick review: https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/48 [09:09:36] morning [09:11:26] blancadesal: done, let me know if you want some help with the chore, otherwise I'll let you do the merges (or we will start stepping in each other's toes xd) [09:13:05] dcaro: it's a good task while catching up on email and such as it doesn't require a lot of brainpower :) [09:34:44] yep, it's good to do some of those from time to time, blancadesal found an interesting bug in the api-gateway https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/49 [09:34:55] I was going crazy getting a different version every time I called it xd [09:35:24] I was able to generate an async python client too btw: https://gitlab.wikimedia.org/dcaro/toolforge_cli_gen/-/tree/main/gen/python-async?ref_type=heads [09:36:22] dcaro: oooh, nice! [09:38:37] dcaro: similar for toolforge-cli https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli/-/merge_requests/33 [09:39:54] blancadesal: hmm, for toolforge-cli I think we still have to support python3.7 :/ [09:40:19] I was hoping that wasn't the case xd [09:40:54] soon I hope, not sure what's the current status, looking [10:26:16] dcaro: what is still using python 3.7? [10:26:28] login-buster.toolforge.org [10:58:58] I found this link in the "Weekly portfolio report" and I'm very impressed by all the information and how well it's organized https://miro.com/app/board/uXjVKigAcLU=/?moveToWidget=3458764604138192821&cot=14 [10:59:36] wikilove to Sarai and everyone who was involved in creating it [11:00:11] is Sarai on IRC btw? [11:00:42] not much, she's mostly in slack [11:00:59] I have another session today with her to go over it [11:01:21] cool, please give her my very positive feedback when you talk to her :) [11:01:32] 👍 [11:34:17] dcaro: I think we can set the silence on alerts.wikimedia.org instead of prometheus-alerts.wmcloud? [11:57:22] yes yes [12:09:55] @dcaro i'll be in collab around 2pm [12:10:34] blancadesal: okok, I have a meeting with sarai to go over the miro at 14:30, but we can sync up a bit during collab [12:10:45] sounds good [12:49:24] dcaro: I think calico-typha can be scheduled on control nodes because it has CriticalAddonsOnly toleration, and "Priority Class Name: system-cluster-critical" [12:56:33] 👍 [13:59:23] dcaro: when you have a moment: https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli/-/merge_requests/33 [14:12:11] https://www.irccloud.com/pastebin/xWvDxAWz/ [14:12:16] could someone please run the unit test for envvars-cli to see if you can reproduce this error? [14:12:31] heads up, codfw1dev seems to be just not working as expected. I was about to do some testing inthere, but seems like it has DB issue, prior to my actions [14:13:20] * arturo nursery run [14:28:38] nevermind –  the problem came from tox using Python 3.12 for the unit test run [14:31:07] 👍 [14:35:16] argh, tox is defaulting to the latest python it can find in my pyenv installations, ignoring my local/global settings [14:35:53] envvars-cli doesn't work with 3.12. 3.11 and under is fine though [14:36:44] we are using 3.11 in CI right now. Would it be okay to set basepython in tox to this version, dcaro? [14:37:15] hmm, should be I think [15:30:35] blancadesal: restish -v ..., for really nice debugging info :) [15:59:36] dhinus: T378995 [15:59:37] T378995: openstack codfw1dev: API endpoints errors 2024-11-04 - https://phabricator.wikimedia.org/T378995 [16:06:02] arturo: thanks! [16:06:31] I had some live-hacks on cloudlb2001-dev going [16:06:45] I reverted them and the APIs seem more healthy now [16:06:57] I am now able to run the restart_openstack cookbook [16:09:02] slyngs: hi there, I see this puppet error on cloudinfra-idp-1: [16:09:04] https://www.irccloud.com/pastebin/lWArQ17V/ [16:11:02] dcaro: nice! [16:12:46] quick review: https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli/-/merge_requests/33 [16:15:46] arturo: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1087203 [16:16:22] slyngs: +1'd LGTM, thanks [16:16:28] I'm starting to question if the profiles variables in hieradata actually does some thing :-) [16:18:00] even worse, they lookup may behave differently (diff preference) in prod vs cloud [16:18:12] That is worse :-) [16:18:20] which leads to https://bash.toolforge.org/quip/AWd6Sq7ffM03vZ1oWjy4 [16:18:40] :-) [16:21:15] blancadesal: added a comment [16:22:20] other than that +1 [16:27:37] dcaro: would you mind testing again? I made some additional changes [16:28:00] blancadesal: testing [16:28:50] blancadesal: works for me :) [16:29:08] thanks! [16:31:46] dhinus: I found the problem in codfw1dev [16:31:49] https://usercontent.irccloud-cdn.com/file/4d62IcTH/image.png [16:33:45] hmm, automated tests still randomly failing in tools [16:34:05] I thought it was due to the upgrade, but it's finished now [16:38:47] is anyone running them at the same time? [16:38:56] what's the error? [16:39:41] hmm, I think it should be already checking that it's not running in paraller :/ [16:40:00] https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/blob/main/utils/run_functional_tests.sh?ref_type=heads#L60 [16:43:52] the last two runs failed on the same step [16:43:57] https://www.irccloud.com/pastebin/CrPlE3lx/ [16:44:43] the run before failed here: [16:44:48] https://www.irccloud.com/pastebin/Z3IQ4JhI/ [16:46:05] the lock seems to be working for me: [16:46:09] https://www.irccloud.com/pastebin/RCBoN4LT/ [16:47:32] then the problem might be something else... need to go afk, will investigate a bit later (unless you find the issue) [16:47:47] ack, cya, thanks! [16:48:56] I got ` INFO: job 'test-1879' completed (and already deleted)`, that Raymond_Ndibe saw also during the upgrade [16:51:40] hmpf. there's another instance of the functional tests running :/, not sure why the lockfile did not work [16:56:21] hmm... not sure how the lockfile did not prevent the parallel runs, I've let the current running tests get to the next loop (so they recreate the lockfile), and now I can't start them myself [16:56:27] https://www.irccloud.com/pastebin/DOztbxGg/ [16:56:46] https://www.irccloud.com/pastebin/8cBWa5F0/ [16:59:56] * arturo offline [18:16:32] Got the first deploy using token auth! \o/ [18:16:36] time to call it a day [18:16:41] * dcaro offline