[06:31:00] godog, dcaro: spicerack-shell is installed on the cloudcumin hosts, hence available there. Because T319450 was never done the modules under wmcs_libs are not automatically available but just a simple ``` sys.path.append('/srv/deployment/wmcs-cookbooks') ``` away. We could also patch spicerack-shell to add that line on the cloudcumin hosts. [06:31:01] T319450: [cookbooks] Refactor the specific wmcs libraries into spicerack module - https://phabricator.wikimedia.org/T319450 [06:32:15] same goes for test-cookbook that is available on the cloudcumin hosts so you can make changes, send them to gerrit and be able to test them (and also make local modifications while testing) [06:32:51] related docs: https://wikitech.wikimedia.org/wiki/Spicerack#Explore_Spicerack and https://wikitech.wikimedia.org/wiki/Spicerack/Cookbooks#Test_before_merging [07:04:51] volans: ack, thank you for the context [07:54:42] morning! [07:56:51] can you send patches to gerrit from cloudcumin? [07:57:34] hey dcaro [07:57:50] quick review about reducing the default request: https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/215 should avoid paging during the weekend, I'll also look into the alerts today if possible [08:01:18] gotta run an errand shortly, bbiab [08:03:22] dcaro: that was in reply to previous chats about fil.ippo that asked to test openstack api calls. Is that related directly to gerrit patches? [08:04:27] as for sending them I don't think there is any specific limitation, it will ofc need a system user and the patches will appear created by a bot and not by a specific user [08:09:28] morning! [08:09:52] welcome back dhinus ! [08:10:23] running the errand, bbiab [08:11:08] volans: I asked because you said "same goes for test-cookbook that is available on the cloudcumin hosts so you can make changes, send them to gerrit and be able to test them" [08:11:26] and I was not sure if test-cookbook was able to send the pateches xd [08:11:40] (would be nice) [08:12:09] ah no, you do your patch locally, send it to gerrit, then you can call test-cookbook with the CR ID and it will check it out in your home so you can test it [08:12:49] then if while testing you need to do some adjustments you can do them locally on the cloudcumin host and re-test the cookbook with local modifications (test-cookbook will alert you that local modifications are present) [08:13:26] no, right now it doesn't have a direct way to send the modifications back to gerrit, but it's an rsync/scp away from your laptop :D [08:13:38] if we get a system user it could also send them back to gerrit indeed [08:39:40] dhinus: welcome back! [08:47:42] hmmm... I think paws is misbehaving [08:48:02] T405183 happens with my user too [08:48:03] T405183: [feature] Can not log into PAWS today due to timeout - https://phabricator.wikimedia.org/T405183 [08:48:07] anyone can test to log in? [08:48:18] (from emails I can see it crashed during the weekend again) [08:51:49] * volans trying to login [08:52:12] volans: I see your pod flying by :) [08:52:16] dcaro: mine started fine [08:52:19] I think it's node-3 that's misbehaving [08:52:28] (got it cordoned, will restart) [08:52:34] ack [08:54:00] does logout free up the resources? [08:54:32] I think not specially, it will stop/delete your pod anyhow I think if it's not running/in use [08:54:56] I didn't see a specific stop/delete button for the pod [08:55:35] there's a stop server somewhere [08:56:01] maybe you have to go to 'hub control pannel' https://hub-paws.wmcloud.org/hub/home [08:58:07] got it, thx [08:58:43] btw https://hub-paws.wmcloud.org/hub/logo returns 404 [08:58:54] so there is no logo on the top-left ;) [08:59:51] yep, I noticed :/ [09:00:07] not sure when it disappeared [09:02:13] https://www.irccloud.com/pastebin/bYjALjOH/ [09:02:22] https://www.irccloud.com/pastebin/neTO3CkZ/ [09:02:25] not there :/ [09:03:05] ooohhh, it's a pvc mounted there, so the file that's inside the image gets trampled [09:04:43] curled it inside the pvc, though on next rebuild it will fail, should be moved some other place [09:05:43] lol [09:08:43] https://github.com/toolforge/paws/pull/499 [09:10:07] paws is developed on GH? [09:10:23] currently yes [09:10:53] comemnted but I can't approve :) [09:13:04] thanks! [09:23:43] * volans quick errand to run, brb [09:40:10] when you get a chance, I'm seeking feedback on https://gerrit.wikimedia.org/r/q/topic:%22bug/T404584%22 and specifically https://gerrit.wikimedia.org/r/c/cloud/wmcs-cookbooks/+/1189792 and https://gerrit.wikimedia.org/r/c/cloud/wmcs-cookbooks/+/1189868 [09:47:19] I should also say, currently a blocker for T404584 [09:47:19] T404584: [tools,nfs,infra] Address tools NFS getting stuck with processes in D state - https://phabricator.wikimedia.org/T404584 [10:14:47] quick review https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/38 to avoid pages for now until we sort out capacity [10:15:40] LGTM [10:25:20] I'll continue reviewing after lunch [10:30:14] ack, thank you! going to lunch too [12:07:37] godog: Added a question there, as we handle the DNS record for tools and toolsbeta using tofu, you cookbook might need extra steps or manual steps for those two projects to flip the ips [12:09:33] fantastic, thank you dcaro ! [12:35:41] re: "if we move everything on to tofu" in https://gerrit.wikimedia.org/r/c/cloud/wmcs-cookbooks/+/1189867/comment/adf05625_4dd6be6a/ [12:35:47] how likely is that to happen? [12:36:15] I get the impression we have tofu/non-tofu diaspora and we'll have to deal with it [12:36:45] I think that's correct xd [12:36:49] (your impression) [12:37:05] https://phabricator.wikimedia.org/T385604 [12:37:12] T385604 [12:37:12] T385604: Decision Request - How openstack projects relate to tofu-infra - https://phabricator.wikimedia.org/T385604 [12:37:24] that's the task to decide if we move all projects management to tofu or not [12:38:02] as I've said many times, in general things should be managed either by cookbooks, or in tofu, but mixing those two for the same thing results in a mess [12:38:12] for our projects, the overall agreement was to move to tofu [12:40:50] mmhh ok thank you for the context, FWIW I agree a mix of tofu and cookbooks is the worst scenario to be in [12:41:18] I think that there's also the question of "should be use tofu from the cookbooks?" (ex. create patches from the cookbook) [12:41:28] that so far has been working quite ok for project creation [12:41:30] (imo) [12:42:39] if tofu is authoritative then yes I don't see any problems if what the cookbook does is effectively create tofu patches [12:43:01] for the things it tofu it is (as it will override anything else xd) [12:44:45] is there a list of projects somewhere for which tofu is authoritative? [12:45:02] sort of a migration tracker for option 1 if you wish [12:47:59] we have all the projects "basic" stuff in tofu https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/tree/main/resources/eqiad1-r?ref_type=heads [12:48:57] then tools and toolsbeta have a specific tofu repo for their own things https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning [12:49:03] that adds the dns and such [12:49:47] and metricsinfra project also has it's own separated repo https://gitlab.wikimedia.org/repos/cloud/metricsinfra/tofu-provisioning [12:50:10] so afaik there's no list [12:51:38] ack thank you, I'm reading create_project and AFAICS for example quotas are not in tofu but set directly in openstack ? [12:51:51] what a mess :) [12:51:53] yep [12:52:11] I'll take a break, brb [12:52:13] it's a work in progress xd [13:00:36] (btw. wink wink if you want to tackle it) [13:06:05] heheh kind offer and I'll decline [13:06:39] xd [13:10:13] I'm sure I'm stating the obvious though without a plan for either tofu or no tofu we're going to live with this mess for the time being [13:11:47] I'll make a note to bring it up on thurs [13:13:34] I think that we need the resources more than the plan, once we have someone that is going to work on it planning can be part of the work [13:13:52] (otherwise we plan, but don't execute, something we have done many times already) [13:14:24] dcaro: do you see any reason for me not to start moving the other ceph racks to bookworm + reef? Everything looks good to me but that's what I thought with bookworm/pacific :/ [13:14:29] (ex. all the non-implemented cookbook tasks) [13:15:12] dcaro: maybe godog /is/ those resources :D [13:15:22] andrewbogott: let me have a look and try to find anything suspicious [13:15:26] (hopefully not) [13:15:28] thank you! [13:15:38] I'm currently not in the mood for jokes in case that wasn't clear [13:16:08] andrewbogott: I don't want to push anyone to do anything they don't want to do, there's more than enough critical work to go around [13:17:13] ok! Sorry, haven't read the backscroll yet [13:18:06] np, happy to talk and understand more about this on thurs though [13:18:18] 👍 [13:45:10] adapted https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/215 to the current conversation in T404726, looking for reviewers [13:45:11] T404726: [tools,infra,k8s] scale up the cluster, specifically CPU - https://phabricator.wikimedia.org/T404726 [15:33:26] andrewbogott: ceph nodes look ok, they seem quite underutilized memory-wise at least [15:33:35] but stable :) [15:37:31] yeah, I tried setting a node in codfw1dev to automatic memory allocation and it used even less [15:37:42] there's clearly more to learn there [15:58:20] * dhinus off [17:09:40] * dcaro off [17:09:42] cya tomorrow!