[10:26:02] * dcaro lunch [11:28:49] * arturo online now [13:04:59] I see this puppet error for cloudcephosd1040/1041: [13:05:03] aborrero@cloudcephosd1040:~ 130 $ sudo run-puppet-agent [13:05:03] Error: The CRL issued by 'CN=Wikimedia_Internal_Root_CA,OU=Cloud Services,O=Wikimedia Foundation\, Inc,L=San Francisco,ST=California,C=US' has expired, verify time is synchronized [13:05:03] Error: The CRL issued by 'CN=Wikimedia_Internal_Root_CA,OU=Cloud Services,O=Wikimedia Foundation\, Inc,L=San Francisco,ST=California,C=US' has expired, verify time is synchronized [13:12:28] * dcaro back [13:13:23] arturo: yep, that's because they are running puppet7 but the ceph roles default to puppet 5 and such, they need reimage to do the puppet7 from scratch, moritzm is doing the others so we will only have puppet7 soon-ish [13:14:56] indeeed, specifically they were initially installed with P7 as insetup::wmcs and then switched to role::cloudcephosd which defaults to P5 since there's buster nodes [13:15:06] clean reimage with -p 7 will fix this [13:15:46] I'll do [13:19:22] ok thanks [13:23:21] the task is T372814 (fyi) [13:23:22] T372814: Put cloudcephosd10[39-41] into service - https://phabricator.wikimedia.org/T372814 [14:14:35] moritzm: hmm... raid partitioning failed on cloudcephosd1039 :/ [14:15:12] in a meeting, can have a look later [14:16:38] ack, let me see if I can get it working before that :) [14:35:12] btullis: ping, hey, you did some changes in the partitioning for osd nodes right? [14:37:00] Yes, not all that recently. [14:37:18] I think it was T372783 and we probably haven't reimaged any host since [14:37:19] T372783: Verify that cephosd* server reimages work without adversely affecting cluster availability - https://phabricator.wikimedia.org/T372783 [14:38:41] That's right. This was the thing I changed most recently. https://gerrit.wikimedia.org/r/c/operations/puppet/+/1065180 [14:39:42] And yes, this would affect cloudcephosd hosts: https://github.com/wikimedia/operations-puppet/blob/production/modules/install_server/files/autoinstall/scripts/partman_early_command.sh#L115 [14:43:09] ack, I'll take a look [14:45:02] We can exclude cloudcephosd servers from those that call this `remove_os_md`, if that helps. [15:53:51] * dcaro back from meetings :) [16:14:16] * dcaro off [16:14:19] vaway