[13:35:40] Hi folks - I'd like a +1 on https://gerrit.wikimedia.org/r/c/operations/puppet/+/1143821 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/1143824 please, for the next bit of hardware work. I won't actually merge until Monday, but it'll save me bugging you then if you've a few minutes now :) [14:03:37] Emperor: the first is now approved - I have zero context to approve the second one but if you can give me some pointers I can review it [14:05:18] FWIW I see the hosts pinging [14:05:23] federico3: it's really "remove all references to these depooled nodes from puppet so they can be decommissioned"; https://wikitech.wikimedia.org/wiki/Swift/How_To#Remove_a_failed_proxy_node_from_the_cluster [14:05:42] is there a way where I can help you with the second- not in terms of the specific patch, but general background? [14:05:55] federico3: ping> yes, they're depooled but still on; they need to be taken out of puppet so e.g. other nodes don't try and talk to them before I turn them off [14:06:08] I would prefer to teach you to have the confidence for all future patches than a single one [14:06:15] want me to doublecheck the depooling? [14:06:40] that's bonus-points, but yes please :) [I check again before actually decomming] [14:07:47] One thing I think we discussed (not with you, but with anyone) is to qualify the statements of reviews if unsure "e.g. I only checked the syntax" or "ok with the idea, but I didn't check the syntax" [14:08:13] that way the submitter can read that and consider if it is enough or not [14:10:42] is this "proof of depooling" enough? https://phabricator.wikimedia.org/T391352#10764031 - can I actively check the pooling status? [14:10:52] federico3: you can check in 2 ways: [14:11:00] sudo confctl select name=thanos-fe2003.codfw.wmnet get [14:11:05] (and similar for the other hosts) [14:11:25] sudo confctl select cluster=thanos,dc=codfw get [14:11:27] (and read the output) [14:11:50] I think you are handling it, but remember you can ping me at any time, federico3, for anything [14:12:51] OK, an off-by-one, also by browsing https://config-master.wikimedia.org/pybal/codfw/thanos-swift by I think that is the least ergonomic [14:13:17] s/by I/but I/ [14:14:09] thanks - CR updated with the command output :) [14:14:13] thanks jynus [14:15:10] thank you federico3 :) [14:17:41] another trick I use is: "what's the worst thing that could happen if this was merged" basically, risk management. Beeing too risk averse will result in not enough agility, while beeing too risky means outages; there is a healthy middle ground. Pooling a new host usually is a very low risk change; updaring apache configuration is a high risk one. [14:17:56] *updating