Skip to content

Conversation

@jiaqiluo
Copy link
Member

@jiaqiluo jiaqiluo commented Jul 17, 2025

Issue:

rancher/rancher#50987

Problem

While creating a node driver or custom RKE2/k3s cluster, if the secret used for saving etcd backups to S3 does not exist, then the cluster will be stuck waiting for node ref, and Rnacher will generate empty plans for nodes.

Note that this bug can only be triggered when creating or updating the cluster programmatically or via editing YAML.

Solution

Add validation for the etcd S3 cloud credential to reject the request for creating or updating the provisioning cluster if the provided secret does not exist.

CheckList

  • Test
  • Docs

@snasovich snasovich requested a review from a team July 28, 2025 20:02
@jiaqiluo jiaqiluo force-pushed the validate-s3-secret branch 2 times, most recently from 6e65218 to 84d8969 Compare July 28, 2025 23:43
@jiaqiluo jiaqiluo marked this pull request as ready for review July 28, 2025 23:46
@jiaqiluo jiaqiluo requested a review from a team as a code owner July 28, 2025 23:46
Copy link
Collaborator

@crobby crobby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what's here seems good from a frameworks perspective. lg to me once Jake's question is addressed.

@jiaqiluo jiaqiluo requested a review from jakefhyde August 7, 2025 17:35
@jiaqiluo jiaqiluo force-pushed the validate-s3-secret branch from 84d8969 to 0aef5a2 Compare August 7, 2025 18:37
@jiaqiluo jiaqiluo requested a review from a team August 7, 2025 18:55

// validateS3Secret checks if the S3 cloud credential secret referenced in the cluster exists.
func (p *provisioningAdmitter) validateS3Secret(oldCluster, cluster *v1.Cluster) (*admissionv1.AdmissionResponse, error) {
if cluster.Name == localCluster {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need a check here if oldCluster == nil, so that you still validate on create, otherwise it will likely panic.

Copy link
Member Author

@jiaqiluo jiaqiluo Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not happen.
I thought about this case while writing the function, and it turned out that the oldCluster and cluster will be pointers to an empty cluster object, &v1.Cluster{}. They are from the ClusterOldAndNewFromRequest function:
https://github.com/rancher/webhook/blob/main/pkg/generated/objects/provisioning.cattle.io/v1/objects.go#L19-L20

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that makes sense, approved.

@jiaqiluo jiaqiluo requested review from a team and jakefhyde August 7, 2025 19:17
@jiaqiluo jiaqiluo requested a review from a team August 8, 2025 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants