vSphere troubleshooting

PV/PVC cannot be created (Unable to find VM by UUID)

This may happen if the virtual machines in the cluster were re-created, e.g. to recover after a VM failure.

The cluster-manager logs on the master will show a similar message: Unable to find VM by UUID. VM UUID: 4215cbe6-2a4f-b8a8-9178-6219df59cd40.

The problem is due to the way kubelet registers nodes in the Kubernetes master.

After a node is registered in the master for the first time, providerID field cannot be changed in the kubernetes node object.

To make sure that you have exactly this error execute the following command:

$ kubectl get nodes -o custom-columns=NAME:.metadata.name,PROVIDER_ID:.spec.providerID,UUID:.status.nodeInfo.systemUUID

NAME                               PROVIDER_ID                                      UUID
cluster-247-vsp1-group1-worker-0   vsphere://42152218-c14d-f00d-dc13-84176e986471   42152218-C14D-F00D-DC13-84176E986471
cluster-247-vsp1-group1-worker-1   vsphere://42155ab0-e5a1-7bc3-e769-8d8cb56f2c2b   42155AB0-E5A1-7BC3-E769-8D8CB56F2C2B
cluster-247-vsp1-master-0          vsphere://42155d4b-9ee2-344e-e610-b80db41130f0   42155D4B-9EE2-344E-E610-B80DB41130F0
cluster-247-vsp1-master-1          vsphere://4215cbe6-2a4f-b8a8-9178-6219df59cd40   4215DE4E-975F-1DCC-57DD-67330B1653D5
cluster-247-vsp1-master-2          vsphere://4215f891-37f9-8cf3-8bb4-8fde7762db6b   42150441-AB5B-A87E-A2EA-A01AB7560430

Note that cluster-247-vsp1-master-1 and cluster-247-vsp1-master-2 nodes have different UUID values in fields .spec.providerID and .status.nodeInfo.systemUUID.

To fix it:

  1. Remove kubernetes nodes with incorrect UUID values by running the following command

    $ kubectl delete node cluster-247-vsp1-master-1
    $ kubectl delete node cluster-247-vsp1-master-2
  2. Restart the kubelet on this nodes via ssh by running the following command, or just restart the nodes.

    # systemctl restart kublr-kubelet