No vMotion possible after ESXi host BIOS update

I was working on some ESXi upgrades recently. We’re currently preparing everything to make the upgrade to vSphere 7 somewhen smooth as silk. That means that we’re rolling out vSphere 6.7 on all of our systems. Recently, I was tasked to upgrade some hosts in a facility some hundred miles away. The task itself was super easy, managing that with vSphere Update Manager was working like a charm. But before the vSphere upgrade, I had to upgrade the BIOS and server firmware to make sure that we’re fine with the VMware HCL.

The second host was done within one hour and received the complete care package. But the first host took a bit longer due to unforeseen troubleshooting. I’d like to share some helpful tips (hopefully they’re helpful).

What happened?

As mentioned, upgrading the ESXi host through the vSphere Update Manager worked like a charm. But before that, I booted the server remotely with the Service Pack for ProLiant ISO image to upgrade the BIOS and firmware of that server. Also, that went very well and. As there are two ESXi hosts at this location, we had shared storage available and we were able to move the VMs from one host to the other without further issues. One host placed into maintenance mode, upgrade, remove from maintenance mode, and the same for the second server. That was the idea.

Read more

VMware – Clone a VM with snapshots (and consolidate it)

Recently we had a weird issue in our office. We had one virtual machine with a snapshot. That by itself isn’t an issue, a snapshot is nothing special. But someone created that snapshot before a software upgrade and forget to delete it. So this snapshot was growing and growing. We found out that there is a snapshot when the VM or service owner requested some additional disk space. We weren’t able to add disk space because of that snapshot. So we scheduled a maintenance window to delete the snapshot. Faster said than done.

The VM went offline because of disk consolidation. That could happen, depending on snapshot size and storage system. But the VM not only went offline for some time, but unexpectedly for hours. Together with VMware support we were able to stop the snapshot deletion. The VM came back online but with the known “Disk Consolidation Needed status”. We found out that this snapshot was about 400 GB in size. What a bummer! So we scheduled another maintenance window to consolidate that snapshot. Unfortunately that didn’t work well. Consolidation timed at around 96%, not sure why. “Error communicating with the host” isn’t very helpful in that moment.

Some research and again having a chat with VMware support led us towards cloning the disk files. During the cloning of a disk file the snapshot will be consolidated. And as you’re doing a disk clone locally on the ESXi host with “vmkfstools” and not withing vCenter, there shouldn’t be a timeout either. So we had out action plan. And we scheduled another maintenance window with the service owner.

Read more