No vMotion possible after ESXi host BIOS update

I was working on some ESXi upgrades recently. We’re currently preparing everything to make the upgrade to vSphere 7 somewhen smooth as silk. That means that we’re rolling out vSphere 6.7 on all of our systems. Recently, I was tasked to upgrade some hosts in a facility some hundred miles away. The task itself was super easy, managing that with vSphere Update Manager was working like a charm. But before the vSphere upgrade, I had to upgrade the BIOS and server firmware to make sure that we’re fine with the VMware HCL.

The second host was done within one hour and received the complete care package. But the first host took a bit longer due to unforeseen troubleshooting. I’d like to share some helpful tips (hopefully they’re helpful).

What happened?

As mentioned, upgrading the ESXi host through the vSphere Update Manager worked like a charm. But before that, I booted the server remotely with the Service Pack for ProLiant ISO image to upgrade the BIOS and firmware of that server. Also, that went very well and. As there are two ESXi hosts at this location, we had shared storage available and we were able to move the VMs from one host to the other without further issues. One host placed into maintenance mode, upgrade, remove from maintenance mode, and the same for the second server. That was the idea.

But unfortunately, the gods of IT had something different in mind. After upgrading the first host, we tried to move the VMs back to this host to prepare the upgrade for the second host. Well, some VMs were able to be moved there, some were not. But why?

When we move some particular VMs back from the second to the upgraded host, we received the below error:

That made us curious. Why did this happen? When checking the tasks and events of that host and also the affected VM in vCenter, we didn’t find much information. After some internet research, we’ve found some possible causes. But not all of them didn’t fit our issue. So we had to dig a bit deeper. Thanks to the vmware.log file, located in the VM folder, we were able to find out the following:

Ok, that sounds funny that VMX has left the building, but not sure why. Seems to be a boring party…

Some more digging brought some more helpful information:

Obviously, the vMotion failed because some CPU features are different between the first (the already updated) and the second host. But wait, the two servers are the same model, with the same hardware configuration? How can that be?

The solution

That led us to the conclusion that it must be something with the VM compatibility level. But wait again, some movable VMs were on VM HW version 8, and the VMs with failed vMotion were also on VM HW version 8? Well, to say it here, we weren’t able to find the exact differences here. But that led us to two solutions. Either upgrade the VM HW version or install an ESXi patch. We decided to install the patch as we didn’t want to reboot some VMs (but we did it later).

And before you complain now, yes, I’m aware of the fact that the patch in the linked KB article is not the most recent ESXi build. It’s somewhat historical. When we started with the global vSphere 6.7 rollout, vSphere 6.7 Update 2 was the latest version available. And yes, we’re currently again planning new rollouts, as it was sadly neglected in the past. But you know, time and human resources, and change requests…

VMware vSphere – How to script vMotion for your VMs

vMotion Script

VMware vMotion is a pretty good feature regarding the availability and load balancing in your vSphere environment. Today i created a vMotion script to help me create a backup with a backup software.

As so many times my blog posts are the result of a problem i had and for which i needed a solution. It shouldn’t be different today. I worked in my vSphere homelab. I created some virtual machines and installed my backup software of choice. My idea was to have a backup before doing any work with the VMs, just in case i screw it up. So i can easily go back to a known good state of the VM and try again. But this task wasn’t so easy.

As a vExpert, VMCE, MVP, Trainer or many other different tech people you can request a NFR license key for Veeam Availability Suite. So did i. The NFR key was delivered quickly to my mailbox, and was even faster installed in Veeam. But there was a catch. At least my NFR license is limited to two sockets, but with no limits for protected VMs, and it comes with a full 1-year retention period, instead of just 30 days as the regular trial.

So i had to deal with the fact that only one host (i’ve got three hosts in my lab with two sockets each) is protected by Veeam. This limitation woke the hunter in me because i had to find a solution. My goal was to backup all my VMs but with only two licensed sockets. The approach I chose was to set vSphere DRS to manual, then do a vMotion of all VMs to the host which helds the Veeam license, doing a backup and set DRS back to fully automated after backup. If you are working with ressource pools you shouldn’t disable DRS, because that results in removing the ressource pools. But there is a workaround for that too. Instead of creating a new problem i did the easy way and just set DRS to manual.

How to get the vMotion script

If you’re familiar with GitHub you can download my script from there:

https://github.com/driftar/vSphere

For any other user i’ll provide the script directly here:

[code lang=”powershell” gutter=”true”]
# .SYNOPSIS
# This script will start a vMotion of all virtual machines on a specified datastore to a specified ESXi host.
# If you are working with a backup software which is licensed to a specific host, # this will probably help you.
# Only recommended in smaller environments or if you have enough ressources on this host.

# .DESCRIPTION
# The script loads a PSSnapin; it sets some PowerCLI options; it connects to your vCenter Server with the given credentials;
# it gets all your VMs in an array; it starts then a Host vMotion of all the VMs in the array to a specified ESXi host.

# .NOTES
# File Name : pre-backup.ps1
# Version: : 1.0
# Author : Karl Widmer (info@)
# Prerequisite : PowerShell V2 over Vista and upper / VMware PowerCLI 6
# Tested on: : Windows Server 2012 R2
# with PowerCLI : PowerCLI 6.3 Release 1 build 3737840
# with PowerShell: 4.0
# Copyright 2016 – Karl Widmer / driftar’s Blog (www.)

# .LINK
# Script posted over: https://www.driftar.ch

# Load PowerCLI cmdlets
Add-PSSnapin VMware.VimAutomation.Core -ErrorAction "SilentlyContinue"

# Set PowerCLI behaviour regarding invalid certificates and deprecation warnings
Set-PowerCLIConfiguration -InvalidCertificateAction ignore -DisplayDeprecationWarnings:$false -confirm:$false

# Define vCenter User and target Datastore
$vcHost = ‘vcenter.domain.com’
$vcUser = ‘administrator@domain.com’
$vcPass = ‘password’
$datastore = ‘your_datastore’
$cluster = ‘your_cluster’
$targetHost = Get-VMHost -Name yourhost.domain.com

# Connect to vCenter
Connect-VIServer $vcHost -User $vcUser -Password $vcPass

# Get VMs (pass array of VMs to $VMs, for example ‘get-datastore test | get-vm’)
$VMs = Get-Datastore $datastore | get-vm

# Get Cluster information to set DRS to Manual for backup window
Set-Cluster $cluster -DrsAutomationLevel Manual -Confirm:$false

Foreach($vm in $vms) {
Write-Host ("Start Host vMotion for VM ‘" + $VM.Name + "’")

Move-VM -VM (Get-VM -Name $vm) -Destination (Get-Vmhost $targethost) -RunAsync

Write-Host ("Waiting…")

Write-Host ("Host vMotion for VM ‘" + $VM.Name + "’ finished")
}

# This last script step should probably be executed in a post-backup script step.
# It sets the DRS automation level back to fully automated. Your VMs will then probably load-balance on your hosts.

# Set DRS on cluster back to FullyAutomated after backup window
Set-Cluster $cluster -DrsAutomationLevel FullyAutomated -Confirm:$false
[/code]

Update 07.11.2016

After updating my ESXi hosts to 6.0.0 Build 4510822 my script stopped working. So i simplified the script and released version 2.0.