Your VMware vCenter unavailable? Here’s what to do
Some time ago, we had a networking failure at one of our customers. Because of this failure, the storage became inaccessible, resulting in all the VMs going down. Eventually, the network came back up and we noticed that we were unable to start the VMware vCenter Server appliance. The vCenter file system turned out to be corrupt.
As this customer currently works with virtual desktop infrastructure (VDI) and VMware Horizon view, the vCenter server appliance is a vital component. It manages the deployment of all Virtual Machines (VMs) needed in the View environment. When vCenter is unavailable, Horizon View is unable to create desktops for users.
How did we solve the corrupted vCenter? We redeployed a new vCenter Server Appliance and restored a backup. When there is no back up available, a reconfiguration of vCenter is required. Of course, this is very time consuming and, fortunately, there is an easier way thanks to the native high availability of VMware vCenter 6.5!
Native High Availability for vCenter
A new feature in vCenter 6.5 is high availability (only available for vCenter Server Appliance 6.5).
VCenter High Availability (HA) removes a single point of failure from your environment. By following the basic HA setup, two clones will be created of the existing vCenter. The first clone will act as the passive node that will be synchronized with the active vCenter. The second clone will be the witness node that is used to avoid split-brain situations in case of a network partition causing a situation where the active and passive node cannot communicate with each other.
When there is a failure of the active node, the public IP will be moved to the passive node and all vCenter services will start within several minutes.
Other features that ensure the HA of your VMs
On top of the new vCenter High Availability, there are also other existing features that ensure the HA of your Virtual Machines. In the case of a hardware failure, VMware High Availability will protect your VMs. It will restart the VMs on a different host which will make them available again.
A feature like VMware Fault creates a secondary VM that is in constant sync with its primary VM. If a host goes down, the secondary VM instantly takes over and the user doesn’t experience any downtime. These features provide HA, but they do not protect against OS failures. If your server displays a purple screen of death, VMware HA doesn’t recognize this as a failure. Fault tolerance will synchronize all changes to its secondary which will cause the same issue on the secondary VM.
In our case where the vCenter server got a corrupted file system, features like VMware HA and VMware Fault tolerance wouldn’t be able to provide a functional vCenter. VCenter HA, on the other hand, would have started the passive vCenter with a minimum amount of downtime. By removing the single point of failure from vCenter and thus minimizing vCenter downtime Horizon can continuously provide users with desktops and applications.