Mountain West Farm Bureau Insurance
office workers empowered by business technology solutions
BLOG
9
27
2016
12.18.2020

Keeping Your Cloud Online with VMware Fault Tolerance

Last updated:
9.16.2020
12.18.2020
No items found.

Green House Data provides a 100% SLA – which means your cloud infrastructure is guaranteed to be online 24/7. But errors in application deployment, cyber attacks, configuration mishaps, heavy network traffic, and other issues can still cause your virtual machines to crash, if you are managing them yourself. One tool in the arsenal to fight cloud downtime is VMware Fault Tolerance.

Fault Tolerance (FT) increases availability of virtual machines by creating an identical copy of the production VM that is continuously updated and ready to replace the original VM in the event of downtime. VMware FT is part of vSphere High Availability and works with it to keep the backup VM in tandem.

FT is often used for applications that require constant availability, especially if they have continual or near-constant client connections, or for custom applications that require clustering.

 

How does Fault Tolerance work in vSphere?

VMware fault tolerance keeps VMs active when a host goes down

FT is enabled for individual virtual machines manually. The second VM resides on a separate host in your cluster so it does not go down with the production VM in a downtime event. Because the VMs are running in lockstep, or parallel, on separate hosts, vMotion compatibility is required.

Each server continuously shares heartbeats, monitoring status of each other to ensure FT is maintained. The ultimate goal is no user interruptions and zero data loss. In addition, FT avoids a potential problem where two active, identical VMs run into storage and configuration problems when the original VM is restored by using atomic file locking to keep only one side of the failover running.

vCenter Server 4.x and 5.x support up to one vCPU per VM for FT, while vCenter Server 6 supports up to four vCPUs. Your cluster must meet certain requirements including:

Your virtual machines also have specific requirements including:

The Difference Between High Availability and Fault Tolerance

HA is designed to prevent downtime due to the loss of the physical server, loss of a VM, or even loss of an application within a VM. With HA turned on, surviving servers detect downtime and the master node assigns failed VMs to other nodes. If the physical server does not go down, the VM is restarted on the original host.

Although FT works with HA, they do not achieve the same failover results.  FT is part of HA, but not vice versa, and HA should therefore be configured first.

Both work to reduce downtime by transitioning VM workloads to a new host during failure of the original ESX/i host. High Availability is better suited to VMs for which 24/7 uptime is not essential. It allows the VM to fail fully before launching on another host. With Fault Tolerance, the failover is instantaneous, and the new VM also is copied to a new server, which in turn is kept in lockstep.

 

For mission-critical VMs, VMware Fault Tolerance can be a great way to minimize or eliminate downtime, as in the vast majority of cases, workloads are not interrupted at all. Read some best practices for Fault Tolerance from the VMware vSphere 6.0 Documentation Center before you configure your environment to ensure your FT works properly.

Recent Blog Posts

lunavi logo alternate white and yellow
7.21.2021
07
.
19
.
2021
How Lunavi Approaches Digital Transformation: HostingAdvice Company Profile

For prospective clients and partners, the history, ethos, and capabilities of a vendor are paramount. HostingAdvice.com recently profiled Lunavi to explore our approach.

Learn more
lunavi logo alternate white and yellow
5.20.2021
04
.
26
.
2021
Test Automation Best Practices: Balancing Confidence with Efficiency

Automation can instill confidence to release software and improve the team’s ability to create high-quality applications in the fastest and most efficient way possible. Essentially, it eliminates the need to compromise or choose one set of priorities over another. Instead, it allows teams to strike a balance between confidence/coverage and speed/efficiency. But automation isn’t a one-size-fits-all solution.

Learn more
lunavi logo alternate white and yellow
4.20.2021
04
.
20
.
2021
Building Your Cloud Foundation Part 1: Core Configuration & Governance

This first area of focus establishes your cloud policy, or the way your organization consumes and manages cloud resources. Learn how to establish proper scope and mitigate tangible risks through corporate policy and standards.

Learn more