Live migration

Live migration, also called migration, refers to the process of moving a running virtual machine (VM) between different physical machines in a manner that the VM and applications running within the VM are mostly unaffected. Memory, storage, and network connectivity of the virtual machine are transferred from the original host machine to the destination. The time between stopping the VM or application on the source and resuming it on destination is called "downtime" or "blackout".

Live migration basics

Live migration requires the entire VM's state to be transferred from the source host to the target host. This state includes the state of any components of the VM, for example, the register contents of the virtual CPUs (vCPUs). The state of the VM is not stable until the VM is paused. For example, until the vCPUs have been paused, the state in the virtual registers may be changing.

The simplest way to implement live migration is to pause the VM, serialize and send the state to the target host, and resume the VM on the target host. Serializing and sending certain state, like the state of the VM's main memory or locally-attached storage devices, can be slow.

Memory live migration

If the entire contents of the VM's main memory must be copied while the VM is paused, the VM will have to remain paused for an extended period of time, depending on the size of the VM's main memory. There are several techniques for reducing blackout for large-memory VMs.

Pre-copy memory migration

Pre-copy memory migration requires the hypervisor to track which pages the guest is writing to. With this ability, the following strategy is possible^[1]:

Copy all of the VM's main memory pages to the target host.
Before copying any particular page, track any subsequent writes to the page (e.g. by write-protecting the corresponding SLAT entry).
After all pages have been copied, start a new copying pass, but only copy pages that were written to since it was last copied to the target host (i.e., pages that are "dirty").
Repeat step 3 until the set of remaining dirty pages is small.
Pause the VM, and transfer the remaining dirty pages to the target host.

This is the basic pre-copy strategy. Other optimizations can be applied, like not re-sending pages that will likely be dirtied again quickly.^[2]

The guest is usually able to write to memory faster than it can be copied to the target, which means that pre-copy might never converge to a small dirty set. If the hypervisor does not throttle writes to guest memory, blackout time my remain large, even with pre-copy memory migration.

Post-copy memory migration

Post-copy memory migration is another strategy for copying the VM's main memory outside of blackout. It can be used with pre-copy memory migration, as it can limit the amount of time the VM must stay paused during memory migration even if pre-copy doesn't converge on a small set of dirty pages.

Post-copy memory migration requires the ability for the hypervisor to intercept guest accesses to the VM's memory. With this capability, the hypervisor is able to resume the VM on the target host without a complete copy of the VM's memory. Guest accesses to pages that are not present on the target (i.e., were dirty at the end of pre-copy, if pre-copy was done) can be fetched on-demand by the hypervisor.^[3]

To ensure that all dirty pages are eventually transferred, a hypervisor implementing post-copy live migration will also fetch outstanding dirty pages in the background, concurrently with on-demand fetching of dirty pages.

Post-copy sends each page exactly once over the network whereas pre-copy can transfer the same page multiple times if the page is dirtied repeatedly at the source during migration. On the other hand, pre-copy retains an up-to-date state of the VM at the source during migration, whereas during post-copy, the VM's state is split across the source and the destination. If the destination fails during pre-copy live migration, the live migration can be restarted, but if the destination fails during post-copy live migration, the VM cannot be recovered.