In computing, virtualization (v12n) is a series of technologies that allows dividing of physical computing resources into a series of virtual machines, operating systems, processes or containers.[1]
Virtualization began in the 1960s with IBM CP/CMS.[1] The control program CP provided each user with a simulated stand-alone System/360 computer.
In hardware virtualization, the host machine is the machine that is used by the virtualization and the guest machine is the virtual machine. The words host and guest are used to distinguish the software that runs on the physical machine from the software that runs on the virtual machine. The software or firmware that creates a virtual machine on the host hardware is called a hypervisor or virtual machine monitor.[2] Hardware virtualization is not the same as hardware emulation. Hardware-assisted virtualization facilitates building a virtual machine monitor and allows guest OSes to be run in isolation.
Desktop virtualization is the concept of separating the logical desktop from the physical machine.
Operating-system-level virtualization, also known as containerization, refers to an operating system feature in which the kernel allows the existence of multiple isolated user-space instances.
The usual goal of virtualization is to centralize administrative tasks while improving scalability and overall hardware-resource utilization.
History
A form of virtualization was first demonstrated with IBM's CP-40 research system in 1967, then distributed via open source in CP/CMS in 1967–1972, and re-implemented in IBM's VM family from 1972 to the present. Each CP/CMS user was provided a simulated, stand-alone computer. Each such virtual machine had the complete capabilities of the underlying machine, and (for its user) the virtual machine was indistinguishable from a private system. This simulation was comprehensive, and was based on the Principles of Operation manual for the hardware. It thus included such elements as an instruction set, main memory, interrupts, exceptions, and device access. The result was a single machine that could be multiplexed among many users.
Hardware-assisted virtualization first appeared on the IBM System/370 in 1972, for use with VM/370, the first virtual machine operating system. IBM added virtual memory hardware to the System/370 series in 1972 which is not the same as Intel VT-x Rings providing a higher privilege level for Hypervisor to properly control Virtual Machines requiring full access to Supervisor and Program or User modes.
With the increasing demand for high-definition computer graphics (e.g. CAD), virtualization of mainframes lost some attention in the late 1970s, when the upcoming minicomputers fostered resource allocation through distributed computing, encompassing the commoditization of microcomputers.
The increase in compute capacity per x86 server (and in particular the substantial increase in modern networks' bandwidths) rekindled interest in data-center based computing which is based on virtualization techniques. The primary driver was the potential for server consolidation: virtualization allowed a single server to cost-efficiently consolidate compute power on multiple underutilized dedicated servers. The most visible hallmark of a return to the roots of computing is cloud computing, which is a synonym for data center based computing (or mainframe-like computing) through high bandwidth networks. It is closely connected to virtualization.
equivalence: a program running under the virtual machine monitor (VMM) should exhibit a behavior essentially identical to that demonstrated when running on an equivalent machine directly
resource control (also called safety): the VMM must be in complete control of the virtualized resources
efficiency: a statistically dominant fraction of machine instructions must be executed without VMM intervention
This made it difficult to implement a virtual machine monitor for this type of processor. Specific limitations included the inability to trap on some privileged instructions.[3] Therefore, to compensate for these architectural limitations, designers accomplished virtualization of the x86 architecture through two methods: full virtualization or paravirtualization.[4] Both create the illusion of physical hardware to achieve the goal of operating system independence from the hardware but present some trade-offs in performance and complexity.
In 2005 and 2006, Intel and AMD (working independently) created new processor extensions to the x86 architecture called Intel VT-x and AMD-V, respectively. On the Itanium architecture, hardware-assisted virtualization is known as VT-i. The first generation of x86 processors to support these extensions were released in late 2005 early 2006:
On November 13, 2005, Intel released two models of Pentium 4 (Model 662 and 672) as the first Intel processors to support VT-x.
On May 23, 2006, AMD released the Athlon 64 ("Orleans"), the Athlon 64 X2 ("Windsor") and the Athlon 64 FX ("Windsor") as the first AMD processors to support this technology.
Hardware virtualization (or platform virtualization) pools computing resources across one or more virtual machines. A virtual machine implements functionality of a (physical) computer with an operating system. The software or firmware that creates a virtual machine on the host hardware is called a hypervisor or virtual machine monitor.[2]
Software executed on these virtual machines is separated from the underlying hardware resources. For example, a computer that is running Arch Linux may host a virtual machine that looks like a computer with the Microsoft Windows operating system; Windows-based software can be run on the virtual machine.[5][6]
Different types of hardware virtualization include:
Full virtualization – Almost complete virtualization of the actual hardware to allow software environments, including a guest operating system and its apps, to run unmodified.
Paravirtualization – The guest apps are executed in their own isolated domains, as if they are running on a separate system, but a hardware environment is not simulated. Guest programs need to be specifically modified to run in this environment.
Hybrid virtualization – Mostly full virtualization but utilizes paravirtualization drivers to increase virtual machine performance.
Full virtualization
Full virtualization employs techniques that pools physical computer resources into one or more instances; each running a virtual environment where any software or operating system capable of execution on the raw hardware can be run in the virtual machine. Two common full virtualization techniques are typically used: (a) binary translation and (b) hardware-assisted full virtualization.[1] Binary translation automatically modifies the software on-the-fly to replace instructions that "pierce the virtual machine" with a different, virtual machine safe sequence of instructions.[7] Hardware-assisted virtualization allows guest operating systems to be run in isolation with virtually no modification to the (guest) operating system.
Full virtualization requires that every salient feature of the hardware be reflected into one of several virtual machines – including the full instruction set, input/output operations, interrupts, memory access, and whatever other elements are used by the software that runs on the bare machine, and that is intended to run in a virtual machine.
This approach was pioneered in 1966 with the IBM CP-40 and CP-67, predecessors of the VM family.
Binary translation
In binary translation, instructions are translated to match the emulated hardware architecture.[1] A piece of hardware imitates another while in hardware assisted virtualization, a hypervisor (a piece of software) imitates a particular piece of computer hardware or the entire computer. Furthermore, a hypervisor is not the same as an emulator; both are computer programs that imitate hardware, but their domain of use in language differs.[8]
Hardware assisted
Hardware-assisted virtualization (or accelerated virtualization; Xen calls it hardware virtual machine (HVM), and Virtual Iron calls it native virtualization) is a way of improving overall efficiency of hardware virtualization using help from the host processors. A full virtualization is used to emulate a complete hardware environment, or virtual machine, in which an unmodified guest operating system (using the same instruction set as the host machine) effectively executes in complete isolation.
Hardware-assisted virtualization was first introduced on the IBM 308X processors in 1980, with the Start Interpretive Execution (SIE) instruction.[9] It was added to x86 processors (Intel VT-x, AMD-V or VIA VT) in 2005, 2006 and 2010[10] respectively.
Hardware-assisted virtualization reduces the maintenance overhead of paravirtualization as it reduces (ideally, eliminates) the changes needed in the guest operating system. It is also considerably easier to obtain better performance.
Paravirtualization
Paravirtualization is a virtualization technique that presents a software interface to the virtual machines which is similar, yet not identical, to the underlying hardware–software interface. Paravirtualization improves performance and efficiency, compared to full virtualization, by having the guest operating system communicate with the hypervisor. By allowing the guest operating system to indicate its intent to the hypervisor, each can cooperate to obtain better performance when running in a virtual machine.
The intent of the modified interface is to reduce the portion of the guest's execution time spent performing operations which are substantially more difficult to run in a virtual environment compared to a non-virtualized environment. The paravirtualization provides specially defined 'hooks' to allow the guest(s) and host to request and acknowledge these tasks, which would otherwise be executed in the virtual domain (where execution performance is worse). A successful paravirtualized platform may allow the virtual machine monitor (VMM) to be simpler (by relocating execution of critical tasks from the virtual domain to the host domain), and/or reduce the overall performance degradation of machine execution inside the virtual guest.
Paravirtualization requires the guest operating system to be explicitly ported for the para-API – a conventional OS distribution that is not paravirtualization-aware cannot be run on top of a paravirtualizing VMM. However, even in cases where the operating system cannot be modified, components may be available that enable many of the significant performance advantages of paravirtualization. For example, the Xen Windows GPLPV project provides a kit of paravirtualization-aware device drivers, that are intended to be installed into a Microsoft Windows virtual guest running on the Xen hypervisor.[11] Such applications tend to be accessible through the paravirtual machine interface environment. This ensures run-mode compatibility across multiple encryption algorithm models, allowing seamless integration within the paravirtual framework.[12]
History
The term "paravirtualization" was first used in the research literature in association with the Denali Virtual Machine Manager.[13] The term is also used to describe the Xen, L4, TRANGO, VMware, Wind River and XtratuMhypervisors. All these projects use or can use paravirtualization techniques to support high performance virtual machines on x86 hardware by implementing a virtual machine that does not implement the hard-to-virtualize parts of the actual x86 instruction set.[14]
In 2005, VMware proposed a paravirtualization interface, the Virtual Machine Interface (VMI), as a communication mechanism between the guest operating system and the hypervisor. This interface enabled transparent paravirtualization in which a single binary version of the operating system can run either on native hardware or on a hypervisor in paravirtualized mode.
The first appearance of paravirtualization support in Linux occurred with the merge of the ppc64 port in 2002,[15] which supported running Linux as a paravirtualized guest on IBM pSeries (RS/6000) and iSeries (AS/400) hardware.
At the USENIX conference in 2006 in Boston, Massachusetts, a number of Linux development vendors (including IBM, VMware, Xen, and Red Hat) collaborated on an alternative form of paravirtualization, initially developed by the Xen group, called "paravirt-ops".[16] The paravirt-ops code (often shortened to pv-ops) was included in the mainline Linux kernel as of the 2.6.23 version, and provides a hypervisor-agnostic interface between the hypervisor and guest kernels. Distribution support for pv-ops guest kernels appeared starting with Ubuntu 7.04 and RedHat 9. Xen hypervisors based on any 2.6.24 or later kernel support pv-ops guests, as does VMware's Workstation product beginning with version 6.[17]
Hybrid virtualization
Hybrid virtualization combines full virtualization techniques with paravirtualized drivers to overcome limitations with hardware-assisted full virtualization.[18]
A hardware-assisted full virtualization approach uses an unmodified guest operating system that involves many VM traps producing high CPU overheads limiting scalability and the efficiency of server consolidation.[19] The hybrid virtualization approach overcomes this problem.
Desktop virtualization separates the logical desktop from the physical machine.
One form of desktop virtualization, virtual desktop infrastructure (VDI), can be thought of as a more advanced form of hardware virtualization. Rather than interacting with a host computer directly via a keyboard, mouse, and monitor, the user interacts with the host computer using another desktop computer or a mobile device by means of a network connection, such as a LAN, Wireless LAN or even the Internet. In addition, the host computer in this scenario becomes a server computer capable of hosting multiple virtual machines at the same time for multiple users.[20]
Companies like HP and IBM provide a hybrid VDI model with a range of virtualization software and delivery models to improve upon the limitations of distributed client computing.[21] Selected client environments move workloads from PCs and other devices to data center servers, creating well-managed virtual clients, with applications and client operating environments hosted on servers and storage in the data center. For users, this means they can access their desktop from any location, without being tied to a single client device. Since the resources are centralized, users moving between work locations can still access the same client environment with their applications and data.[21] For IT administrators, this means a more centralized, efficient client environment that is easier to maintain and able to more quickly respond to the changing needs of the user and business.[22]
Another form, session virtualization, allows multiple users to connect and log into a shared but powerful computer over the network and use it simultaneously. Each is given a desktop and a personal folder in which they store their files.[20] With multiseat configuration, session virtualization can be accomplished using a single PC with multiple monitors, keyboards, and mice connected.
Thin clients, which are seen in desktop virtualization, are simple and/or cheap computers that are primarily designed to connect to the network. They may lack significant hard disk storage space, RAM or even processing power, but many organizations are beginning to look at the cost benefits of eliminating "thick client" desktops that are packed with software (and require software licensing fees) and making more strategic investments.[23]
Desktop virtualization simplifies software versioning and patch management, where the new image is simply updated on the server, and the desktop gets the updated version when it reboots. It also enables centralized control over what applications the user is allowed to have access to on the workstation.
Moving virtualized desktops into the cloud creates hosted virtual desktops (HVDs), in which the desktop images are centrally managed and maintained by a specialist hosting firm. Benefits include scalability and the reduction of capital expenditure, which is replaced by a monthly operational cost.[24]
Operating-system-level virtualization, also known as containerization, refers to an operating system feature in which the kernel allows the existence of multiple isolated user-space instances. Such instances, called containers,[25] partitions, virtual environments (VEs) or jails (FreeBSD jail or chroot jail), may look like real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can see all resources (connected devices, files and folders, network shares, CPU power, quantifiable hardware capabilities) of that computer. However, programs running inside a container can only see the container's contents and devices assigned to the container.
This provides many of the benefits that virtual machines have such as standardization and scalability, while using less resources as the kernel is shared between containers.[26]
Containerization started gaining prominence in 2014, with the introduction of Docker.[27][28]
Virtual file system: an abstraction layer on top of a more concrete file system, allowing client applications to access different types of concrete file systems in a uniform way
Storage hypervisor: the software that manages storage virtualization and combines physical storage resources into one or more flexible pools of logical storage[29]
Data virtualization: the presentation of data as an abstract layer, independent of underlying database systems, structures and storage
Database virtualization: the decoupling of the database layer, which lies between the storage and application layers within the application stack over all
Virtual private network (VPN): a network protocol that replaces the actual wire or other physical media in a network with an abstract layer, allowing a network to be created over the Internet
Network Protocol Virtualization: decoupling networking layers in order to accelerate the deployment and management of networks
Benefits and disadvantages
Virtualization, in particular, full virtualization has proven beneficial for:
sharing a computer system among multiple users;
isolating users from each other (and from the control program);
emulating new hardware to achieve improved reliability, security, and productivity.
A common goal of virtualization is to centralize administrative tasks while improving scalability and overall hardware-resource utilization. With virtualization, several operating systems can be run in parallel on a single central processing unit (CPU). This parallelism tends to reduce overhead costs and differs from multitasking, which involves running several programs on the same OS. Using virtualization, an enterprise can better manage updates and rapid changes to the operating system and applications without disrupting the user. "
Ultimately, virtualization dramatically improves the efficiency and availability of resources and applications in an organization. Instead of relying on the old model of "one server, one application" that leads to underutilized resources, virtual resources are dynamically applied to meet business needs without any excess fat".[30]
Virtual machines running proprietary operating systems require licensing, regardless of the host machine's operating system. For example, installing Microsoft Windows into a VM guest requires its licensing requirements to be satisfied.[31][32][33]
^Chris Barclay, New approach to virtualizing x86s, Network World, 20 October 2006
^Turban, E; King, D; Lee, J; Viehland, D (2008). "Chapter 19: Building E-Commerce Applications and Infrastructure". Electronic Commerce A Managerial Perspective. Prentice-Hall. p. 27.
^"Virtualization in education"(PDF). IBM. October 2007. Retrieved 6 July 2010. A virtual computer is a logical representation of a computer in software. By decoupling the physical hardware from the operating system, virtualization provides more operational flexibility and increases the utilization rate of the underlying physical hardware.
^"Installing signed GPLPV drivers in Windows Xen instances". Univention Wiki. Retrieved 2013-04-10. The GPLPV driver is a driver for Microsoft Windows, which enables Windows DomU systems virtualised in Xen to access the network and block drivers of the Xen Dom0. This provides a significant performance and reliability gain over the standard devices emulated by Xen/Qemu/Kvm.
^Armstrong, D (2011). "Performance issues in clouds: An evaluation of virtual image propagation and I/O paravirtualization". The Computer Journal. 54 (6): 836–849. doi:10.1093/comjnl/bxr011.
^Foley, Mary Jo (5 July 2012). "Microsoft goes public with Windows Server 2012 versions, licensing". ZDNet. CBS Interactive. Finn explained that Standard covers 2 CPUs in a host, and goes from one VOSE (virtual operating system environment - 1 free Std install in a VM on that host) to two, and 'now has all the features and scalability of Datacenter.' He noted there will be a small price increase, but said he thought that wouldn't matter, as it 'should be virtualized anyway and the VOSE rights doubling will compensate. Windows Server Datacenter was a minimum of two 1-CPU licenses with unlimited VOSEs. 'Now it is a simpler SKU that covers two CPUs in a host with unlimited VOSEs,' Finn said.