Sunday, April 26, 2009

AMD and Intel I/O Virtualization

Virtualization now reaches an I/O barrier where consolidated applications must vie for increasingly more limited I/O resources. Early virtualization techniques - both software and hardware assisted - concentrated on process isolation and gross context switching to accelerate the "bulk" of the virtualization process: running multiple virtual machines without significant processing degradation.

As consolidation potentials are greatly enhanced by new processors with many more execution contexts (threads and cores) the limitations imposed on I/O - software translation and emulation of device communication - begin to degrade performance. This degradation further limits consolidation, especially where significant network traffic (over 3Gbps of non-storage VM traffic per virtual server) or specialized device access comes into play.

I/O Virtualization - The Next Step-Up


Intrinsic to AMD-V in revision "F" Opterons and newer AM2 processors is I/O virtualization enabling hardware assisted memory management in the form of a Graphics Aperture Remapping Table (GART) and the Device Exclusion Vector (DEV). These two facilities provide address translation of I/O device access to a limited range of the system physical address space and provide limited I/O device classification and memory protection.

Combined with specialized software GART and DEV provided primitive I/O virtualization but were limited to the confines of the memory map. Direct interaction with devices and virtualization of device contexts in hardware are efficiently possible in this approach as VMs need to rely on hypervisor control of device access. AMD defined its I/O virtualization strategy as AMD IOMMU in 2006 (now AMD-Vi) and has continued to improve it through 2009.

With the release of new motherboard chipsets (AMD SR5690) in 2009, significant performance gains in I/O will be brought to the platform with end-to-end I/O virtualization. Motherboard refreshes based on the SR5690 should enable Shanghai and Istanbul processors to take advantage of the full AMD IOMMU specification (now AMD-Vi).

Similarly, Intel's VT-d approach combines chipset and CPU features to solve the problem in much the same way. Due to the architectural separation of memory controller from CPU, this meant earlier processors not only carry the additional instruction enhancements but they must also be coupled to northbridge chipsets that contained support. This feature was initially available in the Intel Q35 desktop chipset in Q3/2007.

I/O Virtualization: Nod to Intel


Both Intel Nehalem and AMD Opteron I/O virtualization (IOV) technologies require a combination of CPU-supported instruction enhancements and motherboard (I/O hub and southbridge) enhancements to implement end-to-end IOV. Intel's Nehalem-EP platform is the first to bring both aspects to a modern platform, giving the IOV performance edge to Intel.

Likewise, given Intel's earlier introduction of VT-d technology in legacy Xeon systems, there is a ready-made software eco-system with the technology baked-in. AMD's burden will be to work effectively with software vendors like Xen and VMware to bake-in it's technology in advance of the SR5690 general release.

Limited Value Proposition (Today)


The performance enhancements of hardware IOV do not affect a broad market today within the virtualization segment. Likewise, IOV is only beneficial where the acceleration of I/O is necessary at the VM level - not at the hypervisor level. The majority of today's workloads benefit more from memory management improvements like EPT and RVI that are coupled to memory and context switching not I/O.

Next-generation virtual workloads requiring significant I/O performance and/or direct hardware access are experimental or highly specialized today. In a not so distant future, the ability to fully utilize 10Gbps Ethernet and 40Gbps Infiniband will be required to virtualize extreme workloads. Likewise, such improvements will have to be supported on the device hardware and in its drivers as well.

Interestingly, direct hardware access could allow for GPU-based computing to infiltrate the virtualization space. Giving a hypervisor like Xen or VMware the ability to off-load processor intensive calculations to vector processors from NVidia or ATI could allow for a new range of applications to be processed in the virtual space: like facial and voice recognition software, video compression and processing, 3D graphics acceleration, etc. Such capabilities could increase the effectiveness of virtualization farms exponentially.

As the number of processor sockets and cores continue to increase, IOV will create opportunities to incorporate "appliance" workloads into a general virtualization platform that are today restricted to specialized or dedicated hardware. Such applications might be firewalls, routers, VoIP applications, SSL-gateways, HPC nodes, advanced VDI, etc. where a moderate-to-high ratio of I/O to workload exist. Until this type of application is more commonplace in very high traffic environments, IOV will present a limited value proposition.

No comments:

Post a Comment