ELI:Bare-MetalPerformanceforI/OVirtualization资源-CSDN下载

需积分: 35 146 浏览量 2013-03-18 00:55:18 上传评论收藏 722KB PDF 举报

### ELI：裸金属性能下的I/O虚拟化 #### 概述 ELI（Exit-Less Interrupts）是一项旨在提高I/O密集型工作负载在虚拟机环境中的性能的技术方案。传统上，在虚拟化环境中，即使采用了直接设备分配（Direct Device Assignment），即允许虚拟机直接与物理I/O设备通信，仍无法达到裸金属系统的性能水平。这主要是因为宿主机仍然需要拦截所有中断，包括由已分配设备产生的用于通知虚拟机I/O请求完成的中断。这种中断处理方式导致了频繁的上下文切换，从而显著降低了I/O密集型工作负载的性能。 #### 技术背景与挑战在虚拟化环境中，硬件资源被多个虚拟机共享。为了实现资源共享，通常采用虚拟化层或管理程序来管理和调度这些资源。然而，这种方式会带来额外的开销，尤其是在处理I/O操作时。例如，当一个设备完成了一个I/O操作并向系统发出中断信号时，这个信号首先被宿主机捕获并处理。然后，宿主机再将该事件传递给相应的虚拟机。这一过程涉及多次上下文切换，每次切换都需要保存当前虚拟机的状态并恢复宿主机的状态，然后再恢复虚拟机的状态继续执行。这些额外的操作不仅增加了延迟，还消耗了大量的计算资源。 #### ELI技术详解 ELI技术的核心在于软件层面改进了中断处理机制，使得中断可以直接在虚拟机内部处理而无需宿主机介入。具体来说： 1. **直接中断处理**：通过修改虚拟机内的驱动程序或操作系统内核，使其能够识别和处理来自物理设备的中断信号。这样，当设备完成I/O操作时，它直接向虚拟机发送中断信号，虚拟机内部的驱动程序负责处理这些中断。 2. **安全性保证**：为了确保系统的安全性和稳定性，ELI还需要解决几个关键问题： - 防止恶意虚拟机滥用中断信号来干扰其他虚拟机或宿主机。 - 确保虚拟机正确处理中断信号，并不会导致数据损坏或其他安全漏洞。 3. **性能提升**：通过消除宿主机在中断处理路径中的参与，ELI能够显著减少上下文切换次数，进而大幅提高I/O密集型工作负载的吞吐量和降低延迟。实验结果显示，对于未经过修改的、不可信的虚拟机，ELI能够使其性能提升1.3到1.6倍，甚至可以达到接近97%至100%的裸金属性能水平。 #### 实现方法与应用场景 ELI技术主要通过以下步骤实现： 1. **硬件支持**：虽然ELI是一种纯软件解决方案，但其有效性取决于底层硬件的支持。例如，某些现代处理器提供了特定的指令集来支持虚拟化和中断处理。 2. **软件修改**：对虚拟机的操作系统内核进行必要的修改，使其能够直接处理来自物理设备的中断信号。此外，还需要对设备驱动程序进行调整，以便它们能够在虚拟机环境中正常工作。 3. **安全措施**：为了确保系统的安全性和稳定性，需要实现一系列的安全控制措施。例如，可以使用访问控制列表（ACLs）来限制虚拟机访问特定的中断信号，或者实施更高级别的监控机制来检测潜在的安全威胁。 ELI技术适用于各种需要高性能I/O操作的场景，例如： - **高性能计算**：如科学计算、大数据分析等。 - **云计算平台**：提供高带宽、低延迟的网络服务。 - **游戏服务器**：为在线游戏提供流畅的游戏体验。 ELI技术通过改进虚拟化环境中的中断处理机制，有效提升了I/O密集型应用的性能表现，为虚拟化技术的应用带来了新的可能性。

资源推荐

资源详情

资源评论

ELI: Bare-Metal Performance for I/O Virtualization

Abel Gordon

1⋆

Nadav Amit

2⋆

Nadav Har’El

Muli Ben-Yehuda

Alex Landau

Assaf Schuster

Dan Tsafrir

IBM Research—Haifa

Technion—Israel Institute of Technology

{abelg,nyh,lalex}@il.ibm.com {namit,muli,assaf,dan}@cs.technion.ac.il

Abstract

Direct device assignment enhances the performance of guest virtual

machines by allowing them to communicate with I/O devices with-

out host involvement. But even with device assignment, guests are

still unable to approach bare-metal performance, because the host

intercepts all interrupts, including those interrupts generated by as-

signed devices to signal to guests the completion of their I/O requests.

The host involvement induces multiple unwarranted guest/host con-

text switches, which signiﬁcantly hamper the performance of I/O

intensive workloads. To solve this problem, we present ELI (Exit-

Less Interrupts), a software-only approach for handling interrupts

within guest virtual machines directly and securely. By removing

the host from the interrupt handling path, ELI manages to improve

the throughput and latency of unmodiﬁed, untrusted guests by 1.3x–

1.6x, allowing them to reach 97%–100% of bare-metal performance

even for the most demanding I/O-intensive workloads.

Categories and Subject Descriptors C.0 [General]:

Hardware/software interfaces

General Terms Performance

Keywords

SR-IOV, I/O Virtualization, Interrupts, I/O Performance,

Device Assignment

1. Introduction

I/O activity is a dominant factor in the performance of virtualized

environments [32, 33, 47, 51], motivating direct device assignment

where the host assigns physical I/O devices directly to guest virtual

machines. Examples of such devices include disk controllers, net-

work cards, and GPUs. Direct device assignment provides superior

performance relative to alternative I/O virtualization approaches,

because it almost entirely removes the host from the guest’s I/O path.

Without direct device assignment, I/O-intensive workloads might

suffer unacceptable performance degradation [

Still, direct access does not allow I/O-intensive workloads to ap-

proach bare-metal (non-virtual) performance [

limiting it to 60%–65% of the optimum by our measurements. We

∗

Both authors contributed equally.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. To copy otherwise, to republish, to post on servers or to redistribute

to lists, requires prior speciﬁc permission and/or a fee.

ASPLOS’12, March 3–7, 2012, London, England, UK.

 2012 ACM 978-1-4503-0759-8/12/03. . . $10.00

guest/host context switch (exits and entries)

handling cost (handling physical interrupts and their completions)

bare-metal

baseline

guest

host

time

ELI

delivery

guest

host

ELI

delivery &

completion

guest

host

physical

interrupt

completion

(a)

(b)

(c)

interrupt

injection

interrupt

completion

(d)

Figure 1. Exits during interrupt handling

ﬁnd that nearly the entire performance difference is induced by

interrupts of assigned devices.

I/O devices generate interrupts to asynchronously communicate

to the CPU the completion of I/O operations. In virtualized settings,

each device interrupt triggers a costly exit [

], causing the

guest to be suspended and the host to be resumed, regardless of

whether or not the device is assigned. The host ﬁrst signals to

the hardware the completion of the physical interrupt as mandated

by the x86 speciﬁcation. It then injects a corresponding (virtual)

interrupt to the guest and resumes the guest’s execution. The guest

in turn handles the virtual interrupt and, like the host, signals

completion, believing that it directly interacts with the hardware.

This action triggers yet another exit, prompting the host to emulate

the completion of the virtual interrupt and to resume the guest

again. The chain of events for handling interrupts is illustrated in

Figure 1(a).

The guest/host context switches caused by interrupts induce a

tolerable overhead price for non-I/O-intensive workloads, a fact

that allowed some previous virtualization studies to claim they

achieved bare-metal performance [

]. But our measurements

indicate that this overhead quickly ceases to be tolerable, adversely

affecting guests that require throughput of as little as 50 Mbps.

Notably, many previous studies improved virtual I/O by relaxing

protection [

] or by modifying guests [

], whereas we

focus on the most challenging virtualization scenario of guests that

are untrusted and unmodiﬁed.

Many previous studies identiﬁed interrupts as a major source

of overhead [

], and many proposed techniques to reduce

it, both in bare-metal settings [

] and in virtualized

settings [

]. In principle, it is possible to tune devices

and their drivers to generate fewer interrupts, thereby reducing the

411

related overhead. But doing so in practice is far from trivial [8, 44]

and can adversely affect both latency and throughput. We survey

these approaches and contrast them with ours in Section 2.

Our approach rests on the observation that the high interrupt

rates experienced by a core running an I/O-intensive guest are

mostly generated by devices assigned to the guest. Indeed, we

measure rates of over 150K physical interrupts per second, even

while employing standard techniques to reduce the number of

interrupts, such as interrupt coalescing [

] and hybrid

polling [

]. As noted, the resulting guest/host context switches

are nearly exclusively responsible for the inferior performance

relative to bare metal. To eliminate these switches, we propose

ELI (ExitLess Interrupts), a software-only approach for handling

physical interrupts directly within the guest in a secure manner.

With ELI, physical interrupts are delivered directly to guests,

allowing them to process their devices’ interrupts without host in-

volvement; ELI makes sure that each guest forwards all other inter-

rupts to the host. With x86 hardware, interrupts are delivered using

a software-controlled table of pointers to functions, such that the

hardware invokes the

-th function whenever an interrupt of type

k ﬁres. Instead of utilizing the guest’s table, ELI maintains, manip-

ulates, and protects a “shadow table”, such that entries associated

with assigned devices point to the guest’s code, whereas the other

entries are set to trigger an exit to the host. We describe x86 interrupt

handling relevant to ELI and ELI itself in

Section 3

and

Section 4

respectively. ELI leads to a mostly exitless execution as depicted in

Figure 1(c).

We experimentally evaluate ELI in

Section 5

with micro and

macro benchmarks. Our baseline conﬁguration employs standard

techniques to reduce (coalesce) the number of interrupts, demon-

strating ELI’s beneﬁt beyond the state-of-the-art. We show that ELI

improves the throughput and latency of guests by 1.3x–1.6x. No-

tably, whereas I/O-intensive guests were so far limited to 60%–65%

of bare-metal throughput, with ELI they reach performance that is

within 97%–100% of the optimum. Consequently, ELI makes it

possible to, e.g., consolidate traditional data-center workloads that

nowadays remain non-virtualized due to unacceptable performance

loss.

Section 6

we describe how ELI protects the aforementioned

table, maintaining security and isolation while still allowing guests

to handle interrupts directly. In

Section 7

we discusses potential

hardware support that would simplify ELI’s design and implementa-

tion. Finally, in

Section 8

we discuss the applicability of ELI and

our future work directions, and in Section 9 we conclude.

2. Motivation and Related Work

For the past several decades, interrupts have been the main method

by which hardware devices can send asynchronous events to the

operating system [

]. The main advantage of using interrupts

to receive notiﬁcations from devices over polling them is that

the processor is free to perform other tasks while waiting for an

interrupt. This advantage applies when interrupts happen relatively

infrequently [

], as has been the case until high performance

storage and network adapters came into existence. With these

devices, the CPU can be overwhelmed with interrupts, leaving no

time to execute code other than the interrupt handler [

]. When

the operating system is run in a guest, interrupts have a higher cost

since every interrupt causes multiple exits [2, 9, 26].

In the remainder of this section we introduce the existing ap-

proaches to reduce the overheads induced by interrupts, and we

highlight the novelty of ELI in comparison to these approaches. We

subdivide the approaches into two categories.

2.1 Generic Interrupt Handling Approaches

We now survey approaches that equally apply to bare metal and

virtualized environments.

Polling disables interrupts entirely and polls the device for new

events at regular intervals. The beneﬁt is that handling device events

becomes synchronous, allowing the operating system to decide

when to poll and thus limit the number of handler invocations. The

drawbacks are added latency and wasted cycles when no events are

pending. If polling is done on a different core, latency is improved,

yet a core is wasted. Polling also consumes power since the processor

cannot enter an idle state.

hybrid

approach for reducing interrupt-handling overhead

is to dynamically switch between using interrupts and polling [

]. Linux uses this approach by default through the NAPI

mechanism [

]. Switching between interrupts and polling does

not always work well in practice, partly due to the complexity of

predicting the number of interrupts a device will issue in the future.

Another approach is

interrupt coalescing

[

], in which

the OS programs the device to send one interrupt in a time interval

or one interrupt per several events, as opposed to one interrupt per

event. As with the hybrid approaches, coalescing delays interrupts

and hence might suffer from the same shortcomings in terms of

latency. In addition, coalescing has other adverse effects and cannot

be used as the only interrupt mitigation technique. Zec et al. [

]

show that coalescing can burst TCP trafﬁc that was not bursty

beforehand. It also increases latency [

], since the operating

system can only handle the ﬁrst packet of a series when the last

coalesced interrupt for the series arrived. Deciding on the right

model and parameters for coalescing is complex and depends on the

workload, particularly when the workload runs within a guest [

Getting it right for a wide variety of workloads is hard if not

impossible [

]. Unlike coalescing, ELI does not reduce the

number of interrupts; instead it streamlines the handling of interrupts

targeted at virtual machines. Coalescing and ELI are therefore

complementary: coalescing reduces the number of interrupts, and

ELI reduces their price. Furthermore, with ELI, if a guest decides

to employ coalescing, it can directly control the interrupt rate and

latency, leading to predictable results. Without ELI, the interrupt rate

and latency cannot be easily manipulated by changing the coalescing

parameters, since the host’s involvement in the interrupt path adds

variability and uncertainty.

All evaluations in Section 5 were performed with the default

Linux conﬁguration, which combines the hybrid approach (via

NAPI) and coalescing.

2.2 Virtualization-Speciﬁc Approaches

Using an emulated or paravirtual [

] device provides much ﬂexi-

bility on the host side, but its performance is much lower than that of

device assignment, not to mention bare metal. Liu [

] shows that

device assignment of SR-IOV devices [16] can achieve throughput

close to bare metal at the cost of as much as 2x higher CPU utiliza-

tion. He also demonstrates that interrupts have a great impact on

performance and are a major expense for both the transmit and re-

ceive paths. For this reason, although applicable to the emulated and

paravirtual case as well, ELI’s main focus is on improving device

assignment.

Interrupt overhead is ampliﬁed in virtualized environments. The

Turtles project [

] shows interrupt handling to cause a 25% increase

in CPU utilization for a single-level virtual machine when compared

with bare metal, and a 300% increase in CPU utilization for a nested

virtual machine. There are software techniques [

] to reduce the

number of exits by ﬁnding blocks of exiting instructions and exiting

only once for the whole block. These techniques can increase the

efﬁciency of running a virtual machine when the main reason for

the overhead is in the guest code. When the reason is in external

412

剩余11页未读，继续阅读

评论收藏

内容反馈

luo_brian

粉丝: 40

ELI: Bare-Metal Performance for I/O Virtualization

virtualization

Virtualization-x86-64-2.3.1-8926 群晖.spk

如何开启笔记本的Virtualization_Technology虚拟化技术功能

tensorflow_gpu-2.3.1-cp38-cp38.zip

WinNTSetup_v2.3.1_x86_x64

可道云（SPK）- 群晖NAS 版

ELI Bare-Metal Performance for IO Virtualization

Virtualization-x86_64-2.2.2-8493.spk

KVM Virtualization Cookbook (True PDF)-Packt Publishing(2017) [英文]

群晖助手SynologyAssistantSetup-4.3-4359

pandoc-2.3.1-windows-x86_64.msi

亲测可用X86-nanoboot- x86 32位 黑群晖含已经配置好的引导和安装软件完整版

getif-2.2 getif-2.3.1 完整版本

i.MX_Virtualization_User's_Guide.pdf

群辉的virtualbox_x64 软件包

H-ui.admin_v2.3.1

VMware workstation Pro 17安装包

max30102实战资料，全部免费开源，包含硬件设计，下位机程序，上位机程序，结构设计

VMware Workstation 16虚拟机安装包

openssl3.4.1,openssh10.0p1,zlib-1.3.1

FinalShell安装包，让用户通过SSH、Telnet或者RDP等协议连接到远程服务器或设备，实现远程控制和管理

centos7的openssh9.8p1rpm包

Kylin-Server-V10-SP3-General-Release-2212-X86-64.7z.001

MCGS昆仑通态HMI专用工具包V5.5至V6.3

黑白群晖 DSM7.X 监控套件 SurveillanceStation-x86-64-9.1.1-10728 学习版

docker-compose-linux-aarch64（v2.17.2）

exploit （Linux 内核CVE-2024-1086漏洞复现脚本）

mysql-connector-java-8.0.27

银河麒麟系统硬盘分区挂载

银河麒麟Kylin桌面操作系统 V10 (SP1) X86-64

【C语言学习】getchar和putchar

Android WindowManagerService（WMS）框架深度解析

最新资源

亲测可用X86-nanoboot- x86 32位黑群晖含已经配置好的引导和安装软件完整版