SlideShare a Scribd company logo
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Why use Xen for large scale Enterprise
Deployments?
Konrad Rzeszutek Wilk
Software Developer Manager
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.
2
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 3
 A bit of history
Where does the code come from?
Distributions and kernels
Features
The end result
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Unbreakable Enterprise Kernel and Oracle Linux purpose:
• Red Hat and Oracle split:
– Oracle supports a kernel based on RHEL distribution but with our own kernel -
Unbreakable Enterprise Kernel (UEK).
We want better performance for customers. The kernel is being updated more often
and with features and benefits to take advantage of Oracle products.
– As such an Oracle Linux Distribution along with UEK kernels is offered.
The UEK kernel is used in other products – OVM.
4
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Oracle’s virtualization product (OVM):
We use Xen for hypervisor. For kernel we use UEK – in the past (OVM 2) we
had SLES based kernel.
• OVM 2 (Xen 3.4)
– Linux 2.6.32 based on SLES Xen Patches (classic)
While the newer ones are based on paravirt (pvops):
• OVM 3 (Xen 4.1)
– UEK2 kernel (2.6.39)
• OVM 3.3 (Xen 4.3)
– UEK3 kernel (3.8.13)
5
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Kernels (UEK: 2.6.39, 3.8).
• Oracle’s approach is
– Available for anybody (https://oss.oracle.com/git/).
– Make features available for everybody.
• Best way is to have it upstream so every distribution can have it.
• The end goal is for applications to run as best as they can.
• Large set of patches (big divergence from upstream) inhibit this as there is
a lot of complexity in them. Classic Xen patches is an example of this.
6
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Developers approach to patches:
• We forget what we did after 6 months (more or less).
• Want the code in one place (one repository).
• Want to develop new features against code to make it better and faster.
Don’t want to retouch the old code over and over.
• Want to fix new bugs in new shinny code.
• Big patches are scary.
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Quality Assurance approach to patches:
• Want to find the bug and have it fixed.
– Don't want bugs to re-appear later in a new version of kernel (aka regressions).
• Want to catch new bugs, expose new scenarios, not find old bugs.
• Ideal situation:
– new hardware = new bugs
– not new hardware = old bugs.
8
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Linux kernel: 2.6.32 (…) 3.0 (…) 3.8 (…) 3.11 (…) 3.15
Linux stable tree: 2.6.32LT 3.0 LT 3.8 LT
Unbreakable Linux UEK1 UEK2 UEK3
Unbreakable Enterprise Kernel origin
Backporting patches from upstream (Linus's tree) for new functionality.
Long-term kernels is where the community puts in the fixes and features
deemed necessary by maintainers. The version number gives an idea of
origin, for example 2.6.39 was 3.0 but some of the code is from 3.11.
9
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
The process to make this work:
• Patches MUST go upstream (Linus’s tree).
• New functionality developed against upstream kernel.
• Bug-fixes also developed against upstream kernel (where applicable as
some code had been re-worked).
• In some instances, where they do not make sense to go upstream, we keep
them in our tree.
• The problem we had with OVM2 was that it had a huge patchset of Xen
code – and not in any way easy to review.
10
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Upstreaming Xen in Linus’s tree
We started with slowly integrating pieces and pieces, one on top of each
other.
 Linux 3.0 had the initial domain support (but no backend drivers).
 Later versions gained different backend drivers (block, network, etc).
 For Xen (hypervisor) we did not have a huge set so much easier.
 What we ended up doing was:
Linus tree UEK tree OVM and Oracle Linux
Xen upstream OVM
11
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
The “problem” with Linus’s tree and Xen tree:
• High quality of code.
– Code has to go through numerous reviews before accepted. It takes time.
• The end result is:
– High quality and beautiful code.
– Performance driven (no maintainer wants code that slows things down).
– Improve the existing code.
• A fantastic side effect is that other distributions and users gain these
features right out of the box (such as Fedora Core, Debian, Red Hat, etc)
12
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Linux features that we are developing:
 Data safety
 DIF and DIX (Data Integretty), hardening ext4 and XFS against fuzzing attacks and
corrupt filesystems.
 DIRECT_IO - bypass caches so that data goes directly to the disk.
 Expose this via the AIO system call for applications.
 Better use of CPU and memory for:
 Making fsck work faster.
 De duplication of various filesystems (btrfs).
 Faster snapshotting.
 Quota calculations on XFS.
 dtrace
13
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Linux features we have been developing:
 NFS/RDMA (InfiniBand), NFS v4.0, support for NFS client using ZFS storage
and Solaris NFS.
• Security fixes before Linux gets released (And after too).
• Xen:
– The initial domain support and hardware features to match classic Xen support.
– Features in block and frontend to improve I/O.
– Lower latency for PCI passthrough devices.
– Near bare metal performance of guests.
– Continuous upstream presence to catch and fix regressions during Linus's merge
window.
– perf’ support for Xen and more.
14
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
In Xen ecosystem (hypervisor and toolstack):
• Xen Advisory Board where we collaborate with other companies using Xen
– To do more testing across all vendors workloads.
– Get more developers.
– Companies work together on features (Xen block subsystem).
• OASIS VirtIO workgroup to define the VirtIO specification.
• Faster boot, faster deallocation/allocation for huge guests.
• Faster performance on NUMA machines.
• Faster guests – replacing PV with PVH.
15
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
In the Xen ecosystem (hypervisor and toolstack)
• 'perf' support
– For full stack (hypervisor, guests, etc) performance view of what they are running and
performance bottlenecks.
• Xen hypervisor debugger – to troubleshoot in the field.
• Lower interrupt latencies for PCI passthrough.
• Transcendent memory (cooperative memory ballooning with benefits)
– An answer to memory overcommit – where Linux balloons out pages it does not think
it will use often but which can take a lot of memory space. Hypervisor can
deduplicate + compress those across different guests. End result is that we can fit
more guests on a machine and still have good performance (sometimes even 4%
benefit!)
16
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Exadata Database Machine (have X4-2, X4-4, X4-8).
17
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
X4-8:
18
From Sun Server X4-8 Service Manual
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Under the hood we have:
• NUMA
– 2, 4 or 8 sockets (CPU)
– Each socket has its own local memory.
– PCIe slots off sockets (I/O NUMA) with InfiniBand or flash in them.
– All sockets connected via QuickPath Interconnect (QPI).
• For best performance we don’t want to use QPI excessively, an solution is:
– Partitioning per socket.
– We have various size guests that reside within their NUMA node.
• Combined with intelligent software (GRID, Oracle RAC) gives top-notch
performance.
19
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Networking – 40G and more:
• Multiple ways of having better performance:
– PCIe passthrough (InfiniBand or Network Integrated Cards) – SRIOV – what we
concentrate on for best performance for Engineered Systems. But no migration!
– Intel Data Plane Development Kit (DPDK). Low latency, but no migration!
– Improving Xen netback and netfront (Citrix driven, they are the maintainers of Linux
Xen netback driver).
• Want the guest to run without invoking the hypervisor for privileged
operations (aka less VMEXITs):
– Interrupts go directly to the guest (posted interrupts). Improvement in Linux to use
vAPIC instead of event channels for PCIe interrupt.
– Lower the latency of interrupt delivery if we have to go through hypervisor.
20
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Storage: More IOPS!
• Classis OVM deployment is OCFS2 shared across different hosts.
• We have SSDs, now PCIe flash, and in the future NVMe.
• For better performance we do:
– Improve Xen block frontend and backend. Joint projects with Citrix on increasing
throughput and lowering latency.
– SR-IOV for even higher throughput and low latency (but no migration) for Engineered
Systems.
21
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Guests improvements:
• ParaVirtualized guests problem:
– Page updates and syscall require context switch to hypervisor.
– ParaVirtualized Hardware uses the hardware to do page updates and syscall instead
of requiring the guest to do the hypercalls. End result is removal of bottlenecks in PV
22
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Xen hypervisor bottlenecks:
• Identify them using ‘perf’ to visualize and get full system stack (hypervisor
and guests).
23
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Xen transcendent memory.
• Memory is becoming a bottleneck in virtualized system – we want
more! However we have memory in-efficient workloads.
24
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
End goal
• Performance, high quality, stability and security for all different workloads.
• Push patches upstream to benefit everybody.
25
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Oracle is hiring!
konrad.wilk@oracle.com
26
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 27

More Related Content

PDF
XPDS14 - Towards Massive Server Consolidation - Filipe Manco, NEC
The Linux Foundation
 
PDF
XPDS16: libvirt and Tools: What's New and What's Next - James Fehlig, SUSE
The Linux Foundation
 
PDF
XPDS14 - Xen on ARM: Status and Performance - Stefano Stabellini, Citrix
The Linux Foundation
 
PDF
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
The Linux Foundation
 
PDF
XPDS14: Removing the Xen Linux Upstream Delta of Various Linux Distros - Luis...
The Linux Foundation
 
PDF
XPDS14 - Xen in EFI World - Daniel Kiper, Oracle
The Linux Foundation
 
PDF
XPDS16: The OpenXT Project in 2016 - Christopher Clark, BAE Systems
The Linux Foundation
 
PDF
XPDS16: Xen Orchestra: building a Cloud on top of Xen - Olivier Lambert & Jul...
The Linux Foundation
 
XPDS14 - Towards Massive Server Consolidation - Filipe Manco, NEC
The Linux Foundation
 
XPDS16: libvirt and Tools: What's New and What's Next - James Fehlig, SUSE
The Linux Foundation
 
XPDS14 - Xen on ARM: Status and Performance - Stefano Stabellini, Citrix
The Linux Foundation
 
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
The Linux Foundation
 
XPDS14: Removing the Xen Linux Upstream Delta of Various Linux Distros - Luis...
The Linux Foundation
 
XPDS14 - Xen in EFI World - Daniel Kiper, Oracle
The Linux Foundation
 
XPDS16: The OpenXT Project in 2016 - Christopher Clark, BAE Systems
The Linux Foundation
 
XPDS16: Xen Orchestra: building a Cloud on top of Xen - Olivier Lambert & Jul...
The Linux Foundation
 

What's hot (20)

PDF
XPDS16: Xenbedded: Xen-based client virtualization for phones and tablets - ...
The Linux Foundation
 
PDF
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
The Linux Foundation
 
PDF
Xen and the art of embedded virtualization (ELC 2017)
Stefano Stabellini
 
PDF
QEMU Disk IO Which performs Better: Native or threads?
Pradeep Kumar
 
PDF
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
The Linux Foundation
 
PDF
ELC21: VM-to-VM Communication Mechanisms for Embedded
Stefano Stabellini
 
PDF
XPDS16: Hypervisor-based Security: Vicarious Learning via Introspektioneerin...
The Linux Foundation
 
PDF
XPDS13: In-Guest Mechanism to Strengthen Guest Separation - Philip Tricca, Ci...
The Linux Foundation
 
PDF
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
The Linux Foundation
 
PPTX
Xen Project CI for OpenStack Overview
The Linux Foundation
 
PDF
Virtualization with KVM (Kernel-based Virtual Machine)
Novell
 
PDF
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
The Linux Foundation
 
PPTX
LinuxCon Japan 13 : 10 years of Xen and Beyond
The Linux Foundation
 
PDF
Rootlinux17: An introduction to Xen Project Virtualisation
The Linux Foundation
 
PDF
XPDS16: Xen Development Update
The Linux Foundation
 
PDF
XPDS16: CPUID handling for guests - Andrew Cooper, Citrix
The Linux Foundation
 
PDF
Virtualization Architecture & KVM
Pradeep Kumar
 
PDF
Bare-Metal Hypervisor as a Platform for Innovation
The Linux Foundation
 
PDF
LFNW2014 Advanced Security Features of Xen Project Hypervisor
The Linux Foundation
 
PDF
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
The Linux Foundation
 
XPDS16: Xenbedded: Xen-based client virtualization for phones and tablets - ...
The Linux Foundation
 
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
The Linux Foundation
 
Xen and the art of embedded virtualization (ELC 2017)
Stefano Stabellini
 
QEMU Disk IO Which performs Better: Native or threads?
Pradeep Kumar
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
The Linux Foundation
 
ELC21: VM-to-VM Communication Mechanisms for Embedded
Stefano Stabellini
 
XPDS16: Hypervisor-based Security: Vicarious Learning via Introspektioneerin...
The Linux Foundation
 
XPDS13: In-Guest Mechanism to Strengthen Guest Separation - Philip Tricca, Ci...
The Linux Foundation
 
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
The Linux Foundation
 
Xen Project CI for OpenStack Overview
The Linux Foundation
 
Virtualization with KVM (Kernel-based Virtual Machine)
Novell
 
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
The Linux Foundation
 
LinuxCon Japan 13 : 10 years of Xen and Beyond
The Linux Foundation
 
Rootlinux17: An introduction to Xen Project Virtualisation
The Linux Foundation
 
XPDS16: Xen Development Update
The Linux Foundation
 
XPDS16: CPUID handling for guests - Andrew Cooper, Citrix
The Linux Foundation
 
Virtualization Architecture & KVM
Pradeep Kumar
 
Bare-Metal Hypervisor as a Platform for Innovation
The Linux Foundation
 
LFNW2014 Advanced Security Features of Xen Project Hypervisor
The Linux Foundation
 
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
The Linux Foundation
 
Ad

Viewers also liked (6)

PDF
XPDS14: Efficient Interdomain Transmission of Performance Data - John Else, C...
The Linux Foundation
 
PPTX
XPDS14: Unikernels: Who, What, Where, When, Why - Adam Wick, Galois
The Linux Foundation
 
PDF
Xen Project Contributor Training - Part 1 introduction v1.0
The Linux Foundation
 
PPTX
Xen Project Contributor Training Part2 : Processes and Conventions v1.1
The Linux Foundation
 
PDF
Performance Tuning Xen
The Linux Foundation
 
PDF
SXSW 2016 takeaways
Havas
 
XPDS14: Efficient Interdomain Transmission of Performance Data - John Else, C...
The Linux Foundation
 
XPDS14: Unikernels: Who, What, Where, When, Why - Adam Wick, Galois
The Linux Foundation
 
Xen Project Contributor Training - Part 1 introduction v1.0
The Linux Foundation
 
Xen Project Contributor Training Part2 : Processes and Conventions v1.1
The Linux Foundation
 
Performance Tuning Xen
The Linux Foundation
 
SXSW 2016 takeaways
Havas
 
Ad

Similar to LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszutek Wilk , Oracle (20)

PPTX
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 
PPTX
Oracle virtual appliance
solarisyougood
 
PDF
Oracle Linux Nov 2011 Webcast
Terry Wang
 
PDF
2018_GENIVI_XenOverview-123456789011.pdf
BiHongPhc
 
PDF
Develop Your Own Operating Systems using Cheap ARM Boards
National Cheng Kung University
 
PPTX
Using MySQL Containers
Matt Lord
 
PPT
LinuxONE cavemen mmit 20160505 v1.0
Marcel Mitran
 
PDF
High availability virtualization with proxmox
Oriol Izquierdo Vibalda
 
PDF
Linux one vs x86 18 july
Diego Rodriguez
 
PDF
Linux one vs x86
Diego Rodriguez
 
PPT
les_01.ppt of the Oracle course train_1 file
YulinLiu27
 
PDF
Oracle Linux/Oracle VM & Oracle Cloud Overview
Toronto-Oracle-Users-Group
 
PPTX
Flexible compute
Peter Clapham
 
PPTX
Sanger, upcoming Openstack for Bio-informaticians
Peter Clapham
 
DOCX
Resume
Shyama nand
 
PDF
OC|Webcast "Die neue Welt der Virtualisierung"
OPITZ CONSULTING Deutschland
 
PPTX
Why containers
Luca Ravazzolo
 
PDF
Best Practices for Deploying Enterprise Applications on UNIX
Noel McKeown
 
PDF
DevOps Supercharged with Docker on Exadata
MarketingArrowECS_CZ
 
PDF
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
OpenStack
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc
 
Oracle virtual appliance
solarisyougood
 
Oracle Linux Nov 2011 Webcast
Terry Wang
 
2018_GENIVI_XenOverview-123456789011.pdf
BiHongPhc
 
Develop Your Own Operating Systems using Cheap ARM Boards
National Cheng Kung University
 
Using MySQL Containers
Matt Lord
 
LinuxONE cavemen mmit 20160505 v1.0
Marcel Mitran
 
High availability virtualization with proxmox
Oriol Izquierdo Vibalda
 
Linux one vs x86 18 july
Diego Rodriguez
 
Linux one vs x86
Diego Rodriguez
 
les_01.ppt of the Oracle course train_1 file
YulinLiu27
 
Oracle Linux/Oracle VM & Oracle Cloud Overview
Toronto-Oracle-Users-Group
 
Flexible compute
Peter Clapham
 
Sanger, upcoming Openstack for Bio-informaticians
Peter Clapham
 
Resume
Shyama nand
 
OC|Webcast "Die neue Welt der Virtualisierung"
OPITZ CONSULTING Deutschland
 
Why containers
Luca Ravazzolo
 
Best Practices for Deploying Enterprise Applications on UNIX
Noel McKeown
 
DevOps Supercharged with Docker on Exadata
MarketingArrowECS_CZ
 
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
OpenStack
 

More from The Linux Foundation (20)

PDF
ELC2019: Static Partitioning Made Simple
The Linux Foundation
 
PDF
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
The Linux Foundation
 
PDF
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
The Linux Foundation
 
PDF
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
The Linux Foundation
 
PDF
XPDDS19 Keynote: Unikraft Weather Report
The Linux Foundation
 
PDF
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
The Linux Foundation
 
PDF
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
The Linux Foundation
 
PDF
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
The Linux Foundation
 
PDF
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
The Linux Foundation
 
PPTX
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
The Linux Foundation
 
PPTX
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
The Linux Foundation
 
PDF
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
The Linux Foundation
 
PDF
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
The Linux Foundation
 
PDF
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
The Linux Foundation
 
PDF
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
The Linux Foundation
 
PDF
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
The Linux Foundation
 
PDF
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
The Linux Foundation
 
PDF
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
The Linux Foundation
 
PDF
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
The Linux Foundation
 
PDF
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
The Linux Foundation
 
ELC2019: Static Partitioning Made Simple
The Linux Foundation
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
The Linux Foundation
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
The Linux Foundation
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
The Linux Foundation
 
XPDDS19 Keynote: Unikraft Weather Report
The Linux Foundation
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
The Linux Foundation
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
The Linux Foundation
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
The Linux Foundation
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
The Linux Foundation
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
The Linux Foundation
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
The Linux Foundation
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
The Linux Foundation
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
The Linux Foundation
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
The Linux Foundation
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
The Linux Foundation
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
The Linux Foundation
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
The Linux Foundation
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
The Linux Foundation
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
The Linux Foundation
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
The Linux Foundation
 

Recently uploaded (20)

PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 

LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszutek Wilk , Oracle

  • 1. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Why use Xen for large scale Enterprise Deployments? Konrad Rzeszutek Wilk Software Developer Manager
  • 2. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. 2
  • 3. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 3  A bit of history Where does the code come from? Distributions and kernels Features The end result
  • 4. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Unbreakable Enterprise Kernel and Oracle Linux purpose: • Red Hat and Oracle split: – Oracle supports a kernel based on RHEL distribution but with our own kernel - Unbreakable Enterprise Kernel (UEK). We want better performance for customers. The kernel is being updated more often and with features and benefits to take advantage of Oracle products. – As such an Oracle Linux Distribution along with UEK kernels is offered. The UEK kernel is used in other products – OVM. 4
  • 5. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle’s virtualization product (OVM): We use Xen for hypervisor. For kernel we use UEK – in the past (OVM 2) we had SLES based kernel. • OVM 2 (Xen 3.4) – Linux 2.6.32 based on SLES Xen Patches (classic) While the newer ones are based on paravirt (pvops): • OVM 3 (Xen 4.1) – UEK2 kernel (2.6.39) • OVM 3.3 (Xen 4.3) – UEK3 kernel (3.8.13) 5
  • 6. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Kernels (UEK: 2.6.39, 3.8). • Oracle’s approach is – Available for anybody (https://oss.oracle.com/git/). – Make features available for everybody. • Best way is to have it upstream so every distribution can have it. • The end goal is for applications to run as best as they can. • Large set of patches (big divergence from upstream) inhibit this as there is a lot of complexity in them. Classic Xen patches is an example of this. 6
  • 7. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Developers approach to patches: • We forget what we did after 6 months (more or less). • Want the code in one place (one repository). • Want to develop new features against code to make it better and faster. Don’t want to retouch the old code over and over. • Want to fix new bugs in new shinny code. • Big patches are scary. 7
  • 8. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Quality Assurance approach to patches: • Want to find the bug and have it fixed. – Don't want bugs to re-appear later in a new version of kernel (aka regressions). • Want to catch new bugs, expose new scenarios, not find old bugs. • Ideal situation: – new hardware = new bugs – not new hardware = old bugs. 8
  • 9. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Linux kernel: 2.6.32 (…) 3.0 (…) 3.8 (…) 3.11 (…) 3.15 Linux stable tree: 2.6.32LT 3.0 LT 3.8 LT Unbreakable Linux UEK1 UEK2 UEK3 Unbreakable Enterprise Kernel origin Backporting patches from upstream (Linus's tree) for new functionality. Long-term kernels is where the community puts in the fixes and features deemed necessary by maintainers. The version number gives an idea of origin, for example 2.6.39 was 3.0 but some of the code is from 3.11. 9
  • 10. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | The process to make this work: • Patches MUST go upstream (Linus’s tree). • New functionality developed against upstream kernel. • Bug-fixes also developed against upstream kernel (where applicable as some code had been re-worked). • In some instances, where they do not make sense to go upstream, we keep them in our tree. • The problem we had with OVM2 was that it had a huge patchset of Xen code – and not in any way easy to review. 10
  • 11. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Upstreaming Xen in Linus’s tree We started with slowly integrating pieces and pieces, one on top of each other.  Linux 3.0 had the initial domain support (but no backend drivers).  Later versions gained different backend drivers (block, network, etc).  For Xen (hypervisor) we did not have a huge set so much easier.  What we ended up doing was: Linus tree UEK tree OVM and Oracle Linux Xen upstream OVM 11
  • 12. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | The “problem” with Linus’s tree and Xen tree: • High quality of code. – Code has to go through numerous reviews before accepted. It takes time. • The end result is: – High quality and beautiful code. – Performance driven (no maintainer wants code that slows things down). – Improve the existing code. • A fantastic side effect is that other distributions and users gain these features right out of the box (such as Fedora Core, Debian, Red Hat, etc) 12
  • 13. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Linux features that we are developing:  Data safety  DIF and DIX (Data Integretty), hardening ext4 and XFS against fuzzing attacks and corrupt filesystems.  DIRECT_IO - bypass caches so that data goes directly to the disk.  Expose this via the AIO system call for applications.  Better use of CPU and memory for:  Making fsck work faster.  De duplication of various filesystems (btrfs).  Faster snapshotting.  Quota calculations on XFS.  dtrace 13
  • 14. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Linux features we have been developing:  NFS/RDMA (InfiniBand), NFS v4.0, support for NFS client using ZFS storage and Solaris NFS. • Security fixes before Linux gets released (And after too). • Xen: – The initial domain support and hardware features to match classic Xen support. – Features in block and frontend to improve I/O. – Lower latency for PCI passthrough devices. – Near bare metal performance of guests. – Continuous upstream presence to catch and fix regressions during Linus's merge window. – perf’ support for Xen and more. 14
  • 15. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | In Xen ecosystem (hypervisor and toolstack): • Xen Advisory Board where we collaborate with other companies using Xen – To do more testing across all vendors workloads. – Get more developers. – Companies work together on features (Xen block subsystem). • OASIS VirtIO workgroup to define the VirtIO specification. • Faster boot, faster deallocation/allocation for huge guests. • Faster performance on NUMA machines. • Faster guests – replacing PV with PVH. 15
  • 16. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | In the Xen ecosystem (hypervisor and toolstack) • 'perf' support – For full stack (hypervisor, guests, etc) performance view of what they are running and performance bottlenecks. • Xen hypervisor debugger – to troubleshoot in the field. • Lower interrupt latencies for PCI passthrough. • Transcendent memory (cooperative memory ballooning with benefits) – An answer to memory overcommit – where Linux balloons out pages it does not think it will use often but which can take a lot of memory space. Hypervisor can deduplicate + compress those across different guests. End result is that we can fit more guests on a machine and still have good performance (sometimes even 4% benefit!) 16
  • 17. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Exadata Database Machine (have X4-2, X4-4, X4-8). 17
  • 18. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | X4-8: 18 From Sun Server X4-8 Service Manual
  • 19. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Under the hood we have: • NUMA – 2, 4 or 8 sockets (CPU) – Each socket has its own local memory. – PCIe slots off sockets (I/O NUMA) with InfiniBand or flash in them. – All sockets connected via QuickPath Interconnect (QPI). • For best performance we don’t want to use QPI excessively, an solution is: – Partitioning per socket. – We have various size guests that reside within their NUMA node. • Combined with intelligent software (GRID, Oracle RAC) gives top-notch performance. 19
  • 20. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Networking – 40G and more: • Multiple ways of having better performance: – PCIe passthrough (InfiniBand or Network Integrated Cards) – SRIOV – what we concentrate on for best performance for Engineered Systems. But no migration! – Intel Data Plane Development Kit (DPDK). Low latency, but no migration! – Improving Xen netback and netfront (Citrix driven, they are the maintainers of Linux Xen netback driver). • Want the guest to run without invoking the hypervisor for privileged operations (aka less VMEXITs): – Interrupts go directly to the guest (posted interrupts). Improvement in Linux to use vAPIC instead of event channels for PCIe interrupt. – Lower the latency of interrupt delivery if we have to go through hypervisor. 20
  • 21. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Storage: More IOPS! • Classis OVM deployment is OCFS2 shared across different hosts. • We have SSDs, now PCIe flash, and in the future NVMe. • For better performance we do: – Improve Xen block frontend and backend. Joint projects with Citrix on increasing throughput and lowering latency. – SR-IOV for even higher throughput and low latency (but no migration) for Engineered Systems. 21
  • 22. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Guests improvements: • ParaVirtualized guests problem: – Page updates and syscall require context switch to hypervisor. – ParaVirtualized Hardware uses the hardware to do page updates and syscall instead of requiring the guest to do the hypercalls. End result is removal of bottlenecks in PV 22
  • 23. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Xen hypervisor bottlenecks: • Identify them using ‘perf’ to visualize and get full system stack (hypervisor and guests). 23
  • 24. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Xen transcendent memory. • Memory is becoming a bottleneck in virtualized system – we want more! However we have memory in-efficient workloads. 24
  • 25. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | End goal • Performance, high quality, stability and security for all different workloads. • Push patches upstream to benefit everybody. 25
  • 26. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle is hiring! [email protected] 26
  • 27. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 27