A Practice Guide to
vCNS and VXLAN
Technical Overview and Design Guide
Prasenjit Sarkar – VMware
Hongjun Ma – HP
Andy Grant – HP
Agenda
What will we focus on
High level overview how VXLAN works
VXLAN implementation using vCNS including
• Infrastructure Components
• Packet Flow
Deployment Prerequisites
Network Considerations
• Multicast requirements
• Multicast implementation
VTEP Performance and Overhead
• HP Virtual Connect & load-balancing
VXLAN Introduction
Target Audience
Architects, Engineers, Consultants, Admins responsible for Data Center Infrastructure and
VMware virtualization technologies

What is VXLAN
VXLAN - Virtual eXtensible Local Area Network is a network overlay that encapsulates
layer 2 traffic within layer 3
• Submitted it IETF by Cisco, VMware, Citrix, Red Hat, Broadcom, & Arista.
•

Coined network virtualization or ‘virtual wires’ by VMware

Competing Solutions?
NVGRE - Network Virtualization using Generic Routing Encapsulation
• Submitted to IETF by Microsoft, Arista, Intel, Dell, HP, Broadcom, Emulex
SST - Stateless Transport Tunneling
• Submitted to IETF by Nicira (VMware)
VXLAN Introduction
Why VXLAN?
•
•
•
•
•

Ability to manage overlapping addresses between multiple tenants
Decoupling of the virtual topology provided by the tunnels from the physical topology of the network
Support for virtual machine mobility independent of the physical network
Support for essentially unlimited numbers of virtual networks (in contrast to VLANs, for example)
Decoupling of the network service provided to servers from the technology used in the physical
network (e.g. providing an L2 service over an L3 fabric)
• Isolating the physical network from the addressing of the virtual networks, thus avoiding issues such
as MAC table size in physical switches.
• VXLAN provides up to 16 million virtual networks in contrast to the 4094 limit of VLAN’s
• Application agnostic, all work is performed in the ESXi host.

Where are we today?
•
•

VXLAN still in experimental status in IETF
Primarily targeted in vCloud environments but standalone product available.
VXLAN Introduction
How VXLAN?
• VMware vSphere ESXi 5.1 AND
– vCloud Networking Security 5.1 Edge
OR
– Cisco Nexus 1000V
VMware vCloud Networking and Security Edge
• Available vCNS deployment options
– Standalone (licensed per VM)
– AutoDeploy
• Deploying VXLAN through Auto Deploy
– vCloud Director 5.1 (licensed in vCloud Suite)
• Currently tested to support 5000 VXLAN segments
– vCloud Networking and Security 5.1 Edge configuration limits and throughput
Cisco Nexus 1000V
• Currently tested to support 2000 VXLAN segments
– Deploying the VXLAN Feature in Cisco Nexus 1000V Series Switches
Network Virtualization Conceptual View
Analogy between computer virtualization and network virtualization (overlay
transport)
vCloud Networking and Security - Edge
What is vCloud Networking and Security Edge?
Part of the VMware vCloud Networking and Security suite
• Previously known as the vShield suite.
• Provides gateway services including
– VPN
– DHCP
– DNS
– NAT
– Firewall (5 tuple)
– VXLAN & inter-VXLAN routing
– Load-Balancing (Advanced License)
– High Availability (Advanced License)

Licensing Options
– Standalone per-VM Standard or Advanced licensing
– Bundled with vCloud Suite
VXLAN: How it works
What is vCloud Networking and Security Edge?
Part of the VMware vCloud Networking and Security suite
• Encapsulation
– Performed by a kernel module installed on ESXi host
• Acts as the Virtual Tunnel End Point or VTEP
– Adds 24bit identifier and 50 bytes to packet size.
– MAC in UDP + IP
• MAC in UDP + IP
– Why MAC in IP is better than vCNI (MAC in MAC)
• Multicast
– Where it is used, how this impacts scalability
vCNS + Edge + VXLAN: Prerequisites
What is vCloud Networking and Security Edge?
Part of the VMware vCloud Networking and Security suite
• Previously known as the vShield suite.
• Highly integrated with vCloud but vCD is not necessary with standalone licenses.

VXLAN + vCNS Edge requires;
• Physical network components;
•
•

•

MTU increase (1550 MIN)
Multicast enabled (depending on topology, more to come)

VMware components;
•
•
•

•

vDS 5.1 (implies vSphere Enterprise Plus licensing & vCenter)
A vCNS Manager
A vCNS Edge

VMware recommends
•
•
•

a single vDS across all clusters.
you isolate your VTEP traffic from VM VLAN’s
Etherchannel or LACP to your host for the VXLAN transport Port Group
Multicast
What needs to be enabled on HP or Cisco switches?
What are the multicast design considerations?
• Limits of physical network hardware platforms using multicast
– Cisco Nexus 7000 supports 15,000 L2 IGMP entries
(http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9402/ps9512/brochure_mulitcast_w
ith_cisco_nexus_7000.pdf)
– Cisco Nexus 7000 supports 32,000 MC entries (15K vPC)
(http://www.cisco.com/en/US/docs/switches/datacenter/sw/verified_scalability/b_Cisco_Nexus_700
0_Series_NXOS_Verified_Scalability_Guide.html#reference_04BA8513CF3140D2A2A6C5E5B4E7C60C)
– Check HP gear limits.
– So what do these limits mean?
– VMware recommends one VXLAN ‘virtual wire’ per MC segment therefore we can only support this
for up to 15K or 32K?
• If we don’t follow this recommendation, how does this impact a VM broadcast flooding other
VTEP’s w/ multicast traffic?
• Is better to use IGMP snooping/querier (L2 topology) or PIM w/ L3 topology?
– How does this impact Data Center Interconnects (DCI) and stretched VXLAN implementations?
VXLAN Logical View
Packet flow across virtual wires on the same layer 2 VXLAN transport network

•
•

•

VXLAN Fabric
vDS

•

Layer 2

List Pro/Con’s here
Multicast configuration
options
• IGMP
snooping/querier
Explain how they work in
next slide
Design considerations?
• Eg. Broadcast storms?
VXLAN Logical View
Packet flow across virtual wires on different layer 3 VXLAN transport networks

•
•

•
VXLAN Fabric

•
vDS

Layer 3

List Pro/Con’s here
Multicast configuration
options
• PIM
Explain how they work in
next slide
Design considerations?
• Eg. Broadcast storms?
High Level Physical Deployment

VXLAN Fabric

VTE
P

VTE
P

VTE
P
vSphere Distributed Switch

VTE
P

Solution Components
• vDS 5.1

ESXi

ESXi

ESXi

ESXi

• VXLAN virtual fabric
• VTEP (vmk adapter
in a dedicated Port
Group)
• vCNS Edge 5.1
• vCNS Manager 5.1
Physical Deployment – A Closer Look

VXLAN Fabric

• vCNS Manager manages the vCNS deployment
• supports many Edge devices.

VTEP

VTEP

vSphere Distributed Switch

ESXi

ESXi

• VTEP is a single vmkernel interface per host
automatically created on VXLAN vDS Port Group
• LACP, EtherChannel or (static) failover only
supported load balancing methods.
• VLAN ‘trunking’ or virtual switch tagging (VST)
not recommended. Dedicate ‘access’ phyical
uplinks to VXLAN Port Groups
• vCNS Edge virtual appliance provides gateway
services
Physical Deployment – Intra-Host Packet Flow

VXLAN Fabric

VM Packet Flow
1. VM sents packet to remote destination on
same virtual wire

VTEP

VTEP

vSphere Distributed Switch

ESXi

ESXi

2. Packet hits vDS and is forwarded to
destination VM
Physical Deployment – Inter-Host Packet Flow

VXLAN Fabric

VM Packet Flow
1. VM sents packet to remote destination on
same virtual wire

VTEP

VTEP

vSphere Distributed Switch

ESXi

ESXi

2. Destination VM is remote and packet will
traverses VXLAN network
3. ESXi host encapsulates packet and
transmits on via VTEP vmkernel adapter
4. Target ESXi host running the destination
VM receives the packet on the VTEP,
forward to VM
Physical Deployment – Routed Packet Flow

VXLAN Fabric

VM Packet Flow
1. VM transmits packet to remote
destination

VTEP

VTEP

vSphere Distributed Switch

ESXi

ESXi

2. VTEP kernel module in ESXi host
encapsulates packet and transmits on
VXLAN network
3. ESXi host running the Edge device
receives packet and processes through
rule engine
4. Packet processed by firewall/NAT/routing
rules and is sent out external interface on
Edge device
5. Packet hits physical network
infrastructure
Comparison of vSphere NIC Teaming
Load Distribution vs Load Balancing vs Active/Standby
vCNS Edge supports LACP & Etherchannel or Failover “aka, Active/Standby” NIC
teaming options
Load Distribution (of IP flows)
Load Balancing (bandwidth)
Active/Standby

90%
load

LAC
P

20%
load

55%
load

LBT

40%
load

0%
load

Active /
Standb
y

IP Flows

(conversations
)

Attempts to evenly distribute
IP traffic flows, bandwidth is
NOT a consideration

Attempts to evenly distribute
bandwidth capacity

Single active link, no
automatic load
distribution/balancing

100%
load
VXLAN with HP Virtual Connect Interconnects
Virtual Connect Advantage
East/West Fencing (VTEP) Traffic stays in the VC domain using cross-connect or stacking
links reducing North/South bandwidth requirements.

Virtual Connect Disadvantage
Virtual Connect does not support downstream server EtherChannel or LACP connectivity.
• Limited to the vCNS Teaming Policy of “Failover”
•
•
•

Effectively an Active/Standby configuration
Cuts North/South bandwidth efficiency in half due to idle link
This is not as bad as it sounds due to the East/West traffic savings using cross-connects/stacking
links

Possible Solutions?
• VC Tunnel Mode? – Does it pass link aggregation control traffic? Looks to be a NO
• Multiple Edge devices using an alternating Active/Standby teaming on VXLAN Port
Group?
•

•

Static load-distribution sucks!

Other?
VXLAN Performance
Encapsulation Overhead
VXLAN introduces an additional layer of packet processing at the hypervisor level. For
each packet on the VXLAN network, the hypervisor needs to add protocol headers on the
sender side (encapsulation) and remove these headers (decapsulation) on the receiver
side. This causes the CPU additional work for each packet.
Apart from this CPU overhead, some of the offload capabilities of the NIC cannot be used
because the inner packet is no longer accessible. The physical NIC hardware offload
capabilities (for example, checksum offloading and TCP segmentation offload (TSO)) have
been designed for standard (non-encapsulated) packet headers, and some of these
capabilities cannot be used for encapsulated packets. In such a case, a VXLAN enabled
packet will require CPU resources to perform a task that otherwise would have been done
more efficiently by physical NIC hardware. There are certain NIC offload capabilities that
can be used with VXLAN, but they depend on the physical NIC and the driver being used.
As a result, the performance may vary based on the hardware used when VX
http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-VXLAN-Perf.pdfLAN is
configured.
VXLAN Isn’t Perfect
Compared to MAC in MAC encapsulation (vCNI) then VXLAN (MAC in UPD) moves in the
right direction for broadcast scalability
• Broadcasts on internal networks (“protected” with vCDNI) get translated into global
broadcasts. This behavior totally destroys scalability. In VLAN-based designs, the number of hosts
and VMs affected by a broadcast is limited by the VLAN configuration... unless you stretch VLANs all
across the data center (but then you ask for trouble). Ivan Pepelnjak

VXLAN Fenced networks communicate via the VXLAN vmk adapter that only uses a single
Netqueue NIC queue. This limits scalability by increasing CPU pressure on the host for a
single pCPU.
vCNS Teaming Policy in conjunction with Virtual Connect. VC has no downstream
EtherChannel/LACP support and therefore VXLAN will always effectively be Active/Passive
going out the chassis. You will be limited to the bandwidth of a single upstream link per
vCNS Edge device (typically per cluster).
The lack of control plane virtualization and reliance on the physical network for MAC
propagation introduces limits imposed by multicast.
–
–

Multicast administrator expertise (not your typical data center protocol)
Multicast segment support limits of physical network infrastructure
Thank you

VXLAN Practice Guide

  • 1.
    A Practice Guideto vCNS and VXLAN Technical Overview and Design Guide Prasenjit Sarkar – VMware Hongjun Ma – HP Andy Grant – HP
  • 2.
    Agenda What will wefocus on High level overview how VXLAN works VXLAN implementation using vCNS including • Infrastructure Components • Packet Flow Deployment Prerequisites Network Considerations • Multicast requirements • Multicast implementation VTEP Performance and Overhead • HP Virtual Connect & load-balancing
  • 3.
    VXLAN Introduction Target Audience Architects,Engineers, Consultants, Admins responsible for Data Center Infrastructure and VMware virtualization technologies What is VXLAN VXLAN - Virtual eXtensible Local Area Network is a network overlay that encapsulates layer 2 traffic within layer 3 • Submitted it IETF by Cisco, VMware, Citrix, Red Hat, Broadcom, & Arista. • Coined network virtualization or ‘virtual wires’ by VMware Competing Solutions? NVGRE - Network Virtualization using Generic Routing Encapsulation • Submitted to IETF by Microsoft, Arista, Intel, Dell, HP, Broadcom, Emulex SST - Stateless Transport Tunneling • Submitted to IETF by Nicira (VMware)
  • 4.
    VXLAN Introduction Why VXLAN? • • • • • Abilityto manage overlapping addresses between multiple tenants Decoupling of the virtual topology provided by the tunnels from the physical topology of the network Support for virtual machine mobility independent of the physical network Support for essentially unlimited numbers of virtual networks (in contrast to VLANs, for example) Decoupling of the network service provided to servers from the technology used in the physical network (e.g. providing an L2 service over an L3 fabric) • Isolating the physical network from the addressing of the virtual networks, thus avoiding issues such as MAC table size in physical switches. • VXLAN provides up to 16 million virtual networks in contrast to the 4094 limit of VLAN’s • Application agnostic, all work is performed in the ESXi host. Where are we today? • • VXLAN still in experimental status in IETF Primarily targeted in vCloud environments but standalone product available.
  • 5.
    VXLAN Introduction How VXLAN? •VMware vSphere ESXi 5.1 AND – vCloud Networking Security 5.1 Edge OR – Cisco Nexus 1000V VMware vCloud Networking and Security Edge • Available vCNS deployment options – Standalone (licensed per VM) – AutoDeploy • Deploying VXLAN through Auto Deploy – vCloud Director 5.1 (licensed in vCloud Suite) • Currently tested to support 5000 VXLAN segments – vCloud Networking and Security 5.1 Edge configuration limits and throughput Cisco Nexus 1000V • Currently tested to support 2000 VXLAN segments – Deploying the VXLAN Feature in Cisco Nexus 1000V Series Switches
  • 6.
    Network Virtualization ConceptualView Analogy between computer virtualization and network virtualization (overlay transport)
  • 7.
    vCloud Networking andSecurity - Edge What is vCloud Networking and Security Edge? Part of the VMware vCloud Networking and Security suite • Previously known as the vShield suite. • Provides gateway services including – VPN – DHCP – DNS – NAT – Firewall (5 tuple) – VXLAN & inter-VXLAN routing – Load-Balancing (Advanced License) – High Availability (Advanced License) Licensing Options – Standalone per-VM Standard or Advanced licensing – Bundled with vCloud Suite
  • 8.
    VXLAN: How itworks What is vCloud Networking and Security Edge? Part of the VMware vCloud Networking and Security suite • Encapsulation – Performed by a kernel module installed on ESXi host • Acts as the Virtual Tunnel End Point or VTEP – Adds 24bit identifier and 50 bytes to packet size. – MAC in UDP + IP • MAC in UDP + IP – Why MAC in IP is better than vCNI (MAC in MAC) • Multicast – Where it is used, how this impacts scalability
  • 9.
    vCNS + Edge+ VXLAN: Prerequisites What is vCloud Networking and Security Edge? Part of the VMware vCloud Networking and Security suite • Previously known as the vShield suite. • Highly integrated with vCloud but vCD is not necessary with standalone licenses. VXLAN + vCNS Edge requires; • Physical network components; • • • MTU increase (1550 MIN) Multicast enabled (depending on topology, more to come) VMware components; • • • • vDS 5.1 (implies vSphere Enterprise Plus licensing & vCenter) A vCNS Manager A vCNS Edge VMware recommends • • • a single vDS across all clusters. you isolate your VTEP traffic from VM VLAN’s Etherchannel or LACP to your host for the VXLAN transport Port Group
  • 10.
    Multicast What needs tobe enabled on HP or Cisco switches? What are the multicast design considerations? • Limits of physical network hardware platforms using multicast – Cisco Nexus 7000 supports 15,000 L2 IGMP entries (http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9402/ps9512/brochure_mulitcast_w ith_cisco_nexus_7000.pdf) – Cisco Nexus 7000 supports 32,000 MC entries (15K vPC) (http://www.cisco.com/en/US/docs/switches/datacenter/sw/verified_scalability/b_Cisco_Nexus_700 0_Series_NXOS_Verified_Scalability_Guide.html#reference_04BA8513CF3140D2A2A6C5E5B4E7C60C) – Check HP gear limits. – So what do these limits mean? – VMware recommends one VXLAN ‘virtual wire’ per MC segment therefore we can only support this for up to 15K or 32K? • If we don’t follow this recommendation, how does this impact a VM broadcast flooding other VTEP’s w/ multicast traffic? • Is better to use IGMP snooping/querier (L2 topology) or PIM w/ L3 topology? – How does this impact Data Center Interconnects (DCI) and stretched VXLAN implementations?
  • 11.
    VXLAN Logical View Packetflow across virtual wires on the same layer 2 VXLAN transport network • • • VXLAN Fabric vDS • Layer 2 List Pro/Con’s here Multicast configuration options • IGMP snooping/querier Explain how they work in next slide Design considerations? • Eg. Broadcast storms?
  • 12.
    VXLAN Logical View Packetflow across virtual wires on different layer 3 VXLAN transport networks • • • VXLAN Fabric • vDS Layer 3 List Pro/Con’s here Multicast configuration options • PIM Explain how they work in next slide Design considerations? • Eg. Broadcast storms?
  • 13.
    High Level PhysicalDeployment VXLAN Fabric VTE P VTE P VTE P vSphere Distributed Switch VTE P Solution Components • vDS 5.1 ESXi ESXi ESXi ESXi • VXLAN virtual fabric • VTEP (vmk adapter in a dedicated Port Group) • vCNS Edge 5.1 • vCNS Manager 5.1
  • 14.
    Physical Deployment –A Closer Look VXLAN Fabric • vCNS Manager manages the vCNS deployment • supports many Edge devices. VTEP VTEP vSphere Distributed Switch ESXi ESXi • VTEP is a single vmkernel interface per host automatically created on VXLAN vDS Port Group • LACP, EtherChannel or (static) failover only supported load balancing methods. • VLAN ‘trunking’ or virtual switch tagging (VST) not recommended. Dedicate ‘access’ phyical uplinks to VXLAN Port Groups • vCNS Edge virtual appliance provides gateway services
  • 15.
    Physical Deployment –Intra-Host Packet Flow VXLAN Fabric VM Packet Flow 1. VM sents packet to remote destination on same virtual wire VTEP VTEP vSphere Distributed Switch ESXi ESXi 2. Packet hits vDS and is forwarded to destination VM
  • 16.
    Physical Deployment –Inter-Host Packet Flow VXLAN Fabric VM Packet Flow 1. VM sents packet to remote destination on same virtual wire VTEP VTEP vSphere Distributed Switch ESXi ESXi 2. Destination VM is remote and packet will traverses VXLAN network 3. ESXi host encapsulates packet and transmits on via VTEP vmkernel adapter 4. Target ESXi host running the destination VM receives the packet on the VTEP, forward to VM
  • 17.
    Physical Deployment –Routed Packet Flow VXLAN Fabric VM Packet Flow 1. VM transmits packet to remote destination VTEP VTEP vSphere Distributed Switch ESXi ESXi 2. VTEP kernel module in ESXi host encapsulates packet and transmits on VXLAN network 3. ESXi host running the Edge device receives packet and processes through rule engine 4. Packet processed by firewall/NAT/routing rules and is sent out external interface on Edge device 5. Packet hits physical network infrastructure
  • 18.
    Comparison of vSphereNIC Teaming Load Distribution vs Load Balancing vs Active/Standby vCNS Edge supports LACP & Etherchannel or Failover “aka, Active/Standby” NIC teaming options Load Distribution (of IP flows) Load Balancing (bandwidth) Active/Standby 90% load LAC P 20% load 55% load LBT 40% load 0% load Active / Standb y IP Flows (conversations ) Attempts to evenly distribute IP traffic flows, bandwidth is NOT a consideration Attempts to evenly distribute bandwidth capacity Single active link, no automatic load distribution/balancing 100% load
  • 19.
    VXLAN with HPVirtual Connect Interconnects Virtual Connect Advantage East/West Fencing (VTEP) Traffic stays in the VC domain using cross-connect or stacking links reducing North/South bandwidth requirements. Virtual Connect Disadvantage Virtual Connect does not support downstream server EtherChannel or LACP connectivity. • Limited to the vCNS Teaming Policy of “Failover” • • • Effectively an Active/Standby configuration Cuts North/South bandwidth efficiency in half due to idle link This is not as bad as it sounds due to the East/West traffic savings using cross-connects/stacking links Possible Solutions? • VC Tunnel Mode? – Does it pass link aggregation control traffic? Looks to be a NO • Multiple Edge devices using an alternating Active/Standby teaming on VXLAN Port Group? • • Static load-distribution sucks! Other?
  • 20.
    VXLAN Performance Encapsulation Overhead VXLANintroduces an additional layer of packet processing at the hypervisor level. For each packet on the VXLAN network, the hypervisor needs to add protocol headers on the sender side (encapsulation) and remove these headers (decapsulation) on the receiver side. This causes the CPU additional work for each packet. Apart from this CPU overhead, some of the offload capabilities of the NIC cannot be used because the inner packet is no longer accessible. The physical NIC hardware offload capabilities (for example, checksum offloading and TCP segmentation offload (TSO)) have been designed for standard (non-encapsulated) packet headers, and some of these capabilities cannot be used for encapsulated packets. In such a case, a VXLAN enabled packet will require CPU resources to perform a task that otherwise would have been done more efficiently by physical NIC hardware. There are certain NIC offload capabilities that can be used with VXLAN, but they depend on the physical NIC and the driver being used. As a result, the performance may vary based on the hardware used when VX http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-VXLAN-Perf.pdfLAN is configured.
  • 21.
    VXLAN Isn’t Perfect Comparedto MAC in MAC encapsulation (vCNI) then VXLAN (MAC in UPD) moves in the right direction for broadcast scalability • Broadcasts on internal networks (“protected” with vCDNI) get translated into global broadcasts. This behavior totally destroys scalability. In VLAN-based designs, the number of hosts and VMs affected by a broadcast is limited by the VLAN configuration... unless you stretch VLANs all across the data center (but then you ask for trouble). Ivan Pepelnjak VXLAN Fenced networks communicate via the VXLAN vmk adapter that only uses a single Netqueue NIC queue. This limits scalability by increasing CPU pressure on the host for a single pCPU. vCNS Teaming Policy in conjunction with Virtual Connect. VC has no downstream EtherChannel/LACP support and therefore VXLAN will always effectively be Active/Passive going out the chassis. You will be limited to the bandwidth of a single upstream link per vCNS Edge device (typically per cluster). The lack of control plane virtualization and reliance on the physical network for MAC propagation introduces limits imposed by multicast. – – Multicast administrator expertise (not your typical data center protocol) Multicast segment support limits of physical network infrastructure
  • 22.