SlideShare a Scribd company logo
Clustering and Load
Balancing
Outline
• Introduction
• Linux Virtual Server
• Microsoft load balancing solution
Introduction
• Explosive Growth of the Internet
• 100% annual growth rate
• Sites receiving unprecedented workload
• Yahoo! 625 million views per day
• AOL Web cache system receiving 5 billion requests per day
Introduction
• load balancing is a technique to spread work between many
computers, processes, disks or other resources in order to get
optimal resource utilization and decrease computing time.
• A load balancer can be used to increase the capacity of a
server farm beyond that of a single server.
• It can also allow the service to continue even in the face of server
down time due to server failure or server maintenance.
• A load balancer consists of a virtual server (also referred to as
vserver or VIP) which, in turn, consists of an IP address and port.
• virtual server is bound to a number of physical services running on
the physical servers in a server farm.
• A client sends a request to the virtual server, which in turn selects a
physical server in the server farm and directs this request to the
selected physical server.
Introduction (cont.)
• Different virtual servers can be configured for different sets of
physical services, such as TCP and UDP services in general.
• Application specific virtual server may exist to support HTTP, FTP, SSL,
DNS, etc.
• The load balancing methods manage the selection of an appropriate
physical server in a server farm.
• Persistence can be configured on a virtual server; once a server is
selected, subsequent requests from the client are directed to the
same server.
• Persistence is sometimes necessary in applications where client state
is maintained on the server, but the use of persistence can cause
problems in failure and other situations.
• A more common method of managing persistence is to store state
information in a shared database, which can be accessed by all real
servers, and to link this information to a client with a small token such
as a cookie, which is sent in every client request.
Introduction (cont.)
• Load balancers also perform server monitoring of services
in a web server farm.
• case of failure of a service, the load balancer continues to
perform load balancing across the remaining services that
are UP.
• In case of failure of all the servers bound to a virtual
server, requests may be sent to a backup virtual server (if
configured) or optionally redirected to a configured URL.
• In Global Server Load Balancing (GSLB) the load balancer
distributes load to a geographically distributed set of
server farms based on health, server load or proximity.
Introduction (cont.)
• Load balancing methods:
• Least connections
• Round robin
• Least response time
• Least bandwidth
• Least packets
• URL hashing
• Domain name hashing
• Source IP address
• Destination IP address
• Source IP - destination
• Static proximity, used for GSLB
Web Server Load Balancing
• One major issue for large Internet sites is how to handle the load of
the large number of visitors they get.
• This is routinely encountered as a scalability problem as a site
grows.
• There are several ways to accomplish load balancing
• For example in WikiMedia load is balanced as:
• Round robin DNS distributed page requests evenly to one of three Squid
Cache servers
• Squid cache servers used response time measurements to distribute
page requests between seven web servers.
• In addition, the Squid servers cached pages and delivered about 75% of
all pages without ever asking a web server for help.
• The PHP scripts which run the web servers distribute load to one of
several database servers depending on the type of request, with updates
going to a master database server and some database queries going to
one or more slave database servers.
Server Load Balancing and
redundancy
• Alternative methods include use of Layer 4
Router
• Linux virtual server, which is an advanced open
source load balancing solution for network
services.
• Network Load Balancing Services, which is an
advanced open source load balancing solution
for network services.
• Many sites are turning to the multi-homed
scenario; having multiple connections to the
Internet via multiple providers to provide a
reliable and high throughput service.
Linux Virtual Server
• Started in 1998, the Linux Virtual Server (LVS) project
combines multiple physical servers into one virtual server,
eliminating single points of failure (SPOF).
• Built with off-the-shelf components, LVS is already in use
in some of the highest-trafficked sites on the Web.
• Requirements for LVS:
• The service must scale: when the service workload increases, the
system must scale up to meet the requirements.
• The service must always be on and available, despite transient
partial hardware and software failures.
• The system must be cost-effective: the whole system must be
economical to build and expand.
• Although the whole system may be big in physical size, it should be
easy to manage.
LVS
• In LVS, a cluster of Linux servers appear as a single
(virtual) server on a single IP address.
• Client applications interact with the cluster as if it were a
single, high-performance, and highly-available server.
• Inside the virtual server, LVS directs incoming network
connections to the different servers according to
scheduling algorithms.
• Scalability is achieved by transparently adding or
removing nodes in the cluster.
• High availability is provided by detecting node or daemon
failures and reconfiguring the system accordingly, on-the-
fly.
• For transparency, scalability, availability and
manageability, LVS is designed around a three-tier
architecture, as illustrated in next figure
LVS architecture
• The load balancer,
servers, and shared
storage are usually
connected by a high-
speed network, such
as 100 Mbps
Ethernet or Gigabit
Ethernet, so that the
intranetwork does
not become a
bottleneck of the
system as the cluster
grows.
Three ways to balance load
• IPVS supports following three ways to balance loads:
• Virtual Server via NAT (VS/NAT)
• Virtual Server via Tunneling (VS/TUN)
• Virtual Server via Direct Routing (VS/DR)
Virtual Server via NAT (VS/NAT)
VS/NAT Workflow
1. When a user accesses a virtual service provided by the server
cluster, a request packet destined for the virtual IP address (the IP
address to accept requests for virtual service) arrives at the load
balancer.
2. The load balancer examines the packet's destination address and
port number. If they match a virtual service in the virtual server
rule table, a real server is selected from the cluster by a scheduling
algorithm and the connection is added to hash table that records
connections. Then, the destination address and the port of the
packet are rewritten to those of the selected server, and the
packet is forwarded to the server. When an incoming packet
belongs to an established connection, the connection can be found
in the hash table and the packet is rewritten and forwarded to the
right server.
3. The request is processed by one of the physical servers.
4. When response packets come back, the load balancer rewrites the
source address and port of the packets to those of the virtual
service. When a connection terminates or timeouts, the
connection record is removed from the hash table.
5. A reply is sent back to the user.
An example of Virtual Server via Nat
Packet rewriting flow
• The incoming packet for web service:
• The load balancer will choose a real server and
rewritten forwards the packet to it:
• Replies get back to the load balancer:
• The packet is rewritten and forwarded back to
the client
VS-NAT advantages and
disadvantages
• Advantages:
• Real servers can run any OS that supports TCP/IP
• Only an IP address is needed for the load balancer,
real servers can use private IP address
• Disadvantages
• The maximum number of server nodes is limited,
because both request and response packers are
rewritten by the load balancer. When the number of
server nodes increase up to 20, the load balancer will
probably become a new bottleneck
Virtual Server via IP Tunneling
(VS/TUN)
• IP tunneling (also called IP encapsulation) is a
technique to encapsulate IP datagrams within IP
datagrams, which allows datagrams destined for
one IP address to be wrapped and redirected to
another IP address.
• This technique can also be used to build a virtual
server: the load balancer tunnels the request
packets to the different servers, the servers
process the requests, and return the results to
the clients directly. Thus, the service appears as
a virtual service on a single IP address.
VS/TUN architecture
VS-TUN workflow
VS-TUN advantages and
disadvantages
• Advantages:
• Real servers send response packets to client directly,
which can follow different network routes
• Real servers can be in different networks, LAN/WAN
• Greatly increasing the scalability of Virtual Server
• Disadvantages:
• Real server must support IP tunneling protocol
Virtual Server via Direct Routing
(VS/DR)
• The load balancer and the real servers must have one of their
interfaces physically linked by an uninterrupted segment of LAN
such as an Ethernet switch.
• The virtual IP address is shared by real servers and the load
balancer.
• Each real server has a non-ARPing, loopback alias interface
configured with the virtual IP address, and the load balancer has an
interface configured with the virtual IP address to accept incoming
packets.
• The workflow of VS/DR is similar to that of VS/NAT or VS/TUN. In
VS/DR, the load balancer directly routes a packet to the selected
server (the load balancer simply changes the MAC address of the
data frame to that of the server and retransmits it on the LAN).
• When the server receives the forwarded packet, the server
determines that the packet is for the address on its loopback alias
interface, processes the request, and finally returns the result
directly to the user.
VS/DR architecture
VS-DR workflow
VS-DR advantages and
disadvantages
• Advantages:
• Real servers send response packets to clients directly, which can
follow different network routes
• No tunneling overhead
• Disadvantages:
• Servers must have non-arp alias interface
• The load balancer and server must have one of their interfaces in
the same LAN segment
Comparison
VS/NAT VS/TUN VS/DR
Server any Tunneling Non-arp device
server network private LAN/WAN LAN
server number low (10~20) High (100) High (100)
server gateway load balancer own router Own router
Note: those numbers are estimated based on the
assumption that load balancer and backend servers
have the same hardware configuration.
Scheduling algorithms
• Round-Robin
• Weighted Round-Robin
• Least-Connection
• Weighted Least-Connection
References
• Wikipedia
• http://www.linux-vs.org

More Related Content

KEY
Load Balancing with Apache
Bradley Holt
 
PDF
Positive testing vs negative testing
Professional QA
 
PDF
Multithreading and Actors
Diego Pacheco
 
PPTX
Chapter 5-Naming in distributed system.pptx
AschalewAyele2
 
PPTX
Serverless integration with Knative and Apache Camel on Kubernetes
Claus Ibsen
 
PPTX
RabbitMQ & Kafka
VMware Tanzu
 
PPTX
Microservices Architecture Part 2 Event Sourcing and Saga
Araf Karsh Hamid
 
PDF
High Concurrency Architecture at TIKI
Nghia Minh
 
Load Balancing with Apache
Bradley Holt
 
Positive testing vs negative testing
Professional QA
 
Multithreading and Actors
Diego Pacheco
 
Chapter 5-Naming in distributed system.pptx
AschalewAyele2
 
Serverless integration with Knative and Apache Camel on Kubernetes
Claus Ibsen
 
RabbitMQ & Kafka
VMware Tanzu
 
Microservices Architecture Part 2 Event Sourcing and Saga
Araf Karsh Hamid
 
High Concurrency Architecture at TIKI
Nghia Minh
 

What's hot (20)

ODP
Introduction To RabbitMQ
Knoldus Inc.
 
PPTX
The RabbitMQ Message Broker
Martin Toshev
 
PPT
Advanced Operating System Lecture Notes
Anirudhan Guru
 
PDF
Distributed computing bsics
Deepak John
 
PPTX
Load Balancing Server
abhishek16pradhan
 
PDF
Clear pass policy manager advanced_ashwath murthy
Aruba, a Hewlett Packard Enterprise company
 
PPTX
OMA LWM2M Tutorial by ARM to IETF ACE
Open Mobile Alliance
 
PPTX
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
DataWorks Summit
 
PDF
Role based access control - RBAC - Kubernetes
Milan Das
 
PDF
Writing smart contracts
Marek Kirejczyk
 
PDF
How Apache Kafka® Works
confluent
 
PDF
Low Code Integration with Apache Camel.pdf
Claus Ibsen
 
PPT
Intro to signalR
Mindfire Solutions
 
PPTX
Cloud Application Development – The Future is now
SPEC INDIA
 
PPT
Load Balancing In Cloud Computing newppt
Utshab Saha
 
PDF
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
DataWorks Summit
 
PDF
Fault tolerance
Gaurav Rawat
 
PDF
Introduction to AMQP Messaging with RabbitMQ
Dmitriy Samovskiy
 
PPT
Chapter 4- Communication in distributed system.ppt
AschalewAyele2
 
PPTX
Amoeba distributed operating System
Saurabh Gupta
 
Introduction To RabbitMQ
Knoldus Inc.
 
The RabbitMQ Message Broker
Martin Toshev
 
Advanced Operating System Lecture Notes
Anirudhan Guru
 
Distributed computing bsics
Deepak John
 
Load Balancing Server
abhishek16pradhan
 
Clear pass policy manager advanced_ashwath murthy
Aruba, a Hewlett Packard Enterprise company
 
OMA LWM2M Tutorial by ARM to IETF ACE
Open Mobile Alliance
 
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
DataWorks Summit
 
Role based access control - RBAC - Kubernetes
Milan Das
 
Writing smart contracts
Marek Kirejczyk
 
How Apache Kafka® Works
confluent
 
Low Code Integration with Apache Camel.pdf
Claus Ibsen
 
Intro to signalR
Mindfire Solutions
 
Cloud Application Development – The Future is now
SPEC INDIA
 
Load Balancing In Cloud Computing newppt
Utshab Saha
 
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
DataWorks Summit
 
Fault tolerance
Gaurav Rawat
 
Introduction to AMQP Messaging with RabbitMQ
Dmitriy Samovskiy
 
Chapter 4- Communication in distributed system.ppt
AschalewAyele2
 
Amoeba distributed operating System
Saurabh Gupta
 
Ad

Viewers also liked (20)

PPT
XenApp Load Balancing
Denis Gundarev
 
PPTX
Mule esb
Prabhat gangwar
 
PPT
Oracle real application_cluster
Prabhat gangwar
 
PPT
Jdbc drivers
Prabhat gangwar
 
PPT
Buissness model analysis
Prabhat gangwar
 
PPTX
Mule anypoint platform
Prabhat gangwar
 
PPT
Soap service
Prabhat gangwar
 
PPTX
Presentation of Tax
uzair-bhatti
 
PPT
EXtensible Markup Language
Prabhat gangwar
 
PPTX
Mule esb
Prabhat gangwar
 
PPTX
Multiplatform
Prabhat gangwar
 
PPTX
Restful api modeling language
Prabhat gangwar
 
PPT
Api desgin
Prabhat gangwar
 
PPT
Health triangle
Prabhat gangwar
 
PPT
Leadership today
Prabhat gangwar
 
XenApp Load Balancing
Denis Gundarev
 
Mule esb
Prabhat gangwar
 
Oracle real application_cluster
Prabhat gangwar
 
Jdbc drivers
Prabhat gangwar
 
Buissness model analysis
Prabhat gangwar
 
Mule anypoint platform
Prabhat gangwar
 
Soap service
Prabhat gangwar
 
Presentation of Tax
uzair-bhatti
 
EXtensible Markup Language
Prabhat gangwar
 
Mule esb
Prabhat gangwar
 
Multiplatform
Prabhat gangwar
 
Restful api modeling language
Prabhat gangwar
 
Api desgin
Prabhat gangwar
 
Health triangle
Prabhat gangwar
 
Leadership today
Prabhat gangwar
 
Ad

Similar to clustering and load balancing (20)

PDF
IRJET- Load Balancing Cluster based on Linux Virtual Server
IRJET Journal
 
PDF
ACE - Comcore
Aakash Agarwal
 
PPTX
Network Load Balancing.pptx
VydhehSumod
 
PDF
HA System-First presentation
Avin Chan
 
PPTX
Load balancing
ankur bhalla
 
PPTX
Load Balance with NSX-T.pptx
Dhruv Sharma
 
PPTX
Lync Server 2010: High Availability [I3004]
Fabrizio Volpe
 
PDF
Load Balancer Device and Configurations.
Web Werks Data Centers
 
PPT
Web Server Clustering - OSSCAMP
Abhishek Singh
 
PDF
Red hat enterprise_linux-7-load_balancer_administration-en-us
chakrikolla
 
PDF
E VALUATION OF T WO - L EVEL G LOBAL L OAD B ALANCING F RAMEWORK IN C L...
ijcsit
 
PDF
AWSQ: an approximated web server queuing algorithm for heterogeneous web serv...
IJECEIAES
 
PDF
IRJET- An Improved Weighted Least Connection Scheduling Algorithm for Loa...
IRJET Journal
 
DOC
Clustering & nlb
Vinod Hanumantharayappa
 
PDF
Application Delivery Networking
rjain51
 
PPTX
Load Balancing
optalink
 
PDF
Clustering - october 2006
achraf_ing
 
PDF
IRJET- Commercial Web Application Load Balancing based on Hybrid Cloud
IRJET Journal
 
PPTX
EXPRO2024ewfrdusayhfusyhedfyhsduyffdsyhjsdyfhu
rrr499650
 
IRJET- Load Balancing Cluster based on Linux Virtual Server
IRJET Journal
 
ACE - Comcore
Aakash Agarwal
 
Network Load Balancing.pptx
VydhehSumod
 
HA System-First presentation
Avin Chan
 
Load balancing
ankur bhalla
 
Load Balance with NSX-T.pptx
Dhruv Sharma
 
Lync Server 2010: High Availability [I3004]
Fabrizio Volpe
 
Load Balancer Device and Configurations.
Web Werks Data Centers
 
Web Server Clustering - OSSCAMP
Abhishek Singh
 
Red hat enterprise_linux-7-load_balancer_administration-en-us
chakrikolla
 
E VALUATION OF T WO - L EVEL G LOBAL L OAD B ALANCING F RAMEWORK IN C L...
ijcsit
 
AWSQ: an approximated web server queuing algorithm for heterogeneous web serv...
IJECEIAES
 
IRJET- An Improved Weighted Least Connection Scheduling Algorithm for Loa...
IRJET Journal
 
Clustering & nlb
Vinod Hanumantharayappa
 
Application Delivery Networking
rjain51
 
Load Balancing
optalink
 
Clustering - october 2006
achraf_ing
 
IRJET- Commercial Web Application Load Balancing based on Hybrid Cloud
IRJET Journal
 
EXPRO2024ewfrdusayhfusyhedfyhsduyffdsyhjsdyfhu
rrr499650
 

More from Prabhat gangwar (20)

PPT
Middleware
Prabhat gangwar
 
PDF
Pseudolocalization
Prabhat gangwar
 
PPTX
Mule anypoint studio
Prabhat gangwar
 
PPT
What is cluster analysis
Prabhat gangwar
 
PPT
Middleware systems overview and introduction
Prabhat gangwar
 
PPTX
Mule fundamentals
Prabhat gangwar
 
PPTX
Gsm architecture
Prabhat gangwar
 
PDF
Oracle vs-mulesoft-api-manager-features
Prabhat gangwar
 
PPT
Introducing adf business components
Prabhat gangwar
 
PPT
File transfer methods
Prabhat gangwar
 
PPT
Ftp tftp
Prabhat gangwar
 
PPT
Global warming
Prabhat gangwar
 
PPT
Vedic mathmetics
Prabhat gangwar
 
PPT
Trends
Prabhat gangwar
 
PPT
E commerce buissness-model
Prabhat gangwar
 
PPT
Effective communication skills
Prabhat gangwar
 
PPT
Office politics
Prabhat gangwar
 
PPTX
Introduction to xamarin
Prabhat gangwar
 
PPTX
Add to the path on mac os x
Prabhat gangwar
 
PPTX
Cross platform mobile application devlopment
Prabhat gangwar
 
Middleware
Prabhat gangwar
 
Pseudolocalization
Prabhat gangwar
 
Mule anypoint studio
Prabhat gangwar
 
What is cluster analysis
Prabhat gangwar
 
Middleware systems overview and introduction
Prabhat gangwar
 
Mule fundamentals
Prabhat gangwar
 
Gsm architecture
Prabhat gangwar
 
Oracle vs-mulesoft-api-manager-features
Prabhat gangwar
 
Introducing adf business components
Prabhat gangwar
 
File transfer methods
Prabhat gangwar
 
Ftp tftp
Prabhat gangwar
 
Global warming
Prabhat gangwar
 
Vedic mathmetics
Prabhat gangwar
 
E commerce buissness-model
Prabhat gangwar
 
Effective communication skills
Prabhat gangwar
 
Office politics
Prabhat gangwar
 
Introduction to xamarin
Prabhat gangwar
 
Add to the path on mac os x
Prabhat gangwar
 
Cross platform mobile application devlopment
Prabhat gangwar
 

Recently uploaded (20)

DOCX
Action Plan_ARAL PROGRAM_ STAND ALONE SHS.docx
Levenmartlacuna1
 
PPTX
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
PPTX
Kanban Cards _ Mass Action in Odoo 18.2 - Odoo Slides
Celine George
 
PPTX
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PPTX
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
PPTX
How to Track Skills & Contracts Using Odoo 18 Employee
Celine George
 
PDF
RA 12028_ARAL_Orientation_Day-2-Sessions_v2.pdf
Seven De Los Reyes
 
PPTX
Software Engineering BSC DS UNIT 1 .pptx
Dr. Pallawi Bulakh
 
PPTX
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
PPTX
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PDF
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
PPTX
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
PDF
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
PDF
2.Reshaping-Indias-Political-Map.ppt/pdf/8th class social science Exploring S...
Sandeep Swamy
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PDF
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
PPTX
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
PPTX
TEF & EA Bsc Nursing 5th sem.....BBBpptx
AneetaSharma15
 
Action Plan_ARAL PROGRAM_ STAND ALONE SHS.docx
Levenmartlacuna1
 
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
Kanban Cards _ Mass Action in Odoo 18.2 - Odoo Slides
Celine George
 
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
Cleaning Validation Ppt Pharmaceutical validation
Ms. Ashatai Patil
 
How to Track Skills & Contracts Using Odoo 18 Employee
Celine George
 
RA 12028_ARAL_Orientation_Day-2-Sessions_v2.pdf
Seven De Los Reyes
 
Software Engineering BSC DS UNIT 1 .pptx
Dr. Pallawi Bulakh
 
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
The-Invisible-Living-World-Beyond-Our-Naked-Eye chapter 2.pdf/8th science cur...
Sandeep Swamy
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
2.Reshaping-Indias-Political-Map.ppt/pdf/8th class social science Exploring S...
Sandeep Swamy
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
TEF & EA Bsc Nursing 5th sem.....BBBpptx
AneetaSharma15
 

clustering and load balancing

  • 2. Outline • Introduction • Linux Virtual Server • Microsoft load balancing solution
  • 3. Introduction • Explosive Growth of the Internet • 100% annual growth rate • Sites receiving unprecedented workload • Yahoo! 625 million views per day • AOL Web cache system receiving 5 billion requests per day
  • 4. Introduction • load balancing is a technique to spread work between many computers, processes, disks or other resources in order to get optimal resource utilization and decrease computing time. • A load balancer can be used to increase the capacity of a server farm beyond that of a single server. • It can also allow the service to continue even in the face of server down time due to server failure or server maintenance. • A load balancer consists of a virtual server (also referred to as vserver or VIP) which, in turn, consists of an IP address and port. • virtual server is bound to a number of physical services running on the physical servers in a server farm. • A client sends a request to the virtual server, which in turn selects a physical server in the server farm and directs this request to the selected physical server.
  • 5. Introduction (cont.) • Different virtual servers can be configured for different sets of physical services, such as TCP and UDP services in general. • Application specific virtual server may exist to support HTTP, FTP, SSL, DNS, etc. • The load balancing methods manage the selection of an appropriate physical server in a server farm. • Persistence can be configured on a virtual server; once a server is selected, subsequent requests from the client are directed to the same server. • Persistence is sometimes necessary in applications where client state is maintained on the server, but the use of persistence can cause problems in failure and other situations. • A more common method of managing persistence is to store state information in a shared database, which can be accessed by all real servers, and to link this information to a client with a small token such as a cookie, which is sent in every client request.
  • 6. Introduction (cont.) • Load balancers also perform server monitoring of services in a web server farm. • case of failure of a service, the load balancer continues to perform load balancing across the remaining services that are UP. • In case of failure of all the servers bound to a virtual server, requests may be sent to a backup virtual server (if configured) or optionally redirected to a configured URL. • In Global Server Load Balancing (GSLB) the load balancer distributes load to a geographically distributed set of server farms based on health, server load or proximity.
  • 7. Introduction (cont.) • Load balancing methods: • Least connections • Round robin • Least response time • Least bandwidth • Least packets • URL hashing • Domain name hashing • Source IP address • Destination IP address • Source IP - destination • Static proximity, used for GSLB
  • 8. Web Server Load Balancing • One major issue for large Internet sites is how to handle the load of the large number of visitors they get. • This is routinely encountered as a scalability problem as a site grows. • There are several ways to accomplish load balancing • For example in WikiMedia load is balanced as: • Round robin DNS distributed page requests evenly to one of three Squid Cache servers • Squid cache servers used response time measurements to distribute page requests between seven web servers. • In addition, the Squid servers cached pages and delivered about 75% of all pages without ever asking a web server for help. • The PHP scripts which run the web servers distribute load to one of several database servers depending on the type of request, with updates going to a master database server and some database queries going to one or more slave database servers.
  • 9. Server Load Balancing and redundancy • Alternative methods include use of Layer 4 Router • Linux virtual server, which is an advanced open source load balancing solution for network services. • Network Load Balancing Services, which is an advanced open source load balancing solution for network services. • Many sites are turning to the multi-homed scenario; having multiple connections to the Internet via multiple providers to provide a reliable and high throughput service.
  • 10. Linux Virtual Server • Started in 1998, the Linux Virtual Server (LVS) project combines multiple physical servers into one virtual server, eliminating single points of failure (SPOF). • Built with off-the-shelf components, LVS is already in use in some of the highest-trafficked sites on the Web. • Requirements for LVS: • The service must scale: when the service workload increases, the system must scale up to meet the requirements. • The service must always be on and available, despite transient partial hardware and software failures. • The system must be cost-effective: the whole system must be economical to build and expand. • Although the whole system may be big in physical size, it should be easy to manage.
  • 11. LVS • In LVS, a cluster of Linux servers appear as a single (virtual) server on a single IP address. • Client applications interact with the cluster as if it were a single, high-performance, and highly-available server. • Inside the virtual server, LVS directs incoming network connections to the different servers according to scheduling algorithms. • Scalability is achieved by transparently adding or removing nodes in the cluster. • High availability is provided by detecting node or daemon failures and reconfiguring the system accordingly, on-the- fly. • For transparency, scalability, availability and manageability, LVS is designed around a three-tier architecture, as illustrated in next figure
  • 12. LVS architecture • The load balancer, servers, and shared storage are usually connected by a high- speed network, such as 100 Mbps Ethernet or Gigabit Ethernet, so that the intranetwork does not become a bottleneck of the system as the cluster grows.
  • 13. Three ways to balance load • IPVS supports following three ways to balance loads: • Virtual Server via NAT (VS/NAT) • Virtual Server via Tunneling (VS/TUN) • Virtual Server via Direct Routing (VS/DR)
  • 14. Virtual Server via NAT (VS/NAT)
  • 15. VS/NAT Workflow 1. When a user accesses a virtual service provided by the server cluster, a request packet destined for the virtual IP address (the IP address to accept requests for virtual service) arrives at the load balancer. 2. The load balancer examines the packet's destination address and port number. If they match a virtual service in the virtual server rule table, a real server is selected from the cluster by a scheduling algorithm and the connection is added to hash table that records connections. Then, the destination address and the port of the packet are rewritten to those of the selected server, and the packet is forwarded to the server. When an incoming packet belongs to an established connection, the connection can be found in the hash table and the packet is rewritten and forwarded to the right server. 3. The request is processed by one of the physical servers. 4. When response packets come back, the load balancer rewrites the source address and port of the packets to those of the virtual service. When a connection terminates or timeouts, the connection record is removed from the hash table. 5. A reply is sent back to the user.
  • 16. An example of Virtual Server via Nat
  • 17. Packet rewriting flow • The incoming packet for web service: • The load balancer will choose a real server and rewritten forwards the packet to it: • Replies get back to the load balancer: • The packet is rewritten and forwarded back to the client
  • 18. VS-NAT advantages and disadvantages • Advantages: • Real servers can run any OS that supports TCP/IP • Only an IP address is needed for the load balancer, real servers can use private IP address • Disadvantages • The maximum number of server nodes is limited, because both request and response packers are rewritten by the load balancer. When the number of server nodes increase up to 20, the load balancer will probably become a new bottleneck
  • 19. Virtual Server via IP Tunneling (VS/TUN) • IP tunneling (also called IP encapsulation) is a technique to encapsulate IP datagrams within IP datagrams, which allows datagrams destined for one IP address to be wrapped and redirected to another IP address. • This technique can also be used to build a virtual server: the load balancer tunnels the request packets to the different servers, the servers process the requests, and return the results to the clients directly. Thus, the service appears as a virtual service on a single IP address.
  • 22. VS-TUN advantages and disadvantages • Advantages: • Real servers send response packets to client directly, which can follow different network routes • Real servers can be in different networks, LAN/WAN • Greatly increasing the scalability of Virtual Server • Disadvantages: • Real server must support IP tunneling protocol
  • 23. Virtual Server via Direct Routing (VS/DR) • The load balancer and the real servers must have one of their interfaces physically linked by an uninterrupted segment of LAN such as an Ethernet switch. • The virtual IP address is shared by real servers and the load balancer. • Each real server has a non-ARPing, loopback alias interface configured with the virtual IP address, and the load balancer has an interface configured with the virtual IP address to accept incoming packets. • The workflow of VS/DR is similar to that of VS/NAT or VS/TUN. In VS/DR, the load balancer directly routes a packet to the selected server (the load balancer simply changes the MAC address of the data frame to that of the server and retransmits it on the LAN). • When the server receives the forwarded packet, the server determines that the packet is for the address on its loopback alias interface, processes the request, and finally returns the result directly to the user.
  • 26. VS-DR advantages and disadvantages • Advantages: • Real servers send response packets to clients directly, which can follow different network routes • No tunneling overhead • Disadvantages: • Servers must have non-arp alias interface • The load balancer and server must have one of their interfaces in the same LAN segment
  • 27. Comparison VS/NAT VS/TUN VS/DR Server any Tunneling Non-arp device server network private LAN/WAN LAN server number low (10~20) High (100) High (100) server gateway load balancer own router Own router Note: those numbers are estimated based on the assumption that load balancer and backend servers have the same hardware configuration.
  • 28. Scheduling algorithms • Round-Robin • Weighted Round-Robin • Least-Connection • Weighted Least-Connection