SlideShare a Scribd company logo
High Availability of Azure
Applications(PaaS)
Himanshu Sahu
Mindfire Solutions
himanshus@mindfiresolutions.com
Agenda
Introduction
Windows Azure Role Architecture
Fault Domains in Windows Azure
Update Domains in Windows Azure
Windows Azure Host OS Updates
Windows Azure Guest OS Updates
Techniques for High Availability
High Availability in Azure
Introduction
ALWAYS ON
Reliability and Scalability
Design for failure
Implement separation of function
Use a service-oriented architecture
Windows Azure Role Architecture
Fault Domains in Windows Azure
Fault Domains
Fault Domain is a physical unit of failure, and is closely related to
the physical infrastructure in the data centers. In Windows Azure
the rack can be considered a fault domain. However there is no 1:1
mapping between fault domain and rack.
Windows Azure Fabric is responsible to deploy the instances of
your application in different fault domains. Right now Fabric makes
sure that your application uses at least 2 (two) fault domains.
As a developer have no direct control over how many fault
domains your application will use.
Update Domains in Windows Azure
Update Domains
Upgrade Domain is a logical unit, which determines how particular
service will be upgraded.
The default number of upgrade domains that are configured for
your application is 5 (five). You can control how many upgrade
domains your application will use through the upgradeDomain
configuration setting in your service definition file (CSDEF).
Windows Azure Host Updates
When and Why
Windows Azure deploys updates to the host OS approximately
once per month. This ensures that Windows Azure provides a
reliable, efficient and secure platform for hosting your applications.
The HA consists of multiple subcomponents, such as the Network
Agent (NA) that manages virtual machine VLANs and the Virtual
Machine virtual disk driver that connects Virtual Machine disks to
the blobs containing their data in Windows Azure Storage. Azure
therefore update the HA and its subcomponents at different
intervals, depending on when a fix or new functionality is ready.
Windows Azure Host Updates
Windows Azure Host Updates
How
The host OS reboots instances and the fabric controller ensures that only
instances from one upgrade domain at a time will be rebooted.
Virtual machines running on the server that have an Input Endpoint in
their role’s service model are removed from the load balancer rotation so
that no new requests will come to the virtual machine and instead new
requests are sent to other instances of that role as per the Azure load-
balancing policies.
Each virtual machine hosting a Web or Worker Role receives a Stopping
event, whereas VM Roles receive a standard Windows shutdown event.
Worker, Web, and Virtual machine roles are allowed five minutes to
respond to the stopping and shutdown event before they are forcibly
stopped.
Windows Azure Host Updates
How
After all guest virtual machines are stopped, the root partition OS
shuts down and the server reboots.
The updated root partition OS starts.
The virtual machines hosted on the server boot and start their
application code.
Virtual machines hosting service roles with Input Endpoints
reconnect to the load balancer, enabling them to receive client
request
Windows Azure Guest Updates
Once the Host OS has finished upgrading across the datacenter
then the Guest OS will be upgraded for services which are
configured to use automatic Guest OS versions and this upgrade
will proceed using standard upgrade domain rules for your service.
Your VM will be rebooted and the Windows Partition (the D drive)
will be reimaged with the upgraded OS.
The Guest OS update process is much faster than the Host OS
update since the fabric only has to coordinate the update within
your hosted service and your upgrade domains.
Availability
An available application considers the availability of its underlying
infrastructure and dependent services. Available applications
remove single points of failure through redundancy and resilient
design
Azure SLA
More Instances in Azure
Make Guest OS Update Manual
Availability
Scalability directly affects availability—an application that fails
under increased load is no longer available. Scalable applications
are able to meet increased demand with consistent results in
acceptable time windows.
Auto Scaling in Azure
Availability
Protection against hardware failures
Because every application is made up of multiple instances of each
role, hardware failures—a disk crash, a network fault, or the death
of a server machine—won’t take down the application. To help
with this, the fabric controller doesn’t choose machines for an
application’s instances at random. Instead, different instances of
the same role are placed in different fault domains. A fault domain
is a set of hardware—computers, switches, and more—that share a
single point of failure. (For example, all of the computers in a single
fault domain might rely on the same switch to connect to the
network.) Because of this, a single hardware failure can’t take
down an entire application. The application might temporarily lose
some instances, but it will continue to behave correctly.
Availability
Protection against software failures
The fabric controller can also detect failures caused by software. If
the code in an instance crashes or the VM in which it’s running
goes down, the fabric controller will start either just the code or, if
necessary, a new VM for that role. While any work the instance
was doing when it failed will be lost, the new instance will become
part of the application as soon as it starts running.
Availability
The ability to update applications with no application
downtime
When a new version of the application needs to be deployed, the
fabric controller can shut down the instances in just one update
domain, update the code for these, then create new instances
from that new code. Once those instances are running, it can do
the same thing to instances in the next update domain, and so on.
While users might see different versions of the application during
this process, depending on which instance they happen to interact
with, the application as a whole remains continuously available.
Availability
The ability to update Windows and other supporting
software with no application downtime.
Answer is Update Domain. :)
Resources
https://msdn.microsoft.com/enus/library/azure/dn251004.aspx
http://blogs.msdn.com/b/kwill/archive/2011/05/05/windows-azure-r
http://blog.toddysm.com/2010/04/upgrade-domains-and-fault-doma
http://blogs.msdn.com/b/kwill/archive/2012/09/19/role-
instance-restarts-due-to-os-upgrades.aspx
Questions?
Thank you!

More Related Content

What's hot (20)

PDF
Load Testing with JMeter, BlazeMeter, New Relic
Giedrius Krisciukaitis, PMP
 
PPTX
Continuous Testing in the Cloud
Sauce Labs
 
PPTX
Romulus crisan + radu pascal click'n'deploy
Codecamp Romania
 
PDF
Chris Omland - AWS Code Deploy - BSDC 2016
roblund
 
PPT
Continuous integration
Andrey Zhupanenko
 
PPTX
Azure webinar kolkata
Sumantro Mukherjee
 
PDF
Docker - Contain that Wild Application by Marvin Arcilla
DEVCON
 
PPTX
Parallel Testing with Python with Selenium and Sauce Labs
Sauce Labs
 
PDF
Automated-Testing-inside-containers
Manoj Kumar Kumar
 
PPTX
01 overview-servlets-and-environment-setup
dhrubo kayal
 
PPTX
Integration-Monday-Serverless-Slackbots-with-Azure-Durable-Functions
BizTalk360
 
PPTX
RightScale API: How To Build Your Own IT Vending Machine - RightScale Compute...
RightScale
 
PPTX
Cypress Automation
Susantha Pathirana
 
PPTX
Postmortem of a uwp xaml application development
David Catuhe
 
PDF
Cypress e2e automation testing - day1 intor by: Hassan Hameed
Hassan Muhammad
 
PDF
100% Puppet Cloud Deployment of Legacy Software
Puppet
 
PDF
Advanced Spring Boot with Consul
VMware Tanzu
 
PPT
Nakal think test_2015
Rajdeep Varma
 
PDF
How To Use Selenium Successfully (Java Edition)
Sauce Labs
 
PPT
Cloud Delivery
Stelligent
 
Load Testing with JMeter, BlazeMeter, New Relic
Giedrius Krisciukaitis, PMP
 
Continuous Testing in the Cloud
Sauce Labs
 
Romulus crisan + radu pascal click'n'deploy
Codecamp Romania
 
Chris Omland - AWS Code Deploy - BSDC 2016
roblund
 
Continuous integration
Andrey Zhupanenko
 
Azure webinar kolkata
Sumantro Mukherjee
 
Docker - Contain that Wild Application by Marvin Arcilla
DEVCON
 
Parallel Testing with Python with Selenium and Sauce Labs
Sauce Labs
 
Automated-Testing-inside-containers
Manoj Kumar Kumar
 
01 overview-servlets-and-environment-setup
dhrubo kayal
 
Integration-Monday-Serverless-Slackbots-with-Azure-Durable-Functions
BizTalk360
 
RightScale API: How To Build Your Own IT Vending Machine - RightScale Compute...
RightScale
 
Cypress Automation
Susantha Pathirana
 
Postmortem of a uwp xaml application development
David Catuhe
 
Cypress e2e automation testing - day1 intor by: Hassan Hameed
Hassan Muhammad
 
100% Puppet Cloud Deployment of Legacy Software
Puppet
 
Advanced Spring Boot with Consul
VMware Tanzu
 
Nakal think test_2015
Rajdeep Varma
 
How To Use Selenium Successfully (Java Edition)
Sauce Labs
 
Cloud Delivery
Stelligent
 

Similar to High Availability of Azure Applications (20)

PPT
128692851-Introducing-Windows-Azure.ppt
SaraNawghare
 
PPTX
ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
butest
 
PPTX
Microsoft Windows Azure - Cloud Computing Hosting Environment Presentation
Microsoft Private Cloud
 
PPTX
Introducing Windows Azure
Mohamed Samy
 
PPTX
Azure IaaS
Guy Barrette
 
PDF
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
PPTX
From legacy to Azure Service Fabric in 360h. Odessa.
Stas Lebedenko
 
PDF
Azure Solution Architect Interview Questions By ScholarHat
Scholarhat
 
PPTX
Windows Azure
Farhad Idrees MCEP MCE MCD
 
PPTX
Introduction of Windows azure and overview
Vishal Tandel
 
PPTX
Top 10 IaaS Highlights for Developers
Microsoft Tech Community
 
PDF
Major Cloud Providers - Azure.pdf new doc
VaradhaB
 
PPTX
Azure virtual machines & Terraform
Juan Jose Gazzola
 
PPTX
Azure service fabric
Fernando Mejía
 
PPTX
Service Fabric and Azure Service Fabric Mesh introduction
Mikkel Mørk Hegnhøj
 
PPTX
Global Azure Bootcamp: Azure service fabric
Luis Valencia
 
PPTX
azure_fundamentals_5674379643333389633.pptx
SourjyaBose
 
PPTX
Road to cloud-iaas
Hatem Al Sum
 
PPTX
Microsoft Azure Hybrid Cloud - Getting Started For Techies
Aidan Finn
 
PPTX
Cnam cours azure iaas
Aymeric Weinbach
 
128692851-Introducing-Windows-Azure.ppt
SaraNawghare
 
ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
butest
 
Microsoft Windows Azure - Cloud Computing Hosting Environment Presentation
Microsoft Private Cloud
 
Introducing Windows Azure
Mohamed Samy
 
Azure IaaS
Guy Barrette
 
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
From legacy to Azure Service Fabric in 360h. Odessa.
Stas Lebedenko
 
Azure Solution Architect Interview Questions By ScholarHat
Scholarhat
 
Introduction of Windows azure and overview
Vishal Tandel
 
Top 10 IaaS Highlights for Developers
Microsoft Tech Community
 
Major Cloud Providers - Azure.pdf new doc
VaradhaB
 
Azure virtual machines & Terraform
Juan Jose Gazzola
 
Azure service fabric
Fernando Mejía
 
Service Fabric and Azure Service Fabric Mesh introduction
Mikkel Mørk Hegnhøj
 
Global Azure Bootcamp: Azure service fabric
Luis Valencia
 
azure_fundamentals_5674379643333389633.pptx
SourjyaBose
 
Road to cloud-iaas
Hatem Al Sum
 
Microsoft Azure Hybrid Cloud - Getting Started For Techies
Aidan Finn
 
Cnam cours azure iaas
Aymeric Weinbach
 
Ad

More from Mindfire Solutions (20)

PDF
Physician Search and Review
Mindfire Solutions
 
PDF
diet management app
Mindfire Solutions
 
PDF
Business Technology Solution
Mindfire Solutions
 
PDF
Remote Health Monitoring
Mindfire Solutions
 
PDF
Influencer Marketing Solution
Mindfire Solutions
 
PPTX
IOT Hands On
Mindfire Solutions
 
ODP
Oracle Sql Developer-Getting Started
Mindfire Solutions
 
PPT
Adaptive Layout In iOS 8
Mindfire Solutions
 
PPT
Introduction to Auto-layout : iOS/Mac
Mindfire Solutions
 
PPT
LINQPad - utility Tool
Mindfire Solutions
 
PPT
Get started with watch kit development
Mindfire Solutions
 
PPTX
Swift vs Objective-C
Mindfire Solutions
 
ODP
Material Design in Android
Mindfire Solutions
 
ODP
Introduction to OData
Mindfire Solutions
 
PPT
Ext js Part 2- MVC
Mindfire Solutions
 
PPT
ExtJs Basic Part-1
Mindfire Solutions
 
PPT
Spring Security Introduction
Mindfire Solutions
 
PPT
Angular In Depth
Mindfire Solutions
 
PPT
Django Models
Mindfire Solutions
 
Physician Search and Review
Mindfire Solutions
 
diet management app
Mindfire Solutions
 
Business Technology Solution
Mindfire Solutions
 
Remote Health Monitoring
Mindfire Solutions
 
Influencer Marketing Solution
Mindfire Solutions
 
IOT Hands On
Mindfire Solutions
 
Oracle Sql Developer-Getting Started
Mindfire Solutions
 
Adaptive Layout In iOS 8
Mindfire Solutions
 
Introduction to Auto-layout : iOS/Mac
Mindfire Solutions
 
LINQPad - utility Tool
Mindfire Solutions
 
Get started with watch kit development
Mindfire Solutions
 
Swift vs Objective-C
Mindfire Solutions
 
Material Design in Android
Mindfire Solutions
 
Introduction to OData
Mindfire Solutions
 
Ext js Part 2- MVC
Mindfire Solutions
 
ExtJs Basic Part-1
Mindfire Solutions
 
Spring Security Introduction
Mindfire Solutions
 
Angular In Depth
Mindfire Solutions
 
Django Models
Mindfire Solutions
 
Ad

Recently uploaded (20)

PDF
How Agentic AI Networks are Revolutionizing Collaborative AI Ecosystems in 2025
ronakdubey419
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
PPTX
Presentation about Database and Database Administrator
abhishekchauhan86963
 
PDF
Protecting the Digital World Cyber Securit
dnthakkar16
 
PDF
Virtual Threads in Java: A New Dimension of Scalability and Performance
Tier1 app
 
PDF
Infrastructure planning and resilience - Keith Hastings.pptx.pdf
Safe Software
 
PPTX
Contractor Management Platform and Software Solution for Compliance
SHEQ Network Limited
 
PPTX
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
PDF
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
PDF
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
PDF
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
PDF
Supabase Meetup: Build in a weekend, scale to millions
Carlo Gilmar Padilla Santana
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PPTX
Role Of Python In Programing Language.pptx
jaykoshti048
 
PDF
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
PDF
Balancing Resource Capacity and Workloads with OnePlan – Avoid Overloading Te...
OnePlan Solutions
 
PPT
Brief History of Python by Learning Python in three hours
adanechb21
 
PDF
What companies do with Pharo (ESUG 2025)
ESUG
 
PDF
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
PDF
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
How Agentic AI Networks are Revolutionizing Collaborative AI Ecosystems in 2025
ronakdubey419
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
Presentation about Database and Database Administrator
abhishekchauhan86963
 
Protecting the Digital World Cyber Securit
dnthakkar16
 
Virtual Threads in Java: A New Dimension of Scalability and Performance
Tier1 app
 
Infrastructure planning and resilience - Keith Hastings.pptx.pdf
Safe Software
 
Contractor Management Platform and Software Solution for Compliance
SHEQ Network Limited
 
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
Supabase Meetup: Build in a weekend, scale to millions
Carlo Gilmar Padilla Santana
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
Role Of Python In Programing Language.pptx
jaykoshti048
 
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
Balancing Resource Capacity and Workloads with OnePlan – Avoid Overloading Te...
OnePlan Solutions
 
Brief History of Python by Learning Python in three hours
adanechb21
 
What companies do with Pharo (ESUG 2025)
ESUG
 
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 

High Availability of Azure Applications

  • 1. High Availability of Azure Applications(PaaS) Himanshu Sahu Mindfire Solutions [email protected]
  • 2. Agenda Introduction Windows Azure Role Architecture Fault Domains in Windows Azure Update Domains in Windows Azure Windows Azure Host OS Updates Windows Azure Guest OS Updates Techniques for High Availability
  • 3. High Availability in Azure Introduction ALWAYS ON Reliability and Scalability Design for failure Implement separation of function Use a service-oriented architecture
  • 4. Windows Azure Role Architecture
  • 5. Fault Domains in Windows Azure Fault Domains Fault Domain is a physical unit of failure, and is closely related to the physical infrastructure in the data centers. In Windows Azure the rack can be considered a fault domain. However there is no 1:1 mapping between fault domain and rack. Windows Azure Fabric is responsible to deploy the instances of your application in different fault domains. Right now Fabric makes sure that your application uses at least 2 (two) fault domains. As a developer have no direct control over how many fault domains your application will use.
  • 6. Update Domains in Windows Azure Update Domains Upgrade Domain is a logical unit, which determines how particular service will be upgraded. The default number of upgrade domains that are configured for your application is 5 (five). You can control how many upgrade domains your application will use through the upgradeDomain configuration setting in your service definition file (CSDEF).
  • 7. Windows Azure Host Updates When and Why Windows Azure deploys updates to the host OS approximately once per month. This ensures that Windows Azure provides a reliable, efficient and secure platform for hosting your applications. The HA consists of multiple subcomponents, such as the Network Agent (NA) that manages virtual machine VLANs and the Virtual Machine virtual disk driver that connects Virtual Machine disks to the blobs containing their data in Windows Azure Storage. Azure therefore update the HA and its subcomponents at different intervals, depending on when a fix or new functionality is ready.
  • 9. Windows Azure Host Updates How The host OS reboots instances and the fabric controller ensures that only instances from one upgrade domain at a time will be rebooted. Virtual machines running on the server that have an Input Endpoint in their role’s service model are removed from the load balancer rotation so that no new requests will come to the virtual machine and instead new requests are sent to other instances of that role as per the Azure load- balancing policies. Each virtual machine hosting a Web or Worker Role receives a Stopping event, whereas VM Roles receive a standard Windows shutdown event. Worker, Web, and Virtual machine roles are allowed five minutes to respond to the stopping and shutdown event before they are forcibly stopped.
  • 10. Windows Azure Host Updates How After all guest virtual machines are stopped, the root partition OS shuts down and the server reboots. The updated root partition OS starts. The virtual machines hosted on the server boot and start their application code. Virtual machines hosting service roles with Input Endpoints reconnect to the load balancer, enabling them to receive client request
  • 11. Windows Azure Guest Updates Once the Host OS has finished upgrading across the datacenter then the Guest OS will be upgraded for services which are configured to use automatic Guest OS versions and this upgrade will proceed using standard upgrade domain rules for your service. Your VM will be rebooted and the Windows Partition (the D drive) will be reimaged with the upgraded OS. The Guest OS update process is much faster than the Host OS update since the fabric only has to coordinate the update within your hosted service and your upgrade domains.
  • 12. Availability An available application considers the availability of its underlying infrastructure and dependent services. Available applications remove single points of failure through redundancy and resilient design Azure SLA More Instances in Azure Make Guest OS Update Manual
  • 13. Availability Scalability directly affects availability—an application that fails under increased load is no longer available. Scalable applications are able to meet increased demand with consistent results in acceptable time windows. Auto Scaling in Azure
  • 14. Availability Protection against hardware failures Because every application is made up of multiple instances of each role, hardware failures—a disk crash, a network fault, or the death of a server machine—won’t take down the application. To help with this, the fabric controller doesn’t choose machines for an application’s instances at random. Instead, different instances of the same role are placed in different fault domains. A fault domain is a set of hardware—computers, switches, and more—that share a single point of failure. (For example, all of the computers in a single fault domain might rely on the same switch to connect to the network.) Because of this, a single hardware failure can’t take down an entire application. The application might temporarily lose some instances, but it will continue to behave correctly.
  • 15. Availability Protection against software failures The fabric controller can also detect failures caused by software. If the code in an instance crashes or the VM in which it’s running goes down, the fabric controller will start either just the code or, if necessary, a new VM for that role. While any work the instance was doing when it failed will be lost, the new instance will become part of the application as soon as it starts running.
  • 16. Availability The ability to update applications with no application downtime When a new version of the application needs to be deployed, the fabric controller can shut down the instances in just one update domain, update the code for these, then create new instances from that new code. Once those instances are running, it can do the same thing to instances in the next update domain, and so on. While users might see different versions of the application during this process, depending on which instance they happen to interact with, the application as a whole remains continuously available.
  • 17. Availability The ability to update Windows and other supporting software with no application downtime. Answer is Update Domain. :)