SlideShare a Scribd company logo
Stephane Lapointe, Orckestra - stephane@stephanelapointe.net / @s_lapointe
Guy Barrette, freelance Architect/Developer - guy@guybarrette.com / @GuyBarrette
Francois Boucher, Lixar IT - fboucher@frankysnotes.com / @fboucheros
Alexandre Brisebois, Microsoft – alexandre.brisebois@microsoft.com / @brisebois
Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016)
Today we’re going to learn about how
Microservices enable development and management
flexibility
Service Fabric is the platform for building applications with
a microservices design approach
Service Fabric is battle tested and provides a rich platform
for both development and management of services at
scale
Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016)
1 Trillion
Messages delivered every
month with Event Hubs
100,000
New Azure customer
subscriptions/month
20Million
SQL database hours
used every day
>5Trillion
Storage transactions
every month
60Billion
Hits to Websites run on
Azure Web App Service
425Million
Azure Active
Directory Users
Azure Momentum
57%
Of Fortune 500 Companies
use Microsoft Azure
>50Trillion
Storage objects
in Azure
1.4 Million
SQL Databases Deployed
In Azure
“Microsoft is
growing its cloud
revenue faster than
Amazon” – Business
Insider 2016
AWS revenue grew about
69% but Microsoft Azure
revenue grew by 127%
What do these have in common?
Microservices
• Scales by cloning the app on multiple
servers/VMs/Containers
Monolithic application approach Microservices application approach
• A microservice application
separates functionality into
separate smaller services.
• Scales out by deploying each service
independently creating instances of these services
across servers/VMs/containers
• A monolith app contains domain
specific functionality and is
normally divided by functional
layers such as web, business and
data
App 1 App 2App 1
• Single monolithic database
• Tiers of specific technologies
State in Monolithic approach State in Microservices approach
• Graph of interconnected microservices
• State typically scoped to the microservice
• Variety of technologies used
• Remote Storage for cold data
stateless services with
separate stores
stateful
services
stateless
presentation
services
stateless
services
Plan
1 Monitor + Learn
ReleaseDevelop + Test
2
Development Production
4
3
Design/
Develop
Operate
Upgrade
•
•
•
•
A Microservice Platform
Public Cloud Other CloudsOn Premises
Private cloud
A Microservice Platform
Setting-up a
Cluster in AzureWhat Is
Azure Service Fabric?
 Next generation of PaaS on Azure
 Elastic scale, OS updates, SF updates
 Microservices platform for Windows and Linux
 DevOps, rolling upgrades, etc.
 Polycloud including on-premises
 Programming models
 Stateless Win32 apps written in any language (some feature not supported)
 Reliable Services: Stateless & stateful (for hot data; gives low-latency reads)
 OWIN/ASP.NET Core*
 Service Fabric is free of charge
 SDK: http://aka.ms/ServiceFabricSDK
Service Fabric is
• 1 role instance per VM
• Uneven utilization
• Low density
• Slow deployment & upgrade (bound to VM)
• Slow scaling and failure recovery
• Limited fault tolerance
• Many microservices per VM
• Even Utilization (by default, customizable)
• High density (customizable)
• Fast deployment & upgrade
• Fast scaling of independent microservices
• Tunable fast fault tolerance
Cloud Services vs Service Fabric
Azure Cloud Services
(Web & Worker Roles)
Azure Service Fabric
(Services)
Microsoft Azure Service Fabric
A platform for reliable, hyperscale, microservice-based applications
Azure
Windows
Server
Linux
Hosted Clouds
Windows
Server
Linux
Service Fabric
Private Clouds
Windows
Server
Linux
High Availability
Hyper-Scale
Hybrid Operations
High Density
microservices
Rolling Upgrades
Stateful services
Low Latency
Fast startup &
shutdown
Container Orchestration
& lifecycle management
Replication &
Failover
Simple
programming
models
Load balancing
Self-healingData Partitioning
Automated Rollback
Health
Monitoring
Placement
Constraints
Service Fabric Subsystems
Service discovery Reliability, Availability,
Replication, Service
Orchestration
Application lifecycle
Fault Inject,
Test in production
Federates a set of nodes to form a consistent scalable fabric
Secure point-to-point communication
Deployment,
Upgrade and
Monitoring
microservices
Windows OS
Windows OS Windows OS
Windows OS
Windows OS
Windows OS
Fabric
Node
Fabric
Node
Fabric
Node
Fabric
Node
Fabric
Node
Fabric
Node
 Set of OS instances (real or virtual) stitched together to form a pool of resources
 Cluster can scale to 1000s of machines, is self repairing, and scales-up or down
 Acts as environment-independent abstraction layer
Cluster
Datacenter (Azure, On Premises, Other Clouds )
Load
Balancer
PC/VM #1
Service Fabric
Your code, etc.
PC/VM #2
Service Fabric
Your code, etc. PC/VM #3
Service Fabric
Your code, etc.
PC/VM #4
Service Fabric
Your code, etc.
PC/VM #5
Service Fabric
Your code, etc.
Management to deploy
your code, etc.
(Port: 19080)
App Web Request
(Port: 80/443/?)
 Cluster Manager (ports 19080 [REST] & 19000 [TCP])
Performs cluster REST & PowerShell/FabricClient operations
 Failover Manager
Rebalances resources as nodes come/go
 Naming
Maps service instances to endpoints
 Image store (not on OneBox)
Contains your Application packages
 Upgrade Service (Azure only)
Coordinates upgrading SF itself with Azure’s SFRP
Service Fabric’s Infrastructure Services
Node #1
F
Node #2
C N I
Node #3
C F
Node #4
N I
Node #5
C
I
F
N
U
U
U
N F U
IC
Setting-up a
Cluster in AzureMicroservices with
Azure Service Fabric
App1 App2
Service Fabric Microservices
App Type Packages Service Fabric Cluster VMs
Guest Executables
• Bring any exe
• Any language
• Any programming model
• Packaged as Application
• Gets versioning, upgrade,
monitoring, health, etc.
Reliable Services
• Stateless & stateful services
• Concurrent, granular state
changes
• Use of the Reliable
Collections
• Transactions across
collections
• Full platform integration
Reliable Actors
• Stateless & stateful actor
objects
• Simplified programming
model
• Single Threaded model
• Great for scaled out compute
and state
• Reliable collections make it easy to build stateful services
• An evolution of .NET collections - for the cloud
• ReliableDictionary<T1,T2> and ReliableQueue<T>
Programming models: Reliable Services
Collections
• Single machine
• Single-threaded
Concurrent Collections
• Single machine
• Multi-threaded
Reliable Collections
• Multi-machine
• Replicated (HA)
• Persistence (durable)
• Asynchronous
• Transactional







protected override async Task RunAsync(CancellationToken cancellationToke)
{
var requestQueue = await this.StateManager.GetOrAddAsync<IReliableQueue<CustomerRecord>>(“requests");
var locationDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<Guid, LocationInfo>>(“locs");
var personDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<Guid, Person>>(“ppl");
var customerListDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<Guid, object>>(“customers");
while (true)
{
cancellationToke.ThrowIfCancellationRequested();
Guid customerId = Guid.NewGuid();
using (var tx = this.StateManager.CreateTransaction())
{
var customerRequestResult = await requestQueue.TryDequeueAsync(tx);
await customerListDictionary.AddAsync(tx, customerId, new object());
await personDictionary.AddAsync(tx, customerId, customerRequestResult.Value.person);
await locationDictionary.AddAsync(tx, customerId, customerRequestResult.Value.locInfo);
await tx.CommitAsync();
}
}
}
Everything
happens or
nothing
happens!
Programming models: Reliable Actors
• Independent units of compute and state
• Large number of them executing in parallel
• Communicates using asynchronous messaging
• Single threaded execution
• Automatically created and dehydrated as necessary
Reliable Actors APIs Reliable Services APIs
Your problem space involves many small independent
units of state and logic
You need to maintain logic across multiple components
You want to work with single-threaded objects while still
being able to scale and maintain consistency
You want to use reliable collections (like .NET Dictionary
and Queue) to store and manage your state
You want the framework to manage the concurrency and
granularity of state
You want to control the granularity and concurrency of
your state
You want the platform to manage communication for
you
You want to manage the communication and control the
partitioning scheme for your service
Comparing Reliable Actors & Reliable Service
http://bit.ly/sf-setup
http://bit.ly/sf-lab-2
Setting-up a
Cluster in AzureApplication Packaging
& Deployment





<ServiceManifest Name="QueueService" Version="1.0">
<ServiceTypes>
<StatefulServiceType ServiceTypeName="QueueServiceType" HasPersistedState="true" />
</ServiceTypes>
<CodePackage Name="Code" Version="1.0">
<EntryPoint>
<ExeHost>
<Program>ServiceHost.exe</Program>
</ExeHost>
</EntryPoint>
</CodePackage>
<ConfigPackage Name="Config" Version="1.0" />
<DataPackage Name="Data" Version="1.0" />
</ServiceManifest>








Cluster
“Fabrikam” eStore App
“G” Gallery Svc
“P” Payment Svc
eStore App Type
Gallery Svc Type
Payment Svc Type
“Contoso” eStore App
“G” Gallery Svc
“P” Payment Svc
<ApplicationManifest
ApplicationTypeName="eStoreAppType"
ApplicationTypeVersion="1.0" ...>
<ServiceManifestImport>
<ServiceManifestRef
ServiceManifestName="GalleryServicePkg"
ServiceManifestVersion="1.0" ... />
<ServiceManifestRef
ServiceManifestName="PaymentServicePkg"
ServiceManifestVersion="1.0" ... />
...
</ServiceManifestImport>
</ApplicationManifest>
C:eStoreAppTypePkg
│ ApplicationManifest.xml
│
├───GalleryServicePkg
│ │ ServiceManifest.xml
│ │
│ └───CodePkg
│ Gallery.exe
│ GalleryLib.dll
│ Setup.bat
│
└───PaymentServicePkg
│ ServiceManifest.xml
│
└───CodePkg
Payment.exe
<ServiceManifest Name="GalleryServicePkg"
Version="1.0">
<ServiceTypes>
<StatelessServiceType
ServiceTypeName="GalleryServiceType" ... >
</StatelessServiceType>
</ServiceTypes>
<CodePackage Name="CodePkg" Version="1.0">
<EntryPoint> <ExeHost>
<Program>Gallery.exe</Program>
</ExeHost> </EntryPoint>
</CodePackage>
<Resources> <Endpoints>
<Endpoint Name="GalleryEndpoint"
Type="Input" Protocol="http" Port="8080" />
</Endpoints> </Resources>
</ServiceManifest>
C:eStoreAppTypePkg
│ ApplicationManifest.xml
│
├───GalleryServicePkg
│ │ ServiceManifest.xml
│ │
│ └───CodePkg
│ Gallery.exe
│ GalleryLib.dll
│
└───PaymentServicePkg
│ ServiceManifest.xml
│
└───CodePkg
Payment.exe
Cluster
Management, Billing (VMs), Geolocation, Multitenancy
1+ Named Applications
Isolation, Multitenancy, Unit of versioning/config
1+ Named Services
Code package(s), Multitenancy (w/o isolation)
Stateless: 1 Partition
No value
1+ Instances
Scale, Availability
Stateful: 1+ Partitions
Addressability, Scale
1+ Replicas
Availability
• You can dynamically start/remove named
apps/services and instances; not partitions.
• The # instances is set per named service;
all partitions have the same # of instances




Node #1
Node #2
Node #3
Node #4
Node #5
f:/A1/S1, P1, I1
f:/A1/S2, P1, I1
f:/A1/S1, P1, I2
f:/A1/S1, P1, I3
f:/A1/S2, P1, I2
f:/A1/S2, P2, I2
f:/A1/S2, P2, I1
App
Name
Service
Type
Service
Name
#
Partitions
#
Instances
fabric:/A1 “S” fabric:/A1/S1 1 3
fabric:/A1 “S” fabric:/A1/S2 2 2
App Type App Version App Name
“A” 1.0 fabric:/A1
NOTE: When using SF programming models, instances
from same named app/service are in the same process
“fabric:/Contoso”
Named App
“fabric:/Contoso/Payment”
Named Svc (Stateful)
“fabric:/Contoso/Gallery”
Named Svc (Stateless)
Partition-1 Partition-2
Replica-1
Replica-2
Replica-3
Replica-1
Replica-2
Replica-3
Partition-1
Instance-1
Instance-2
Replica-4
Deploy
Application Type
& Create App
Instance
 Copy-ServiceFabricApplicationPackage (to image store)
 Register-ServiceFabricApplicationType (in image store)
 Remove-ServiceFabricApplicationPackage (from image store)
 New-ServiceFabricApplication (named app)
 New-ServiceFabricService (named svc)
 Remove-ServiceFabricService (named svc)
 Remove-ServiceFabricApplication (named app & its named svcs)
 Unregister-ServiceFabricApplicationType (from image store)
 No named app can be running
PowerShell App Pkg & Named App/Service Ops
http://bit.ly/sf-lab-3
Add a web front-end to
your application
http://bit.ly/sf-lab-4
Setting-up a
Cluster in AzureRunning Microservices
at Scale!
Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016)
Node 5Node 4Node 3 Node 6Node 2Node 1
P2
S
S
S
P4
S
P1
S
P3S
S
S
• Services can be partitioned for scale-out.
• You can choose your own partitioning scheme.
• Service partitions are striped across machines in the cluster.
• Replicas automatically scale out & in on cluster changes
Performance and stress response
• Rich built-in metrics for Actors and Services programming models
• Easy to add custom application performance metrics
Health status monitoring
• Built-in health status for cluster and services
• Flexible and extensible health store for custom app health reporting
• Allows continuous monitoring for real-time alerting on problems in production
• Repair suggestions. Examples: Slow RunAsync cancellations, RunAsync failures
• All important events logged. Examples: App creation, deploy and upgrade records. All Actor method
calls.
Detailed
System
Optics
• ETW == Fast Industry Standard Logging Technology
• Works across environments. Same tracing code runs on devbox and also on production clusters on
Azure.
• Easy to add and system appends all the needed metadata such as node, app, service, and partition.
Custom
Application
Tracing
• Visual Studio Diagnostics Events Viewer
• Windows Event Viewer
• Windows Azure Diagnostics + Operational Insights
• Easy to plug in your preferred tools: Kibana, Elasticsearch and more
Choice of
Tools
Scalability
High Availability
Reliability
Resiliency
Durability
Time = t1
83
76 50
46
64 New Node arrived61
Time = t2
83
61
50
46
Failures Detected
cluster reconfigured
83
76
64
50
46
Time = t0
Nodes failed
Stateful Microservices - Replication
Service Fabric Cluster VMs
Primary
Secondary
Replication


P
S
S
S
S
WriteWrite
WriteWrite
AckAck Ack
Ack
Read
Value
Write
Ack
App1 App2
Handling Machine Failures
App Type Packages Service Fabric Cluster VMs










P
S
S
S
S
S
Must be safe in the presence
of cascading failures
B P
X
Failed
X
Failed
Monitor
http://bit.ly/sf-lab-5
Health
Cluster
Partitions










Nodes Applications
Deployed
Applications
Instances/
Replicas
Services
Deployed Service
Packages






<FabricSettings>
<Section Name="HealthManager/ClusterHealthPolicy">
<Parameter Name="MaxPercentUnhealthyApplications" Value="0"/>
<Parameter Name="MaxPercentUnhealthyNodes" Value="20"/>
</Section>
</FabricSettings>
<Policies>
<HealthPolicy MaxPercentUnhealthyDeployedApplications="20">
<DefaultServiceTypeHealthPolicy
MaxPercentUnhealthyServices="0"
MaxPercentUnhealthyPartitionsPerService="10"
MaxPercentUnhealthyReplicasPerPartition="0"/>
<ServiceTypeHealthPolicy ServiceTypeName="FrontEndSvcType"
MaxPercentUnhealthyServices="0"
MaxPercentUnhealthyPartitionsPerService="20"
MaxPercentUnhealthyReplicasPerPartition="0"/>
</HealthPolicy>
</Policies>
 Health Policies
MaxPercentUnhealthyServices, MaxPercentUnhelathyDeployedApplications, ConsiderWarningsasError
 UpgradeTimeout
If an entire upgrade hits this timeout, the upgrade is failed.
 Upgrade DomainTimeout
If upgrading a UD hits this timeout, the upgrade is failed.
 HealthCheckWaitDuration
After an UD is upgraded, wait for this time before checking health of nodes in that UD.
 HealthCheckStableDuration
Even if the last health check passed, keep checking the health for this duration to ensure the upgrade is stable. If stable, upgrade the next UD.
 UpgradeHealthCheckInterval
Keep checking health periodically with this interval until HealthCheckStableDuration is hit.
 HealthCheckRetryTimeout
Once this time out is hit, stop checking health and fail the upgrade.
Health Policies & Timeouts








Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016)



























































Submitting a
Health Report

Mandatory Data Description
Entity Cluster, Node, App, Service, Partition, Replica, Deployed App, Deployed Service Pkg
SourceId String uniquely identifies reporter
Property Category (ex: “Storage” or “Connectivity”)
HealthState Ok, Warning, Error
Optional Data Default Description
Description “” Human readable info
TimeToLive Infinite # seconds before report is expired
RemoveWhenExpired False Useful if TTL != Infinite. If false, report’s entity is in Error; else report
removed after expiration.
SequenceNumber Auto-
generated
Increasing integer. Use to replace old reports when reporting state
transitions.

Property Description
HealthInformation The original health report
SourceUtcTimetamp The time the health report was originally submitted
LastModifiedUtcTimestamp The last time the report was modified
IsExpired True if TTL expired and RemoveWhenExpired=false
LastOkTransitionAt
LastWarningTransitionAt
LastErrorTransitionAt
These give a history of the event’s health states.
Ex: Alert if !Ok > 5 minutes









Report
http://bit.ly/sf-lab-6
Setting-up a
Cluster in AzureReal Customers
Real Workloads
Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016)
Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016)
Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016)
Independent games studio specializing in massively
multiplayer games
http://web.ageofascent.com/category/development/service-
fabric/
Testability
 Two main test scenarios provided out of the box
 Chaos tests
 Failover tests
 Tools
 C# APIs (System.Fabric.Testability.dll)
 PowerShell commandlets (runtime required)
Testability in Service Fabric
 Generates faults across the entire Service Fabric
cluster
 Compresses faults generally seen in months or years
to a few hours
 Combination of interleaved faults with the high fault
rate finds corner cases that are otherwise missed
 Leads to a significant improvement in the code
quality of the service
What do we get from this Testability
Actions Description Managed API Powershell Cmdlet
Graceful/
UnGraceful
Faults
CleanTestState
Removes all the test state from the cluster in case of a bad
shutdown of the test driver.
CleanTestStateAsync Remove-ServiceFabricTestState Not Applicable
InvokeDataLoss Induces data loss into a service partition. InvokeDataLossAsync Invoke-ServiceFabricPartitionDataLoss Graceful
InvokeQuorumLoss Puts a given stateful service partition in to quorum loss. InvokeQuorumLossAsync Invoke-ServiceFabricQuorumLoss Graceful
Move Primary
Moves the specified primary replica of stateful service to the
specified cluster node.
MovePrimaryAsync Move-ServiceFabricPrimaryReplica Graceful
Move Secondary
Moves the current secondary replica of a stateful service to a
different cluster node.
MoveSecondaryAsync Move-ServiceFabricSecondaryReplica Graceful
RemoveReplica
Simulates a replica failure by removing a replica from a cluster. This
will close the replica and will transition it to role 'None', removing
all of its state from the cluster.
RemoveReplicaAsync Remove-ServiceFabricReplica Graceful
RestartDeployedCodeP
ackage
Simulates a code package process failure by restarting a code
package deployed on a node in a cluster. This aborts the code
package process which will restart all the user service replicas
hosted in that process.
RestartDeployedCodePac
kageAsync
Restart-
ServiceFabricDeployedCodePackage
Ungraceful
RestartNode Simulates a Service Fabric cluster node failure by restarting a node. RestartNodeAsync Restart-ServiceFabricNode Ungraceful
RestartPartition
Simulates a data center blackout or cluster blackout scenario by
restarting some or all replicas of a partition.
RestartPartitionAsync Restart-ServiceFabricPartition Graceful
RestartReplica
Simulates a replica failure by restarting a persisted replica in a
cluster, closing the replica and then reopening it.
RestartReplicaAsync Restart-ServiceFabricReplica Graceful
StartNode Starts a node in a cluster which is already stopped. StartNodeAsync Start-ServiceFabricNode Not Applicable
StopNode
Simulates a node failure by stopping a node in a cluster. The node
will stay down until StartNode is called.
StopNodeAsync Stop-ServiceFabricNode Ungraceful
ValidateApplication
Validates the availability and health of all Service Fabric services
within an application, usually after inducing some fault into the
system.
ValidateApplicationAsync Test-ServiceFabricApplication Not Applicable
ValidateService
Validates the availability and health of a Service Fabric service,
usually after inducing some fault into the system.
ValidateServiceAsync Test-ServiceFabricService Not Applicable
Testability Actions
 Stateless:
 Stop node (ungraceful)
 Start node (N/A)
 Restart node (ungraceful)
 Validate application (N/A)
 Validate service (N/A)
 RestartDeployedCodePackage (ungraceful)
 Restart partition (graceful)
 Restart replica (graceful)
 CleanTestState (N/A)
 Failover/chaos tests
Testability
 Stateful:
 Move primary replica (graceful)
 Move secondary replica (graceful)
 Remove Replica (graceful)
 InvokeQuorumLoss (graceful)
 InvokeDataLoss (graceful)
http://bit.ly/sf-lab-7
Simulate
http://bit.ly/sf-lab-8
Upgrading a
Named Application
1. Put new code in code
package
2. Update ver strings
(#s are not required)
3. Copy new app package
to image store
4. Register new app type/
version
5. Select named app(s) to
upgrade to new version
Updating Your App’s Service’s Code
<ServiceManifest Name="WebServer" Version="2.0">
<ServiceTypes>
<StatelessServiceType ServiceTypeName="WebServer" ...>
<Extensions> ... </Extensions>
</StatelessServiceType>
</ServiceTypes>
<CodePackage Name="CodePkg" Version="1.1">
<EntryPoint> ... </EntryPoint>
</CodePackage>
<Resources><Endpoints> ... </Endpoints></Resources>
</ServiceManifest>
<ApplicationManifest ApplicationTypeName="DemoAppType"
ApplicationTypeVersion="3.0" ...>
<ServiceManifestImport>
<ServiceManifestRef ServiceManifestName="WebServer"
ServiceManifestVersion="2.0" .../>
</ServiceManifestImport>
</ApplicationManifest>
A
B1
C
B2
 Prevent complete service outage while upgrading
 More UDs  less loss of scale but more time to upgrade
 # UD set when cluster created via cluster manifest; ARM template
 Default=5; 20% down at a time
 IMPORTANT: 2 versions of your code run side-by-side simultaneously
 Beware of data/schema/protocol changes; use 2-phase upgrade
 Below shows 9 nodes spread across 5 UDs
Upgrade Domains
UD #1 UD #2 UD #3 UD #4 Node #5
Node-1
Node-8
Node-2 Node-3 Node-4 Node-5
Node-9Node-6 Node-7
 Isolate cluster from a single point of
hardware failure (fault)
 Determined by hardware topology (datacenter, rack, blade)
Fault Domains
fd:/DC1/R1/B1
fd:/DC1/R1/B2
fd:/DC1/R1/B3
fd:/DC1/R2/B1
fd:/DC1/R2/B2
fd:/DC1/R2/B3
fd:/DC2/R1/B1
fd:/DC2/R1/B2
fd:/DC2/R1/B3
fd:/DC2/R2/B1
fd:/DC2/R2/B2
fd:/DC2/R2/B3
…
DC1
R1
B1
B2
B3
R2
B1
B2
B3
DC2
R1
B1
B2
B3
R2
B1
B2
B3
DC3
R1
B1
B2
B3
R2
B1
B2
B3
Start-ServiceFabricApplicationUpgrade
Parameter Default Description
ApplicationName N/A Application Instance name
TargetApplicationTypeVersion N/A The version string you want to upgrade to
FailureAction N/A Rollback (to last version) or
Manual (stop upgrade & switch to manual)
UpgradeDomainTimeoutSec Infinite If any UD takes more than this time, FailureAction
UpgradeTimeout Infinite If all UDs take more than this time, FailureAction
HealthCheckWaitDurationSec 0 After UD, SF waits this long before initiating health check
UpgradeHealthCheckInterval 60 If health check fails, SF waits this long before checking
again
(set in cluster manifest; not PowerShell)
HealthCheckRetryTimeoutSec 600 Maximum time SF waits for app to be healthy
HealthCheckStableDurationSec 0 How long app must be healthy before upgrading next UD
Optional Health Criteria Policies
Parameter Default Description
ConsiderWarningAsError False Warning health events are considered errors
stopping the upgrade
MaxPercentUnhealthyDeployedApplications 0 TODO: Max unhealthy before app is declared
unhealthy
MaxPercentUnhealthyServices 0 Max service instances unhealthy before app is
declared unhealthy
MaxPercentUnhealthyPartitionsPerService 0 Max partitions unhealthy before service instance is
declared unhealthy
MaxPercentUnhealthyReplicasPerPartition 0 Max partition replicas unhealthy before partition is
declared unhealthy
UpgradeReplicaSetCheckTimeout Infinite
900 (rollback)
Stateless: How long SF waits for target instances
before next UD
Stateful: How long SF waits for quorum before next
UD
ForceRestart False Forces service restart when updating config/data
 Get progress via Get-ServiceFabricApplicationUpgrade
 Most problems are timing related
 Instances/replicas not going down quickly
 UDs not coming up in time
 Failing health checks
 If FailureAction is “Manual”, you can:
 Optional: After all named apps upgrade,
unregister old app type
Managing Named Application Upgrades
Action PowerShell Command
Rollback Start-ServiceFabricApplicationRollback
Start next UD Resume-ServiceFabricApplicationUpgrade
Resume monitored upgrade Update-ServiceFabricApplicationUpgrade
Windows OS
Windows OS Windows OS
Windows OS
Windows OS
Windows OS
Fabric
Node
Fabric
Node
Fabric
Node
Fabric
Node
Fabric
Node
Fabric
Node
App B v2
App B v2
App B v2
App A v1
App A v1
App A v1
App C v1
App C v1
App C v1
App Repository
App A v1
App C v1
App B v2
App C v2
App C v2
App C v2
App C v2
Perform
http://bit.ly/sf-lab-9
Clone repository in VS
https://github.com/Azure-Samples/service-fabric-dotnet-getting-started.git
StatefulVisualObjectActor.cs is now VisualObjectActor.cs
Updates Since //Build 2015
Now Globaly Available
Create Clusters via ARM & Portal
Hosted Clusters in Azure
Many Performance, Density, & Scale Improvements
Many API Improvements
 New Previews
 Linux Support
 Java Support
 Docker & Windows Containers
 On Premises Clusters
http://aka.ms/ServiceFabricSDK
http://aka.ms/ServiceFabricWS2012R2
http://aka.ms/ServiceFabricSamples
http://aka.ms/SFlinuxpreview
http://aka.ms/ServiceFabricForum
• Learn from the tutorials and videos
• http://aka.ms/ServiceFabricDocs
Stephane Lapointe, Orckestra - stephane@stephanelapointe.net / @s_lapointe
Guy Barrette, freelance Architect/Developer - guy@guybarrette.com / @GuyBarrette
Francois Boucher, Lixar IT - fboucher@frankysnotes.com / @fboucheros
Alexandre Brisebois, Microsoft – alexandre.brisebois@microsoft.com / @brisebois

More Related Content

PPTX
Azure service fabric: a gentle introduction
Alessandro Melchiori
 
PPTX
Service Fabric – building tomorrows applications today
BizTalk360
 
PDF
Serverless Stream Processing with Bill Bejeck
confluent
 
PDF
Global Azure Bootcamp 2017 - Why I love S2D for MSSQL on Azure
Karim Vaes
 
PPTX
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
João Pedro Martins
 
PPTX
App fabric hybrid computing
Hammad Rajjoub
 
PPTX
Azure Automation and Update Management
Udaiappa Ramachandran
 
PDF
Experiences using CouchDB inside Microsoft's Azure team
Brian Benz
 
Azure service fabric: a gentle introduction
Alessandro Melchiori
 
Service Fabric – building tomorrows applications today
BizTalk360
 
Serverless Stream Processing with Bill Bejeck
confluent
 
Global Azure Bootcamp 2017 - Why I love S2D for MSSQL on Azure
Karim Vaes
 
Azure Service Fabric and the Actor Model: when did we forget Object Orientation?
João Pedro Martins
 
App fabric hybrid computing
Hammad Rajjoub
 
Azure Automation and Update Management
Udaiappa Ramachandran
 
Experiences using CouchDB inside Microsoft's Azure team
Brian Benz
 

What's hot (20)

PPTX
App fabric introduction
Dennis van der Stelt
 
PDF
Going serverless with azure functions
gjuljo
 
PDF
Microservices with Java, Spring Boot and Spring Cloud
Eberhard Wolff
 
PPTX
Presentation Tier optimizations
Anup Hariharan Nair
 
PDF
Develop enterprise-ready applications for Microsoft Teams
Markus Moeller
 
PPTX
Azure Logic Apps
Azure Riyadh User Group
 
PPTX
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Jeff Chu
 
PPTX
Azure in Developer Perspective
rizaon
 
PPTX
Lets talk about: Azure Kubernetes Service (AKS)
Pedro Sousa
 
DOC
Praveen Kumar Resume
praveen Kothuri.Praveen
 
PPTX
Deep dive into service fabric after 2 years
Tomasz Kopacz
 
PPTX
Automating Your Microsoft Azure Environment (DevLink 2014)
Michael Collier
 
PPTX
Devteach 2016: A practical overview of actors in service fabric
Brisebois
 
PDF
Spring Cloud Netflix OSS
Steve Hall
 
PDF
Network security with Azure PaaS services by Erwin Staal from 4DotNet at Azur...
DevClub_lv
 
PPTX
Using Windows Azure for Solving Identity Management Challenges
Michael Collier
 
PPTX
.NET microservices with Azure Service Fabric
Davide Benvegnù
 
PPTX
2011.05.31 super mondays-servicebus-demo
daveingham
 
PDF
Using Azure Managed Identities for your App Services by Jan de Vries from 4Do...
DevClub_lv
 
PPTX
Container on azure
Vishwas N
 
App fabric introduction
Dennis van der Stelt
 
Going serverless with azure functions
gjuljo
 
Microservices with Java, Spring Boot and Spring Cloud
Eberhard Wolff
 
Presentation Tier optimizations
Anup Hariharan Nair
 
Develop enterprise-ready applications for Microsoft Teams
Markus Moeller
 
Azure Logic Apps
Azure Riyadh User Group
 
Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)
Jeff Chu
 
Azure in Developer Perspective
rizaon
 
Lets talk about: Azure Kubernetes Service (AKS)
Pedro Sousa
 
Praveen Kumar Resume
praveen Kothuri.Praveen
 
Deep dive into service fabric after 2 years
Tomasz Kopacz
 
Automating Your Microsoft Azure Environment (DevLink 2014)
Michael Collier
 
Devteach 2016: A practical overview of actors in service fabric
Brisebois
 
Spring Cloud Netflix OSS
Steve Hall
 
Network security with Azure PaaS services by Erwin Staal from 4DotNet at Azur...
DevClub_lv
 
Using Windows Azure for Solving Identity Management Challenges
Michael Collier
 
.NET microservices with Azure Service Fabric
Davide Benvegnù
 
2011.05.31 super mondays-servicebus-demo
daveingham
 
Using Azure Managed Identities for your App Services by Jan de Vries from 4Do...
DevClub_lv
 
Container on azure
Vishwas N
 
Ad

Viewers also liked (20)

PPTX
Tokyo Azure Meetup #5 - Microservices and Azure Service Fabric
Tokyo Azure Meetup
 
PDF
Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...
Tomasz Kopacz
 
PDF
SQL Server 2016 SSRS and BI
MSDEVMTL
 
PPTX
Azure Service Fabric Overview
João Pedro Martins
 
PDF
Ssis 2016 RC3
MSDEVMTL
 
PPTX
Introduction à Application Insights
MSDEVMTL
 
PPTX
Les micro orm, alternatives à entity framework
MSDEVMTL
 
PPTX
Azure Service Fabric pour les développeurs
Microsoft
 
PPTX
How ddd, cqrs and event sourcing constitute the architecture of the future
MSDEVMTL
 
PPTX
Introduction à la sécurité dans ASP.NET Core
MSDEVMTL
 
PPTX
.Net Core Fall update
MSDEVMTL
 
PPTX
Cathy Monier: Power Query et Power BI
MSDEVMTL
 
PDF
Sébastien Coutu: Copy this Meetup Devops - microservices - infrastructure imm...
MSDEVMTL
 
PPTX
Le Microsoft Graph et le développement Office 365
MSDEVMTL
 
PDF
Advanced analytics with R and SQL
MSDEVMTL
 
PDF
Microsoft Modern Analytics
MSDEVMTL
 
PDF
Robert Luong: Analyse prédictive dans Excel
MSDEVMTL
 
PDF
SQL Server 2016 novelties
MSDEVMTL
 
PPTX
Tokyo azure meetup #12 service fabric internals
Tokyo Azure Meetup
 
PPTX
Guy Barrette: Afficher des données en temps réel dans PowerBI
MSDEVMTL
 
Tokyo Azure Meetup #5 - Microservices and Azure Service Fabric
Tokyo Azure Meetup
 
Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...
Tomasz Kopacz
 
SQL Server 2016 SSRS and BI
MSDEVMTL
 
Azure Service Fabric Overview
João Pedro Martins
 
Ssis 2016 RC3
MSDEVMTL
 
Introduction à Application Insights
MSDEVMTL
 
Les micro orm, alternatives à entity framework
MSDEVMTL
 
Azure Service Fabric pour les développeurs
Microsoft
 
How ddd, cqrs and event sourcing constitute the architecture of the future
MSDEVMTL
 
Introduction à la sécurité dans ASP.NET Core
MSDEVMTL
 
.Net Core Fall update
MSDEVMTL
 
Cathy Monier: Power Query et Power BI
MSDEVMTL
 
Sébastien Coutu: Copy this Meetup Devops - microservices - infrastructure imm...
MSDEVMTL
 
Le Microsoft Graph et le développement Office 365
MSDEVMTL
 
Advanced analytics with R and SQL
MSDEVMTL
 
Microsoft Modern Analytics
MSDEVMTL
 
Robert Luong: Analyse prédictive dans Excel
MSDEVMTL
 
SQL Server 2016 novelties
MSDEVMTL
 
Tokyo azure meetup #12 service fabric internals
Tokyo Azure Meetup
 
Guy Barrette: Afficher des données en temps réel dans PowerBI
MSDEVMTL
 
Ad

Similar to Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016) (20)

PPTX
Cloud computing and the Windows Azure Services Platform (KU Leuven)
Maarten Balliauw
 
PPTX
App Modernization with Microsoft Azure
Microsoft Tech Community
 
PPTX
Clouds clouds everywhere
Matt Deacon
 
PPTX
Azure Umbraco workshop
Orbit One - We create coherence
 
PPTX
Microsoft Azure in der Praxis
Yvette Teiken
 
PPTX
Understanding The Azure Platform March 2010
DavidGristwood
 
PPTX
20170209 dev day-websites_vs_cloudservices_vsservicefabric_scenarios
Ricardo González
 
PPT
Windows Azure and a little SQL Data Services
ukdpe
 
PPTX
Migrating Apps To Azure
Harish Ranganathan
 
PDF
PCF: Platform for a New Era - Kubernetes for the Enterprise - London
VMware Tanzu
 
PPTX
Build intelligent solutions using Azure
Mostafa
 
PPTX
Cloud Powered Mobile Apps with Azure
Kris Wagner
 
PPTX
CloudStack DC Meetup - Apache CloudStack Overview and 4.1/4.2 Preview
Chip Childers
 
PDF
Aplicaciones distribuidas con Dapr
César Jesús Angulo Gasco
 
PPTX
Cloud Powered Mobile Apps with Azure
Ken Cenerelli
 
PPTX
Ukfs Snr Dev Arch Forum Pres2 St
AllyWick
 
PDF
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
Lightbend
 
PPTX
Understanding the Windows Azure Platform - Dec 2010
DavidGristwood
 
PPTX
Sky High With Azure
Clint Edmonson
 
Cloud computing and the Windows Azure Services Platform (KU Leuven)
Maarten Balliauw
 
App Modernization with Microsoft Azure
Microsoft Tech Community
 
Clouds clouds everywhere
Matt Deacon
 
Azure Umbraco workshop
Orbit One - We create coherence
 
Microsoft Azure in der Praxis
Yvette Teiken
 
Understanding The Azure Platform March 2010
DavidGristwood
 
20170209 dev day-websites_vs_cloudservices_vsservicefabric_scenarios
Ricardo González
 
Windows Azure and a little SQL Data Services
ukdpe
 
Migrating Apps To Azure
Harish Ranganathan
 
PCF: Platform for a New Era - Kubernetes for the Enterprise - London
VMware Tanzu
 
Build intelligent solutions using Azure
Mostafa
 
Cloud Powered Mobile Apps with Azure
Kris Wagner
 
CloudStack DC Meetup - Apache CloudStack Overview and 4.1/4.2 Preview
Chip Childers
 
Aplicaciones distribuidas con Dapr
César Jesús Angulo Gasco
 
Cloud Powered Mobile Apps with Azure
Ken Cenerelli
 
Ukfs Snr Dev Arch Forum Pres2 St
AllyWick
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
Lightbend
 
Understanding the Windows Azure Platform - Dec 2010
DavidGristwood
 
Sky High With Azure
Clint Edmonson
 

More from MSDEVMTL (20)

PPTX
Intro grpc.net
MSDEVMTL
 
PPTX
Grpc and asp.net partie 2
MSDEVMTL
 
PPTX
Property based testing
MSDEVMTL
 
PPTX
Improve cloud visibility and cost in Microsoft Azure
MSDEVMTL
 
PPTX
Return on Ignite 2019: Azure, .NET, A.I. & Data
MSDEVMTL
 
PPTX
C sharp 8.0 new features
MSDEVMTL
 
PPTX
Asp.net core 3
MSDEVMTL
 
PDF
MSDEVMTL Informations 2019
MSDEVMTL
 
PPTX
Common features in webapi aspnetcore
MSDEVMTL
 
PPTX
Groupe Excel et Power BI - Rencontre du 25 septembre 2018
MSDEVMTL
 
PPTX
Api gateway
MSDEVMTL
 
PPTX
Common features in webapi aspnetcore
MSDEVMTL
 
PPTX
Stephane Lapointe: Governance in Azure, keep control of your environments
MSDEVMTL
 
PPTX
Eric Routhier: Garder le contrôle sur vos coûts Azure
MSDEVMTL
 
PDF
Data science presentation
MSDEVMTL
 
PPTX
Michel Ouellette + Gabriel Lainesse: Process Automation & Data Analytics at S...
MSDEVMTL
 
PPTX
Open id connect, azure ad, angular 5, web api core
MSDEVMTL
 
PPTX
Yoann Clombe : Fail fast, iterate quickly with power bi and google analytics
MSDEVMTL
 
TXT
CAE: etude de cas - Rolling Average
MSDEVMTL
 
PDF
CAE: etude de cas
MSDEVMTL
 
Intro grpc.net
MSDEVMTL
 
Grpc and asp.net partie 2
MSDEVMTL
 
Property based testing
MSDEVMTL
 
Improve cloud visibility and cost in Microsoft Azure
MSDEVMTL
 
Return on Ignite 2019: Azure, .NET, A.I. & Data
MSDEVMTL
 
C sharp 8.0 new features
MSDEVMTL
 
Asp.net core 3
MSDEVMTL
 
MSDEVMTL Informations 2019
MSDEVMTL
 
Common features in webapi aspnetcore
MSDEVMTL
 
Groupe Excel et Power BI - Rencontre du 25 septembre 2018
MSDEVMTL
 
Api gateway
MSDEVMTL
 
Common features in webapi aspnetcore
MSDEVMTL
 
Stephane Lapointe: Governance in Azure, keep control of your environments
MSDEVMTL
 
Eric Routhier: Garder le contrôle sur vos coûts Azure
MSDEVMTL
 
Data science presentation
MSDEVMTL
 
Michel Ouellette + Gabriel Lainesse: Process Automation & Data Analytics at S...
MSDEVMTL
 
Open id connect, azure ad, angular 5, web api core
MSDEVMTL
 
Yoann Clombe : Fail fast, iterate quickly with power bi and google analytics
MSDEVMTL
 
CAE: etude de cas - Rolling Average
MSDEVMTL
 
CAE: etude de cas
MSDEVMTL
 

Recently uploaded (20)

PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Software Development Methodologies in 2025
KodekX
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 

Stephane Lapointe, Frank Boucher & Alexandre Brisebois: Les micro-services et Azure Service Fabric (Global Azure Bootcamp 2016)

  • 1. Stephane Lapointe, Orckestra - [email protected] / @s_lapointe Guy Barrette, freelance Architect/Developer - [email protected] / @GuyBarrette Francois Boucher, Lixar IT - [email protected] / @fboucheros Alexandre Brisebois, Microsoft – [email protected] / @brisebois
  • 3. Today we’re going to learn about how Microservices enable development and management flexibility Service Fabric is the platform for building applications with a microservices design approach Service Fabric is battle tested and provides a rich platform for both development and management of services at scale
  • 5. 1 Trillion Messages delivered every month with Event Hubs 100,000 New Azure customer subscriptions/month 20Million SQL database hours used every day >5Trillion Storage transactions every month 60Billion Hits to Websites run on Azure Web App Service 425Million Azure Active Directory Users Azure Momentum 57% Of Fortune 500 Companies use Microsoft Azure >50Trillion Storage objects in Azure 1.4 Million SQL Databases Deployed In Azure “Microsoft is growing its cloud revenue faster than Amazon” – Business Insider 2016 AWS revenue grew about 69% but Microsoft Azure revenue grew by 127%
  • 6. What do these have in common?
  • 8. • Scales by cloning the app on multiple servers/VMs/Containers Monolithic application approach Microservices application approach • A microservice application separates functionality into separate smaller services. • Scales out by deploying each service independently creating instances of these services across servers/VMs/containers • A monolith app contains domain specific functionality and is normally divided by functional layers such as web, business and data App 1 App 2App 1
  • 9. • Single monolithic database • Tiers of specific technologies State in Monolithic approach State in Microservices approach • Graph of interconnected microservices • State typically scoped to the microservice • Variety of technologies used • Remote Storage for cold data stateless services with separate stores stateful services stateless presentation services stateless services
  • 10. Plan 1 Monitor + Learn ReleaseDevelop + Test 2 Development Production 4 3
  • 13. Public Cloud Other CloudsOn Premises Private cloud A Microservice Platform
  • 14. Setting-up a Cluster in AzureWhat Is Azure Service Fabric?
  • 15.  Next generation of PaaS on Azure  Elastic scale, OS updates, SF updates  Microservices platform for Windows and Linux  DevOps, rolling upgrades, etc.  Polycloud including on-premises  Programming models  Stateless Win32 apps written in any language (some feature not supported)  Reliable Services: Stateless & stateful (for hot data; gives low-latency reads)  OWIN/ASP.NET Core*  Service Fabric is free of charge  SDK: http://aka.ms/ServiceFabricSDK Service Fabric is
  • 16. • 1 role instance per VM • Uneven utilization • Low density • Slow deployment & upgrade (bound to VM) • Slow scaling and failure recovery • Limited fault tolerance • Many microservices per VM • Even Utilization (by default, customizable) • High density (customizable) • Fast deployment & upgrade • Fast scaling of independent microservices • Tunable fast fault tolerance Cloud Services vs Service Fabric Azure Cloud Services (Web & Worker Roles) Azure Service Fabric (Services)
  • 17. Microsoft Azure Service Fabric A platform for reliable, hyperscale, microservice-based applications Azure Windows Server Linux Hosted Clouds Windows Server Linux Service Fabric Private Clouds Windows Server Linux High Availability Hyper-Scale Hybrid Operations High Density microservices Rolling Upgrades Stateful services Low Latency Fast startup & shutdown Container Orchestration & lifecycle management Replication & Failover Simple programming models Load balancing Self-healingData Partitioning Automated Rollback Health Monitoring Placement Constraints
  • 18. Service Fabric Subsystems Service discovery Reliability, Availability, Replication, Service Orchestration Application lifecycle Fault Inject, Test in production Federates a set of nodes to form a consistent scalable fabric Secure point-to-point communication Deployment, Upgrade and Monitoring microservices
  • 19. Windows OS Windows OS Windows OS Windows OS Windows OS Windows OS Fabric Node Fabric Node Fabric Node Fabric Node Fabric Node Fabric Node  Set of OS instances (real or virtual) stitched together to form a pool of resources  Cluster can scale to 1000s of machines, is self repairing, and scales-up or down  Acts as environment-independent abstraction layer Cluster
  • 20. Datacenter (Azure, On Premises, Other Clouds ) Load Balancer PC/VM #1 Service Fabric Your code, etc. PC/VM #2 Service Fabric Your code, etc. PC/VM #3 Service Fabric Your code, etc. PC/VM #4 Service Fabric Your code, etc. PC/VM #5 Service Fabric Your code, etc. Management to deploy your code, etc. (Port: 19080) App Web Request (Port: 80/443/?)
  • 21.  Cluster Manager (ports 19080 [REST] & 19000 [TCP]) Performs cluster REST & PowerShell/FabricClient operations  Failover Manager Rebalances resources as nodes come/go  Naming Maps service instances to endpoints  Image store (not on OneBox) Contains your Application packages  Upgrade Service (Azure only) Coordinates upgrading SF itself with Azure’s SFRP Service Fabric’s Infrastructure Services Node #1 F Node #2 C N I Node #3 C F Node #4 N I Node #5 C I F N U U U N F U IC
  • 22. Setting-up a Cluster in AzureMicroservices with Azure Service Fabric
  • 23. App1 App2 Service Fabric Microservices App Type Packages Service Fabric Cluster VMs
  • 24. Guest Executables • Bring any exe • Any language • Any programming model • Packaged as Application • Gets versioning, upgrade, monitoring, health, etc. Reliable Services • Stateless & stateful services • Concurrent, granular state changes • Use of the Reliable Collections • Transactions across collections • Full platform integration Reliable Actors • Stateless & stateful actor objects • Simplified programming model • Single Threaded model • Great for scaled out compute and state
  • 25. • Reliable collections make it easy to build stateful services • An evolution of .NET collections - for the cloud • ReliableDictionary<T1,T2> and ReliableQueue<T> Programming models: Reliable Services Collections • Single machine • Single-threaded Concurrent Collections • Single machine • Multi-threaded Reliable Collections • Multi-machine • Replicated (HA) • Persistence (durable) • Asynchronous • Transactional
  • 27. protected override async Task RunAsync(CancellationToken cancellationToke) { var requestQueue = await this.StateManager.GetOrAddAsync<IReliableQueue<CustomerRecord>>(“requests"); var locationDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<Guid, LocationInfo>>(“locs"); var personDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<Guid, Person>>(“ppl"); var customerListDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<Guid, object>>(“customers"); while (true) { cancellationToke.ThrowIfCancellationRequested(); Guid customerId = Guid.NewGuid(); using (var tx = this.StateManager.CreateTransaction()) { var customerRequestResult = await requestQueue.TryDequeueAsync(tx); await customerListDictionary.AddAsync(tx, customerId, new object()); await personDictionary.AddAsync(tx, customerId, customerRequestResult.Value.person); await locationDictionary.AddAsync(tx, customerId, customerRequestResult.Value.locInfo); await tx.CommitAsync(); } } } Everything happens or nothing happens!
  • 28. Programming models: Reliable Actors • Independent units of compute and state • Large number of them executing in parallel • Communicates using asynchronous messaging • Single threaded execution • Automatically created and dehydrated as necessary
  • 29. Reliable Actors APIs Reliable Services APIs Your problem space involves many small independent units of state and logic You need to maintain logic across multiple components You want to work with single-threaded objects while still being able to scale and maintain consistency You want to use reliable collections (like .NET Dictionary and Queue) to store and manage your state You want the framework to manage the concurrency and granularity of state You want to control the granularity and concurrency of your state You want the platform to manage communication for you You want to manage the communication and control the partitioning scheme for your service Comparing Reliable Actors & Reliable Service
  • 31. Setting-up a Cluster in AzureApplication Packaging & Deployment
  • 32.      <ServiceManifest Name="QueueService" Version="1.0"> <ServiceTypes> <StatefulServiceType ServiceTypeName="QueueServiceType" HasPersistedState="true" /> </ServiceTypes> <CodePackage Name="Code" Version="1.0"> <EntryPoint> <ExeHost> <Program>ServiceHost.exe</Program> </ExeHost> </EntryPoint> </CodePackage> <ConfigPackage Name="Config" Version="1.0" /> <DataPackage Name="Data" Version="1.0" /> </ServiceManifest>
  • 34.      Cluster “Fabrikam” eStore App “G” Gallery Svc “P” Payment Svc eStore App Type Gallery Svc Type Payment Svc Type “Contoso” eStore App “G” Gallery Svc “P” Payment Svc
  • 35. <ApplicationManifest ApplicationTypeName="eStoreAppType" ApplicationTypeVersion="1.0" ...> <ServiceManifestImport> <ServiceManifestRef ServiceManifestName="GalleryServicePkg" ServiceManifestVersion="1.0" ... /> <ServiceManifestRef ServiceManifestName="PaymentServicePkg" ServiceManifestVersion="1.0" ... /> ... </ServiceManifestImport> </ApplicationManifest> C:eStoreAppTypePkg │ ApplicationManifest.xml │ ├───GalleryServicePkg │ │ ServiceManifest.xml │ │ │ └───CodePkg │ Gallery.exe │ GalleryLib.dll │ Setup.bat │ └───PaymentServicePkg │ ServiceManifest.xml │ └───CodePkg Payment.exe
  • 36. <ServiceManifest Name="GalleryServicePkg" Version="1.0"> <ServiceTypes> <StatelessServiceType ServiceTypeName="GalleryServiceType" ... > </StatelessServiceType> </ServiceTypes> <CodePackage Name="CodePkg" Version="1.0"> <EntryPoint> <ExeHost> <Program>Gallery.exe</Program> </ExeHost> </EntryPoint> </CodePackage> <Resources> <Endpoints> <Endpoint Name="GalleryEndpoint" Type="Input" Protocol="http" Port="8080" /> </Endpoints> </Resources> </ServiceManifest> C:eStoreAppTypePkg │ ApplicationManifest.xml │ ├───GalleryServicePkg │ │ ServiceManifest.xml │ │ │ └───CodePkg │ Gallery.exe │ GalleryLib.dll │ └───PaymentServicePkg │ ServiceManifest.xml │ └───CodePkg Payment.exe
  • 37. Cluster Management, Billing (VMs), Geolocation, Multitenancy 1+ Named Applications Isolation, Multitenancy, Unit of versioning/config 1+ Named Services Code package(s), Multitenancy (w/o isolation) Stateless: 1 Partition No value 1+ Instances Scale, Availability Stateful: 1+ Partitions Addressability, Scale 1+ Replicas Availability • You can dynamically start/remove named apps/services and instances; not partitions. • The # instances is set per named service; all partitions have the same # of instances
  • 38.     Node #1 Node #2 Node #3 Node #4 Node #5 f:/A1/S1, P1, I1 f:/A1/S2, P1, I1 f:/A1/S1, P1, I2 f:/A1/S1, P1, I3 f:/A1/S2, P1, I2 f:/A1/S2, P2, I2 f:/A1/S2, P2, I1 App Name Service Type Service Name # Partitions # Instances fabric:/A1 “S” fabric:/A1/S1 1 3 fabric:/A1 “S” fabric:/A1/S2 2 2 App Type App Version App Name “A” 1.0 fabric:/A1 NOTE: When using SF programming models, instances from same named app/service are in the same process
  • 39. “fabric:/Contoso” Named App “fabric:/Contoso/Payment” Named Svc (Stateful) “fabric:/Contoso/Gallery” Named Svc (Stateless) Partition-1 Partition-2 Replica-1 Replica-2 Replica-3 Replica-1 Replica-2 Replica-3 Partition-1 Instance-1 Instance-2 Replica-4
  • 41.  Copy-ServiceFabricApplicationPackage (to image store)  Register-ServiceFabricApplicationType (in image store)  Remove-ServiceFabricApplicationPackage (from image store)  New-ServiceFabricApplication (named app)  New-ServiceFabricService (named svc)  Remove-ServiceFabricService (named svc)  Remove-ServiceFabricApplication (named app & its named svcs)  Unregister-ServiceFabricApplicationType (from image store)  No named app can be running PowerShell App Pkg & Named App/Service Ops
  • 42. http://bit.ly/sf-lab-3 Add a web front-end to your application http://bit.ly/sf-lab-4
  • 43. Setting-up a Cluster in AzureRunning Microservices at Scale!
  • 45. Node 5Node 4Node 3 Node 6Node 2Node 1 P2 S S S P4 S P1 S P3S S S • Services can be partitioned for scale-out. • You can choose your own partitioning scheme. • Service partitions are striped across machines in the cluster. • Replicas automatically scale out & in on cluster changes
  • 46. Performance and stress response • Rich built-in metrics for Actors and Services programming models • Easy to add custom application performance metrics Health status monitoring • Built-in health status for cluster and services • Flexible and extensible health store for custom app health reporting • Allows continuous monitoring for real-time alerting on problems in production
  • 47. • Repair suggestions. Examples: Slow RunAsync cancellations, RunAsync failures • All important events logged. Examples: App creation, deploy and upgrade records. All Actor method calls. Detailed System Optics • ETW == Fast Industry Standard Logging Technology • Works across environments. Same tracing code runs on devbox and also on production clusters on Azure. • Easy to add and system appends all the needed metadata such as node, app, service, and partition. Custom Application Tracing • Visual Studio Diagnostics Events Viewer • Windows Event Viewer • Windows Azure Diagnostics + Operational Insights • Easy to plug in your preferred tools: Kibana, Elasticsearch and more Choice of Tools
  • 49. Time = t1 83 76 50 46 64 New Node arrived61 Time = t2 83 61 50 46 Failures Detected cluster reconfigured 83 76 64 50 46 Time = t0 Nodes failed
  • 50. Stateful Microservices - Replication Service Fabric Cluster VMs Primary Secondary Replication
  • 52. App1 App2 Handling Machine Failures App Type Packages Service Fabric Cluster VMs
  • 53.           P S S S S S Must be safe in the presence of cascading failures B P X Failed X Failed
  • 57.       <FabricSettings> <Section Name="HealthManager/ClusterHealthPolicy"> <Parameter Name="MaxPercentUnhealthyApplications" Value="0"/> <Parameter Name="MaxPercentUnhealthyNodes" Value="20"/> </Section> </FabricSettings> <Policies> <HealthPolicy MaxPercentUnhealthyDeployedApplications="20"> <DefaultServiceTypeHealthPolicy MaxPercentUnhealthyServices="0" MaxPercentUnhealthyPartitionsPerService="10" MaxPercentUnhealthyReplicasPerPartition="0"/> <ServiceTypeHealthPolicy ServiceTypeName="FrontEndSvcType" MaxPercentUnhealthyServices="0" MaxPercentUnhealthyPartitionsPerService="20" MaxPercentUnhealthyReplicasPerPartition="0"/> </HealthPolicy> </Policies>
  • 58.  Health Policies MaxPercentUnhealthyServices, MaxPercentUnhelathyDeployedApplications, ConsiderWarningsasError  UpgradeTimeout If an entire upgrade hits this timeout, the upgrade is failed.  Upgrade DomainTimeout If upgrading a UD hits this timeout, the upgrade is failed.  HealthCheckWaitDuration After an UD is upgraded, wait for this time before checking health of nodes in that UD.  HealthCheckStableDuration Even if the last health check passed, keep checking the health for this duration to ensure the upgrade is stable. If stable, upgrade the next UD.  UpgradeHealthCheckInterval Keep checking health periodically with this interval until HealthCheckStableDuration is hit.  HealthCheckRetryTimeout Once this time out is hit, stop checking health and fail the upgrade. Health Policies & Timeouts
  • 66.  Mandatory Data Description Entity Cluster, Node, App, Service, Partition, Replica, Deployed App, Deployed Service Pkg SourceId String uniquely identifies reporter Property Category (ex: “Storage” or “Connectivity”) HealthState Ok, Warning, Error Optional Data Default Description Description “” Human readable info TimeToLive Infinite # seconds before report is expired RemoveWhenExpired False Useful if TTL != Infinite. If false, report’s entity is in Error; else report removed after expiration. SequenceNumber Auto- generated Increasing integer. Use to replace old reports when reporting state transitions.
  • 67.  Property Description HealthInformation The original health report SourceUtcTimetamp The time the health report was originally submitted LastModifiedUtcTimestamp The last time the report was modified IsExpired True if TTL expired and RemoveWhenExpired=false LastOkTransitionAt LastWarningTransitionAt LastErrorTransitionAt These give a history of the event’s health states. Ex: Alert if !Ok > 5 minutes
  • 70. Setting-up a Cluster in AzureReal Customers Real Workloads
  • 74. Independent games studio specializing in massively multiplayer games http://web.ageofascent.com/category/development/service- fabric/
  • 76.  Two main test scenarios provided out of the box  Chaos tests  Failover tests  Tools  C# APIs (System.Fabric.Testability.dll)  PowerShell commandlets (runtime required) Testability in Service Fabric
  • 77.  Generates faults across the entire Service Fabric cluster  Compresses faults generally seen in months or years to a few hours  Combination of interleaved faults with the high fault rate finds corner cases that are otherwise missed  Leads to a significant improvement in the code quality of the service What do we get from this Testability
  • 78. Actions Description Managed API Powershell Cmdlet Graceful/ UnGraceful Faults CleanTestState Removes all the test state from the cluster in case of a bad shutdown of the test driver. CleanTestStateAsync Remove-ServiceFabricTestState Not Applicable InvokeDataLoss Induces data loss into a service partition. InvokeDataLossAsync Invoke-ServiceFabricPartitionDataLoss Graceful InvokeQuorumLoss Puts a given stateful service partition in to quorum loss. InvokeQuorumLossAsync Invoke-ServiceFabricQuorumLoss Graceful Move Primary Moves the specified primary replica of stateful service to the specified cluster node. MovePrimaryAsync Move-ServiceFabricPrimaryReplica Graceful Move Secondary Moves the current secondary replica of a stateful service to a different cluster node. MoveSecondaryAsync Move-ServiceFabricSecondaryReplica Graceful RemoveReplica Simulates a replica failure by removing a replica from a cluster. This will close the replica and will transition it to role 'None', removing all of its state from the cluster. RemoveReplicaAsync Remove-ServiceFabricReplica Graceful RestartDeployedCodeP ackage Simulates a code package process failure by restarting a code package deployed on a node in a cluster. This aborts the code package process which will restart all the user service replicas hosted in that process. RestartDeployedCodePac kageAsync Restart- ServiceFabricDeployedCodePackage Ungraceful RestartNode Simulates a Service Fabric cluster node failure by restarting a node. RestartNodeAsync Restart-ServiceFabricNode Ungraceful RestartPartition Simulates a data center blackout or cluster blackout scenario by restarting some or all replicas of a partition. RestartPartitionAsync Restart-ServiceFabricPartition Graceful RestartReplica Simulates a replica failure by restarting a persisted replica in a cluster, closing the replica and then reopening it. RestartReplicaAsync Restart-ServiceFabricReplica Graceful StartNode Starts a node in a cluster which is already stopped. StartNodeAsync Start-ServiceFabricNode Not Applicable StopNode Simulates a node failure by stopping a node in a cluster. The node will stay down until StartNode is called. StopNodeAsync Stop-ServiceFabricNode Ungraceful ValidateApplication Validates the availability and health of all Service Fabric services within an application, usually after inducing some fault into the system. ValidateApplicationAsync Test-ServiceFabricApplication Not Applicable ValidateService Validates the availability and health of a Service Fabric service, usually after inducing some fault into the system. ValidateServiceAsync Test-ServiceFabricService Not Applicable Testability Actions
  • 79.  Stateless:  Stop node (ungraceful)  Start node (N/A)  Restart node (ungraceful)  Validate application (N/A)  Validate service (N/A)  RestartDeployedCodePackage (ungraceful)  Restart partition (graceful)  Restart replica (graceful)  CleanTestState (N/A)  Failover/chaos tests Testability  Stateful:  Move primary replica (graceful)  Move secondary replica (graceful)  Remove Replica (graceful)  InvokeQuorumLoss (graceful)  InvokeDataLoss (graceful)
  • 82. 1. Put new code in code package 2. Update ver strings (#s are not required) 3. Copy new app package to image store 4. Register new app type/ version 5. Select named app(s) to upgrade to new version Updating Your App’s Service’s Code <ServiceManifest Name="WebServer" Version="2.0"> <ServiceTypes> <StatelessServiceType ServiceTypeName="WebServer" ...> <Extensions> ... </Extensions> </StatelessServiceType> </ServiceTypes> <CodePackage Name="CodePkg" Version="1.1"> <EntryPoint> ... </EntryPoint> </CodePackage> <Resources><Endpoints> ... </Endpoints></Resources> </ServiceManifest> <ApplicationManifest ApplicationTypeName="DemoAppType" ApplicationTypeVersion="3.0" ...> <ServiceManifestImport> <ServiceManifestRef ServiceManifestName="WebServer" ServiceManifestVersion="2.0" .../> </ServiceManifestImport> </ApplicationManifest> A B1 C B2
  • 83.  Prevent complete service outage while upgrading  More UDs  less loss of scale but more time to upgrade  # UD set when cluster created via cluster manifest; ARM template  Default=5; 20% down at a time  IMPORTANT: 2 versions of your code run side-by-side simultaneously  Beware of data/schema/protocol changes; use 2-phase upgrade  Below shows 9 nodes spread across 5 UDs Upgrade Domains UD #1 UD #2 UD #3 UD #4 Node #5 Node-1 Node-8 Node-2 Node-3 Node-4 Node-5 Node-9Node-6 Node-7
  • 84.  Isolate cluster from a single point of hardware failure (fault)  Determined by hardware topology (datacenter, rack, blade) Fault Domains fd:/DC1/R1/B1 fd:/DC1/R1/B2 fd:/DC1/R1/B3 fd:/DC1/R2/B1 fd:/DC1/R2/B2 fd:/DC1/R2/B3 fd:/DC2/R1/B1 fd:/DC2/R1/B2 fd:/DC2/R1/B3 fd:/DC2/R2/B1 fd:/DC2/R2/B2 fd:/DC2/R2/B3 … DC1 R1 B1 B2 B3 R2 B1 B2 B3 DC2 R1 B1 B2 B3 R2 B1 B2 B3 DC3 R1 B1 B2 B3 R2 B1 B2 B3
  • 85. Start-ServiceFabricApplicationUpgrade Parameter Default Description ApplicationName N/A Application Instance name TargetApplicationTypeVersion N/A The version string you want to upgrade to FailureAction N/A Rollback (to last version) or Manual (stop upgrade & switch to manual) UpgradeDomainTimeoutSec Infinite If any UD takes more than this time, FailureAction UpgradeTimeout Infinite If all UDs take more than this time, FailureAction HealthCheckWaitDurationSec 0 After UD, SF waits this long before initiating health check UpgradeHealthCheckInterval 60 If health check fails, SF waits this long before checking again (set in cluster manifest; not PowerShell) HealthCheckRetryTimeoutSec 600 Maximum time SF waits for app to be healthy HealthCheckStableDurationSec 0 How long app must be healthy before upgrading next UD
  • 86. Optional Health Criteria Policies Parameter Default Description ConsiderWarningAsError False Warning health events are considered errors stopping the upgrade MaxPercentUnhealthyDeployedApplications 0 TODO: Max unhealthy before app is declared unhealthy MaxPercentUnhealthyServices 0 Max service instances unhealthy before app is declared unhealthy MaxPercentUnhealthyPartitionsPerService 0 Max partitions unhealthy before service instance is declared unhealthy MaxPercentUnhealthyReplicasPerPartition 0 Max partition replicas unhealthy before partition is declared unhealthy UpgradeReplicaSetCheckTimeout Infinite 900 (rollback) Stateless: How long SF waits for target instances before next UD Stateful: How long SF waits for quorum before next UD ForceRestart False Forces service restart when updating config/data
  • 87.  Get progress via Get-ServiceFabricApplicationUpgrade  Most problems are timing related  Instances/replicas not going down quickly  UDs not coming up in time  Failing health checks  If FailureAction is “Manual”, you can:  Optional: After all named apps upgrade, unregister old app type Managing Named Application Upgrades Action PowerShell Command Rollback Start-ServiceFabricApplicationRollback Start next UD Resume-ServiceFabricApplicationUpgrade Resume monitored upgrade Update-ServiceFabricApplicationUpgrade
  • 88. Windows OS Windows OS Windows OS Windows OS Windows OS Windows OS Fabric Node Fabric Node Fabric Node Fabric Node Fabric Node Fabric Node App B v2 App B v2 App B v2 App A v1 App A v1 App A v1 App C v1 App C v1 App C v1 App Repository App A v1 App C v1 App B v2 App C v2 App C v2 App C v2 App C v2
  • 89. Perform http://bit.ly/sf-lab-9 Clone repository in VS https://github.com/Azure-Samples/service-fabric-dotnet-getting-started.git StatefulVisualObjectActor.cs is now VisualObjectActor.cs
  • 90. Updates Since //Build 2015 Now Globaly Available Create Clusters via ARM & Portal Hosted Clusters in Azure Many Performance, Density, & Scale Improvements Many API Improvements  New Previews  Linux Support  Java Support  Docker & Windows Containers  On Premises Clusters
  • 92. Stephane Lapointe, Orckestra - [email protected] / @s_lapointe Guy Barrette, freelance Architect/Developer - [email protected] / @GuyBarrette Francois Boucher, Lixar IT - [email protected] / @fboucheros Alexandre Brisebois, Microsoft – [email protected] / @brisebois