SlideShare a Scribd company logo
High-throughput data
streaming in Azure
Alexander Laysha
Solution Architect at EPAM Systems & Microsoft
Azure MVP
2
Few words about myself…
I’m Alexander Laysha
• Solution Architect from EPAM Systems & Microsoft
Azure MVP
• Focused on backend, high-load and cloud solutions
• Leader of Belarus Azure Community
• Speaker at local and external meetups and
conferences
My contacts
• Email: layshaalex@gmail.comom
• Twitter: @layshaalexander
• Facebook: alexander.laysha
3
• Business needs for real-time analytics
• Use-cases & architecture approaches
• Basics of real-time data streaming platforms
• Azure Event Hub capabilities & constraints
• Pricing calculations for multiple data ingestion scenarios based on Event Hub
• Summary
Which topics will we cover?
4
Past world
• Capture data for later analysis
• Reports and analytics with X days latency
Current days
• Dealing with tons of data
• Offline report and analysis in no longer enough (but still important)
• Business want to get immediate insights from captured data with X
seconds/minutes latency
Business needs for data analytics
5
IoT – device operational intelligence and pro-active alerts
Gaming Industry – real-time board with game leaders and scores
E-Commerce – online recommendation engines and proactive care
Operations - analyze real-time data to respond to dynamic environments in
order to take immediate action
Financial - monitor financial transactions in real-time to detect fraudulent
activity
Just few use-cases…
6
Collection – captured data from
multiple sources
Streaming - high-throughput data
pipeline systems like Kafka, Kinesis,
Event Hub
Processing – stream processing
platforms that performs a certain
task to produce output
Serving – app for stream processing
output consumption – UI, posts, DB,
report viewers, APIs
High-level real-time streaming architecture
OUR FOCUS
7
Persistence and batch - data is stored in
a persistence layer from which it is
ingested and processed by the batch
layer periodically (may includes stream
processing for on-fly ETL)
Speed layer - handles the portion of the
data that has not-yet been processed by
the batch layer (includes stream
processing and storage)
Serving layer - consolidates both by
merging the output of the batch and the
speed layer
High-level view of Lambda Architecture
8
High-level view of Kappa Architecture
Persistence – stores initial raw data for
historical purposes and can be used to
replay computations from initial data
stream
Speed Layer - the basic idea is to not
periodically recompute all data in the
batch layer, but to do all computation on
Speed Layer in the stream processing
system alone and only perform
recomputation when the business logic
changes by replaying historical data
9
Popular platforms for data streaming
Kinesis
Event Hub
10
Idea of Kafka/EventHub/Kinesis
11
Common terminology
Producer Producer Producer Publisher
Consumer Consumer Kinesis Stream
Applications
Consumer
Stream Topic Stream Event Hub
Partition Partition Shard Partition
Index Offset Sequence Number Offset
Consumer
Group
Consumer
Group
Application Consumer
Group
12
• Designed to handle very large quantities of small messages
• Horizontally scalable by using partitions and consumer groups
• Reliable and fault-tolerant
• Configurable data replication
• Configurable message TTL (stream level)
• Supports at-least-once delivery
• Logical data organization using partitions
• Separate date view for consumer by using consumer groups and indexes
• Ability to replay messages
• Messages with the same key are sent to the same partition
• Guarantee of message order in scope of partition
• Integrated with modern stream processing platforms (Stream Analytics,
Storm, Spark, etc.)
Common characteristics
13
Let’s take a close look to Azure Event Hub
Event
Producers
> 1M Producers
> 1GB/sec
Aggregate
Throughput
Direct
PartitionKey
Hash
Throughput Units:
• 1 ≤ TUs ≤ Partition Count
• TU: 1 MB/s writes, 2 MB/s
reads
Namespace
14
Ways to publish - individual event or batch:
• Round Robin
• Partition Id
• Partition Key
Supported Protocols:
• HTTPS – short-lived (low throughput)
• AMQP 1.0 – long-lived, (high throughput)
Publisher Policy - run-time feature designed to facilitate large numbers of
independent event publishers by using unique identifier and virtual endpoint:
//[my namespace].servicebus.windows.net/[event hub
name]/publishers/[my publisher name]
Event Hub Publishers
Event
Producers
15
Events listening - consumer connects to a partition using AMQP 1.0 protocol
and listens for incoming events
Consumer Groups - is a view (state, position, or offset) of an entire event hub.
Consumer groups enable multiple consuming applications to each have a
separate view of the event stream, and to read the stream independently at
their own pace and with their own offsets
Event Hub Consumers
16
• Security model is based on Shared Access Signature (SAS) tokens
• Shared access policy (key) supports following claims: Send, Listen, Manage
• Shared access policy (key):
• can be created on namespace or event hub level
• includes Primary and Secondary keys
• Primary and Secondary key can be revoked
• SAS tokens can be created on namespace, event hub or publisher level
• Granular control over event publishers through publisher policies (publisher
name should be the same as partition name, SAS token should be for
publisher endpoint)
• Event publishers can be revoked in case of usage of publisher specific SAS
token
Event Hub Security
17
• Automatic persistence of ingested events from Even Hub in Apach Avro format
• Supported storages:
• Azure Storage
• Azure Data Lake
• Configurable size & time windows per partition
Event Hub Capturing
18
Monitoring
• Integrated with Azure Monitor
• Type of diagnostics data: archive logs, operational logs, auto-scale logs, all
metrics
• Diagnostic logs can be send to: storage account, event hub, Log Analytics
Availability & Disaster Recovery
• SLA for 99,9% for operations on Event Hub
• HA is guaranteed by replication and availability sets
• In case of failure one of the partitions, other partitions will be available
• No built-in options for disaster recovery of Event Hub between regions (custom
solution: use events capturing with geo redundant storage and custom code to
populate Event Hub in another region)
Event Hub Monitoring & Disaster Recovery
19
• Throughput Unit – unit of scalability, shared across all event hubs in
namespace
• Manually or programmatically set (TUs)
• 1 TU = 1 MB/sec or 1000 events/sec on ingress, 2 MB/sec on egress, Max 100
TU for Standard Tier (contact support team)
• Dedicated Event Hub: 1 CU = ~200 TU, max 8 CU
• Enable Auto-Inflate for auto scaling up of TUs with ability to specify limits
• Partition count: 2-32. Count is not changeable and must be specified during
creation (count can be increased by contacting Microsoft)
• Consumer Group count: up to 20 per Event Hub
• 5 max concurrent readers on a partition per consumer group (recommended
to use one active receiver on a partition per consumer group)
Event Hub Scalability
20
• Single tenant hosting with no noise from other tenants, available to
customers with an enterprise agreement
• Repeatable performance every time
• No additional charge for incoming messages
• Message size increases to 1 MB as compared to 256 KB for Standard and Basic
• Scalable between 1 and 8 capacity units (CU) – providing up to 2 million
ingress events per second
• CUs manage the scale for Event Hubs Dedicated, 1 CU = ~200 TU, max 8 CU
• Zero maintenance: management of load balancing, OS updates, security
patches, and partitioning
• Fixed monthly pricing: ~720$ per day for 1 CU (pricing & CU size will change
starting from October 2017: ¼ CU for 5000$ per month)
Dedicated Event Hub
21
• Max 10 Event Hubs per Namespace
• Partition limit is 1 TU
• Number of AMQP connections per namespace: 5000
• Only Azure deployment, No Azure Stack support yet
• No SAS on consumer group level, no built-in encryption or compression of
event body
• No functionality to drain Event Hub (need to create custom drainer)
• No local emulator
Other non covered Event Hub Constraints
22
!!! COSTS JUST FOR INCOMING TRAFFIC WITHOUT STORAGE COSTS
1.000 msg/sec, 1 KB in size, 1 MB/sec, 24/7 = 1 TU Standard Pricing Tier *
22.32$ + (1.000 * 60sec * 60min * 24hrs * 31d)/1.000.000 * 0.028$ = 22.32$ +
75$ = 97.3$ per month - GOOD
100.000 msg/sec, 1 KB in size, 100 MB/sec, 24/7 = 100 TU (MAX) Standard
Pricing Tier * 22.32$ + (100.000 * 60sec * 60min * 24hrs * 31d)/1.000.000 *
0.028$ = 2.232$ + 7.500$ = 9.730$ per month - GOOD
1.000.000 msg/sec, 1 KB in size, 1 GB/sec, 24/7 = 4 CU Dedicated Pricing Tier
* 720$ * 31d = 89.280$ per month - TOO MUCH!
Event Hub pricing for ingestion
23
• Azure Event Hub is capable to handle middle-loaded scenarios (100.000
msg/sec or 100 MB/sec) in cost affective manner and provides good feature
parity
• For high-loaded scenarios (1.000.000+ msg/sec or 1+ GB/sec) or big-data
scenarios it seems too expensive (Apache Kafka cluster more cheaper but
requires invest into tuning & maintenance costs)
• Always consider quality attribute requirements for your system before
moving forward with technology decisions. PaaS is not always right choice in
case of high-loaded scenarios
Summary
24
Q & A
THANK YOU!

More Related Content

What's hot (20)

PDF
Scaling Your Database In The Cloud
Cory Isaacson
 
PPTX
LeanXcale for Monitoring
LeanXcale
 
PDF
Cloud stack for_beginners
Radhika Puthiyetath
 
PDF
VTU 6th Sem Elective CSE - Module 5 cloud computing
Sachin Gowda
 
PDF
Week 8 lecture material
Ankit Gupta
 
PDF
Creating Data Fabric for #IOT with Apache Pulsar
Karthik Ramasamy
 
PPTX
Is "Free" Good Enough for Your MySQL Environment?
Datavail
 
PPTX
Netflix Data Pipeline With Kafka
Steven Wu
 
PDF
Mod05lec24(resource mgmt i)
Ankit Gupta
 
PPTX
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
StreamNative
 
PPTX
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
StreamNative
 
PDF
Hands-on Workshop: Apache Pulsar
Sijie Guo
 
PDF
Effectively-once semantics in Apache Pulsar
Matteo Merli
 
PDF
Building Multi-tenant SaaS Applications using WSO2 Private PaaS
Sameera Jayasoma
 
PPTX
Securing your data with Azure SQL DB
Microsoft Tech Community
 
PPTX
Cloud Technology Brief 2013 Q1 - Thailand
Aruj Thirawat
 
PDF
Scaling customer engagement with apache pulsar
StreamNative
 
PPTX
The Power of Now! Azure Stream Analytics - Microsoft ITPro AirLift
Rui Quintino
 
PDF
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
confluent
 
PDF
Apache pulsar - storage architecture
Matteo Merli
 
Scaling Your Database In The Cloud
Cory Isaacson
 
LeanXcale for Monitoring
LeanXcale
 
Cloud stack for_beginners
Radhika Puthiyetath
 
VTU 6th Sem Elective CSE - Module 5 cloud computing
Sachin Gowda
 
Week 8 lecture material
Ankit Gupta
 
Creating Data Fabric for #IOT with Apache Pulsar
Karthik Ramasamy
 
Is "Free" Good Enough for Your MySQL Environment?
Datavail
 
Netflix Data Pipeline With Kafka
Steven Wu
 
Mod05lec24(resource mgmt i)
Ankit Gupta
 
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
StreamNative
 
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
StreamNative
 
Hands-on Workshop: Apache Pulsar
Sijie Guo
 
Effectively-once semantics in Apache Pulsar
Matteo Merli
 
Building Multi-tenant SaaS Applications using WSO2 Private PaaS
Sameera Jayasoma
 
Securing your data with Azure SQL DB
Microsoft Tech Community
 
Cloud Technology Brief 2013 Q1 - Thailand
Aruj Thirawat
 
Scaling customer engagement with apache pulsar
StreamNative
 
The Power of Now! Azure Stream Analytics - Microsoft ITPro AirLift
Rui Quintino
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
confluent
 
Apache pulsar - storage architecture
Matteo Merli
 

Similar to High throughput data streaming in Azure (20)

PDF
GECon2017_High-volume data streaming in azure_ Aliaksandr Laisha
GECon_Org Team
 
PDF
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent
 
PPTX
Building High-scalable Enterprise Solutions,
Swiss Data Forum Swiss Data Forum
 
PDF
IoT & Azure
Mirco Vanini
 
PPTX
A guide through the Azure Messaging services - Update Conference
Eldert Grootenboer
 
PDF
IoT & Azure (EventHub)
Mirco Vanini
 
PDF
Azure IOT: EVENT HUB & STREAM ANALYTICS & POWER BI
Chourouk HJAIEJ
 
PPTX
Latest Updates to Azure Integration Services
Daniel Toomey
 
PPTX
Azure Stream Analytics
Marco Parenzan
 
PPTX
Azure IoT Summary
Todd Whitehead
 
PPTX
Azure event hubs, Stream Analytics & Power BI (by Sam Vanhoutte)
Codit
 
PDF
Serverless Messaging with Microsoft Azure by Steef-Jan Wiggers
Adam Walhout
 
PPTX
Event Grid - quiet event to revolutionize Azure and more
Sean Feldman
 
PDF
Azure Events and Messages
Neeraj Kumar
 
PPTX
Leverage your application architecture with azure services
Sammani Palansuriya
 
PDF
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Guido Schmutz
 
PPTX
Azure Stream Analytics
Marco Parenzan
 
PDF
Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...
Lviv Startup Club
 
PPTX
Stateful stream processing of iIoT events with C# and containers
Marco Parenzan
 
PDF
Event Hub (i.e. Kafka) in Modern Data Architecture
Guido Schmutz
 
GECon2017_High-volume data streaming in azure_ Aliaksandr Laisha
GECon_Org Team
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent
 
Building High-scalable Enterprise Solutions,
Swiss Data Forum Swiss Data Forum
 
IoT & Azure
Mirco Vanini
 
A guide through the Azure Messaging services - Update Conference
Eldert Grootenboer
 
IoT & Azure (EventHub)
Mirco Vanini
 
Azure IOT: EVENT HUB & STREAM ANALYTICS & POWER BI
Chourouk HJAIEJ
 
Latest Updates to Azure Integration Services
Daniel Toomey
 
Azure Stream Analytics
Marco Parenzan
 
Azure IoT Summary
Todd Whitehead
 
Azure event hubs, Stream Analytics & Power BI (by Sam Vanhoutte)
Codit
 
Serverless Messaging with Microsoft Azure by Steef-Jan Wiggers
Adam Walhout
 
Event Grid - quiet event to revolutionize Azure and more
Sean Feldman
 
Azure Events and Messages
Neeraj Kumar
 
Leverage your application architecture with azure services
Sammani Palansuriya
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Guido Schmutz
 
Azure Stream Analytics
Marco Parenzan
 
Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...
Lviv Startup Club
 
Stateful stream processing of iIoT events with C# and containers
Marco Parenzan
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Guido Schmutz
 
Ad

More from Alexander Laysha (6)

PDF
Data exposure in Azure - production use-case
Alexander Laysha
 
PPTX
Multi-Tenant Hybrid Solution based on Hybrid Connections & App Service
Alexander Laysha
 
PPTX
Implement API Gateway using Azure API Management
Alexander Laysha
 
PPTX
Usage of Reliable Actors in Azure Service Fabric
Alexander Laysha
 
PPTX
Monitoring of distributed app hosted in Azure App Service
Alexander Laysha
 
PPTX
Миграция в Azure Service Fabric
Alexander Laysha
 
Data exposure in Azure - production use-case
Alexander Laysha
 
Multi-Tenant Hybrid Solution based on Hybrid Connections & App Service
Alexander Laysha
 
Implement API Gateway using Azure API Management
Alexander Laysha
 
Usage of Reliable Actors in Azure Service Fabric
Alexander Laysha
 
Monitoring of distributed app hosted in Azure App Service
Alexander Laysha
 
Миграция в Azure Service Fabric
Alexander Laysha
 
Ad

Recently uploaded (20)

PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 

High throughput data streaming in Azure

  • 1. High-throughput data streaming in Azure Alexander Laysha Solution Architect at EPAM Systems & Microsoft Azure MVP
  • 2. 2 Few words about myself… I’m Alexander Laysha • Solution Architect from EPAM Systems & Microsoft Azure MVP • Focused on backend, high-load and cloud solutions • Leader of Belarus Azure Community • Speaker at local and external meetups and conferences My contacts • Email: [email protected] • Twitter: @layshaalexander • Facebook: alexander.laysha
  • 3. 3 • Business needs for real-time analytics • Use-cases & architecture approaches • Basics of real-time data streaming platforms • Azure Event Hub capabilities & constraints • Pricing calculations for multiple data ingestion scenarios based on Event Hub • Summary Which topics will we cover?
  • 4. 4 Past world • Capture data for later analysis • Reports and analytics with X days latency Current days • Dealing with tons of data • Offline report and analysis in no longer enough (but still important) • Business want to get immediate insights from captured data with X seconds/minutes latency Business needs for data analytics
  • 5. 5 IoT – device operational intelligence and pro-active alerts Gaming Industry – real-time board with game leaders and scores E-Commerce – online recommendation engines and proactive care Operations - analyze real-time data to respond to dynamic environments in order to take immediate action Financial - monitor financial transactions in real-time to detect fraudulent activity Just few use-cases…
  • 6. 6 Collection – captured data from multiple sources Streaming - high-throughput data pipeline systems like Kafka, Kinesis, Event Hub Processing – stream processing platforms that performs a certain task to produce output Serving – app for stream processing output consumption – UI, posts, DB, report viewers, APIs High-level real-time streaming architecture OUR FOCUS
  • 7. 7 Persistence and batch - data is stored in a persistence layer from which it is ingested and processed by the batch layer periodically (may includes stream processing for on-fly ETL) Speed layer - handles the portion of the data that has not-yet been processed by the batch layer (includes stream processing and storage) Serving layer - consolidates both by merging the output of the batch and the speed layer High-level view of Lambda Architecture
  • 8. 8 High-level view of Kappa Architecture Persistence – stores initial raw data for historical purposes and can be used to replay computations from initial data stream Speed Layer - the basic idea is to not periodically recompute all data in the batch layer, but to do all computation on Speed Layer in the stream processing system alone and only perform recomputation when the business logic changes by replaying historical data
  • 9. 9 Popular platforms for data streaming Kinesis Event Hub
  • 11. 11 Common terminology Producer Producer Producer Publisher Consumer Consumer Kinesis Stream Applications Consumer Stream Topic Stream Event Hub Partition Partition Shard Partition Index Offset Sequence Number Offset Consumer Group Consumer Group Application Consumer Group
  • 12. 12 • Designed to handle very large quantities of small messages • Horizontally scalable by using partitions and consumer groups • Reliable and fault-tolerant • Configurable data replication • Configurable message TTL (stream level) • Supports at-least-once delivery • Logical data organization using partitions • Separate date view for consumer by using consumer groups and indexes • Ability to replay messages • Messages with the same key are sent to the same partition • Guarantee of message order in scope of partition • Integrated with modern stream processing platforms (Stream Analytics, Storm, Spark, etc.) Common characteristics
  • 13. 13 Let’s take a close look to Azure Event Hub Event Producers > 1M Producers > 1GB/sec Aggregate Throughput Direct PartitionKey Hash Throughput Units: • 1 ≤ TUs ≤ Partition Count • TU: 1 MB/s writes, 2 MB/s reads Namespace
  • 14. 14 Ways to publish - individual event or batch: • Round Robin • Partition Id • Partition Key Supported Protocols: • HTTPS – short-lived (low throughput) • AMQP 1.0 – long-lived, (high throughput) Publisher Policy - run-time feature designed to facilitate large numbers of independent event publishers by using unique identifier and virtual endpoint: //[my namespace].servicebus.windows.net/[event hub name]/publishers/[my publisher name] Event Hub Publishers Event Producers
  • 15. 15 Events listening - consumer connects to a partition using AMQP 1.0 protocol and listens for incoming events Consumer Groups - is a view (state, position, or offset) of an entire event hub. Consumer groups enable multiple consuming applications to each have a separate view of the event stream, and to read the stream independently at their own pace and with their own offsets Event Hub Consumers
  • 16. 16 • Security model is based on Shared Access Signature (SAS) tokens • Shared access policy (key) supports following claims: Send, Listen, Manage • Shared access policy (key): • can be created on namespace or event hub level • includes Primary and Secondary keys • Primary and Secondary key can be revoked • SAS tokens can be created on namespace, event hub or publisher level • Granular control over event publishers through publisher policies (publisher name should be the same as partition name, SAS token should be for publisher endpoint) • Event publishers can be revoked in case of usage of publisher specific SAS token Event Hub Security
  • 17. 17 • Automatic persistence of ingested events from Even Hub in Apach Avro format • Supported storages: • Azure Storage • Azure Data Lake • Configurable size & time windows per partition Event Hub Capturing
  • 18. 18 Monitoring • Integrated with Azure Monitor • Type of diagnostics data: archive logs, operational logs, auto-scale logs, all metrics • Diagnostic logs can be send to: storage account, event hub, Log Analytics Availability & Disaster Recovery • SLA for 99,9% for operations on Event Hub • HA is guaranteed by replication and availability sets • In case of failure one of the partitions, other partitions will be available • No built-in options for disaster recovery of Event Hub between regions (custom solution: use events capturing with geo redundant storage and custom code to populate Event Hub in another region) Event Hub Monitoring & Disaster Recovery
  • 19. 19 • Throughput Unit – unit of scalability, shared across all event hubs in namespace • Manually or programmatically set (TUs) • 1 TU = 1 MB/sec or 1000 events/sec on ingress, 2 MB/sec on egress, Max 100 TU for Standard Tier (contact support team) • Dedicated Event Hub: 1 CU = ~200 TU, max 8 CU • Enable Auto-Inflate for auto scaling up of TUs with ability to specify limits • Partition count: 2-32. Count is not changeable and must be specified during creation (count can be increased by contacting Microsoft) • Consumer Group count: up to 20 per Event Hub • 5 max concurrent readers on a partition per consumer group (recommended to use one active receiver on a partition per consumer group) Event Hub Scalability
  • 20. 20 • Single tenant hosting with no noise from other tenants, available to customers with an enterprise agreement • Repeatable performance every time • No additional charge for incoming messages • Message size increases to 1 MB as compared to 256 KB for Standard and Basic • Scalable between 1 and 8 capacity units (CU) – providing up to 2 million ingress events per second • CUs manage the scale for Event Hubs Dedicated, 1 CU = ~200 TU, max 8 CU • Zero maintenance: management of load balancing, OS updates, security patches, and partitioning • Fixed monthly pricing: ~720$ per day for 1 CU (pricing & CU size will change starting from October 2017: ¼ CU for 5000$ per month) Dedicated Event Hub
  • 21. 21 • Max 10 Event Hubs per Namespace • Partition limit is 1 TU • Number of AMQP connections per namespace: 5000 • Only Azure deployment, No Azure Stack support yet • No SAS on consumer group level, no built-in encryption or compression of event body • No functionality to drain Event Hub (need to create custom drainer) • No local emulator Other non covered Event Hub Constraints
  • 22. 22 !!! COSTS JUST FOR INCOMING TRAFFIC WITHOUT STORAGE COSTS 1.000 msg/sec, 1 KB in size, 1 MB/sec, 24/7 = 1 TU Standard Pricing Tier * 22.32$ + (1.000 * 60sec * 60min * 24hrs * 31d)/1.000.000 * 0.028$ = 22.32$ + 75$ = 97.3$ per month - GOOD 100.000 msg/sec, 1 KB in size, 100 MB/sec, 24/7 = 100 TU (MAX) Standard Pricing Tier * 22.32$ + (100.000 * 60sec * 60min * 24hrs * 31d)/1.000.000 * 0.028$ = 2.232$ + 7.500$ = 9.730$ per month - GOOD 1.000.000 msg/sec, 1 KB in size, 1 GB/sec, 24/7 = 4 CU Dedicated Pricing Tier * 720$ * 31d = 89.280$ per month - TOO MUCH! Event Hub pricing for ingestion
  • 23. 23 • Azure Event Hub is capable to handle middle-loaded scenarios (100.000 msg/sec or 100 MB/sec) in cost affective manner and provides good feature parity • For high-loaded scenarios (1.000.000+ msg/sec or 1+ GB/sec) or big-data scenarios it seems too expensive (Apache Kafka cluster more cheaper but requires invest into tuning & maintenance costs) • Always consider quality attribute requirements for your system before moving forward with technology decisions. PaaS is not always right choice in case of high-loaded scenarios Summary

Editor's Notes

  • #5: https://www.slideshare.net/hadooparchbook/streaming-architecture-patterns
  • #11: http://container-solutions.com/introduction-stream-processing-systems-kafka-aws-kinesis-azure-event-hubs/
  • #13: https://blogs.msdn.microsoft.com/opensourcemsft/2015/08/08/choosing-between-azure-event-hub-and-kafka-what-you-need-to-know/ http://container-solutions.com/introduction-stream-processing-systems-kafka-aws-kinesis-azure-event-hubs/