SlideShare a Scribd company logo
3PAR Streaming Journey
Chris McDermott
January 2020
Agenda
1. The 3PAR use case
2. Batch to Streaming transition
3. Reactive architecture, micro-services and event sourcing
2
The 3PAR Use case
• 50K 3PAR Storage Arrays (SAs)
• Different data types have different arrival rates
• 500 GB/day on average
• Need to process 10x bursts
• 10K sensors per SA
• ~10 sources of enhancement data
3
The 3PAR Use case (continued)
• Joins!
• Analytics
• Statistical aggregations
• Prediction
• Projections
• Automated case management
• Legacy Integrations
• Create event sources from non-eventing services using Reactive facades
4
Batch Vs Streaming
Day 1
Processing
Data
Day 2
Processing
Data
Day 3
Processing
Data
Day 4 Day N
Day N
Processing
Data
Day 1
Input Data
Day 3
Input Data
Day 2
Input Data
Day 4
Input Data
Time Lag
T 1
Processing
Data
T 2
Processing
Data
T 3
Processing
Data
T 4 Time N
Time N
Processing
Data
T 1
Input Data
T 3
Input Data
T 2
Input Data
T 4
Input Data
No Time
Lag
Batch
Streaming
Batch Processing
• Serial processing of all data on a regular cadence
* Push = project data to new Elasticsearch index
• As the amount of history and systems increase, each stage of the pipeline takes longer to run
• Failures take a long time to recover from (Push failure at the 20-hour mark…)
• Large quanta problems means repeating failed changes takes a long time
• Based on Spark, so monitoring is sub-par
6
Gather
(1 hour)
Process
(4 hours)
Push*
(30 hours)
Streaming Processing
• Parallel processing of (relatively) small chunks of data (per SA) as soon as the data is received
• Data lag, or data freshness, is always consistent at less than 5 minutes. New data is always
processed as soon as it is available.
• Failure recovery is extremely fast
oWhen running at 10x line rate, full recovery is roughly 10% of outage time (2-day outage is
recovered in ~5 hours)
oCan dynamically apply more resources to increase processing performance
• Based on Lagom/Akka, which provides built-in metrics and reporting framework
7
Reactive Architecture and Technologies
Reactive Architecture
“Systems built as Reactive Systems are more flexible, loosely-coupled and scalable. This makes them easier to develop
and amenable to change. They are significantly more tolerant of failure and when a failure does occur they meet it with
elegance rather than disaster. Reactive Systems are highly responsive, giving users effective interactive feedback.”
9
Reactive Systems Are:
• Responsive: The system responds in a timely manner if at all possible.
• Resilient: The system stays responsive in the face of failure. This applies not only to highly-available,
mission-critical systems — any system that is not resilient will be unresponsive after a failure.
• Elastic: The system stays responsive under varying workload. Reactive Systems can react to changes
in the input rate by increasing or decreasing the resources allocated to service these inputs.
• Message Driven: Reactive Systems rely on asynchronous message-passing to establish a boundary
between components that ensures loose coupling, isolation and location transparency.
Summarized from the Reactive Manifesto
3PAR Streaming – Technology Stack
10
Apache Kafka – durable, elastic, fault-tolerant, log based, message bus
Apache Cassandra – durable, elastic, fault-tolerant noSQL database
Elasticsearch – durable, elastic, fault-tolerant document-store optimized for
search
Apache NiFi – dataflow automation with graphical programming interface
Akka - a toolkit for building highly concurrent, distributed, and resilient message-
driven applications
Play – web based (REST) application framework based on Akka
3PAR Streaming – Technology Stack
11
Lightbend Commercial Components
• Split Brain Resolver
• Telemetry
• Thread Starvation Detector
Telemetry
3PAR Simplified Streaming Component Architecture
Ingest
Support
Tickets
Entitlement
ML
Application
NiFi
Support Lagom
Entitlement Lagom
ES Projector
*
*
…
…
Elasticsearch
StoreServ API
StoreServ Akka
InfoSight UI
HPE
Akka
StoreServ Akka
• Device shadow model
• Stores raw data in Cassandra (data lake)
• Stores Actor State in Cassandra
• Actors cache most recent data in memory for very low latency
• Actors are rehydrated from State in Cassandra
• Actors are not passivated
• Scale out by adding more instances when running out heap
Akka vs Lagom
Various stateful micro services written using Lagom
• Lagom makes sense for event driven micro services
• If you can store the entire event history and rebuild the read-side from it in a reasonable amount
of time (event sourcing)
• Most use cases fall into this category.
• Plain Akka makes more sense if you can’t afford to save the entire event history or rebuilding the read-
side from the event history is too expensive. Or you simply don’t need the entire event history to
rebuild the read-side.
• Persisted the entire read-side (Kafka)
• Still CQRS (but not event-sourced.)
ES Projector
• Akka Streams application
• Reads full data model
• Creates “role” based projections of data into Elasticsearch
StoreServ API
• Uses Play Framework
• Basically provides a thin wrapper over Elasticsearch queries
• Modifies client queries to enforce access control (both tenancy and role restrictions)
Results
What was gained?
• Responsive: InfoSight is updated in near real-time
• Reduced lag gives customers greater confidence
• Allows automated support (problems can be remediated sooner: outages are prevented)
• Resilient: InfoSight is more reliable
• Microservice isolation means the system degrades instead of totally fails.
• Elastic: Containerization and Scale-out technologies
• The system can easily be scaled to account for growth and new processing
• Message Driven: Well defined boundaries and client managed message consumption
• Isolated components are more easily understood.
• New components can be added without changing the underlying architecture.
Bottom Line
• Customer satisfaction is increased
• HPE costs are decreased
Futures
Data Platform
Goals
• Allow the on-boarding of a disparate set product lines quickly and efficiently
• Data-lake to share data across internal organizations
• Support exploratory analytics – Data democracy
• Access Control and Multitenancy (RBAC)
• Uniform Data Access API
• Support for ML workflows
Data Platform
• Scala
• Kubernetes
• S3
• Delta-Lake
• Spark
• Akka
• Kafka
• Cassandra/ElasticSearch/PostgresSQL,
Are you ready?
HPE is Hiring
Click on the links below to see job descriptions – note: the URLs are subject to change.
For the latest information, visit https://careers.hpe.com
https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589424
https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589425
https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589426
https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589427
https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589423
You may apply through the career site or send resumes directly to victor.volpe@hpe.com

More Related Content

What's hot (20)

PPTX
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Todd Fritz
 
PDF
Nine Neins - where Java EE will never take you
Markus Eisele
 
PDF
Revitalizing Aging Architectures with Microservices
Legacy Typesafe (now Lightbend)
 
PDF
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
Big Data Spain
 
PDF
Akka Streams And Kafka Streams: Where Microservices Meet Fast Data
Lightbend
 
PPTX
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
 
PPTX
Lightbend Training for Scala, Akka, Play Framework and Apache Spark
Lightbend
 
PPTX
Siddhi: A Second Look at Complex Event Processing Implementations
Srinath Perera
 
PDF
Akka and Kubernetes: Reactive From Code To Cloud
Lightbend
 
PDF
Scalable and Reliable Logging at Pinterest
Krishna Gade
 
PPTX
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Legacy Typesafe (now Lightbend)
 
PDF
The 6 Rules for Modernizing Your Legacy Java Monolith with Microservices
Lightbend
 
PPTX
Running Kafka for Maximum Pain
Todd Palino
 
PPTX
Journey to the Modern App with Containers, Microservices and Big Data
Lightbend
 
PDF
Going Reactive in the Land of No
Lightbend
 
PDF
101 ways to configure kafka - badly (Kafka Summit)
Henning Spjelkavik
 
PDF
Event Sourcing in less than 20 minutes - With Akka and Java 8
J On The Beach
 
PPTX
20160609 nike techtalks reactive applications tools of the trade
shinolajla
 
PDF
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Lightbend
 
PPTX
Real time big data stream processing
Luay AL-Assadi
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Todd Fritz
 
Nine Neins - where Java EE will never take you
Markus Eisele
 
Revitalizing Aging Architectures with Microservices
Legacy Typesafe (now Lightbend)
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
Big Data Spain
 
Akka Streams And Kafka Streams: Where Microservices Meet Fast Data
Lightbend
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
 
Lightbend Training for Scala, Akka, Play Framework and Apache Spark
Lightbend
 
Siddhi: A Second Look at Complex Event Processing Implementations
Srinath Perera
 
Akka and Kubernetes: Reactive From Code To Cloud
Lightbend
 
Scalable and Reliable Logging at Pinterest
Krishna Gade
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Legacy Typesafe (now Lightbend)
 
The 6 Rules for Modernizing Your Legacy Java Monolith with Microservices
Lightbend
 
Running Kafka for Maximum Pain
Todd Palino
 
Journey to the Modern App with Containers, Microservices and Big Data
Lightbend
 
Going Reactive in the Land of No
Lightbend
 
101 ways to configure kafka - badly (Kafka Summit)
Henning Spjelkavik
 
Event Sourcing in less than 20 minutes - With Akka and Java 8
J On The Beach
 
20160609 nike techtalks reactive applications tools of the trade
shinolajla
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Lightbend
 
Real time big data stream processing
Luay AL-Assadi
 

Similar to Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbend Platform (20)

PPTX
Data & Analytics Forum: Moving Telcos to Real Time
SingleStore
 
PPTX
Big Data Berlin v8.0 Stream Processing with Apache Apex
Apache Apex
 
PPTX
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Dataconomy Media
 
PPTX
Introduction to Apache Apex
Apache Apex
 
PDF
Webinar: SQL for Machine Data?
Crate.io
 
PDF
Data Pipelines with Spark & DataStax Enterprise
DataStax
 
PPTX
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Maya Lumbroso
 
PPTX
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Dataconomy Media
 
PDF
Estimating the Total Costs of Your Cloud Analytics Platform
DATAVERSITY
 
PDF
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
ScyllaDB
 
PDF
Chip ICT | Hgst storage brochure
Marco van der Hart
 
PPTX
real time data processing is a tsubtopic in the topic in the domain bigdata
ArasuVishnu
 
PDF
Kafka & Hadoop in Rakuten
Rakuten Group, Inc.
 
PDF
OpenStack at the speed of business with SolidFire & Red Hat
NetApp
 
PPTX
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 
PPTX
Solving Office 365 Big Challenges using Cassandra + Spark
Anubhav Kale
 
PDF
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
Amazon Web Services Korea
 
PDF
Monitoring MySQL at scale
Ovais Tariq
 
PDF
SpringPeople - Introduction to Cloud Computing
SpringPeople
 
PDF
As34269277
IJERA Editor
 
Data & Analytics Forum: Moving Telcos to Real Time
SingleStore
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Dataconomy Media
 
Introduction to Apache Apex
Apache Apex
 
Webinar: SQL for Machine Data?
Crate.io
 
Data Pipelines with Spark & DataStax Enterprise
DataStax
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Maya Lumbroso
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Dataconomy Media
 
Estimating the Total Costs of Your Cloud Analytics Platform
DATAVERSITY
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
ScyllaDB
 
Chip ICT | Hgst storage brochure
Marco van der Hart
 
real time data processing is a tsubtopic in the topic in the domain bigdata
ArasuVishnu
 
Kafka & Hadoop in Rakuten
Rakuten Group, Inc.
 
OpenStack at the speed of business with SolidFire & Red Hat
NetApp
 
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 
Solving Office 365 Big Challenges using Cassandra + Spark
Anubhav Kale
 
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
Amazon Web Services Korea
 
Monitoring MySQL at scale
Ovais Tariq
 
SpringPeople - Introduction to Cloud Computing
SpringPeople
 
As34269277
IJERA Editor
 
Ad

More from Lightbend (20)

PDF
IoT 'Megaservices' - High Throughput Microservices with Akka
Lightbend
 
PDF
How Akka Cluster Works: Actors Living in a Cluster
Lightbend
 
PDF
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
Lightbend
 
PDF
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Lightbend
 
PDF
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Lightbend
 
PDF
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Lightbend
 
PDF
Microservices, Kubernetes, and Application Modernization Done Right
Lightbend
 
PDF
Akka and Kubernetes: A Symbiotic Love Story
Lightbend
 
PPTX
Scala 3 Is Coming: Martin Odersky Shares What To Know
Lightbend
 
PDF
Migrating From Java EE To Cloud-Native Reactive Systems
Lightbend
 
PDF
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Lightbend
 
PDF
Designing Events-First Microservices For A Cloud Native World
Lightbend
 
PDF
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Lightbend
 
PDF
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
Lightbend
 
PDF
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes
Lightbend
 
PDF
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
Lightbend
 
PDF
How Akka Works: Visualize And Demo Akka With A Raspberry-Pi Cluster
Lightbend
 
PDF
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Lightbend
 
PDF
Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...
Lightbend
 
PDF
Making Scala Faster: 3 Expert Tips For Busy Development Teams
Lightbend
 
IoT 'Megaservices' - High Throughput Microservices with Akka
Lightbend
 
How Akka Cluster Works: Actors Living in a Cluster
Lightbend
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
Lightbend
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Lightbend
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Lightbend
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Lightbend
 
Microservices, Kubernetes, and Application Modernization Done Right
Lightbend
 
Akka and Kubernetes: A Symbiotic Love Story
Lightbend
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Lightbend
 
Migrating From Java EE To Cloud-Native Reactive Systems
Lightbend
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Lightbend
 
Designing Events-First Microservices For A Cloud Native World
Lightbend
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Lightbend
 
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
Lightbend
 
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes
Lightbend
 
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
Lightbend
 
How Akka Works: Visualize And Demo Akka With A Raspberry-Pi Cluster
Lightbend
 
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Lightbend
 
Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...
Lightbend
 
Making Scala Faster: 3 Expert Tips For Busy Development Teams
Lightbend
 
Ad

Recently uploaded (20)

PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PDF
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PPTX
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PDF
Unlock Efficiency with Insurance Policy Administration Systems
Insurance Tech Services
 
PPTX
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
PPTX
Human Resources Information System (HRIS)
Amity University, Patna
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PPTX
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PPTX
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
PPTX
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
PDF
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
Mobile CMMS Solutions Empowering the Frontline Workforce
CryotosCMMSSoftware
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
Tally software_Introduction_Presentation
AditiBansal54083
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
Unlock Efficiency with Insurance Policy Administration Systems
Insurance Tech Services
 
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
Human Resources Information System (HRIS)
Amity University, Patna
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
The Role of a PHP Development Company in Modern Web Development
SEO Company for School in Delhi NCR
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 

Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbend Platform

  • 1. 3PAR Streaming Journey Chris McDermott January 2020
  • 2. Agenda 1. The 3PAR use case 2. Batch to Streaming transition 3. Reactive architecture, micro-services and event sourcing 2
  • 3. The 3PAR Use case • 50K 3PAR Storage Arrays (SAs) • Different data types have different arrival rates • 500 GB/day on average • Need to process 10x bursts • 10K sensors per SA • ~10 sources of enhancement data 3
  • 4. The 3PAR Use case (continued) • Joins! • Analytics • Statistical aggregations • Prediction • Projections • Automated case management • Legacy Integrations • Create event sources from non-eventing services using Reactive facades 4
  • 5. Batch Vs Streaming Day 1 Processing Data Day 2 Processing Data Day 3 Processing Data Day 4 Day N Day N Processing Data Day 1 Input Data Day 3 Input Data Day 2 Input Data Day 4 Input Data Time Lag T 1 Processing Data T 2 Processing Data T 3 Processing Data T 4 Time N Time N Processing Data T 1 Input Data T 3 Input Data T 2 Input Data T 4 Input Data No Time Lag Batch Streaming
  • 6. Batch Processing • Serial processing of all data on a regular cadence * Push = project data to new Elasticsearch index • As the amount of history and systems increase, each stage of the pipeline takes longer to run • Failures take a long time to recover from (Push failure at the 20-hour mark…) • Large quanta problems means repeating failed changes takes a long time • Based on Spark, so monitoring is sub-par 6 Gather (1 hour) Process (4 hours) Push* (30 hours)
  • 7. Streaming Processing • Parallel processing of (relatively) small chunks of data (per SA) as soon as the data is received • Data lag, or data freshness, is always consistent at less than 5 minutes. New data is always processed as soon as it is available. • Failure recovery is extremely fast oWhen running at 10x line rate, full recovery is roughly 10% of outage time (2-day outage is recovered in ~5 hours) oCan dynamically apply more resources to increase processing performance • Based on Lagom/Akka, which provides built-in metrics and reporting framework 7
  • 9. Reactive Architecture “Systems built as Reactive Systems are more flexible, loosely-coupled and scalable. This makes them easier to develop and amenable to change. They are significantly more tolerant of failure and when a failure does occur they meet it with elegance rather than disaster. Reactive Systems are highly responsive, giving users effective interactive feedback.” 9 Reactive Systems Are: • Responsive: The system responds in a timely manner if at all possible. • Resilient: The system stays responsive in the face of failure. This applies not only to highly-available, mission-critical systems — any system that is not resilient will be unresponsive after a failure. • Elastic: The system stays responsive under varying workload. Reactive Systems can react to changes in the input rate by increasing or decreasing the resources allocated to service these inputs. • Message Driven: Reactive Systems rely on asynchronous message-passing to establish a boundary between components that ensures loose coupling, isolation and location transparency. Summarized from the Reactive Manifesto
  • 10. 3PAR Streaming – Technology Stack 10 Apache Kafka – durable, elastic, fault-tolerant, log based, message bus Apache Cassandra – durable, elastic, fault-tolerant noSQL database Elasticsearch – durable, elastic, fault-tolerant document-store optimized for search Apache NiFi – dataflow automation with graphical programming interface Akka - a toolkit for building highly concurrent, distributed, and resilient message- driven applications Play – web based (REST) application framework based on Akka
  • 11. 3PAR Streaming – Technology Stack 11 Lightbend Commercial Components • Split Brain Resolver • Telemetry • Thread Starvation Detector
  • 13. 3PAR Simplified Streaming Component Architecture Ingest Support Tickets Entitlement ML Application NiFi Support Lagom Entitlement Lagom ES Projector * * … … Elasticsearch StoreServ API StoreServ Akka InfoSight UI HPE
  • 14. Akka StoreServ Akka • Device shadow model • Stores raw data in Cassandra (data lake) • Stores Actor State in Cassandra • Actors cache most recent data in memory for very low latency • Actors are rehydrated from State in Cassandra • Actors are not passivated • Scale out by adding more instances when running out heap
  • 15. Akka vs Lagom Various stateful micro services written using Lagom • Lagom makes sense for event driven micro services • If you can store the entire event history and rebuild the read-side from it in a reasonable amount of time (event sourcing) • Most use cases fall into this category. • Plain Akka makes more sense if you can’t afford to save the entire event history or rebuilding the read- side from the event history is too expensive. Or you simply don’t need the entire event history to rebuild the read-side. • Persisted the entire read-side (Kafka) • Still CQRS (but not event-sourced.)
  • 16. ES Projector • Akka Streams application • Reads full data model • Creates “role” based projections of data into Elasticsearch StoreServ API • Uses Play Framework • Basically provides a thin wrapper over Elasticsearch queries • Modifies client queries to enforce access control (both tenancy and role restrictions)
  • 18. What was gained? • Responsive: InfoSight is updated in near real-time • Reduced lag gives customers greater confidence • Allows automated support (problems can be remediated sooner: outages are prevented) • Resilient: InfoSight is more reliable • Microservice isolation means the system degrades instead of totally fails. • Elastic: Containerization and Scale-out technologies • The system can easily be scaled to account for growth and new processing • Message Driven: Well defined boundaries and client managed message consumption • Isolated components are more easily understood. • New components can be added without changing the underlying architecture.
  • 19. Bottom Line • Customer satisfaction is increased • HPE costs are decreased
  • 21. Data Platform Goals • Allow the on-boarding of a disparate set product lines quickly and efficiently • Data-lake to share data across internal organizations • Support exploratory analytics – Data democracy • Access Control and Multitenancy (RBAC) • Uniform Data Access API • Support for ML workflows
  • 22. Data Platform • Scala • Kubernetes • S3 • Delta-Lake • Spark • Akka • Kafka • Cassandra/ElasticSearch/PostgresSQL,
  • 24. HPE is Hiring Click on the links below to see job descriptions – note: the URLs are subject to change. For the latest information, visit https://careers.hpe.com https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589424 https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589425 https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589426 https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589427 https://careers.hpe.com/job/Hewlett-Packard-Enterprise-Andover-Massachusetts/91589423 You may apply through the career site or send resumes directly to [email protected]

Editor's Notes

  • #4: 3PARs are high end storage arrays
  • #7: Batch failures are a “vey large quanta” problems. Failure in a batch stage requires reprocessing of that very large quanta. Push needs to copy the Vertica index, needs to pull in all historical data, plus all newly processed data into a new index. Spark monitoring is poor because much of it sits behind Yarn. We also have no ssh level access to the HDP nodes.
  • #8: Streaming failures are a ”small quanta” problem. Recovery requires only reprocessing of that small quanta. Streaming push - pushes new data as it becomes available. There is no “dump truck” of data that needs to be indexed which leads to a more available and stable Elasticsearch cluster. No index rolling and copying data between an old index and a new index.
  • #10: Reactive Architecture arose from the rejection of a model where remote communication was trying to be disguised as local: e.g. rRMI, CORBA, DCE, etc. Reactive Architecture fully accepts the realities of distributed systems by never treating anything as local. There are no synchronous messages. Everything is location transparent and assumed to be remote. Failures to occur and the system still must remain as available and as functional as possible.
  • #11: Akka supports multiple processing paradigms including streaming. Lagom is an opinionated API framework built on top of Akka.
  • #12: Akka supports multiple processing paradigms including streaming. Lagom is an opinionated API framework built on top of Akka.
  • #14: Kafka provides centralized message bus comprised of many channels (topics) Many more data sources adapted to Kakfa by Akka Streaming/Lagom: iBase, BL/WL, CFST (crash file search tool), DSPN Listener We are using Cassandra for all statefull Akka based micro-services, but other than storeserv lagom, it is only used in those for state persistence. For storeserv_lagom it provides “random access” to the raw files that also live in Kafka. Also stores perform-deltas which are too ”large” for elasticserach.
  • #20: Near real-time analytics allows us to fix customer problems before they result in outages
  • #22: Near real-time analytics allows us to fix customer problems before they result in outages
  • #23: Near real-time analytics allows us to fix customer problems before they result in outages