SlideShare a Scribd company logo
Webinar | June 24, 2019
Adam Frank
Senior Product Manager
Mick Miller
Senior DevOps Architect
How KeyBank Liberated its IT Ops
from Rules-Based Event Management
Breaking the
Rules
15
States
1,200+
Branches
1,500+
ATMs
20,000
Employees
$135B
Assets
$5B
Revenue
2
Datacenters
Systems have become more…
IT System
Complexity
• Modular
• Distributed
• Dynamic
• Ephemeral
What is
Driving Digital
Transformation?
• Increased demand
• Increasing change velocity
• Customer expectation
• Customer mobility
• Customer choice
What is Driving
Change Velocity?
• Expansion of digital services
• Emergence of containers
• High availability architectures
• Volume: 100k+ and above logins per
sec, etc.
Increased monitoring breaks down
legacy approaches…
• Increasing staff does not scale with the
rate of data ingestion
• Legacy systems do not learn
Keeping
Customers…
…and attracting new ones through
improved customer experience (Cx)
• Near 100% uptime has become
expected
• Restoration of services is measured
in seconds not hours
• Capturing click-level events to
discover how customers are using
your systems
• Continuous delivery
The Weight
• Legacy rules-based filtering (if,
then, else, etc.) won’t scale with
exponential growth
• Too many interdependencies
between complex systems and
rules supporting the telemetry
Legacy Monitoring Can’t Scale
Obsolescence: Planned
and Unplanned
• Software/Hardware: at the core of
ideas, which change as we advance
information/data/technology
• Languages: Over 25 languages in 60
years (1948–2009)
• Data: Flat files -> ISAM -> Relational ->
No-SQL -> Clusters -> etc.
• Software : ad-hoc -> Structured
programming -> Object -> Functional -
> etc.
• IT Operations: ad hoc -> ITIL v1-3 ->
ITIL v4 -> DevOps -> etc.
• And on, and on …
The Only
Constant is
Change
If you don’t like change, you are
really going to hate extinction.
New Strategy Required for IT System
Monitoring
Graph based on StackState monitoring maturity model for IT operations
visibilityandintelligence level 1
individual
component
monitoring
level 2
full-breadth
monitoring
level 3
end-to-end
monitoring and
correlation
level 4
AIOps
maturity level
reactive
monitoring
proactive
monitoring
predictive
monitoring
Rules-based
AIOps
• As the number of systems increases,
so does the volume of data. This
means the number of rules will
increase causing exponential
complexity.
• Increased number of rules becomes
unpredictable and untestable.
This Parrot Is
No More
Rules-based event correlation is
past it’s time
This Parrot Is No More
• Multiple rules interacting is a factorial problem:
(n! = n × (n−1)!)
o 5! rules = 120 possible combinations
o 6! rules = 740 possible combinations
o 10! rules = 3,628,800 possible combinations
o 100! rules = 9 x 10157 power
(9 followed by 157 zeros) possible
combinations
• While easy to understand and implement,
rules-based monitoring implodes at the enterprise
scale as complexity increases
Rules-based event correlation is past it’s time
n n!
0 1
1 1
2 2
3 6
4 24
5 120
6 702
7 5,040
8 40,320
9 362,880
10 3,628,800
11 39,916,800
12 479,001,600
13 6,277,020,800
n n!
14 87,178,291,200
15 1,307,674,368,000
16 20,922,789,888,000
17 355,687,428,096,000
18 6,402,373,705,728,000
19 121,645,100,408,832,000
20 2,432,902,008,176,640,000
21 51,090,942,171,709,440,000
22 1,124,000,727,777,607,680,000
23 25,852,016,738,884,976,640,000
24 620,448,401,733,239,439,360,000
25 15,511,210,043,330,985,984,000,000
26 403,291,461,126,605,635,584,000,000
27 10,888,869,450,418,352,160,768,000,000
Relationship Between Rules Growth Is
Not Linear
• Trying to understand and
test all the relationships
between rules is not
possible.
• Data scientists call this the
“NP-complete” problem
(not solvable with current
compute capability)
• Virtually impossible to
understand effect of alert
exceptions in a collection
of rules, even at 10 rules.
You don’t know
what you don’t
know
• You can’t predict unusual events (events
not caught or missed by rules)
• Rules-based approaches need to change
to AIOps
• ML and AI: all event data can be
processed
• Modern AIOps uses algorithms to identify
when something is unusual
In data science a “black swan event” is
something you can’t predict.
Whodunit?
• Rules-based approaches cannot
decide on root cause of system
failures
• Random nature of real-world failures
in highly distributed systems can
have multiple root causes
• Unlike rules-based systems, AIOps
have built-in learning models. You
don’t need to constantly add new
rules
Root cause probability
Take the red pill…
• Deceptively simple
• Expensive
• Unpredictable
• Undecidable
Rules-based systems cannot meet
the demands of complex
distributed computing
Take the red pill…
• Processes all events
• Does not separate data from systems
• Algorithms are deterministic
• Algorithms don’t care about order
• Single algorithm can replace
hundreds of rules
AIOps (AI and ML ) liberates IT
from the limitations of rules-
based systems
Start Today
• Now is the time to start your AIOps journey
• Move beyond legacy rules-based systems
• Start using modern machine-learning of
AIOps to deliver continuous service
assurance toyour enterprise
Get Started by Reading the AIOps Manifesto:
• https://www.aiops-exchange.org/wp-
content/uploads/2019/05/aiops-manifesto.pdf
• https://www.moogsoft.com/resources/aiops/e
book/aiops-liberates-it
DEMO Moogsoft AIOps
Continuous Service Delivery, Optimal Business Agility
TIME
Quickly focus on and
resolve the most critical
issues, at scale
Improve your economics by making
teams faster, smarter, and more
productive
COST
Get real-time visibility into your
existing data sources, tools and
workflow
RISK
ALL
DATA
Any
SCALE
Purpose-Built AI for IT and DevOps
Moogsoft Is the Platform for Agile and Proactive Event
Resolution Workflow
Industrialized data
ingestion from
multiple sources
Proactively and
automatically detects
Incidents and probable root
causes (reduces MTTD)
Triggers automation
to restore services
Predictive insights
(reduces support
escalations and
MTTR)
Enables collaborative
workflows (reduces
MTTR and adverse
business impact)
Automatically resolves
signals from alert
noise
Early Detection, fewer tickets, reduced MTTR
AI
AI
AI
AI
Diagnostics Diagnostics
Custom Scripts
Existing Runbooks
RUNBOOK
AUTOMATION
NSO
ORCHESTRATION
Continuous
Deployment
Known
Remediation
AIOps
Notifications
Incident
Cases
NOTIFICATIONS
INCIDENTS
Events
AMW
ESX
NETAPP
AWS X-Ray
APPLICATION NETWORK INFRASTRUCTURE CUSTOM ALERTS LOGOS/SYSLOGS SYNTHETIC CLOUD
Seamless Integration With Your Existing Tools
and Workflows
Mick Miller
mick_miller@keybank.com
Adam Frank
adam.frank@moogsoft.com
Q & A

More Related Content

What's hot (20)

PDF
Strengthen and Scale Security Using DevSecOps - OWASP Indonesia
Mohammed A. Imran
 
PDF
Intro to Web3
asasdasd5
 
PPTX
What is tokenization in blockchain?
Ulf Mattsson
 
PPTX
A Secure Model of IoT Using Blockchain
Altoros
 
PDF
Tokenization
Pavel Kravchenko, PhD
 
PDF
Blockchain With IoT - Top Blockchain IoT Use Cases
101 Blockchains
 
PDF
IoT and 5G in Agriculture: opportunities and challenges
Sjaak Wolfert
 
PDF
Introduction to Blockchain
Malak Abu Hammad
 
PDF
Evolution of Digital Bank 4.0
Connected Futures
 
PPTX
Blockchain and Cryptocurrencies
nimeshQ
 
PDF
What is No-Code/Low-Code App Development and Why Should Your Business Care?
kintone
 
PDF
Drag and Drop Open Source GeoTools ETL with Apache NiFi
"Constantin \"Cristi\"" Stanca
 
PDF
What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...
Edureka!
 
PPTX
Do You Really Need to Evolve From Monitoring to Observability?
Splunk
 
PDF
UPI Technology
indiastack
 
PDF
Elastic APM: Amping up your logs and metrics for the full picture
Elasticsearch
 
PPTX
Introduction to Corda Blockchain for Developers
R3
 
PDF
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Kai Wähner
 
PDF
9 reasons why low code no-code platform is the best choice for increasing ado...
Enterprise Bot
 
PDF
Introduction to Decentralized Finance - DeFi
Umair Moon
 
Strengthen and Scale Security Using DevSecOps - OWASP Indonesia
Mohammed A. Imran
 
Intro to Web3
asasdasd5
 
What is tokenization in blockchain?
Ulf Mattsson
 
A Secure Model of IoT Using Blockchain
Altoros
 
Tokenization
Pavel Kravchenko, PhD
 
Blockchain With IoT - Top Blockchain IoT Use Cases
101 Blockchains
 
IoT and 5G in Agriculture: opportunities and challenges
Sjaak Wolfert
 
Introduction to Blockchain
Malak Abu Hammad
 
Evolution of Digital Bank 4.0
Connected Futures
 
Blockchain and Cryptocurrencies
nimeshQ
 
What is No-Code/Low-Code App Development and Why Should Your Business Care?
kintone
 
Drag and Drop Open Source GeoTools ETL with Apache NiFi
"Constantin \"Cristi\"" Stanca
 
What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...
Edureka!
 
Do You Really Need to Evolve From Monitoring to Observability?
Splunk
 
UPI Technology
indiastack
 
Elastic APM: Amping up your logs and metrics for the full picture
Elasticsearch
 
Introduction to Corda Blockchain for Developers
R3
 
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Kai Wähner
 
9 reasons why low code no-code platform is the best choice for increasing ado...
Enterprise Bot
 
Introduction to Decentralized Finance - DeFi
Umair Moon
 

Similar to Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Management (20)

PPTX
Using Machine Learning to Optimize DevOps Practices
Peter Varhol
 
PDF
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
PDF
Itsummit2015 blizzard
kevin_donovan
 
PPTX
Observability – the good, the bad, and the ugly
Timetrix
 
PDF
Automation of document management paul fenton webinar
Montrium
 
PPTX
Context Is Critical for IT Operations - How Rich Data Yields Richer Results
OpsRamp
 
PPTX
Observability - the good, the bad, and the ugly
Aleksandr Tavgen
 
PPTX
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
InfluxData
 
PDF
How to improve your system monitoring
Andrew White
 
PDF
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
Dell World
 
PPT
WWV2015: Jibes Paul van der Hulst big data
webwinkelvakdag
 
PPTX
Monitoring Containerized Micro-Services In Azure
Alex Bulankou
 
PDF
Brighttalk high scale low touch and other bedtime stories - final
Andrew White
 
PPTX
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Aleksandr Tavgen
 
PDF
Drive Smarter Decisions with Big Data Using Complex Event Processing
Perficient, Inc.
 
PPTX
DevOpsDays Chicago 2014 - Controlling Devops
Brian Henerey
 
PPTX
Correlation does not mean causation
Peter Varhol
 
PDF
Its Not You Its Me MSSP Couples Counseling
Atif Ghauri
 
PDF
Security a Revenue Center: How Security Can Drive Your Business
shira koper
 
PDF
Using Time Series for Full Observability of a SaaS Platform
DevOps.com
 
Using Machine Learning to Optimize DevOps Practices
Peter Varhol
 
Mykola Mykytenko: MLOps: your way from nonsense to valuable effect (approache...
Lviv Startup Club
 
Itsummit2015 blizzard
kevin_donovan
 
Observability – the good, the bad, and the ugly
Timetrix
 
Automation of document management paul fenton webinar
Montrium
 
Context Is Critical for IT Operations - How Rich Data Yields Richer Results
OpsRamp
 
Observability - the good, the bad, and the ugly
Aleksandr Tavgen
 
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
InfluxData
 
How to improve your system monitoring
Andrew White
 
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
Dell World
 
WWV2015: Jibes Paul van der Hulst big data
webwinkelvakdag
 
Monitoring Containerized Micro-Services In Azure
Alex Bulankou
 
Brighttalk high scale low touch and other bedtime stories - final
Andrew White
 
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Aleksandr Tavgen
 
Drive Smarter Decisions with Big Data Using Complex Event Processing
Perficient, Inc.
 
DevOpsDays Chicago 2014 - Controlling Devops
Brian Henerey
 
Correlation does not mean causation
Peter Varhol
 
Its Not You Its Me MSSP Couples Counseling
Atif Ghauri
 
Security a Revenue Center: How Security Can Drive Your Business
shira koper
 
Using Time Series for Full Observability of a SaaS Platform
DevOps.com
 
Ad

Recently uploaded (20)

PPTX
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
PDF
Upskill to Agentic Automation 2025 - Kickoff Meeting
DianaGray10
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Shuen Mei Parth Sharma Boost Productivity, Innovation and Efficiency wit...
AWS Chicago
 
PDF
Upgrading to z_OS V2R4 Part 01 of 02.pdf
Flavio787771
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
Sustainable and comertially viable mining process.pdf
Avijit Kumar Roy
 
PDF
Are there government-backed agri-software initiatives in Limerick.pdf
giselawagner2
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
Productivity Management Software | Workstatus
Lovely Baghel
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
UiPath vs Other Automation Tools Meeting Presentation.pdf
Tracy Dixon
 
PDF
CloudStack GPU Integration - Rohit Yadav
ShapeBlue
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
Upskill to Agentic Automation 2025 - Kickoff Meeting
DianaGray10
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
ShapeBlue
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Shuen Mei Parth Sharma Boost Productivity, Innovation and Efficiency wit...
AWS Chicago
 
Upgrading to z_OS V2R4 Part 01 of 02.pdf
Flavio787771
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Sustainable and comertially viable mining process.pdf
Avijit Kumar Roy
 
Are there government-backed agri-software initiatives in Limerick.pdf
giselawagner2
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
Productivity Management Software | Workstatus
Lovely Baghel
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
UiPath vs Other Automation Tools Meeting Presentation.pdf
Tracy Dixon
 
CloudStack GPU Integration - Rohit Yadav
ShapeBlue
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
Ad

Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Management

  • 1. Webinar | June 24, 2019 Adam Frank Senior Product Manager Mick Miller Senior DevOps Architect How KeyBank Liberated its IT Ops from Rules-Based Event Management
  • 4. Systems have become more… IT System Complexity • Modular • Distributed • Dynamic • Ephemeral
  • 5. What is Driving Digital Transformation? • Increased demand • Increasing change velocity • Customer expectation • Customer mobility • Customer choice
  • 6. What is Driving Change Velocity? • Expansion of digital services • Emergence of containers • High availability architectures • Volume: 100k+ and above logins per sec, etc. Increased monitoring breaks down legacy approaches… • Increasing staff does not scale with the rate of data ingestion • Legacy systems do not learn
  • 7. Keeping Customers… …and attracting new ones through improved customer experience (Cx) • Near 100% uptime has become expected • Restoration of services is measured in seconds not hours • Capturing click-level events to discover how customers are using your systems • Continuous delivery
  • 8. The Weight • Legacy rules-based filtering (if, then, else, etc.) won’t scale with exponential growth • Too many interdependencies between complex systems and rules supporting the telemetry Legacy Monitoring Can’t Scale
  • 9. Obsolescence: Planned and Unplanned • Software/Hardware: at the core of ideas, which change as we advance information/data/technology • Languages: Over 25 languages in 60 years (1948–2009) • Data: Flat files -> ISAM -> Relational -> No-SQL -> Clusters -> etc. • Software : ad-hoc -> Structured programming -> Object -> Functional - > etc. • IT Operations: ad hoc -> ITIL v1-3 -> ITIL v4 -> DevOps -> etc. • And on, and on …
  • 10. The Only Constant is Change If you don’t like change, you are really going to hate extinction.
  • 11. New Strategy Required for IT System Monitoring Graph based on StackState monitoring maturity model for IT operations visibilityandintelligence level 1 individual component monitoring level 2 full-breadth monitoring level 3 end-to-end monitoring and correlation level 4 AIOps maturity level reactive monitoring proactive monitoring predictive monitoring Rules-based AIOps
  • 12. • As the number of systems increases, so does the volume of data. This means the number of rules will increase causing exponential complexity. • Increased number of rules becomes unpredictable and untestable. This Parrot Is No More Rules-based event correlation is past it’s time
  • 13. This Parrot Is No More • Multiple rules interacting is a factorial problem: (n! = n × (n−1)!) o 5! rules = 120 possible combinations o 6! rules = 740 possible combinations o 10! rules = 3,628,800 possible combinations o 100! rules = 9 x 10157 power (9 followed by 157 zeros) possible combinations • While easy to understand and implement, rules-based monitoring implodes at the enterprise scale as complexity increases Rules-based event correlation is past it’s time
  • 14. n n! 0 1 1 1 2 2 3 6 4 24 5 120 6 702 7 5,040 8 40,320 9 362,880 10 3,628,800 11 39,916,800 12 479,001,600 13 6,277,020,800 n n! 14 87,178,291,200 15 1,307,674,368,000 16 20,922,789,888,000 17 355,687,428,096,000 18 6,402,373,705,728,000 19 121,645,100,408,832,000 20 2,432,902,008,176,640,000 21 51,090,942,171,709,440,000 22 1,124,000,727,777,607,680,000 23 25,852,016,738,884,976,640,000 24 620,448,401,733,239,439,360,000 25 15,511,210,043,330,985,984,000,000 26 403,291,461,126,605,635,584,000,000 27 10,888,869,450,418,352,160,768,000,000 Relationship Between Rules Growth Is Not Linear • Trying to understand and test all the relationships between rules is not possible. • Data scientists call this the “NP-complete” problem (not solvable with current compute capability) • Virtually impossible to understand effect of alert exceptions in a collection of rules, even at 10 rules.
  • 15. You don’t know what you don’t know • You can’t predict unusual events (events not caught or missed by rules) • Rules-based approaches need to change to AIOps • ML and AI: all event data can be processed • Modern AIOps uses algorithms to identify when something is unusual In data science a “black swan event” is something you can’t predict.
  • 16. Whodunit? • Rules-based approaches cannot decide on root cause of system failures • Random nature of real-world failures in highly distributed systems can have multiple root causes • Unlike rules-based systems, AIOps have built-in learning models. You don’t need to constantly add new rules Root cause probability
  • 17. Take the red pill… • Deceptively simple • Expensive • Unpredictable • Undecidable Rules-based systems cannot meet the demands of complex distributed computing
  • 18. Take the red pill… • Processes all events • Does not separate data from systems • Algorithms are deterministic • Algorithms don’t care about order • Single algorithm can replace hundreds of rules AIOps (AI and ML ) liberates IT from the limitations of rules- based systems
  • 19. Start Today • Now is the time to start your AIOps journey • Move beyond legacy rules-based systems • Start using modern machine-learning of AIOps to deliver continuous service assurance toyour enterprise Get Started by Reading the AIOps Manifesto: • https://www.aiops-exchange.org/wp- content/uploads/2019/05/aiops-manifesto.pdf • https://www.moogsoft.com/resources/aiops/e book/aiops-liberates-it
  • 21. Continuous Service Delivery, Optimal Business Agility TIME Quickly focus on and resolve the most critical issues, at scale Improve your economics by making teams faster, smarter, and more productive COST Get real-time visibility into your existing data sources, tools and workflow RISK ALL DATA Any SCALE Purpose-Built AI for IT and DevOps
  • 22. Moogsoft Is the Platform for Agile and Proactive Event Resolution Workflow Industrialized data ingestion from multiple sources Proactively and automatically detects Incidents and probable root causes (reduces MTTD) Triggers automation to restore services Predictive insights (reduces support escalations and MTTR) Enables collaborative workflows (reduces MTTR and adverse business impact) Automatically resolves signals from alert noise Early Detection, fewer tickets, reduced MTTR AI AI AI AI
  • 23. Diagnostics Diagnostics Custom Scripts Existing Runbooks RUNBOOK AUTOMATION NSO ORCHESTRATION Continuous Deployment Known Remediation AIOps Notifications Incident Cases NOTIFICATIONS INCIDENTS Events AMW ESX NETAPP AWS X-Ray APPLICATION NETWORK INFRASTRUCTURE CUSTOM ALERTS LOGOS/SYSLOGS SYNTHETIC CLOUD Seamless Integration With Your Existing Tools and Workflows