SlideShare a Scribd company logo
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Oracle Autonomous Database
Sandesh Rao
VP - Autonomous Health & Machine Learning
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.
Confidential – Oracle Restricted
2
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Theme
1. Tools or features which provide some function
2. Automation around some of these tools or features
3. Components or products which use machine learning to solve some use-cases
4. Additional ML tools which can be used on 1,2 or the results of 3 to develop different
outcomes
1. People who know DataScience
2. People who want to use it – prebuilt models
C
o
n
f
i
d
e
n
t
i
a
l
–
O
r
a
c
l
e
R
e
s
t
r
i 3
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Agenda
Journey to Autonomous Database
Machine learning basics & use cases
1
2
4
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Oracle’s Vision for Autonomous Database
• Self-Driving
–User defines service levels, database makes them happen
• Self-Securing
–Protection from both external attacks and malicious internal users
• Self-Repairing
–Automated protection from all downtime
5
Autonomous
Database
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Oracle Database 9i, 10g
• Automatic Storage Management (ASM)
• Automatic Memory Management
• Automatic DB Diagnostic Monitor (ADDM)
• Automatic Workload Repository (AWR)
• Automatic Undo tablespaces
• Automatic Segment Space Management
• Automatic Statistics Gathering
• Automatic Standby Management (Broker)
• Automatic Query Rewrite
Oracle Database 11g, 12c
• Automatic SQL Tuning
• Automatic Workload Replay
• Automatic Capture of SQL Monitor
• Automatic Data Optimization
• Automatic Storage Indexes
• Automatic Columnar Cache
• Automatic Diagnostic Framework
• Automatic Refresh of Database Cloning
• Autonomous Health Framework
6
Journey to Autonomous Database
• Oracle has been developing sophisticated database automation for decades
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Database Operations Runtime Management
• Solving these challenges requires a holistic approach
– Prevent problems and optimize solutions in real-time
– Recover from failures and identify root cause quickly with minimal intervention
• Human reactions too late and do not scale
• Manual triage and floods of notifications do not scale
• Applied Machine learning techniques effectively respond in real-time and
without huge impact to operations
Confidential – Oracle Restricted 7
Prevention and Recovery Pillars
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Journey to Autonomous Database
• Cloud enables Oracle to deliver a Fully Autonomous Database
– Expanded Database Automation
– Integrated with complete infrastructure automation
– With additional automation for operations, HA, security, etc.
8
Autonomous
Database
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
One Autonomous Database – Optimized by Use Case
9
Oracle Autonomous Database
Enterprise
OLTP,
Mixed
Workloads
Data
Warehousing
Departments,
Developers
2017 2018 Now
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Autonomous Database Cloud For Data Warehouse
• Easy
– Automatically optimizes Analytic workloads
– Simply “load and go”
– Database tunes itself - No need to define indexes, partitions, materialized views, etc.
– Works with any BI analytics tool
• Fast
– Based on Exadata technology
– Performance matches or exceeds most hand-tuned Data Warehouses
• Elastic
– Instant scaling of compute or storage with no downtime
– Pay for compute when in use only
10
Expected CY 2017
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Autonomous Database Cloud For OLTP or Mixed Workloads
11
Expected CY 2018
• Easy
– Configured for Mission Critical workloads
• Full Maximum Availability Architecture with scale-out clustering and disaster recovery
– Or Configured for Low Cost
• Single server for non-critical workloads or test/dev
• Fast
– Based on Exadata technology
• Elastic
– Instant scaling of compute or storage with no downtime
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Full End-to-End Automation
• Must handle a large number of tasks
– Provisioning complex scale-out clusters with disaster recovery
– Patching, upgrading, and backing up online
– Monitoring, scaling, diagnosing performance, tuning, optimizing
– Testing and change management of complex applications and workloads
– Automatically handling failures and errors
• Autonomous Database brings difficult trade-offs:
– Best Performance vs. Consistent Performance
– Simplicity vs. Completeness
12
Autonomous
Database
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Autonomous Health via Machine Learning
• Real-time Health Monitoring
of compliance, performance,
availability & capacity
2014
2016
2018+
Journey to Autonomous Database Cloud
Confidential – Gartner OPDBMS Vendor Briefing
• Automated analysis &
Anomaly detection
• Automated & targeted diagnostic
collections (50+ top areas & growing)
• Automated Health Checks
• Log masking, reduction &
diagnostic collections
• Automated
repair
2017
• Automated log
lifecycle management
• Preemptive fault
prediction &
correction
• Automated
environment
correlation for
fault prioritization
& flood control• Automated
workload
forecasting
2015
• Integration of database
support tools
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Machine Learning Use Cases
Machine learning basics
Log reduction & Anomaly timeline
Maintenance slot identification
Detect Performance Problems
Problem Signatures from Event Paths
Discover Duplicate bugs correlated issues
1
2
3
14
4
5
6
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
3 Key Areas of Machine Learning
Analytics
Knowledge discovery
Machine Learning
Learn & get better from
experience
Artificial Intelligence
Simulate human
intelligence
15
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Examples of Machine Learning Problem Types
Example: Classify if a particular log entry is
normal or not
Classifiers
Predict a label classification
Example: Predict when a system will run out
of memory
Regression
Predict a value
Example: Group incidents into collections of
similar ones, that share some common
attributes
Clustering
Form groups by discovering
reoccurring patterns
16
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Machine Learning Categories
Supervised Learning
Predict future outcomes with the help of
training data provided by human experts
Semi-Supervised Learning
Discover patterns within raw data and make
predictions, which are then reviewed by human
experts, who provide feedback which is used to
improve the model accuracy
Unsupervised Learning
Find patterns without any external input other
than the raw data
Reinforcement Learning
Take decisions based on past rewards for this
type of action
17
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Real-time Prevention
• Data Ingestion
– Kernel Smoothing and Moving Average
– Interpolation and Imputation
• Prediction and Pattern Recognition
– Multivariate and Auto-Associative Regression
– Clustering, Similarity Operators and Bayes Networks
• Fault and Anomaly Detection
– Sequential Probability Ratio Tests
– Conditional Probability Filters & Hidden Markov
Models
• Prognosis and Diagnosis
– Bayesian Belief Networks and Probabilistic Inference
– Remaining Useful Life Regression and GPM Models
Rapid Recovery
Confidential – Oracle Restricted 18
Autonomous Health Platform ML Technologies
• Data Ingestion
– ELK
– Lucene
• Prediction and Pattern Recognition
– TF-IDF and Bag-of-Words modelling
– Sequence Matcher
– K-nearest Neighbour
• Fault and Anomaly Detection
– Decision Trees and Random Forest
– Sequential Pattern Mining
• Prognosis and Diagnosis
– Recurrent neural Network
– Long short-term memory Predictive Analysis
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Log reduction & Anomaly timeline
Remove the noise from thousands of log
events and metrics to identify key events
revealing what happened, in what order
and why
19
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 20
Knowledge
Base Indexing
Entry
Clustering
Model
Generation
Entry Feature
Creation
Log
Cleansing
1 2 3 4 5 6
Expert Input
Knowledge Base
Creation
FeedbackTraining Real-time
Log File
Processing
Timestamp
Correlation &
Ranking
8 97 Batch
Feedback
Anomaly
Detection
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Anomaly Detection – High Level
21
Known normal log entry (discard)
Probable anomalous Line (collect)
Log
Collection
File
Type
1
File
Type
2
File
Type
n..
Log File
Anomaly Timeline
Probable
Anomalies
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
ML & Statistical Algorithms & Concepts Used
•Log entry features to build decision
tree of good / bad entries
•Entropy algorithm used feature
clustering
•SequenceMatcher library used for
clustering log entry clustering
•Expert input used in a feedback look
with ML output
•Functional rules defined for initial
good / bad mapping then feedback
only required for results with
standard deviation of > 2+-
•Features are extracted from log entries
and used for good / bad modeling
•TF-IDF used for weighted knowledge
matching & performance
Bag of
words
Semi-
supervised
ML
Decision
Tree
K-Nearest
Neighbour
22
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Autonomous Health Analysis - Ex: Trace File Analyzer
Auto
Recommendation
Confidential – Oracle Restricted
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Autonomous Health – TFA Anomaly Timeline
Confidential – Oracle
24
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Maintenance slot identification
Find the next best window of time
maintenance can be performed
with minimal service impact
25
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 26
Maintenance Slot Identification
Apply log
transformation to
penalize higher
values more than
smaller
Trend using
varying
mean over
time
Calculate season
component as
average for each
period
Applying a
convolution
filter
Seasonal
decomposition
showing the cyclic
nature of seasonal
which helps identify
relative usage highs
and lows
30 days of
hourly average
active session
data
Calculate
periodicity using
pandas dataframe
Input
Output
Approach
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Maintenance slot identification
• Use case
– Identify appropriate maintenance window for performing maintenance activity based
on historical workload patterns.
• Inputs (Training Data)
– The Average Active Sessions (metric is important because it's best representation of
your database system load) in sliding window format. Preferred last 30days data
points before making the prediction.
• AAS = (DB Time / Elapsed Time)
• In other words, AAS is a time-normalized DB Time
• From DB Tables :
– V$ACTIVE_SESSION_HISTORY => COUNT(*) = DB Time in seconds {Cyclic buffer ~4 Hours}
– DBA_HIST_ACTIVE_SESS_HISTORY => 10 * (COUNT(*)) = DB Time in seconds {Since one in 10 samples}
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Maintenance slot identification
• Implementation
– Depending upon the size of maintenance window slot we identify the granularity with
which data is considered and length of prediction block
– Transformation is performed over the time series to reduce trend (Log, square root,
cube root )
– Seasonality of the time series is extracted
• Fourier Transformation helps in the identification of seasonality period (periodogram)
• Seasonal decomposition helps identify the nature of seasonality (curve)
• The cycle identified from this seasonality helps in identifying what are low and high workload periods
in the cycles.
• We use this information to identify the next best maintenance window.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Maintenance slot identification
• Seasonal Decomposition
– Using an observed time series extract a number of component series where each of
these has a certain characteristic or type of behavior.
– Time Series Decomposition
• Trend
– The trend component at time t, which reflects the long-term progression of the series (secular variation)
– A trend exists when there is a persistent increasing or decreasing direction in the data
• Seasonality
– The seasonal component at time t, reflecting seasonality (seasonal variation)
– Seasonality occurs over a fixed and known period (e.g., the quarter of the year, the month, or day of the
week)
• Residual
– The irregular component (or "noise") at time t, which describes random, irregular influences
– It represents the residuals or remainder of the time series after the other components have been removed.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 30
Maintenance Slot Identification
START_TIME CNT
2018-04-11 15:00:00 290
2018-04-11 16:00:00 31120
2018-04-11 17:00:00 21530
2018-04-11 18:00:00 26240
2018-04-11 19:00:00 40520
2018-04-11 20:00:00 54270
2018-04-11 21:00:00 51460
2018-04-11 22:00:00 44310
2018-04-11 23:00:00 25690
START_TIME
2018-04-11 15:00:00 -0.226098
2018-04-11 16:00:00 -0.069821
2018-04-11 17:00:00 -0.350088
2018-04-11 18:00:00 -0.187483
2018-04-11 19:00:00 -0.513240
2018-04-11 20:00:00 0.019737
2018-04-11 21:00:00 0.059213
2018-04-11 22:00:00 -0.011312
2018-04-11 23:00:00 -0.179156
START_TIME
2018-04-11 15:00:00 5.669881
2018-04-11 16:00:00 10.345606
2018-04-11 17:00:00 9.977203
2018-04-11 18:00:00 10.175040
2018-04-11 19:00:00 10.609551
2018-04-11 20:00:00 10.901727
2018-04-11 21:00:00 10.848560
2018-04-11 22:00:00 10.698966
2018-04-11 23:00:00 10.153857
Current Date : 2018-05-12 15:00:00
Current Position in Seasonality : -0.22609829742533585
Best Maintenance Period in next Cycle : 2018-05-12 19:00:00
Worst Maintenance Period in next Cycle : 2018-05-13 08:00:00
Original observation data1 Apply convolution filter & average2 Calculate seasonality3
Use seasonality to
predict best
maintenance window
4
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Anomaly Detection with OS and ASH Data
Detect performance
problems
31
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Cluster Health Advisor – Applied Machine Learning
• Fault data driven model development
• Applied purpose-built Applied ML for
knowledge extraction
• Expert Dev team scrubs data
• Generates Bayesian Network-based
diagnostic root-cause models
• Uses BN-based run-time models to
perform real-time prognostics
32
Discovers Potential Cluster & DB Problems
CHA Dev Team
ASH
ML
Knowledge
Extraction
BN
Models
Expert
Supervision
CHA
Runtime
Model
Feedback
CHA
CHA
Scrub Data
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 33
Data Flow Overview
Cluster Health Advisor
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 34
Models Capture the Dynamic Behavior of all Normal Operation
Models Capture all Normal Operating Modes
0
5000
10000
15000
20000
25000
30000
35000
40000
10:00 2:00 6:00
5100
9025
4024
2350
4100
22050
10000
21000
4400
2500
4900
800
IOPS
user commits (/sec)
log file parallel write (usec)
log file sync (usec)
• Release ships with conservative models to minimize false warnings
• A model captures the normal load phases and their statistics over time, and thus the characteristics for all load
intensities and profiles. During monitoring, any data point similar to one of the vectors is NORMAL.
• One could say that the model REMEMBERS the normal operational dynamics over time
In-Memory Reference Matrix
(Part of “Normality” Model)
IOPS #### 2500 4900 800 ####
User Commits #### 10000 21000 4400 ####
Log File Parallel
Write
#### 2350 4100 22050 ####
Log File Sync #### 5100 9025 4024 ####
… … … … … …
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Problem Signatures from Event Paths
Identify a series of events as
connected and representing the
signature of a problem
35
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Longest Common Subsequence of Anomalous Entries
36
1. Start by classifying a problem such as an important
ORA or CRS error
2. Find occurrences of the problem across many different
log files
3. Identify anomalous entries and lifecycle events in
chronological order within a predefined time window
around the occurrence of the problem in all the logs
– Time window depends on frequency of message logging
(e.g. 10 mins window for Clusterware)
4. Compare the repeating anomalous / lifecycle entries
to identify the longest common subsequence of
anomalous entries
– These represent the problem signature
– Sequence of events are correlated by component, log file, host &
thread
Find the Finite State Automata(FSA)
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Improvisation by constraining LCS identification
1. There are constraints over multiple features which are recorded while analysis:
• LCS must include the major checkpoints present in the knowledge base. Other CRS, ORA log
signatures which are non-fatal and generally appear in correlation with these fatal events must be
present.
• LCS are computed at two levels:
– Overall sequence of anomalous entries, i.e. in correlation with all components and products (sequentials
know-ids for ocssd, gipc, database alert log, asm alert log etc)
– Sequence of anomalies specific to file type.
• We record categorization of log entries specific to host.
• Pre defined knowledge of log structure also helps in optimizing the entries till thread level.
– E.g. Structure of Clusterware log file ([Timestamp]:[Component Name]:[Thread Id]:[Msg])
37
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Example signatures and their analysis
• Sample Central Event : 2017-01-19 16:51:20.562 [OCSSD(24862)]CRS-1656: The CSS daemon is terminating due to a fatal error;
Details at (:CSSSC00012:) in /tools/list/grid/orabase/diag/crs/ur102ora3502c/crs/trace/ocssd.trc
38
Knowledge Id Sample Line (States in FSA for central event)
52CC1E8631FC2674E053B580E80AB08D 2016-10-16 21:22:36.520+CRS-5008: Invalid attribute value: en4 for the network interface
52CC1E8632082674E053B580E80AB08D
2016-10-16 21:25:11.516 [OCSSD(6816354)]CRS-1608: This node was evicted by node 3, rwsbs03; details at (:CSSNM00005:) in
/u01/app/crsusr/diag/crs/rwsbs02/crs/trace/ocssd.trc.
52CC1E8632212674E053B580E80AB08D 2016-10-16 21:25:17.927 [OCSSD(18219406)]CRS-1654: Clean up of CRSD resources finished successfully.
52CC1E8631EC2674E053B580E80AB08D 2016-10-16 21:25:17.927 [OCSSD(18219406)]CRS-1655: CSSD on node rwsbs01 detected a problem and started to shutdown.
52CC1E8632272674E053B580E80AB08D
2016-10-16 21:25:19.431 [OCSSD(18219406)]CRS-8503: Oracle Clusterware process OCSSD with operating system process ID 18219406 experienced fatal signal or
exception code 6.
52CC1E8632202674E053B580E80AB08D
2016-10-16 21:25:21.788 [CRSD(44696012)]CRS-0805: Cluster Ready Service aborted due to failure to communicate with Cluster Synchronization Service with error
[3]. Details at (:CRSD00109:) in /u01/app/crsusr/diag/crs/rwsbs01/crs/trace/crsd.trc.
52CC1E86208C2674E053B580E80AB08D 2016-10-18 02:02:00.835 : CSSD:6684: (:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally
52CC1E861F132674E053B580E80AB08D CLSB:6684: Oracle Clusterware infrastructure error in OCSSD (OS PID 12452524): Fatal signal 6 has occurred in program ocssd thread 6684; nested signal count is 1
52CC1E861E552674E053B580E80AB08D Incident 393 created, dump file: /u01/app/crsusr/diag/crs/rwsbs02/crs/incident/incdir_393/ocssd_i393.trc
52CC1E861F332674E053B580E80AB08D 2016-10-18 02:02:07.113 : SKGFD:5655: ERROR: -9(Error 27041, OS Error (IBM AIX RISC System/6000 Error: 47: Write-protected media
52CC1E86207C2674E053B580E80AB08D
2016-10-18 02:02:07.774 : CSSD:5655: clssnmvDiskCreate: Cluster guid ea34893b9442ef79ff642d70699aff9d found in voting disk /dev/rbs01_100G_asm1 does not
match with the cluster guid 7b63590c34fa5f44bf6944aefa4ee85d obtained from the GPnP profile
52CC1E863DB82674E053B580E80AB08D
2017-01-19 16:48:01.057 [OCSSD(24862)]CRS-1649: An I/O error occurred for voting file: /dev/rdsk/c1d16; details at (:CSSNM00059:) in
/tools/list/grid/orabase/diag/crs/ur102ora3502c/crs/trace/ocssd.trc.
52CC1E863DBC2674E053B580E80AB08D
2017-01-19 16:49:40.550 [OCSSD(24862)]CRS-1615: No I/O has completed after 50% of the maximum interval. Voting file /dev/rdsk/c1d16 will be considered not
functional in 99508 milliseconds
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 39
Example signatures and their analysis
5 Minute
before
Central
Event
5 Minute
After
Central
Event
Central
Event
52CC1E8
631FC26
74E053B
580E80A
B08D
52CC1E8
6207C26
74E053B
580E80A
B08D
52CC1E8
61F33267
4E053B5
80E80AB
08D
52CC1E8
61E5526
74E053B
580E80A
B08D
52CC1E8
61F13267
4E053B5
80E80AB
08D
52CC1E8
6208C26
74E053B
580E80A
B08D
52CC1E8
6322026
74E053B
580E80A
B08D
52CC1E8
6322726
74E053B
580E80A
B08D
52CC1E8
631EC26
74E053B
580E80A
B08D
52CC1E8
6322126
74E053B
580E80A
B08D
52CC1E8
6320826
74E053B
580E80A
B08D
52CC1E8
63DBC26
74E053B
580E80A
B08D
52CC1E8
63DB826
74E053B
580E80A
B08D
52CC1E8
6722C26
74E053B
580E80A
B08D
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Generalizing event signatures over the scope of bug
•BugSignature
Repository 40
Event
Signature 35
Event
Signature 3435
Event
Signature 494
Event
Signature 3948
Event
Signature 292
Event Signature
434933
Node Eviction bug
243645 Timeline
Event
Signature 3434
Event
Signature 3435
Event
Signature 4344
Event
Signature 3048
Event
Signature 202
Event Signature
434983
Node Eviction bug
2747747 Timeline
Event Signature
35
Event Signature
3435
Event Signature
3048
Event Signature
3948
Event Signature
292
Event Signature
434933
New Signature
Check for weighted
probabilistic match
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Bug Duplicate Identification
Discovers Duplicate Bugs,
Correlated Issues and Prioritizes
Based Upon Customer Impact
41
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Applied Machine Learning – Adaptive Bug Search
• ABS is internally offered in BugDB and MOS
GUI for Dev and Support teams
• ABS helps find problems in the same space
• Allows engineers to get the full context of
past known problems
• Provides debugging clues to help diagnose the
reported problem
• Identifies developers who worked in this
space in the past
Confidential – Oracle Restricted 42
Discovers Duplicate Bugs and Correlated Issues
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
BUG
DB
Applied Machine Learning – Adaptive Bug Search
• Bugs are submitted from over 400 Oracle
products
• Performs ML Logistic Regression on
training set of bugs to generate model
• Displays up to 8 possible duplicates per
bug or SR
• Feedback improves model accuracy
– Direct from developers
– Indirect from bug updates
Confidential – Oracle Restricted 43
Discovers Duplicate Bugs and Correlated Issues
ABS Dev Team
Bugs
ML Logistic
Regression
Model
Generation
Expert
Supervision
ABS
Runtime
Model
Dev
Feedback
Bug
Submission Bug and
Duplicates
Together
ABS
Service
Scrub Data
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Adaptive Bug Search (ABS) – High Level Flow
• Issues parsed into different features
– Error stack, Trace data, Problem description, etc.
• Issues represented as a cluster of features
– i.e. All bugs in a bug tree contribute towards the feature set
• Logistic Regression applied to build a model
– Model defines the significance of each feature
• Similarity between issues computed using the model
– Identifies the root of the cluster (aka bug tree)
• Feedback used to improve the model
– Feedback is automatically derived based on how the bug gets closed
44Confidential – Oracle Restricted
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
BUG
DB
Applied Machine Learning – ABS Bug Priority
• Analyze customer and internal bugs
submitted within past 5 years
• Performs ML Linear Regression on training
set of bugs to generate model
• Highlight bugs for Development Managers
to review
– Bugs affecting large numbers of customers
– Bugs with significant impact to high profile
customers
• Feedback improves model accuracy
– Direct from development manager
– Indirect from bug fixes
Confidential – Oracle Restricted 45
Discovers Bugs that have the most Customer Impact
ABS Dev Team
Cust.
Bugs
ML Linear
Regression
Model
Generation
Expert
Supervision
ABS BP
Model
Dev
Feedback
Customer files
SR and results
in a new Bug
Bug Priority
with Score
ABS BP
Service
Scrub Data
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
• Thank you for your feedback!!
• Please continue to reach out to us via
social media
Twitter @sandeshr
Linkedin
https://www.linkedin.com/in/raosandesh/
Questions ?

More Related Content

What's hot (20)

PDF
Oracle Trace File Analyzer Overview
Gareth Chapman
 
PDF
EXAchk for Exadata Presentation
Sandesh Rao
 
PDF
ORAchk EXAchk what's new in 12.1.0.2.7
Sandesh Rao
 
PDF
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
Sandesh Rao
 
PDF
The Machine Learning behind the Autonomous Database ILOUG Feb 2020
Sandesh Rao
 
PDF
Introduction to Machine Learning and Data Science using Autonomous Database ...
Sandesh Rao
 
PDF
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
Sandesh Rao
 
PDF
How to use Exachk effectively to manage Exadata environments OGBEmea
Sandesh Rao
 
PDF
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Sandesh Rao
 
PDF
AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...
Sandesh Rao
 
PDF
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
Sandesh Rao
 
PDF
AUSOUG - NZOUG - Groundbreakers - Jun 2019 - 19 Troubleshooting Tips and Tric...
Sandesh Rao
 
PDF
20 Tips and Tricks with the Autonomous Database
Sandesh Rao
 
PDF
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Sandesh Rao
 
PDF
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
Sandesh Rao
 
PDF
Machine Learning and AI at Oracle
Sandesh Rao
 
PDF
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
Sandesh Rao
 
PDF
LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...
Sandesh Rao
 
PDF
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Sandesh Rao
 
PDF
Using Machine Learning to Debug complex Oracle RAC Issues
Anil Nair
 
Oracle Trace File Analyzer Overview
Gareth Chapman
 
EXAchk for Exadata Presentation
Sandesh Rao
 
ORAchk EXAchk what's new in 12.1.0.2.7
Sandesh Rao
 
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
Sandesh Rao
 
The Machine Learning behind the Autonomous Database ILOUG Feb 2020
Sandesh Rao
 
Introduction to Machine Learning and Data Science using Autonomous Database ...
Sandesh Rao
 
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
Sandesh Rao
 
How to use Exachk effectively to manage Exadata environments OGBEmea
Sandesh Rao
 
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Sandesh Rao
 
AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...
Sandesh Rao
 
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
Sandesh Rao
 
AUSOUG - NZOUG - Groundbreakers - Jun 2019 - 19 Troubleshooting Tips and Tric...
Sandesh Rao
 
20 Tips and Tricks with the Autonomous Database
Sandesh Rao
 
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Sandesh Rao
 
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
Sandesh Rao
 
Machine Learning and AI at Oracle
Sandesh Rao
 
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
Sandesh Rao
 
LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...
Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Sandesh Rao
 
Using Machine Learning to Debug complex Oracle RAC Issues
Anil Nair
 

Similar to AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know (20)

PDF
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - AI and Machine Learning
Sandesh Rao
 
PDF
Get ready for_an_autonomous_data_driven_future_ext
Oracle Developers
 
PDF
Data meets AI - AICUG - Santa Clara
Sandesh Rao
 
PDF
Highly Automated IT
Andrey Akulov
 
PDF
Introduction to Machine Learning and Data Science using the Autonomous databa...
Sandesh Rao
 
PDF
Data meets AI - ATP Roadshow India
Sandesh Rao
 
PDF
Machine Learning in Autonomous Data Warehouse
Sandesh Rao
 
PPTX
The Changing Role of a DBA in an Autonomous World
Maria Colgan
 
PDF
SOUG Day - autonomous what is next
Thomas Teske
 
PDF
Oracle Autonomous Database - introducción técnica y hands on lab
"Diego \"Perico\"" Sanchez
 
PDF
Enterprise Cloud transformation z pohledu Oracle
MarketingArrowECS_CZ
 
PDF
On24 oracle-machine-learning-platform-12-feb-2020-webcast
Till Huber
 
PPTX
Ground Breakers Romania: Oracle Autonomous Database
Maria Colgan
 
PDF
Autonomous Database Explained
Neagu Alexandru Cristian
 
PDF
The Smarter Way To Manage Data
Smart ERP Solutions, Inc.
 
PDF
AUSOUG Analytics Update - Nov 14 2018
Jason Lowe
 
PDF
Introduction to AutoML and Data Science using the Oracle Autonomous Database ...
Sandesh Rao
 
PDF
Introducing New AI Ops Innovations in Oracle 19c Autonomous Health Framework ...
Sandesh Rao
 
PPTX
DBCS Office Hours - Modernization through Migration
Tammy Bednar
 
PDF
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
avanttic Consultoría Tecnológica
 
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - AI and Machine Learning
Sandesh Rao
 
Get ready for_an_autonomous_data_driven_future_ext
Oracle Developers
 
Data meets AI - AICUG - Santa Clara
Sandesh Rao
 
Highly Automated IT
Andrey Akulov
 
Introduction to Machine Learning and Data Science using the Autonomous databa...
Sandesh Rao
 
Data meets AI - ATP Roadshow India
Sandesh Rao
 
Machine Learning in Autonomous Data Warehouse
Sandesh Rao
 
The Changing Role of a DBA in an Autonomous World
Maria Colgan
 
SOUG Day - autonomous what is next
Thomas Teske
 
Oracle Autonomous Database - introducción técnica y hands on lab
"Diego \"Perico\"" Sanchez
 
Enterprise Cloud transformation z pohledu Oracle
MarketingArrowECS_CZ
 
On24 oracle-machine-learning-platform-12-feb-2020-webcast
Till Huber
 
Ground Breakers Romania: Oracle Autonomous Database
Maria Colgan
 
Autonomous Database Explained
Neagu Alexandru Cristian
 
The Smarter Way To Manage Data
Smart ERP Solutions, Inc.
 
AUSOUG Analytics Update - Nov 14 2018
Jason Lowe
 
Introduction to AutoML and Data Science using the Oracle Autonomous Database ...
Sandesh Rao
 
Introducing New AI Ops Innovations in Oracle 19c Autonomous Health Framework ...
Sandesh Rao
 
DBCS Office Hours - Modernization through Migration
Tammy Bednar
 
Meetup Oracle Database MAD_BCN: 1.2 Oracle Database 18c (autonomous database)
avanttic Consultoría Tecnológica
 
Ad

More from Sandesh Rao (19)

PDF
Will Oracle 23ai make you a better DBA or Developer?
Sandesh Rao
 
PDF
Beyond Metrics – Oracle AHF Insights for Proactive Database Management - DOAG...
Sandesh Rao
 
PDF
Sandesh_Rao_Navigating Oracle Troubleshooting- AHF Insights for Database 23ai...
Sandesh Rao
 
PDF
Sandesh_Rao_Unlocking Oracle Database Mysteries AHF Insights and the AI-LLM D...
Sandesh Rao
 
PDF
Whats new in Autonomous Database in 2022
Sandesh Rao
 
PDF
Oracle Database performance tuning using oratop
Sandesh Rao
 
PDF
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Sandesh Rao
 
PDF
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Sandesh Rao
 
PDF
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
Sandesh Rao
 
PDF
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
Sandesh Rao
 
PDF
Top 20 FAQs on the Autonomous Database
Sandesh Rao
 
PDF
How to Use EXAchk Effectively to Manage Exadata Environments
Sandesh Rao
 
PDF
TFA Collector - what can one do with it
Sandesh Rao
 
PDF
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Sandesh Rao
 
PDF
Troubleshooting tips and tricks for Oracle Database Oct 2020
Sandesh Rao
 
PDF
20 tips and tricks with the Autonomous Database
Sandesh Rao
 
PDF
TFA, ORAchk and EXAchk 20.2 - What's new
Sandesh Rao
 
PDF
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
Sandesh Rao
 
PDF
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Sandesh Rao
 
Will Oracle 23ai make you a better DBA or Developer?
Sandesh Rao
 
Beyond Metrics – Oracle AHF Insights for Proactive Database Management - DOAG...
Sandesh Rao
 
Sandesh_Rao_Navigating Oracle Troubleshooting- AHF Insights for Database 23ai...
Sandesh Rao
 
Sandesh_Rao_Unlocking Oracle Database Mysteries AHF Insights and the AI-LLM D...
Sandesh Rao
 
Whats new in Autonomous Database in 2022
Sandesh Rao
 
Oracle Database performance tuning using oratop
Sandesh Rao
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Sandesh Rao
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Sandesh Rao
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
Sandesh Rao
 
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
Sandesh Rao
 
Top 20 FAQs on the Autonomous Database
Sandesh Rao
 
How to Use EXAchk Effectively to Manage Exadata Environments
Sandesh Rao
 
TFA Collector - what can one do with it
Sandesh Rao
 
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Sandesh Rao
 
Troubleshooting tips and tricks for Oracle Database Oct 2020
Sandesh Rao
 
20 tips and tricks with the Autonomous Database
Sandesh Rao
 
TFA, ORAchk and EXAchk 20.2 - What's new
Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Sandesh Rao
 
Ad

Recently uploaded (20)

PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 

AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know

  • 1. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Oracle Autonomous Database Sandesh Rao VP - Autonomous Health & Machine Learning
  • 2. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Confidential – Oracle Restricted 2
  • 3. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Theme 1. Tools or features which provide some function 2. Automation around some of these tools or features 3. Components or products which use machine learning to solve some use-cases 4. Additional ML tools which can be used on 1,2 or the results of 3 to develop different outcomes 1. People who know DataScience 2. People who want to use it – prebuilt models C o n f i d e n t i a l – O r a c l e R e s t r i 3
  • 4. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Agenda Journey to Autonomous Database Machine learning basics & use cases 1 2 4
  • 5. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Oracle’s Vision for Autonomous Database • Self-Driving –User defines service levels, database makes them happen • Self-Securing –Protection from both external attacks and malicious internal users • Self-Repairing –Automated protection from all downtime 5 Autonomous Database
  • 6. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Oracle Database 9i, 10g • Automatic Storage Management (ASM) • Automatic Memory Management • Automatic DB Diagnostic Monitor (ADDM) • Automatic Workload Repository (AWR) • Automatic Undo tablespaces • Automatic Segment Space Management • Automatic Statistics Gathering • Automatic Standby Management (Broker) • Automatic Query Rewrite Oracle Database 11g, 12c • Automatic SQL Tuning • Automatic Workload Replay • Automatic Capture of SQL Monitor • Automatic Data Optimization • Automatic Storage Indexes • Automatic Columnar Cache • Automatic Diagnostic Framework • Automatic Refresh of Database Cloning • Autonomous Health Framework 6 Journey to Autonomous Database • Oracle has been developing sophisticated database automation for decades
  • 7. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Database Operations Runtime Management • Solving these challenges requires a holistic approach – Prevent problems and optimize solutions in real-time – Recover from failures and identify root cause quickly with minimal intervention • Human reactions too late and do not scale • Manual triage and floods of notifications do not scale • Applied Machine learning techniques effectively respond in real-time and without huge impact to operations Confidential – Oracle Restricted 7 Prevention and Recovery Pillars
  • 8. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Journey to Autonomous Database • Cloud enables Oracle to deliver a Fully Autonomous Database – Expanded Database Automation – Integrated with complete infrastructure automation – With additional automation for operations, HA, security, etc. 8 Autonomous Database
  • 9. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | One Autonomous Database – Optimized by Use Case 9 Oracle Autonomous Database Enterprise OLTP, Mixed Workloads Data Warehousing Departments, Developers 2017 2018 Now
  • 10. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Autonomous Database Cloud For Data Warehouse • Easy – Automatically optimizes Analytic workloads – Simply “load and go” – Database tunes itself - No need to define indexes, partitions, materialized views, etc. – Works with any BI analytics tool • Fast – Based on Exadata technology – Performance matches or exceeds most hand-tuned Data Warehouses • Elastic – Instant scaling of compute or storage with no downtime – Pay for compute when in use only 10 Expected CY 2017
  • 11. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Autonomous Database Cloud For OLTP or Mixed Workloads 11 Expected CY 2018 • Easy – Configured for Mission Critical workloads • Full Maximum Availability Architecture with scale-out clustering and disaster recovery – Or Configured for Low Cost • Single server for non-critical workloads or test/dev • Fast – Based on Exadata technology • Elastic – Instant scaling of compute or storage with no downtime
  • 12. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Full End-to-End Automation • Must handle a large number of tasks – Provisioning complex scale-out clusters with disaster recovery – Patching, upgrading, and backing up online – Monitoring, scaling, diagnosing performance, tuning, optimizing – Testing and change management of complex applications and workloads – Automatically handling failures and errors • Autonomous Database brings difficult trade-offs: – Best Performance vs. Consistent Performance – Simplicity vs. Completeness 12 Autonomous Database
  • 13. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Autonomous Health via Machine Learning • Real-time Health Monitoring of compliance, performance, availability & capacity 2014 2016 2018+ Journey to Autonomous Database Cloud Confidential – Gartner OPDBMS Vendor Briefing • Automated analysis & Anomaly detection • Automated & targeted diagnostic collections (50+ top areas & growing) • Automated Health Checks • Log masking, reduction & diagnostic collections • Automated repair 2017 • Automated log lifecycle management • Preemptive fault prediction & correction • Automated environment correlation for fault prioritization & flood control• Automated workload forecasting 2015 • Integration of database support tools
  • 14. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Machine Learning Use Cases Machine learning basics Log reduction & Anomaly timeline Maintenance slot identification Detect Performance Problems Problem Signatures from Event Paths Discover Duplicate bugs correlated issues 1 2 3 14 4 5 6
  • 15. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 3 Key Areas of Machine Learning Analytics Knowledge discovery Machine Learning Learn & get better from experience Artificial Intelligence Simulate human intelligence 15
  • 16. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Examples of Machine Learning Problem Types Example: Classify if a particular log entry is normal or not Classifiers Predict a label classification Example: Predict when a system will run out of memory Regression Predict a value Example: Group incidents into collections of similar ones, that share some common attributes Clustering Form groups by discovering reoccurring patterns 16
  • 17. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Machine Learning Categories Supervised Learning Predict future outcomes with the help of training data provided by human experts Semi-Supervised Learning Discover patterns within raw data and make predictions, which are then reviewed by human experts, who provide feedback which is used to improve the model accuracy Unsupervised Learning Find patterns without any external input other than the raw data Reinforcement Learning Take decisions based on past rewards for this type of action 17
  • 18. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Real-time Prevention • Data Ingestion – Kernel Smoothing and Moving Average – Interpolation and Imputation • Prediction and Pattern Recognition – Multivariate and Auto-Associative Regression – Clustering, Similarity Operators and Bayes Networks • Fault and Anomaly Detection – Sequential Probability Ratio Tests – Conditional Probability Filters & Hidden Markov Models • Prognosis and Diagnosis – Bayesian Belief Networks and Probabilistic Inference – Remaining Useful Life Regression and GPM Models Rapid Recovery Confidential – Oracle Restricted 18 Autonomous Health Platform ML Technologies • Data Ingestion – ELK – Lucene • Prediction and Pattern Recognition – TF-IDF and Bag-of-Words modelling – Sequence Matcher – K-nearest Neighbour • Fault and Anomaly Detection – Decision Trees and Random Forest – Sequential Pattern Mining • Prognosis and Diagnosis – Recurrent neural Network – Long short-term memory Predictive Analysis
  • 19. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Log reduction & Anomaly timeline Remove the noise from thousands of log events and metrics to identify key events revealing what happened, in what order and why 19
  • 20. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 20 Knowledge Base Indexing Entry Clustering Model Generation Entry Feature Creation Log Cleansing 1 2 3 4 5 6 Expert Input Knowledge Base Creation FeedbackTraining Real-time Log File Processing Timestamp Correlation & Ranking 8 97 Batch Feedback Anomaly Detection
  • 21. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Anomaly Detection – High Level 21 Known normal log entry (discard) Probable anomalous Line (collect) Log Collection File Type 1 File Type 2 File Type n.. Log File Anomaly Timeline Probable Anomalies
  • 22. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | ML & Statistical Algorithms & Concepts Used •Log entry features to build decision tree of good / bad entries •Entropy algorithm used feature clustering •SequenceMatcher library used for clustering log entry clustering •Expert input used in a feedback look with ML output •Functional rules defined for initial good / bad mapping then feedback only required for results with standard deviation of > 2+- •Features are extracted from log entries and used for good / bad modeling •TF-IDF used for weighted knowledge matching & performance Bag of words Semi- supervised ML Decision Tree K-Nearest Neighbour 22
  • 23. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Autonomous Health Analysis - Ex: Trace File Analyzer Auto Recommendation Confidential – Oracle Restricted
  • 24. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Autonomous Health – TFA Anomaly Timeline Confidential – Oracle 24
  • 25. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Maintenance slot identification Find the next best window of time maintenance can be performed with minimal service impact 25
  • 26. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 26 Maintenance Slot Identification Apply log transformation to penalize higher values more than smaller Trend using varying mean over time Calculate season component as average for each period Applying a convolution filter Seasonal decomposition showing the cyclic nature of seasonal which helps identify relative usage highs and lows 30 days of hourly average active session data Calculate periodicity using pandas dataframe Input Output Approach
  • 27. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Maintenance slot identification • Use case – Identify appropriate maintenance window for performing maintenance activity based on historical workload patterns. • Inputs (Training Data) – The Average Active Sessions (metric is important because it's best representation of your database system load) in sliding window format. Preferred last 30days data points before making the prediction. • AAS = (DB Time / Elapsed Time) • In other words, AAS is a time-normalized DB Time • From DB Tables : – V$ACTIVE_SESSION_HISTORY => COUNT(*) = DB Time in seconds {Cyclic buffer ~4 Hours} – DBA_HIST_ACTIVE_SESS_HISTORY => 10 * (COUNT(*)) = DB Time in seconds {Since one in 10 samples}
  • 28. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Maintenance slot identification • Implementation – Depending upon the size of maintenance window slot we identify the granularity with which data is considered and length of prediction block – Transformation is performed over the time series to reduce trend (Log, square root, cube root ) – Seasonality of the time series is extracted • Fourier Transformation helps in the identification of seasonality period (periodogram) • Seasonal decomposition helps identify the nature of seasonality (curve) • The cycle identified from this seasonality helps in identifying what are low and high workload periods in the cycles. • We use this information to identify the next best maintenance window.
  • 29. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Maintenance slot identification • Seasonal Decomposition – Using an observed time series extract a number of component series where each of these has a certain characteristic or type of behavior. – Time Series Decomposition • Trend – The trend component at time t, which reflects the long-term progression of the series (secular variation) – A trend exists when there is a persistent increasing or decreasing direction in the data • Seasonality – The seasonal component at time t, reflecting seasonality (seasonal variation) – Seasonality occurs over a fixed and known period (e.g., the quarter of the year, the month, or day of the week) • Residual – The irregular component (or "noise") at time t, which describes random, irregular influences – It represents the residuals or remainder of the time series after the other components have been removed.
  • 30. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 30 Maintenance Slot Identification START_TIME CNT 2018-04-11 15:00:00 290 2018-04-11 16:00:00 31120 2018-04-11 17:00:00 21530 2018-04-11 18:00:00 26240 2018-04-11 19:00:00 40520 2018-04-11 20:00:00 54270 2018-04-11 21:00:00 51460 2018-04-11 22:00:00 44310 2018-04-11 23:00:00 25690 START_TIME 2018-04-11 15:00:00 -0.226098 2018-04-11 16:00:00 -0.069821 2018-04-11 17:00:00 -0.350088 2018-04-11 18:00:00 -0.187483 2018-04-11 19:00:00 -0.513240 2018-04-11 20:00:00 0.019737 2018-04-11 21:00:00 0.059213 2018-04-11 22:00:00 -0.011312 2018-04-11 23:00:00 -0.179156 START_TIME 2018-04-11 15:00:00 5.669881 2018-04-11 16:00:00 10.345606 2018-04-11 17:00:00 9.977203 2018-04-11 18:00:00 10.175040 2018-04-11 19:00:00 10.609551 2018-04-11 20:00:00 10.901727 2018-04-11 21:00:00 10.848560 2018-04-11 22:00:00 10.698966 2018-04-11 23:00:00 10.153857 Current Date : 2018-05-12 15:00:00 Current Position in Seasonality : -0.22609829742533585 Best Maintenance Period in next Cycle : 2018-05-12 19:00:00 Worst Maintenance Period in next Cycle : 2018-05-13 08:00:00 Original observation data1 Apply convolution filter & average2 Calculate seasonality3 Use seasonality to predict best maintenance window 4
  • 31. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Anomaly Detection with OS and ASH Data Detect performance problems 31
  • 32. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Cluster Health Advisor – Applied Machine Learning • Fault data driven model development • Applied purpose-built Applied ML for knowledge extraction • Expert Dev team scrubs data • Generates Bayesian Network-based diagnostic root-cause models • Uses BN-based run-time models to perform real-time prognostics 32 Discovers Potential Cluster & DB Problems CHA Dev Team ASH ML Knowledge Extraction BN Models Expert Supervision CHA Runtime Model Feedback CHA CHA Scrub Data
  • 33. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 33 Data Flow Overview Cluster Health Advisor
  • 34. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 34 Models Capture the Dynamic Behavior of all Normal Operation Models Capture all Normal Operating Modes 0 5000 10000 15000 20000 25000 30000 35000 40000 10:00 2:00 6:00 5100 9025 4024 2350 4100 22050 10000 21000 4400 2500 4900 800 IOPS user commits (/sec) log file parallel write (usec) log file sync (usec) • Release ships with conservative models to minimize false warnings • A model captures the normal load phases and their statistics over time, and thus the characteristics for all load intensities and profiles. During monitoring, any data point similar to one of the vectors is NORMAL. • One could say that the model REMEMBERS the normal operational dynamics over time In-Memory Reference Matrix (Part of “Normality” Model) IOPS #### 2500 4900 800 #### User Commits #### 10000 21000 4400 #### Log File Parallel Write #### 2350 4100 22050 #### Log File Sync #### 5100 9025 4024 #### … … … … … …
  • 35. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Problem Signatures from Event Paths Identify a series of events as connected and representing the signature of a problem 35
  • 36. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Longest Common Subsequence of Anomalous Entries 36 1. Start by classifying a problem such as an important ORA or CRS error 2. Find occurrences of the problem across many different log files 3. Identify anomalous entries and lifecycle events in chronological order within a predefined time window around the occurrence of the problem in all the logs – Time window depends on frequency of message logging (e.g. 10 mins window for Clusterware) 4. Compare the repeating anomalous / lifecycle entries to identify the longest common subsequence of anomalous entries – These represent the problem signature – Sequence of events are correlated by component, log file, host & thread Find the Finite State Automata(FSA)
  • 37. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Improvisation by constraining LCS identification 1. There are constraints over multiple features which are recorded while analysis: • LCS must include the major checkpoints present in the knowledge base. Other CRS, ORA log signatures which are non-fatal and generally appear in correlation with these fatal events must be present. • LCS are computed at two levels: – Overall sequence of anomalous entries, i.e. in correlation with all components and products (sequentials know-ids for ocssd, gipc, database alert log, asm alert log etc) – Sequence of anomalies specific to file type. • We record categorization of log entries specific to host. • Pre defined knowledge of log structure also helps in optimizing the entries till thread level. – E.g. Structure of Clusterware log file ([Timestamp]:[Component Name]:[Thread Id]:[Msg]) 37
  • 38. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Example signatures and their analysis • Sample Central Event : 2017-01-19 16:51:20.562 [OCSSD(24862)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /tools/list/grid/orabase/diag/crs/ur102ora3502c/crs/trace/ocssd.trc 38 Knowledge Id Sample Line (States in FSA for central event) 52CC1E8631FC2674E053B580E80AB08D 2016-10-16 21:22:36.520+CRS-5008: Invalid attribute value: en4 for the network interface 52CC1E8632082674E053B580E80AB08D 2016-10-16 21:25:11.516 [OCSSD(6816354)]CRS-1608: This node was evicted by node 3, rwsbs03; details at (:CSSNM00005:) in /u01/app/crsusr/diag/crs/rwsbs02/crs/trace/ocssd.trc. 52CC1E8632212674E053B580E80AB08D 2016-10-16 21:25:17.927 [OCSSD(18219406)]CRS-1654: Clean up of CRSD resources finished successfully. 52CC1E8631EC2674E053B580E80AB08D 2016-10-16 21:25:17.927 [OCSSD(18219406)]CRS-1655: CSSD on node rwsbs01 detected a problem and started to shutdown. 52CC1E8632272674E053B580E80AB08D 2016-10-16 21:25:19.431 [OCSSD(18219406)]CRS-8503: Oracle Clusterware process OCSSD with operating system process ID 18219406 experienced fatal signal or exception code 6. 52CC1E8632202674E053B580E80AB08D 2016-10-16 21:25:21.788 [CRSD(44696012)]CRS-0805: Cluster Ready Service aborted due to failure to communicate with Cluster Synchronization Service with error [3]. Details at (:CRSD00109:) in /u01/app/crsusr/diag/crs/rwsbs01/crs/trace/crsd.trc. 52CC1E86208C2674E053B580E80AB08D 2016-10-18 02:02:00.835 : CSSD:6684: (:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally 52CC1E861F132674E053B580E80AB08D CLSB:6684: Oracle Clusterware infrastructure error in OCSSD (OS PID 12452524): Fatal signal 6 has occurred in program ocssd thread 6684; nested signal count is 1 52CC1E861E552674E053B580E80AB08D Incident 393 created, dump file: /u01/app/crsusr/diag/crs/rwsbs02/crs/incident/incdir_393/ocssd_i393.trc 52CC1E861F332674E053B580E80AB08D 2016-10-18 02:02:07.113 : SKGFD:5655: ERROR: -9(Error 27041, OS Error (IBM AIX RISC System/6000 Error: 47: Write-protected media 52CC1E86207C2674E053B580E80AB08D 2016-10-18 02:02:07.774 : CSSD:5655: clssnmvDiskCreate: Cluster guid ea34893b9442ef79ff642d70699aff9d found in voting disk /dev/rbs01_100G_asm1 does not match with the cluster guid 7b63590c34fa5f44bf6944aefa4ee85d obtained from the GPnP profile 52CC1E863DB82674E053B580E80AB08D 2017-01-19 16:48:01.057 [OCSSD(24862)]CRS-1649: An I/O error occurred for voting file: /dev/rdsk/c1d16; details at (:CSSNM00059:) in /tools/list/grid/orabase/diag/crs/ur102ora3502c/crs/trace/ocssd.trc. 52CC1E863DBC2674E053B580E80AB08D 2017-01-19 16:49:40.550 [OCSSD(24862)]CRS-1615: No I/O has completed after 50% of the maximum interval. Voting file /dev/rdsk/c1d16 will be considered not functional in 99508 milliseconds
  • 39. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 39 Example signatures and their analysis 5 Minute before Central Event 5 Minute After Central Event Central Event 52CC1E8 631FC26 74E053B 580E80A B08D 52CC1E8 6207C26 74E053B 580E80A B08D 52CC1E8 61F33267 4E053B5 80E80AB 08D 52CC1E8 61E5526 74E053B 580E80A B08D 52CC1E8 61F13267 4E053B5 80E80AB 08D 52CC1E8 6208C26 74E053B 580E80A B08D 52CC1E8 6322026 74E053B 580E80A B08D 52CC1E8 6322726 74E053B 580E80A B08D 52CC1E8 631EC26 74E053B 580E80A B08D 52CC1E8 6322126 74E053B 580E80A B08D 52CC1E8 6320826 74E053B 580E80A B08D 52CC1E8 63DBC26 74E053B 580E80A B08D 52CC1E8 63DB826 74E053B 580E80A B08D 52CC1E8 6722C26 74E053B 580E80A B08D
  • 40. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Generalizing event signatures over the scope of bug •BugSignature Repository 40 Event Signature 35 Event Signature 3435 Event Signature 494 Event Signature 3948 Event Signature 292 Event Signature 434933 Node Eviction bug 243645 Timeline Event Signature 3434 Event Signature 3435 Event Signature 4344 Event Signature 3048 Event Signature 202 Event Signature 434983 Node Eviction bug 2747747 Timeline Event Signature 35 Event Signature 3435 Event Signature 3048 Event Signature 3948 Event Signature 292 Event Signature 434933 New Signature Check for weighted probabilistic match
  • 41. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Bug Duplicate Identification Discovers Duplicate Bugs, Correlated Issues and Prioritizes Based Upon Customer Impact 41
  • 42. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Applied Machine Learning – Adaptive Bug Search • ABS is internally offered in BugDB and MOS GUI for Dev and Support teams • ABS helps find problems in the same space • Allows engineers to get the full context of past known problems • Provides debugging clues to help diagnose the reported problem • Identifies developers who worked in this space in the past Confidential – Oracle Restricted 42 Discovers Duplicate Bugs and Correlated Issues
  • 43. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | BUG DB Applied Machine Learning – Adaptive Bug Search • Bugs are submitted from over 400 Oracle products • Performs ML Logistic Regression on training set of bugs to generate model • Displays up to 8 possible duplicates per bug or SR • Feedback improves model accuracy – Direct from developers – Indirect from bug updates Confidential – Oracle Restricted 43 Discovers Duplicate Bugs and Correlated Issues ABS Dev Team Bugs ML Logistic Regression Model Generation Expert Supervision ABS Runtime Model Dev Feedback Bug Submission Bug and Duplicates Together ABS Service Scrub Data
  • 44. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Adaptive Bug Search (ABS) – High Level Flow • Issues parsed into different features – Error stack, Trace data, Problem description, etc. • Issues represented as a cluster of features – i.e. All bugs in a bug tree contribute towards the feature set • Logistic Regression applied to build a model – Model defines the significance of each feature • Similarity between issues computed using the model – Identifies the root of the cluster (aka bug tree) • Feedback used to improve the model – Feedback is automatically derived based on how the bug gets closed 44Confidential – Oracle Restricted
  • 45. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | BUG DB Applied Machine Learning – ABS Bug Priority • Analyze customer and internal bugs submitted within past 5 years • Performs ML Linear Regression on training set of bugs to generate model • Highlight bugs for Development Managers to review – Bugs affecting large numbers of customers – Bugs with significant impact to high profile customers • Feedback improves model accuracy – Direct from development manager – Indirect from bug fixes Confidential – Oracle Restricted 45 Discovers Bugs that have the most Customer Impact ABS Dev Team Cust. Bugs ML Linear Regression Model Generation Expert Supervision ABS BP Model Dev Feedback Customer files SR and results in a new Bug Bug Priority with Score ABS BP Service Scrub Data
  • 46. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | • Thank you for your feedback!! • Please continue to reach out to us via social media Twitter @sandeshr Linkedin https://www.linkedin.com/in/raosandesh/ Questions ?