Oracle Advanced Analytics:
insurance claim fraud detection
Oracle Innovation Days 2015, Riga
• Established in November, 2007
• 100+ employees
• Customers in Nordics, Latvia, Russia and
the USA
• Provide systems integration services
(CRM, Decision Support Systems)
• Develops original products
• (Micromiles, Debessmana)
Who we are
• Defining needs
• Collecting data
• Generating and evaluating options
• Selecting the best possible
• Applying and using
• Getting feedback and following up
Decisions Making Process Is …
Data Mining is
• the computational process of discovering
patterns in large data sets
• Knowledge Discovery in Databases
What is Data Mining?
Financial Services
- Credit risk analysis
- Cross-LOB up-selling
- Fraud detection
- Retail banking personalization
- “Best customer” prediction & profiling
Retail
- Product recommendations
- Customer segmentation
- Customer profiling
- Market Basket Analysis
Telecommunications
- Churn prevention
- Social network analysis
- Network monitoring
- Customer handling time reduction
Transportation and logistics
- Anticipate bottlenecks
- Proactive resource planning
- Improved preventative maintenance strategies
Data Mining use cases
Cross Industry Standard Process for Data Mining (CRISP)
Business Understanding
• Business Objectives
• Success Criteria
• Project plan
• Deliveries
Data Understanding
• Initial Data Collection
• Data Description
• Data Exploration
Data Preparation
• Data cleaning
• Sampling
• Normalization
• Feature Selection
Modeling
• Select modeling techniques
• Build/train model
• Prediction
Evaluation
• Model validation
• Review results
• Success criteria evaluation
Deployment
• Results visualization
• Report creation
Business Understanding
Fraud detection analysis for insurance claims
(car insurance)
Business Objectives
The goal of this analysis is to create a tool which helps to
identify fraudulent claims in auto insurance (KASKO)
Deliveries
• Possible fraud prediction
• Descriptive analysis
Data Understanding
Initial Data Collection
250 attributes
404 k claims
4% fraud
Fraud
Normal
Source: Oracle Siebel CRM
Data preprocessing
Fraud
Normal
Activities:
• normalization
• inputting missing data
• attribute selection
• stratified sampling
• 70% training dataset
• 30% test dataset
Final data set
150 of 250 attributes selected
Data Mining techniques
• Classification
• Clustering
Data mining tools: Oracle Data Miner
Modeling
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
– In-database data mining algorithms
and open source R algorithms
– SQL, PL/SQL, R languages
– Scalable, parallel in-database
execution
– Workflow GUI and IDEs
– Integrated component of Database
– Enables enterprise analytical
applications
Key Features
Oracle Advanced Analytics
Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
OBIEE
Oracle Database Enterprise Edition
Oracle Advanced Analytics Architecture
Oracle Advanced Analytics
Native SQL Data Mining/Analytic Functions + High-performance
R Integration for Scalable, Distributed, Parallel Execution
SQL Developer ApplicationsR Client
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Function Algorithms Applicability
Classification
Logistic Regression (GLM)
Decision Trees
Naïve Bayes
Support Vector Machines (SVM)
Classical statistical technique
Popular / Rules / transparency
Embedded app
Wide / narrow data / text
Regression
Linear Regression (GLM)
Support Vector Machine (SVM)
Classical statistical technique
Wide / narrow data / text
Anomaly
Detection
One Class SVM Unknown fraud cases or anomalies
Attribute
Importance
Minimum Description Length (MDL)
Principal Components Analysis (PCA)
Attribute reduction, Reduce data noise
Association
Rules
Apriori Market basket analysis / Next Best Offer
Clustering
Hierarchical k-Means
Hierarchical O-Cluster
Expectation-Maximization Clustering (EM)
Product grouping / Text mining
Gene and protein analysis
Feature
Extraction
Nonnegative Matrix Factorization (NMF)
Singular Value Decomposition (SVD)
Text analysis / Feature reduction
Oracle Advanced Analytics
In-Database Data Mining Algorithms—SQL & R & GUI Access
A1 A2 A3 A4 A5 A6 A7
F1 F2 F3 F4
• Automated data
preprocessing (normalizing,
cleaning)
• Workflow type modeling
• Build several models in
parallel
Modeling
Classification modeling using Oracle Data Miner
Models comparison and validation (confusion matrix)
Classification modeling evaluation
Models Actual values Predicted Values
Accuracy
Value Y N
SVM
Y 66% 34%
69%
N 29% 71%
DT
Y 66% 34%
66%
N 33% 67%
GLM
Y 70% 30%
70%
N 30% 70%
Where
Y – Fraud cases
N – Normal cases
Cluster evaluation
% of fraud vs normal cases
The top left quadrant
is our goal
22
Cluster analysis OBIEE dashboard
Fraudulent claims prediction
Output:
- List of possible
fraudulent cases
- Probabilities
Contacts
• Web: www.ideaportriga.lv
• Blog: blog.ideaportriga.lv
• Email: jurijs.jefimovs@ideaportriga.lv
• LinkedIn: lv.linkedin.com/in/jurijsj
Find out more
Q&A

More Related Content

PPTX
Idea Port Riga: Siebel health check and optimization
PPTX
Big Data and Semantic Web in Manufacturing
PPTX
Keys toSuccess: Business Intelligence Proven, Practical Strategies That Work
PDF
Oracle - Next Generation Datacenter - Alan Hartwell
PPTX
Big Data in Manufacturing Final PPT
PPTX
Operational Analytics
PDF
WhereScape, the pioneer in data warehouse automation software
PDF
Prez szabolcs
Idea Port Riga: Siebel health check and optimization
Big Data and Semantic Web in Manufacturing
Keys toSuccess: Business Intelligence Proven, Practical Strategies That Work
Oracle - Next Generation Datacenter - Alan Hartwell
Big Data in Manufacturing Final PPT
Operational Analytics
WhereScape, the pioneer in data warehouse automation software
Prez szabolcs

What's hot (18)

PDF
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
 
PDF
AI Data Acquisition and Governance: Considerations for Success
PPTX
Data driven decision making through analytics and IoT
PPTX
Who changed my data? Need for data governance and provenance in a streaming w...
PPTX
Business Intelligence Overview
PDF
Is your quality monitoring tech stack secure?
PPTX
Birst for SAP HANA
PPTX
Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...
PDF
Testing the Data Warehouse—Big Data, Big Problems
PPTX
How can a quality engineering and assurance consultancy keep you ahead of others
PDF
Data as the New Oil: Producing Value in the Oil and Gas Industry
PPT
Value proposition for big data isv partners 0714
PDF
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
PPTX
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
PDF
DOG Meetup 18 November 2021 - Intro and Azumuta
PDF
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
PPTX
Introduction to business intelligence
PPTX
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
 
AI Data Acquisition and Governance: Considerations for Success
Data driven decision making through analytics and IoT
Who changed my data? Need for data governance and provenance in a streaming w...
Business Intelligence Overview
Is your quality monitoring tech stack secure?
Birst for SAP HANA
Using JReview to Analyze Clinical and Pharmacovigilance Data in Disparate Sys...
Testing the Data Warehouse—Big Data, Big Problems
How can a quality engineering and assurance consultancy keep you ahead of others
Data as the New Oil: Producing Value in the Oil and Gas Industry
Value proposition for big data isv partners 0714
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
DOG Meetup 18 November 2021 - Intro and Azumuta
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Introduction to business intelligence
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Ad

Viewers also liked (7)

PPTX
Medical Billing Fraud
PPTX
Medical fraud and its implications Dr Vaikuthan Rajaratnam
PPTX
Health care fraud stark law and false claim act
PDF
ACFE Presentation on Analytics for Fraud Detection and Mitigation
PDF
Fraud Detection presentation
PDF
SQL to Hive Cheat Sheet
PPT
Presentation on fraud prevention, detection & control
Medical Billing Fraud
Medical fraud and its implications Dr Vaikuthan Rajaratnam
Health care fraud stark law and false claim act
ACFE Presentation on Analytics for Fraud Detection and Mitigation
Fraud Detection presentation
SQL to Hive Cheat Sheet
Presentation on fraud prevention, detection & control
Ad

Similar to IPR Oracle Innovation Days 2015 (20)

PPTX
oracleadvancedanalyticsv2otn-2859525.pptx
PDF
Big Data Analytics With Oracle Advanced Analytics - 2012
PDF
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
PDF
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
PDF
Oracle Advanced Analytics
PDF
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
PDF
Oracle analytics Live September 2021
PPTX
Biwa summit 2015 oaa oracle data miner hands on lab
PPTX
How to Empower Your Business Users with Oracle Data Visualization
PDF
Oracle Analytics Live Webinar August 2021
PPTX
OFSAA - BIG DATA - IBANK
PPTX
OFSAA - BIGDATA - IBANK
PPTX
Innovate Analytics with Oracle Data Mining & Oracle R
PDF
5 big data at work linking discovery and bi to improve business outcomes from...
PPTX
DBCS Office Hours - Modernization through Migration
PPTX
Advanced SQL - Quebec 2014
PPTX
13 2792 big-data_keynote_presentation_finalpass_05_d_v02
PPTX
Advanced Database Administration 10g
PPT
Data mining final year project in ludhiana
PPT
Data mining final year project in jalandhar
oracleadvancedanalyticsv2otn-2859525.pptx
Big Data Analytics With Oracle Advanced Analytics - 2012
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Oracle Advanced Analytics
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Oracle analytics Live September 2021
Biwa summit 2015 oaa oracle data miner hands on lab
How to Empower Your Business Users with Oracle Data Visualization
Oracle Analytics Live Webinar August 2021
OFSAA - BIG DATA - IBANK
OFSAA - BIGDATA - IBANK
Innovate Analytics with Oracle Data Mining & Oracle R
5 big data at work linking discovery and bi to improve business outcomes from...
DBCS Office Hours - Modernization through Migration
Advanced SQL - Quebec 2014
13 2792 big-data_keynote_presentation_finalpass_05_d_v02
Advanced Database Administration 10g
Data mining final year project in ludhiana
Data mining final year project in jalandhar

IPR Oracle Innovation Days 2015

  • 1. Oracle Advanced Analytics: insurance claim fraud detection Oracle Innovation Days 2015, Riga
  • 2. • Established in November, 2007 • 100+ employees • Customers in Nordics, Latvia, Russia and the USA • Provide systems integration services (CRM, Decision Support Systems) • Develops original products • (Micromiles, Debessmana) Who we are
  • 3. • Defining needs • Collecting data • Generating and evaluating options • Selecting the best possible • Applying and using • Getting feedback and following up Decisions Making Process Is …
  • 4. Data Mining is • the computational process of discovering patterns in large data sets • Knowledge Discovery in Databases What is Data Mining?
  • 5. Financial Services - Credit risk analysis - Cross-LOB up-selling - Fraud detection - Retail banking personalization - “Best customer” prediction & profiling Retail - Product recommendations - Customer segmentation - Customer profiling - Market Basket Analysis Telecommunications - Churn prevention - Social network analysis - Network monitoring - Customer handling time reduction Transportation and logistics - Anticipate bottlenecks - Proactive resource planning - Improved preventative maintenance strategies Data Mining use cases
  • 6. Cross Industry Standard Process for Data Mining (CRISP) Business Understanding • Business Objectives • Success Criteria • Project plan • Deliveries Data Understanding • Initial Data Collection • Data Description • Data Exploration Data Preparation • Data cleaning • Sampling • Normalization • Feature Selection Modeling • Select modeling techniques • Build/train model • Prediction Evaluation • Model validation • Review results • Success criteria evaluation Deployment • Results visualization • Report creation
  • 7. Business Understanding Fraud detection analysis for insurance claims (car insurance) Business Objectives The goal of this analysis is to create a tool which helps to identify fraudulent claims in auto insurance (KASKO) Deliveries • Possible fraud prediction • Descriptive analysis
  • 8. Data Understanding Initial Data Collection 250 attributes 404 k claims 4% fraud Fraud Normal Source: Oracle Siebel CRM
  • 9. Data preprocessing Fraud Normal Activities: • normalization • inputting missing data • attribute selection • stratified sampling • 70% training dataset • 30% test dataset Final data set 150 of 250 attributes selected
  • 10. Data Mining techniques • Classification • Clustering Data mining tools: Oracle Data Miner Modeling
  • 11. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | – In-database data mining algorithms and open source R algorithms – SQL, PL/SQL, R languages – Scalable, parallel in-database execution – Workflow GUI and IDEs – Integrated component of Database – Enables enterprise analytical applications Key Features Oracle Advanced Analytics Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics
  • 12. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | OBIEE Oracle Database Enterprise Edition Oracle Advanced Analytics Architecture Oracle Advanced Analytics Native SQL Data Mining/Analytic Functions + High-performance R Integration for Scalable, Distributed, Parallel Execution SQL Developer ApplicationsR Client
  • 13. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Function Algorithms Applicability Classification Logistic Regression (GLM) Decision Trees Naïve Bayes Support Vector Machines (SVM) Classical statistical technique Popular / Rules / transparency Embedded app Wide / narrow data / text Regression Linear Regression (GLM) Support Vector Machine (SVM) Classical statistical technique Wide / narrow data / text Anomaly Detection One Class SVM Unknown fraud cases or anomalies Attribute Importance Minimum Description Length (MDL) Principal Components Analysis (PCA) Attribute reduction, Reduce data noise Association Rules Apriori Market basket analysis / Next Best Offer Clustering Hierarchical k-Means Hierarchical O-Cluster Expectation-Maximization Clustering (EM) Product grouping / Text mining Gene and protein analysis Feature Extraction Nonnegative Matrix Factorization (NMF) Singular Value Decomposition (SVD) Text analysis / Feature reduction Oracle Advanced Analytics In-Database Data Mining Algorithms—SQL & R & GUI Access A1 A2 A3 A4 A5 A6 A7 F1 F2 F3 F4
  • 14. • Automated data preprocessing (normalizing, cleaning) • Workflow type modeling • Build several models in parallel Modeling Classification modeling using Oracle Data Miner
  • 15. Models comparison and validation (confusion matrix) Classification modeling evaluation Models Actual values Predicted Values Accuracy Value Y N SVM Y 66% 34% 69% N 29% 71% DT Y 66% 34% 66% N 33% 67% GLM Y 70% 30% 70% N 30% 70% Where Y – Fraud cases N – Normal cases
  • 16. Cluster evaluation % of fraud vs normal cases The top left quadrant is our goal 22
  • 18. Fraudulent claims prediction Output: - List of possible fraudulent cases - Probabilities
  • 19. Contacts • Web: www.ideaportriga.lv • Blog: blog.ideaportriga.lv • Email: [email protected] • LinkedIn: lv.linkedin.com/in/jurijsj Find out more
  • 20. Q&A