SlideShare a Scribd company logo
Tame Big Data with Oracle Data Integration
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration: CON7922
Tame Big Data with Oracle Data Integration
Alex Kotopoulis
Senior Principal Product Manager
Oracle Fusion Middleware, Data Integration Solutions
Michael Rainey
Principal Consultant
Rittman Mead
Oracle OpenWorld 2014 2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information
purposes only, and may not be incorporated into any contract. It is not a commitment to deliver
any material, code, or functionality, and should not be relied upon in making purchasing decisions.
The development, release, and timing of any features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.
Oracle OpenWorld 2014 3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Oracle OpenWorld 2014 4
Oracle Data Integration Overview
Customer Cases and Best Practices
Big Data Demo
Q&A and For More Information
• OOW Data Integration Sessions and Additional Resources
3
4
1
2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration Solutions and Proven Benefits
Oracle OpenWorld 2014 5
 Improve Agility
• Deploy Projects Faster
• Reliable Real-Time
 Reduce Risk
• Popular, Proven Tools
• Open, Not Proprietary
 Reduce Costs
• Better Productivity
• Eliminate ETL Servers
Analytic Data Integration
• Big Data Integration & Governance
• Data Warehouse Integration
• Business Intelligence Applications
Enterprise Data Integration and Governance
• Enterprise Data Quality and Profiling
• Comprehensive, Heterogeneous Data Integration
• Business Glossary and Metadata Management
Business Continuity
• Active-Active for Maximum Availability
• Zero Downtime Migrations
• Data Consolidation / Application Modernization
24 x 7 x 365
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Comprehensive Data Integration & Governance Capabilities
Oracle OpenWorld 2014 6
Real-Time Data Movement
– Low impact capture, stage in Hadoop
– Continuous data availability
Data Transformation
– Bulk data movement
– Pushdown data processing
Data Federation
– Virtualized Data Services
Data Quality & Verification
– Fix quality at the source
– Verify data consistency
Metadata Management
– Lineage and Impact Analysis
– Business Glossary Semantics
Data Governance
Foundation
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast
Load
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator
(Federation)
GoldenGate Veridata
(Online Data Verification)
ELT Processing
on Hadoop or SQL
Continuous Availability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Governance
Foundation
Differentiated Technical Approach
Oracle OpenWorld 2014 7
Dynamic Data Movement
– Real-time CDC is by default, not ETL
– Least invasive on sources
– Proven best performance
– Integrated Oracle capture/apply
No ETL Engines
– Take the processing to the data;
don’t move the data to the process
– Leverage your data engines for the
workloads (Hadoop or SQL)
Most Heterogeneous
– Leverage open source Hadoop, not
proprietary distributions
– Hadoop is the Hub, not ETL tools
– Open metadata standards
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast
Load
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator
(Federation)
GoldenGate Veridata
(Online Data Verification)
ELT Processing
on Hadoop or SQL
Continuous Availability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Reservoir Use Case with Oracle Data Integration
Oracle Confidential – Internal/Restricted/Highly Restricted 8
Oracle Data
Integrator
Logs
OLTP Databases
Social
Media
Sensor
Data
Data Warehouses,
Datamarts
Pig
Sqoop Initial Load Sqoop Load
OLH / OSCH
Big Data SQL
File Load
CDC to HDFS, Hive,
Flume, HBase
Oracle GoldenGate
Oracle Enterprise
Metadata Management
Oracle Data Service
Integrator
Federated Queries
Oracle Enterprise
Data Quality
Impala
Transformations with
HDFS, Hive, Hbase, Pig
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Logical and Physical Design with ODI
Logical
Design
Oracle
MySQL
Hive
Physical
Design
Sqoop
Sqoop
IKM
LKM
LKM
Oracle
Hive
MySQL
Hive
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Design Once, Run Anywhere
• Use native technologies for any data
source
– Data Locality
– Optimal performance, reduced
network traffic
• No proprietary middle tier
– Reduced infrastructure cost and
maintenance effort
• Declarative design
– Simplified development
– Reusable across technologies
Hive
Agent
Languages and Tools
Runtime
Environments
Sqoop
Big Data
SQL
Future
Languages
Future Runtime
Engines
OLH
OSCH
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle GoldenGate Adapter – Big Data Use Cases
Oracle Confidential – Internal/Restricted/Highly Restricted 11
Java
Adapter
HDFS
file
Capture
Parameter
File
Adapter
Property file
Adapter
Jar file
Source
Database
Pump
Parameter file
Hive
HBase
Flume
Source Channel Sink
Other
Custom
Targets
Log File Pump
Trail
File
Capture
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Oracle OpenWorld 2014 12
Oracle Data Integration Overview
Customer Cases and Best Practices
Big Data Demo
Q&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Introduction
• Michael Rainey
• Principal Consultant - Rittman Mead
• Oracle Data Integration expert
– Oracle Data Integrator and Oracle GoldenGate
• Oracle ACE
• Twitter: @mRainey
Oracle Confidential – Internal/Restricted/Highly Restricted 13
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
About Rittman Mead
• Oracle Gold partner
– World leading specialist partner for technical excellence, solutions delivery and
innovation in Oracle BI
– Provide consulting, training, managed services for customers worldwide
• 120+ consultants including 1 Oracle ACE Director, 3 Oracle ACEs and 1
Oracle ACE Associate
– All expert in Oracle BI, DW, EPM and Analytics tech
– Skills in broad range of supporting Oracle tools: OBIEE, OBIA, ODIEE, Essbase, Oracle
OLAP, GoldenGate, Exadata, Endeca
• Blog: www.rittmanmead.com/blog Twitter: @rittmanmead
Oracle Confidential – Internal/Restricted/Highly Restricted 14
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Customer Challenge
• Company has subscribers with in-home devices
• Company wishes to improve customer experience
• Log data can potentially help identify issues, but difficult to access and read
• …and there’s a lot of data!
Oracle Confidential – Internal/Restricted/Highly Restricted 15
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Big Data Solution
• 6 Node Big Data Appliance (BDA)
Oracle Confidential – Internal/Restricted/Highly Restricted 16
bin/hadoop*dfs*-copyFromLocal
Process scheduled via cron jobs
Extract data
from XML logs
via python script
Load data to
HDFS using
copyFromLocal
command
Filter, format,
sort data using
Oracle R
Aggregate &
transform data
using python
scripts & HiveQL
Load to Oracle
DB via Sqoop
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Wait, this looks familiar…
• Looks like a standard data integration project!
• Scripts written to extract, load, and transform data
• Source data and transformations evolving
• But something is missing
– Scheduling, process flow, monitoring, data quality
– Standardization and maintainability
Oracle Confidential – Internal/Restricted/Highly Restricted 17
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Transition to an ETL tool
• Initial thought…Informatica
– Client has experience with product
• Why Oracle Data Integrator?
– Extensibility - “Design Once…”
– No middle ETL engine
– Data Quality
• And…it’s licensed with their BDA!
Oracle Confidential – Internal/Restricted/Highly Restricted 18
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
ODI Procedure
IKM Hive Transform
IKM File-Hive to SQL (SQOOP)
Big Data Solution using ODI 12c
Oracle Confidential – Internal/Restricted/Highly Restricted 19
bin/hadoop*dfs*-copyFromLocal
Extract data
from XML logs
via python script
Load data to
HDFS using
copyFromLocal
command
Filter, format,
sort data using
Oracle R
Aggregate &
transform data
using python
scripts & HiveQL
Load to Oracle
DB via Sqoop
IKM Hive Control Append
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
What we learned along the way…
• HiveQL <> Oracle SQL
– Hive KMs, check the Generate ANSI Syntax checkbox, Hive expects table joins to be in
this format rather than the “Oracle” format.
• Begin with scripts, but have ODI Application Adapters for Hadoop in mind
• Utilize the skills your available resources have
– Not everyone can write MapReduce code
Oracle Confidential – Internal/Restricted/Highly Restricted 20
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Oracle OpenWorld 2014 21
Oracle Data Integration Overview
Customer Cases and Best Practices
Big Data Demo
Q&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Integration Demo
Oracle Confidential – Internal/Restricted/Highly Restricted 22
Oracle Data
Integrator
Oracle
GoldenGate
Flume
Process Activity
(Hive)
Application
Logs
Activity
Load Oracle
Big Data SQL
ActivityClean CountrySales
Load Oracle
OLH/OSCH
MySQL DB
SQOOP
OGG
(HDFS/Flume)
MovieMovie MovieRating MovieRating
Customer
Calculate Rating
(Hive)
Sessionize Activity
(Pig OS Call)
Customer SessionStats
Calc Purchases
(Oracle)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Oracle OpenWorld 2014 23
Oracle Data Integration Overview
Customer Cases and Best Practices
Big Data Demo
Q&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
2014
2014 Oracle Excellence Award Ceremony
for Fusion Middleware Innovation
ORACLE FUSION MIDDLEWARE:
CELEBRATE THIS YEAR'S MOST INNOVATIVE
CUSTOMER SOLUTIONS
Tuesday, September 30, 2014 5:00-5:45pm
YBCA Theater (next to Moscone North)
Session ID: CON7029
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Resources
Oracle OpenWorld 2014
25
Oracle Data Integration Oracle Data Integration OracleGoldenGateORCL DataIntegration blogs.oracle.com/dataint
egration
Oracle Data
Integrator
Oracle
GoldenGate
Oracle
Enterprise
Data Quality
Oracle Enterprise
Metadata
Management
Oracle Data
Services
Integrator
http://www.oracle.com/us/products/middleware/data-integration/overview/index.html
Data Integration
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Questions and Answers
Oracle OpenWorld 2014 26
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Oracle GoldenGate
2:45PM - CON7717 Oracle GoldenGate
New Features & Options Product Update
4:00PM - CON7716 Oracle GoldenGate
12c for Oracle Database 12c
5:15PM – CON7719 Enabling Real-Time
Data Integration for Big Data
10:45AM – CON7715 Oracle Active Data
Guard & Oracle GoldenGate for HA
12:00PM – CON7328 Near-Zero
Downtime Unicode Migration for Oracle
12:00PM – CON774 Oracle GoldenGate
for Cloud
6:00PM – BOF9597 International Oracle
GoldenGate User Group Meeting
3:30PM – CON7934 Tapping into the Big
Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with
Oracle Data Integration
4:45PM – CON7773 Oracle GoldenGate
Performance Tuning for Oracle Database
10:45AM – CON7655 Achieving Zero
Downtime During Oracle Application
Upgrades & System Migrations
1:15PM – CON7718 Managing &
Monitoring Oracle GoldenGate
Oracle OpenWorld 2014 27
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Oracle Data Integrator
4:00PM – CON7899 Oracle Data
Integrator: Product Update and
Future Strategy
5:00PM – CON7820 Making he Move from
Oracle Warehouse Building to Oracle Data
Integrator
3:30PM – CON7934 Tapping into the Big
Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with
Oracle Data Integration
9:30AM – CON7926 Oracle Data
Integration: A Crucial Ingredient for Cloud
Integration
10:45AM – CON7923 Oracle Data
Integration & Metadata Management for
Seamless Enterprise
2:30PM – CON7921 Insight into Action:
Business Intelligence Applications and
Oracle Data Integrator
Oracle OpenWorld 2014 28
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Enterprise Data Quality
11:45AM – CON7776 Data Quality
Maturity Journey: Building Toward
Strong Enterprise Data Quality
10:45AM – CON7780 Oracle Enterprise
Data Quality: Product Overview and
Roadmap
2:00PM – CON7775 The Essential Core of
Data Governance with Oracle Enterprise
Data Quality
3:30PM – CON7934 Tapping into the Big
Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with
Oracle Data Integration
12:00PM CON7931 Solving Big Data’s Big
Problem with Data Preparation &
Enrichment in the Cloud
Oracle OpenWorld 2014 29
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Hands-on Labs @ OOW ’14
Tuesday 3:45PM – HOL9439
• Oracle Data Integrator 12c New
Features Deep Dive
Tuesday 5:15PM – HOL9414
• Oracle Data Integrator for Big Data
Hotel Nikko
Nikko Ballroom II
22 Mason Street
Monday 1:15PM – HOL9437
• Oracle GoldenGate 12c New
Features Deep Drive
Wednesday 4:15PM – HOL9436
• Pushing Transactions to JCache with
Coherence and GoldenGate
Thursday 10AM – HOL9413
• Oracle GoldenGate Heterogeneous
Replication
Monday 2:45PM – HOL9438
• Oracle Enterprise Data Quality
Introduction
Oracle OpenWorld 2014 30
OGG
ODI
EDQ
Tame Big Data with Oracle Data Integration
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 32

More Related Content

What's hot (20)

PPTX
Oracle GoldenGate, Streams, and Data Integrator
Fumiko Yamashita
 
PPTX
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
DataWorks Summit/Hadoop Summit
 
PPTX
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
PPTX
Biwa summit 2015 oaa oracle data miner hands on lab
Charlie Berger
 
PPTX
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
PDF
Open Innovation with Power Systems
IBM Power Systems
 
PDF
Replacing Oracle CDC with Oracle GoldenGate
Stewart Bryson
 
PPTX
Understanding Oracle GoldenGate 12c
IT Help Desk Inc
 
PDF
Implementing a Data Lake with Enterprise Grade Data Governance
Hortonworks
 
PPTX
Don't Let Security Be The 'Elephant in the Room'
Hortonworks
 
PPTX
Edw Optimization Solution
Hortonworks
 
PPTX
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
jdijcks
 
PDF
Oracle RAC - Roadmap for New Features
Markus Michalewicz
 
PPTX
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Seetharam Venkatesh
 
PPTX
The DAP - Where YARN, HBase, Kafka and Spark go to Production
DataWorks Summit/Hadoop Summit
 
PDF
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
Mark Rittman
 
PPTX
Oracle GoldenGate for Disaster Recovery
Fumiko Yamashita
 
PDF
Intro to Spark & Zeppelin - Crash Course - HS16SJ
DataWorks Summit/Hadoop Summit
 
PDF
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
Markus Michalewicz
 
PPTX
Internet of things Crash Course Workshop
DataWorks Summit
 
Oracle GoldenGate, Streams, and Data Integrator
Fumiko Yamashita
 
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
DataWorks Summit/Hadoop Summit
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
Biwa summit 2015 oaa oracle data miner hands on lab
Charlie Berger
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
Open Innovation with Power Systems
IBM Power Systems
 
Replacing Oracle CDC with Oracle GoldenGate
Stewart Bryson
 
Understanding Oracle GoldenGate 12c
IT Help Desk Inc
 
Implementing a Data Lake with Enterprise Grade Data Governance
Hortonworks
 
Don't Let Security Be The 'Elephant in the Room'
Hortonworks
 
Edw Optimization Solution
Hortonworks
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
jdijcks
 
Oracle RAC - Roadmap for New Features
Markus Michalewicz
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Seetharam Venkatesh
 
The DAP - Where YARN, HBase, Kafka and Spark go to Production
DataWorks Summit/Hadoop Summit
 
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
Mark Rittman
 
Oracle GoldenGate for Disaster Recovery
Fumiko Yamashita
 
Intro to Spark & Zeppelin - Crash Course - HS16SJ
DataWorks Summit/Hadoop Summit
 
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
Markus Michalewicz
 
Internet of things Crash Course Workshop
DataWorks Summit
 

Similar to Tame Big Data with Oracle Data Integration (20)

PDF
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Jeffrey T. Pollock
 
PDF
Tapping into the Big Data Reservoir (CON7934)
Jeffrey T. Pollock
 
PDF
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
PDF
Oracle Data Integration CON9737 at OpenWorld
Jeffrey T. Pollock
 
PDF
Oracle Data Integration - Overview
Jeffrey T. Pollock
 
PDF
Oracle Big Data Governance Webcast Charts
Jeffrey T. Pollock
 
PPTX
Hortonworks Oracle Big Data Integration
Hortonworks
 
PPTX
Transform Your Data Integration Platform From Informatica To ODI
Jade Global
 
PPTX
Data warehouse migration to oracle data integrator 11g
Michael Rainey
 
PDF
One Slide Overview: ORCL Big Data Integration and Governance
Jeffrey T. Pollock
 
PDF
Big Data and Enterprise Data - Oracle -1663869
Edgar Alejandro Villegas
 
PDF
Oracle Unified Information Architeture + Analytics by Example
Harald Erb
 
PPTX
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
avanttic Consultoría Tecnológica
 
PPTX
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
DataWorks Summit
 
PDF
Meetup Oracle Database BCN: 2.1 Data Management Trends
avanttic Consultoría Tecnológica
 
PPTX
Insights into Real-world Data Management Challenges
DataWorks Summit
 
PDF
Oracle Warehouse Builder to Oracle Data Integrator 12c Migration Utility
Noel Sidebotham
 
PPTX
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
PPTX
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Alex Gorbachev
 
PPTX
Delicious : EDQ, OGG and ODI over Exadata for Perfection
Gurcan Orhan
 
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Jeffrey T. Pollock
 
Tapping into the Big Data Reservoir (CON7934)
Jeffrey T. Pollock
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
Oracle Data Integration CON9737 at OpenWorld
Jeffrey T. Pollock
 
Oracle Data Integration - Overview
Jeffrey T. Pollock
 
Oracle Big Data Governance Webcast Charts
Jeffrey T. Pollock
 
Hortonworks Oracle Big Data Integration
Hortonworks
 
Transform Your Data Integration Platform From Informatica To ODI
Jade Global
 
Data warehouse migration to oracle data integrator 11g
Michael Rainey
 
One Slide Overview: ORCL Big Data Integration and Governance
Jeffrey T. Pollock
 
Big Data and Enterprise Data - Oracle -1663869
Edgar Alejandro Villegas
 
Oracle Unified Information Architeture + Analytics by Example
Harald Erb
 
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
avanttic Consultoría Tecnológica
 
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
DataWorks Summit
 
Meetup Oracle Database BCN: 2.1 Data Management Trends
avanttic Consultoría Tecnológica
 
Insights into Real-world Data Management Challenges
DataWorks Summit
 
Oracle Warehouse Builder to Oracle Data Integrator 12c Migration Utility
Noel Sidebotham
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Alex Gorbachev
 
Delicious : EDQ, OGG and ODI over Exadata for Perfection
Gurcan Orhan
 
Ad

More from Michael Rainey (19)

PDF
Data Warehouse - Incremental Migration to the Cloud
Michael Rainey
 
PDF
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Michael Rainey
 
PPTX
SQL on Hadoop for the Oracle Professional
Michael Rainey
 
PPTX
Going Serverless - an Introduction to AWS Glue
Michael Rainey
 
PDF
Offload, Transform, and Present - the New World of Data Integration
Michael Rainey
 
PDF
Oracle Data Integrator 12c - Getting Started
Michael Rainey
 
PDF
Streaming with Oracle Data Integration
Michael Rainey
 
PDF
Oracle data integrator 12c - getting started
Michael Rainey
 
PDF
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Michael Rainey
 
PDF
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
Michael Rainey
 
PDF
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Michael Rainey
 
PDF
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Michael Rainey
 
PDF
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
Michael Rainey
 
PDF
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Michael Rainey
 
PPT
Real-Time Data Replication to Hadoop using GoldenGate 12c Adaptors
Michael Rainey
 
PDF
Real-time Data Warehouse Upgrade – Success Stories
Michael Rainey
 
PDF
A Picture Can Replace A Thousand Words
Michael Rainey
 
PDF
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
Michael Rainey
 
PDF
KScope14 - Real-Time Data Warehouse Upgrade - Success Stories
Michael Rainey
 
Data Warehouse - Incremental Migration to the Cloud
Michael Rainey
 
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Michael Rainey
 
SQL on Hadoop for the Oracle Professional
Michael Rainey
 
Going Serverless - an Introduction to AWS Glue
Michael Rainey
 
Offload, Transform, and Present - the New World of Data Integration
Michael Rainey
 
Oracle Data Integrator 12c - Getting Started
Michael Rainey
 
Streaming with Oracle Data Integration
Michael Rainey
 
Oracle data integrator 12c - getting started
Michael Rainey
 
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Michael Rainey
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
Michael Rainey
 
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Michael Rainey
 
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Michael Rainey
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
Michael Rainey
 
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Michael Rainey
 
Real-Time Data Replication to Hadoop using GoldenGate 12c Adaptors
Michael Rainey
 
Real-time Data Warehouse Upgrade – Success Stories
Michael Rainey
 
A Picture Can Replace A Thousand Words
Michael Rainey
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
Michael Rainey
 
KScope14 - Real-Time Data Warehouse Upgrade - Success Stories
Michael Rainey
 
Ad

Recently uploaded (20)

PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 

Tame Big Data with Oracle Data Integration

  • 2. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integration: CON7922 Tame Big Data with Oracle Data Integration Alex Kotopoulis Senior Principal Product Manager Oracle Fusion Middleware, Data Integration Solutions Michael Rainey Principal Consultant Rittman Mead Oracle OpenWorld 2014 2
  • 3. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Oracle OpenWorld 2014 3
  • 4. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle OpenWorld 2014 4 Oracle Data Integration Overview Customer Cases and Best Practices Big Data Demo Q&A and For More Information • OOW Data Integration Sessions and Additional Resources 3 4 1 2
  • 5. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integration Solutions and Proven Benefits Oracle OpenWorld 2014 5  Improve Agility • Deploy Projects Faster • Reliable Real-Time  Reduce Risk • Popular, Proven Tools • Open, Not Proprietary  Reduce Costs • Better Productivity • Eliminate ETL Servers Analytic Data Integration • Big Data Integration & Governance • Data Warehouse Integration • Business Intelligence Applications Enterprise Data Integration and Governance • Enterprise Data Quality and Profiling • Comprehensive, Heterogeneous Data Integration • Business Glossary and Metadata Management Business Continuity • Active-Active for Maximum Availability • Zero Downtime Migrations • Data Consolidation / Application Modernization 24 x 7 x 365
  • 6. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Comprehensive Data Integration & Governance Capabilities Oracle OpenWorld 2014 6 Real-Time Data Movement – Low impact capture, stage in Hadoop – Continuous data availability Data Transformation – Bulk data movement – Pushdown data processing Data Federation – Virtualized Data Services Data Quality & Verification – Fix quality at the source – Verify data consistency Metadata Management – Lineage and Impact Analysis – Business Glossary Semantics Data Governance Foundation Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Fast Load Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator (Federation) GoldenGate Veridata (Online Data Verification) ELT Processing on Hadoop or SQL Continuous Availability
  • 7. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Data Governance Foundation Differentiated Technical Approach Oracle OpenWorld 2014 7 Dynamic Data Movement – Real-time CDC is by default, not ETL – Least invasive on sources – Proven best performance – Integrated Oracle capture/apply No ETL Engines – Take the processing to the data; don’t move the data to the process – Leverage your data engines for the workloads (Hadoop or SQL) Most Heterogeneous – Leverage open source Hadoop, not proprietary distributions – Hadoop is the Hub, not ETL tools – Open metadata standards Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Fast Load Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator (Federation) GoldenGate Veridata (Online Data Verification) ELT Processing on Hadoop or SQL Continuous Availability
  • 8. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Data Reservoir Use Case with Oracle Data Integration Oracle Confidential – Internal/Restricted/Highly Restricted 8 Oracle Data Integrator Logs OLTP Databases Social Media Sensor Data Data Warehouses, Datamarts Pig Sqoop Initial Load Sqoop Load OLH / OSCH Big Data SQL File Load CDC to HDFS, Hive, Flume, HBase Oracle GoldenGate Oracle Enterprise Metadata Management Oracle Data Service Integrator Federated Queries Oracle Enterprise Data Quality Impala Transformations with HDFS, Hive, Hbase, Pig
  • 9. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Logical and Physical Design with ODI Logical Design Oracle MySQL Hive Physical Design Sqoop Sqoop IKM LKM LKM Oracle Hive MySQL Hive
  • 10. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Design Once, Run Anywhere • Use native technologies for any data source – Data Locality – Optimal performance, reduced network traffic • No proprietary middle tier – Reduced infrastructure cost and maintenance effort • Declarative design – Simplified development – Reusable across technologies Hive Agent Languages and Tools Runtime Environments Sqoop Big Data SQL Future Languages Future Runtime Engines OLH OSCH
  • 11. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle GoldenGate Adapter – Big Data Use Cases Oracle Confidential – Internal/Restricted/Highly Restricted 11 Java Adapter HDFS file Capture Parameter File Adapter Property file Adapter Jar file Source Database Pump Parameter file Hive HBase Flume Source Channel Sink Other Custom Targets Log File Pump Trail File Capture
  • 12. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle OpenWorld 2014 12 Oracle Data Integration Overview Customer Cases and Best Practices Big Data Demo Q&A and For More Information • OOW Data Integration Sessions and Additional Resources 1 2 3 4
  • 13. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Introduction • Michael Rainey • Principal Consultant - Rittman Mead • Oracle Data Integration expert – Oracle Data Integrator and Oracle GoldenGate • Oracle ACE • Twitter: @mRainey Oracle Confidential – Internal/Restricted/Highly Restricted 13
  • 14. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | About Rittman Mead • Oracle Gold partner – World leading specialist partner for technical excellence, solutions delivery and innovation in Oracle BI – Provide consulting, training, managed services for customers worldwide • 120+ consultants including 1 Oracle ACE Director, 3 Oracle ACEs and 1 Oracle ACE Associate – All expert in Oracle BI, DW, EPM and Analytics tech – Skills in broad range of supporting Oracle tools: OBIEE, OBIA, ODIEE, Essbase, Oracle OLAP, GoldenGate, Exadata, Endeca • Blog: www.rittmanmead.com/blog Twitter: @rittmanmead Oracle Confidential – Internal/Restricted/Highly Restricted 14
  • 15. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Customer Challenge • Company has subscribers with in-home devices • Company wishes to improve customer experience • Log data can potentially help identify issues, but difficult to access and read • …and there’s a lot of data! Oracle Confidential – Internal/Restricted/Highly Restricted 15
  • 16. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Big Data Solution • 6 Node Big Data Appliance (BDA) Oracle Confidential – Internal/Restricted/Highly Restricted 16 bin/hadoop*dfs*-copyFromLocal Process scheduled via cron jobs Extract data from XML logs via python script Load data to HDFS using copyFromLocal command Filter, format, sort data using Oracle R Aggregate & transform data using python scripts & HiveQL Load to Oracle DB via Sqoop
  • 17. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Wait, this looks familiar… • Looks like a standard data integration project! • Scripts written to extract, load, and transform data • Source data and transformations evolving • But something is missing – Scheduling, process flow, monitoring, data quality – Standardization and maintainability Oracle Confidential – Internal/Restricted/Highly Restricted 17
  • 18. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Transition to an ETL tool • Initial thought…Informatica – Client has experience with product • Why Oracle Data Integrator? – Extensibility - “Design Once…” – No middle ETL engine – Data Quality • And…it’s licensed with their BDA! Oracle Confidential – Internal/Restricted/Highly Restricted 18
  • 19. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | ODI Procedure IKM Hive Transform IKM File-Hive to SQL (SQOOP) Big Data Solution using ODI 12c Oracle Confidential – Internal/Restricted/Highly Restricted 19 bin/hadoop*dfs*-copyFromLocal Extract data from XML logs via python script Load data to HDFS using copyFromLocal command Filter, format, sort data using Oracle R Aggregate & transform data using python scripts & HiveQL Load to Oracle DB via Sqoop IKM Hive Control Append
  • 20. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | What we learned along the way… • HiveQL <> Oracle SQL – Hive KMs, check the Generate ANSI Syntax checkbox, Hive expects table joins to be in this format rather than the “Oracle” format. • Begin with scripts, but have ODI Application Adapters for Hadoop in mind • Utilize the skills your available resources have – Not everyone can write MapReduce code Oracle Confidential – Internal/Restricted/Highly Restricted 20
  • 21. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle OpenWorld 2014 21 Oracle Data Integration Overview Customer Cases and Best Practices Big Data Demo Q&A and For More Information • OOW Data Integration Sessions and Additional Resources 1 2 3 4
  • 22. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Data Integration Demo Oracle Confidential – Internal/Restricted/Highly Restricted 22 Oracle Data Integrator Oracle GoldenGate Flume Process Activity (Hive) Application Logs Activity Load Oracle Big Data SQL ActivityClean CountrySales Load Oracle OLH/OSCH MySQL DB SQOOP OGG (HDFS/Flume) MovieMovie MovieRating MovieRating Customer Calculate Rating (Hive) Sessionize Activity (Pig OS Call) Customer SessionStats Calc Purchases (Oracle)
  • 23. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle OpenWorld 2014 23 Oracle Data Integration Overview Customer Cases and Best Practices Big Data Demo Q&A and For More Information • OOW Data Integration Sessions and Additional Resources 1 2 3 4
  • 24. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 2014 2014 Oracle Excellence Award Ceremony for Fusion Middleware Innovation ORACLE FUSION MIDDLEWARE: CELEBRATE THIS YEAR'S MOST INNOVATIVE CUSTOMER SOLUTIONS Tuesday, September 30, 2014 5:00-5:45pm YBCA Theater (next to Moscone North) Session ID: CON7029
  • 25. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Resources Oracle OpenWorld 2014 25 Oracle Data Integration Oracle Data Integration OracleGoldenGateORCL DataIntegration blogs.oracle.com/dataint egration Oracle Data Integrator Oracle GoldenGate Oracle Enterprise Data Quality Oracle Enterprise Metadata Management Oracle Data Services Integrator http://www.oracle.com/us/products/middleware/data-integration/overview/index.html Data Integration
  • 26. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Questions and Answers Oracle OpenWorld 2014 26
  • 27. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle DIS Session @ OOW ’14 – Oracle GoldenGate 2:45PM - CON7717 Oracle GoldenGate New Features & Options Product Update 4:00PM - CON7716 Oracle GoldenGate 12c for Oracle Database 12c 5:15PM – CON7719 Enabling Real-Time Data Integration for Big Data 10:45AM – CON7715 Oracle Active Data Guard & Oracle GoldenGate for HA 12:00PM – CON7328 Near-Zero Downtime Unicode Migration for Oracle 12:00PM – CON774 Oracle GoldenGate for Cloud 6:00PM – BOF9597 International Oracle GoldenGate User Group Meeting 3:30PM – CON7934 Tapping into the Big Data Reserve with All Data 4:45PM – CON7922 Tame Big Data with Oracle Data Integration 4:45PM – CON7773 Oracle GoldenGate Performance Tuning for Oracle Database 10:45AM – CON7655 Achieving Zero Downtime During Oracle Application Upgrades & System Migrations 1:15PM – CON7718 Managing & Monitoring Oracle GoldenGate Oracle OpenWorld 2014 27 TUEMON WED THU
  • 28. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle DIS Session @ OOW ’14 – Oracle Data Integrator 4:00PM – CON7899 Oracle Data Integrator: Product Update and Future Strategy 5:00PM – CON7820 Making he Move from Oracle Warehouse Building to Oracle Data Integrator 3:30PM – CON7934 Tapping into the Big Data Reserve with All Data 4:45PM – CON7922 Tame Big Data with Oracle Data Integration 9:30AM – CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration 10:45AM – CON7923 Oracle Data Integration & Metadata Management for Seamless Enterprise 2:30PM – CON7921 Insight into Action: Business Intelligence Applications and Oracle Data Integrator Oracle OpenWorld 2014 28 TUEMON WED THU
  • 29. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle DIS Session @ OOW ’14 – Enterprise Data Quality 11:45AM – CON7776 Data Quality Maturity Journey: Building Toward Strong Enterprise Data Quality 10:45AM – CON7780 Oracle Enterprise Data Quality: Product Overview and Roadmap 2:00PM – CON7775 The Essential Core of Data Governance with Oracle Enterprise Data Quality 3:30PM – CON7934 Tapping into the Big Data Reserve with All Data 4:45PM – CON7922 Tame Big Data with Oracle Data Integration 12:00PM CON7931 Solving Big Data’s Big Problem with Data Preparation & Enrichment in the Cloud Oracle OpenWorld 2014 29 TUEMON WED THU
  • 30. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle DIS Hands-on Labs @ OOW ’14 Tuesday 3:45PM – HOL9439 • Oracle Data Integrator 12c New Features Deep Dive Tuesday 5:15PM – HOL9414 • Oracle Data Integrator for Big Data Hotel Nikko Nikko Ballroom II 22 Mason Street Monday 1:15PM – HOL9437 • Oracle GoldenGate 12c New Features Deep Drive Wednesday 4:15PM – HOL9436 • Pushing Transactions to JCache with Coherence and GoldenGate Thursday 10AM – HOL9413 • Oracle GoldenGate Heterogeneous Replication Monday 2:45PM – HOL9438 • Oracle Enterprise Data Quality Introduction Oracle OpenWorld 2014 30 OGG ODI EDQ
  • 32. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 32

Editor's Notes

  • #3: Big Data is the New Fuel for the Enterprise. It’s a clean fuel that from renewable sources. It’s perishable if not used regularly It’s combustible and can explosive impacts.