SlideShare a Scribd company logo
© Hortonworks Inc. 2015
Protecting Enterprise Data
in Apache Hadoop
April 2015
Page 1
Owen O’Malley
owen@hortonworks.com
@owen_omalley
© Hortonworks Inc. 2015
Security
Page 2
© Hortonworks Inc. 2015
Security Architecture
Page 3
© Hortonworks Inc. 2015
Attack Vectors
Page 4
© Hortonworks Inc. 2015
Attack Vectors
Page 5
© Hortonworks Inc. 2015
Threat: Accidental Damage
Page 6
© Hortonworks Inc. 2015
Threat: Remote Access
Page 7
© Hortonworks Inc. 2015
Threat: Eavesdropping
Page 8
© Hortonworks Inc. 2015
Threat: User accesses private data
Page 9
© Hortonworks Inc. 2015
Threat: Physical access
Page 10
© Hortonworks Inc. 2015
Threat: Hadoop Admin in Cluster
Page 11
© Hortonworks Inc. 2015
HDFS Encryption
Page 12
© Hortonworks Inc. 2015
KeyProvider API
Page 13
© Hortonworks Inc. 2015
Encryption Scheme
Page 14
© Hortonworks Inc. 2015
Threat: User reads private columns
Page 15
© Hortonworks Inc. 2015
ORC File Layout
Page 16
File Footer
Postscript
Index Data
Row Data
Stripe Footer
256MBStripe
Index Data
Row Data
Stripe Footer
256MBStripe
Index Data
Row Data
Stripe Footer
256MBStripe
Column 1
Column 2
Column 7
Column 8
Column 3
Column 6
Column 4
Column 5
Column 1
Column 2
Column 7
Column 8
Column 3
Column 6
Column 4
Column 5
Stream 2.1
Stream 2.2
Stream 2.3
Stream 2.4
© Hortonworks Inc. 2015
Threat: User reads hidden values
Page 17
© Hortonworks Inc. 2015
Threat: Shadow Security
Page 18
© Hortonworks Inc. 2015
Resources
Page 19
© Hortonworks Inc. 2015
Thank You!
Page 20

More Related Content

PPTX
Protecting Enterprise Data in Apache Hadoop
Owen O'Malley
 
PPTX
Adding ACID Updates to Hive
Owen O'Malley
 
PPTX
ORC Column Encryption
Owen O'Malley
 
PPTX
Protecting Enterprise Data in Apache Hadoop
DataWorks Summit
 
PDF
Plugging the Holes: Security and Compatability in Hadoop
Owen O'Malley
 
PPTX
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
 
PPT
Hadoop Security Architecture
Owen O'Malley
 
PPTX
Structor - Automated Building of Virtual Hadoop Clusters
Owen O'Malley
 
Protecting Enterprise Data in Apache Hadoop
Owen O'Malley
 
Adding ACID Updates to Hive
Owen O'Malley
 
ORC Column Encryption
Owen O'Malley
 
Protecting Enterprise Data in Apache Hadoop
DataWorks Summit
 
Plugging the Holes: Security and Compatability in Hadoop
Owen O'Malley
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Owen O'Malley
 
Hadoop Security Architecture
Owen O'Malley
 
Structor - Automated Building of Virtual Hadoop Clusters
Owen O'Malley
 

Viewers also liked (17)

PPTX
ORC File Introduction
Owen O'Malley
 
PDF
Bay Area HUG Feb 2011 Intro
Owen O'Malley
 
PDF
Next Generation MapReduce
Owen O'Malley
 
PDF
Next Generation Hadoop Operations
Owen O'Malley
 
PDF
Optimizing Hive Queries
Owen O'Malley
 
PDF
ORC Files
Owen O'Malley
 
PPTX
ORC File and Vectorization - Hadoop Summit 2013
Owen O'Malley
 
PDF
Hadoop Security Now and Future
tcloudcomputing-tw
 
PDF
Optimizing Hive Queries
DataWorks Summit
 
PPTX
ORC 2015
t3rmin4t0r
 
PDF
Parquet Hadoop Summit 2013
Julien Le Dem
 
PPTX
Apache Ranger
Rommel Garcia
 
PDF
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
 
PPTX
Hadoop Security Today & Tomorrow with Apache Knox
Vinay Shukla
 
PDF
Hive tuning
Michael Zhang
 
PPTX
Securing Hadoop with Apache Ranger
DataWorks Summit
 
PPTX
Hadoop security
Shivaji Dutta
 
ORC File Introduction
Owen O'Malley
 
Bay Area HUG Feb 2011 Intro
Owen O'Malley
 
Next Generation MapReduce
Owen O'Malley
 
Next Generation Hadoop Operations
Owen O'Malley
 
Optimizing Hive Queries
Owen O'Malley
 
ORC Files
Owen O'Malley
 
ORC File and Vectorization - Hadoop Summit 2013
Owen O'Malley
 
Hadoop Security Now and Future
tcloudcomputing-tw
 
Optimizing Hive Queries
DataWorks Summit
 
ORC 2015
t3rmin4t0r
 
Parquet Hadoop Summit 2013
Julien Le Dem
 
Apache Ranger
Rommel Garcia
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
 
Hadoop Security Today & Tomorrow with Apache Knox
Vinay Shukla
 
Hive tuning
Michael Zhang
 
Securing Hadoop with Apache Ranger
DataWorks Summit
 
Hadoop security
Shivaji Dutta
 
Ad

Similar to Data protection2015 (20)

PPTX
Protecting enterprise Data in Hadoop
DataWorks Summit
 
PPTX
Protecting Enterprise Data in Apache Hadoop
DataWorks Summit/Hadoop Summit
 
PPTX
Protecting Enterprise Data in Apache Hadoop
DataWorks Summit/Hadoop Summit
 
PPTX
Protecting Enterprise Data in Apache Hadoop
Hortonworks
 
PPTX
Protecting Enterprise Data In Apache Hadoop
DataWorks Summit/Hadoop Summit
 
PDF
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Big Data Spain
 
PDF
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
Hortonworks
 
PDF
Hortonworks and Voltage Security webinar
Hortonworks
 
PDF
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks
 
PDF
Hadoop Security Protecting Your Big Data Platform 1st Edition Ben Spivey
balonisongep63
 
PPTX
Fine Grain Access Control for Big Data: ORC Column Encryption
Owen O'Malley
 
PDF
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
DataWorks Summit
 
PDF
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
DataWorks Summit
 
PPTX
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Pardeep Kumar Mishra (Big Data / Hadoop Consultant)
 
PPTX
Curb your insecurity with HDP
DataWorks Summit/Hadoop Summit
 
PDF
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
IRJET Journal
 
PDF
Hortonworks sqrrl webinar v5.pptx
Hortonworks
 
PPTX
Improvements in Hadoop Security
DataWorks Summit
 
PPTX
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Luan Moreno Medeiros Maciel
 
PDF
BigData Security - A Point of View
Karan Alang
 
Protecting enterprise Data in Hadoop
DataWorks Summit
 
Protecting Enterprise Data in Apache Hadoop
DataWorks Summit/Hadoop Summit
 
Protecting Enterprise Data in Apache Hadoop
DataWorks Summit/Hadoop Summit
 
Protecting Enterprise Data in Apache Hadoop
Hortonworks
 
Protecting Enterprise Data In Apache Hadoop
DataWorks Summit/Hadoop Summit
 
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Big Data Spain
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
Hortonworks
 
Hortonworks and Voltage Security webinar
Hortonworks
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks
 
Hadoop Security Protecting Your Big Data Platform 1st Edition Ben Spivey
balonisongep63
 
Fine Grain Access Control for Big Data: ORC Column Encryption
Owen O'Malley
 
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
DataWorks Summit
 
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
DataWorks Summit
 
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Pardeep Kumar Mishra (Big Data / Hadoop Consultant)
 
Curb your insecurity with HDP
DataWorks Summit/Hadoop Summit
 
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
IRJET Journal
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks
 
Improvements in Hadoop Security
DataWorks Summit
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Luan Moreno Medeiros Maciel
 
BigData Security - A Point of View
Karan Alang
 
Ad

More from Owen O'Malley (7)

PPTX
Running An Apache Project: 10 Traps and How to Avoid Them
Owen O'Malley
 
PPTX
Big Data's Journey to ACID
Owen O'Malley
 
PPTX
ORC Deep Dive 2020
Owen O'Malley
 
PPTX
Protect your private data with ORC column encryption
Owen O'Malley
 
PPTX
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Owen O'Malley
 
PDF
Strata NYC 2018 Iceberg
Owen O'Malley
 
PPTX
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Owen O'Malley
 
Running An Apache Project: 10 Traps and How to Avoid Them
Owen O'Malley
 
Big Data's Journey to ACID
Owen O'Malley
 
ORC Deep Dive 2020
Owen O'Malley
 
Protect your private data with ORC column encryption
Owen O'Malley
 
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Owen O'Malley
 
Strata NYC 2018 Iceberg
Owen O'Malley
 
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Owen O'Malley
 

Recently uploaded (20)

PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Doc9.....................................
SofiaCollazos
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
The Future of Artificial Intelligence (AI)
Mukul
 
Doc9.....................................
SofiaCollazos
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Software Development Methodologies in 2025
KodekX
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 

Data protection2015