SlideShare a Scribd company logo
©2015 MFMER | slide-1
Securing Enterprise Healthcare Big Data
by the Combination of Knox/F5, Ranger,
TFA and Kerberos Coupled With
Enterprise Active Directory and LDAP
Dequan Chen, Ph.D.
Mayo Clinic Big Data Technology Services Team
chen.dequan@mayo.edu; 507-208-1599
San Jose McEnery Convention Center
June 13, 2017
©2015 MFMER | slide-2
Outlines
‱ Data Security – Critical to the Success of
Healthcare Business at Mayo Clinic
 Enterprise Healthcare Big Data at Mayo Clinic
 Personally Identifiable Information (PII) Data
 Protected Health Information (PHI) Data
‱ Mayo Clinic Big Data Clusters & Evolution
‱ Securing Enterprise Healthcare Big Data on
Mayo Clinic Hadoop Clusters
‱ On-going & Future Direction
‱ Conclusion
©2015 MFMER | slide-3
Data Security – Critical to the Success of
Healthcare Business at Mayo Clinic
©2015 MFMER | slide-4
Mayo Clinic Healthcare - Integrated with
Research and Education
‱ World’s largest integrated not-for-profit
healthcare system – > 70 hospitals and clinics
 Enterprise core value: The Needs of the Patient Come First
‱ Provides care for > 1m (1,317,900 in 2014)
patients from all 50 states & > 150 countries
annually
‱ Daily generates large amounts of EHR (EMR)
Data
 Structured
 Semi-Structured
 Unstructured
©2015 MFMER | slide-5
Mayo Clinic Rochester, Minn. recognized as the #1 in the
"Best Hospitals“ list of USA for 2014-2015, and 2016-2017
by U.S. News & World Report
©2015 MFMER | slide-6
Enterprise Healthcare Big Data at Mayo Clinic
‱ Enterprise Healthcare Big Data on Mayo Clinic
Hadoop Clusters
 HL7 messages or their parsed data or their json
derivatives – mix of semi- and un-structured EHR data
 Enterprise-level clinical usage (diagnosis, treatment,
prevention, or clinical reporting)
 Enterprise-level non-clinical usage (research, business
intelligence, or health information exchange)
©2015 MFMER | slide-7
Enterprise Healthcare Big Data Security Needs
‱ With Personally Identifiable Information (PII) Data
 Any data that can be used to contact, locate or identify a
specific individual, either by itself or combined with other
sources that are easily accessed
 May include information that is linked to an individual
through financial, medical, educational or employment
records
 Some of the data elements that might be used to identify
a certain person could consist of fingerprints, biometric
data, a name, telephone number, email address or social
security number
 Federal laws required to handle PII data securely: HIPAA,
Privacy Act, GLBA, FERPA, COPPA, and FCRA
©2015 MFMER | slide-8
‱ With Protected Health Information (PHI) Data
 Any health information that is individually identifiable, and
created or received by a covered entity - provider of
health care, a health plan operator, or health clearing
house
 May relate to an individual’s present, past or future health,
either in physical or mental terms, or the current condition
of a person
 Either maintained or transmitted in any given form,
including speech, paper, or electronics
 Exclude the education records covered by the educational
family rights and privacy act or any employment records
maintained by a covered entity
 Federal law required to handle PHI data securely: HIPAA
Enterprise Healthcare Big Data Security Needs
©2015 MFMER | slide-9
Mayo Clinic Big Data Clusters & Evolution
©2015 MFMER | slide-10
Mayo Clinic Big Data Clusters
‱ Teradata Appliance with SUSE Linux
Enterprise Server
‱ Each Hadoop Cluster Coupled with One
ElasticSearch (ES) Cluster on Selected
Edge Nodes
‱ Separated HDF (Nifi) Clusters (Not to be presented)
‱ (Hadoop + ES) Clusters
Normal: Sandbox, Dev, Test(Int)* and Prod
Disaster Recovery (DR): Dev and Prod
©2015 MFMER | slide-11
Mayo Clinic Big Data Clusters
‱ Data Storage on (Hadoop + ES) Clusters
Permanent
 HDFS folders/files
 HBase tables
 ES indexes
Temporary/Permanent
 Hive tables
©2015 MFMER | slide-12
Mayo Clinic (Hadoop + ES) Clusters
Evolution
‱ Hadoop/ES Cluster HDP + ES Evolution
 TDH/HDP 1.3.2 + ES (v1.0.0) (Un-Kerberized)
 TDH/HDP 2.1.2 + ES (v1.3.2..v1.5.2) (Un-
Kerberized)
 HDP 2.1.11 + ES (v1.5.2) (Secured: Local KDC
+ ES Shield via AD/LDAP)
 HDP 2.3.4 + ES (v1.5.2..v1.7.2..v2.1.2..v2.3.2..
v2.4.1..v2.4.4) (Secured: AD/LDAP etc)
 HDP 2.5.3 + ES (v2.4.4) (Secured: AD/LDAP
etc)
©2015 MFMER | slide-13
Securing Enterprise Healthcare Big Data
on Mayo Clinic Hadoop Clusters
©2015 MFMER | slide-14
Security Adopted on Mayo Clinic Hadoop
Clusters
©2015 MFMER | slide-15
Security Adopted on Mayo Clinic Hadoop
Clusters
‱ Kerberos Security
o Coupled with enterprise active directory (AD) using AD KDC
‱ Coupled with lightweight directory access
protocol (LDAP) over SSL
o Critical HDP services + ElasticSearch service
‱ Two Factor Authentication (TFA) Login and Sudo
Capability Post OS-Hardening
o Only for limited number of authorized users / applications on a
local entry node(s)
o Root login disabled
‱ Ranger Plugins and Policies
‱ HDFS/Hive/HBase Data Ops - Knox Gateway/F5
o The majority of users or applications
©2015 MFMER | slide-16
Kerberos with Active Directory
‱ Kerberized Using Enterprise (Active Directory)
AD KDC
o Provides a host of extensions and conveniences, such as
password expiration and account lockout
o Authentication and authorization
o AD user (princ) name + Password
©2015 MFMER | slide-17
Kerberos with Active Directory
‱ Kerberized Using Enterprise (Active Directory)
AD KDC (c'td)
o AD user (princ) name + keytab
o Auth_to_local rules needed for HDFS, Oozie, Storm, Kafka,
Ranger KMS, and Atlas
©2015 MFMER | slide-18
LDAP + SSL == LDAPS
‱ User Authentication/Authorization Also Uses
LDAP protocol for Some Hadoop Components
Services
o Ambari, Ranger / Ranger KMS, Knox, Grafana, Atlas, Hue, ES
o LDAP over SSL (LDAPS) certificate – Mayo Clinic Comodo certs
©2015 MFMER | slide-19
LDAP + SSL == LDAPS
o LDAP over SSL (LDAPS) certificate – Mayo Clinic Comodo certs
(c’td)
©2015 MFMER | slide-20
TFA & Sudo Capability Post OS-Hardening
‱ TFA Used for Local Users on Cluster Nodes
o Root login disabled
o Specific local nodes user name + password
o Passcode generated on-the-fly from user’s mobile device – RSA
o Sudo capability only for a limited number of users
o Post TFA login, Kerberos authentication against AD is required
©2015 MFMER | slide-21
Ranger Plugins and Policies
‱ Ranger Policies Control the Authorization of a
Single User/ Group Users Authorized to Operate
(CRUD etc) on the Specified Data or Services
o Data in HDFS files/folders, Hive databases/tables, HBase
namespaces/tables, (Solr collections/documents), Atlas metadata
o Services of
YARN,
Storm,
Kafka,
Knox
©2015 MFMER | slide-22
Ranger Plugins and Policies
o Example list of HDFS policies:
©2015 MFMER | slide-23
Ranger Plugins and Policies
o Example of a HDFS policy:
©2015 MFMER | slide-24
Ranger Plugins and Policies
‱ Ranger Also Performs Data Or Service Auditing
o Example of hdfs user accessing hive service:
©2015 MFMER | slide-25
Knox Gateway - HDFS/Hive/HBase Data Ops
‱ Required for the Majority of Mayo Clinic Big Data
Clients (Users/Applications)
‱ Non-Secure Hive Shell Has Been Disabled
o Hive CLI ops are forced to use beeline
‱ No Keytabs Issued for HDFS/Hive/HBase Data
Options by a Client Application Outside Mayo
Clinic Hadoop Clusters
o Regular HDFS remote Java client using keytabs – Not allowed
o Hive JDBC remote using keytabs – Not allowed
o HBase remote Java client using keytabs – Not allowed
©2015 MFMER | slide-26
F5/Knox - HDFS/Hive/HBase Data Ops
‱ WebHDFS, WebHCat/Knox-Hive JDBC and
WebHBase for Data Ops of HDFS, Hive and
HBase via Knox Gateway Respectively
o HA (high availability) are achieved using 2 or more WebHDFS,
WebHcat and WebHBase (Stargate HBase) services on the
master nodes
o Knox-Hive JDBC needs using user’s AD user name & password
‱ F5 Balancer Over Two or More Knox Gateway
Services for Each Hadoop Cluster Are Used to
Achieve Knox Gateway Services HA (+ More
Protection)
©2015 MFMER | slide-27
F5/Knox - HDFS/Hive/HBase Data Ops
‱ HA Example – WebHBase:
©2015 MFMER | slide-28
Data Ops via F5 Balancer / Knox Gateway
‱ Example – WebHDFS Data via Web Browser or
Curl Cmd
oWebUI & results
https://f5balanceryyyy:port1/gateway/YYYYYYYYY/webhdfs/v1/user/
zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN
90000.0,125000.0,Texas,120000.0,45500.0,250000.0,110000.0,
140000.0,7,3,113642.85714285714,
.
https://knoxgatewayzzz1:ddd7/gateway/YYYYYYYYY/webhdfs/v1/u
ser/zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN
or
https://knoxgatewayzzz2:ddd7/gateway/YYYYYYYYY/webhdfs/v1/u
ser/zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN
90000.0,125000.0,Texas,120000.0,45500.0,250000.0,110000.0,
140000.0,7,3,113642.85714285714,
.
©2015 MFMER | slide-29
Data Ops via F5 Balancer / Knox Gateway
oCurl Cmd & results
curl -i -k -u xxxxxxxx -X GET -L 
https://f5balanceryyyy:port1/gateway/YYYYYYYYY/webhdfs/v1/user/
zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN
90000.0,125000.0,Texas,120000.0,45500.0,250000.0,110000.0,
140000.0,7,3,113642.85714285714,
.
curl -i -k -u xxxxxxxx -X GET -L 
https://knoxgatewayzzz1:ddd7/gateway/YYYYYYYYY/webhdfs/v1/u
ser/zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN
or
https://knoxgatewayzzz2:ddd7/gateway/YYYYYYYYY/webhdfs/v1/u
ser/zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN
90000.0,125000.0,Texas,120000.0,45500.0,250000.0,110000.0,
140000.0,7,3,113642.85714285714,
.
©2015 MFMER | slide-30
How WebHDFS Data Ops via Knox Gateway Work?
oComplex authentication and authorization process
 Ranger HDFS Plugin allows access to the
HDFS data
 Ranger Knox Plugin allows access to the
HDFS interface through Knox
 The access path is likely to be:
A User connects/authenticates to Knox via HTTPS (SSL
protects credentials)
Knox checks the user’s credentials via LDAPS (SSL protects
credentials)
Ranger Knox plugin allows access to WebHDFS (Ranger
Knox service level authorization)
©2015 MFMER | slide-31
How WebHDFS Data Ops via Knox Gateway Work?
Knox authenticates to AD (Kerberos protects Knox credentials)
AD grants Knox a ticket granting ticket (TGT)
Knox requests WebHDFS service ticket from AD
AD grants Knox a service ticket (ST) for WebHDFS
Knox passes the user as a proxyuser to WebHDFS
The user tries to access the HDFS file XYZ
 Ranger HDFS Plugin checks if policy exists for the user and
the HDFS file XYZ (Ranger HDFS authorization); HDFS
checks native authorization for the user and the HDFS file
XYZ (HDFS authorization)  Issue authorization or denial
 Only when authorized, data in the HDFS file XYZ is retrieved
and returned to the client application (Web Browser or CLI)
©2015 MFMER | slide-32
Access Secured ElasticSearch (ES) Data
‱ Using Rest API via WebUI or Curl Cmd
https://elasticsearchuixxxx:ddd6/estest/
greeting/_search?pretty=true,q=title:Hello
curl -v --user xxxxxxx -XGET
https://elasticsearchuixxxx:ddd6/estest/
greeting/_search?pretty=true,q=title:Hello
©2015 MFMER | slide-33
Access Secured ElasticSearch (ES) Data
‱ Using Java API via Transport-TCP
©2015 MFMER | slide-34
On-going & Future Direction
©2015 MFMER | slide-35
On-going & Future Direction
‱ Big Data Network Segmentation - Whitelisting
oList of Mayo Clinic healthcare business-
allowed URLs or IP addresses
 Hadoop cluster-specific
 User/application client computer-specific
oImplemented by Mayo Network team and
Mayo Clinic BDTS team
oPermit data/service operations by any user /
client via the allowed list of IP connections
while block all the others
©2015 MFMER | slide-36
On-going & Future Direction
‱ Big Data At-Rest Encryption Enhancement
oDrive (disk) encryption
oClient-managed encrypted HDFS, Hive,
HBase or ES data
oRanger KMS-managed encryption keys and
data de-encryption for HDFS, Hive and
HBase data in the encryption zones
 For HDP v2.3.4 and earlier versions, HDFS data in the
encryption zones can only be retrieved by CLI but not via
Knox Gateway
©2015 MFMER | slide-37
Conclusion
©2015 MFMER | slide-38
Conclusion
‱ Enterprise Healthcare Big Data on Mayo Clinic Hadoop
Clusters Have Been Successfully Protected by
o Enterprise Kerberos
o Active Directory
o LDAP Over SSL
o OS Hardening/TFA
o Ranger
o Knox Gateway/F5 Balancer
‱ Underway at Mayo Clinic - Enhancement of Enterprise
Healthcare Big Data Security by
o Network Segmentation
o Data-At-Rest Encryption via Ranger KMS
‱ Achieved Successful Data Ops on Enterprise-Secured
Healthcare Big Data
©2015 MFMER | slide-39
References
‱ Mayo Clinic: http://www.mayoclinic.org/
‱ PII: http://privacyoffice.med.miami.edu/faq/privacy-
faqs/what-is-personally-identifiable-information-pii
‱ PHI: https://www.hipaa.com/hipaa-protected-health-
information-what-does-phi-include/
‱ Hadoop Stack: http://hortonworks.com
‱ ElasticSearch: https://www.elastic.co/
‱ CS Paper of Top IEEE Journal:
Chen et al. Real-Time or Near Real-Time Persisting Daily Healthcare
Data Into HDFS and ElasticSearch Index Inside a Big Data Platform.
IEEE Transactions on Industrial Informatics, vol. 13, no.2, pp 595-
606, April 2017
©2015 MFMER | slide-40
Questions & Discussion
chen.dequan@mayo.edu; 507-208-1599
Personal Email: dequanchen2007@gmail.com
LinkedIn: https://www.linkedin.com/in/dequan-chen-5b0a37bb/

More Related Content

PPTX
Security Updates: More Seamless Access Controls with Apache Spark and Apache ...
DataWorks Summit
 
PPTX
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
DataWorks Summit
 
PPTX
Is your Enterprise Data lake Metadata Driven AND Secure?
DataWorks Summit/Hadoop Summit
 
PPTX
Implementing Security on a Large Multi-Tenant Cluster the Right Way
DataWorks Summit
 
PPTX
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
DataWorks Summit
 
PPTX
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
DataWorks Summit
 
PPTX
Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srik...
Artem Ervits
 
PPTX
Apache Ranger Hive Metastore Security
DataWorks Summit/Hadoop Summit
 
Security Updates: More Seamless Access Controls with Apache Spark and Apache ...
DataWorks Summit
 
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
DataWorks Summit
 
Is your Enterprise Data lake Metadata Driven AND Secure?
DataWorks Summit/Hadoop Summit
 
Implementing Security on a Large Multi-Tenant Cluster the Right Way
DataWorks Summit
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
DataWorks Summit
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
DataWorks Summit
 
Security and Governance on Hadoop with Apache Atlas and Apache Ranger by Srik...
Artem Ervits
 
Apache Ranger Hive Metastore Security
DataWorks Summit/Hadoop Summit
 

What's hot (20)

PPTX
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
DataWorks Summit/Hadoop Summit
 
PPTX
Partner Ecosystem Showcase for Apache Ranger and Apache Atlas
DataWorks Summit
 
PPTX
Treat your enterprise data lake indigestion: Enterprise ready security and go...
DataWorks Summit
 
PDF
HAWQ Meets Hive - Querying Unmanaged Data
DataWorks Summit
 
PPTX
Insights into Real-world Data Management Challenges
DataWorks Summit
 
PPTX
Classification based security in Hadoop
Madhan Neethiraj
 
PPTX
Security and Data Governance using Apache Ranger and Apache Atlas
DataWorks Summit/Hadoop Summit
 
PPTX
Dynamic DDL: Adding structure to streaming IoT data on the fly
DataWorks Summit
 
PDF
Data Governance - Atlas 7.12.2015
Hortonworks
 
PDF
History of Privacera
Privacera
 
PPTX
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
DataWorks Summit/Hadoop Summit
 
PPTX
Built-In Security for the Cloud
DataWorks Summit
 
PDF
Implementing a Data Lake with Enterprise Grade Data Governance
Hortonworks
 
PPTX
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Seetharam Venkatesh
 
PPTX
Best Practices for Enterprise User Management in Hadoop Environment
DataWorks Summit/Hadoop Summit
 
PPTX
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
PPTX
Sharing metadata across the data lake and streams
DataWorks Summit
 
PPTX
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
DataWorks Summit
 
PPTX
Automatic Detection, Classification and Authorization of Sensitive Personal D...
DataWorks Summit/Hadoop Summit
 
PPTX
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
DataWorks Summit
 
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
DataWorks Summit/Hadoop Summit
 
Partner Ecosystem Showcase for Apache Ranger and Apache Atlas
DataWorks Summit
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
DataWorks Summit
 
HAWQ Meets Hive - Querying Unmanaged Data
DataWorks Summit
 
Insights into Real-world Data Management Challenges
DataWorks Summit
 
Classification based security in Hadoop
Madhan Neethiraj
 
Security and Data Governance using Apache Ranger and Apache Atlas
DataWorks Summit/Hadoop Summit
 
Dynamic DDL: Adding structure to streaming IoT data on the fly
DataWorks Summit
 
Data Governance - Atlas 7.12.2015
Hortonworks
 
History of Privacera
Privacera
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
DataWorks Summit/Hadoop Summit
 
Built-In Security for the Cloud
DataWorks Summit
 
Implementing a Data Lake with Enterprise Grade Data Governance
Hortonworks
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Seetharam Venkatesh
 
Best Practices for Enterprise User Management in Hadoop Environment
DataWorks Summit/Hadoop Summit
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
Sharing metadata across the data lake and streams
DataWorks Summit
 
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
DataWorks Summit
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
DataWorks Summit/Hadoop Summit
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
DataWorks Summit
 
Ad

Similar to Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger, TFA and Kerberos Coupled with Enterprise Active Directory and LDAP (20)

PPTX
HIPAA Compliance in the Cloud
DataWorks Summit/Hadoop Summit
 
PPTX
Intel boubker el mouttahid
BigDataExpo
 
PPTX
Is Your Hadoop Environment Secure?
Datameer
 
PDF
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks
 
PDF
Solving the Really Big Tech Problems with IoT
Eric Kavanagh
 
PDF
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
BigDataEverywhere
 
PDF
Achieving HIPAA Compliance with Postgres Plus Cloud Database
EDB
 
PPTX
Computerworld Big Data Forum 2015
Steven Sit
 
PPTX
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Luan Moreno Medeiros Maciel
 
PPTX
Open Source Security Tools for Big Data
Great Wide Open
 
PPTX
Open Source Security Tools for Big Data
Rommel Garcia
 
PDF
HIPAA Solutions on Cloud Foundry
Jim Shingler
 
PPT
Hadoop Operations: How to Secure and Control Cluster Access
Cloudera, Inc.
 
PPTX
THE FDA and Medical Device Cybersecurity Guidance
Pam Gilmore
 
PDF
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Big Data Spain
 
PDF
BigData Security - A Point of View
Karan Alang
 
PPTX
Infochimps CxO Seminar @ PARC
Jim Kaskade
 
PDF
Scalable Data Computing for Healthcare and Life Sciences Industry
Paula Koziol
 
PDF
The FDA - Mobile, and Fixed Medical Devices Cybersecurity Guidance
Valdez Ladd MBA, CISSP, CISA,
 
PPTX
Hadoop and Big Data Security
Chicago Hadoop Users Group
 
HIPAA Compliance in the Cloud
DataWorks Summit/Hadoop Summit
 
Intel boubker el mouttahid
BigDataExpo
 
Is Your Hadoop Environment Secure?
Datameer
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks
 
Solving the Really Big Tech Problems with IoT
Eric Kavanagh
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
BigDataEverywhere
 
Achieving HIPAA Compliance with Postgres Plus Cloud Database
EDB
 
Computerworld Big Data Forum 2015
Steven Sit
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Luan Moreno Medeiros Maciel
 
Open Source Security Tools for Big Data
Great Wide Open
 
Open Source Security Tools for Big Data
Rommel Garcia
 
HIPAA Solutions on Cloud Foundry
Jim Shingler
 
Hadoop Operations: How to Secure and Control Cluster Access
Cloudera, Inc.
 
THE FDA and Medical Device Cybersecurity Guidance
Pam Gilmore
 
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Big Data Spain
 
BigData Security - A Point of View
Karan Alang
 
Infochimps CxO Seminar @ PARC
Jim Kaskade
 
Scalable Data Computing for Healthcare and Life Sciences Industry
Paula Koziol
 
The FDA - Mobile, and Fixed Medical Devices Cybersecurity Guidance
Valdez Ladd MBA, CISSP, CISA,
 
Hadoop and Big Data Security
Chicago Hadoop Users Group
 
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
DataWorks Summit
 
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
PPTX
Managing the Dewey Decimal System
DataWorks Summit
 
PPTX
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
PPTX
Security Framework for Multitenant Architecture
DataWorks Summit
 
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
PPTX
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
PDF
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

Recently uploaded (20)

PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Doc9.....................................
SofiaCollazos
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Doc9.....................................
SofiaCollazos
 
The Future of AI & Machine Learning.pptx
pritsen4700
 

Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger, TFA and Kerberos Coupled with Enterprise Active Directory and LDAP

  • 1. ©2015 MFMER | slide-1 Securing Enterprise Healthcare Big Data by the Combination of Knox/F5, Ranger, TFA and Kerberos Coupled With Enterprise Active Directory and LDAP Dequan Chen, Ph.D. Mayo Clinic Big Data Technology Services Team [email protected]; 507-208-1599 San Jose McEnery Convention Center June 13, 2017
  • 2. ©2015 MFMER | slide-2 Outlines ‱ Data Security – Critical to the Success of Healthcare Business at Mayo Clinic  Enterprise Healthcare Big Data at Mayo Clinic  Personally Identifiable Information (PII) Data  Protected Health Information (PHI) Data ‱ Mayo Clinic Big Data Clusters & Evolution ‱ Securing Enterprise Healthcare Big Data on Mayo Clinic Hadoop Clusters ‱ On-going & Future Direction ‱ Conclusion
  • 3. ©2015 MFMER | slide-3 Data Security – Critical to the Success of Healthcare Business at Mayo Clinic
  • 4. ©2015 MFMER | slide-4 Mayo Clinic Healthcare - Integrated with Research and Education ‱ World’s largest integrated not-for-profit healthcare system – > 70 hospitals and clinics  Enterprise core value: The Needs of the Patient Come First ‱ Provides care for > 1m (1,317,900 in 2014) patients from all 50 states & > 150 countries annually ‱ Daily generates large amounts of EHR (EMR) Data  Structured  Semi-Structured  Unstructured
  • 5. ©2015 MFMER | slide-5 Mayo Clinic Rochester, Minn. recognized as the #1 in the "Best Hospitals“ list of USA for 2014-2015, and 2016-2017 by U.S. News & World Report
  • 6. ©2015 MFMER | slide-6 Enterprise Healthcare Big Data at Mayo Clinic ‱ Enterprise Healthcare Big Data on Mayo Clinic Hadoop Clusters  HL7 messages or their parsed data or their json derivatives – mix of semi- and un-structured EHR data  Enterprise-level clinical usage (diagnosis, treatment, prevention, or clinical reporting)  Enterprise-level non-clinical usage (research, business intelligence, or health information exchange)
  • 7. ©2015 MFMER | slide-7 Enterprise Healthcare Big Data Security Needs ‱ With Personally Identifiable Information (PII) Data  Any data that can be used to contact, locate or identify a specific individual, either by itself or combined with other sources that are easily accessed  May include information that is linked to an individual through financial, medical, educational or employment records  Some of the data elements that might be used to identify a certain person could consist of fingerprints, biometric data, a name, telephone number, email address or social security number  Federal laws required to handle PII data securely: HIPAA, Privacy Act, GLBA, FERPA, COPPA, and FCRA
  • 8. ©2015 MFMER | slide-8 ‱ With Protected Health Information (PHI) Data  Any health information that is individually identifiable, and created or received by a covered entity - provider of health care, a health plan operator, or health clearing house  May relate to an individual’s present, past or future health, either in physical or mental terms, or the current condition of a person  Either maintained or transmitted in any given form, including speech, paper, or electronics  Exclude the education records covered by the educational family rights and privacy act or any employment records maintained by a covered entity  Federal law required to handle PHI data securely: HIPAA Enterprise Healthcare Big Data Security Needs
  • 9. ©2015 MFMER | slide-9 Mayo Clinic Big Data Clusters & Evolution
  • 10. ©2015 MFMER | slide-10 Mayo Clinic Big Data Clusters ‱ Teradata Appliance with SUSE Linux Enterprise Server ‱ Each Hadoop Cluster Coupled with One ElasticSearch (ES) Cluster on Selected Edge Nodes ‱ Separated HDF (Nifi) Clusters (Not to be presented) ‱ (Hadoop + ES) Clusters Normal: Sandbox, Dev, Test(Int)* and Prod Disaster Recovery (DR): Dev and Prod
  • 11. ©2015 MFMER | slide-11 Mayo Clinic Big Data Clusters ‱ Data Storage on (Hadoop + ES) Clusters Permanent  HDFS folders/files  HBase tables  ES indexes Temporary/Permanent  Hive tables
  • 12. ©2015 MFMER | slide-12 Mayo Clinic (Hadoop + ES) Clusters Evolution ‱ Hadoop/ES Cluster HDP + ES Evolution  TDH/HDP 1.3.2 + ES (v1.0.0) (Un-Kerberized)  TDH/HDP 2.1.2 + ES (v1.3.2..v1.5.2) (Un- Kerberized)  HDP 2.1.11 + ES (v1.5.2) (Secured: Local KDC + ES Shield via AD/LDAP)  HDP 2.3.4 + ES (v1.5.2..v1.7.2..v2.1.2..v2.3.2.. v2.4.1..v2.4.4) (Secured: AD/LDAP etc)  HDP 2.5.3 + ES (v2.4.4) (Secured: AD/LDAP etc)
  • 13. ©2015 MFMER | slide-13 Securing Enterprise Healthcare Big Data on Mayo Clinic Hadoop Clusters
  • 14. ©2015 MFMER | slide-14 Security Adopted on Mayo Clinic Hadoop Clusters
  • 15. ©2015 MFMER | slide-15 Security Adopted on Mayo Clinic Hadoop Clusters ‱ Kerberos Security o Coupled with enterprise active directory (AD) using AD KDC ‱ Coupled with lightweight directory access protocol (LDAP) over SSL o Critical HDP services + ElasticSearch service ‱ Two Factor Authentication (TFA) Login and Sudo Capability Post OS-Hardening o Only for limited number of authorized users / applications on a local entry node(s) o Root login disabled ‱ Ranger Plugins and Policies ‱ HDFS/Hive/HBase Data Ops - Knox Gateway/F5 o The majority of users or applications
  • 16. ©2015 MFMER | slide-16 Kerberos with Active Directory ‱ Kerberized Using Enterprise (Active Directory) AD KDC o Provides a host of extensions and conveniences, such as password expiration and account lockout o Authentication and authorization o AD user (princ) name + Password
  • 17. ©2015 MFMER | slide-17 Kerberos with Active Directory ‱ Kerberized Using Enterprise (Active Directory) AD KDC (c'td) o AD user (princ) name + keytab o Auth_to_local rules needed for HDFS, Oozie, Storm, Kafka, Ranger KMS, and Atlas
  • 18. ©2015 MFMER | slide-18 LDAP + SSL == LDAPS ‱ User Authentication/Authorization Also Uses LDAP protocol for Some Hadoop Components Services o Ambari, Ranger / Ranger KMS, Knox, Grafana, Atlas, Hue, ES o LDAP over SSL (LDAPS) certificate – Mayo Clinic Comodo certs
  • 19. ©2015 MFMER | slide-19 LDAP + SSL == LDAPS o LDAP over SSL (LDAPS) certificate – Mayo Clinic Comodo certs (c’td)
  • 20. ©2015 MFMER | slide-20 TFA & Sudo Capability Post OS-Hardening ‱ TFA Used for Local Users on Cluster Nodes o Root login disabled o Specific local nodes user name + password o Passcode generated on-the-fly from user’s mobile device – RSA o Sudo capability only for a limited number of users o Post TFA login, Kerberos authentication against AD is required
  • 21. ©2015 MFMER | slide-21 Ranger Plugins and Policies ‱ Ranger Policies Control the Authorization of a Single User/ Group Users Authorized to Operate (CRUD etc) on the Specified Data or Services o Data in HDFS files/folders, Hive databases/tables, HBase namespaces/tables, (Solr collections/documents), Atlas metadata o Services of YARN, Storm, Kafka, Knox
  • 22. ©2015 MFMER | slide-22 Ranger Plugins and Policies o Example list of HDFS policies:
  • 23. ©2015 MFMER | slide-23 Ranger Plugins and Policies o Example of a HDFS policy:
  • 24. ©2015 MFMER | slide-24 Ranger Plugins and Policies ‱ Ranger Also Performs Data Or Service Auditing o Example of hdfs user accessing hive service:
  • 25. ©2015 MFMER | slide-25 Knox Gateway - HDFS/Hive/HBase Data Ops ‱ Required for the Majority of Mayo Clinic Big Data Clients (Users/Applications) ‱ Non-Secure Hive Shell Has Been Disabled o Hive CLI ops are forced to use beeline ‱ No Keytabs Issued for HDFS/Hive/HBase Data Options by a Client Application Outside Mayo Clinic Hadoop Clusters o Regular HDFS remote Java client using keytabs – Not allowed o Hive JDBC remote using keytabs – Not allowed o HBase remote Java client using keytabs – Not allowed
  • 26. ©2015 MFMER | slide-26 F5/Knox - HDFS/Hive/HBase Data Ops ‱ WebHDFS, WebHCat/Knox-Hive JDBC and WebHBase for Data Ops of HDFS, Hive and HBase via Knox Gateway Respectively o HA (high availability) are achieved using 2 or more WebHDFS, WebHcat and WebHBase (Stargate HBase) services on the master nodes o Knox-Hive JDBC needs using user’s AD user name & password ‱ F5 Balancer Over Two or More Knox Gateway Services for Each Hadoop Cluster Are Used to Achieve Knox Gateway Services HA (+ More Protection)
  • 27. ©2015 MFMER | slide-27 F5/Knox - HDFS/Hive/HBase Data Ops ‱ HA Example – WebHBase:
  • 28. ©2015 MFMER | slide-28 Data Ops via F5 Balancer / Knox Gateway ‱ Example – WebHDFS Data via Web Browser or Curl Cmd oWebUI & results https://f5balanceryyyy:port1/gateway/YYYYYYYYY/webhdfs/v1/user/ zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN 90000.0,125000.0,Texas,120000.0,45500.0,250000.0,110000.0, 140000.0,7,3,113642.85714285714,
. https://knoxgatewayzzz1:ddd7/gateway/YYYYYYYYY/webhdfs/v1/u ser/zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN or https://knoxgatewayzzz2:ddd7/gateway/YYYYYYYYY/webhdfs/v1/u ser/zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN 90000.0,125000.0,Texas,120000.0,45500.0,250000.0,110000.0, 140000.0,7,3,113642.85714285714,
.
  • 29. ©2015 MFMER | slide-29 Data Ops via F5 Balancer / Knox Gateway oCurl Cmd & results curl -i -k -u xxxxxxxx -X GET -L https://f5balanceryyyy:port1/gateway/YYYYYYYYY/webhdfs/v1/user/ zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN 90000.0,125000.0,Texas,120000.0,45500.0,250000.0,110000.0, 140000.0,7,3,113642.85714285714,
. curl -i -k -u xxxxxxxx -X GET -L https://knoxgatewayzzz1:ddd7/gateway/YYYYYYYYY/webhdfs/v1/u ser/zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN or https://knoxgatewayzzz2:ddd7/gateway/YYYYYYYYY/webhdfs/v1/u ser/zzzzz/test/Solr/solr_curl_query_result1.txt?op=OPEN 90000.0,125000.0,Texas,120000.0,45500.0,250000.0,110000.0, 140000.0,7,3,113642.85714285714,
.
  • 30. ©2015 MFMER | slide-30 How WebHDFS Data Ops via Knox Gateway Work? oComplex authentication and authorization process  Ranger HDFS Plugin allows access to the HDFS data  Ranger Knox Plugin allows access to the HDFS interface through Knox  The access path is likely to be: A User connects/authenticates to Knox via HTTPS (SSL protects credentials) Knox checks the user’s credentials via LDAPS (SSL protects credentials) Ranger Knox plugin allows access to WebHDFS (Ranger Knox service level authorization)
  • 31. ©2015 MFMER | slide-31 How WebHDFS Data Ops via Knox Gateway Work? Knox authenticates to AD (Kerberos protects Knox credentials) AD grants Knox a ticket granting ticket (TGT) Knox requests WebHDFS service ticket from AD AD grants Knox a service ticket (ST) for WebHDFS Knox passes the user as a proxyuser to WebHDFS The user tries to access the HDFS file XYZ  Ranger HDFS Plugin checks if policy exists for the user and the HDFS file XYZ (Ranger HDFS authorization); HDFS checks native authorization for the user and the HDFS file XYZ (HDFS authorization)  Issue authorization or denial  Only when authorized, data in the HDFS file XYZ is retrieved and returned to the client application (Web Browser or CLI)
  • 32. ©2015 MFMER | slide-32 Access Secured ElasticSearch (ES) Data ‱ Using Rest API via WebUI or Curl Cmd https://elasticsearchuixxxx:ddd6/estest/ greeting/_search?pretty=true,q=title:Hello curl -v --user xxxxxxx -XGET https://elasticsearchuixxxx:ddd6/estest/ greeting/_search?pretty=true,q=title:Hello
  • 33. ©2015 MFMER | slide-33 Access Secured ElasticSearch (ES) Data ‱ Using Java API via Transport-TCP
  • 34. ©2015 MFMER | slide-34 On-going & Future Direction
  • 35. ©2015 MFMER | slide-35 On-going & Future Direction ‱ Big Data Network Segmentation - Whitelisting oList of Mayo Clinic healthcare business- allowed URLs or IP addresses  Hadoop cluster-specific  User/application client computer-specific oImplemented by Mayo Network team and Mayo Clinic BDTS team oPermit data/service operations by any user / client via the allowed list of IP connections while block all the others
  • 36. ©2015 MFMER | slide-36 On-going & Future Direction ‱ Big Data At-Rest Encryption Enhancement oDrive (disk) encryption oClient-managed encrypted HDFS, Hive, HBase or ES data oRanger KMS-managed encryption keys and data de-encryption for HDFS, Hive and HBase data in the encryption zones  For HDP v2.3.4 and earlier versions, HDFS data in the encryption zones can only be retrieved by CLI but not via Knox Gateway
  • 37. ©2015 MFMER | slide-37 Conclusion
  • 38. ©2015 MFMER | slide-38 Conclusion ‱ Enterprise Healthcare Big Data on Mayo Clinic Hadoop Clusters Have Been Successfully Protected by o Enterprise Kerberos o Active Directory o LDAP Over SSL o OS Hardening/TFA o Ranger o Knox Gateway/F5 Balancer ‱ Underway at Mayo Clinic - Enhancement of Enterprise Healthcare Big Data Security by o Network Segmentation o Data-At-Rest Encryption via Ranger KMS ‱ Achieved Successful Data Ops on Enterprise-Secured Healthcare Big Data
  • 39. ©2015 MFMER | slide-39 References ‱ Mayo Clinic: http://www.mayoclinic.org/ ‱ PII: http://privacyoffice.med.miami.edu/faq/privacy- faqs/what-is-personally-identifiable-information-pii ‱ PHI: https://www.hipaa.com/hipaa-protected-health- information-what-does-phi-include/ ‱ Hadoop Stack: http://hortonworks.com ‱ ElasticSearch: https://www.elastic.co/ ‱ CS Paper of Top IEEE Journal: Chen et al. Real-Time or Near Real-Time Persisting Daily Healthcare Data Into HDFS and ElasticSearch Index Inside a Big Data Platform. IEEE Transactions on Industrial Informatics, vol. 13, no.2, pp 595- 606, April 2017
  • 40. ©2015 MFMER | slide-40 Questions & Discussion [email protected]; 507-208-1599 Personal Email: [email protected] LinkedIn: https://www.linkedin.com/in/dequan-chen-5b0a37bb/