SlideShare a Scribd company logo
1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Troubleshooting Kerberos
In Hadoop :
Taming the Beast
DataWorks Summit
Sept 2017
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Author Profile
Vipin Rathor
Sr. Product Specialist (HDP Security)
Contributed to Kerberos, Apache Zeppelin, Apache Atlas
vrathor@hortonworks.com / @VipinRathor46
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
• Why Kerberos?
• Where is Kerberos used across the Hadoop Stack?
• What is Kerberos & how does it work
• Realms, Principals and Keytabs
• Systematic Approach to Kerberos Nirvana
• Tools available in Hadoop
• Native Kerberos Tools / Debug Options
• Kerberos Checklist
• Most Common Kerberos Error Messages (& their meanings)
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Why Kerberos?
• Universal Authentication mechanism for Hadoop stack
• Integrates with enterprise user management (e.g. Active Directory)
Solves:
• How can parts of a cluster trust each other
(NameNodes, DataNodes, YARN, HBase, ZooKeeper...)
• How can users trust the system?
• How can the system trust users?
• Foundation for: how can users delegate rights to applications?
• Without Kerberos: your cluster has NO security
Hadoop clusters are some of the largest Kerberos systems ever!!
5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Where is Kerberos used across the Hadoop Stack?
• Ubiquitous End-User / Hadoop Service Authentication mechanism
• Hadoop DelegationToken (Delegated authentication to NameNode)
• != Kerberos Tickets
• Bootstrapped with Kerberos authentication token
• HTTP Authentication
• Using SPNEGO (RFC 4559)
• Via Browsers / cURL (curl --negotiate)
• RPC Authentication
• Using Simple Authentication & Security Layer aka SASL (!= SSL)
• Java API Based Kerberos login
• Using JGSS / JAAS
• GSS-API (RFC 2743)
6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
What is Kerberos
• Open source, Developed by MIT
• Password is NEVER transmitted over wire
• Central trusted authority – Key Distribution Center (KDC)
• Symmetric key (common shared key)
• Flavors:
• MIT Kerberos
• Active Directory
• Heimdal Kerberos (OS X)
7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
How does Kerberos work
End User
- Does kinit (1 & 2)
- Runs HDFS
command (3 - 6)
Hadoop NameNode
- Starts up with nn.service.keytab
- Verifies user and gives access to
HDFS
KDC
- Provisions user keys and
service keytabs (e.g.
nn.service.keytab)
- Provides TGT and TGS
8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Realms, Principals and Keytabs
• Realm
• User Principal
• E.g. user1@HWX.COM
• ken/admin@HWX.COM
• ken/sandbox.hortonworks.com@HWX.COM
• Service Principal
• E.g. HTTP/sandbox.hortonworks.com@HWX.COM
• nn/node1.hortonworks.com@HWX.COM
• dn/node2.hortonworks.com@HWX.COM
• dn/_HOST@HWX.COM
• Keytabs
• Service keytabs (for service)
• Headless keytabs (for user)
9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Systematic Approach to Kerberos Nirvana
• Identify the involved parties (user, service, keytabs, nodes)
• Identify the stage where Kerberos is failing
• Based on stage & error message, narrow down between client-side or service-
side issue
• Check & verify configurations for correctness using the appropriate tools
• Repeat as necessary
10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kerberos Tools Available in Hadoop
• Kdiag
• Runs a series of diagnostic checks & gives suggestions
• hadoop org.apache.hadoop.security.KDiag
11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kerberos Tools Available in Hadoop (cntd..)
• HadoopKerberosName
• Checks Auth_to_local rules (Kerberos Principal to Unix user name conversion)
• hadoop org.apache.hadoop.security.HadoopKerberosName
nn/bali1.openstacklocal@LAB.HORTONWORKS.NET
12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Native Kerberos Tools / Debug Options
• via command line
• kinit
• klist -eaf / klist –kte
• kvno
• kdestroy
• export KRB5_TRACE=/tmp/krb5-curl.out
curl -ivL --negotiate -u: "http://namenode-host:50070/webhdfs/v1/?op=LISTSTATUS"
• via debug messages
• export HADOOP_JAAS_DEBUG=true
• export HADOOP_ROOT_LOGGER=DEBUG,console
• via Java library
• -Dsun.security.krb5.debug=true
• -Dsun.security.spnego.debug=true
• export OPTS=“$OPTS -Dsun.security.krb5.debug=true”
13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kerberos Checklist
• FQDN
• Name Resolution
• If DNS is configured, then check reverse lookup
• Date/Time sync (< 5 minutes)
• Configuration file - /etc/krb5.conf
• Principal Names
• Stale Keytabs (via kvno)
• Credential Cache location (JAAS config)
• Which Java suite, JCE policy
• KDC log file - /var/log/kerberos/krb5kdc.log
14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Most Common Kerberos Error Messages (& their meaning)
• <unknown client> for <unknown service> 
• Decrypt Integrity Check Failed
• AES256 EncType not supported
• Clock skew too great
• Kerberos service principal not found in the database
• Client not found in the database
• No valid initial credential found
15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
References
• http://web.mit.edu/kerberos/
• http://www.kerberos.org/software/tutorial.html
• https://github.com/steveloughran/kerberos_and_hadoop
16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank you !

More Related Content

What's hot (20)

PDF
Improve Monitoring and Observability for Kubernetes with OSS tools
Nilesh Gule
 
PPTX
OpenTelemetry For Developers
Kevin Brockhoff
 
PPTX
Mulesoft with ELK (Elastic Search, Log stash, Kibana)
Gaurav Sethi
 
PPT
Monitoring using Prometheus and Grafana
Arvind Kumar G.S
 
PPTX
Apache Hadoop Security - Ranger
Isheeta Sanghi
 
PDF
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
PDF
Mule 4 migration + Common Integration Challenges : MuleSoft Virtual Muleys Me...
Angel Alberici
 
PDF
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
PPTX
Jenkins CI
Viyaan Jhiingade
 
PDF
What is new in Apache Hive 3.0?
DataWorks Summit
 
PPT
Jenkins Overview
Ahmed M. Gomaa
 
PPTX
Prometheus (Prometheus London, 2016)
Brian Brazil
 
PPTX
Intro to Helm for Kubernetes
Carlos E. Salazar
 
PDF
Designing a complete ci cd pipeline using argo events, workflow and cd products
Julian Mazzitelli
 
PDF
An overview of the Kubernetes architecture
Igor Sfiligoi
 
PDF
Microservices architecture
Abdelghani Azri
 
PDF
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
 
PDF
Introduction to Nexus Repository Manager.pdf
Knoldus Inc.
 
PDF
Istio service mesh introduction
Kyohei Mizumoto
 
PPTX
Prometheus and Grafana
Lhouceine OUHAMZA
 
Improve Monitoring and Observability for Kubernetes with OSS tools
Nilesh Gule
 
OpenTelemetry For Developers
Kevin Brockhoff
 
Mulesoft with ELK (Elastic Search, Log stash, Kibana)
Gaurav Sethi
 
Monitoring using Prometheus and Grafana
Arvind Kumar G.S
 
Apache Hadoop Security - Ranger
Isheeta Sanghi
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Mule 4 migration + Common Integration Challenges : MuleSoft Virtual Muleys Me...
Angel Alberici
 
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Jenkins CI
Viyaan Jhiingade
 
What is new in Apache Hive 3.0?
DataWorks Summit
 
Jenkins Overview
Ahmed M. Gomaa
 
Prometheus (Prometheus London, 2016)
Brian Brazil
 
Intro to Helm for Kubernetes
Carlos E. Salazar
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Julian Mazzitelli
 
An overview of the Kubernetes architecture
Igor Sfiligoi
 
Microservices architecture
Abdelghani Azri
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
 
Introduction to Nexus Repository Manager.pdf
Knoldus Inc.
 
Istio service mesh introduction
Kyohei Mizumoto
 
Prometheus and Grafana
Lhouceine OUHAMZA
 

Viewers also liked (20)

PPTX
Improvements in Hadoop Security
DataWorks Summit
 
PPTX
An Approach for Multi-Tenancy Through Apache Knox
DataWorks Summit/Hadoop Summit
 
PPTX
Built-In Security for the Cloud
DataWorks Summit
 
PPTX
Hdp security overview
Hortonworks
 
PDF
Big Data Security with Hadoop
Cloudera, Inc.
 
PDF
Hadoop & Security - Past, Present, Future
Uwe Printz
 
PPT
Information security in big data -privacy and data mining
harithavijay94
 
PPTX
Apache Knox setup and hive and hdfs Access using KNOX
Abhishek Mallick
 
PPTX
Big Data and Security - Where are we now? (2015)
Peter Wood
 
PPTX
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Kevin Minder
 
PPTX
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
DataWorks Summit
 
PPTX
Treat your enterprise data lake indigestion: Enterprise ready security and go...
DataWorks Summit
 
PPTX
Hadoop Security Today & Tomorrow with Apache Knox
Vinay Shukla
 
PDF
OAuth - Open API Authentication
leahculver
 
PPTX
Hadoop and Data Access Security
Cloudera, Inc.
 
PDF
Hadoop Internals (2.3.0 or later)
Emilio Coppa
 
PPT
Hadoop Security Architecture
Owen O'Malley
 
PPTX
HADOOP TECHNOLOGY ppt
sravya raju
 
PDF
Cours Big Data Chap1
Amal Abid
 
Improvements in Hadoop Security
DataWorks Summit
 
An Approach for Multi-Tenancy Through Apache Knox
DataWorks Summit/Hadoop Summit
 
Built-In Security for the Cloud
DataWorks Summit
 
Hdp security overview
Hortonworks
 
Big Data Security with Hadoop
Cloudera, Inc.
 
Hadoop & Security - Past, Present, Future
Uwe Printz
 
Information security in big data -privacy and data mining
harithavijay94
 
Apache Knox setup and hive and hdfs Access using KNOX
Abhishek Mallick
 
Big Data and Security - Where are we now? (2015)
Peter Wood
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Kevin Minder
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
DataWorks Summit
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
DataWorks Summit
 
Hadoop Security Today & Tomorrow with Apache Knox
Vinay Shukla
 
OAuth - Open API Authentication
leahculver
 
Hadoop and Data Access Security
Cloudera, Inc.
 
Hadoop Internals (2.3.0 or later)
Emilio Coppa
 
Hadoop Security Architecture
Owen O'Malley
 
HADOOP TECHNOLOGY ppt
sravya raju
 
Cours Big Data Chap1
Amal Abid
 
Ad

Similar to Troubleshooting Kerberos in Hadoop: Taming the Beast (20)

PDF
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
CloudIDSummit
 
PPTX
Improvements in Hadoop Security
Chris Nauroth
 
PPTX
Curb your insecurity with HDP
DataWorks Summit/Hadoop Summit
 
PPTX
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Pardeep Kumar Mishra (Big Data / Hadoop Consultant)
 
PDF
Curb your insecurity with HDP - Tips for a Secure Cluster
ahortonworks
 
PDF
Practical Kerberos
Accumulo Summit
 
PDF
An Apache Hive Based Data Warehouse
DataWorks Summit
 
PPTX
Securing the Hadoop Ecosystem
DataWorks Summit
 
PPTX
Practical Kerberos with Apache HBase
Josh Elser
 
PPTX
Micro services vs hadoop
Gergely Devenyi
 
PPTX
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
DataWorks Summit
 
PPTX
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
Michael Stack
 
PPTX
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Caserta
 
PPTX
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
DataWorks Summit
 
PDF
MiniFi and Apache NiFi : IoT in Berlin Germany 2018
Timothy Spann
 
PPTX
Hadoop Security Today and Tomorrow
DataWorks Summit
 
PPTX
Apache NiFi in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
PPTX
Apache NiFi in the Hadoop Ecosystem
Bryan Bende
 
PPTX
Introduction to the Hadoop EcoSystem
Shivaji Dutta
 
PDF
2014 sept 4_hadoop_security
Adam Muise
 
CIS13: Big Data Platform Vendor’s Perspective: Insights from the Bleeding Edge
CloudIDSummit
 
Improvements in Hadoop Security
Chris Nauroth
 
Curb your insecurity with HDP
DataWorks Summit/Hadoop Summit
 
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Pardeep Kumar Mishra (Big Data / Hadoop Consultant)
 
Curb your insecurity with HDP - Tips for a Secure Cluster
ahortonworks
 
Practical Kerberos
Accumulo Summit
 
An Apache Hive Based Data Warehouse
DataWorks Summit
 
Securing the Hadoop Ecosystem
DataWorks Summit
 
Practical Kerberos with Apache HBase
Josh Elser
 
Micro services vs hadoop
Gergely Devenyi
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
DataWorks Summit
 
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
Michael Stack
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Caserta
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
DataWorks Summit
 
MiniFi and Apache NiFi : IoT in Berlin Germany 2018
Timothy Spann
 
Hadoop Security Today and Tomorrow
DataWorks Summit
 
Apache NiFi in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
Apache NiFi in the Hadoop Ecosystem
Bryan Bende
 
Introduction to the Hadoop EcoSystem
Shivaji Dutta
 
2014 sept 4_hadoop_security
Adam Muise
 
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
DataWorks Summit
 
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
PPTX
Managing the Dewey Decimal System
DataWorks Summit
 
PPTX
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
PPTX
Security Framework for Multitenant Architecture
DataWorks Summit
 
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
PPTX
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
PDF
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

Recently uploaded (20)

PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
July Patch Tuesday
Ivanti
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
July Patch Tuesday
Ivanti
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Biography of Daniel Podor.pdf
Daniel Podor
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 

Troubleshooting Kerberos in Hadoop: Taming the Beast

  • 1. 1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Troubleshooting Kerberos In Hadoop : Taming the Beast DataWorks Summit Sept 2017
  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Author Profile Vipin Rathor Sr. Product Specialist (HDP Security) Contributed to Kerberos, Apache Zeppelin, Apache Atlas [email protected] / @VipinRathor46
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda • Why Kerberos? • Where is Kerberos used across the Hadoop Stack? • What is Kerberos & how does it work • Realms, Principals and Keytabs • Systematic Approach to Kerberos Nirvana • Tools available in Hadoop • Native Kerberos Tools / Debug Options • Kerberos Checklist • Most Common Kerberos Error Messages (& their meanings)
  • 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Why Kerberos? • Universal Authentication mechanism for Hadoop stack • Integrates with enterprise user management (e.g. Active Directory) Solves: • How can parts of a cluster trust each other (NameNodes, DataNodes, YARN, HBase, ZooKeeper...) • How can users trust the system? • How can the system trust users? • Foundation for: how can users delegate rights to applications? • Without Kerberos: your cluster has NO security Hadoop clusters are some of the largest Kerberos systems ever!!
  • 5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Where is Kerberos used across the Hadoop Stack? • Ubiquitous End-User / Hadoop Service Authentication mechanism • Hadoop DelegationToken (Delegated authentication to NameNode) • != Kerberos Tickets • Bootstrapped with Kerberos authentication token • HTTP Authentication • Using SPNEGO (RFC 4559) • Via Browsers / cURL (curl --negotiate) • RPC Authentication • Using Simple Authentication & Security Layer aka SASL (!= SSL) • Java API Based Kerberos login • Using JGSS / JAAS • GSS-API (RFC 2743)
  • 6. 6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved What is Kerberos • Open source, Developed by MIT • Password is NEVER transmitted over wire • Central trusted authority – Key Distribution Center (KDC) • Symmetric key (common shared key) • Flavors: • MIT Kerberos • Active Directory • Heimdal Kerberos (OS X)
  • 7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved How does Kerberos work End User - Does kinit (1 & 2) - Runs HDFS command (3 - 6) Hadoop NameNode - Starts up with nn.service.keytab - Verifies user and gives access to HDFS KDC - Provisions user keys and service keytabs (e.g. nn.service.keytab) - Provides TGT and TGS
  • 8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Realms, Principals and Keytabs • Realm • User Principal • E.g. [email protected] • ken/[email protected] • ken/[email protected] • Service Principal • E.g. HTTP/[email protected] • nn/[email protected] • dn/[email protected] • dn/[email protected] • Keytabs • Service keytabs (for service) • Headless keytabs (for user)
  • 9. 9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Systematic Approach to Kerberos Nirvana • Identify the involved parties (user, service, keytabs, nodes) • Identify the stage where Kerberos is failing • Based on stage & error message, narrow down between client-side or service- side issue • Check & verify configurations for correctness using the appropriate tools • Repeat as necessary
  • 10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kerberos Tools Available in Hadoop • Kdiag • Runs a series of diagnostic checks & gives suggestions • hadoop org.apache.hadoop.security.KDiag
  • 11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kerberos Tools Available in Hadoop (cntd..) • HadoopKerberosName • Checks Auth_to_local rules (Kerberos Principal to Unix user name conversion) • hadoop org.apache.hadoop.security.HadoopKerberosName nn/[email protected]
  • 12. 12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Native Kerberos Tools / Debug Options • via command line • kinit • klist -eaf / klist –kte • kvno • kdestroy • export KRB5_TRACE=/tmp/krb5-curl.out curl -ivL --negotiate -u: "http://namenode-host:50070/webhdfs/v1/?op=LISTSTATUS" • via debug messages • export HADOOP_JAAS_DEBUG=true • export HADOOP_ROOT_LOGGER=DEBUG,console • via Java library • -Dsun.security.krb5.debug=true • -Dsun.security.spnego.debug=true • export OPTS=“$OPTS -Dsun.security.krb5.debug=true”
  • 13. 13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kerberos Checklist • FQDN • Name Resolution • If DNS is configured, then check reverse lookup • Date/Time sync (< 5 minutes) • Configuration file - /etc/krb5.conf • Principal Names • Stale Keytabs (via kvno) • Credential Cache location (JAAS config) • Which Java suite, JCE policy • KDC log file - /var/log/kerberos/krb5kdc.log
  • 14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Most Common Kerberos Error Messages (& their meaning) • <unknown client> for <unknown service>  • Decrypt Integrity Check Failed • AES256 EncType not supported • Clock skew too great • Kerberos service principal not found in the database • Client not found in the database • No valid initial credential found
  • 15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved References • http://web.mit.edu/kerberos/ • http://www.kerberos.org/software/tutorial.html • https://github.com/steveloughran/kerberos_and_hadoop
  • 16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Thank you !

Editor's Notes

  • #9: Realms = Domain in Active Directory KDC makes no differentiation between user principals and service principals. Goes same for the keytabs too.