SlideShare a Scribd company logo
Streamline
Hadoop DevOps
with Apache
Ambari
Jayush Luniya
Hadoop Summit, Tokyo
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Speaker
Jayush Luniya
Staff Software Engineer @ Hortonworks
Apache Ambari PMC
jluniya@apache.org
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Open-source platform to provision, manage and monitor
Hadoop clusters
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Ambari?
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Ambari?
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
1690
1864
797
2251
277
206
34379
488
0
500
1000
1500
2000
2500
3000
Ambari 2.0 Ambari 2.1 Ambari 2.2 Ambari 2.4
April '15 Jul - Sept'15 Dec'15 - Feb'16 Aug'16 - Sept '16
#.#.2
#.#.1
GA
No.ofJIRAs
Ambari Releases
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ambari Architecture
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Exciting Enterprise Features in
Ambari 2.4
 New Services: Log Search, Zeppelin, Hive LLAP
 Role Based Access Control
 Management Packs
 Grafana UI for Ambari Metrics System
 New Views: Zeppelin, Storm
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy
Secure
Config
Upgrade
Monitor
Extend
Operations - Lifecycle
Ease-of-Use
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy On Premise
Ambari handles all of these combinations and
makes recommendations based on host specs.
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy In The Cloud
Certified environments
Sysprepped VMs
Hundreds of similar clusters
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy with Blueprints
 Systematic way of defining a cluster
 Export existing cluster into blueprint
/api/v1/clusters/:clusterName?format=blueprint
Configs Topology Hosts Cluster
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create Cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create Cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create Cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create Cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Blueprints for Large Scale
 Kerberos, secure out-of-the-box
 High Availability is setup initially for
NameNode, YARN, Hive, Oozie, etc
 Host Discovery allows Ambari to
automatically install services for a Host when
it comes online
 Stack Advisor recommendations
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
POST /api/v1/clusters/MyCluster/hosts
[
{
"blueprint" : "single-node-hdfs-test2",
"host_groups" :[
{
"host_group" : "slave",
"host_count" : 3,
"host_predicate" : "Hosts/cpu_count>1”
}, {
"host_group" : "super-slave",
"host_count" : 5,
"host_predicate" : "Hosts/cpu_count>2&
Hosts/total_mem>3000000"
}
]
}
]
Blueprint Host Discovery
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Secure
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Comprehensive Security
LDAP/AD
• User Auth.
• Sync
Kerberos
• MIT KDC
• Keytab
Management
Atlas
• Governance
• Compliance
• Data Classify
• Lineage & History
Ranger
• Security policies
• Audit
• Authorization
Knox
• Perimeter Sec.
• LDAP/AD
• Sec. REST/HTTP
• SSL
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kerberos
Ambari manages Kerberos principals and keytabs
Works with existing MIT KDC or Active Directory
Once Kerberized, handles
 Adding Services
 Adding Hosts
 Adding Host Components
 Moving Host Components
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Role Based Access Control (RBAC)
As Ambari & organizations grow,
so do security needs
Ambari integrates with external
authentication systems & LDAP
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
RBAC Terms
Users belong to groups
A group has a role
Users can also have additional roles
Roles are applied to Resources. E.g.,
Ambari, particular Cluster, particular View
Roles have permissions
e.g., add services to cluster
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
New RBAC Roles
only view
↑, except change configs
↑, except alter cluster topology
or install components
Ambari Admin
Cluster Admin
Cluster Op
Service Admin
Service Op
Read-Only
↑, except add services, Kerberos,
manage alerts & upgrades
↑, except manage permissions
all
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Config
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Config Management
 Config Groups
 Different config settings for individual host components
 Config Versioning
 Revert back to old configs
 Smart Configs
 Highlight most important configs
 Stack Advisor
 Recommend configurations
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Smart Configs
Widgets
- Sliders
- Combos
- Toggles
- Spinners
- Lists
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Advisor
Kerberos
HTTPS
Zookeeper Servers
Memory Settings
…
High Availability
atlas.rest.address =
http(s)://host:port
# Atlas Servers
atlas.enabletTLS = true|false
atlas.server.http.port = 21000
atlas.server.https.port = 21443
Example
Configurations
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Upgrade
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Background: Upgrade Terminology
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Background: Upgrade Terminology
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
Rolling
Upgrade
Automated
Upgrades one component
per host at a time
Preserves cluster operation
and minimizes service impact
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Background: Upgrade Terminology
Express
Upgrade
Automated
Runs in parallel across hosts
Incurs downtime
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
Rolling
Upgrade
Automated
Upgrades one component
per host at a time
Preserves cluster operation
and minimizes service impact
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Check
Prerequisites
Review the
prereqs to
confirm
your cluster
configs are
ready
Prepare
Take
backups of
critical
cluster
metadata
Register +
Install
Register the
HDP
repository
and install
the target
HDP version
on the
cluster
Automated Upgrade: Rolling/Express
Perform
Upgrade
Perform the
HDP
upgrade.
The steps
depend on
upgrade
method:
Rolling or
Express
Finalize
Finalize the
upgrade,
making the
target
version the
current
version
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Process: Rolling Upgrade
ZooKeeper
Ranger
Hive
Oozie
Falcon
Kafka
Knox
Storm
Slider
Flume
Finalize or
Downgrade
Clients HDFS, YARN, MR, Tez,
HBase, Pig. Hive, etc.
Core
Masters
Core Slaves
HDFS
YARN
HBase
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Monitor
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Alerting Framework
Alert Type Description Thresholds (units)
WEB Connects to a Web URL. Alert status is based
on the HTTP response code
Response Code (n/a)
Connection Timeout (seconds)
PORT Connects to a port. Alert status is based on
response time
Response (seconds)
METRIC Checks the value of a service metric. Units
vary, based on the metric being checked
Metric Value (units vary)
Connection Timeout (seconds)
AGGREGATE Aggregates the status for another alert % Affected (percentage)
SCRIPT Executes a script to handle the alert check Varies
SERVER Executes a server-side runnable class to
handle the alert check
Varies
39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Alert UI
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Grafana for Ambari Metrics
 Grafana as a “Native UI” for
Ambari Metrics
 Pre-built Dashboards
Host-level, Service-level
 Supports HTTPS
 System Home, Servers
 HDFS Home, NameNodes,
DataNodes
 YARN Home, Applications,
Job History Server
 HBase Home, Performance
FEATURES DASHBOARDS
41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Grafana includes pre-built
dashboards for visualizing the most
important cluster metrics.
42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The HDFS NameNode
dashboard highlights
file system activity.
43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Storm Monitoring View
44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Log Search
Search and index Hadoop logs!
Capabilities
• Rapid Search of all Hadoop component logs
• Search across time ranges, log levels, and for
keywords
Solr
Logsearch
Ambari
45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Log Search
WO R K E R
N O D E
L O G
F E E D E R
Solr
LO G
S EA RC H
U I
Solr
Solr
A M BA R I
Java Process
Multi-output Support
Grok filters
Solr Cloud
Local Disk Storage
46 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Extend
47 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Management Packs
 Improved Release Management:
Decouple Ambari core from stacks releases
 Support Add-ons:
–Release vehicle for 3rd party services, views
–Self-contained release artifacts
–Stack is an overlay of multiple management
packs
48 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Overlay of Management Packs
inherits from 2.3
inherits from 2.4
inherits from 2.5
49 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Management Pack++
Short Term Goals (Ambari 2.4)
 Retrofit in Stack Processing Framework
 Enable 3rd party to ship add-on services
Future Goals
 Management Pack Framework
 Include Views
50 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Service Level Extensions
 Service Role Command Order
 Service Advisor
 Service Repos
 Service Upgrade Packs
51 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future
52 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future of Ambari
 Cloud Focus
 Multiple Service Instance (Two ZK quorums)
 Multiple Service Versions (Spark 1.6 & Spark 2.0)
 YARN Assemblies
 Granular Upgrades: Patch, Component, Service
 Ambari High Availability
53 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You

More Related Content

PPTX
Apache Ambari Stack Extensibility
Jayush Luniya
 
PPTX
Manage Add-on Services in Apache Ambari
Jayush Luniya
 
PPTX
Apache Ambari: Simplified Hadoop Cluster Operation & Troubleshooting
Jayush Luniya
 
PPTX
Managing Enterprise Hadoop Clusters with Apache Ambari
Jayush Luniya
 
PPTX
Streamline Hadoop DevOps with Apache Ambari
Alejandro Fernandez
 
PPTX
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
 
PPTX
Simplified Cluster Operation & Troubleshooting
DataWorks Summit/Hadoop Summit
 
PDF
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks
 
Apache Ambari Stack Extensibility
Jayush Luniya
 
Manage Add-on Services in Apache Ambari
Jayush Luniya
 
Apache Ambari: Simplified Hadoop Cluster Operation & Troubleshooting
Jayush Luniya
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Jayush Luniya
 
Streamline Hadoop DevOps with Apache Ambari
Alejandro Fernandez
 
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
 
Simplified Cluster Operation & Troubleshooting
DataWorks Summit/Hadoop Summit
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks
 

What's hot (20)

PPTX
Managing 2000 Node Cluster with Ambari
DataWorks Summit
 
PDF
Discover.hdp2.2.ambari.final[1]
Hortonworks
 
PPTX
Apache Ambari - What's New in 1.7.0
Hortonworks
 
PPTX
Ambari Views - Overview
Hortonworks
 
PPTX
Apache Ambari: Past, Present, Future
Hortonworks
 
PPTX
Ambari Meetup: Ambari Futures
Hortonworks
 
PPTX
Ambari Meetup: Architecture and Demo
Hortonworks
 
PPTX
Apache Ambari BOF - APIs - Hadoop Summit 2013
Hortonworks
 
PPTX
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
Yahoo Developer Network
 
PPTX
Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013
Hortonworks
 
PPTX
Ambari Meetup: APIs and SPIs of Ambari
Hortonworks
 
PPTX
Apache Ambari - What's New in 2.0.0
Hortonworks
 
PPTX
Apache Ambari - What's New in 1.5.0
Hortonworks
 
PPTX
Apache Ambari - What's New in 1.4.3
Hortonworks
 
PDF
Impala 2.0 Update #impalajp
Cloudera Japan
 
PPTX
Apache Ambari - What's New in 1.4.2
Hortonworks
 
PPTX
Apache Falcon DevOps
Sanjeev Tripurari
 
PPTX
Hive analytic workloads hadoop summit san jose 2014
alanfgates
 
PPTX
Apache Ambari - What's New in 2.4
Hortonworks
 
PDF
HiveServer2 for Apache Hive
Carl Steinbach
 
Managing 2000 Node Cluster with Ambari
DataWorks Summit
 
Discover.hdp2.2.ambari.final[1]
Hortonworks
 
Apache Ambari - What's New in 1.7.0
Hortonworks
 
Ambari Views - Overview
Hortonworks
 
Apache Ambari: Past, Present, Future
Hortonworks
 
Ambari Meetup: Ambari Futures
Hortonworks
 
Ambari Meetup: Architecture and Demo
Hortonworks
 
Apache Ambari BOF - APIs - Hadoop Summit 2013
Hortonworks
 
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
Yahoo Developer Network
 
Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013
Hortonworks
 
Ambari Meetup: APIs and SPIs of Ambari
Hortonworks
 
Apache Ambari - What's New in 2.0.0
Hortonworks
 
Apache Ambari - What's New in 1.5.0
Hortonworks
 
Apache Ambari - What's New in 1.4.3
Hortonworks
 
Impala 2.0 Update #impalajp
Cloudera Japan
 
Apache Ambari - What's New in 1.4.2
Hortonworks
 
Apache Falcon DevOps
Sanjeev Tripurari
 
Hive analytic workloads hadoop summit san jose 2014
alanfgates
 
Apache Ambari - What's New in 2.4
Hortonworks
 
HiveServer2 for Apache Hive
Carl Steinbach
 
Ad

Viewers also liked (9)

PDF
A Generative Method for Infrastructure Emergence
whichlight
 
PPTX
A complete hadoop stack
Abhra Pal
 
PPTX
"Who Moved my Data? - Why tracking changes and sources of data is critical to...
Cask Data
 
PDF
Introducing Athena: 08/19 Big Data Application Meetup, Talk #3
Cask Data
 
PDF
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
Cask Data
 
PDF
Webinar: What's new in CDAP 3.5?
Cask Data
 
PPTX
The DAP - Where YARN, HBase, Kafka and Spark go to Production
DataWorks Summit/Hadoop Summit
 
PDF
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Cloudera, Inc.
 
PDF
Transactions Over Apache HBase
Cask Data
 
A Generative Method for Infrastructure Emergence
whichlight
 
A complete hadoop stack
Abhra Pal
 
"Who Moved my Data? - Why tracking changes and sources of data is critical to...
Cask Data
 
Introducing Athena: 08/19 Big Data Application Meetup, Talk #3
Cask Data
 
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
Cask Data
 
Webinar: What's new in CDAP 3.5?
Cask Data
 
The DAP - Where YARN, HBase, Kafka and Spark go to Production
DataWorks Summit/Hadoop Summit
 
Hadoop World 2011: The Hadoop Stack - Then, Now and in the Future - Eli Colli...
Cloudera, Inc.
 
Transactions Over Apache HBase
Cask Data
 
Ad

Similar to Streamline Hadoop DevOps with Apache Ambari (20)

PPTX
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
 
PPTX
Streamline Hadoop DevOps with Apache Ambari
Alejandro Fernandez
 
PPTX
Simplified Cluster Operation and Troubleshooting
DataWorks Summit/Hadoop Summit
 
PDF
Hortonworks Technical Workshop: Apache Ambari
Hortonworks
 
PPTX
Apache Ambari - What's New in 1.6.0
Hortonworks
 
PPTX
Managing Enterprise Hadoop Clusters with Apache Ambari
Hortonworks
 
PDF
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Spain
 
PDF
GeekAustin DevOps
Matt Ray
 
PPTX
Manage Add-On Services with Apache Ambari
DataWorks Summit
 
PPTX
Apache Spark and Object Stores
Steve Loughran
 
PPTX
Calcite meetup-2016-04-20
Josh Elser
 
PDF
Spark Summit EU talk by Steve Loughran
Spark Summit
 
PDF
Making Sense out of Amazon ECS
WhiteHedge Technologies Inc.
 
PDF
Leveraging the Latest OSGi R7 Specifications - C Ziegeler & D Bosschaert
mfrancis
 
PPTX
S3Guard: What's in your consistency model?
Hortonworks
 
PDF
Automated infrastructure is on the menu
jtimberman
 
PPTX
Apache Spark and Object Stores —for London Spark User Group
Steve Loughran
 
PPTX
Apache Tez - A unifying Framework for Hadoop Data Processing
DataWorks Summit
 
PDF
Oracle Cloud - Infrastruktura jako kód
MarketingArrowECS_CZ
 
PPTX
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Data Con LA
 
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
 
Streamline Hadoop DevOps with Apache Ambari
Alejandro Fernandez
 
Simplified Cluster Operation and Troubleshooting
DataWorks Summit/Hadoop Summit
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks
 
Apache Ambari - What's New in 1.6.0
Hortonworks
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Hortonworks
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Spain
 
GeekAustin DevOps
Matt Ray
 
Manage Add-On Services with Apache Ambari
DataWorks Summit
 
Apache Spark and Object Stores
Steve Loughran
 
Calcite meetup-2016-04-20
Josh Elser
 
Spark Summit EU talk by Steve Loughran
Spark Summit
 
Making Sense out of Amazon ECS
WhiteHedge Technologies Inc.
 
Leveraging the Latest OSGi R7 Specifications - C Ziegeler & D Bosschaert
mfrancis
 
S3Guard: What's in your consistency model?
Hortonworks
 
Automated infrastructure is on the menu
jtimberman
 
Apache Spark and Object Stores —for London Spark User Group
Steve Loughran
 
Apache Tez - A unifying Framework for Hadoop Data Processing
DataWorks Summit
 
Oracle Cloud - Infrastruktura jako kód
MarketingArrowECS_CZ
 
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Data Con LA
 

Recently uploaded (20)

PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
The Future of Artificial Intelligence (AI)
Mukul
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 

Streamline Hadoop DevOps with Apache Ambari

  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Speaker Jayush Luniya Staff Software Engineer @ Hortonworks Apache Ambari PMC [email protected]
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Open-source platform to provision, manage and monitor Hadoop clusters
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why Ambari?
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why Ambari?
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 1690 1864 797 2251 277 206 34379 488 0 500 1000 1500 2000 2500 3000 Ambari 2.0 Ambari 2.1 Ambari 2.2 Ambari 2.4 April '15 Jul - Sept'15 Dec'15 - Feb'16 Aug'16 - Sept '16 #.#.2 #.#.1 GA No.ofJIRAs Ambari Releases
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ambari Architecture
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Exciting Enterprise Features in Ambari 2.4  New Services: Log Search, Zeppelin, Hive LLAP  Role Based Access Control  Management Packs  Grafana UI for Ambari Metrics System  New Views: Zeppelin, Storm
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy Secure Config Upgrade Monitor Extend Operations - Lifecycle Ease-of-Use
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy On Premise Ambari handles all of these combinations and makes recommendations based on host specs.
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy In The Cloud Certified environments Sysprepped VMs Hundreds of similar clusters
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy with Blueprints  Systematic way of defining a cluster  Export existing cluster into blueprint /api/v1/clusters/:clusterName?format=blueprint Configs Topology Hosts Cluster
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create Cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create Cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create Cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create Cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Blueprints for Large Scale  Kerberos, secure out-of-the-box  High Availability is setup initially for NameNode, YARN, Hive, Oozie, etc  Host Discovery allows Ambari to automatically install services for a Host when it comes online  Stack Advisor recommendations
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved POST /api/v1/clusters/MyCluster/hosts [ { "blueprint" : "single-node-hdfs-test2", "host_groups" :[ { "host_group" : "slave", "host_count" : 3, "host_predicate" : "Hosts/cpu_count>1” }, { "host_group" : "super-slave", "host_count" : 5, "host_predicate" : "Hosts/cpu_count>2& Hosts/total_mem>3000000" } ] } ] Blueprint Host Discovery
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Secure
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Comprehensive Security LDAP/AD • User Auth. • Sync Kerberos • MIT KDC • Keytab Management Atlas • Governance • Compliance • Data Classify • Lineage & History Ranger • Security policies • Audit • Authorization Knox • Perimeter Sec. • LDAP/AD • Sec. REST/HTTP • SSL
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Kerberos Ambari manages Kerberos principals and keytabs Works with existing MIT KDC or Active Directory Once Kerberized, handles  Adding Services  Adding Hosts  Adding Host Components  Moving Host Components
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Role Based Access Control (RBAC) As Ambari & organizations grow, so do security needs Ambari integrates with external authentication systems & LDAP
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved RBAC Terms Users belong to groups A group has a role Users can also have additional roles Roles are applied to Resources. E.g., Ambari, particular Cluster, particular View Roles have permissions e.g., add services to cluster
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved New RBAC Roles only view ↑, except change configs ↑, except alter cluster topology or install components Ambari Admin Cluster Admin Cluster Op Service Admin Service Op Read-Only ↑, except add services, Kerberos, manage alerts & upgrades ↑, except manage permissions all
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Config
  • 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Config Management  Config Groups  Different config settings for individual host components  Config Versioning  Revert back to old configs  Smart Configs  Highlight most important configs  Stack Advisor  Recommend configurations
  • 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Smart Configs Widgets - Sliders - Combos - Toggles - Spinners - Lists
  • 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stack Advisor Kerberos HTTPS Zookeeper Servers Memory Settings … High Availability atlas.rest.address = http(s)://host:port # Atlas Servers atlas.enabletTLS = true|false atlas.server.http.port = 21000 atlas.server.https.port = 21443 Example Configurations
  • 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Upgrade
  • 32. 32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Background: Upgrade Terminology Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime
  • 33. 33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Background: Upgrade Terminology Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime Rolling Upgrade Automated Upgrades one component per host at a time Preserves cluster operation and minimizes service impact
  • 34. 34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Background: Upgrade Terminology Express Upgrade Automated Runs in parallel across hosts Incurs downtime Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime Rolling Upgrade Automated Upgrades one component per host at a time Preserves cluster operation and minimizes service impact
  • 35. 35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Check Prerequisites Review the prereqs to confirm your cluster configs are ready Prepare Take backups of critical cluster metadata Register + Install Register the HDP repository and install the target HDP version on the cluster Automated Upgrade: Rolling/Express Perform Upgrade Perform the HDP upgrade. The steps depend on upgrade method: Rolling or Express Finalize Finalize the upgrade, making the target version the current version
  • 36. 36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Process: Rolling Upgrade ZooKeeper Ranger Hive Oozie Falcon Kafka Knox Storm Slider Flume Finalize or Downgrade Clients HDFS, YARN, MR, Tez, HBase, Pig. Hive, etc. Core Masters Core Slaves HDFS YARN HBase
  • 37. 37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Monitor
  • 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Alerting Framework Alert Type Description Thresholds (units) WEB Connects to a Web URL. Alert status is based on the HTTP response code Response Code (n/a) Connection Timeout (seconds) PORT Connects to a port. Alert status is based on response time Response (seconds) METRIC Checks the value of a service metric. Units vary, based on the metric being checked Metric Value (units vary) Connection Timeout (seconds) AGGREGATE Aggregates the status for another alert % Affected (percentage) SCRIPT Executes a script to handle the alert check Varies SERVER Executes a server-side runnable class to handle the alert check Varies
  • 39. 39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Alert UI
  • 40. 40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Grafana for Ambari Metrics  Grafana as a “Native UI” for Ambari Metrics  Pre-built Dashboards Host-level, Service-level  Supports HTTPS  System Home, Servers  HDFS Home, NameNodes, DataNodes  YARN Home, Applications, Job History Server  HBase Home, Performance FEATURES DASHBOARDS
  • 41. 41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Grafana includes pre-built dashboards for visualizing the most important cluster metrics.
  • 42. 42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The HDFS NameNode dashboard highlights file system activity.
  • 43. 43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Storm Monitoring View
  • 44. 44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Log Search Search and index Hadoop logs! Capabilities • Rapid Search of all Hadoop component logs • Search across time ranges, log levels, and for keywords Solr Logsearch Ambari
  • 45. 45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Log Search WO R K E R N O D E L O G F E E D E R Solr LO G S EA RC H U I Solr Solr A M BA R I Java Process Multi-output Support Grok filters Solr Cloud Local Disk Storage
  • 46. 46 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Extend
  • 47. 47 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Management Packs  Improved Release Management: Decouple Ambari core from stacks releases  Support Add-ons: –Release vehicle for 3rd party services, views –Self-contained release artifacts –Stack is an overlay of multiple management packs
  • 48. 48 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Overlay of Management Packs inherits from 2.3 inherits from 2.4 inherits from 2.5
  • 49. 49 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Management Pack++ Short Term Goals (Ambari 2.4)  Retrofit in Stack Processing Framework  Enable 3rd party to ship add-on services Future Goals  Management Pack Framework  Include Views
  • 50. 50 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Service Level Extensions  Service Role Command Order  Service Advisor  Service Repos  Service Upgrade Packs
  • 51. 51 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future
  • 52. 52 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future of Ambari  Cloud Focus  Multiple Service Instance (Two ZK quorums)  Multiple Service Versions (Spark 1.6 & Spark 2.0)  YARN Assemblies  Granular Upgrades: Patch, Component, Service  Ambari High Availability
  • 53. 53 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You

Editor's Notes

  • #5: Compare Hadoop to a Nissan GTR
  • #6: Takumi Nissan's Legendary Takumi: Four Master Craftsmen Who Hand-Build Every Nissan GT-R Engine http://nissannews.com/en-US/nissan/usa/channels/nissan360/releases/nissan-s-legendary-takumi-four-master-craftsmen-who-hand-build-every-nissan-gt-r-engine Takumi Kurosawa, Tsunemi Ooyama, Izumi Shioya and Nobumitsu Gozu
  • #7: Takumi for Hadoop Services Config Provision on the cloud Metrics Alerts Security Host management Views framework
  • #8: One of the most active Apache projects. Cadence is 2-3 major releases per year, with follow up maintenance releases in the months after. http://jsfiddle.net/mp8rqq5x/2/
  • #10: Log Search : Solr, Logfeeder (similar to Logstash), and Grafana UI Zeppelin A web-based notebook for interactive data analytics. Data exploration and visualization that can plugin to multiple data back ends Role Based Access Control
  • #11: Deploy: Blueprints with Host Discovery Secure: Kerberos, LDAP sync Smart Configs: stack advisor, painful to configure a thousand related knobs. E.g, change zoozkeeper quorum then that has an effect on several services. Log folder, then affects log search. Upgrade: Rolling and Express Upgrade, get patches Monitor: Ambari Alerts, Ambari Metrics Analyze, Scale, Extend: Views, Management Packs
  • #14: Cloudbreak can install on Amazon EC2, MSFT Azure, Cluster install takes 5-10 mins, mostly downloading packages, installing bits, and starting services.
  • #15: Used by HDInsight (Microsoft Azure) and Hortonworks QA Allow cluster creation or scaling to be started via the REST API prior to all/any hosts being available. As hosts register with Ambari server they will be matched to request host groups and provisioned according to the requested topology Allow host predicates to be specified along with host count to provide more flexibility in matching hosts to host groups. This will allow for host flavors where different host groups are matched to different host flavors Break up the current monolithic provisioning request into a request for each host operation. For example, install on host A, start on host A, install on hostB, etc. This will allow hosts to make progress even when another host encounters a failure. Allow a host count to be specified in the cluster creation template instead of host names. This is documented in https://issues.apache.org/jira/browse/AMBARI-6275 Install a cluster with two API calls
  • #16: The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  • #17: The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  • #18: The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  • #19: The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  • #20: Dynamic availability Allow host_count to be specified instead of host_names As hosts register, they will be matched to the request host groups and provisioned according to to the requested topology When specifying a host_count, a predicate can also be specified for finer-grained control
  • #21: Dynamic availability Allow host_count to be specified instead of host_names As hosts register, they will be matched to the request host groups and provisioned according to to the requested topology When specifying a host_count, a predicate can also be specified for finer-grained control 3 Terabytes since units is in MB
  • #23: Hadoop is a business critical data platform for enterprise. These corporations require a layered security model focusing on various aspects of security: authentication, authorization, auditing, data protection and data governance. Kerberos: Strong Authentication for Users and Service LDAP/AD: External User Authentication Service v/s authenticated users in Ambari DB Services: Ranger, Atlas, Knox. Ranger: setup security policies on who can access what. Authorization of audit files, plugins for other services like HDFS, Hive, Storm, etc. Atlas: Data governance and track lineage of data to ensure compliance, especially in health care and financial institutions Knox: perimeter security for HTTP and REST calls in the Hadoop Services. Works with SSL, Kerberos. Kerberos Key Distribution Center so we can define service principals and keytabs.
  • #24: Can use existing KDC (key distribution center) or install one for Hadoop Hadoop uses a rule-based system to create mappings between service principals and their related UNIX username
  • #25: As Ambari grows and organizations grow, so do security needs Users have fine-grained roles over the cluster and individual views. Granular authorization checks to distribute the responsibilities and privileges of authenticated users
  • #31: Stack Advisor, can now ship the recommendations for a service with the service itself, instead of a monolithic stack advisor for the entire stack. Makes it easier to integrate customer services
  • #33: Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  • #34: Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  • #35: Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  • #41: This Grafana instance is specifically for AMS, not meant to be general-purpose If customer is already using Grafana, this is not a replacement. Grafana will support read-only access for anonymous users, and HTTPS Aggregates across entire cluster, filter by host, top/bottom x, functions like avg/sum/min/max, filter by date range
  • #45: This is not HDP Search, it is not something that the customer has to separately license, it is an embedded Solr instance
  • #46: Agent/Collection process running on each host Written in Java Tails all service log files Parses logs using Grok/regex. Can merge multiple line logs, e.g. stack trace On restart, can resume from last read line. Uses checkpoint files to maintain state Extendable design to send logs to multiple destination type. Currently can send logs to Solr and Kafka
  • #53: upgrade individual components in the same stack version, e.g., just DN and RM in HDP 2.5.*.* with zero downtime