SlideShare a Scribd company logo
Taming the Elephant - Learn how
    Monsanto manages their Hadoop clusters
    to enable Genome/Sequence processing

          Erich Hochmuth          Bala Venkatrao
         Mark Seidenstricker      Aparna Ramani

•   Hadoop World 2012, New York, October 25th, 2012
Agenda
• Introductions
• Monsanto Hadoop Use Case
     • Operational Challenges
     • How Monsanto leverages Cloudera Manager & Product Demo
     • Key benefits of using Cloudera Manager
•   Cloudera Manager
     • Overview
     • Key Features
     • Roadmap
•   Q&A

2
Introductions
    • Monsanto
      • Erich Hochmuth – R&D IT Data & Analytics Lead
      • Mark Seidenstricker – Infrastructure R&D Architect


    • Cloudera
       • Bala Venkartrao – Director, Products
       • Aparna Ramani – Director, Engineering



3
Monsanto Serves Farmers Around the World
    Working With Growers Large and Small, Row Crops and Vegetables




4
Monsanto’s Approach to Driving Yield
    A System of Agriculture Working Together to Boost Productivity




                          BREEDING                   BIOTECHNOLOGY                AGRONOMICS




                   The art and science             The science of improving    The farm management
                   of combining genetic material   plants by inserting genes   practices involved in
                   to produce a new seed           into their DNA              growing plants

5
Increasing Yield through Big Data
    At the Cornerstone of Yield Increases is Information & Analytics
                                            Increased Yield




                      Variety                      Volume                          Velocity




         • Raw Sequence data              • PBs of NGS data              • 10’s millions yield dps/day
         • Unstructured sensor data       • 10’s TBs of genomic data     • 100’s million genotyping dps/day
         • Poly-structured genomic data   • TBs of yield data            • TBs of NGS data/week
         • Spatial data                   • Billions of genotyping dps

6
What are the Challenges of managing a Hadoop Cluster?
    Software Provisioning & Configuration Management
        •   Automated & simplified installation/patch management
        •   Streamlined cluster configuration

    Enterprise –ready Tools
        •   Enterprise grade monitoring & management capabilities
        •   Integration with existing enterprise IT stack

    Reporting & Monitoring
        •   Proactive monitoring & alerting
        •   Capacity planning

    Support
        •   Midwest Location
        •   Lack of Hadoop expertise


7
What are the Solutions?
    With Cloudera Manager, you get…
    Intuitive Management Console
         •   Mission control style dashboard for entire cluster
         •   Centralized management of entire Hadoop ecosystem
         •   Treat the cluster as an appliance
         •   Configuration change audit & validation
    Integration with Enterprise IT Management Tools
         •   Connect to Corporate LDAP
         •   Cloudera Manager API integrates with existing BMC platform
    Comprehensive Monitoring & Alerting
         •   Proactive service level alerts
         •   Summarized cluster level graphs & charts
         •   Real-time series charts (MapReduce & HBase)
    Historical Cluster Metrics/Reports
         •   Capacity planning - Disk usage/ Slot Capacity


8
What are the Benefits of Cloudera Manager?
    Lowers the barrier for Hadoop administration
        •   Do not need to rely on experts solely

        •   Reduces the number of administrators needed

    Provides a “one-stop” holistic view
        •   Easy to understand how the overall cluster is performing

    Includes pre-tuned configuration with best practices
        •   Get straight to solving the business problem

    Integrates with Cloudera support
        •   Leverage the real experts…not just for bugs
9
Cloudera Enterprise – The Platform for Big Data




10
Why You Need Cloudera Manager?
     Complexity services running across many machines
     Hadoop is more than a dozen
        • Hundreds of hardware components
        • Thousands of settings
        • Limitless permutations

     Context not just a collection of parts
     Hadoop is a system,
        • Everything is interrelated
        • Raw data about individual pieces is not enough
        • Must extract what’s important


     Efficiency multiple tools & manual process takes longer
     Managing Hadoop with
        • Complicated, error-prone workflows
        • Longer issue resolution
        • Lack of consistent & repeatable processes

11
Cloudera Manager
     End-to-End Administration for CDH




     1   Deploy
         Install, configure & start your cluster in 3
         simple steps



     2 Configure & Optimize
         Ensure optimal settings for all hosts & services




     3 Monitor, Diagnose & Report
         Find & fix problems quickly, view current &
         historical activity & resource usage



12
Managing Complexity
       One Tool For Everything
 DEPLOYMENT &                                                                                            ACTIVITY
                      MONITORING   WORKFLOWS   EVENTS & ALERTS   LOG SEARCH   DIAGNOSTICS   REPORTING
 CONFIGURATION                                                                                          MONITORING

DO-IT-YOURSELF




                         +




CLOUDERA ENTERPRISE




      “In a recent Cloudera survey, >95% of respondents emphasized the importance of having a
                      single end-to-end tool to manage their Hadoop Operations”
 13
Raw Data vs. Hadoop Intelligence
     Providing Context




                                   1   Smart Configuration

                         ?
                                       Auto-sets configurations & guards against user error

                             VS.   2   Workflows
                                       Ensures that multi-step tasks are accomplished completely
                                       & in the correct sequence

                                   3   Dependencies
                                       Aware of how a particular action affects the rest of the
                                       cluster & manages the impact

                                   4   Events & Alerts
                                       Makes you aware of what’s important at a Hadoop system level


                                   5   History
                                       Compares current & past activities for context

14
Cloudera Manager Key Features
                  Installs the complete Hadoop stack in minutes via a wizard-based interface

                  Gives you complete, end-to-end visibility and control over your Hadoop cluster from a single
                  interface
                  Allows you to manage multiple clusters from a single instance of Cloudera Manager

                  Integrate Cloudera Manager with Active Directory

                  Establishes the time context globally for almost all views

                  Correlates jobs, activities, logs, system changes, configuration changes and service metrics along
                  a single timeline to simplify diagnosis
                  Set server roles, configure services and manage security across the cluster

                  Gracefully start, stop and restart of services as needed
                  Supports Administrator and Read-Only users

                  Maintains a complete record of configuration changes with the ability to roll back to previous
                  states
                  Monitors dozens of service performance metrics and alerts you when you approach critical
                  thresholds
15
Cloudera Manager Key Features (Contd..)
                  Gather, view and search Hadoop logs collected from across the cluster

                  Scans Hadoop logs for irregularities and warns you before they impact the cluster
                  Creates and aggregates relevant Hadoop events pertaining to system health, log messages, user
                  services and activities and make them available for alerting and searching


                  Generates email alerts when certain events occur

                  Consolidates all cluster activity into a single, real-time view

                  View information pertaining to hosts in your cluster including status, resident memory, virtual
                  memory and roles
                  Visualize health status and metrics across the cluster to quickly identify problem nodes and take
                  action
                  Visualize current and historical disk usage by user, group and directory
                  Track MapReduce activity on the cluster by job or user
                  Takes a snapshot of the cluster state and automatically sends it to Cloudera support to assist with
                  resolution
                  Easily integrate Cloudera Manager with your existing enterprise-wide management and monitoring
                  tools

16
Cloudera Manager Roadmap
     •   Cloudera Manager 4.1 – Released 10/24
           • Platform Support for CDH4.1
           • Cloudera Impala management & monitoring
           • New monitoring – Zookeeper, Flume NG
           • Maintenance Mode
           • Host Decommissioning
           • Several Usability Enhancements


     •   Cloudera Manager 4.5 – Early 2013
           •   Rolling Upgrades/ Restarts
           •   Enhanced Monitoring, Cluster Heatmaps etc.
           •   Role Groups Configuration
           •   Cloud Support
           •   Others – SNMP support, Error handling, ISV integration etc.


17
Why Cloudera Manager?
      Simple administration in a single tool
      End-to-End Hadoop

      Intelligentsystem level – Cloudera’s experience realized in software
      Manages Hadoop at a


      Efficient workflows & makes administrators more productive
      Simplifies complex

      Best-in-Class management application available
      The only enterprise-grade Hadoop

18
Next Steps
     • Try out FREE edition of Cloudera Manager
        •   Download from:
            http://www.cloudera.com/products-services/tools/
        •   Support available via scm-users@cloudera.org


     • For Cloudera Enterprise subscriptions,    please contact:
      sales@cloudera.com

19
Q&A




20
Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages their Hadoop clusters to enable Genome/Sequence processing
Key Features
     Cloudera Manager




22
Install A Cluster In 3 Simple Steps
     Cloudera Manager Key Features


                  1
             Find Nodes
                                                                    2
                                                        Install Components
                                                                                                                       3
                                                                                                                 Assign Roles




  Enter the names of the hosts which will be      Cloudera Manager automatically installs the CDH   Verify the roles of the nodes within your cluster.
included in the Hadoop cluster. Click Continue.       components on the hosts you specified.                   Make changes as necessary.


23
View Service Health & Performance
     Cloudera Manager Key Features




24
Get Host-Level Snapshots
     Cloudera Manager Key Features




25
Monitor & Diagnose Cluster Workloads
     Cloudera Manager Key Features




26
Gather, View & Search Hadoop Logs
     Cloudera Manager Key Features




27
Track Events From Across The Cluster
     Cloudera Manager Key Features




28
Report On System Performance & Usage
     Cloudera Manager Key Features




29
Visualize Health Status With Heatmaps
     Cloudera Manager Key Features




30
Manage Multiple CDH Clusters
     Cloudera Manager Key Features




31
Easily Configure High Availability
     Cloudera Manager Key Features




32
Set The Time Context Globally
     Cloudera Manager Key Features




33

More Related Content

PDF
Shopzilla Puppet Case Study
Puppet
 
PDF
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld
 
PPTX
Private Cloud Day Session 2: Creating & Configure your Private Cloud
Microsoft TechNet - Belgium and Luxembourg
 
PDF
Citrix Puppet Case Study
Puppet
 
PDF
IM B10
Vesna Vrdoljak
 
PDF
DS Auditor Datasheet
Embarcadero Technologies
 
PDF
Percona, software libre y bases de datos
LibreCon
 
PDF
Azure SQL Database Introduction by Tim Radney
Hasan Savran
 
Shopzilla Puppet Case Study
Puppet
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld
 
Private Cloud Day Session 2: Creating & Configure your Private Cloud
Microsoft TechNet - Belgium and Luxembourg
 
Citrix Puppet Case Study
Puppet
 
DS Auditor Datasheet
Embarcadero Technologies
 
Percona, software libre y bases de datos
LibreCon
 
Azure SQL Database Introduction by Tim Radney
Hasan Savran
 

Similar to Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages their Hadoop clusters to enable Genome/Sequence processing (20)

PPTX
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera, Inc.
 
PPTX
How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...
Cloudera, Inc.
 
PDF
Cluster management and automation with cloudera manager
Chris Westin
 
PDF
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Cloudera, Inc.
 
PDF
Hadoop on Cloud: Why and How?
Cloudera, Inc.
 
PDF
Hadoop summit cloudera keynote_v5
Cloudera, Inc.
 
PDF
One Hadoop, Multiple Clouds
Cloudera, Inc.
 
PDF
One Hadoop, Multiple Clouds - NYC Big Data Meetup
Andrei Savu
 
PPTX
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
Cloudera, Inc.
 
PPTX
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
PPTX
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera, Inc.
 
PDF
Hadoop World 2011: Preview of the New Cloudera Management Suite - Phil Zeylig...
Cloudera, Inc.
 
PPTX
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
Cloudera, Inc.
 
PPTX
Amr Awadallah, unSEXY Presentation
500 Startups
 
PPTX
Turning Data into Business Value with a Modern Data Platform
Cloudera, Inc.
 
PDF
Effective Hadoop Cluster Management- Impetus White Paper
Impetus Technologies
 
PPTX
Instant hadoop of your own
Jack (Yaakov) Bezalel
 
PPTX
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
PPTX
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
PDF
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
yaevents
 
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera, Inc.
 
How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...
Cloudera, Inc.
 
Cluster management and automation with cloudera manager
Chris Westin
 
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Cloudera, Inc.
 
Hadoop on Cloud: Why and How?
Cloudera, Inc.
 
Hadoop summit cloudera keynote_v5
Cloudera, Inc.
 
One Hadoop, Multiple Clouds
Cloudera, Inc.
 
One Hadoop, Multiple Clouds - NYC Big Data Meetup
Andrei Savu
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
Cloudera, Inc.
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
Cloudera Director: Unlock the Full Potential of Hadoop in the Cloud
Cloudera, Inc.
 
Hadoop World 2011: Preview of the New Cloudera Management Suite - Phil Zeylig...
Cloudera, Inc.
 
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
Cloudera, Inc.
 
Amr Awadallah, unSEXY Presentation
500 Startups
 
Turning Data into Business Value with a Modern Data Platform
Cloudera, Inc.
 
Effective Hadoop Cluster Management- Impetus White Paper
Impetus Technologies
 
Instant hadoop of your own
Jack (Yaakov) Bezalel
 
Five Tips for Running Cloudera on AWS
Cloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
yaevents
 
Ad

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
PPTX
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
PPTX
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
PPTX
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
PPTX
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
PPTX
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
PPTX
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
PPTX
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
PPTX
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
PPTX
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Ad

Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages their Hadoop clusters to enable Genome/Sequence processing

  • 1. Taming the Elephant - Learn how Monsanto manages their Hadoop clusters to enable Genome/Sequence processing Erich Hochmuth Bala Venkatrao Mark Seidenstricker Aparna Ramani • Hadoop World 2012, New York, October 25th, 2012
  • 2. Agenda • Introductions • Monsanto Hadoop Use Case • Operational Challenges • How Monsanto leverages Cloudera Manager & Product Demo • Key benefits of using Cloudera Manager • Cloudera Manager • Overview • Key Features • Roadmap • Q&A 2
  • 3. Introductions • Monsanto • Erich Hochmuth – R&D IT Data & Analytics Lead • Mark Seidenstricker – Infrastructure R&D Architect • Cloudera • Bala Venkartrao – Director, Products • Aparna Ramani – Director, Engineering 3
  • 4. Monsanto Serves Farmers Around the World Working With Growers Large and Small, Row Crops and Vegetables 4
  • 5. Monsanto’s Approach to Driving Yield A System of Agriculture Working Together to Boost Productivity BREEDING BIOTECHNOLOGY AGRONOMICS The art and science The science of improving The farm management of combining genetic material plants by inserting genes practices involved in to produce a new seed into their DNA growing plants 5
  • 6. Increasing Yield through Big Data At the Cornerstone of Yield Increases is Information & Analytics Increased Yield Variety Volume Velocity • Raw Sequence data • PBs of NGS data • 10’s millions yield dps/day • Unstructured sensor data • 10’s TBs of genomic data • 100’s million genotyping dps/day • Poly-structured genomic data • TBs of yield data • TBs of NGS data/week • Spatial data • Billions of genotyping dps 6
  • 7. What are the Challenges of managing a Hadoop Cluster? Software Provisioning & Configuration Management • Automated & simplified installation/patch management • Streamlined cluster configuration Enterprise –ready Tools • Enterprise grade monitoring & management capabilities • Integration with existing enterprise IT stack Reporting & Monitoring • Proactive monitoring & alerting • Capacity planning Support • Midwest Location • Lack of Hadoop expertise 7
  • 8. What are the Solutions? With Cloudera Manager, you get… Intuitive Management Console • Mission control style dashboard for entire cluster • Centralized management of entire Hadoop ecosystem • Treat the cluster as an appliance • Configuration change audit & validation Integration with Enterprise IT Management Tools • Connect to Corporate LDAP • Cloudera Manager API integrates with existing BMC platform Comprehensive Monitoring & Alerting • Proactive service level alerts • Summarized cluster level graphs & charts • Real-time series charts (MapReduce & HBase) Historical Cluster Metrics/Reports • Capacity planning - Disk usage/ Slot Capacity 8
  • 9. What are the Benefits of Cloudera Manager? Lowers the barrier for Hadoop administration • Do not need to rely on experts solely • Reduces the number of administrators needed Provides a “one-stop” holistic view • Easy to understand how the overall cluster is performing Includes pre-tuned configuration with best practices • Get straight to solving the business problem Integrates with Cloudera support • Leverage the real experts…not just for bugs 9
  • 10. Cloudera Enterprise – The Platform for Big Data 10
  • 11. Why You Need Cloudera Manager? Complexity services running across many machines Hadoop is more than a dozen • Hundreds of hardware components • Thousands of settings • Limitless permutations Context not just a collection of parts Hadoop is a system, • Everything is interrelated • Raw data about individual pieces is not enough • Must extract what’s important Efficiency multiple tools & manual process takes longer Managing Hadoop with • Complicated, error-prone workflows • Longer issue resolution • Lack of consistent & repeatable processes 11
  • 12. Cloudera Manager End-to-End Administration for CDH 1 Deploy Install, configure & start your cluster in 3 simple steps 2 Configure & Optimize Ensure optimal settings for all hosts & services 3 Monitor, Diagnose & Report Find & fix problems quickly, view current & historical activity & resource usage 12
  • 13. Managing Complexity One Tool For Everything DEPLOYMENT & ACTIVITY MONITORING WORKFLOWS EVENTS & ALERTS LOG SEARCH DIAGNOSTICS REPORTING CONFIGURATION MONITORING DO-IT-YOURSELF + CLOUDERA ENTERPRISE “In a recent Cloudera survey, >95% of respondents emphasized the importance of having a single end-to-end tool to manage their Hadoop Operations” 13
  • 14. Raw Data vs. Hadoop Intelligence Providing Context 1 Smart Configuration ? Auto-sets configurations & guards against user error VS. 2 Workflows Ensures that multi-step tasks are accomplished completely & in the correct sequence 3 Dependencies Aware of how a particular action affects the rest of the cluster & manages the impact 4 Events & Alerts Makes you aware of what’s important at a Hadoop system level 5 History Compares current & past activities for context 14
  • 15. Cloudera Manager Key Features Installs the complete Hadoop stack in minutes via a wizard-based interface Gives you complete, end-to-end visibility and control over your Hadoop cluster from a single interface Allows you to manage multiple clusters from a single instance of Cloudera Manager Integrate Cloudera Manager with Active Directory Establishes the time context globally for almost all views Correlates jobs, activities, logs, system changes, configuration changes and service metrics along a single timeline to simplify diagnosis Set server roles, configure services and manage security across the cluster Gracefully start, stop and restart of services as needed Supports Administrator and Read-Only users Maintains a complete record of configuration changes with the ability to roll back to previous states Monitors dozens of service performance metrics and alerts you when you approach critical thresholds 15
  • 16. Cloudera Manager Key Features (Contd..) Gather, view and search Hadoop logs collected from across the cluster Scans Hadoop logs for irregularities and warns you before they impact the cluster Creates and aggregates relevant Hadoop events pertaining to system health, log messages, user services and activities and make them available for alerting and searching Generates email alerts when certain events occur Consolidates all cluster activity into a single, real-time view View information pertaining to hosts in your cluster including status, resident memory, virtual memory and roles Visualize health status and metrics across the cluster to quickly identify problem nodes and take action Visualize current and historical disk usage by user, group and directory Track MapReduce activity on the cluster by job or user Takes a snapshot of the cluster state and automatically sends it to Cloudera support to assist with resolution Easily integrate Cloudera Manager with your existing enterprise-wide management and monitoring tools 16
  • 17. Cloudera Manager Roadmap • Cloudera Manager 4.1 – Released 10/24 • Platform Support for CDH4.1 • Cloudera Impala management & monitoring • New monitoring – Zookeeper, Flume NG • Maintenance Mode • Host Decommissioning • Several Usability Enhancements • Cloudera Manager 4.5 – Early 2013 • Rolling Upgrades/ Restarts • Enhanced Monitoring, Cluster Heatmaps etc. • Role Groups Configuration • Cloud Support • Others – SNMP support, Error handling, ISV integration etc. 17
  • 18. Why Cloudera Manager? Simple administration in a single tool End-to-End Hadoop Intelligentsystem level – Cloudera’s experience realized in software Manages Hadoop at a Efficient workflows & makes administrators more productive Simplifies complex Best-in-Class management application available The only enterprise-grade Hadoop 18
  • 19. Next Steps • Try out FREE edition of Cloudera Manager • Download from: http://www.cloudera.com/products-services/tools/ • Support available via [email protected] • For Cloudera Enterprise subscriptions, please contact: [email protected] 19
  • 22. Key Features Cloudera Manager 22
  • 23. Install A Cluster In 3 Simple Steps Cloudera Manager Key Features 1 Find Nodes 2 Install Components 3 Assign Roles Enter the names of the hosts which will be Cloudera Manager automatically installs the CDH Verify the roles of the nodes within your cluster. included in the Hadoop cluster. Click Continue. components on the hosts you specified. Make changes as necessary. 23
  • 24. View Service Health & Performance Cloudera Manager Key Features 24
  • 25. Get Host-Level Snapshots Cloudera Manager Key Features 25
  • 26. Monitor & Diagnose Cluster Workloads Cloudera Manager Key Features 26
  • 27. Gather, View & Search Hadoop Logs Cloudera Manager Key Features 27
  • 28. Track Events From Across The Cluster Cloudera Manager Key Features 28
  • 29. Report On System Performance & Usage Cloudera Manager Key Features 29
  • 30. Visualize Health Status With Heatmaps Cloudera Manager Key Features 30
  • 31. Manage Multiple CDH Clusters Cloudera Manager Key Features 31
  • 32. Easily Configure High Availability Cloudera Manager Key Features 32
  • 33. Set The Time Context Globally Cloudera Manager Key Features 33

Editor's Notes

  • #5: Monsanto is a St. Louis-based agricultural company with one goal in mind – produce more food, fiber and fuel using less inputs like water and land, while improving the lives of the people around the world that benefit from our technology.Monsanto utilizes a systems approach to improving upon today’s agricultural offerings – Breeding, Biotechnology, and Advanced Agronomic Practices These three facets of our approach help farmers improve productivity, reduce the costs of farming, and grow better foods for consumers and better feed for animals.We’re proud to have customers of all kinds; from large-acre, technology-driven row-crop farmers in Central Illinois all the way to farmers with very small landholdings who are just beginning to realize the benefits of modern agriculture in Africa.
  • #6: Sustainably increasing yield, while more efficiently using inputs and resources, requires every tool at farmers’ disposal. At Monsanto, we’re focused on three pillars for driving yield: breeding, biotechnology and improved agronomic practices. All three are required to meet our goals.Basics of Breeding Breeding, a technique that has been practiced by farmers for thousands of years, involves bringing together two parent plants to produce a new offspring that contains a mixture of parent characteristics. Monsanto has assembled a pool of elite seed genetics (germplasm) from around the world, and we use cutting-edge technology to help us more quickly, efficiently and accurately find desired traits for breeding. Our primary method is using genetic analysis – mapping the DNA of plants – to identify seeds with traits we want, such as improved yield, disease resistance, suitability for a particular climate, and in the case of vegetables better taste and nutrition.Basics of Biotechnology Biotechnology is the process of inserting a gene from one species, like a plant or a bacterium, into another species. We use biotechnology to give plants desirable characteristics (or traits) that often cannot be developed through breeding practices. The traits we develop help farmers produce more of their crop, reduce costs and conserve resources. Examples of these traits would be herbicide tolerance, insect-resistance and drought-tolerance. We also are working to develop traits that will benefit consumers, such as soybeans that produce healthier oils.Basics of AgronomicsAgronomic practices are steps farmers incorporate into their farm management systems to improve soil quality, enhance water use, manage crop residue and improve the environment through better fertilizer management. These steps not only improve a farmer’s bottom line by decreasing input costs, but also improve the environment by decreasing water use and over-fertilization. Improved agronomics cover a broad range of practices, suitable for any type of farm. For example, a high-tech, high productivity grower may use GPS and computer systems to automate planting for optimal row spacing and varying inputs acre by acre, to produce more and conserve more. A subsistence farmer can see significant benefits by learning about input management and optimal plant spacing to reduce costs and improve yield. Conservation tillage is a broadly applicable technique that preserves topsoil and locks in moisture.