Open Platform for Next-Gen Analytics

           Patrick Buddenbaum
               Director, Enterprise Segment
               Datacenter and Connected System Group
Legal Information
Today’s presentations contain forward-looking statements. All statements made that are not historical facts are subject to a number of
risks and uncertainties, and actual results may differ materially. Please refer to our most recent Earnings Release and our most recent
Form 10-Q or 10-K filing for more information on the risk factors that could cause actual results to differ.
If we use any non-GAAP financial measures during the presentations, you will find on our website, intc.com, the required reconciliation
to the most directly comparable GAAP financial measure.

INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR
OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY
WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION
INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR
INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate
performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may
affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components
they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference
www.intel.com/software/products.

Software and workloads used in performance tests may have been optimized for performance only on Intel
microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components,
software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that
product when combined with other products.

Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative
to obtain Intel's current plan of record product roadmaps.
Making Sense of One Petabyte




  50x 13y 11s
        To read                                       To view                                         To generate
in Library of Congress                              as HD Video                                         in 2012


       Sources: IDC 2012, The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East
                http://blogs.loc.gov/digitalpreservation/2011/07/transferring-libraries-of-congress-of-data/
Analysis of Data can Transform Society

                 Enhance understanding, drive
                 innovation, and accelerate medical cures


                  Create new business models and
                  transform organizational processes


                  Improve public safety and increase
                  energy efficiency with smart grids
Virtuous Cycle of Data-Driven User Experience
                                    Richer
                               user experiences

  Richer data to
     analyze
                                  CLIENTS



                                Richer data
    CLOUD                      from devices



                               INTELLIGENT
                                 SYSTEMS
Democratize Data Analysis from Edge to Cloud


                            Unlock value in silicon

                           Support open platforms


                             Intelligent Systems
                                 Framework
Intel at the Intersection of Big Data Forces

         HPC                                          Cloud                                         Open Source

                  Intel®
               TrueScale
               Infiniband




Enabling exascale computing           Helping enterprises build                                    Contributing code and
   on massive data sets               open interoperable clouds                                     fostering ecosystem



                              * Other names and brands may be claimed as the property of others.
History of Intel and Apache Hadoop*
                                                                                                Product
                                                           Optimization
                                  Tuning
           Benchmarking
                                                                                                  Release 2.0
Research                               Telco                     Smart City
                                                                                                         (2012)

                                                                                        Release 1.0
                 HiBench      Healthcare                         Retail                         (2011)
                                                      Web
  Open Cirrus*


2009                                                                                                              2013
                           * Other names and brands may be claimed as the property of others.
Announcing Availability of
Intel® Distribution for Apache Hadoop* software


                      Hardware-enhanced performance & security

                      Enables partner innovation in analytics

                      Strengthens Apache Hadoop* ecosystem




             * Other names and brands may be claimed as the property of others.
Intel® Distribution for Apache Hadoop* software

          • Up to 20x faster decryption with AES-NI*
          • Granular access controls for Hbase

          • Optimized with SSD and Cache Acceleration
          • Up to 8.5X faster queries in Hive
          • Hardware-enhanced compression with AVX & SSE4.2


          • Automated tuning with Intel® Active Tuner

                            *Based on internal testing
Intel Distribution for Apache Hadoop* software

                                                                                        Intel® Manager for Apache Hadoop software
                                                                                  Deployment, Configuration, Monitoring, Alerts, and Security
                                       Data Exchange

                                                                                   Oozie                   Pig                  Mahout                 R connectors      Hive
                               Sqoop


                                                                                   Workflow              Scripting          Machine Learning               Statistics   SQL Query




                                                                                                                                                                                            Columnar Store
                                                                                                                                                                                    HBase
                                                                   Coordination
                                                       Zookeeper



                                                                                                                    YARN (MRv2)
                                                                                                          Distributed Processing Framework
                                       Log Collector
                               Flume




                                                                                                                                 HDFS
                                                                                                                     Hadoop Distributed File System

Intel unique

Intel enhancements contributed back to open source


Open source components included without change                                        * Other names and brands may be claimed as the property of others.
Sold with World-Class Intel Support


      Annual Subscription with Technical Support

      Support Coverage Options: 24x7 or 8x5

      Via Solution Vendors and Service Providers
Continued Innovation


Pipeline of innovation from Intel Labs
   • Machine Learning, Graph Lab & Graph Builder
   • Data-Intensive Algorithms & Computer Architecture


Roadmap of open source from Intel Software
   • Project Rhino: Hardening Apache Hadoop
   • Project Panthera: Standard SQL on Apache Hadoop




                          * Other names and brands may be claimed as the property of others.
Backed by Broad Portfolio of Datacenter Products
                     Software
                                Cache
                                Acceleration
                                Software



       Server     Storage & Memory             Network
Antoine Hue
                              Regional Sales Manager
                                    APJC Data Center




* Other names and brands may be claimed as the property of others.
>4 Hours to 7 Minutes
                                  Intel Platform Benefits for Sorting 1TB Data
>4 Hours                                                              Intel®
                                                                      Xeon®
                                                                    E5-2690
                                                                    processor

                                                                 ~50%                                                          Intel® SSD
                                                               improved                                                           520                                                      Intel®                                                 Deploy Intel
                                                                                                                                 Series                                                    10GbE                                                  Distribution
                                                                                                                                                                                          Adapters                                                for Apache
                                                                                                                          ~80%                                                                                                                     Hadoop*

 Intel® Xeon 5690                                                                                                       improved                                                      ~50%                                                      ~40%
                                                                                                                                                                                    improved                                                  improved
      7200 HDD

  1GbE Adapters
                                                                                                                                                                                                                                                                                   ~7 mins

    Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any
           change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
                                                                                                                               Source: Intel Internal testing
                                                                                                                    For more information go to : intel.com/performance
                                                                                                                                              `
Proven in the Enterprise

Using the Intel® Distribution to gain tremendous results



                                                          IT




                 * Other names and brands may be claimed as the property of others.
Customer Video
With Broad Support from the Ecosystem




          * Other names and brands may be claimed as the property of others.
Chris Levanes
Director of Cloud Business Development
                            Savvis Asia
Big Data launch keynote Singapore Patrick Buddenbaum
The Promise of Big Data Requires Industrialized Services
Big Data Customers Need

• Trusted, mission critical, high-powered
  computing solutions

• Robust security options

• Enterprise-grade global storage capabilities
                                                 BIG
• Highly available compute power

• Cloud-based economic model
                                                 DATA
• Expert consulting services to aide in
  transformation of data assets
A Longstanding Successful Alliance
Enterprise-Grade, Industrialized Infrastructure Services for
Intel Distribution for Apache Hadoop Software
Summary

• Intel announced Intel® Distribution for Apache Hadoop* software

• Delivers performance, security and ease of deployment

• Backed by broad portfolio of Intel data center products

• Contributes to open source and supports Apache Hadoop

• Enabling ecosystem of partners to innovate on analytics solutions
Q&A
Legal Disclaimers
All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across
different processor families. Go to: http://www.intel.com/products/processor_number

Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from
published specifications. Current characterized errata are available on request.

Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM). Functionality,
performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all
operating systems. Consult your PC manufacturer. For more information, visit http://www.intel.com/go/virtualization

No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer system
with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible
measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit
http://www.intel.com/technology/security

Intel, Intel Xeon, Intel Atom, Intel Xeon Phi, Intel Itanium, the Intel Itanium logo, the Intel Xeon Phi logo, the Intel Xeon logo and the Intel logo are
trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

Other names and brands may be claimed as the property of others.
Copyright © 2013, Intel Corporation. All rights reserved.
Apache Hadoop Performance Test Configuration
                 4 hours to 7 minutes
Cluster Configuration                            Head Node Hardware
 1 Head Node (name node, job tracker)            1 x Dell r710 1U servers
 10 Workers (data nodes, task trackers)              Intel: 2x3.47GHz Intel® Xeon®
 10-Gigabit Switch: Cisco Nexus 5020                   processor X5690
                                                      Memory: 48G RAM
                                                      Storage: 10K SAS HDD
Software Configuration                                Intel® Ethernet 10 Gigabit SFP+
   Intel Distribution for Apache Hadoop 2.1.1
                                                      Intel® Ethernet 1 Gigabit
   Apache Hadoop 1.0.3
   RHEL 6.3
   Oracle Java 1.7.0_05                         Worker Node Hardware
                                                 10 x Dell r720 2U servers
                                                       Intel: 2 x 2.90Ghz Intel® Xeon® processor E5-2690
                                                       Memory: 128G RAM
                                                       Storage: 520 Series SSDs
                                                       Intel® Ethernet 10 Gigabit SFP+
                                                       Intel® Ethernet 1 Gigabit



                                                 29

More Related Content

PDF
Big Data launch Singapore Patrick Buddenbaum
PDF
Hadoop - Now, Next and Beyond
PPTX
hadoop 101 aug 21 2012 tohug
PDF
Big Data for Everyman
PDF
SQL-H a new way to enable SQL analytics
PDF
Innovations in SAP BusinessObjects 4.0
PPTX
Jaspersoft Dashboards Webinar Feb 2013
PDF
Hadoop's Opportunity to Power Next-Generation Architectures
Big Data launch Singapore Patrick Buddenbaum
Hadoop - Now, Next and Beyond
hadoop 101 aug 21 2012 tohug
Big Data for Everyman
SQL-H a new way to enable SQL analytics
Innovations in SAP BusinessObjects 4.0
Jaspersoft Dashboards Webinar Feb 2013
Hadoop's Opportunity to Power Next-Generation Architectures

What's hot (20)

PPT
Ihee Ppres0998
PDF
Tackling big data with hadoop and open source integration
PDF
Unity: Because the Sum is Greater than the Parts
PDF
Cloud Computing: Making IT Simple
PDF
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
PDF
Sap sap so h 2013
PDF
Engineered Systems: Oracle’s Vision for the Future
PPT
The 5 levels of embedded bi
PDF
Engineered Systems: Oracle’s Vision for the Future
PDF
IBM Power 710 Express server
PPTX
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...
PPTX
IT Future 2012 - Fujitsu SAP HANA
PDF
INTERSPORT improves fitness and business flexibility
PDF
Vision - The Agile Data Center
PPT
The Changes In Service Delivery With Cloud Computing
PDF
Talk IT_ Oracle_김태완_110831
PDF
Self-Service Access and Exploration of Big Data
PPTX
2012 06 hortonworks paris hug
PDF
Etu Appliance - 2012 TAITRONICS Technology Innovation Awards
PDF
HCLT Brochure: E-Discovery and Document Review Solutions
Ihee Ppres0998
Tackling big data with hadoop and open source integration
Unity: Because the Sum is Greater than the Parts
Cloud Computing: Making IT Simple
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Sap sap so h 2013
Engineered Systems: Oracle’s Vision for the Future
The 5 levels of embedded bi
Engineered Systems: Oracle’s Vision for the Future
IBM Power 710 Express server
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...
IT Future 2012 - Fujitsu SAP HANA
INTERSPORT improves fitness and business flexibility
Vision - The Agile Data Center
The Changes In Service Delivery With Cloud Computing
Talk IT_ Oracle_김태완_110831
Self-Service Access and Exploration of Big Data
2012 06 hortonworks paris hug
Etu Appliance - 2012 TAITRONICS Technology Innovation Awards
HCLT Brochure: E-Discovery and Document Review Solutions
Ad

Viewers also liked (6)

PDF
Intel Cloud Summit: Jason Fedder
PDF
RedHat - Intel Big Data & Cloud Summit 2013
PDF
Apac summit ODCA - Allyson Klein
PDF
Intel APJ Enterprise Day - Strategic IT, A New Way of Business
PDF
Intel Cloud Summit: Product update
PDF
Intel apj cloud big data summit sdi press briefing - panhorst
Intel Cloud Summit: Jason Fedder
RedHat - Intel Big Data & Cloud Summit 2013
Apac summit ODCA - Allyson Klein
Intel APJ Enterprise Day - Strategic IT, A New Way of Business
Intel Cloud Summit: Product update
Intel apj cloud big data summit sdi press briefing - panhorst
Ad

Similar to Big Data launch keynote Singapore Patrick Buddenbaum (20)

PDF
Intel And Big Data: An Open Platform for Next-Gen Analytics
PDF
Big Data Beyond Hadoop*: Research Directions for the Future
PDF
Hadoop Trends
PDF
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
 
PDF
Common and unique use cases for Apache Hadoop
PDF
Commonanduniqueusecases 110831113310-phpapp01
PPTX
Apache Hadoop Now Next and Beyond
PDF
Cloud computing era
PDF
hadoop @ Ibmbigdata
PPTX
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
PPTX
HP Microsoft SQL Server Data Management Solutions
PPTX
Bb3061 bess systems of record sv
PDF
Building Big Data Applications
PPTX
Scalability and Availability - Without Compromise
PDF
Cetas Analytics as a Service for Predictive Analytics
PDF
Cetas Predictive Analytics Prezo
PPTX
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
PPTX
Sequence Services Phase 2 Webinar Series: Constellation Technology and Genestack
PDF
Zh tw cloud computing era
PDF
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Intel And Big Data: An Open Platform for Next-Gen Analytics
Big Data Beyond Hadoop*: Research Directions for the Future
Hadoop Trends
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
 
Common and unique use cases for Apache Hadoop
Commonanduniqueusecases 110831113310-phpapp01
Apache Hadoop Now Next and Beyond
Cloud computing era
hadoop @ Ibmbigdata
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
HP Microsoft SQL Server Data Management Solutions
Bb3061 bess systems of record sv
Building Big Data Applications
Scalability and Availability - Without Compromise
Cetas Analytics as a Service for Predictive Analytics
Cetas Predictive Analytics Prezo
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Sequence Services Phase 2 Webinar Series: Constellation Technology and Genestack
Zh tw cloud computing era
Architecting the Future of Big Data & Search - Eric Baldeschwieler

More from IntelAPAC (20)

PDF
Day 2 aziz apj aziz_big_datakeynote_press
PDF
Netweb flytxt-big-data-case-study
PDF
2 pc enterprise summit cronin newfinal aug 18
PDF
5 Cronin Steen - IOT Smart Cities
PDF
Gab Genai Cloudera - Going Beyond Traditional Analytic
PDF
1 RK Hiremane
PDF
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
PDF
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
PDF
Intel APJ Enterprise Day - Intel puts Automotive Innovation into High Gear
PDF
Intel APJ Enterprise Day - Intro to Intel Collaboration Centre
PDF
Intel APJ Enterprise Day - Keynote by RK Hiremane
PDF
Intel APJ Enterprise Day - Introduction to Intel Kabushiki Kaisha
PDF
Greg Brown - Intel Big Data & Cloud Summit 2013
PDF
TWSE - Intel Big Data & Cloud Summit 2013
PDF
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
PDF
Lynn Comp - Big Data & Cloud Summit 2013
PDF
Girish Juneja - Intel Big Data & Cloud Summit 2013
PDF
Ron Kasabian - Intel Big Data & Cloud Summit 2013
PDF
Designed in Asia: Intel's Manufacturing Powerhouse in Asia
PDF
Designed in Asia: Welcome
Day 2 aziz apj aziz_big_datakeynote_press
Netweb flytxt-big-data-case-study
2 pc enterprise summit cronin newfinal aug 18
5 Cronin Steen - IOT Smart Cities
Gab Genai Cloudera - Going Beyond Traditional Analytic
1 RK Hiremane
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
Intel APJ Enterprise Day - Synopses of Demos at Intel Collaboration Center
Intel APJ Enterprise Day - Intel puts Automotive Innovation into High Gear
Intel APJ Enterprise Day - Intro to Intel Collaboration Centre
Intel APJ Enterprise Day - Keynote by RK Hiremane
Intel APJ Enterprise Day - Introduction to Intel Kabushiki Kaisha
Greg Brown - Intel Big Data & Cloud Summit 2013
TWSE - Intel Big Data & Cloud Summit 2013
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013
Ron Kasabian - Intel Big Data & Cloud Summit 2013
Designed in Asia: Intel's Manufacturing Powerhouse in Asia
Designed in Asia: Welcome

Recently uploaded (20)

PDF
giants, standing on the shoulders of - by Daniel Stenberg
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PPTX
SGT Report The Beast Plan and Cyberphysical Systems of Control
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PDF
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PPTX
Microsoft User Copilot Training Slide Deck
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
PPTX
agenticai-neweraofintelligence-250529192801-1b5e6870.pptx
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
giants, standing on the shoulders of - by Daniel Stenberg
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
Early detection and classification of bone marrow changes in lumbar vertebrae...
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
SGT Report The Beast Plan and Cyberphysical Systems of Control
NewMind AI Weekly Chronicles – August ’25 Week IV
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
MuleSoft-Compete-Deck for midddleware integrations
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
Microsoft User Copilot Training Slide Deck
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
Convolutional neural network based encoder-decoder for efficient real-time ob...
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
agenticai-neweraofintelligence-250529192801-1b5e6870.pptx
Introduction to MCP and A2A Protocols: Enabling Agent Communication
EIS-Webinar-Regulated-Industries-2025-08.pdf
Improvisation in detection of pomegranate leaf disease using transfer learni...

Big Data launch keynote Singapore Patrick Buddenbaum

  • 1. Open Platform for Next-Gen Analytics Patrick Buddenbaum Director, Enterprise Segment Datacenter and Connected System Group
  • 2. Legal Information Today’s presentations contain forward-looking statements. All statements made that are not historical facts are subject to a number of risks and uncertainties, and actual results may differ materially. Please refer to our most recent Earnings Release and our most recent Form 10-Q or 10-K filing for more information on the risk factors that could cause actual results to differ. If we use any non-GAAP financial measures during the presentations, you will find on our website, intc.com, the required reconciliation to the most directly comparable GAAP financial measure. INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference www.intel.com/software/products. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel's current plan of record product roadmaps.
  • 3. Making Sense of One Petabyte 50x 13y 11s To read To view To generate in Library of Congress as HD Video in 2012 Sources: IDC 2012, The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East http://blogs.loc.gov/digitalpreservation/2011/07/transferring-libraries-of-congress-of-data/
  • 4. Analysis of Data can Transform Society Enhance understanding, drive innovation, and accelerate medical cures Create new business models and transform organizational processes Improve public safety and increase energy efficiency with smart grids
  • 5. Virtuous Cycle of Data-Driven User Experience Richer user experiences Richer data to analyze CLIENTS Richer data CLOUD from devices INTELLIGENT SYSTEMS
  • 6. Democratize Data Analysis from Edge to Cloud Unlock value in silicon Support open platforms Intelligent Systems Framework
  • 7. Intel at the Intersection of Big Data Forces HPC Cloud Open Source Intel® TrueScale Infiniband Enabling exascale computing Helping enterprises build Contributing code and on massive data sets open interoperable clouds fostering ecosystem * Other names and brands may be claimed as the property of others.
  • 8. History of Intel and Apache Hadoop* Product Optimization Tuning Benchmarking Release 2.0 Research Telco Smart City (2012) Release 1.0 HiBench Healthcare Retail (2011) Web Open Cirrus* 2009 2013 * Other names and brands may be claimed as the property of others.
  • 9. Announcing Availability of Intel® Distribution for Apache Hadoop* software Hardware-enhanced performance & security Enables partner innovation in analytics Strengthens Apache Hadoop* ecosystem * Other names and brands may be claimed as the property of others.
  • 10. Intel® Distribution for Apache Hadoop* software • Up to 20x faster decryption with AES-NI* • Granular access controls for Hbase • Optimized with SSD and Cache Acceleration • Up to 8.5X faster queries in Hive • Hardware-enhanced compression with AVX & SSE4.2 • Automated tuning with Intel® Active Tuner *Based on internal testing
  • 11. Intel Distribution for Apache Hadoop* software Intel® Manager for Apache Hadoop software Deployment, Configuration, Monitoring, Alerts, and Security Data Exchange Oozie Pig Mahout R connectors Hive Sqoop Workflow Scripting Machine Learning Statistics SQL Query Columnar Store HBase Coordination Zookeeper YARN (MRv2) Distributed Processing Framework Log Collector Flume HDFS Hadoop Distributed File System Intel unique Intel enhancements contributed back to open source Open source components included without change * Other names and brands may be claimed as the property of others.
  • 12. Sold with World-Class Intel Support Annual Subscription with Technical Support Support Coverage Options: 24x7 or 8x5 Via Solution Vendors and Service Providers
  • 13. Continued Innovation Pipeline of innovation from Intel Labs • Machine Learning, Graph Lab & Graph Builder • Data-Intensive Algorithms & Computer Architecture Roadmap of open source from Intel Software • Project Rhino: Hardening Apache Hadoop • Project Panthera: Standard SQL on Apache Hadoop * Other names and brands may be claimed as the property of others.
  • 14. Backed by Broad Portfolio of Datacenter Products Software Cache Acceleration Software Server Storage & Memory Network
  • 15. Antoine Hue Regional Sales Manager APJC Data Center * Other names and brands may be claimed as the property of others.
  • 16. >4 Hours to 7 Minutes Intel Platform Benefits for Sorting 1TB Data >4 Hours Intel® Xeon® E5-2690 processor ~50% Intel® SSD improved 520 Intel® Deploy Intel Series 10GbE Distribution Adapters for Apache ~80% Hadoop* Intel® Xeon 5690 improved ~50% ~40% improved improved 7200 HDD 1GbE Adapters ~7 mins Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: Intel Internal testing For more information go to : intel.com/performance `
  • 17. Proven in the Enterprise Using the Intel® Distribution to gain tremendous results IT * Other names and brands may be claimed as the property of others.
  • 19. With Broad Support from the Ecosystem * Other names and brands may be claimed as the property of others.
  • 20. Chris Levanes Director of Cloud Business Development Savvis Asia
  • 22. The Promise of Big Data Requires Industrialized Services
  • 23. Big Data Customers Need • Trusted, mission critical, high-powered computing solutions • Robust security options • Enterprise-grade global storage capabilities BIG • Highly available compute power • Cloud-based economic model DATA • Expert consulting services to aide in transformation of data assets
  • 25. Enterprise-Grade, Industrialized Infrastructure Services for Intel Distribution for Apache Hadoop Software
  • 26. Summary • Intel announced Intel® Distribution for Apache Hadoop* software • Delivers performance, security and ease of deployment • Backed by broad portfolio of Intel data center products • Contributes to open source and supports Apache Hadoop • Enabling ecosystem of partners to innovate on analytics solutions
  • 27. Q&A
  • 28. Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. Go to: http://www.intel.com/products/processor_number Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM). Functionality, performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all operating systems. Consult your PC manufacturer. For more information, visit http://www.intel.com/go/virtualization No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer system with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit http://www.intel.com/technology/security Intel, Intel Xeon, Intel Atom, Intel Xeon Phi, Intel Itanium, the Intel Itanium logo, the Intel Xeon Phi logo, the Intel Xeon logo and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Other names and brands may be claimed as the property of others. Copyright © 2013, Intel Corporation. All rights reserved.
  • 29. Apache Hadoop Performance Test Configuration 4 hours to 7 minutes Cluster Configuration Head Node Hardware  1 Head Node (name node, job tracker)  1 x Dell r710 1U servers  10 Workers (data nodes, task trackers)  Intel: 2x3.47GHz Intel® Xeon®  10-Gigabit Switch: Cisco Nexus 5020 processor X5690  Memory: 48G RAM  Storage: 10K SAS HDD Software Configuration  Intel® Ethernet 10 Gigabit SFP+  Intel Distribution for Apache Hadoop 2.1.1  Intel® Ethernet 1 Gigabit  Apache Hadoop 1.0.3  RHEL 6.3  Oracle Java 1.7.0_05 Worker Node Hardware 10 x Dell r720 2U servers  Intel: 2 x 2.90Ghz Intel® Xeon® processor E5-2690  Memory: 128G RAM  Storage: 520 Series SSDs  Intel® Ethernet 10 Gigabit SFP+  Intel® Ethernet 1 Gigabit 29