What the Spark!? Intro and Use Cases

7 likes•5,712 views

The document discusses the introductory concepts and use cases of Apache Spark, emphasizing its significance in the big data landscape. It briefly covers components such as Spark SQL, Spark Streaming, MLlib, and GraphX, addressing common myths and misconceptions. The content aims to provide insights for getting started with Spark.

Data & Analytics

© Copyright 2015 Glassbeam Inc.
What the Spark!
Intro and Use Cases
February 26, 2015

© Copyright 2015 Glassbeam Inc.







© Copyright 2015 Glassbeam Inc.
Big Data

© Copyright 2015 Glassbeam Inc.
Volume
Variety
Velocity

© Copyright 2015 Glassbeam Inc.
Source: Cisco, IDC, Wikibon report 2013
1980s 1990-2000s 2010 - beyond

© Copyright 2015 Glassbeam Inc.
Quick Review

© Copyright 2015 Glassbeam Inc.
Spark Intro

© Copyright 2015 Glassbeam Inc.
Why is Spark hot?

© Copyright 2015 Glassbeam Inc.
Spark SQL Intro

© Copyright 2015 Glassbeam Inc.
Spark Streaming
Intro

© Copyright 2015 Glassbeam Inc.
MLlib Intro

© Copyright 2015 Glassbeam Inc.
GraphX Intro

© Copyright 2015 Glassbeam Inc.
Myths and
Misconceptions

© Copyright 2015 Glassbeam Inc.
Getting Started

© Copyright 2015 Glassbeam Inc.
Questions?

More Related Content

PPTX

OpenStack 2016 - Boom or Bust? - Adrian Ionel, CEO, Mirantis - OpenStackSV 2014Mirantis

PDF

Brent Dykes - Data storytelling - Conversion Hotel 2015Webanalisten .nl

PDF

TechWiseTV Workshop: Stealthwatch CloudRobb Boyd

PPTX

Devoxx Retrospective sailing - Collaboration Games june16 being agileBelinda Waldock

PDF

The value imperative of exceptional leadershipSecurity Catalyst

PDF

Seeking Nirvana - Predictability in a Complex WorldJose Casal-Gimenez FBCS CITP

PDF

Ffliping Agility - Lean Agile Brighton - Oct 2018Jose Casal-Gimenez FBCS CITP

PPTX

JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...David Taieb

OpenStack 2016 - Boom or Bust? - Adrian Ionel, CEO, Mirantis - OpenStackSV 2014Mirantis

Brent Dykes - Data storytelling - Conversion Hotel 2015Webanalisten .nl

TechWiseTV Workshop: Stealthwatch CloudRobb Boyd

Devoxx Retrospective sailing - Collaboration Games june16 being agileBelinda Waldock

The value imperative of exceptional leadershipSecurity Catalyst

Seeking Nirvana - Predictability in a Complex WorldJose Casal-Gimenez FBCS CITP

Ffliping Agility - Lean Agile Brighton - Oct 2018Jose Casal-Gimenez FBCS CITP

JavaOne 2016: Getting Started with Apache Spark: Use Scala, Java, Python, or ...David Taieb

Viewers also liked (15)

PDF

[db tech showcase Tokyo 2017] AzureでOSS DB/データ処理基盤のPaaSサービスを使ってみよう (Azure Dat...Naoki (Neo) SATO

PDF

SparkMLlibで始めるビッグデータを対象とした機械学習入門Takeshi Mikami

PPTX

Apache Spark in Scientific ApplicationsDr. Mirko Kämpf

PDF

Introduction to Stateful Stream Processing with Apache Flink.Konstantinos Kloudas

PDF

Apache Spark, the Next Generation Cluster ComputingGerger

PDF

Sparkcamp @ Strata CA: Intro to Apache Spark with Hands-on TutorialsDatabricks

PDF

Large-Scale Stream Processing in the Hadoop Ecosystem DataWorks Summit/Hadoop Summit

PDF

Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw

PDF

Apache Spark BriefingThomas W. Dinsmore

PDF

Apache Hadoopを利用したビッグデータ分析基盤Hortonworks Japan

PDF

ちょっと理解に自信がないなという皆さまに贈るHadoop／Sparkのキホン（IBM Datapalooza Tokyo 2016講演資料）hamaken

PDF

40分でわかるHadoop徹底入門（Cloudera World Tokyo 2014 講演資料） hamaken

PDF

Apache Sparkに手を出してヤケドしないための基本～「Apache Spark入門より」～（デブサミ 2016 講演資料）NTT DATA OSS Professional Services

PDF

Top 5 mistakes when writing Spark applicationshadooparchbook

PDF

The AI RushJean-Baptiste Dumont

[db tech showcase Tokyo 2017] AzureでOSS DB/データ処理基盤のPaaSサービスを使ってみよう (Azure Dat...Naoki (Neo) SATO

SparkMLlibで始めるビッグデータを対象とした機械学習入門Takeshi Mikami

Apache Spark in Scientific ApplicationsDr. Mirko Kämpf

Introduction to Stateful Stream Processing with Apache Flink.Konstantinos Kloudas

Apache Spark, the Next Generation Cluster ComputingGerger

Sparkcamp @ Strata CA: Intro to Apache Spark with Hands-on TutorialsDatabricks

Large-Scale Stream Processing in the Hadoop Ecosystem DataWorks Summit/Hadoop Summit

Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw

Apache Spark BriefingThomas W. Dinsmore

Apache Hadoopを利用したビッグデータ分析基盤Hortonworks Japan

ちょっと理解に自信がないなという皆さまに贈るHadoop／Sparkのキホン（IBM Datapalooza Tokyo 2016講演資料）hamaken

40分でわかるHadoop徹底入門（Cloudera World Tokyo 2014 講演資料） hamaken

Apache Sparkに手を出してヤケドしないための基本～「Apache Spark入門より」～（デブサミ 2016 講演資料）NTT DATA OSS Professional Services

Top 5 mistakes when writing Spark applicationshadooparchbook

The AI RushJean-Baptiste Dumont

Similar to What the Spark!? Intro and Use Cases (20)

PDF

Managing Demand Across OrganizationsCA Technologies

PDF

Serverless <3 GraphQL - AWS UG Tampere 2020Marcia Villalba

PDF

Deploying CA Applications in the Cloud: Automated Blueprints for your Agile I...CA Technologies

PDF

API’s and Identity: Enabling Optum to become the HealthCare cloudCA Technologies

PDF

Scaling ScrumMe Myself And I (affiliated with Scrum.org)

PDF

Case Study: Orange Goes from Dev "Oops" to DevOps With CA Application Perfor...CA Technologies

PDF

An Introduction to Scaled Agile Framework (SAFe)CA Technologies

PDF

Foundations of the Scaled Agile Framework®: Values, Principles, Practices, ...CA Technologies

PDF

Freeing the World from Slow: How Service Virtualization and the Concept of S....CA Technologies

PPTX

Cloudreach Voices The Internet of ThingsCloudreach

PDF

AWS Stockholm Summit 19- Building serverless applications with GraphQLMarcia Villalba

PPTX

Conversion MythbustingAffiliate Summit

PPTX

10.29.15 sa fe in-8 pictures-with speaker-notes-v3.0.4Tonya McCaulley, SPC4

PPTX

10.29.15 SAFe in-8 pictures-with speaker-notes-v3.0.4Tonya McCaulley, SPC4

PPTX

SAFe-in-8 Pictures from Scaled AgileLJ Alefantis

PDF

The Cloud Foundry Story on OpenStackStuart Charlton

PPTX

SolarWinds Cybersecurity in the Federal Government SolarWinds

PDF

Kranky Geek 2015 - Decisions & Considerations in building your WebRTC AppKranky Geek

PDF

Posters, as a form of visual presentationsfxwizkid

PPTX

Cloudreach Voices AWS CloudWatch and Smart MonitoringCloudreach

Managing Demand Across OrganizationsCA Technologies

Serverless <3 GraphQL - AWS UG Tampere 2020Marcia Villalba

Deploying CA Applications in the Cloud: Automated Blueprints for your Agile I...CA Technologies

API’s and Identity: Enabling Optum to become the HealthCare cloudCA Technologies

Scaling ScrumMe Myself And I (affiliated with Scrum.org)

Case Study: Orange Goes from Dev "Oops" to DevOps With CA Application Perfor...CA Technologies

An Introduction to Scaled Agile Framework (SAFe)CA Technologies

Foundations of the Scaled Agile Framework®: Values, Principles, Practices, ...CA Technologies

Freeing the World from Slow: How Service Virtualization and the Concept of S....CA Technologies

Cloudreach Voices The Internet of ThingsCloudreach

AWS Stockholm Summit 19- Building serverless applications with GraphQLMarcia Villalba

Conversion MythbustingAffiliate Summit

10.29.15 sa fe in-8 pictures-with speaker-notes-v3.0.4Tonya McCaulley, SPC4

10.29.15 SAFe in-8 pictures-with speaker-notes-v3.0.4Tonya McCaulley, SPC4

SAFe-in-8 Pictures from Scaled AgileLJ Alefantis

The Cloud Foundry Story on OpenStackStuart Charlton

SolarWinds Cybersecurity in the Federal Government SolarWinds

Kranky Geek 2015 - Decisions & Considerations in building your WebRTC AppKranky Geek

Posters, as a form of visual presentationsfxwizkid

Cloudreach Voices AWS CloudWatch and Smart MonitoringCloudreach

More from Aerospike, Inc. (20)

PDF

Aerospike Hybrid Memory ArchitectureAerospike, Inc.

PDF

2017 DB Trends for Powering Real-Time Systems of EngagementAerospike, Inc.

PPTX

WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...Aerospike, Inc.

PPTX

Leveraging Big Data with Hadoop, NoSQL and RDBMSAerospike, Inc.

PDF

Using Databases and Containers From Development to DeploymentAerospike, Inc.

PDF

01282016 Aerospike-Docker webinarAerospike, Inc.

PPTX

There are 250 Database products, are you running the right one?Aerospike, Inc.

PPTX

The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.

PPTX

Tectonic Shift: A New Foundation for Data Driven BusinessAerospike, Inc.

PPTX

How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeAerospike, Inc.

PDF

Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.

PPTX

Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourAerospike, Inc.

PPTX

ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACIDAerospike, Inc.

PPTX

Flash Economics and Lessons learned from operating low latency platforms at h...Aerospike, Inc.

PDF

Storm Persistence and Real-Time AnalyticsAerospike, Inc.

PDF

You Snooze You Lose or How to Win in Ad Tech?Aerospike, Inc.

PPT

Aerospike: Key Value Data AccessAerospike, Inc.

PPTX

Aerospike: Maximizing PerformanceAerospike, Inc.

PPTX

Distributing Data The Aerospike WayAerospike, Inc.

PPTX

Getting The Most Out Of Your Flash/SSDsAerospike, Inc.