SlideShare a Scribd company logo
Spark Usage in
Enterprise Business
Operations
Ken Tsai
VP, Data Management & Platform-as-Services
SAP
@kentsaiSAP
2.17.16: Spark Summit, NYC
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an
SAP affiliate company.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE
(or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark
information and notices.
Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.
National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its
affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or
SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and
services, if any. Nothing herein should be construed as constituting an additional warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or
release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future
developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for
any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-
looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place
undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
SAP – Our Quick Snapshot in the Enterprise Computing World
74% of the world’s
transaction revenue
touches an SAP system.
SAP’s product focus:
Enterprise Applications
Business Networks
Platforms – 15 yrs on IMC
SAP customers represent
87% of Forbes Global
2,000 companies.
SAP touches
$16 trillion of world
consumer purchases.
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
SAP HANA – An In-Memory Platform to Enable New Business
Scenarios Previously Not Feasible
BKPF BSEG BSEG BSEG
no indices no aggregates no redundancies
CORE DATA STRUCTURE
REMAINS UNCHANGED
•  Soft financial close anytime
•  Real-time revenue and cost analysis
•  Real-time liquidity forecasts
•  Real-time alerts and blocks on suspicious
transactions
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Distributed Big Data Is Everywhere
How to better use it in core enterprise business applications?
~79% of Data Reservoirs/
Lakes are still disconnected
from core business
operations
How do I embed big data signal
into my business applications
and enterprise analytics?
53
Difficulty integrating
with CRM and/or
other systems
%
49
Unable to apply or integrate
external data quickly
enough to inform real-time
decision making
%
59
Only a few analysts with
specialized training can
analyze big data
%
Harvard Business Review Analytic Services, Global Survey of 251 Respondents, Sept. 2015
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Introducing SAP HANA Vora
An in-memory query engine that extends the Apache Spark execution framework
to enrich the interactive analytics experiences on massively distributed computing clusters
•  OLAP processing
•  In-Memory
Computing for
high performance
•  Connecting to
Enterprise
Systems
•  Unified System
Management
SAP HANA
ERP DATA BIG DATA
Parallelized
Queries
Vora
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Key Open Source Contribution to Apache Spark Ecosystem
Spark to HANA Push-downs & Data Hierarchies
scala>	val	hierarchy	=	sqlContext.sql(	s"""	
SELECT	
		LVL,	COUNT(*),	ROUND(	AVG(P_RETAILPRICE),	2)	
FROM	(	
		SELECT	LEVEL(node)	AS	LVL,	P_RETAILPRICE	
		FROM	
				HIERARCHY(	
						USING	PART_HIERARCHY	AS	c	
						JOIN	PARENT	p	ON	c.P_PARENT	=	p.P_PARTKEY	
						SEARCH	BY	
								P_PARTKEY	ASC	
						START	WHERE	
								P_PARTKEY	=	1	
						SET	node	)	AS	H0	
		)	T1	GROUP	BY	LVL		
""".stripMargin	).collect().foreach(println)	
901
903
913912
904
911
+---+---+------------+	
|LEVEL|COUNT|AVG(P_RETAILPRICE)|	
+-----+-----+------------------+	
|		0		|		1		|							901								|	
|		1		|		2		|						903.5							|	
|		2		|		3		|							912								|	
+-----+-----+------------------+	
val	options	=	Map("dbschema"	->	config.user,"host"	->	
config.host,"instance"	->	config.instance)	
		
#	HANA	Live	CustomerBasicData	Virtual	Data	Model	
val	custConf	=	options	+	("path"	->	s"""sap.hba.ecc/
CustomerBasicData""")	
val	cust	=	
sqlContext.read.format("com.sap.spark.hana").options(custConf).load()	
cust.registerTempTable("customer")	
		
#	HANA	Live	SalesOrderHeader	VDM	
val	sohConf	=	options	+	("path"	->	s"""sap.hba.ecc/
SalesOrderHeader""")	
val	soh	=	
sqlContext.read.format("com.sap.spark.hana").options(sohConf).load()	
soh.registerTempTable(soh)	
	
#	Top	5	Countries	by	Sales	Order	Volume	
salesOrder	=	sqlContext.sql("select	"Country",count(*)	as	Frequency	
																					from	salesOrder	as	s	LEFT	OUTER	JOIN	customer	as	c	on	
s.soldToParty	=	c.Customer		
																											GROUP	BY	Country	ORDER	BY	Frequency	desc”)
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Airline Use Case – Optimize MRO scheduling with Sensor Data
Challenges
•  $10,000 loss for every hour spent
on maintenance, repair, and
overhaul (MRO)
•  Predictive MRO generates TB of
sensor data per flight
Solution
•  SAP HANA Vora rapidly processes
sensor data in HDFS and
combines it with flight schedule
and staffing data in SAP HANA to
prioritize maintenance jobs and
accelerate MRO
Why SAP
HANA Vora
•  Optimize MRO operations with
interactive, on-demand drill down
by airport, flight route, etc.
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Utility Use Case – CenterPoint Energy
Challenge
•  Smart meters generate TBs of
data/month
•  Regulatory requirement to retain
data for 10 years
•  Current storage solution full by
end-2016
•  Need to leverage HDFS as an
additional tier for storage
Solution
•  SAP HANA for most recent sensor
signal and operational data,
Dynamic Tiering for 1~2yrs old
data, HDFS for historical sensor
data
•  SAP HANA Vora accesses and
queries data across all tiers
Why SAP
HANA Vora
•  SAP HANA Vora provides
enterprise analytics & OLAP like
experience across data
warehouse and HDFS.
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Utility Use Case – How It Works
CenterPoint Energy
Our benchmark tests proved
that SAP HANA paired with
SAP HANA Vora are the right
solutions for us. We expect
immediate cost benefits and
to see competitive
differentiation in the future.”
Gary Hayes,
CIO & SVP at CenterPoint Energy
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
SAP HANA
MOST RECENT
SENSOR DATA
Dynamie
Tiering
1-2 YR OLD DATA
Parallelized
Queries
HDFS
HISTORICAL SENSOR DATA
Query data within and across tiers
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Financial Services Use Case – Extend Fraud Pattern Detection
Challenges
•  100+ million business transactions
daily, 25% growth YoY
•  Limited access to archived data
•  Difficult to detect patterns in
historical transactions
Solution
•  Current transactions in SAP
HANA, historical transactions in
HDFS clusters
•  Real-time detection of
abnormalities
Why SAP
HANA Vora
•  Real-time, aggregated insights
from current and historical
transactions
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
2016 and the Road Ahead
Customers in
North America,
APJ, and EMEA
Dev edition
available on AWS
TODAY
General Availability
Vora Modeler to
build and query
OLAP style cubes on
data
COMING
SOON
Planning (HR, Financial)
Extend engine support
for time series
Transaction
management
Analytics on archived
ERP data in Hadoop
FUTURE
©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Contribute to Spark Ecosystem, Embrace Best of Community Innovation
Contribution to
Open Source:
Hierarchy capabilities
Connection to ERP: predicate
pushdown to HANA
On-the-market
solution
SAP HANA Vora
Thank you!
Ken Tsai: ken.tsai@sap.com
@kentsaiSAP
Enter to Win a
GoPro HERO4
Session at
SAP Booth 102
Learn More @
hana.sap.com/vora
Try Dev Edition
bit.ly/1K1qLyo
We’re Hiring: https://spark-summit.org/east-2016/jobs/

More Related Content

What's hot (20)

PDF
SAP HANA SPS09 - Development Tools
SAP Technology
 
PPTX
What's New in SPS11 Overview
SAP Technology
 
PDF
SAP HANA SPS10- Text Analysis & Text Mining
SAP Technology
 
PDF
SAP HANA SPS09 - Smart Data Streaming
SAP Technology
 
PPTX
SAP HANA Native Application Development
SAP Technology
 
PDF
SAP HANA SPS10- Workload Management
SAP Technology
 
PDF
SAP HANA SPS10- Extended Application Services (XS) Programming Model
SAP Technology
 
PPTX
Spotlight on Financial Services with Calypso and SAP ASE
SAP Technology
 
PDF
SAP HANA SPS10- SQLScript
SAP Technology
 
PDF
SAP HANA SPS10- SAP HANA Modeling
SAP Technology
 
PDF
SAP HANA SPS09 - Full-text Search
SAP Technology
 
PDF
Building Custom Advanced Analytics Applications with SAP HANA
SAP Technology
 
PDF
SAP HANA SPS09 - XS Programming Model
SAP Technology
 
PDF
SAP HANA SPS10- SAP DB Control Center
SAP Technology
 
PDF
SAP HANA SPS10- SAP HANA Dynamic Tiering
SAP Technology
 
PDF
What's Planned for SAP HANA SPS10
SAP Technology
 
PDF
SAP HANA SPS09 - Dynamic Tiering
SAP Technology
 
PDF
SAP HANA SPS10- SAP HANA Development Tools
SAP Technology
 
PPTX
What's new for Text in SAP HANA SPS 11
SAP Technology
 
PPTX
What's New in SAP HANA SPS 11 Application Lifecycle Management
SAP Technology
 
SAP HANA SPS09 - Development Tools
SAP Technology
 
What's New in SPS11 Overview
SAP Technology
 
SAP HANA SPS10- Text Analysis & Text Mining
SAP Technology
 
SAP HANA SPS09 - Smart Data Streaming
SAP Technology
 
SAP HANA Native Application Development
SAP Technology
 
SAP HANA SPS10- Workload Management
SAP Technology
 
SAP HANA SPS10- Extended Application Services (XS) Programming Model
SAP Technology
 
Spotlight on Financial Services with Calypso and SAP ASE
SAP Technology
 
SAP HANA SPS10- SQLScript
SAP Technology
 
SAP HANA SPS10- SAP HANA Modeling
SAP Technology
 
SAP HANA SPS09 - Full-text Search
SAP Technology
 
Building Custom Advanced Analytics Applications with SAP HANA
SAP Technology
 
SAP HANA SPS09 - XS Programming Model
SAP Technology
 
SAP HANA SPS10- SAP DB Control Center
SAP Technology
 
SAP HANA SPS10- SAP HANA Dynamic Tiering
SAP Technology
 
What's Planned for SAP HANA SPS10
SAP Technology
 
SAP HANA SPS09 - Dynamic Tiering
SAP Technology
 
SAP HANA SPS10- SAP HANA Development Tools
SAP Technology
 
What's new for Text in SAP HANA SPS 11
SAP Technology
 
What's New in SAP HANA SPS 11 Application Lifecycle Management
SAP Technology
 

Viewers also liked (13)

PDF
Why SAP HANA?
SAP Technology
 
PPTX
SAP Helps Reduce Silos Between Business and Spatial Data
SAP Technology
 
PDF
SQL Anywhere and the Internet of Things
SAP Technology
 
PDF
Deployment and Development approaches for the ISV using PowerBuilder and SQL ...
SAP Technology
 
PDF
Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...
SAP Technology
 
PDF
Maximizing Database Tuning in SAP SQL Anywhere
SAP Technology
 
PDF
SQL Anywhere Tips and Tricks
SAP Technology
 
PDF
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
SAP Technology
 
PDF
Big Data, Big Thinking: Untapped Opportunities
SAP Technology
 
PPTX
What's New in SAP HANA SPS 11 Platform Lifecycle Management (Operations)
SAP Technology
 
PDF
Enterprise Information Management
SAP Technology
 
PDF
An In-Depth Look at SAP SQL Anywhere Performance Features
SAP Technology
 
PDF
SQLAnywhere 16.0 and Odata
SAP Technology
 
Why SAP HANA?
SAP Technology
 
SAP Helps Reduce Silos Between Business and Spatial Data
SAP Technology
 
SQL Anywhere and the Internet of Things
SAP Technology
 
Deployment and Development approaches for the ISV using PowerBuilder and SQL ...
SAP Technology
 
Building ISV Applications that run in the cloud with SQL Anywhere On-Demand E...
SAP Technology
 
Maximizing Database Tuning in SAP SQL Anywhere
SAP Technology
 
SQL Anywhere Tips and Tricks
SAP Technology
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
SAP Technology
 
Big Data, Big Thinking: Untapped Opportunities
SAP Technology
 
What's New in SAP HANA SPS 11 Platform Lifecycle Management (Operations)
SAP Technology
 
Enterprise Information Management
SAP Technology
 
An In-Depth Look at SAP SQL Anywhere Performance Features
SAP Technology
 
SQLAnywhere 16.0 and Odata
SAP Technology
 
Ad

Similar to Spark Usage in Enterprise Business Operations (20)

PPTX
Spark Summit presentation by Ken Tsai
Spark Summit
 
PDF
Spark Summit EU talk by Stephan Kessler
Spark Summit
 
PDF
SAP Vora CodeJam
Vitaliy Rudnytskiy
 
PDF
SAP HANA Vora SITMTY 20160707
Henrique Pinto
 
PPTX
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
DataWorks Summit/Hadoop Summit
 
PDF
SAP Lambda Architecture Point of View
Snehanshu Shah
 
PDF
Introducing the SAP high-performance analytic appliance (SAP HANA)
IBM India Smarter Computing
 
PDF
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
In-Memory Computing Summit
 
PDF
TDWI Roundtable: The HANA EDW
ukc4
 
PDF
SAP HANA Interactive Use Case Map
SAP Technology
 
PDF
How Sap hana is changing Retail Industry Inventory forecasting
Sumit Roy
 
PPTX
SAP HANA Adoption Press Briefing Japan (Paul Marriott @pmmarriott, Paul Young)
Paul Marriott
 
PPTX
Leveraging SAP, Hadoop, and Big Data to Redefine Business
DataWorks Summit
 
PDF
Sap hana master_guide_en
Farrukh Yusupov
 
PPTX
Introduction to HANA in-memory from SAP
ugur candan
 
PPTX
HANA overview
jenkin
 
PPTX
HANA - An Innovative Platform for Retail Use Cases
Venu Cherupillil
 
PPTX
Sap hana l1 -reinventing real-time businesses through innovation, value & si...
Daniel Lahl
 
PDF
SAP HANA Use Cases in 27 Industries
SAP Asia Pacific
 
PDF
S/4 HANA presentation at INDUS
INDUSCommunity
 
Spark Summit presentation by Ken Tsai
Spark Summit
 
Spark Summit EU talk by Stephan Kessler
Spark Summit
 
SAP Vora CodeJam
Vitaliy Rudnytskiy
 
SAP HANA Vora SITMTY 20160707
Henrique Pinto
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
DataWorks Summit/Hadoop Summit
 
SAP Lambda Architecture Point of View
Snehanshu Shah
 
Introducing the SAP high-performance analytic appliance (SAP HANA)
IBM India Smarter Computing
 
IMCSummit 2015 - Day 1 IT Business Track - In-memory computing with SAP HANA:...
In-Memory Computing Summit
 
TDWI Roundtable: The HANA EDW
ukc4
 
SAP HANA Interactive Use Case Map
SAP Technology
 
How Sap hana is changing Retail Industry Inventory forecasting
Sumit Roy
 
SAP HANA Adoption Press Briefing Japan (Paul Marriott @pmmarriott, Paul Young)
Paul Marriott
 
Leveraging SAP, Hadoop, and Big Data to Redefine Business
DataWorks Summit
 
Sap hana master_guide_en
Farrukh Yusupov
 
Introduction to HANA in-memory from SAP
ugur candan
 
HANA overview
jenkin
 
HANA - An Innovative Platform for Retail Use Cases
Venu Cherupillil
 
Sap hana l1 -reinventing real-time businesses through innovation, value & si...
Daniel Lahl
 
SAP HANA Use Cases in 27 Industries
SAP Asia Pacific
 
S/4 HANA presentation at INDUS
INDUSCommunity
 
Ad

More from SAP Technology (20)

PPTX
SAP Integration Suite L1
SAP Technology
 
PDF
Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...
SAP Technology
 
PDF
7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...
SAP Technology
 
PDF
Extend SAP S/4HANA to deliver real-time intelligent processes
SAP Technology
 
PDF
Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...
SAP Technology
 
PDF
Accelerate your journey to SAP S/4HANA with SAP’s Business Technology Platform
SAP Technology
 
PDF
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
SAP Technology
 
PDF
Transform your business with intelligent insights and SAP S/4HANA
SAP Technology
 
PDF
SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...
SAP Technology
 
PPTX
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
SAP Technology
 
PDF
The IoT Imperative for Consumer Products
SAP Technology
 
PDF
The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...
SAP Technology
 
PDF
IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...
SAP Technology
 
PDF
The IoT Imperative in Government and Healthcare
SAP Technology
 
PDF
SAP S/4HANA Finance and the Digital Core
SAP Technology
 
PDF
Five Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANA
SAP Technology
 
PPTX
SAP ASE 16 SP02 Performance Features
SAP Technology
 
PDF
What's New in SAP HANA SPS 11 Operations
SAP Technology
 
PPTX
What's New in SAP HANA SPS 11 DB Control Center (Operations)
SAP Technology
 
PPTX
What's New in SAP HANA SPS 11 Mission Critical Data Center Operations
SAP Technology
 
SAP Integration Suite L1
SAP Technology
 
Future-Proof Your Business Processes by Automating SAP S/4HANA processes with...
SAP Technology
 
7 Top Reasons to Automate Processes with SAP Intelligent Robotic Processes Au...
SAP Technology
 
Extend SAP S/4HANA to deliver real-time intelligent processes
SAP Technology
 
Process optimization and automation for SAP S/4HANA with SAP’s Business Techn...
SAP Technology
 
Accelerate your journey to SAP S/4HANA with SAP’s Business Technology Platform
SAP Technology
 
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
SAP Technology
 
Transform your business with intelligent insights and SAP S/4HANA
SAP Technology
 
SAP Cloud Platform for SAP S/4HANA: Accelerate your move to an Intelligent En...
SAP Technology
 
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
SAP Technology
 
The IoT Imperative for Consumer Products
SAP Technology
 
The IoT Imperative for Discrete Manufacturers - Automotive, Aerospace & Defen...
SAP Technology
 
IoT is Enabling a New Era of Shareholder Value in Energy and Natural Resource...
SAP Technology
 
The IoT Imperative in Government and Healthcare
SAP Technology
 
SAP S/4HANA Finance and the Digital Core
SAP Technology
 
Five Reasons To Skip SAP Suite on HANA and Go Directly to SAP S/4HANA
SAP Technology
 
SAP ASE 16 SP02 Performance Features
SAP Technology
 
What's New in SAP HANA SPS 11 Operations
SAP Technology
 
What's New in SAP HANA SPS 11 DB Control Center (Operations)
SAP Technology
 
What's New in SAP HANA SPS 11 Mission Critical Data Center Operations
SAP Technology
 

Recently uploaded (20)

PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PDF
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
PDF
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
PPTX
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
PPTX
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
Climate Action.pptx action plan for climate
justfortalabat
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 

Spark Usage in Enterprise Business Operations

  • 1. Spark Usage in Enterprise Business Operations Ken Tsai VP, Data Management & Platform-as-Services SAP @kentsaiSAP 2.17.16: Spark Summit, NYC
  • 2. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 © 2016 SAP SE or an SAP affiliate company. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward- looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
  • 3. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 SAP – Our Quick Snapshot in the Enterprise Computing World 74% of the world’s transaction revenue touches an SAP system. SAP’s product focus: Enterprise Applications Business Networks Platforms – 15 yrs on IMC SAP customers represent 87% of Forbes Global 2,000 companies. SAP touches $16 trillion of world consumer purchases.
  • 4. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 SAP HANA – An In-Memory Platform to Enable New Business Scenarios Previously Not Feasible BKPF BSEG BSEG BSEG no indices no aggregates no redundancies CORE DATA STRUCTURE REMAINS UNCHANGED •  Soft financial close anytime •  Real-time revenue and cost analysis •  Real-time liquidity forecasts •  Real-time alerts and blocks on suspicious transactions
  • 5. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Distributed Big Data Is Everywhere How to better use it in core enterprise business applications? ~79% of Data Reservoirs/ Lakes are still disconnected from core business operations How do I embed big data signal into my business applications and enterprise analytics? 53 Difficulty integrating with CRM and/or other systems % 49 Unable to apply or integrate external data quickly enough to inform real-time decision making % 59 Only a few analysts with specialized training can analyze big data % Harvard Business Review Analytic Services, Global Survey of 251 Respondents, Sept. 2015
  • 6. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Introducing SAP HANA Vora An in-memory query engine that extends the Apache Spark execution framework to enrich the interactive analytics experiences on massively distributed computing clusters •  OLAP processing •  In-Memory Computing for high performance •  Connecting to Enterprise Systems •  Unified System Management SAP HANA ERP DATA BIG DATA Parallelized Queries Vora
  • 7. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Key Open Source Contribution to Apache Spark Ecosystem Spark to HANA Push-downs & Data Hierarchies scala> val hierarchy = sqlContext.sql( s""" SELECT LVL, COUNT(*), ROUND( AVG(P_RETAILPRICE), 2) FROM ( SELECT LEVEL(node) AS LVL, P_RETAILPRICE FROM HIERARCHY( USING PART_HIERARCHY AS c JOIN PARENT p ON c.P_PARENT = p.P_PARTKEY SEARCH BY P_PARTKEY ASC START WHERE P_PARTKEY = 1 SET node ) AS H0 ) T1 GROUP BY LVL """.stripMargin ).collect().foreach(println) 901 903 913912 904 911 +---+---+------------+ |LEVEL|COUNT|AVG(P_RETAILPRICE)| +-----+-----+------------------+ | 0 | 1 | 901 | | 1 | 2 | 903.5 | | 2 | 3 | 912 | +-----+-----+------------------+ val options = Map("dbschema" -> config.user,"host" -> config.host,"instance" -> config.instance) # HANA Live CustomerBasicData Virtual Data Model val custConf = options + ("path" -> s"""sap.hba.ecc/ CustomerBasicData""") val cust = sqlContext.read.format("com.sap.spark.hana").options(custConf).load() cust.registerTempTable("customer") # HANA Live SalesOrderHeader VDM val sohConf = options + ("path" -> s"""sap.hba.ecc/ SalesOrderHeader""") val soh = sqlContext.read.format("com.sap.spark.hana").options(sohConf).load() soh.registerTempTable(soh) # Top 5 Countries by Sales Order Volume salesOrder = sqlContext.sql("select "Country",count(*) as Frequency from salesOrder as s LEFT OUTER JOIN customer as c on s.soldToParty = c.Customer GROUP BY Country ORDER BY Frequency desc”)
  • 8. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Airline Use Case – Optimize MRO scheduling with Sensor Data Challenges •  $10,000 loss for every hour spent on maintenance, repair, and overhaul (MRO) •  Predictive MRO generates TB of sensor data per flight Solution •  SAP HANA Vora rapidly processes sensor data in HDFS and combines it with flight schedule and staffing data in SAP HANA to prioritize maintenance jobs and accelerate MRO Why SAP HANA Vora •  Optimize MRO operations with interactive, on-demand drill down by airport, flight route, etc. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
  • 9. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Utility Use Case – CenterPoint Energy Challenge •  Smart meters generate TBs of data/month •  Regulatory requirement to retain data for 10 years •  Current storage solution full by end-2016 •  Need to leverage HDFS as an additional tier for storage Solution •  SAP HANA for most recent sensor signal and operational data, Dynamic Tiering for 1~2yrs old data, HDFS for historical sensor data •  SAP HANA Vora accesses and queries data across all tiers Why SAP HANA Vora •  SAP HANA Vora provides enterprise analytics & OLAP like experience across data warehouse and HDFS. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
  • 10. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Utility Use Case – How It Works CenterPoint Energy Our benchmark tests proved that SAP HANA paired with SAP HANA Vora are the right solutions for us. We expect immediate cost benefits and to see competitive differentiation in the future.” Gary Hayes, CIO & SVP at CenterPoint Energy ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 SAP HANA MOST RECENT SENSOR DATA Dynamie Tiering 1-2 YR OLD DATA Parallelized Queries HDFS HISTORICAL SENSOR DATA Query data within and across tiers
  • 11. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Financial Services Use Case – Extend Fraud Pattern Detection Challenges •  100+ million business transactions daily, 25% growth YoY •  Limited access to archived data •  Difficult to detect patterns in historical transactions Solution •  Current transactions in SAP HANA, historical transactions in HDFS clusters •  Real-time detection of abnormalities Why SAP HANA Vora •  Real-time, aggregated insights from current and historical transactions ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
  • 12. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 2016 and the Road Ahead Customers in North America, APJ, and EMEA Dev edition available on AWS TODAY General Availability Vora Modeler to build and query OLAP style cubes on data COMING SOON Planning (HR, Financial) Extend engine support for time series Transaction management Analytics on archived ERP data in Hadoop FUTURE
  • 13. ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 Contribute to Spark Ecosystem, Embrace Best of Community Innovation Contribution to Open Source: Hierarchy capabilities Connection to ERP: predicate pushdown to HANA On-the-market solution SAP HANA Vora
  • 14. Thank you! Ken Tsai: [email protected] @kentsaiSAP Enter to Win a GoPro HERO4 Session at SAP Booth 102 Learn More @ hana.sap.com/vora Try Dev Edition bit.ly/1K1qLyo We’re Hiring: https://spark-summit.org/east-2016/jobs/