SlideShare a Scribd company logo
Journey to SAS
Analytics Grid with SAS,
R, Python
Benjamin Zenick, Chief Operating Officer -
Zencos
Sumit Sarkar, Chief Data Evangelist -
Progress DataDirect
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.2
Audio Bridge Options & Question Submission
Journey to SAS
Analytics Grid with SAS,
R, Python
Benjamin Zenick, Chief Operating Officer -
Zencos
Sumit Sarkar, Chief Data Evangelist -
Progress DataDirect
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.4
Agenda
 Differences between traditional and Grid deployments for SAS
 Best practices and lessons learned in deploying an Analytics Grid
 How to deliver an open analytics strategy for SAS, R, Python and
others
 Popular data sources for advanced analytics
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.5
POLL
WHERE ARE YOU IN YOUR ANALYTICS JOURNEY?
 DESKTOP ANALYTICS
 CLIENT/SERVER ANALYTICS
 GRID ANALYTICS
 CLOUD ANALYTICS
 OTHER
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.6
Differences between traditional
and Grid deployments for SAS
The Evolution of Analytics
Businesses started with large and expensive central mainframes
– Mainframes were limited by early storage and processing technology
– Connectivity and user interfaces to data were limited by “dumb” terminals
– Expansion was limited by proprietary chassis design
– Connecting multiple mainframes was expensive, challenging, or impossible
Analytics Today
• Modernization moved away from Mainframes
• Moved toward server / client solutions, workstations, storage
appliances, and networking
• Shortcoming of centralized datacenters: Administrative and
Performance Bottlenecks
Example of Traditional Deployment
What benefits do grid deployments provide?
• Standardization supporting multiple ecosystems
• Streamline Administrative support
• Better tools for analytics and administration
• Centralizing and improving management
• Size & Scalability
Example of Grid Deployment
Signs your organization is ready to consider an HPC or Grid
solution…
• Decrease in cost benefits
• Current model doesn’t scale well
• Massively Parallelized Processing
• Administrative needs continue to grow and grow
• High(er) Availability is possible
• Faster (Disaster) Recovery
Zencos capabilities prepared for TEST Co.
Top Considerations for “Modernization”
• Why?
• Who?
• What?
• Where?
• When?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.14
Best practices and lessons
learned in deploying an
Analytics Grid
Best Practices
• Preparation
• Technologies
• Plan
• Time
• Expectations
• Team
• Transition
• Users
• Support
• Goal Alignment
Lessons Learned
• Invest in a meaningful assessment
• Plan to purchase and build Test and Disaster Recovery
environments
• Understand the applications and use cases
• Outline support model for legacy projects
• Consider your post-implementation needs
• Expect the unexpected
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.17
How to deliver an open analytics
strategy for SAS, R, Python and
others
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.18
POLL
WHICH LANGUAGE(S) ARE COMMONLY USED IN YOUR
ORGANIZATION
 SAS
 Python
 R
 SPSS
 OTHER
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.19
SAS and Open Analytics across …
SAS ViyaSAS Grid Manager
SAS (open data access and grid
management for native language support)
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.20
SAS Grid Manager
Image from SAS webinar: https://www.evensi.us/webinar-taking-r-and-python-from-good-to-
great-with-sas-/204358443
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.21
SAS with Open Data Access (ODBC)
 Access external data using supported
access modes using data source
specific SAS/Access interfaces.
 Leverage generic SAS/Access
interface to ODBC with an open
ODBC driver for direct access from
Python and R.
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.22
Workers
SAS and Open Analytics | SAS Grid (Open Data Access via ODBC)
ODBC
RDBMS, Big Data, NoSQL, Cloud
Access data sources over TCP or HTTPS
Analytics Grid
Open Grid Manager
Open Data Access
Controller
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.23
R ODBC Example
library(RODBC)
# Make a connection using your DSN name
conn <- odbcConnect("Spark Next")
# Execute a SQL Tables call
sqlTables(conn)
# Execute a SQL columns call on the table with our energy data
sqlColumns(conn, "energyconsumption")
# Bind the results of a SQL query for plotting
data <- sqlQuery(conn, "SELECT * FROM energyconsumption WHERE country IN ('China', 'United States', 'Canada', 'France', 'Germany', 'Italy',
'Japan')")
# Attach the data for plotting access
attach(data)
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.24
Python ODBC Example
import pyodbc
import getpass
import sys
def show_odbc():
sources = pyodbc.dataSources()
dsns = sources.keys()
sl = []
i = 1
for dsn in dsns:
sl.append( str(i) + '. %s' % (dsn))
i= i+1
print('n'.join(sl))
return dsns
def listTables(cursor):
for row in cursor.tables():
print row.table_name
def executeSelectQuery(cursor, cnxn):
query = raw_input('Enter the SELECT Query:')
cursor.execute(query)
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.25
DataDirect ODBC is engineered for GRID and Cloud
 Deliver advanced functionality over OSS to become SAS OEM Partner
 Run 85+ million QA tests on our suite of connectors
 Performance labs measure throughput and resource utilization (CPU and memory)
 Focus on security features for customers to achieve regulatory compliance
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.26
Popular data sources for
advanced analytics
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.27
Popular Relational/Analytics
Data Sources
SQL Server 18.70%
Oracle 12.89%
MySQL 12.77%
Progress OpenEdge 7.93%
PostgreSQL 5.65%
Microsoft SQL Azure
5.27%
IBM DB2 4.76%
SQLite 3.68%
Teradata 2.61%
SAP HANA 2.30%
MariaDB 2.25%
Sybase ASE 1.92%
Amazon Redshift 1.79%
Informix 1.64%
Sybase IQ 1.30%
Netezza 1.25%
Other (please
specify): 1.13%
Amazon Aurora 1.00%
Not sure 0.97%
Pivotal Greenplum
0.87%
Google BigQuery 0.77%
Vertica 0.61%
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.28
Popular Big Data Sources
Hadoop Hive 18.53%
Spark SQL 8.17%
Hortonworks 7.97%
Cloudera CDH 7.87%
Cloudera Impala 7.47%
Apache Solr 7.37%
Oracle BDA 6.67%
Amazon EMR 5.98%
Apache Sqoop 5.48%
MapR 5.38%
IBM BigInsights 4.68%
Apache Storm 4.08%
Apache Drill 2.39%
Apache Phoenix 2.39%
SAP Altiscale 2.19%
Pivotal HD 1.89%
Presto 0.80%
GemFireXD 0.70%
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.29
Popular NoSQL Sources
MongoDB 35.60%
Cassandra 14.57%
HBase 10.34%
Oracle NoSQL 9.01%
Redis 8.45%
Other (please
specify): 6.01%
Couchbase 5.78%
DynamoDB 2.78%
DataStax
Enterprise 2.22%
SimpleDB 2.22%
MarkLogic 1.67%
Aerospike 0.78%
Riak 0.56%
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.30
What about SaaS?
Data Source API
Eloqua Web Services API (REST/SOAP)
Bulk and non-Bulk APIs
No query language
Oracle Service Cloud Web Services APIs (REST/SOAP)
ROQL
Google Analytics Hypercube (query limits of 10 metrics grouped by
max of 7 dimensions)
Veeva CRM SOAP, BULK, Metadata APIs
SOQL
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.31
Supported ODBC Data Sources for SAS/Access
Apache Hadoop Hive 0.8.0 and higher
Amazon EMR 2.1.4 and higher
Amazon Redshift
Apache Spark SQL 1.2, 1.3, 1.4, 1.5
Cloudera CDH update 4 and higher
Cloudera Impala 1.0, 1.1, 1.2, 1.3, 1.4
Cloudera Impala 2.0, 2.1, 2.2
Hortonworks 1.3 and higher
IBM BigInsights 3.0 and higher
MapR 1.2 and higher
Pivotal HD 2.0.1 and higher
DB2 V9.1, V9.5, V9.7, 9.8 for Linux, UNIX, Windows DB2 V8.x for LUW
DB2 11 for z/OS* DB2 V10 for z/OS DB2 V9.1 for z/OS
DB2 UDB V8.1 for z/OS
DB2 I 7.1, 7.2* (DB2 UDB V7R1, V7R2 for iSeries)
DB2 I 6.1 (DB2 UDB V6R1 for iSeries)
DB2 for I 5/OS (DB2 UDB V5R4 for iSeries)
Eloqua (Oracle Marketing Cloud)
Financial Force
Google Analytics
Greenplum 4, 4.1, 4.2, 4.3
Greenplum 3.3
Hubspot
Informix Dynamic Server 12.1*
Informix Dynamic Server 11.0, 11.5, 11.7
Informix Dynamic Server 10.0
Informix Dynamic Server 9.2, 9.3, 9.4
Informix Dynamic Server 11.0, 11.5, 11.7
Informix Dynamic Server 10.0
Informix Dynamic Server 9.2, 9.3, 9.4
Marketo
Microsoft Dynamics CRM 2011 Rollup 16, 2013, 2015
Microsoft SQL Server 2014*
Microsoft SQL Server 2012
Microsoft SQL Server 2008 R1, R2
Microsoft SQL Server 2005
Microsoft SQL Server 2000 Desktop Engine (MSDE 2000) Microsoft SQL Server 2000
Microsoft SQL Azure*
MongoDB 3.0
MongoDB 2.2, 2.4, 2.6
MySQL Enterprise Edition 5.0, 5.1, 5.5, 5.6*
Oracle 12c R1 (12.1)*
Oracle 11g R1, R2 (11.1, 11.2)
Oracle 10g R1, R2 (10.1, 10.2)
Oracle 9i R1, R2 (9.0.1, 9.2)
Oracle 8i R3 (8.1.7)
Oracle Service Cloud
Oracle Sales Cloud
Pivotal HAWQ 1.1*, 1.2*
PostgreSQL 9.0, 9.1, 9.2, 9.3, 9.4*
PostgreSQL 8.2, 8.3, 8.4
Progress OpenEdge 11.0, 11.1*, 11.2*, 11.3*, 11.4*
Progress OpenEdge 10.1.x, 10.2.x
Progress Rollbase 2.0 and higher*
REST API (via OpenAccess)
SAP Adaptive Server Enterprise 16.0*
ServiceMax
SugarCRM 7.1.6 and higher*
Sybase Adaptive Server Enterprise 15.0, 15.5, 15.7
Sybase Adaptive Server Enterprise 12.0, 12.5, 12.5.x
Sybase Adaptive Server Enterprise 11.9
Sybase IQ 16.0*
Sybase IQ 15.0, 15.1, 15.2, 15.3, 15.4
Veeva CRM
Blue text indicates cloud hosted
Blue text* indicates cloud hosted with on-premises option
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.32
NEW cross data center access for SAS/Access interface to ODBC (over https)
SAS/Access interface to
ODBC
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.33
Learn More about Data Access for SAS Analytics
What DataDirect Does for SAS Shops
“Taking R and Python from good to great with SAS” [Webinar hosted
by SAS in April 17]
Zencos Consulting Blog
Tech Articles on configuring SAS with ODBC:
• SAS/Access 9.4 interface to ODBC Tutorial across popular data
sources such as SQL Server, Salesforce and Amazon Redshift
• SAS/Access 9.4 interface to ODBC Tutorial across cloud data
sources such as Marketo and Eloqua
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.34
Wrap Up with Q&A
Slides and recording will be made available to each attendee
Visit www.datadirect.com to learn more about ODBC drivers engineered for analytics
Please enter your questions in the chat...
Journey to SAS Analytics Grid with SAS, R, Python

More Related Content

What's hot (19)

PPTX
Building a marketing data lake
Sumit Sarkar
 
PPTX
Firewall friendly pipeline for secure data access
Sumit Sarkar
 
PPTX
OData External Data Integration Strategies for SaaS
Sumit Sarkar
 
PPTX
OData and the future of business objects universes
Sumit Sarkar
 
PDF
Big Data Insurance
Progress
 
PDF
SQL Access to NoSQL
Progress
 
PDF
Navigating Your Product's Growth with Embedded Analytics
Progress
 
PPTX
Talend
templedf
 
PDF
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
Markus Michalewicz
 
PPTX
How to Prepare Your Toolbox for the Future of SharePoint Development
Progress
 
PDF
Flexpod with SAP HANA and SAP Applications
Lishantian
 
PDF
Why Use an Oracle Database?
Markus Michalewicz
 
PPTX
How Universities Use Big Data to Transform Education
Hortonworks
 
PPTX
Salesforce External Objects for Big Data
Sumit Sarkar
 
PDF
Oracle Data Integration - Overview
Jeffrey T. Pollock
 
PDF
Make Your Application “Oracle RAC Ready” & Test For It
Markus Michalewicz
 
PDF
Oracle Solaris Build and Run Applications Better on 11.3
OTN Systems Hub
 
PDF
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
Markus Michalewicz
 
PDF
Pivotal Big Data Suite: A Technical Overview
VMware Tanzu
 
Building a marketing data lake
Sumit Sarkar
 
Firewall friendly pipeline for secure data access
Sumit Sarkar
 
OData External Data Integration Strategies for SaaS
Sumit Sarkar
 
OData and the future of business objects universes
Sumit Sarkar
 
Big Data Insurance
Progress
 
SQL Access to NoSQL
Progress
 
Navigating Your Product's Growth with Embedded Analytics
Progress
 
Talend
templedf
 
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
Markus Michalewicz
 
How to Prepare Your Toolbox for the Future of SharePoint Development
Progress
 
Flexpod with SAP HANA and SAP Applications
Lishantian
 
Why Use an Oracle Database?
Markus Michalewicz
 
How Universities Use Big Data to Transform Education
Hortonworks
 
Salesforce External Objects for Big Data
Sumit Sarkar
 
Oracle Data Integration - Overview
Jeffrey T. Pollock
 
Make Your Application “Oracle RAC Ready” & Test For It
Markus Michalewicz
 
Oracle Solaris Build and Run Applications Better on 11.3
OTN Systems Hub
 
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
Markus Michalewicz
 
Pivotal Big Data Suite: A Technical Overview
VMware Tanzu
 

Similar to Journey to SAS Analytics Grid with SAS, R, Python (20)

PDF
Operational-Analytics
Niloy Mukherjee
 
PPTX
Building a modern data warehouse
James Serra
 
PPTX
Big data analyti data analytical life cycle
NAKKAPUNEETH1
 
PDF
SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...
South Tyrol Free Software Conference
 
PPTX
Sql 2017 net raf
Maximiliano Accotto
 
PPTX
Sql 2016 2017 full
Maximiliano Accotto
 
PPTX
Modernizing Mission-Critical Apps with SQL Server
Microsoft Tech Community
 
PDF
BAR360 open data platform presentation at DAMA, Sydney
Sai Paravastu
 
PPTX
BIG DATA and USE CASES
Bhaskara Reddy Sannapureddy
 
PPTX
IBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
Torsten Steinbach
 
PDF
Innovating With Data and Analytics
VMware Tanzu
 
PPTX
SQL Server 2017 Deep Dive - @Ignite 2017
Travis Wright
 
PDF
Analytical Innovation: How to Build the Next Generation Data Platform
VMware Tanzu
 
PDF
OpenSistemas Corporate Presentation
OpenSistemas
 
PPTX
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
DataWorks Summit
 
PDF
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
PPTX
Big data unit 2
RojaT4
 
PDF
Advanced Analytics and Big Data (August 2014)
Thomas W. Dinsmore
 
PPTX
Finance and Accounting BPM
Bob Samuels
 
Operational-Analytics
Niloy Mukherjee
 
Building a modern data warehouse
James Serra
 
Big data analyti data analytical life cycle
NAKKAPUNEETH1
 
SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...
South Tyrol Free Software Conference
 
Sql 2017 net raf
Maximiliano Accotto
 
Sql 2016 2017 full
Maximiliano Accotto
 
Modernizing Mission-Critical Apps with SQL Server
Microsoft Tech Community
 
BAR360 open data platform presentation at DAMA, Sydney
Sai Paravastu
 
BIG DATA and USE CASES
Bhaskara Reddy Sannapureddy
 
IBM Insight 2014 - Advanced Warehouse Analytics in the Cloud
Torsten Steinbach
 
Innovating With Data and Analytics
VMware Tanzu
 
SQL Server 2017 Deep Dive - @Ignite 2017
Travis Wright
 
Analytical Innovation: How to Build the Next Generation Data Platform
VMware Tanzu
 
OpenSistemas Corporate Presentation
OpenSistemas
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
DataWorks Summit
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
Big data unit 2
RojaT4
 
Advanced Analytics and Big Data (August 2014)
Thomas W. Dinsmore
 
Finance and Accounting BPM
Bob Samuels
 
Ad

More from Sumit Sarkar (6)

PPTX
What serverless means for enterprise apps
Sumit Sarkar
 
PPTX
Digitize Enterprise Assets for Mobility
Sumit Sarkar
 
PPTX
Salesforce Connect External Object Reports
Sumit Sarkar
 
PPTX
Webinar on MongoDB BI Connectors
Sumit Sarkar
 
PPTX
Lightning Connect: Lessons Learned
Sumit Sarkar
 
PPTX
Ibis 2015 final template
Sumit Sarkar
 
What serverless means for enterprise apps
Sumit Sarkar
 
Digitize Enterprise Assets for Mobility
Sumit Sarkar
 
Salesforce Connect External Object Reports
Sumit Sarkar
 
Webinar on MongoDB BI Connectors
Sumit Sarkar
 
Lightning Connect: Lessons Learned
Sumit Sarkar
 
Ibis 2015 final template
Sumit Sarkar
 
Ad

Recently uploaded (20)

PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
July Patch Tuesday
Ivanti
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
July Patch Tuesday
Ivanti
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 

Journey to SAS Analytics Grid with SAS, R, Python

  • 1. Journey to SAS Analytics Grid with SAS, R, Python Benjamin Zenick, Chief Operating Officer - Zencos Sumit Sarkar, Chief Data Evangelist - Progress DataDirect
  • 2. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.2 Audio Bridge Options & Question Submission
  • 3. Journey to SAS Analytics Grid with SAS, R, Python Benjamin Zenick, Chief Operating Officer - Zencos Sumit Sarkar, Chief Data Evangelist - Progress DataDirect
  • 4. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.4 Agenda  Differences between traditional and Grid deployments for SAS  Best practices and lessons learned in deploying an Analytics Grid  How to deliver an open analytics strategy for SAS, R, Python and others  Popular data sources for advanced analytics
  • 5. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.5 POLL WHERE ARE YOU IN YOUR ANALYTICS JOURNEY?  DESKTOP ANALYTICS  CLIENT/SERVER ANALYTICS  GRID ANALYTICS  CLOUD ANALYTICS  OTHER
  • 6. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.6 Differences between traditional and Grid deployments for SAS
  • 7. The Evolution of Analytics Businesses started with large and expensive central mainframes – Mainframes were limited by early storage and processing technology – Connectivity and user interfaces to data were limited by “dumb” terminals – Expansion was limited by proprietary chassis design – Connecting multiple mainframes was expensive, challenging, or impossible
  • 8. Analytics Today • Modernization moved away from Mainframes • Moved toward server / client solutions, workstations, storage appliances, and networking • Shortcoming of centralized datacenters: Administrative and Performance Bottlenecks
  • 10. What benefits do grid deployments provide? • Standardization supporting multiple ecosystems • Streamline Administrative support • Better tools for analytics and administration • Centralizing and improving management • Size & Scalability
  • 11. Example of Grid Deployment
  • 12. Signs your organization is ready to consider an HPC or Grid solution… • Decrease in cost benefits • Current model doesn’t scale well • Massively Parallelized Processing • Administrative needs continue to grow and grow • High(er) Availability is possible • Faster (Disaster) Recovery Zencos capabilities prepared for TEST Co.
  • 13. Top Considerations for “Modernization” • Why? • Who? • What? • Where? • When?
  • 14. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.14 Best practices and lessons learned in deploying an Analytics Grid
  • 15. Best Practices • Preparation • Technologies • Plan • Time • Expectations • Team • Transition • Users • Support • Goal Alignment
  • 16. Lessons Learned • Invest in a meaningful assessment • Plan to purchase and build Test and Disaster Recovery environments • Understand the applications and use cases • Outline support model for legacy projects • Consider your post-implementation needs • Expect the unexpected
  • 17. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.17 How to deliver an open analytics strategy for SAS, R, Python and others
  • 18. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.18 POLL WHICH LANGUAGE(S) ARE COMMONLY USED IN YOUR ORGANIZATION  SAS  Python  R  SPSS  OTHER
  • 19. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.19 SAS and Open Analytics across … SAS ViyaSAS Grid Manager SAS (open data access and grid management for native language support)
  • 20. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.20 SAS Grid Manager Image from SAS webinar: https://www.evensi.us/webinar-taking-r-and-python-from-good-to- great-with-sas-/204358443
  • 21. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.21 SAS with Open Data Access (ODBC)  Access external data using supported access modes using data source specific SAS/Access interfaces.  Leverage generic SAS/Access interface to ODBC with an open ODBC driver for direct access from Python and R.
  • 22. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.22 Workers SAS and Open Analytics | SAS Grid (Open Data Access via ODBC) ODBC RDBMS, Big Data, NoSQL, Cloud Access data sources over TCP or HTTPS Analytics Grid Open Grid Manager Open Data Access Controller
  • 23. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.23 R ODBC Example library(RODBC) # Make a connection using your DSN name conn <- odbcConnect("Spark Next") # Execute a SQL Tables call sqlTables(conn) # Execute a SQL columns call on the table with our energy data sqlColumns(conn, "energyconsumption") # Bind the results of a SQL query for plotting data <- sqlQuery(conn, "SELECT * FROM energyconsumption WHERE country IN ('China', 'United States', 'Canada', 'France', 'Germany', 'Italy', 'Japan')") # Attach the data for plotting access attach(data)
  • 24. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.24 Python ODBC Example import pyodbc import getpass import sys def show_odbc(): sources = pyodbc.dataSources() dsns = sources.keys() sl = [] i = 1 for dsn in dsns: sl.append( str(i) + '. %s' % (dsn)) i= i+1 print('n'.join(sl)) return dsns def listTables(cursor): for row in cursor.tables(): print row.table_name def executeSelectQuery(cursor, cnxn): query = raw_input('Enter the SELECT Query:') cursor.execute(query)
  • 25. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.25 DataDirect ODBC is engineered for GRID and Cloud  Deliver advanced functionality over OSS to become SAS OEM Partner  Run 85+ million QA tests on our suite of connectors  Performance labs measure throughput and resource utilization (CPU and memory)  Focus on security features for customers to achieve regulatory compliance
  • 26. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.26 Popular data sources for advanced analytics
  • 27. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.27 Popular Relational/Analytics Data Sources SQL Server 18.70% Oracle 12.89% MySQL 12.77% Progress OpenEdge 7.93% PostgreSQL 5.65% Microsoft SQL Azure 5.27% IBM DB2 4.76% SQLite 3.68% Teradata 2.61% SAP HANA 2.30% MariaDB 2.25% Sybase ASE 1.92% Amazon Redshift 1.79% Informix 1.64% Sybase IQ 1.30% Netezza 1.25% Other (please specify): 1.13% Amazon Aurora 1.00% Not sure 0.97% Pivotal Greenplum 0.87% Google BigQuery 0.77% Vertica 0.61%
  • 28. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.28 Popular Big Data Sources Hadoop Hive 18.53% Spark SQL 8.17% Hortonworks 7.97% Cloudera CDH 7.87% Cloudera Impala 7.47% Apache Solr 7.37% Oracle BDA 6.67% Amazon EMR 5.98% Apache Sqoop 5.48% MapR 5.38% IBM BigInsights 4.68% Apache Storm 4.08% Apache Drill 2.39% Apache Phoenix 2.39% SAP Altiscale 2.19% Pivotal HD 1.89% Presto 0.80% GemFireXD 0.70%
  • 29. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.29 Popular NoSQL Sources MongoDB 35.60% Cassandra 14.57% HBase 10.34% Oracle NoSQL 9.01% Redis 8.45% Other (please specify): 6.01% Couchbase 5.78% DynamoDB 2.78% DataStax Enterprise 2.22% SimpleDB 2.22% MarkLogic 1.67% Aerospike 0.78% Riak 0.56%
  • 30. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.30 What about SaaS? Data Source API Eloqua Web Services API (REST/SOAP) Bulk and non-Bulk APIs No query language Oracle Service Cloud Web Services APIs (REST/SOAP) ROQL Google Analytics Hypercube (query limits of 10 metrics grouped by max of 7 dimensions) Veeva CRM SOAP, BULK, Metadata APIs SOQL
  • 31. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.31 Supported ODBC Data Sources for SAS/Access Apache Hadoop Hive 0.8.0 and higher Amazon EMR 2.1.4 and higher Amazon Redshift Apache Spark SQL 1.2, 1.3, 1.4, 1.5 Cloudera CDH update 4 and higher Cloudera Impala 1.0, 1.1, 1.2, 1.3, 1.4 Cloudera Impala 2.0, 2.1, 2.2 Hortonworks 1.3 and higher IBM BigInsights 3.0 and higher MapR 1.2 and higher Pivotal HD 2.0.1 and higher DB2 V9.1, V9.5, V9.7, 9.8 for Linux, UNIX, Windows DB2 V8.x for LUW DB2 11 for z/OS* DB2 V10 for z/OS DB2 V9.1 for z/OS DB2 UDB V8.1 for z/OS DB2 I 7.1, 7.2* (DB2 UDB V7R1, V7R2 for iSeries) DB2 I 6.1 (DB2 UDB V6R1 for iSeries) DB2 for I 5/OS (DB2 UDB V5R4 for iSeries) Eloqua (Oracle Marketing Cloud) Financial Force Google Analytics Greenplum 4, 4.1, 4.2, 4.3 Greenplum 3.3 Hubspot Informix Dynamic Server 12.1* Informix Dynamic Server 11.0, 11.5, 11.7 Informix Dynamic Server 10.0 Informix Dynamic Server 9.2, 9.3, 9.4 Informix Dynamic Server 11.0, 11.5, 11.7 Informix Dynamic Server 10.0 Informix Dynamic Server 9.2, 9.3, 9.4 Marketo Microsoft Dynamics CRM 2011 Rollup 16, 2013, 2015 Microsoft SQL Server 2014* Microsoft SQL Server 2012 Microsoft SQL Server 2008 R1, R2 Microsoft SQL Server 2005 Microsoft SQL Server 2000 Desktop Engine (MSDE 2000) Microsoft SQL Server 2000 Microsoft SQL Azure* MongoDB 3.0 MongoDB 2.2, 2.4, 2.6 MySQL Enterprise Edition 5.0, 5.1, 5.5, 5.6* Oracle 12c R1 (12.1)* Oracle 11g R1, R2 (11.1, 11.2) Oracle 10g R1, R2 (10.1, 10.2) Oracle 9i R1, R2 (9.0.1, 9.2) Oracle 8i R3 (8.1.7) Oracle Service Cloud Oracle Sales Cloud Pivotal HAWQ 1.1*, 1.2* PostgreSQL 9.0, 9.1, 9.2, 9.3, 9.4* PostgreSQL 8.2, 8.3, 8.4 Progress OpenEdge 11.0, 11.1*, 11.2*, 11.3*, 11.4* Progress OpenEdge 10.1.x, 10.2.x Progress Rollbase 2.0 and higher* REST API (via OpenAccess) SAP Adaptive Server Enterprise 16.0* ServiceMax SugarCRM 7.1.6 and higher* Sybase Adaptive Server Enterprise 15.0, 15.5, 15.7 Sybase Adaptive Server Enterprise 12.0, 12.5, 12.5.x Sybase Adaptive Server Enterprise 11.9 Sybase IQ 16.0* Sybase IQ 15.0, 15.1, 15.2, 15.3, 15.4 Veeva CRM Blue text indicates cloud hosted Blue text* indicates cloud hosted with on-premises option
  • 32. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.32 NEW cross data center access for SAS/Access interface to ODBC (over https) SAS/Access interface to ODBC
  • 33. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.33 Learn More about Data Access for SAS Analytics What DataDirect Does for SAS Shops “Taking R and Python from good to great with SAS” [Webinar hosted by SAS in April 17] Zencos Consulting Blog Tech Articles on configuring SAS with ODBC: • SAS/Access 9.4 interface to ODBC Tutorial across popular data sources such as SQL Server, Salesforce and Amazon Redshift • SAS/Access 9.4 interface to ODBC Tutorial across cloud data sources such as Marketo and Eloqua
  • 34. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.34 Wrap Up with Q&A Slides and recording will be made available to each attendee Visit www.datadirect.com to learn more about ODBC drivers engineered for analytics Please enter your questions in the chat...

Editor's Notes

  • #2: Can Your Current Infrastructure Support High-Performance Analytics and Data Science? Big data, compliance and a highly skilled workforce are driving organizations to transform their current analytical infrastructure to deliver enterprise computing environments that can support the latest in data science and analytics practices. SAS remains a popular choice for statistical programming languages, but there is growing demand for R and Python. Data engineers are now being tasked to deliver scalable and highly available computing resources to support analytics for a growing number of users and increasing data volumes while maintaining security for their customers. Join this webinar to learn: Differences between traditional and Grid deployments for SAS Best practices and lessons learned in deploying an Analytics Grid How to deliver an open analytics strategy for SAS, R, Python and others Popular data sources for advanced analytics
  • #3: Join Audio: 2 ways to do so, 1) to use VoIP, click on “Mic & Speakers”, or 2) to use your telephone, click on “telephone” and dial-in using the numbers and information provided 2) All lines are muted for today’s webinar. We do plan to have a live Q&A session at the end of the presentations. However if you have a question at any time during this webinar, simply submit your questions via the “Question” section of the webinar interface located to the right of your screen – we will collect all questions through this “Question Window”. Final Note: we are recording today’s webinar