SlideShare a Scribd company logo
June 2017
Yanbo Liang
Apache Spark committer
Hortonworks
SparkR best practices for R data scientists
2 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Outline
à Introduction to	R	and SparkR.
à Typical data science workflow.
à SparkR + R for typical data science problem.
– Big data, small learning.
– Partition aggregate.
– Large scale machine learning.
à Future directions.
3 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Outline
à Introduction to	R	and SparkR.
à Typical data science workflow.
à SparkR + R for typical data science problem.
– Big data, small learning.
– Partition aggregate.
– Large scale machine learning.
à Future directions.
4 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
R for data scientist
à Pros
– Open source.
– Rich ecosystem of packages.
– Powerful visualization infrastructure.
– Data frames make data manipulation convenient.
– Taught by many schools to statistics and computer science students.
à Cons
– Single threaded
– Everything has to fit in single machine memory
5 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
SparkR = Spark + R
à An	R	frontend	for	Apache	Spark,	a	widely deployed cluster computing engine.
à Wrappers over DataFrames and DataFrame-based APIs (MLlib).
– Complete DataFrame API to behave just like R data.frame.
– ML APIs mimic to the methods implemented in R or R packages, rather than Scala/Python APIs.
à Data frame concept is the corner stone of both Spark and R.
à Convenient interoperability between R and Spark DataFrames.
6 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
SparkR architecture
7 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Outline
à Introduction to	R	and SparkR.
à Typical data science workflow.
à SparkR + R for typical data science problem.
– Big data, small learning.
– Partition aggregate.
– Large scale machine learning.
à Future directions.
8 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data science workflow
R for Data Science (http://r4ds.had.co.nz/)
9 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Why SparkR + R
à There are thousands of community packages on CRAN.
– It is impossible for SparkR to match all existing features.
à Not every dataset is large.
– Many people work with small/medium datasets.
10 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Outline
à Introduction to	R	and SparkR.
à Typical data science workflow.
à SparkR + R for typical data science problem.
– Big data, small learning.
– Partition aggregate.
– Large scale machine learning.
à Future directions.
11 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Big data, small learning
Table1
Table2
Table3 Table4 Table5join
select/
where/
aggregate/
sample collect
model/
analytics
SparkR R
12 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data wrangle with SparkR
Operation/Transformation function
Join different data sources or tables join
Pick observations by their value filter/where
Reorder the rows arrange
Pick variables by their names select
Create new variable with functions of existing variables mutate/withColumn
Collapse many values down to a single summary summary/describe
Aggregation groupBy
13 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Data wrangle
airlines <- read.df(path="/data/2008.csv", source="csv",
header="true", inferSchema="true")
planes <- read.df(path="/data/plane-data.csv", source="csv",
header="true", inferSchema="true")
joined <- join(airlines, planes, airlines$TailNum ==
planes$tailnum)
df1 <- select(joined, “aircraft_type”, “Distance”, “ArrDelay”,
“DepDelay”)
df2 <- dropna(df1)
14 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
SparkR performance
15 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Sampling Algorithms
à Bernoulli sampling (without replacement)
– df3 <- sample(df2,	FALSE,	0.1)
à Poisson sampling (with replacement)
– df3 <- sample(df2, TRUE, 0.1)
à stratified sampling
– df3 <- sampleBy(df2,	"aircraft_type",	list("Fixed	Wing	Multi-Engine"=0.1,	"Fixed	Wing	Single-
Engine"=0.2,	"Rotorcraft"=0.3),	0)
16 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Big data, small learning
Table1
Table2
Table3 Table4 Table5join
select/
where/
aggregate/
sample collect
model/
analytics
SparkR R
17 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Big data, small learning
Table1
Table2
Table3 Table4 Table5join
select/
where/
aggregate/
sample collect
model/
analytics
SparkDataFrame data.frame
18 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Distributed dataset to local
19 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Partition aggregate
à User Defined Functions (UDFs).
– dapply
– gapply
à Parallel execution of function.
– spark.lapply
20 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
User Defined Functions (UDFs)
à dapply
à gapply
21 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
dapply
> schema <- structType(structField(”aircraft_type”, “string”),
structField(”Distance“, ”integer“),
structField(”ArrDelay“, ”integer“),
structField(”DepDelay“, ”integer“),
structField(”DepDelayS“, ”integer“))
> df4 <- dapply(df3, function(x) { x <- cbind(x, x$DepDelay *
60L) }, schema)
> head(df4)
22 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
gapply
> schema <- structType(structField(”Distance“, ”integer“),
structField(”MaxActualDelay“, ”integer“))
> df5 <- gapply(df3, “Distance”, function(key, x) { y <-
data.frame(key, max(x$ArrDelay-x$DepDelay)) }, schema)
> head(df5)
23 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
spark.lapply
à Ideal way for distributing existing R functionality and packages
24 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
spark.lapply
for (lambda in c(0.5, 1.5)) {
for (alpha in c(0.1, 0.5, 1.0)) {
model <- glmnet(A, b, lambda=lambda, alpha=alpha)
c <- predit(model, A)
c(coef(model), auc(c, b))
}
}
25 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
spark.lapply
values <- c(c(0.5, 0.1), c(0.5, 0.5), c(0.5, 1.0), c(1.5,
0.1), c(1.5, 0.5), c(1.5, 1.0))
train <- function(value) {
lambda <- value[1]
alpha <- value[2]
model <- glmnet(A, b, lambda=lambda, alpha=alpha)
c(coef(model), auc(c, b))
}
models <- spark.lapply(values, train)
26 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
spark.lapply
executor
executor
executor
executor
executor
Driver
lambda = c(0.5, 1.5)
alpha = c(0.1, 0.5, 1.0)
executor
27 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
spark.lapply
(0.5, 0.1)
executor
(1.5, 0.1)
executor
(0.5, 0.5)
executor
(0.5, 1.0)
executor
(1.5, 1.0)
executor
Driver
(1.5, 0.5)
executor
28 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Virtual environment
(glmnet)
executor
(glmnet)
executor
(glmnet)
executor
(glmnet)
executor
(glmnet)
executor
Driver
(glmnet)
executor
29 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Virtual environment
download.packages(”glmnet", packagesDir, repos =
"https://cran.r-project.org")
filename <- list.files(packagesDir, "^glmnet")
packagesPath <- file.path(packagesDir, filename)
spark.addFile(packagesPath)
30 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Virtual environment
values <- c(c(0.5, 0.1), c(0.5, 0.5), c(0.5, 1.0), c(1.5, 0.1), c(1.5,
0.5), c(1.5, 1.0))
train <- function(value) {
path <- spark.getSparkFiles(filename)
install.packages(path, repos = NULL, type = "source")
library(glmnet)
lambda <- value[1]
alpha <- value[2]
model <- glmnet(A, b, lambda=lambda, alpha=alpha)
c(coef(model), auc(c, b))
}
models <- spark.lapply(values, train)
31 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Large scale machine learning
32 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Large scale machine learning
33 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Large scale machine learning
> model <- glm(ArrDelay ~ DepDelay + Distance + aircraft_type,
family = "gaussian", data = df3)
> summary(model)
34 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Outline
à Introduction to	R	and SparkR.
à Typical data science workflow.
à SparkR + R for typical data science problem.
– Big data, small learning.
– Partition aggregate.
– Large scale machine learning.
à Future directions.
35 ©	Hortonworks	Inc.	2011	– 2016.	All	Rights	Reserved
Future directions
à Improve collect/createDataFrame performance in SparkR (SPARK-18924).
à More scalable machine learning algorithms from MLlib.
à Better R formula support.
à Improve UDF performance.
June 2017
Yanbo Liang
Apache Spark committer
Hortonworks
SparkR best practices for R data scientists

More Related Content

What's hot (20)

PPTX
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
DataWorks Summit/Hadoop Summit
 
PPTX
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
DataWorks Summit
 
PDF
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
DataWorks Summit
 
PPTX
Hive edw-dataworks summit-eu-april-2017
alanfgates
 
PDF
Coexistence and Migration of Vendor HPC based infrastructure to Hadoop Ecosys...
DataWorks Summit
 
PPTX
Row/Column- Level Security in SQL for Apache Spark
DataWorks Summit/Hadoop Summit
 
PPTX
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
DataWorks Summit
 
PPTX
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
 
PPTX
Schema Registry - Set Your Data Free
DataWorks Summit
 
PPTX
An Apache Hive Based Data Warehouse
DataWorks Summit
 
PPTX
LLAP: Building Cloud First BI
DataWorks Summit
 
PPTX
Apache Hadoop 3.0 Community Update
DataWorks Summit
 
PPTX
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
 
PPTX
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
 
PDF
The state of SQL-on-Hadoop in the Cloud
DataWorks Summit/Hadoop Summit
 
PPTX
Sharing metadata across the data lake and streams
DataWorks Summit
 
PPTX
Cloudy with a Chance of Hadoop - Real World Considerations
DataWorks Summit/Hadoop Summit
 
PPTX
File Format Benchmark - Avro, JSON, ORC & Parquet
DataWorks Summit/Hadoop Summit
 
PPTX
Apache Phoenix + Apache HBase
DataWorks Summit/Hadoop Summit
 
PPTX
Double Your Hadoop Hardware Performance with SmartSense
Hortonworks
 
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
DataWorks Summit/Hadoop Summit
 
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
DataWorks Summit
 
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
DataWorks Summit
 
Hive edw-dataworks summit-eu-april-2017
alanfgates
 
Coexistence and Migration of Vendor HPC based infrastructure to Hadoop Ecosys...
DataWorks Summit
 
Row/Column- Level Security in SQL for Apache Spark
DataWorks Summit/Hadoop Summit
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
DataWorks Summit
 
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
 
Schema Registry - Set Your Data Free
DataWorks Summit
 
An Apache Hive Based Data Warehouse
DataWorks Summit
 
LLAP: Building Cloud First BI
DataWorks Summit
 
Apache Hadoop 3.0 Community Update
DataWorks Summit
 
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
 
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
 
The state of SQL-on-Hadoop in the Cloud
DataWorks Summit/Hadoop Summit
 
Sharing metadata across the data lake and streams
DataWorks Summit
 
Cloudy with a Chance of Hadoop - Real World Considerations
DataWorks Summit/Hadoop Summit
 
File Format Benchmark - Avro, JSON, ORC & Parquet
DataWorks Summit/Hadoop Summit
 
Apache Phoenix + Apache HBase
DataWorks Summit/Hadoop Summit
 
Double Your Hadoop Hardware Performance with SmartSense
Hortonworks
 

Similar to SparkR best practices for R data scientist (20)

PDF
Integrate SparkR with existing R packages to accelerate data science workflows
Artem Ervits
 
PPTX
Machine Learning with SparkR
Olgun Aydın
 
PDF
Recent Developments In SparkR For Advanced Analytics
Databricks
 
PDF
Scalable Data Science with SparkR
DataWorks Summit
 
PDF
Scalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Spark Summit
 
PDF
Introduction to SparkR
Ankara Big Data Meetup
 
PDF
Introduction to SparkR
Olgun Aydın
 
PPTX
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
DataWorks Summit/Hadoop Summit
 
PDF
Sparkr sigmod
waqasm86
 
PDF
Data processing with spark in r &amp; python
Maloy Manna, PMP®
 
PDF
Parallelizing Existing R Packages
Craig Warman
 
PDF
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
Spark Summit
 
PDF
Enabling exploratory data science with Spark and R
Databricks
 
PDF
Big data analysis using spark r published
Dipendra Kusi
 
PDF
A Data Frame Abstraction Layer for SparkR-(Chris Freeman, Alteryx)
Spark Summit
 
PDF
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
Spark Summit
 
PDF
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
Debraj GuhaThakurta
 
PDF
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
Debraj GuhaThakurta
 
PDF
Enabling Exploratory Analysis of Large Data with Apache Spark and R
Databricks
 
Integrate SparkR with existing R packages to accelerate data science workflows
Artem Ervits
 
Machine Learning with SparkR
Olgun Aydın
 
Recent Developments In SparkR For Advanced Analytics
Databricks
 
Scalable Data Science with SparkR
DataWorks Summit
 
Scalable Data Science with SparkR: Spark Summit East talk by Felix Cheung
Spark Summit
 
Introduction to SparkR
Ankara Big Data Meetup
 
Introduction to SparkR
Olgun Aydın
 
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
DataWorks Summit/Hadoop Summit
 
Sparkr sigmod
waqasm86
 
Data processing with spark in r &amp; python
Maloy Manna, PMP®
 
Parallelizing Existing R Packages
Craig Warman
 
Using SparkR to Scale Data Science Applications in Production. Lessons from t...
Spark Summit
 
Enabling exploratory data science with Spark and R
Databricks
 
Big data analysis using spark r published
Dipendra Kusi
 
A Data Frame Abstraction Layer for SparkR-(Chris Freeman, Alteryx)
Spark Summit
 
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
Spark Summit
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
Debraj GuhaThakurta
 
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
Debraj GuhaThakurta
 
Enabling Exploratory Analysis of Large Data with Apache Spark and R
Databricks
 
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
DataWorks Summit
 
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
PPTX
Managing the Dewey Decimal System
DataWorks Summit
 
PPTX
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
PPTX
Security Framework for Multitenant Architecture
DataWorks Summit
 
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
PPTX
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
PDF
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Ad

Recently uploaded (20)

PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 

SparkR best practices for R data scientist