SlideShare a Scribd company logo
Data Pipelines and Tools to Integrate
with Power BI and Spotfire
Steve Dolha & Ryan Mross
⬤ Introduction
⬤ Considerations in Software Selection
⬤ The Real Issue is Data Access
⬤ How Can a Data Pipeline Help?
⬤ Where Have We Seen Success?
⬤ Building a Data Pipeline
⬤ Where To Go From Here
⬤ Q&A
Agenda
Steve Dolha
⬤ Chief Data Officer
⬤ 40+ Years Oil & Gas Data Analysis
Experience
Introductions
Ryan Mross
⬤ Senior Data Analytics Specialist
⬤ 12+ Years Oil & Gas Data Analysis
Experience
⬤ Power BI vs Spotfire vs Tableau
⬤ Virtualized vs Physical Data Layers, and their
related software
⬤ Define your Key Criteria
⬤ Cost (initial and ongoing)
⬤ Performance/features
⬤ Extensibility
⬤ Ease of Use (learning curve)
⬤ Ease of Support
⬤ Existing Skillsets
Considerations on Software Choices
Example: Business Intelligence Software Matrix
Mapping Extensibility
Data
Wrangling
Statistical
Analysis
Data Volumes
Mobile
Access
Spotfire
Multiple Layers,
shape files, custom
map layers, bubble
charts
Full python,
and R
integration.
Mods
Data canvas R, matlab, SAS,
python
Only limited by
hardware
Static Emails,
App (with
license), Web
dev possible
Tableau
Bubble and area map
layers, WMS layers
Dashboard
extensions
offered
Tableau prep Need separate
python and R
servers.
“Chatty" when
querying. A
couple million
rows ideal
Tableau app
for published
content
PowerBI
Dynamic shading,
multiple map
sources. Better with
paid add-ons
External tools
available
Power-query R and python not
supported on server
Performance
drop-off over 1
million rows
Online and
offline reports
⬤ Regardless of what software/technology you
chose, the most important thing is the DATA
⬤ Pick a path and commit to making it meet your
business needs
⬤ We are vendor neutral
⬤ Focus more on the ideas and strategies you
need to be successful
Let's Shift the Narrative
⬤ Finding vetted data is a challenge
⬤ Location, quantity, and types of data are
complex.
⬤ Everyone has business rules, but they are often
not well defined (aggregation, hierarchies, joins,
unit conversions)
⬤ Publishing reports is time consuming
A Common Issue – Trusted Data is Hard to Find
⬤ Cubes enable access to vetted, business focused
data, often through excel.
⬤ However, BI tools prefer flat data and there is
often inefficiency flattening cube data
⬤ The structures are rigid and require effort to
change. They are not flexible or agile.
⬤ It is also very challenging to combine cube data
with other sources
Where to Start – Cubes and Tabular Models
⬤ A data pipeline is a standardized process for
sourcing, processing/calculating, and delivering
data to end users
⬤ It is not necessarily a data warehouse. It is the
processing steps, not the physical location of
the final data sets.
⬤ They are meant to be agile, flexible, and
dynamic based on ever changing business
requirements
The Next Step – A Data Pipeline
⬤ In your current analytics process, you already
perform calculations and transform data.
⬤ A Data Pipeline moves this process further back
in the chain
⬤ Lets do this work once, vet it, and share it
among all end users!
You’re Already Halfway There
⬤ TIBCO Data Virtualization @ Hammerhead
⬤ Merged in proprietary data from internal
databases with GDC public Data
⬤ Wrote business rules into the views for reusable
calculations
⬤ Created dynamic well lists based on set criteria
for always up-to-date data sets
⬤ Published standardized and vetted views for end
user self service
Where Have We Seen Success?
⬤ DataBricks & TIBCO Data Virtualization @ Arc
⬤ Used DataBricks to create a standard Data Access
Layer (DAL) for all reports to source data from
⬤ DataBricks unifies many disparate data sources
into holistic data sets. Used TDV to flatten MDX
cubes for Spotfire consumption
⬤ Accessing this data from both Power BI and
Spotfire, for unified reports, regardless of BI Tool.
Where Have We Seen Success?
⬤ Efficient calculation over large data volumes
⬤ Search and understand the data (source, quality,
calculations)
⬤ Standardize business rules
⬤ Meld disparate data sources
⬤ The end goal is always Self Service
Data Pipeline Business Capabilities
⬤ Assess your tool and analytical requirements
based on what you're trying to do. Pick a tool
⬤ Build your Data Pipeline from the ground up
based on business reporting and analysis
requirements
⬤ Model your data (standardize and consolidate)
⬤ Build the pipeline. Pick a technology and
implement your solution
⬤ Vet the data
⬤ Publish for end users
Building a Data Pipeline
⬤ Find your capabilities and requirements. Build a
plan.
⬤ Ask for help! There are tons of resources
available to help you along this journey
⬤ Be Data and Tool Agnostic. Pick the tool that fits
best for you now, with room to grow in the
future.
⬤ Be prepared to iterate, you won't get it right the
first time.
Where Do We Go From Here?
Questions
Thank you!
Contact us:
General / Sales Inquiries
Phone: 403.475.2494
Email: info@cadeon.com
Head Office Address
Suite 520
800 – 5th Avenue SW
Calgary, AB T2P 3T6

More Related Content

Similar to Data Pipelines and Tools to Integrate with Power BI and Spotfire.pdf (20)

PDF
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Databricks
 
PPTX
Become Data Driven With Hadoop as-a-Service
Mammoth Data
 
PDF
Renewing the BI infrastructure at Hellorider - Big Data Expo 2019
webwinkelvakdag
 
PDF
Big data pipelines
Vivek Aanand Ganesan
 
PDF
Expert Big Data Tips
Qubole
 
PDF
TDWI checklist - Evolving to Modern DW
Jeannette Browning
 
DOCX
Big Data Analytics and Machine Learning Document.docx
Zitin Technologies PVT LTD
 
PDF
Innovating With Data and Analytics
VMware Tanzu
 
PPTX
Deliveinrg explainable AI
Gary Allemann
 
PDF
Open Source BI
InfoAxon Technologies Limited
 
PPTX
Top 5 Trends in Big Data & Analytics.
Teqfocus Consulting LLC
 
PPTX
Top 5 Trends in Big Data & Analytics
Teqforce Solutions
 
PDF
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
 
PDF
Top 5 Trends in Big Data & Analytics
Teqforce Solutions
 
PDF
Data Engineering Services-Contata Solutions.pdf
Contata Solutions
 
PDF
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
Big Data Value Association
 
PPTX
Big data? No. Big Decisions are What You Want
Stuart Miniman
 
PPTX
Designing Data Pipelines for Automous and Trusted Analytics
DataWorks Summit
 
PDF
data_blending
subit1615
 
PDF
Azure BI Cloud Architectural Guidelines.pdf
pbonillo1
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Databricks
 
Become Data Driven With Hadoop as-a-Service
Mammoth Data
 
Renewing the BI infrastructure at Hellorider - Big Data Expo 2019
webwinkelvakdag
 
Big data pipelines
Vivek Aanand Ganesan
 
Expert Big Data Tips
Qubole
 
TDWI checklist - Evolving to Modern DW
Jeannette Browning
 
Big Data Analytics and Machine Learning Document.docx
Zitin Technologies PVT LTD
 
Innovating With Data and Analytics
VMware Tanzu
 
Deliveinrg explainable AI
Gary Allemann
 
Top 5 Trends in Big Data & Analytics.
Teqfocus Consulting LLC
 
Top 5 Trends in Big Data & Analytics
Teqforce Solutions
 
Big Data Architectures @ JAX / BigDataCon 2016
Guido Schmutz
 
Top 5 Trends in Big Data & Analytics
Teqforce Solutions
 
Data Engineering Services-Contata Solutions.pdf
Contata Solutions
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
Big Data Value Association
 
Big data? No. Big Decisions are What You Want
Stuart Miniman
 
Designing Data Pipelines for Automous and Trusted Analytics
DataWorks Summit
 
data_blending
subit1615
 
Azure BI Cloud Architectural Guidelines.pdf
pbonillo1
 

Recently uploaded (20)

PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PDF
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
PPT
deep dive data management sharepoint apps.ppt
novaprofk
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PPTX
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
PPTX
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PPTX
Dr djdjjdsjsjsjsjsjsjjsjdjdjdjdjjd1.pptx
Nandy31
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
deep dive data management sharepoint apps.ppt
novaprofk
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
Climate Action.pptx action plan for climate
justfortalabat
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
Dr djdjjdsjsjsjsjsjsjjsjdjdjdjdjjd1.pptx
Nandy31
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Ad

Data Pipelines and Tools to Integrate with Power BI and Spotfire.pdf

  • 1. Data Pipelines and Tools to Integrate with Power BI and Spotfire Steve Dolha & Ryan Mross
  • 2. ⬤ Introduction ⬤ Considerations in Software Selection ⬤ The Real Issue is Data Access ⬤ How Can a Data Pipeline Help? ⬤ Where Have We Seen Success? ⬤ Building a Data Pipeline ⬤ Where To Go From Here ⬤ Q&A Agenda
  • 3. Steve Dolha ⬤ Chief Data Officer ⬤ 40+ Years Oil & Gas Data Analysis Experience Introductions Ryan Mross ⬤ Senior Data Analytics Specialist ⬤ 12+ Years Oil & Gas Data Analysis Experience
  • 4. ⬤ Power BI vs Spotfire vs Tableau ⬤ Virtualized vs Physical Data Layers, and their related software ⬤ Define your Key Criteria ⬤ Cost (initial and ongoing) ⬤ Performance/features ⬤ Extensibility ⬤ Ease of Use (learning curve) ⬤ Ease of Support ⬤ Existing Skillsets Considerations on Software Choices
  • 5. Example: Business Intelligence Software Matrix Mapping Extensibility Data Wrangling Statistical Analysis Data Volumes Mobile Access Spotfire Multiple Layers, shape files, custom map layers, bubble charts Full python, and R integration. Mods Data canvas R, matlab, SAS, python Only limited by hardware Static Emails, App (with license), Web dev possible Tableau Bubble and area map layers, WMS layers Dashboard extensions offered Tableau prep Need separate python and R servers. “Chatty" when querying. A couple million rows ideal Tableau app for published content PowerBI Dynamic shading, multiple map sources. Better with paid add-ons External tools available Power-query R and python not supported on server Performance drop-off over 1 million rows Online and offline reports
  • 6. ⬤ Regardless of what software/technology you chose, the most important thing is the DATA ⬤ Pick a path and commit to making it meet your business needs ⬤ We are vendor neutral ⬤ Focus more on the ideas and strategies you need to be successful Let's Shift the Narrative
  • 7. ⬤ Finding vetted data is a challenge ⬤ Location, quantity, and types of data are complex. ⬤ Everyone has business rules, but they are often not well defined (aggregation, hierarchies, joins, unit conversions) ⬤ Publishing reports is time consuming A Common Issue – Trusted Data is Hard to Find
  • 8. ⬤ Cubes enable access to vetted, business focused data, often through excel. ⬤ However, BI tools prefer flat data and there is often inefficiency flattening cube data ⬤ The structures are rigid and require effort to change. They are not flexible or agile. ⬤ It is also very challenging to combine cube data with other sources Where to Start – Cubes and Tabular Models
  • 9. ⬤ A data pipeline is a standardized process for sourcing, processing/calculating, and delivering data to end users ⬤ It is not necessarily a data warehouse. It is the processing steps, not the physical location of the final data sets. ⬤ They are meant to be agile, flexible, and dynamic based on ever changing business requirements The Next Step – A Data Pipeline
  • 10. ⬤ In your current analytics process, you already perform calculations and transform data. ⬤ A Data Pipeline moves this process further back in the chain ⬤ Lets do this work once, vet it, and share it among all end users! You’re Already Halfway There
  • 11. ⬤ TIBCO Data Virtualization @ Hammerhead ⬤ Merged in proprietary data from internal databases with GDC public Data ⬤ Wrote business rules into the views for reusable calculations ⬤ Created dynamic well lists based on set criteria for always up-to-date data sets ⬤ Published standardized and vetted views for end user self service Where Have We Seen Success?
  • 12. ⬤ DataBricks & TIBCO Data Virtualization @ Arc ⬤ Used DataBricks to create a standard Data Access Layer (DAL) for all reports to source data from ⬤ DataBricks unifies many disparate data sources into holistic data sets. Used TDV to flatten MDX cubes for Spotfire consumption ⬤ Accessing this data from both Power BI and Spotfire, for unified reports, regardless of BI Tool. Where Have We Seen Success?
  • 13. ⬤ Efficient calculation over large data volumes ⬤ Search and understand the data (source, quality, calculations) ⬤ Standardize business rules ⬤ Meld disparate data sources ⬤ The end goal is always Self Service Data Pipeline Business Capabilities
  • 14. ⬤ Assess your tool and analytical requirements based on what you're trying to do. Pick a tool ⬤ Build your Data Pipeline from the ground up based on business reporting and analysis requirements ⬤ Model your data (standardize and consolidate) ⬤ Build the pipeline. Pick a technology and implement your solution ⬤ Vet the data ⬤ Publish for end users Building a Data Pipeline
  • 15. ⬤ Find your capabilities and requirements. Build a plan. ⬤ Ask for help! There are tons of resources available to help you along this journey ⬤ Be Data and Tool Agnostic. Pick the tool that fits best for you now, with room to grow in the future. ⬤ Be prepared to iterate, you won't get it right the first time. Where Do We Go From Here?
  • 17. Thank you! Contact us: General / Sales Inquiries Phone: 403.475.2494 Email: [email protected] Head Office Address Suite 520 800 – 5th Avenue SW Calgary, AB T2P 3T6