SlideShare a Scribd company logo
18
Most read
19
Most read
21
Most read
Open Source DataViz
with Apache Superset
• MSc. Computer Science and B.A.Sc. Engineering Physics.
• Python programmer with experience in machine learning and
big data.
• Currently working as Data Scientist at the Mexican Startup
Konfío and living in Mexico City.
Who Am I?
Visit:

http://konfio.mx
Why DataViz?
• Making useful insights from stored data
• Allow and support effective decisions
• Using data to get new information and find patterns
• Monitor daily numbers and trends
• Present an argument or tell a story
• It’s all about getting the right information, to the right decision
makers, at the right time
Tools out there
(all paid-for)
Tools out there
(Open source)
Open Source DataViz with Apache Superset
• A modern data exploration and visualization web application.
• Superset’s main goal is to make it easy to slice, dice and
visualize data.
• Developed by engineers at Airbnb now released under
Apache license 2.0
• This project was originally named Panoramix, was renamed to
Caravel in March 2016, and is currently named Superset as of
November 2016.
Open Source DataViz with Apache Superset
• Build over the Flask framework in Python.
• Works as a web app on all most used browsers so it does not
require any additional desktop installations.
• Easy to deploy on a Server and ability to handle multiple users
with roles and authentication.
• Repo: 



https://github.com/apache/incubator-superset
Advantages
• Interactivity: You can create visualizations even without
knowledge of SQL or Python!
• No coding to setup but code available for manipulation
• Completely free, no user license or one time download fee
• Supports multiple data sources (Most of SQL dialects and
Druid) and more to come!
• Growing in popularity getting new releases each month
Limitations and
Disadvantages
• Still very young and lacks some of the basics like uploading data,
tooltip customization and visual filters.
• The tool is rapidly developing so be ready to find bugs.
• Problems with customization if you don’t want to dive into
the source code, but you still can!
• Difficult to plot aggregations of a higher level without some data
manipulation or creating views.
Mounting Superset
• Clone from:

git clone https://github.com/amancevice/superset
• Or pull docker image

docker pull amancevice/superset
• Create container:

docker run --detach --name superset -p 8088:8088 amancevice/superset
• Or easier:
docker-compose up
• Initialize and load demo data:

docker exec -it [container name] superset-demo
Getting it up
DEMO
Contributing
• Contributions are welcome and are greatly appreciated!
• You can help make superset better by:
• Reporting/Fixing bugs
• Implementing new features
• Help with the Documentation
• Or submit Feedback and new feature ideas
https://github.com/apache/incubator-superset/blob/master/
CONTRIBUTING.md
Closing
• Superset is certainly not a replacement for more robust BI tools but
its growing to become their main open source competitor.
• For becoming a full BI tool stills need an analytics module.
• Can scale better over user license solutions.
• With superset everybody in the organization can be a data scientist,
at least a bit
BONUS
https://datavizcatalogue.com/index.html
Open Source DataViz with Apache Superset
Thank You!
Let’s have a beer!
cwallaceh@gmail.com

More Related Content

What's hot (20)

PPTX
Adopting OpenTelemetry
Vincent Behar
 
PDF
Openstack 101
Kamesh Pemmaraju
 
PDF
Where狙いのキー、order by狙いのキー
yoku0825
 
PDF
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Cloudera Japan
 
PDF
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
LibbySchulze
 
PPTX
Elastic Stack Introduction
Vikram Shinde
 
PDF
[2018] 오픈스택 5년 운영의 경험
NHN FORWARD
 
PDF
最近のストリーム処理事情振り返り
Sotaro Kimura
 
PDF
Why Splunk Chose Pulsar_Karthik Ramasamy
StreamNative
 
PPTX
Introduction to CI/CD
Steve Mactaggart
 
PDF
Kafka internals
David Groozman
 
PPTX
OpenTelemetry For Operators
Kevin Brockhoff
 
PDF
OpenStack Architecture
Mirantis
 
PDF
Prometheus Multi Tenancy
Natan Yellin
 
PDF
Open shift 4 infra deep dive
Winton Winton
 
PPTX
Apache NiFi in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
PPTX
1909 Hyperledger Besu(a.k.a pantheon) Overview
Hyperledger Korea User Group
 
PDF
Embracing Observability in CI/CD with OpenTelemetry
Cyrille Le Clerc
 
PPT
LiquiBase
Mike Willbanks
 
PDF
ELK in Security Analytics
nullowaspmumbai
 
Adopting OpenTelemetry
Vincent Behar
 
Openstack 101
Kamesh Pemmaraju
 
Where狙いのキー、order by狙いのキー
yoku0825
 
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Cloudera Japan
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
LibbySchulze
 
Elastic Stack Introduction
Vikram Shinde
 
[2018] 오픈스택 5년 운영의 경험
NHN FORWARD
 
最近のストリーム処理事情振り返り
Sotaro Kimura
 
Why Splunk Chose Pulsar_Karthik Ramasamy
StreamNative
 
Introduction to CI/CD
Steve Mactaggart
 
Kafka internals
David Groozman
 
OpenTelemetry For Operators
Kevin Brockhoff
 
OpenStack Architecture
Mirantis
 
Prometheus Multi Tenancy
Natan Yellin
 
Open shift 4 infra deep dive
Winton Winton
 
Apache NiFi in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
1909 Hyperledger Besu(a.k.a pantheon) Overview
Hyperledger Korea User Group
 
Embracing Observability in CI/CD with OpenTelemetry
Cyrille Le Clerc
 
LiquiBase
Mike Willbanks
 
ELK in Security Analytics
nullowaspmumbai
 

Similar to Open Source DataViz with Apache Superset (20)

PDF
SFScon21 - Maurizio Napolitano - Apache Superset - A modern data exploration ...
South Tyrol Free Software Conference
 
PPTX
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Anant Corporation
 
PDF
SFScon22 - Grazia Cazzin - Open source analytics and business intelligence.pdf
South Tyrol Free Software Conference
 
PDF
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Ahmed Elmalla
 
PDF
Continuum Analytics and Python
Travis Oliphant
 
PPTX
Clio infra Collabs data analysis tools
vty
 
PDF
tools
bhavesh lande
 
PDF
Knowage 8 presentation
KNOWAGE
 
PDF
Dirty data? Clean it up! - Datapalooza Denver 2016
Dan Lynn
 
PDF
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dan Lynn
 
PPT
NTEN Webinar - Data Cleaning and Visualization Tools for Nonprofits
Azavea
 
PPSX
10-Hot-Data-Analytics-Tre-8904178.ppsx
SangeetaTripathi8
 
PPTX
Department of Commerce App Challenge: Big Data Dashboards
Brand Niemann
 
PDF
Big data berlin
kammeyer
 
PDF
The Incredible Disappearing Data Scientist
Rebecca Bilbro
 
PDF
2019-09-25 Paris Time Series Meetup - Warp 10 - Advanced Time Series Technolo...
Mathias Herberts
 
PDF
Python as the Zen of Data Science
Travis Oliphant
 
PPTX
Dc python meetup
Jeffrey Clark
 
KEY
Trending with Purpose
Jason Dixon
 
PPTX
Advanced Data Analytics techniques .pptx
Anshika865276
 
SFScon21 - Maurizio Napolitano - Apache Superset - A modern data exploration ...
South Tyrol Free Software Conference
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Anant Corporation
 
SFScon22 - Grazia Cazzin - Open source analytics and business intelligence.pdf
South Tyrol Free Software Conference
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Ahmed Elmalla
 
Continuum Analytics and Python
Travis Oliphant
 
Clio infra Collabs data analysis tools
vty
 
Knowage 8 presentation
KNOWAGE
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dan Lynn
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dan Lynn
 
NTEN Webinar - Data Cleaning and Visualization Tools for Nonprofits
Azavea
 
10-Hot-Data-Analytics-Tre-8904178.ppsx
SangeetaTripathi8
 
Department of Commerce App Challenge: Big Data Dashboards
Brand Niemann
 
Big data berlin
kammeyer
 
The Incredible Disappearing Data Scientist
Rebecca Bilbro
 
2019-09-25 Paris Time Series Meetup - Warp 10 - Advanced Time Series Technolo...
Mathias Herberts
 
Python as the Zen of Data Science
Travis Oliphant
 
Dc python meetup
Jeffrey Clark
 
Trending with Purpose
Jason Dixon
 
Advanced Data Analytics techniques .pptx
Anshika865276
 
Ad

Recently uploaded (20)

PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
MuleSoft MCP Support (Model Context Protocol) and Use Case Demo
shyamraj55
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Ad

Open Source DataViz with Apache Superset

  • 1. Open Source DataViz with Apache Superset
  • 2. • MSc. Computer Science and B.A.Sc. Engineering Physics. • Python programmer with experience in machine learning and big data. • Currently working as Data Scientist at the Mexican Startup Konfío and living in Mexico City. Who Am I?
  • 4. Why DataViz? • Making useful insights from stored data • Allow and support effective decisions • Using data to get new information and find patterns • Monitor daily numbers and trends • Present an argument or tell a story • It’s all about getting the right information, to the right decision makers, at the right time
  • 8. • A modern data exploration and visualization web application. • Superset’s main goal is to make it easy to slice, dice and visualize data. • Developed by engineers at Airbnb now released under Apache license 2.0 • This project was originally named Panoramix, was renamed to Caravel in March 2016, and is currently named Superset as of November 2016.
  • 10. • Build over the Flask framework in Python. • Works as a web app on all most used browsers so it does not require any additional desktop installations. • Easy to deploy on a Server and ability to handle multiple users with roles and authentication. • Repo: 
 
 https://github.com/apache/incubator-superset
  • 11. Advantages • Interactivity: You can create visualizations even without knowledge of SQL or Python! • No coding to setup but code available for manipulation • Completely free, no user license or one time download fee • Supports multiple data sources (Most of SQL dialects and Druid) and more to come! • Growing in popularity getting new releases each month
  • 12. Limitations and Disadvantages • Still very young and lacks some of the basics like uploading data, tooltip customization and visual filters. • The tool is rapidly developing so be ready to find bugs. • Problems with customization if you don’t want to dive into the source code, but you still can! • Difficult to plot aggregations of a higher level without some data manipulation or creating views.
  • 14. • Clone from:
 git clone https://github.com/amancevice/superset • Or pull docker image
 docker pull amancevice/superset
  • 15. • Create container:
 docker run --detach --name superset -p 8088:8088 amancevice/superset • Or easier: docker-compose up • Initialize and load demo data:
 docker exec -it [container name] superset-demo Getting it up
  • 16. DEMO
  • 17. Contributing • Contributions are welcome and are greatly appreciated! • You can help make superset better by: • Reporting/Fixing bugs • Implementing new features • Help with the Documentation • Or submit Feedback and new feature ideas https://github.com/apache/incubator-superset/blob/master/ CONTRIBUTING.md
  • 18. Closing • Superset is certainly not a replacement for more robust BI tools but its growing to become their main open source competitor. • For becoming a full BI tool stills need an analytics module. • Can scale better over user license solutions. • With superset everybody in the organization can be a data scientist, at least a bit