SlideShare a Scribd company logo
Open Source Business
Intelligence Overview
From Data Source to Analytics and Beyond
Agenda
● Open Source and BI
● Data sources
● Data Integration
● Reporting/Frontend
● Analytics
● Data Quality
● Data Governance
Open Source BI Overview
Open Source BI Overview
Source: https://www.informs.org/ORMS-Today/Public-Articles/October-Volume-37-Number-5/Back-in-Business
Data Sources
Traditional
○ PostgreSQL - http://www.postgresql.org/
■ Pivotal Greenplum - http://gopivotal.com/
○ MySQL - http://www.mysql.com/
■ Percona - http://www.percona.com/
■ MariaDB - https://mariadb.org/
Columnar
○ MySQL Derivatives
■ InfiniDB - http://infinidb.org/
■ Infobright - https://www.infobright.com/
○ MonetDB - http://www.monetdb.org/Home
Relational vs Columnar
Source: http://www.calpont.com/images/column-oriented-database.jpg
Data Sources
NoSQL
○ Cassandra - http://cassandra.apache.org/
○ MongoDB - http://www.mongodb.org/
○ CouchDB - http://couchdb.apache.org/
○ Infinispan - http://www.jboss.org/infinispan/
○ Hadoop - http://hadoop.apache.org/
■ HBase - http://hbase.apache.org/
■ Hive - http://hive.apache.org/
OLAP
○ Mondrian - http://mondrian.pentaho.com/
Source: http://gerardnico.com/wiki/database/oracle/oracle_olap
The Next Wave of Data Sources
Virtualization
○ Teiid - http://www.jboss.org/teiid/
Semantic Web/Graph
○ Sesame - http://www.openrdf.org/
○ Neo4j - http://www.neo4j.org/
○ OrientDB - http://www.orientdb.org/
○ Infogrid - http://infogrid.org/trac/
Source: http://www.ebizq.net/blogs/guest_session/2009/12/putting-data-to-work-for-cloud-bpm-mdm-and-soa-projects.php
Graph Database
Source: http://en.wikipedia.org/wiki/Graph_database
Data Integration
Kettle - http://kettle.pentaho.com/
Talend - http://www.talend.com/
CloverETL - http://www.cloveretl.com/
Reporting
BIRT (Actuate) - http://www.eclipse.org/birt/phoenix/
Pentaho - http://reporting.pentaho.com/
Jaspersoft - http://community.jaspersoft.com/
Saiku - http://meteorite.bi/saiku
Full Stacks
SpagoBI - http://www.spagoworld.org/xwiki/bin/view/SpagoBI/#
Pentaho - http://www.pentaho.com/
Jaspersoft - http://www.jaspersoft.com/
Analytics
R - http://www.r-project.org/
Weka - http://www.cs.waikato.ac.nz/ml/weka/
RapidMiner - http://rapid-i.com/content/view/181/
Data Quality
Profiling
○ DataCleaner - http://datacleaner.org/
○ DQGuru - http://www.sqlpower.ca/page/dqguru
Suites
○ Talend - http://www.talend.com/products/data-quality
Testing
○ SQLUnit - http://sqlunit.sourceforge.net/
○ dbFit - http://benilovj.github.io/dbfit/
○ etlUnit - https://github.com/dbaAlex/etlUnit (shameless plug :p )
Data Governance
MDM
○ Talend - http://www.talend.com/resource/data-governance.html
Business Rules Engine
○ JBoss Drools - http://www.jboss.org/drools/
○ Open Rules - http://openrules.com/

More Related Content

Viewers also liked (14)

PDF
見守りサービスMiimaミーマ 資料
kaisya_account
 
PPTX
Hands on training on DbFit Part-II
Babul Mirdha
 
PPTX
Testing regression
Richie Lee
 
PDF
Test-Driven Development with DbFit and Oracle database, BGOUG Conference, 201...
Yavor Nikolov
 
PDF
Deliberate Practice (Agile Slovenia 2015)
Peter Kofler
 
PPT
Effective Test Driven Database Development
elliando dias
 
PPT
Sigist Presentation 091208 V2.0
Colin Moore-Hill B.Sc(Hons) MBCS ISEB
 
PDF
Trivadis TechEvent 2016 A few thoughts on the subject Continuous integration ...
Trivadis
 
PDF
Agile WTF
Naresh Jain
 
PPTX
Hands on training on DbFit Part-I
Babul Mirdha
 
PPTX
Lynn Winterboer : Test automation
AgileDenver
 
PPTX
Agile Methods and Data Warehousing (2016 update)
Kent Graziano
 
PDF
Open Source ETL vs Commercial ETL
Jonathan Levin
 
PPTX
TDD - Test Driven Dvelopment | Test First Design
Quang Nguyễn Bá
 
見守りサービスMiimaミーマ 資料
kaisya_account
 
Hands on training on DbFit Part-II
Babul Mirdha
 
Testing regression
Richie Lee
 
Test-Driven Development with DbFit and Oracle database, BGOUG Conference, 201...
Yavor Nikolov
 
Deliberate Practice (Agile Slovenia 2015)
Peter Kofler
 
Effective Test Driven Database Development
elliando dias
 
Sigist Presentation 091208 V2.0
Colin Moore-Hill B.Sc(Hons) MBCS ISEB
 
Trivadis TechEvent 2016 A few thoughts on the subject Continuous integration ...
Trivadis
 
Agile WTF
Naresh Jain
 
Hands on training on DbFit Part-I
Babul Mirdha
 
Lynn Winterboer : Test automation
AgileDenver
 
Agile Methods and Data Warehousing (2016 update)
Kent Graziano
 
Open Source ETL vs Commercial ETL
Jonathan Levin
 
TDD - Test Driven Dvelopment | Test First Design
Quang Nguyễn Bá
 

Similar to Open Source BI Overview (20)

PDF
Moving data to the cloud BY CESAR ROJAS from Pivotal
VMware Tanzu Korea
 
PDF
It bi final report
renjan131
 
ODP
Open Source Business Intelligence Overview
Alex Meadows
 
PDF
Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...
Thierry Badard
 
PPTX
Open Source DWBI-A Primer
partha69
 
PPTX
Building a modern data warehouse
James Serra
 
PPTX
BI: Open Source
DataminingTools Inc
 
PPTX
Open Source BI
opensource content
 
PDF
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
PPTX
Module_01_formation-PowerBI Desktop.pptx
seydi17
 
PDF
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
PDF
GeoKettle, GeoMondrian et Spatialytics : une suite open source de GeoBI
ACSG Section Montréal
 
PPTX
Cepta The Future of Data with Power BI
Kellyn Pot'Vin-Gorman
 
DOCX
The Rise of Open Source Business Intelligence Helical Insight.docx
Varsha Nayak
 
PPTX
The Rise of Open Source Business Intelligence Helical Insight (1).pptx
Varsha Nayak
 
PDF
The Rise of Open Source Business Intelligence Helical Insight.pdf
Varsha Nayak
 
PDF
Business Intelligence Open Source
Roberto Marchetto
 
PDF
How to Build Modern Data Architectures Both On Premises and in the Cloud
VMware Tanzu
 
PDF
Formulating Power BI Enterprise Strategy
Teo Lachev
 
PDF
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Denodo
 
Moving data to the cloud BY CESAR ROJAS from Pivotal
VMware Tanzu Korea
 
It bi final report
renjan131
 
Open Source Business Intelligence Overview
Alex Meadows
 
Open Source Geospatial Business Intelligence (GeoBI): Definition, architectur...
Thierry Badard
 
Open Source DWBI-A Primer
partha69
 
Building a modern data warehouse
James Serra
 
BI: Open Source
DataminingTools Inc
 
Open Source BI
opensource content
 
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
Module_01_formation-PowerBI Desktop.pptx
seydi17
 
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
GeoKettle, GeoMondrian et Spatialytics : une suite open source de GeoBI
ACSG Section Montréal
 
Cepta The Future of Data with Power BI
Kellyn Pot'Vin-Gorman
 
The Rise of Open Source Business Intelligence Helical Insight.docx
Varsha Nayak
 
The Rise of Open Source Business Intelligence Helical Insight (1).pptx
Varsha Nayak
 
The Rise of Open Source Business Intelligence Helical Insight.pdf
Varsha Nayak
 
Business Intelligence Open Source
Roberto Marchetto
 
How to Build Modern Data Architectures Both On Premises and in the Cloud
VMware Tanzu
 
Formulating Power BI Enterprise Strategy
Teo Lachev
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Denodo
 
Ad

More from Alex Meadows (16)

PPTX
Ethics In A Data Driven World
Alex Meadows
 
PDF
SIM RTP Meeting - So Who's Using Open Source Anyway?
Alex Meadows
 
ODP
Introduction To Data Warehousing
Alex Meadows
 
ODP
Continuous Integration As A Service
Alex Meadows
 
ODP
Building next generation data warehouses
Alex Meadows
 
PPTX
How Linked Data Can Speed Information Discovery
Alex Meadows
 
ODP
Graphing Your Data
Alex Meadows
 
ODP
Introduction To Analytics
Alex Meadows
 
PDF
Big Data Pitfalls
Alex Meadows
 
ODP
Continuous integration with business intelligence and analytics
Alex Meadows
 
ODP
Big Data Analytics - Introduction
Alex Meadows
 
PDF
Agile Business Intelligence
Alex Meadows
 
ODP
Open source data_warehousing_overview
Alex Meadows
 
ODP
Data quality overview
Alex Meadows
 
ODP
Mondrian and OLAP Overview
Alex Meadows
 
ODP
Choosing the right steps in pentaho kettle
Alex Meadows
 
Ethics In A Data Driven World
Alex Meadows
 
SIM RTP Meeting - So Who's Using Open Source Anyway?
Alex Meadows
 
Introduction To Data Warehousing
Alex Meadows
 
Continuous Integration As A Service
Alex Meadows
 
Building next generation data warehouses
Alex Meadows
 
How Linked Data Can Speed Information Discovery
Alex Meadows
 
Graphing Your Data
Alex Meadows
 
Introduction To Analytics
Alex Meadows
 
Big Data Pitfalls
Alex Meadows
 
Continuous integration with business intelligence and analytics
Alex Meadows
 
Big Data Analytics - Introduction
Alex Meadows
 
Agile Business Intelligence
Alex Meadows
 
Open source data_warehousing_overview
Alex Meadows
 
Data quality overview
Alex Meadows
 
Mondrian and OLAP Overview
Alex Meadows
 
Choosing the right steps in pentaho kettle
Alex Meadows
 
Ad

Recently uploaded (20)

PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Biography of Daniel Podor.pdf
Daniel Podor
 
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 

Open Source BI Overview

  • 1. Open Source Business Intelligence Overview From Data Source to Analytics and Beyond
  • 2. Agenda ● Open Source and BI ● Data sources ● Data Integration ● Reporting/Frontend ● Analytics ● Data Quality ● Data Governance
  • 6. Data Sources Traditional ○ PostgreSQL - http://www.postgresql.org/ ■ Pivotal Greenplum - http://gopivotal.com/ ○ MySQL - http://www.mysql.com/ ■ Percona - http://www.percona.com/ ■ MariaDB - https://mariadb.org/ Columnar ○ MySQL Derivatives ■ InfiniDB - http://infinidb.org/ ■ Infobright - https://www.infobright.com/ ○ MonetDB - http://www.monetdb.org/Home
  • 7. Relational vs Columnar Source: http://www.calpont.com/images/column-oriented-database.jpg
  • 8. Data Sources NoSQL ○ Cassandra - http://cassandra.apache.org/ ○ MongoDB - http://www.mongodb.org/ ○ CouchDB - http://couchdb.apache.org/ ○ Infinispan - http://www.jboss.org/infinispan/ ○ Hadoop - http://hadoop.apache.org/ ■ HBase - http://hbase.apache.org/ ■ Hive - http://hive.apache.org/ OLAP ○ Mondrian - http://mondrian.pentaho.com/
  • 10. The Next Wave of Data Sources Virtualization ○ Teiid - http://www.jboss.org/teiid/ Semantic Web/Graph ○ Sesame - http://www.openrdf.org/ ○ Neo4j - http://www.neo4j.org/ ○ OrientDB - http://www.orientdb.org/ ○ Infogrid - http://infogrid.org/trac/
  • 13. Data Integration Kettle - http://kettle.pentaho.com/ Talend - http://www.talend.com/ CloverETL - http://www.cloveretl.com/
  • 14. Reporting BIRT (Actuate) - http://www.eclipse.org/birt/phoenix/ Pentaho - http://reporting.pentaho.com/ Jaspersoft - http://community.jaspersoft.com/ Saiku - http://meteorite.bi/saiku
  • 15. Full Stacks SpagoBI - http://www.spagoworld.org/xwiki/bin/view/SpagoBI/# Pentaho - http://www.pentaho.com/ Jaspersoft - http://www.jaspersoft.com/
  • 16. Analytics R - http://www.r-project.org/ Weka - http://www.cs.waikato.ac.nz/ml/weka/ RapidMiner - http://rapid-i.com/content/view/181/
  • 17. Data Quality Profiling ○ DataCleaner - http://datacleaner.org/ ○ DQGuru - http://www.sqlpower.ca/page/dqguru Suites ○ Talend - http://www.talend.com/products/data-quality Testing ○ SQLUnit - http://sqlunit.sourceforge.net/ ○ dbFit - http://benilovj.github.io/dbfit/ ○ etlUnit - https://github.com/dbaAlex/etlUnit (shameless plug :p )
  • 18. Data Governance MDM ○ Talend - http://www.talend.com/resource/data-governance.html Business Rules Engine ○ JBoss Drools - http://www.jboss.org/drools/ ○ Open Rules - http://openrules.com/