SlideShare a Scribd company logo
Fast, Powerful and
Scalable Analytics
Dipti Joshi
Director Product Management
MariaDB AX, MariaDB MaxScale
Why Analytics ?
• Get the most value of your data asset
• Faster Better decision making process
• Cost reduction
• New products and services
Type of Analytics
• Descriptive: What happened ?
• Diagnostics: Why did it happen?
• Predictive: What is likely to happen?
• Prescriptive: What should I do about it ?
Descriptive: What happened ?
● Reports
○ Sales Report
○ Expense summary
● Ad-hoc requests to analyst
Diagnostics: Why did it happen
● Aggregates: aggregate measure over one or
more dimension
○ Find total sales
○ Top five product ranked by sales
● Roll-ups: Aggregate at different levels of
dimension hierarchy
○ given total sales by city, roll-up to get sales by
state
● Drill-down: Inverse of roll-ups
○ given total sales by state, drill-down to get
total by city
● Slicing and Dicing:
○ Equality and range selections on one or more
dimensions
Predictive: What is likely to happen
● Sales Prediction
○ Analyze data to identify trends, spot
weakness or determine conditions
among broader data sets for making
decisions about the future
● Targeted marketing
○ what is likelihood of a customer buying
a particular product based on past
buying behavior
Prescriptive: What is the best course of action?
Paradox of choices
With too many choices, which one is the best?
Data Analytics Use Cases
By industry
Finance
Identify trade patterns
Detect fraud and anomalies
Predict trading outcomes
Manufacturing
Simulations to improve design/yield
Detect production anomalies
Predict machine failures (sensor data)
Telecom
Behavioral analysis of customer calls
Network analysis (perf and reliability)
Healthcare
Find genetic profiles/matches
Analyze health vs spending
Predict viral outbreaks
Data Analytics Solution Consideration
• Technical Considerations
• Real-time analytics
– High speed data ingestion
– High speed read queries
• Analytics
– Built in analytics
– Choice of BI tools
• Business Considerations
• Cost of deployment and use
– Hardware and
Price/Performance ratio
– Large talent pool
Existing Approaches
Limited real time analytics
Slow releases of product innovation
Expensive hardware and software
Data Warehouses
Hadoop / NoSQL
LIMITED SQL SUPPORT
DIFFICULT TO
INSTALL/MANAGE
LIMITED TALENT POOL
DATA LAKE W/ NO DATA
MANAGEMENT
Hard to use
Purpose Built rather than predictive
analytics
MariaDB Big Data Solution
MariaDB AX
and
MariaDB ColumnStore
MariaDB AX
Analytics -
simple, fast, scalable…
and open source
MariaDB AX
MariaDB Server
MariaDB MaxScale
MariaDB ColumnStore
Parallel queries
Distributed storage
No indexes
Automatic partitioning
Read optimized
High compression
Low disk IO ColumnStore
Storage
ColumnStore
Storage
ColumnStore
Storage
MariaDB Server
ColumnStore
MariaDB Server
ColumnStore
MariaDB MaxScale
MariaDB Server
ColumnStore
ColumnStore
Storage
MariaDB MaxScale
UM
User
Module
PM
Performance Module
MariaDB ColumnStore
High performance columnar storage engine that supports a wide variety
of analytical use cases in highly scalable distributed environments
Parallel query
processing for distributed
environments
Faster, More
Efficient Queries
Single Interface for
OLTP and analytics
Easy to Manage and
Scale
Easier Enterprise
Analytics
Power of SQL and
Freedom of Open
Source to Big Data
Analytics
Better Price
Performance
Better Price
Performance
Flexible deployment option
• Cloud and On-premise
• Run on commodity hardware
• Open Source, Subscription based pricing
90.3%
less per TB
per year
Commercial Data
Warehouse
MariaDB
ColumnStore
No need to maintain a third platform
• Run analytics from the same SQL front end
• No need to update application code
• Leverage MariaDB Extensible architecture
High data compression
• More efficient at storing big data
• Less hardware
Customers have saved by going to MariaDB AX against
Oracle(HealthCare), MemSQL(Auto-parts), Vertica(Finance, SEO
Marketing): Come see them at M18!
Easier Enterprise
Analytics
ANSI SQL
Single SQL Front-end
• Use a single SQL interface for analytics and OLTP
• Leverage MariaDB Security features - Encryption for
data in motion, role based access and auditing
Full ANSI SQL
• No more SQL “like” query
• Support complex join, aggregation and window
function
Easy to manage and scale
• Eliminate needs for indexes and views
• Automated horizontal/vertical partitioning
• Linear scalable by adding new nodes as data grows
• Out of box connection with BI tools
MariaDB AX customers across industries: Auto Parts, Finance, Ad
analytics, Asset management, Telecommunication, Healthcare,
Digital Media, Carpooling App
Faster, More
Efficient Queries
Optimized for Columnar storage
• Columnar storage reduces disk I/O
• Blazing fast read-intensive workload
• Ultra fast data import
Parallel
Query Processing
Parallel distributed query execution
• Distributed queries into series of parallel operations
• Fully parallel high speed data ingestion
– TPCH lineitem table - 750K to 1 million rows per min
Highly available analytic environment
• Built-in Redundancy
• Automatic fail-over
MariaDB AX customers across industries: Auto Parts, Finance, Ad
analytics, Asset management, Telecommunication, Healthcare,
Digital Media, Carpooling App
Ingestion Analytics
Data Services
Bulk Data Adapters
Apache Kafka
Streaming Data Adapters
Spark / Python / ML
Bulk Data Adapters
Operations
Transaction (OLTP)
MariaDB Server
InnoDB
MariaDB MaxScale
Web/Mobile Services
MariaDB MaxScale
Analytics (OLAP)
MariaDB Server
ColumnStore
Simple & Streamlined data ingestion
Streaming data
adapters – Apache
Kafka
Stream all messages published
to Apache Kafka topics to
MariaDB AX automatically and
continuously - enable data
from many sources to be
streamed and collected for
analysis without complex
code
MariaDB Server
ColumnStore
Apache Kafka
ColumnStore Storage ColumnStore StorageColumnStore Storage
Write API Write API Write API
MariaDB Server
ColumnStore
Streaming Data
Adapter
(Kafka Client)
Topic Topic Topic
OLTP to OLAP:
Streaming data
adapters – MaxScale
CDC
Stream all writes from
MariaDB TX to MariaDB AX
automatically and continously -
ensure analytical data is
up to date and not stale, no
need for batch jobs,
manual processes or
human intervention
MariaDB Server
InnoDB
MariaDB Server
ColumnStore
MariaDB MaxScale
ColumnStore Storage ColumnStore StorageColumnStore Storage
Write API Write API Write API
MariaDB Server
ColumnStore
Streaming Data
Adapter
(CDC Client)
CDC Server
MariaDB AX Use Cases
IHME - Institute of Health Metrics and Evaluation
IHME Visualizations library: http://www.healthdata.org/results/data-visualizations
Started with 4.2 TB, with goal to go to 30TB of data
Customer Use Case -1
Industry: healthcare (Medicaid)
Data: surveys
Use case: decision support system
Details:
• Identify trends and patterns
• Determine population cohorts
• Predict health outcomes
• Anticipate funding / capacity
• Recommend intervention
Can’t do complex queries on current
hardware with Oracle and snowflake
schemas
Limited to optimizing for simple, known
queries (2-3 columns)
Replaced with ColumnStore
> a single table
> 2.5 million rows, 248 columns >
complex, ad-hoc queries
> query 20+ columns in seconds
Customer Use Case - 2
Industry: biotechnology (genetics)
Data: genotypes
Use case: genetic profiling
Details:
• Find genetic mates (beef and dairy)
• Predict meat production (pork)
• Gene/DNA analysis
Had to convert to CSV files and schedule
import jobs (cron)
Always receiving new genetic data
Migrated to data adapter (Python)
> streamline import process
> remove steps / possible error
> remove delays
> import data on demand
> immediate customer access
Customer Use Case - 3
Industry:Mobile text/call app
Data: call and text logs
Use case: Mobile app use analytics
Details:
• 30 million text and 3 million phone call
per day
• 1.5 billion rows of logs per day
• The text and call volume rate will
continue to grow
InnoDB backend hit the scale limit of
6TB and it requires lot of performance
tuning and index management
Migrated to MariaDB AX
> Able to process 24 month - 24TB vs
6 months limitation of InnoDB
> Same BI tools and client applications
worked with MariaDB AX seamlessly
MariaDB AX
Analytics made easy –
simple, fast, scalable…
Thank you

More Related Content

What's hot (20)

ODP
Big Data Testing Strategies
Knoldus Inc.
 
ODP
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
Romeo Kienzler
 
PDF
Architecting for analytics
Rob Winters
 
PPTX
Tableau @ Spil Games
Rob Winters
 
PPT
Retail Data Warehouse
Peter Campbell
 
PDF
Business Intelligence Presentation 1 (15th March'16)
Muhammad Fahad
 
PPTX
Solution Architecture US healthcare
sumiteshkr
 
PPTX
A brief history of data warehousing
Rob Winters
 
PPTX
DATA MART APPROCHES TO ARCHITECTURE
Sachin Batham
 
PDF
Cognos datawarehouse
ssuser7fc7eb
 
PPTX
Platform for Data Scientists
datamantra
 
PDF
RWE & Patient Analytics Leveraging Databricks – A Use Case
Databricks
 
PPTX
Building an Effective Data Warehouse Architecture
James Serra
 
PDF
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo
 
PPTX
Moving from BI to AI : For decision makers
zekeLabs Technologies
 
PPTX
Traditional Data-warehousing / BI overview
Nagaraj Yerram
 
PDF
Managing R&D Data on Parallel Compute Infrastructure
Databricks
 
PDF
Big Data - Module 1
Aneej Matthai
 
PPTX
Analytical tools
Aniket Joshi
 
Big Data Testing Strategies
Knoldus Inc.
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
Romeo Kienzler
 
Architecting for analytics
Rob Winters
 
Tableau @ Spil Games
Rob Winters
 
Retail Data Warehouse
Peter Campbell
 
Business Intelligence Presentation 1 (15th March'16)
Muhammad Fahad
 
Solution Architecture US healthcare
sumiteshkr
 
A brief history of data warehousing
Rob Winters
 
DATA MART APPROCHES TO ARCHITECTURE
Sachin Batham
 
Cognos datawarehouse
ssuser7fc7eb
 
Platform for Data Scientists
datamantra
 
RWE & Patient Analytics Leveraging Databricks – A Use Case
Databricks
 
Building an Effective Data Warehouse Architecture
James Serra
 
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo
 
Moving from BI to AI : For decision makers
zekeLabs Technologies
 
Traditional Data-warehousing / BI overview
Nagaraj Yerram
 
Managing R&D Data on Parallel Compute Infrastructure
Databricks
 
Big Data - Module 1
Aneej Matthai
 
Analytical tools
Aniket Joshi
 

Similar to Delivering fast, powerful and scalable analytics (20)

PDF
Fast, Powerful and Scalable Analytics
MariaDB plc
 
PPTX
Delivering fast, powerful and scalable analytics
MariaDB plc
 
PDF
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
Insight Technology, Inc.
 
PPTX
Hadoop in the Cloud: Common Architectural Patterns
DataWorks Summit
 
PPTX
Skilwise Big data
Skillwise Group
 
PPTX
Skillwise Big Data part 2
Skillwise Group
 
PDF
Business Intelligence Architecture
Philippe Julio
 
PPTX
Business analytics and data visualisation
Shwetabh Jaiswal
 
PDF
Customer value analysis of big data products
Vikas Sardana
 
PDF
Girish Juneja - Intel Big Data & Cloud Summit 2013
IntelAPAC
 
PPTX
Data lake-itweekend-sharif university-vahid amiry
datastack
 
PPTX
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Ali Hodroj
 
PPTX
IPC Data Analysis and Extraction
pzybrick
 
PPT
Informix & IWA : Operational analytics performance
Keshav Murthy
 
PPTX
Big data unit 2
RojaT4
 
PDF
Machine Data Analytics
Nicolas Morales
 
PPTX
Unlocking Operational Intelligence from the Data Lake
MongoDB
 
PPTX
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
PDF
Fast Cycle, Multi-Terabyte Data Analysis with Amazon Redshift and ClearStory ...
ClearStory Data
 
PPTX
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Infochimps, a CSC Big Data Business
 
Fast, Powerful and Scalable Analytics
MariaDB plc
 
Delivering fast, powerful and scalable analytics
MariaDB plc
 
[db tech showcase Tokyo 2017] C37: MariaDB ColumnStore analytics engine : use...
Insight Technology, Inc.
 
Hadoop in the Cloud: Common Architectural Patterns
DataWorks Summit
 
Skilwise Big data
Skillwise Group
 
Skillwise Big Data part 2
Skillwise Group
 
Business Intelligence Architecture
Philippe Julio
 
Business analytics and data visualisation
Shwetabh Jaiswal
 
Customer value analysis of big data products
Vikas Sardana
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
IntelAPAC
 
Data lake-itweekend-sharif university-vahid amiry
datastack
 
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Ali Hodroj
 
IPC Data Analysis and Extraction
pzybrick
 
Informix & IWA : Operational analytics performance
Keshav Murthy
 
Big data unit 2
RojaT4
 
Machine Data Analytics
Nicolas Morales
 
Unlocking Operational Intelligence from the Data Lake
MongoDB
 
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
Fast Cycle, Multi-Terabyte Data Analysis with Amazon Redshift and ClearStory ...
ClearStory Data
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Infochimps, a CSC Big Data Business
 
Ad

More from MariaDB plc (20)

PDF
MariaDB Berlin Roadshow Slides - 8 April 2025
MariaDB plc
 
PDF
MariaDB München Roadshow - 24 September, 2024
MariaDB plc
 
PDF
MariaDB Paris Roadshow - 19 September 2024
MariaDB plc
 
PDF
MariaDB Amsterdam Roadshow: 19 September, 2024
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - Newpharma
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - Cloud
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - MaxScale
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB plc
 
PDF
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB plc
 
PDF
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB plc
 
PDF
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB plc
 
PDF
Einführung : MariaDB Tech und Business Update Hamburg 2023
MariaDB plc
 
PDF
Hochverfügbarkeitslösungen mit MariaDB
MariaDB plc
 
PDF
Die Neuheiten in MariaDB Enterprise Server
MariaDB plc
 
PDF
Global Data Replication with Galera for Ansell Guardian®
MariaDB plc
 
PDF
Introducing workload analysis
MariaDB plc
 
PDF
Under the hood: SkySQL monitoring
MariaDB plc
 
MariaDB Berlin Roadshow Slides - 8 April 2025
MariaDB plc
 
MariaDB München Roadshow - 24 September, 2024
MariaDB plc
 
MariaDB Paris Roadshow - 19 September 2024
MariaDB plc
 
MariaDB Amsterdam Roadshow: 19 September, 2024
MariaDB plc
 
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB plc
 
MariaDB Paris Workshop 2023 - Newpharma
MariaDB plc
 
MariaDB Paris Workshop 2023 - Cloud
MariaDB plc
 
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB plc
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB plc
 
MariaDB Paris Workshop 2023 - MaxScale
MariaDB plc
 
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB plc
 
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB plc
 
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB plc
 
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
MariaDB plc
 
Einführung : MariaDB Tech und Business Update Hamburg 2023
MariaDB plc
 
Hochverfügbarkeitslösungen mit MariaDB
MariaDB plc
 
Die Neuheiten in MariaDB Enterprise Server
MariaDB plc
 
Global Data Replication with Galera for Ansell Guardian®
MariaDB plc
 
Introducing workload analysis
MariaDB plc
 
Under the hood: SkySQL monitoring
MariaDB plc
 
Ad

Recently uploaded (20)

PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PDF
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
PPTX
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
Climate Action.pptx action plan for climate
justfortalabat
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
JavaScript - Good or Bad? Tips for Google Tag Manager
📊 Markus Baersch
 
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 

Delivering fast, powerful and scalable analytics

  • 1. Fast, Powerful and Scalable Analytics Dipti Joshi Director Product Management MariaDB AX, MariaDB MaxScale
  • 2. Why Analytics ? • Get the most value of your data asset • Faster Better decision making process • Cost reduction • New products and services
  • 3. Type of Analytics • Descriptive: What happened ? • Diagnostics: Why did it happen? • Predictive: What is likely to happen? • Prescriptive: What should I do about it ?
  • 4. Descriptive: What happened ? ● Reports ○ Sales Report ○ Expense summary ● Ad-hoc requests to analyst
  • 5. Diagnostics: Why did it happen ● Aggregates: aggregate measure over one or more dimension ○ Find total sales ○ Top five product ranked by sales ● Roll-ups: Aggregate at different levels of dimension hierarchy ○ given total sales by city, roll-up to get sales by state ● Drill-down: Inverse of roll-ups ○ given total sales by state, drill-down to get total by city ● Slicing and Dicing: ○ Equality and range selections on one or more dimensions
  • 6. Predictive: What is likely to happen ● Sales Prediction ○ Analyze data to identify trends, spot weakness or determine conditions among broader data sets for making decisions about the future ● Targeted marketing ○ what is likelihood of a customer buying a particular product based on past buying behavior
  • 7. Prescriptive: What is the best course of action? Paradox of choices With too many choices, which one is the best?
  • 8. Data Analytics Use Cases By industry Finance Identify trade patterns Detect fraud and anomalies Predict trading outcomes Manufacturing Simulations to improve design/yield Detect production anomalies Predict machine failures (sensor data) Telecom Behavioral analysis of customer calls Network analysis (perf and reliability) Healthcare Find genetic profiles/matches Analyze health vs spending Predict viral outbreaks
  • 9. Data Analytics Solution Consideration • Technical Considerations • Real-time analytics – High speed data ingestion – High speed read queries • Analytics – Built in analytics – Choice of BI tools • Business Considerations • Cost of deployment and use – Hardware and Price/Performance ratio – Large talent pool
  • 10. Existing Approaches Limited real time analytics Slow releases of product innovation Expensive hardware and software Data Warehouses Hadoop / NoSQL LIMITED SQL SUPPORT DIFFICULT TO INSTALL/MANAGE LIMITED TALENT POOL DATA LAKE W/ NO DATA MANAGEMENT Hard to use Purpose Built rather than predictive analytics
  • 11. MariaDB Big Data Solution MariaDB AX and MariaDB ColumnStore
  • 12. MariaDB AX Analytics - simple, fast, scalable… and open source
  • 13. MariaDB AX MariaDB Server MariaDB MaxScale MariaDB ColumnStore Parallel queries Distributed storage No indexes Automatic partitioning Read optimized High compression Low disk IO ColumnStore Storage ColumnStore Storage ColumnStore Storage MariaDB Server ColumnStore MariaDB Server ColumnStore MariaDB MaxScale MariaDB Server ColumnStore ColumnStore Storage MariaDB MaxScale UM User Module PM Performance Module
  • 14. MariaDB ColumnStore High performance columnar storage engine that supports a wide variety of analytical use cases in highly scalable distributed environments Parallel query processing for distributed environments Faster, More Efficient Queries Single Interface for OLTP and analytics Easy to Manage and Scale Easier Enterprise Analytics Power of SQL and Freedom of Open Source to Big Data Analytics Better Price Performance
  • 15. Better Price Performance Flexible deployment option • Cloud and On-premise • Run on commodity hardware • Open Source, Subscription based pricing 90.3% less per TB per year Commercial Data Warehouse MariaDB ColumnStore No need to maintain a third platform • Run analytics from the same SQL front end • No need to update application code • Leverage MariaDB Extensible architecture High data compression • More efficient at storing big data • Less hardware Customers have saved by going to MariaDB AX against Oracle(HealthCare), MemSQL(Auto-parts), Vertica(Finance, SEO Marketing): Come see them at M18!
  • 16. Easier Enterprise Analytics ANSI SQL Single SQL Front-end • Use a single SQL interface for analytics and OLTP • Leverage MariaDB Security features - Encryption for data in motion, role based access and auditing Full ANSI SQL • No more SQL “like” query • Support complex join, aggregation and window function Easy to manage and scale • Eliminate needs for indexes and views • Automated horizontal/vertical partitioning • Linear scalable by adding new nodes as data grows • Out of box connection with BI tools MariaDB AX customers across industries: Auto Parts, Finance, Ad analytics, Asset management, Telecommunication, Healthcare, Digital Media, Carpooling App
  • 17. Faster, More Efficient Queries Optimized for Columnar storage • Columnar storage reduces disk I/O • Blazing fast read-intensive workload • Ultra fast data import Parallel Query Processing Parallel distributed query execution • Distributed queries into series of parallel operations • Fully parallel high speed data ingestion – TPCH lineitem table - 750K to 1 million rows per min Highly available analytic environment • Built-in Redundancy • Automatic fail-over MariaDB AX customers across industries: Auto Parts, Finance, Ad analytics, Asset management, Telecommunication, Healthcare, Digital Media, Carpooling App
  • 18. Ingestion Analytics Data Services Bulk Data Adapters Apache Kafka Streaming Data Adapters Spark / Python / ML Bulk Data Adapters Operations Transaction (OLTP) MariaDB Server InnoDB MariaDB MaxScale Web/Mobile Services MariaDB MaxScale Analytics (OLAP) MariaDB Server ColumnStore Simple & Streamlined data ingestion
  • 19. Streaming data adapters – Apache Kafka Stream all messages published to Apache Kafka topics to MariaDB AX automatically and continuously - enable data from many sources to be streamed and collected for analysis without complex code MariaDB Server ColumnStore Apache Kafka ColumnStore Storage ColumnStore StorageColumnStore Storage Write API Write API Write API MariaDB Server ColumnStore Streaming Data Adapter (Kafka Client) Topic Topic Topic
  • 20. OLTP to OLAP: Streaming data adapters – MaxScale CDC Stream all writes from MariaDB TX to MariaDB AX automatically and continously - ensure analytical data is up to date and not stale, no need for batch jobs, manual processes or human intervention MariaDB Server InnoDB MariaDB Server ColumnStore MariaDB MaxScale ColumnStore Storage ColumnStore StorageColumnStore Storage Write API Write API Write API MariaDB Server ColumnStore Streaming Data Adapter (CDC Client) CDC Server
  • 21. MariaDB AX Use Cases
  • 22. IHME - Institute of Health Metrics and Evaluation IHME Visualizations library: http://www.healthdata.org/results/data-visualizations Started with 4.2 TB, with goal to go to 30TB of data
  • 23. Customer Use Case -1 Industry: healthcare (Medicaid) Data: surveys Use case: decision support system Details: • Identify trends and patterns • Determine population cohorts • Predict health outcomes • Anticipate funding / capacity • Recommend intervention Can’t do complex queries on current hardware with Oracle and snowflake schemas Limited to optimizing for simple, known queries (2-3 columns) Replaced with ColumnStore > a single table > 2.5 million rows, 248 columns > complex, ad-hoc queries > query 20+ columns in seconds
  • 24. Customer Use Case - 2 Industry: biotechnology (genetics) Data: genotypes Use case: genetic profiling Details: • Find genetic mates (beef and dairy) • Predict meat production (pork) • Gene/DNA analysis Had to convert to CSV files and schedule import jobs (cron) Always receiving new genetic data Migrated to data adapter (Python) > streamline import process > remove steps / possible error > remove delays > import data on demand > immediate customer access
  • 25. Customer Use Case - 3 Industry:Mobile text/call app Data: call and text logs Use case: Mobile app use analytics Details: • 30 million text and 3 million phone call per day • 1.5 billion rows of logs per day • The text and call volume rate will continue to grow InnoDB backend hit the scale limit of 6TB and it requires lot of performance tuning and index management Migrated to MariaDB AX > Able to process 24 month - 24TB vs 6 months limitation of InnoDB > Same BI tools and client applications worked with MariaDB AX seamlessly
  • 26. MariaDB AX Analytics made easy – simple, fast, scalable…