Why Abstract Away
the Underlying Database Infrastructure
MariaDB MaxScale: Database Proxy
Markus Mäkelä
Overview
• What is database cluster abstraction?
• Why is it important?
• How does MariaDB MaxScale do it?
The Idea of a Perfect Database
● Behaves like a single database
○ Simple to use
○ Easy to manage
● Performs like a cluster
○ Robust and failure tolerant
○ Near-linear scalability
What is Abstraction for Database Clusters?
The Database
Why is it Important?
● Isolates Complexity
○ One logical database → Build simpler applications
● Highly Available Database
○ Fault tolerant → Robust services
○ Dynamic clusters → Easier maintenance
● Load balancing
○ Read/Write splitting → Better performance
Database Abstraction Layer
Why is it Important?
Complexity isolation
● Simpler application development/configuration
○ No need to know where to send queries
● No user-visible infrastructure
○ Don’t need to detect servers that are in maintenance
○ No need to know the cluster topology
The Database
Why is it Important?
Highly Available Database
● Prevents downtime
○ Node failure is not cluster failure
● Easier Maintenance
○ Functionality not tied to physical nodes
○ Reduced capacity, not functionality
○ Easy node replacement
Database Abstraction Layer
Why is it Important?
Load Balancing
● Runtime Load Balancing
○ Maximized node utilization
● Horizontal Scalability
○ Cheaper
○ Easier to change
○ On-demand capacity
2 N1
Database Abstraction Layer
How the Abstraction is Implemented
MariaDB MaxScale
MaxScale Overview
● Modular Database Proxy
○ Only use what is needed
○ Extendable
● Content-aware
○ Understands routed traffic
● Cluster-aware
○ Active cluster monitoring
○ Understands different cluster types
Configuration:
Defining Services instead of Servers
● “Database as a Service”
● Decouple clients from databases
● Describe what you want instead of what
you have
○ This is a service that provides
automated, highly available
read-write splitting
Database Abstraction Layer
Monitors
Abstracting the Cluster Concept
● Classify servers
○ Up or Down?
○ Master or Slave?
○ In sync or not?
● Information used by routers
○ Masters used for writes
○ Slaves used for reads
● Detects events
○ Server went down
○ Slave is disconnected
Overview: Monitors
● Detects topology
○ Builds the replication tree
● Assigns correct labels
○ Root node for writes
○ Other nodes for reads
● Detects replication lag
○ Write timestamp on master
○ Read from slave
MariaDB Monitor:
Master-Slave Monitor
Master
SlaveSlave
This is a master
This is a slave
● Output of SHOW ALL SLAVES STATUS
○ Slave_IO_Running: Yes
○ Slave_SQL_Running: Yes
○ Master_Server_Id: 1234
● Number of configured slaves
● @@read_only
MariaDB Monitor:
Monitored Variables
Master
SlaveSlave
This is used to build
the replication tree
● Galera Clusters
○ Synchronous Cluster
○ Globally conflict free
○ Conflicting transaction → Error on commit
● Abstracted in MaxScale
○ One “master” node
■ Prevents conflicts
○ Rest labeled as “slaves”
■ Good for scaleout
Galera Cluster Monitor
Master
MasterMaster
Use this for all writes...
…and these two for reads
● @@wsrep_local_state
○ 4(Joined) → OK
○ Anything else →Not OK
● @@wsrep_local_index
○ Zero-indexed
○ Cluster-wide “rank”
● Optional:
○ Manual node ranking (priority)
○ Split-brain sanity checks
○ MariaBackup/XtraBackup SST
detection
Galera Cluster Monitor:
Node Election
Master
MasterMaster
Routing & Query Classification
How the Load Balancing is Done
SELECT
WHERE
id
=
1;
● Provides both abstract and detailed information
○ Read or write
■ Does the query modify the database?
○ Query components
■ Is the table `t1` used in this query?
■ What are the values for the functions in the query?
○ Query characteristics
■ Does the query have a WHERE clause?
○ State changes
■ Was the default character set changed?
■ Is there an open transaction?
Query Classifier:
The Brains of MaxScale
Read-only query
SELECT
WHERE
id
=
1;
Query Classifier:
Details
● Based on a modified lightweight version of SQLite
○ Extended for MariaDB 10.3 syntax
○ Removed data storage and memory allocation
● Smart classification
○ First pass
■ Lightweight parsing
■ Resolves operation and query type
○ Second pass
■ Only for full syntactic classification
■ Column ↔Function relationships
Read-only query
● Read/write splitting
○ Write to master, read from slaves
○ Performance improvement for read-heavy loads
○ Prevents conflicts (Galera)
● Session state tracking & propagation
○ Consistent session state
● Failure tolerant
○ Hides slave failures
● Multiple backend connections
○ Must-have for read/write splitting
○ Speeds up node failover
ReadWriteSplit:
The Routing Muscle
Based on server score
● Multiple algorithms
○ Active operation count → Default
■ MIN(operations)
○ Connection count
■ MIN(connections)
○ Replication delay
■ MIN(delay)
● Manually adjustable
○ Weight each server differently
■ MIN(score * weight)
ReadWriteSplit:
Load Balancing
● Consistent state for all connections
○ State modifications propagated
○ Truly abstracted reads
● State modification history
○ Node replacement
ReadWriteSplit:
Session State SET SQL_MODE=’ANSI’;
START TRANSACTION;
SELECT name FROM accounts WHERE id = 1;
INSERT INTO logins VALUES (‘john doe’);
COMMIT;
ReadWriteSplit:
Transactions
Transactional behavior must be kept intact
● Executed only on one node
● Statements cannot be retried on other servers
● Cannot be load balanced
Read-write transaction
START TRANSACTION READ ONLY;
SELECT name FROM accounts WHERE id = 1;
COMMIT;
ReadWriteSplit:
Transactions
Same as read-write except:
● Can be load balanced
● Safe even with writes
○ Server returns an error
Read-only transaction
SELECT name FROM accounts WHERE id = 1;
INSERT INTO logins VALUES (‘john doe’);
SELECT LAST_INSERT_ID();
SET @@character_set_client=cp850;
ReadWriteSplit: Query classification
Read
Write
Dependent Query
Session State
Different queries require different behavior
● Writes to master
● Reads to slaves
● Dependent queries to previous server
● Session state modifications to all
SELECT name FROM accounts WHERE id = ?;
INSERT INTO logins VALUES (‘?’);
ReadWriteSplit: Query classification
Prepared statements
Observable behavior:
● None
Behind the scenes:
● Text protocol
○ Resolve query type
○ Map text identifier to query
type
● Binary protocol
○ Resolve query type
○ Route preparation
○ Map returned identifier to
query type
Handling Failures
How MaxScale Hides Node Failures
Monitors detect failures:
● Node no longer responsive
○ Response takes too long
○ Connection broken → Cannot reestablish
● Invalid state
○ Broken replication
○ Replication is lagging
○ Out-of-sync Galera node
Monitors:
Node Failure
Read retry
● Hides “trivial” failures
○ SELECT statement
○ autocommit=1
○ No open transaction
● Guaranteed reply
○ Try slaves first
○ Use master as last resort
ReadWriteSplit:
Hiding Node Failures
● Triggered on master failure
○ Master server down
○ Lost connection to master
● Read-only queries and transactions allowed
○ For read-heavy traffic
● Configurable behavior
○ Close connection on master failure
○ Close connection on first write
○ Send error on all writes
ReadWriteSplit:
Read-only Mode
● Triggered on slave failure
○ Discard current slave
○ Pick a replacement
● Supplements read retry
○ Lower total connection count
● Configurable behavior
○ Close connection on master failure
○ Close connection on first write
○ Send error on all writes
ReadWriteSplit:
Slave Replacement
Filters
Extending MaxScale Functionality
● Between client and router module
○ Pre-processing
○ Analytics
○ Target hinting
● Chainable
○ Output pipes to input
● Easy to write
○ First community contribution
■ tpmfilter
Filter Overview
Cache:
TTL-based resultset caching
● Up to 3x read performance
● Configurable caching and storage
○ Specific users or applications
○ Matching SQL statements
○ Specific tables or databases
●
● Non-transactional
○ Work on a single node
○ Fail when load balanced
● Depend on previous queries
○ Read inserted value
Critical Reads
INSERT INTO accounts VALUES (‘john doe’);
SELECT name FROM accounts WHERE name = ’john doe’;
● Not compatible with load balancing
○ Can return a result without the inserted value
● Not the “correct way” to do it
○ Legacy application → hard to modify
○ Framework →impossible to modify
● Detects data modification
○ Writes “pin” the session to master
● Tags the query with a hint
○ Route to master
● Configurable
○ Number of queries
○ Time interval
CCRFilter:
Consistent Critical Reads
INSERT INTO accounts VALUES (‘john doe’);
SELECT name FROM accounts WHERE name = ’john doe’;
Route this to the master!
● Match-replace functionality
○ PCRE2 regular expressions
● Fix broken SQL
○ “Patching” after release
● Allows neat tricks
○ Append a LIMIT clause
○ Add optimizer hints
○ Change storage engine
Regexfilter:
sed for SQL
Solution:
Use the right tool. Work smart, not hard.
Wrapping Up
Problem:
Database clusters are essential for
performance and HA but are also hard to
use properly.
Wrapping Up
MaxScale:
A Toolbox for the Database.
● Abstracts database clusters into services
● Truly understands traffic and environment
● Makes database clusters easy to use efficiently
Thank you

More Related Content

PDF
M|18 PolarDB: Extending Shared-storage to MyRocks
PDF
M|18 Architectural Overview: MariaDB MaxScale
PDF
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
PDF
M|18 Deep Dive: InnoDB Transactions and Write Paths
PDF
M|18 Under the Hood: Galera Cluster
PDF
NewSQL overview, Feb 2015
PPTX
M|18 Battle of the Online Schema Change Methods
PPTX
M|18 Scalability via Expendable Resources: Containers at BlaBlaCar
M|18 PolarDB: Extending Shared-storage to MyRocks
M|18 Architectural Overview: MariaDB MaxScale
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
M|18 Deep Dive: InnoDB Transactions and Write Paths
M|18 Under the Hood: Galera Cluster
NewSQL overview, Feb 2015
M|18 Battle of the Online Schema Change Methods
M|18 Scalability via Expendable Resources: Containers at BlaBlaCar

What's hot (19)

PPTX
M|18 Creating a Reference Architecture for High Availability at Nokia
PPTX
M|18 How DBAs at TradingScreen Make Life Easier With Automation
PPTX
M|18 How Facebook Migrated to MyRocks
PDF
MySQL topology healing at OLA.
PDF
合并到 XtraDB 存储引擎集群
PPTX
When is MyRocks good?
PDF
Parallel Replication in MySQL and MariaDB
PDF
How to migrate from Oracle Database with ease
PDF
Zero Downtime Schema Changes - Galera Cluster - Best Practices
PDF
MyRocks in MariaDB: why and how
PDF
Introduction to Galera Cluster
PDF
Galera Cluster - Node Recovery - Webinar slides
PDF
M|18 How to use MyRocks with MariaDB Server
PPTX
Overview of some popular distributed databases
PPTX
Migrating from InnoDB and HBase to MyRocks at Facebook
PDF
Galera explained 3
PDF
Ansible is Our Wishbone(Automate DBA Tasks With Ansible)
PDF
Evolution of DBA in the Cloud Era
PPTX
M|18 Writing Stored Procedures in the Real World
M|18 Creating a Reference Architecture for High Availability at Nokia
M|18 How DBAs at TradingScreen Make Life Easier With Automation
M|18 How Facebook Migrated to MyRocks
MySQL topology healing at OLA.
合并到 XtraDB 存储引擎集群
When is MyRocks good?
Parallel Replication in MySQL and MariaDB
How to migrate from Oracle Database with ease
Zero Downtime Schema Changes - Galera Cluster - Best Practices
MyRocks in MariaDB: why and how
Introduction to Galera Cluster
Galera Cluster - Node Recovery - Webinar slides
M|18 How to use MyRocks with MariaDB Server
Overview of some popular distributed databases
Migrating from InnoDB and HBase to MyRocks at Facebook
Galera explained 3
Ansible is Our Wishbone(Automate DBA Tasks With Ansible)
Evolution of DBA in the Cloud Era
M|18 Writing Stored Procedures in the Real World
Ad

Similar to M|18 Why Abstract Away the Underlying Database Infrastructure (20)

PDF
MariaDB MaxScale: an Intelligent Database Proxy
PDF
MariaDB MaxScale: an Intelligent Database Proxy
PDF
MariaDB MaxScale
PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
PDF
Introduction to MariaDB MaxScale
PDF
MariaDB - The Future of MySQL?
PDF
Webinar slides: Severalnines & MariaDB present: Automation & Management of Ma...
PDF
MariaDB: Connect Storage Engine
PPTX
Skalierbarkeit mit MariaDB und MaxScale - MariaDB Roadshow Summer 2014 Hambur...
PDF
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
PPTX
MaxScale - The Pluggable Router
PDF
Using all of the high availability options in MariaDB
PDF
Introduction of MariaDB AX / TX
PDF
Les fonctionnalites mariadb
PDF
What to expect from MariaDB Platform X5, part 2
PPTX
Running MariaDB in multiple data centers
PDF
Choosing the right high availability strategy
MariaDB MaxScale: an Intelligent Database Proxy
MariaDB MaxScale: an Intelligent Database Proxy
MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
Introduction to MariaDB MaxScale
MariaDB - The Future of MySQL?
Webinar slides: Severalnines & MariaDB present: Automation & Management of Ma...
MariaDB: Connect Storage Engine
Skalierbarkeit mit MariaDB und MaxScale - MariaDB Roadshow Summer 2014 Hambur...
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
How to Manage Scale-Out Environments with MariaDB MaxScale
MaxScale - The Pluggable Router
Using all of the high availability options in MariaDB
Introduction of MariaDB AX / TX
Les fonctionnalites mariadb
What to expect from MariaDB Platform X5, part 2
Running MariaDB in multiple data centers
Choosing the right high availability strategy
Ad

More from MariaDB plc (20)

PDF
MariaDB Berlin Roadshow Slides - 8 April 2025
PDF
MariaDB München Roadshow - 24 September, 2024
PDF
MariaDB Paris Roadshow - 19 September 2024
PDF
MariaDB Amsterdam Roadshow: 19 September, 2024
PDF
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
PDF
MariaDB Paris Workshop 2023 - Newpharma
PDF
MariaDB Paris Workshop 2023 - Cloud
PDF
MariaDB Paris Workshop 2023 - MariaDB Enterprise
PDF
MariaDB Paris Workshop 2023 - Performance Optimization
PDF
MariaDB Paris Workshop 2023 - MaxScale
PDF
MariaDB Paris Workshop 2023 - novadys presentation
PDF
MariaDB Paris Workshop 2023 - DARVA presentation
PDF
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
PDF
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
PDF
Einführung : MariaDB Tech und Business Update Hamburg 2023
PDF
Hochverfügbarkeitslösungen mit MariaDB
PDF
Die Neuheiten in MariaDB Enterprise Server
PDF
Global Data Replication with Galera for Ansell Guardian®
PDF
Introducing workload analysis
PDF
Under the hood: SkySQL monitoring
MariaDB Berlin Roadshow Slides - 8 April 2025
MariaDB München Roadshow - 24 September, 2024
MariaDB Paris Roadshow - 19 September 2024
MariaDB Amsterdam Roadshow: 19 September, 2024
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
Einführung : MariaDB Tech und Business Update Hamburg 2023
Hochverfügbarkeitslösungen mit MariaDB
Die Neuheiten in MariaDB Enterprise Server
Global Data Replication with Galera for Ansell Guardian®
Introducing workload analysis
Under the hood: SkySQL monitoring

Recently uploaded (20)

PPTX
AI-Augmented Business Process Management Systems
PPTX
Understanding AI: Basics on Artificial Intelligence and Machine Learning
PPTX
Overview_of_Computing_Presentation.pptxxx
PDF
n8n Masterclass.pdfn8n Mastercn8n Masterclass.pdflass.pdf
PPT
DWDM unit 1 for btech 3rd year students.ppt
PDF
Teal Blue Futuristic Metaverse Presentation.pdf
PPT
genetics-16bbbbbbhhbbbjjjjjjjjffggg11-.ppt
PPTX
REAL of PPT_P1_5019211081 (1).pdf_20250718_084609_0000.pptx
PDF
PPT IEPT 2025_Ms. Nurul Presentation 10.pdf
PPT
Handout for Lean and Six Sigma application
PPTX
4. Sustainability.pptxxxxxxxxxxxxxxxxxxx
PDF
PPT nikita containers of the company use
PPTX
Sistem Informasi Manejemn-Sistem Manajemen Database
PPTX
Microsoft Fabric Modernization Pathways in Action: Strategic Insights for Dat...
PPTX
BDA_Basics of Big data Unit-1.pptx Big data
PPT
2011 HCRP presentation-final.pptjrirrififfi
PPTX
Bussiness Plan S Group of college 2020-23 Final
PDF
MISO Deep-NARX Forecasting for Energy and Electricity Demand/Price Data
PDF
NU-MEP-Standards معايير تصميم جامعية .pdf
PPTX
DIGITAL DESIGN AND.pptx hhhhhhhhhhhhhhhhh
AI-Augmented Business Process Management Systems
Understanding AI: Basics on Artificial Intelligence and Machine Learning
Overview_of_Computing_Presentation.pptxxx
n8n Masterclass.pdfn8n Mastercn8n Masterclass.pdflass.pdf
DWDM unit 1 for btech 3rd year students.ppt
Teal Blue Futuristic Metaverse Presentation.pdf
genetics-16bbbbbbhhbbbjjjjjjjjffggg11-.ppt
REAL of PPT_P1_5019211081 (1).pdf_20250718_084609_0000.pptx
PPT IEPT 2025_Ms. Nurul Presentation 10.pdf
Handout for Lean and Six Sigma application
4. Sustainability.pptxxxxxxxxxxxxxxxxxxx
PPT nikita containers of the company use
Sistem Informasi Manejemn-Sistem Manajemen Database
Microsoft Fabric Modernization Pathways in Action: Strategic Insights for Dat...
BDA_Basics of Big data Unit-1.pptx Big data
2011 HCRP presentation-final.pptjrirrififfi
Bussiness Plan S Group of college 2020-23 Final
MISO Deep-NARX Forecasting for Energy and Electricity Demand/Price Data
NU-MEP-Standards معايير تصميم جامعية .pdf
DIGITAL DESIGN AND.pptx hhhhhhhhhhhhhhhhh

M|18 Why Abstract Away the Underlying Database Infrastructure

  • 1. Why Abstract Away the Underlying Database Infrastructure MariaDB MaxScale: Database Proxy Markus Mäkelä
  • 2. Overview • What is database cluster abstraction? • Why is it important? • How does MariaDB MaxScale do it?
  • 3. The Idea of a Perfect Database ● Behaves like a single database ○ Simple to use ○ Easy to manage ● Performs like a cluster ○ Robust and failure tolerant ○ Near-linear scalability What is Abstraction for Database Clusters? The Database
  • 4. Why is it Important? ● Isolates Complexity ○ One logical database → Build simpler applications ● Highly Available Database ○ Fault tolerant → Robust services ○ Dynamic clusters → Easier maintenance ● Load balancing ○ Read/Write splitting → Better performance Database Abstraction Layer
  • 5. Why is it Important? Complexity isolation ● Simpler application development/configuration ○ No need to know where to send queries ● No user-visible infrastructure ○ Don’t need to detect servers that are in maintenance ○ No need to know the cluster topology The Database
  • 6. Why is it Important? Highly Available Database ● Prevents downtime ○ Node failure is not cluster failure ● Easier Maintenance ○ Functionality not tied to physical nodes ○ Reduced capacity, not functionality ○ Easy node replacement Database Abstraction Layer
  • 7. Why is it Important? Load Balancing ● Runtime Load Balancing ○ Maximized node utilization ● Horizontal Scalability ○ Cheaper ○ Easier to change ○ On-demand capacity 2 N1 Database Abstraction Layer
  • 8. How the Abstraction is Implemented MariaDB MaxScale
  • 9. MaxScale Overview ● Modular Database Proxy ○ Only use what is needed ○ Extendable ● Content-aware ○ Understands routed traffic ● Cluster-aware ○ Active cluster monitoring ○ Understands different cluster types
  • 10. Configuration: Defining Services instead of Servers ● “Database as a Service” ● Decouple clients from databases ● Describe what you want instead of what you have ○ This is a service that provides automated, highly available read-write splitting Database Abstraction Layer
  • 12. ● Classify servers ○ Up or Down? ○ Master or Slave? ○ In sync or not? ● Information used by routers ○ Masters used for writes ○ Slaves used for reads ● Detects events ○ Server went down ○ Slave is disconnected Overview: Monitors
  • 13. ● Detects topology ○ Builds the replication tree ● Assigns correct labels ○ Root node for writes ○ Other nodes for reads ● Detects replication lag ○ Write timestamp on master ○ Read from slave MariaDB Monitor: Master-Slave Monitor Master SlaveSlave This is a master This is a slave
  • 14. ● Output of SHOW ALL SLAVES STATUS ○ Slave_IO_Running: Yes ○ Slave_SQL_Running: Yes ○ Master_Server_Id: 1234 ● Number of configured slaves ● @@read_only MariaDB Monitor: Monitored Variables Master SlaveSlave This is used to build the replication tree
  • 15. ● Galera Clusters ○ Synchronous Cluster ○ Globally conflict free ○ Conflicting transaction → Error on commit ● Abstracted in MaxScale ○ One “master” node ■ Prevents conflicts ○ Rest labeled as “slaves” ■ Good for scaleout Galera Cluster Monitor Master MasterMaster Use this for all writes... …and these two for reads
  • 16. ● @@wsrep_local_state ○ 4(Joined) → OK ○ Anything else →Not OK ● @@wsrep_local_index ○ Zero-indexed ○ Cluster-wide “rank” ● Optional: ○ Manual node ranking (priority) ○ Split-brain sanity checks ○ MariaBackup/XtraBackup SST detection Galera Cluster Monitor: Node Election Master MasterMaster
  • 17. Routing & Query Classification How the Load Balancing is Done
  • 18. SELECT WHERE id = 1; ● Provides both abstract and detailed information ○ Read or write ■ Does the query modify the database? ○ Query components ■ Is the table `t1` used in this query? ■ What are the values for the functions in the query? ○ Query characteristics ■ Does the query have a WHERE clause? ○ State changes ■ Was the default character set changed? ■ Is there an open transaction? Query Classifier: The Brains of MaxScale Read-only query
  • 19. SELECT WHERE id = 1; Query Classifier: Details ● Based on a modified lightweight version of SQLite ○ Extended for MariaDB 10.3 syntax ○ Removed data storage and memory allocation ● Smart classification ○ First pass ■ Lightweight parsing ■ Resolves operation and query type ○ Second pass ■ Only for full syntactic classification ■ Column ↔Function relationships Read-only query
  • 20. ● Read/write splitting ○ Write to master, read from slaves ○ Performance improvement for read-heavy loads ○ Prevents conflicts (Galera) ● Session state tracking & propagation ○ Consistent session state ● Failure tolerant ○ Hides slave failures ● Multiple backend connections ○ Must-have for read/write splitting ○ Speeds up node failover ReadWriteSplit: The Routing Muscle
  • 21. Based on server score ● Multiple algorithms ○ Active operation count → Default ■ MIN(operations) ○ Connection count ■ MIN(connections) ○ Replication delay ■ MIN(delay) ● Manually adjustable ○ Weight each server differently ■ MIN(score * weight) ReadWriteSplit: Load Balancing
  • 22. ● Consistent state for all connections ○ State modifications propagated ○ Truly abstracted reads ● State modification history ○ Node replacement ReadWriteSplit: Session State SET SQL_MODE=’ANSI’;
  • 23. START TRANSACTION; SELECT name FROM accounts WHERE id = 1; INSERT INTO logins VALUES (‘john doe’); COMMIT; ReadWriteSplit: Transactions Transactional behavior must be kept intact ● Executed only on one node ● Statements cannot be retried on other servers ● Cannot be load balanced Read-write transaction
  • 24. START TRANSACTION READ ONLY; SELECT name FROM accounts WHERE id = 1; COMMIT; ReadWriteSplit: Transactions Same as read-write except: ● Can be load balanced ● Safe even with writes ○ Server returns an error Read-only transaction
  • 25. SELECT name FROM accounts WHERE id = 1; INSERT INTO logins VALUES (‘john doe’); SELECT LAST_INSERT_ID(); SET @@character_set_client=cp850; ReadWriteSplit: Query classification Read Write Dependent Query Session State Different queries require different behavior ● Writes to master ● Reads to slaves ● Dependent queries to previous server ● Session state modifications to all
  • 26. SELECT name FROM accounts WHERE id = ?; INSERT INTO logins VALUES (‘?’); ReadWriteSplit: Query classification Prepared statements Observable behavior: ● None Behind the scenes: ● Text protocol ○ Resolve query type ○ Map text identifier to query type ● Binary protocol ○ Resolve query type ○ Route preparation ○ Map returned identifier to query type
  • 27. Handling Failures How MaxScale Hides Node Failures
  • 28. Monitors detect failures: ● Node no longer responsive ○ Response takes too long ○ Connection broken → Cannot reestablish ● Invalid state ○ Broken replication ○ Replication is lagging ○ Out-of-sync Galera node Monitors: Node Failure
  • 29. Read retry ● Hides “trivial” failures ○ SELECT statement ○ autocommit=1 ○ No open transaction ● Guaranteed reply ○ Try slaves first ○ Use master as last resort ReadWriteSplit: Hiding Node Failures
  • 30. ● Triggered on master failure ○ Master server down ○ Lost connection to master ● Read-only queries and transactions allowed ○ For read-heavy traffic ● Configurable behavior ○ Close connection on master failure ○ Close connection on first write ○ Send error on all writes ReadWriteSplit: Read-only Mode
  • 31. ● Triggered on slave failure ○ Discard current slave ○ Pick a replacement ● Supplements read retry ○ Lower total connection count ● Configurable behavior ○ Close connection on master failure ○ Close connection on first write ○ Send error on all writes ReadWriteSplit: Slave Replacement
  • 33. ● Between client and router module ○ Pre-processing ○ Analytics ○ Target hinting ● Chainable ○ Output pipes to input ● Easy to write ○ First community contribution ■ tpmfilter Filter Overview
  • 34. Cache: TTL-based resultset caching ● Up to 3x read performance ● Configurable caching and storage ○ Specific users or applications ○ Matching SQL statements ○ Specific tables or databases ●
  • 35. ● Non-transactional ○ Work on a single node ○ Fail when load balanced ● Depend on previous queries ○ Read inserted value Critical Reads INSERT INTO accounts VALUES (‘john doe’); SELECT name FROM accounts WHERE name = ’john doe’; ● Not compatible with load balancing ○ Can return a result without the inserted value ● Not the “correct way” to do it ○ Legacy application → hard to modify ○ Framework →impossible to modify
  • 36. ● Detects data modification ○ Writes “pin” the session to master ● Tags the query with a hint ○ Route to master ● Configurable ○ Number of queries ○ Time interval CCRFilter: Consistent Critical Reads INSERT INTO accounts VALUES (‘john doe’); SELECT name FROM accounts WHERE name = ’john doe’; Route this to the master!
  • 37. ● Match-replace functionality ○ PCRE2 regular expressions ● Fix broken SQL ○ “Patching” after release ● Allows neat tricks ○ Append a LIMIT clause ○ Add optimizer hints ○ Change storage engine Regexfilter: sed for SQL
  • 38. Solution: Use the right tool. Work smart, not hard. Wrapping Up Problem: Database clusters are essential for performance and HA but are also hard to use properly.
  • 39. Wrapping Up MaxScale: A Toolbox for the Database. ● Abstracts database clusters into services ● Truly understands traffic and environment ● Makes database clusters easy to use efficiently