SlideShare a Scribd company logo
Scaling MySQL – Sharding Made Easy
2
Agenda
• Scalability Issues
• MySQL 5.6
• Why Do-It-Yourself (DIY) Sharding Sucks
• ScaleBase Data Distribution:
– Successful sharding on Amazon and private clouds
– Single vs. multiple shards per server
– Eliminating data silos
– Creating a redundant, fault-tolerant architecture
– Re-balancing and splitting shards
• Q & A
3
Doron Levari, Founder & CTO
Doron Levari,
Founder & CTO
A technologist and long-
time veteran of the
database industry. Prior
to founding
ScaleBase, Doron was
CEO to Aluna.
4
What We Do
Simply and cost-effectively scale
MySQL to support an infinite
number of users, transactions and data
with NO disruption to the existing infrastructure
Scalability Issues and MySQL 5.6
6
MySQL Scalability Challenges
• Too many transactions
• Too many users
• Too much data
• Too many writes
• Capacity
• Throughput
• Performance inconsistencies
7
Improvements in MySQL 5.6 – Single Box
Partitioning Improvements
– Explicit Partition Selection:
SELECT * FROM employees
PARTITION (p0, p2);
– Import / Export for Partitioned Tables:
Bring a new data set into a partitioned
table, or export a partition to manage it
as a regular table ALTER TABLE e
EXCHANGE PARTITION p0 WITH
TABLE e2;
http://dev.mysql.co/tech-resources/articles/whats-new-in-mysql-5.6.html
Replication Improvements
– Optimizations to Row-Based
Replication
– Multi-Threaded Slaves
– Improvements to Data Integrity
– Crash-Safe Slaves
– Replication Checksums
SCALABILITY issues remain due to the limitations of a single box:
To ensure ACID, you still face limitations with:
- Memory management - Thread management
- Semaphores - Locking
- Recovery tasks
No new functionality for sharing workloads across multiple boxes
8
What are my Options
1. More/Bigger Hardware?
– Temporary fix…you will need new hardware again
– More memory…helps mostly with “reads,” but not with “writes”
– Every write operation is at least 4 write operations in database, plus
multiple activities in the database engine memory
2. Application re-architecture?
– Steer workload away from the database
– Example: introduce a caching layer
– Force application re-writes; new test & QA cycles
3. Do it Yourself Sharding?
4. Migrate to new database architecture
– Other RDBMS/NewSQL / NoSQL?
– Force application re-writes; new test & QA cycles
– ACID/Durability Issues
9
Scale Out your Existing MySQL
• Keep your MySQL - keep your InnoDB
• Ecosystem compatibility, preserve skills
• 100% application compatibility
• Smoother migration, no down-time, no forklift
• Your data is safe
• No “in-memory” magic
• No “in-memory” size limit
Don’t throw out the baby with the bath water!
Why Do It Yourself Sharding “Sucks”
11
What is Sharding?
Wikipedia - Shard (database architecture) http://en.wikipedia.org/wiki/Shard_(database_architecture)
A database shard is a horizontal partition in a
database or search engine. Each individual partition
is referred to as a shard.
Horizontal partitioning is a database design
principle whereby rows of a database table are held
separately, rather than being split into columns.
Each partition forms part of a shard, which may in
turn be located on a separate database server or
physical location.
12
DIY Sharding Challenges
Applications must be modified to support multiple shards
13
• Maintaining DB ops and IPs in the app
• Non-optimized sharding strategies
– No good way to maintain global tables
replicated across all database
• Sacrifices development agility,
additional administrative complexity
• Results in database silos
• Database ecosystem breaks because
the application “conceals” sharding
strategies internally
• Risks for data inconsistency
• Adding and removing databases
is not supported…overprovisioning…
• Jeopardizes high availability, backups & disaster recovery
• Demands custom application code that can fail ACID compliance
DIY Sharding Challenges
Challenges exist because
application code changes are
required to support multiple
database instances.
ScaleBase Data Distribution Overview
15
Data Distribution: Application Experience
Without ScaleBase: App must be customized to support shards
With ScaleBase: App sees ONE database…
…and doesn’t require any customization
ScaleBase acts as a proxy between the app and the
database, virtualizing the database environment
16
Manual Sharding versus ScaleBase
Sharding Limitations:
• Major app rewrite, maintaining code
• Maintaining DB ops & IPs in the app
• Administration/3rd party tools are broken
• DB silos/Database ecosystem is blind
– Application “hides” sharding strategies
• Non-optimized data distribution policy
– No good way to maintain global
tables, replicated across all database
• Sacrifices development agility
• Adding/removing DBs is not supported
• Risks for data inconsistency
• Demands custom application code that
can fail ACID compliance
• Jeopardizes high
availability, backups, and disaster
recovery
ScaleBase Benefits:
• No hard-coding application re-writes
• Unlimited scalability
• Improve performance
• Real time elasticity
• ACID compliance
• Verified data consistency
• Real time monitoring, traffic analysis
• Carefully analyze distribution policy
• Enable system upgrades and updates
• Simplified, centralized admin
– Adding users
– Changing schemas
– Maintenance scripts
– Management queries
17
Typical ScaleBase Data Traffic Manager Deployment
Application
Servers
BI
Management
Database A Replica A
Database B Replica B
Database C Replica C
Database D Replica D
Unlimited Scale
ScaleBase
Architecture
is Fault Tolerant
ScaleBase Data Distribution – In Detail
19
ScaleBase Enables MySQL Scale Out without Re-
writing Apps
• Data distribution and scale-out is part of the database
architecture, not the application
• One IP to connect to, and “see a unified database”
– The application
– Entire ecosystem (ETL, mysqldump, PHPMyAdmin)
– No special sharding wizard developer
– No app re-design, re-dev, re-QA, re-test, re-deploy
– No hard-coded variables lost in the code
– No special documentation
20
ScaleBase Enable Scale Out on AWS and Private
Clouds
• A virtualized DB environment makes it easy to change real
infrastructure, because it’s decoupled from the application
• No cloud makes your database elastic
• ScaleBase enables elasticity of MySQL in the cloud (EC2, RDS, etc.)
Scale-up hits
AWS’s tiered
configuration
limits fast
Scale-out is
unlimited and
gives cloud
flexibility
21
ScaleBase Supports Scale Out on Single & Multiple
Machines
Advantages of several
shards on one machine:
– Several smaller MySQL
instances better utilize
cores, memory
– When data grows, each
instance can later on
migrate to a bigger
machine of its own
Advantages of several shards
on multiple machines
– Leverage commodity hardware
– When reaches machine limits -
ScaleBase enables online data
redistribution (resharding) and
shard-split
22
ScaleBase Enables Splitting Shards
• ScaleBase also redistributes data across the array to eliminate hot
spots, splitting the hot spot into two databases
23
ScaleBase Re-balances Shards
• Special analysis and alerts about approaching limits
• ScaleBase dynamically redistributes data (resharding) - moving the
data across the array from the over-utilized to the under-utilized
24
ScaleBase Provides Optimal Data Distribution Policies
A good data distribution policy ensures that a specific
transaction is directed to a specific database
1,000 transactions
250
transactions
250
transactions
250
transactions
250
transactions
1,000 transactions
25
ScaleBase Eliminates Data Silos
When a query needs data
from several databases,
ScaleBase:
– Runs the query in parallel
on all databases
– Aggregates results into one
meaningful result-set to be returned to the client – the same
result-set that would have been returned from a single DB!
– Including cross-db GROUP BY, ORDER BY, aggregate functions
– Including cross-db JOIN operations
– Enables 2-phase commit for transactions spanning multiple
databases
26
ScaleBase Provides a Fault Tolerant Architecture
Application
Servers
BI
Management
Database A Replica A
Database B Replica B
Database C Replica C
Database D Replica D
Fully Redundant
Resilience to failures
Scheduled
maintenance without
downtime
Summary
28
ScaleBase Delivers Scalability
Scale to
Unlimited
Throughput
No Specialized
Hardware
No
Re-architecture
No Application
Rewrites
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!
30
Detailed Scale Out Case Studies
Large Chip Co
• Scalability
• Multiple Apps
• Multiple growing
users
• Availability
• MySQL DB
Solar Edge
• Next Gen
Monitoring App
• Massive Scale
• Monitors real
time data from
thousands of
distributed
systems
Mozilla
• New Product/
Next Gen App/
AppStore
• Scalability
• Geo-clustering
AppDynamics
• Next gen APM
company
• Scalability for the
Netflix
implementation
31
ScaleBase Deployment
Environments
– Public Cloud
– AWS, Rackspace, any
– Private cloud
– Hosted / on-premise
Databases Supported
– MySQL 5.1, 5.5, 5.6 (under
certification)
– AWS RDS MySQL 5.1, 5.5
– Maria DB 10.0 (under
certification)
Path to Scale-Out:
1. Data Distribution
Policy Analysis
2. Functional Test
3. Load Test
4. Production Migration
(safe, online)
32
Summary
ScaleBase provides cost-effective Scale-Out solutions
• Scale to an infinite number of users, data and transactions
• Improve performance
• No application rewrites
• Real-time elasticity
• ACID Compliant
• Expert analysis and simple deployment
• Leverage existing MySQL ecosystem/skills
• Improve database visibility with real-time monitoring
• Simplified, centralized administration
33
Questions (please enter directly into the GTW side panel)
paul.campaniello@scalebase.com
doron.levari@scalebase.com
www.ScaleBase.com
617.630.2800
Additional Resources
http://www.scalebase.com/blog/
http://www.scalebase.com/resources/
@scalebase
34
Thank You

More Related Content

What's hot (20)

PDF
NoSQL Now! NoSQL Architecture Patterns
DATAVERSITY
 
PDF
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
Vivek Adithya Mohankumar
 
PDF
Scalability Considerations
Navid Malek
 
PDF
Data Lake and the rise of the microservices
Bigstep
 
PPTX
Chapter1: NoSQL: It’s about making intelligent choices
Maynooth University
 
PPTX
What's new in SQL Server 2016
James Serra
 
PDF
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Hazelcast
 
PPT
SQL/NoSQL How to choose ?
Venu Anuganti
 
PDF
Database Security Threats - MariaDB Security Best Practices
MariaDB plc
 
PPTX
HA/DR options with SQL Server in Azure and hybrid
James Serra
 
PPTX
SQL Server 2016 - Stretch DB
Shy Engelberg
 
PDF
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
Todd Hoff
 
PPT
SQL, NoSQL, BigData in Data Architecture
Venu Anuganti
 
PPT
Design principles of scalable, distributed systems
Tinniam V Ganesh (TV)
 
PDF
NoSQL and MySQL webinar - best of both worlds
Mat Keep
 
PPTX
Scaling Your Database in the Cloud
RightScale
 
PDF
Dr and ha solutions with sql server azure
MSDEVMTL
 
PDF
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
huguk
 
PPTX
Spark
fatemehjamalii
 
PDF
DBaaS with EDB Postgres on AWS
EDB
 
NoSQL Now! NoSQL Architecture Patterns
DATAVERSITY
 
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
Vivek Adithya Mohankumar
 
Scalability Considerations
Navid Malek
 
Data Lake and the rise of the microservices
Bigstep
 
Chapter1: NoSQL: It’s about making intelligent choices
Maynooth University
 
What's new in SQL Server 2016
James Serra
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Hazelcast
 
SQL/NoSQL How to choose ?
Venu Anuganti
 
Database Security Threats - MariaDB Security Best Practices
MariaDB plc
 
HA/DR options with SQL Server in Azure and hybrid
James Serra
 
SQL Server 2016 - Stretch DB
Shy Engelberg
 
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
Todd Hoff
 
SQL, NoSQL, BigData in Data Architecture
Venu Anuganti
 
Design principles of scalable, distributed systems
Tinniam V Ganesh (TV)
 
NoSQL and MySQL webinar - best of both worlds
Mat Keep
 
Scaling Your Database in the Cloud
RightScale
 
Dr and ha solutions with sql server azure
MSDEVMTL
 
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
huguk
 
DBaaS with EDB Postgres on AWS
EDB
 

Viewers also liked (20)

PDF
Methods of Sharding MySQL
Laine Campbell
 
PPTX
Distributed RDBMS: Data Distribution Policy: Part 2 - Creating a Data Distrib...
ScaleBase
 
PPTX
Distributed RDBMS: Data Distribution Policy: Part 3 - Changing Your Data Dist...
ScaleBase
 
PDF
MySQL Proxy: Architecture and concepts of misuse
weigon
 
PPTX
MySQL Fabric: High Availability using Python/Connector
Vishal Yadav
 
PDF
High Availability with MySQL
Thava Alagu
 
PDF
MySQL highav Availability
Baruch Osoveskiy
 
PDF
DIY: A distributed database cluster, or: MySQL Cluster
Ulf Wendel
 
PDF
MySQL Proxy. From Architecture to Implementation
Ronald Bradford
 
PDF
MySQL Proxy tutorial
Giuseppe Maxia
 
PPTX
Distributed RDBMS: Data Distribution Policy: Part 1 - What is a Data Distribu...
ScaleBase
 
PDF
MySQL Proxy. A powerful, flexible MySQL toolbox.
Miguel Araújo
 
PDF
MySQL HA Solutions
Mat Keep
 
PDF
MySQL High Availability Deep Dive
hastexo
 
ODP
MySQL? Load? Clustering! Balancing! PECL/mysqlnd_ms 1.4
Ulf Wendel
 
ODP
MySQL 5.6 Global Transaction IDs - Use case: (session) consistency
Ulf Wendel
 
PDF
MySQL High Availability and Disaster Recovery with Continuent, a VMware company
Continuent
 
PDF
High-Availability using MySQL Fabric
Mats Kindahl
 
PDF
MySQL Replication: What’s New in MySQL 5.7 and Beyond
Andrew Morgan
 
PPTX
Using MySQL Fabric for High Availability and Scaling Out
OSSCube
 
Methods of Sharding MySQL
Laine Campbell
 
Distributed RDBMS: Data Distribution Policy: Part 2 - Creating a Data Distrib...
ScaleBase
 
Distributed RDBMS: Data Distribution Policy: Part 3 - Changing Your Data Dist...
ScaleBase
 
MySQL Proxy: Architecture and concepts of misuse
weigon
 
MySQL Fabric: High Availability using Python/Connector
Vishal Yadav
 
High Availability with MySQL
Thava Alagu
 
MySQL highav Availability
Baruch Osoveskiy
 
DIY: A distributed database cluster, or: MySQL Cluster
Ulf Wendel
 
MySQL Proxy. From Architecture to Implementation
Ronald Bradford
 
MySQL Proxy tutorial
Giuseppe Maxia
 
Distributed RDBMS: Data Distribution Policy: Part 1 - What is a Data Distribu...
ScaleBase
 
MySQL Proxy. A powerful, flexible MySQL toolbox.
Miguel Araújo
 
MySQL HA Solutions
Mat Keep
 
MySQL High Availability Deep Dive
hastexo
 
MySQL? Load? Clustering! Balancing! PECL/mysqlnd_ms 1.4
Ulf Wendel
 
MySQL 5.6 Global Transaction IDs - Use case: (session) consistency
Ulf Wendel
 
MySQL High Availability and Disaster Recovery with Continuent, a VMware company
Continuent
 
High-Availability using MySQL Fabric
Mats Kindahl
 
MySQL Replication: What’s New in MySQL 5.7 and Beyond
Andrew Morgan
 
Using MySQL Fabric for High Availability and Scaling Out
OSSCube
 
Ad

Similar to ScaleBase Webinar: Scaling MySQL - Sharding Made Easy! (20)

PDF
20090425mysqlslides 12593434194072-phpapp02
Vinamra Mittal
 
PPTX
high performance databases
mahdi_92
 
PPTX
Database Virtualization: The Next Wave of Big Data
exponential-inc
 
PPTX
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
Qian Lin
 
PPTX
Azure data platform overview
Alessandro Melchiori
 
PPTX
Continuous Availability and Scale-out for MySQL with ScaleBase Lite & Enterpr...
Vladi Vexler
 
PPTX
Migrate a successful transactional database to azure
Ike Ellis
 
PPTX
Scalable relational database with SQL Azure
Shy Engelberg
 
PPTX
AZURE Data Related Services
Ruslan Drahomeretskyy
 
PPTX
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Andrew Miller
 
PPTX
How To Tell if Your Business Needs NoSQL
DataStax
 
PPTX
SQL Saturday San Diego
Kellyn Pot'Vin-Gorman
 
PDF
MySQL 8: Ready for Prime Time
Arnab Ray
 
PDF
Big SQL 3.0 - Fast and easy SQL on Hadoop
Wilfried Hoge
 
PDF
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
PDF
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
PDF
A Tour of Azure SQL Databases (NOVA SQL UG 2020)
Timothy McAliley
 
PPTX
Why you should(n't) run your databases in the cloud
Microsoft TechNet - Belgium and Luxembourg
 
PPTX
NoSQLDatabases
Adi Challa
 
PDF
MySQL overview
Mark Swarbrick
 
20090425mysqlslides 12593434194072-phpapp02
Vinamra Mittal
 
high performance databases
mahdi_92
 
Database Virtualization: The Next Wave of Big Data
exponential-inc
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
Qian Lin
 
Azure data platform overview
Alessandro Melchiori
 
Continuous Availability and Scale-out for MySQL with ScaleBase Lite & Enterpr...
Vladi Vexler
 
Migrate a successful transactional database to azure
Ike Ellis
 
Scalable relational database with SQL Azure
Shy Engelberg
 
AZURE Data Related Services
Ruslan Drahomeretskyy
 
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Andrew Miller
 
How To Tell if Your Business Needs NoSQL
DataStax
 
SQL Saturday San Diego
Kellyn Pot'Vin-Gorman
 
MySQL 8: Ready for Prime Time
Arnab Ray
 
Big SQL 3.0 - Fast and easy SQL on Hadoop
Wilfried Hoge
 
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
A Tour of Azure SQL Databases (NOVA SQL UG 2020)
Timothy McAliley
 
Why you should(n't) run your databases in the cloud
Microsoft TechNet - Belgium and Luxembourg
 
NoSQLDatabases
Adi Challa
 
MySQL overview
Mark Swarbrick
 
Ad

More from ScaleBase (9)

PPTX
Challenges in Querying a Distributed Relational Database
ScaleBase
 
PPTX
Database Scalability - The Shard Conflict
ScaleBase
 
PDF
ScaleBase Webinar: Strategies for scaling MySQL
ScaleBase
 
PDF
Scaling MySQL: Catch 22 of Read Write Splitting
ScaleBase
 
PDF
Scaling MySQL: Benefits of Automatic Data Distribution
ScaleBase
 
PDF
Choosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQL
ScaleBase
 
PDF
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase
 
PPTX
ScaleBase Backs Mozilla's new app store
ScaleBase
 
PDF
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase
 
Challenges in Querying a Distributed Relational Database
ScaleBase
 
Database Scalability - The Shard Conflict
ScaleBase
 
ScaleBase Webinar: Strategies for scaling MySQL
ScaleBase
 
Scaling MySQL: Catch 22 of Read Write Splitting
ScaleBase
 
Scaling MySQL: Benefits of Automatic Data Distribution
ScaleBase
 
Choosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQL
ScaleBase
 
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase
 
ScaleBase Backs Mozilla's new app store
ScaleBase
 
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase
 

Recently uploaded (20)

PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
July Patch Tuesday
Ivanti
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 

ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!

  • 1. Scaling MySQL – Sharding Made Easy
  • 2. 2 Agenda • Scalability Issues • MySQL 5.6 • Why Do-It-Yourself (DIY) Sharding Sucks • ScaleBase Data Distribution: – Successful sharding on Amazon and private clouds – Single vs. multiple shards per server – Eliminating data silos – Creating a redundant, fault-tolerant architecture – Re-balancing and splitting shards • Q & A
  • 3. 3 Doron Levari, Founder & CTO Doron Levari, Founder & CTO A technologist and long- time veteran of the database industry. Prior to founding ScaleBase, Doron was CEO to Aluna.
  • 4. 4 What We Do Simply and cost-effectively scale MySQL to support an infinite number of users, transactions and data with NO disruption to the existing infrastructure
  • 6. 6 MySQL Scalability Challenges • Too many transactions • Too many users • Too much data • Too many writes • Capacity • Throughput • Performance inconsistencies
  • 7. 7 Improvements in MySQL 5.6 – Single Box Partitioning Improvements – Explicit Partition Selection: SELECT * FROM employees PARTITION (p0, p2); – Import / Export for Partitioned Tables: Bring a new data set into a partitioned table, or export a partition to manage it as a regular table ALTER TABLE e EXCHANGE PARTITION p0 WITH TABLE e2; http://dev.mysql.co/tech-resources/articles/whats-new-in-mysql-5.6.html Replication Improvements – Optimizations to Row-Based Replication – Multi-Threaded Slaves – Improvements to Data Integrity – Crash-Safe Slaves – Replication Checksums SCALABILITY issues remain due to the limitations of a single box: To ensure ACID, you still face limitations with: - Memory management - Thread management - Semaphores - Locking - Recovery tasks No new functionality for sharing workloads across multiple boxes
  • 8. 8 What are my Options 1. More/Bigger Hardware? – Temporary fix…you will need new hardware again – More memory…helps mostly with “reads,” but not with “writes” – Every write operation is at least 4 write operations in database, plus multiple activities in the database engine memory 2. Application re-architecture? – Steer workload away from the database – Example: introduce a caching layer – Force application re-writes; new test & QA cycles 3. Do it Yourself Sharding? 4. Migrate to new database architecture – Other RDBMS/NewSQL / NoSQL? – Force application re-writes; new test & QA cycles – ACID/Durability Issues
  • 9. 9 Scale Out your Existing MySQL • Keep your MySQL - keep your InnoDB • Ecosystem compatibility, preserve skills • 100% application compatibility • Smoother migration, no down-time, no forklift • Your data is safe • No “in-memory” magic • No “in-memory” size limit Don’t throw out the baby with the bath water!
  • 10. Why Do It Yourself Sharding “Sucks”
  • 11. 11 What is Sharding? Wikipedia - Shard (database architecture) http://en.wikipedia.org/wiki/Shard_(database_architecture) A database shard is a horizontal partition in a database or search engine. Each individual partition is referred to as a shard. Horizontal partitioning is a database design principle whereby rows of a database table are held separately, rather than being split into columns. Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.
  • 12. 12 DIY Sharding Challenges Applications must be modified to support multiple shards
  • 13. 13 • Maintaining DB ops and IPs in the app • Non-optimized sharding strategies – No good way to maintain global tables replicated across all database • Sacrifices development agility, additional administrative complexity • Results in database silos • Database ecosystem breaks because the application “conceals” sharding strategies internally • Risks for data inconsistency • Adding and removing databases is not supported…overprovisioning… • Jeopardizes high availability, backups & disaster recovery • Demands custom application code that can fail ACID compliance DIY Sharding Challenges Challenges exist because application code changes are required to support multiple database instances.
  • 15. 15 Data Distribution: Application Experience Without ScaleBase: App must be customized to support shards With ScaleBase: App sees ONE database… …and doesn’t require any customization ScaleBase acts as a proxy between the app and the database, virtualizing the database environment
  • 16. 16 Manual Sharding versus ScaleBase Sharding Limitations: • Major app rewrite, maintaining code • Maintaining DB ops & IPs in the app • Administration/3rd party tools are broken • DB silos/Database ecosystem is blind – Application “hides” sharding strategies • Non-optimized data distribution policy – No good way to maintain global tables, replicated across all database • Sacrifices development agility • Adding/removing DBs is not supported • Risks for data inconsistency • Demands custom application code that can fail ACID compliance • Jeopardizes high availability, backups, and disaster recovery ScaleBase Benefits: • No hard-coding application re-writes • Unlimited scalability • Improve performance • Real time elasticity • ACID compliance • Verified data consistency • Real time monitoring, traffic analysis • Carefully analyze distribution policy • Enable system upgrades and updates • Simplified, centralized admin – Adding users – Changing schemas – Maintenance scripts – Management queries
  • 17. 17 Typical ScaleBase Data Traffic Manager Deployment Application Servers BI Management Database A Replica A Database B Replica B Database C Replica C Database D Replica D Unlimited Scale ScaleBase Architecture is Fault Tolerant
  • 19. 19 ScaleBase Enables MySQL Scale Out without Re- writing Apps • Data distribution and scale-out is part of the database architecture, not the application • One IP to connect to, and “see a unified database” – The application – Entire ecosystem (ETL, mysqldump, PHPMyAdmin) – No special sharding wizard developer – No app re-design, re-dev, re-QA, re-test, re-deploy – No hard-coded variables lost in the code – No special documentation
  • 20. 20 ScaleBase Enable Scale Out on AWS and Private Clouds • A virtualized DB environment makes it easy to change real infrastructure, because it’s decoupled from the application • No cloud makes your database elastic • ScaleBase enables elasticity of MySQL in the cloud (EC2, RDS, etc.) Scale-up hits AWS’s tiered configuration limits fast Scale-out is unlimited and gives cloud flexibility
  • 21. 21 ScaleBase Supports Scale Out on Single & Multiple Machines Advantages of several shards on one machine: – Several smaller MySQL instances better utilize cores, memory – When data grows, each instance can later on migrate to a bigger machine of its own Advantages of several shards on multiple machines – Leverage commodity hardware – When reaches machine limits - ScaleBase enables online data redistribution (resharding) and shard-split
  • 22. 22 ScaleBase Enables Splitting Shards • ScaleBase also redistributes data across the array to eliminate hot spots, splitting the hot spot into two databases
  • 23. 23 ScaleBase Re-balances Shards • Special analysis and alerts about approaching limits • ScaleBase dynamically redistributes data (resharding) - moving the data across the array from the over-utilized to the under-utilized
  • 24. 24 ScaleBase Provides Optimal Data Distribution Policies A good data distribution policy ensures that a specific transaction is directed to a specific database 1,000 transactions 250 transactions 250 transactions 250 transactions 250 transactions 1,000 transactions
  • 25. 25 ScaleBase Eliminates Data Silos When a query needs data from several databases, ScaleBase: – Runs the query in parallel on all databases – Aggregates results into one meaningful result-set to be returned to the client – the same result-set that would have been returned from a single DB! – Including cross-db GROUP BY, ORDER BY, aggregate functions – Including cross-db JOIN operations – Enables 2-phase commit for transactions spanning multiple databases
  • 26. 26 ScaleBase Provides a Fault Tolerant Architecture Application Servers BI Management Database A Replica A Database B Replica B Database C Replica C Database D Replica D Fully Redundant Resilience to failures Scheduled maintenance without downtime
  • 28. 28 ScaleBase Delivers Scalability Scale to Unlimited Throughput No Specialized Hardware No Re-architecture No Application Rewrites
  • 30. 30 Detailed Scale Out Case Studies Large Chip Co • Scalability • Multiple Apps • Multiple growing users • Availability • MySQL DB Solar Edge • Next Gen Monitoring App • Massive Scale • Monitors real time data from thousands of distributed systems Mozilla • New Product/ Next Gen App/ AppStore • Scalability • Geo-clustering AppDynamics • Next gen APM company • Scalability for the Netflix implementation
  • 31. 31 ScaleBase Deployment Environments – Public Cloud – AWS, Rackspace, any – Private cloud – Hosted / on-premise Databases Supported – MySQL 5.1, 5.5, 5.6 (under certification) – AWS RDS MySQL 5.1, 5.5 – Maria DB 10.0 (under certification) Path to Scale-Out: 1. Data Distribution Policy Analysis 2. Functional Test 3. Load Test 4. Production Migration (safe, online)
  • 32. 32 Summary ScaleBase provides cost-effective Scale-Out solutions • Scale to an infinite number of users, data and transactions • Improve performance • No application rewrites • Real-time elasticity • ACID Compliant • Expert analysis and simple deployment • Leverage existing MySQL ecosystem/skills • Improve database visibility with real-time monitoring • Simplified, centralized administration
  • 33. 33 Questions (please enter directly into the GTW side panel) [email protected] [email protected] www.ScaleBase.com 617.630.2800 Additional Resources http://www.scalebase.com/blog/ http://www.scalebase.com/resources/ @scalebase