SlideShare a Scribd company logo
Galera Replicator IRL
Art van Scheppingen
Head of Database Engineering
Overview
1.
2.
3.
4.
5.
6.

Who are we?
What is Galera?
What is Spil Games using Galera for?
What have we learned?
Future technologies
Conclusion

2
Who are we?
Who is Spil Games?
Facts
•
•
•
•
•

Company founded in 2001
350+ employees world wide
180M+ unique visitors per month
Over 60M registered users
45 portals in 19 languages
• Casual games
• Social games
• Real time multiplayer games
• Mobile games
• 35+ MySQL clusters
• 60k queries per second (3.5 billion qpd)
4
Geographic Reach
180 Million Monthly Active Users(*)

Source: (*) Google Analytics, August 2012

5
What is Galera?
How to get Highly Available
and beyond
What is Galera?
1. Replication plugin for MySQL by Codership
• Synchronous (parallel) replication
• Supports InnoDB
• MyISAM “works”
• Committing transactions actually replicates data
1. Allows clustering of nodes
• Minimum of 3 nodes for HA
• Galera Arbitrator allows 2 nodes
• One node elected as Primary Component
How does Galera work?
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Connect/read/write
to any node

MySQL
MySQL

MySQL
MySQL
Galera
Galera

MySQL
MySQL

Synchronous replication
Galera replication

Server-n
Server-n

Client receives OK

MySQL
MySQL

commit

MySQL
MySQL
Galera replication
Galera replication

MySQL
MySQL
Transaction applied
to slaves
High Availability (1)
Server-1
Server-1

MySQL
MySQL

Server-2
Server-2

MySQL
MySQL
Galera
Galera

Server-n
Server-n

MySQL
MySQL
High Availability (2)
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer / Query router

MySQL
MySQL

MySQL
MySQL
Galera
Galera

MySQL
MySQL
High Availability (3)
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer

Load balancer

Load balancer

MySQL
MySQL

MySQL
MySQL

MySQL
MySQL

Galera
Galera
High Availability (4)
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer (port 3306 read,3307 write)

read+write

MySQL
MySQL

read only

MySQL
MySQL
Galera
Galera

read only

MySQL
MySQL
Node joining SST (State Snapshot Transfer)
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer

SST
New
New
MySQL
MySQL
node
node
Requests to
join cluster

read/write to two nodes

Cluster drains
node

MySQL
MySQL

MySQL
MySQL
Galera Galera
Galera Galera

MySQL
MySQL

Synchronous replication
Node joining IST (Incremental State Transfer)
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer

IST
Existing
Existing
MySQL
MySQL
node
node
Requests to
join cluster

MySQL
MySQL

MySQL
MySQL
Galera Galera
Galera Galera

MySQL
MySQL

Synchronous replication
Galera replication over WAN

Server-n
Server-n
Commit is delayed by RTT

DC1

DC2

MySQL
MySQL

MySQL
MySQL
Galera replication
Galera replication
WAN replication Galera 2.x

DC1

DC2

Node 11
Node

Node 22
Node

Node 44
Node

Node 33
Node

Node 55
Node

Node 66
Node
WAN replication Galera 3.x

DC1

DC2

Node 11
Node

Node 22
Node

Node 44
Node

Node 33
Node

Node 55
Node

Node 66
Node
What are we
using Galera for?
Synchronous replication for
the masses
Our systems
1. Legacy services databases
• MySQL Master-Master
1. SSP (Spil Storage Platform)
• MySQL Master-Master (to be phased out)
• Galera
1. ROAR (Read Often, Alter Rarely)
• Galera
Master-Master setup used at Spil Games
Server-1
Server-1

Server-2
Server-2

read+write

read only

MySQL
MySQL
active
active
master
master
db-something (192.168.1.1)
db-something-r1 (192.168.1.2)

Server-n
Server-n

MySQL
MySQL
inactive
inactive
master
master
db-something-r2 (192.168.1.3)

Asynchronous replication
MMM
MMM
Master-Master setup used at Spil Games
MMM
MMM

read+write
db-something (192.168.1.1)
db-something-r1 (192.168.1.2)

db-something-r3 (192.168.1.4)

MySQL
MySQL
active
active
master
master

MySQL
MySQL
slave
slave

read only

MySQL
MySQL
inactive
inactive
master
master

read only

MySQL
MySQL
slave
slave

db-something-r2 (192.168.1.3)

db-something-r4 (192.168.1.5)
Migrating legacy dbs to Galera (lab)
legacy1
legacy1
inactive
inactive
master
master

MySQL
MySQL

Clone database
(innobackupex)

legacy2
legacy2
inactive
inactive
master
master

legacy3
legacy3
inactive
inactive
master
master

Feed database dump
(mysqldump)

Start slaving

MySQL
MySQL

MySQL
MySQL
Galera
Galera

MySQL
MySQL
Scaling Galera (1)
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer (port 3306 write+read)

MySQL
MySQL

MySQL
MySQL

MySQL
MySQL

Galera
Galera

MySQL
MySQL

Galera
Galera

MySQL
MySQL
Scaling Galera (2)
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer (port 3306 write, 3307 read)

read only

read only
asynchronous
replication

MySQL
MySQL

asynchronous
replication

MySQL
MySQL

MySQL
MySQL

Galera
Galera

MySQL
MySQL

MySQL
MySQL
Why consolidate legacy systems?
1. Around 20 legacy database clusters
• 50 servers in total
1. Maintenance
• Master-Master requires a lot of (manual)
maintenance
1. Replacement is needed
• 35 of them will be older than 3 years in 2014
1. Current state: tested in lab
SSP (Spil Storage Platform)
• Storage API between application and databases
• All data is sharded
• User
SSP
SSP
• Function
• Location
• Every cluster (two masters) will contain two shards
Shard 11
Shard 22
Shard
Shard
• Data written interleaved
• HA for both shards
• Both masters active and “warmed up”
27
SSP Master-Master setup
Server-1
Server-1

Server-2
Server-2

read+write

read+write

MySQL
MySQL
active
active
master
master
db-ssp001 (192.168.2.1)

Server-n
Server-n

MySQL
MySQL
active
active
master
master
db-ssp002 (192.168.2.2)

Asynchronous replication
MMM
MMM
SSP Master-Master setup
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

read+write

MySQL
MySQL
broken
broken
master
master

MySQL
MySQL
active
active
master
master
db-ssp002 (192.168.2.2)
db-ssp001 (192.168.2.1)

MMM
MMM
SSP Galera setup
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer

read/write to any node

MySQL
MySQL

MySQL
MySQL
Galera
Galera

MySQL
MySQL

Synchronous replication
Current state of the SSP
1. Total of 4 old style SSP shard nodes (2 clusters)
2. Total of 6 Galera SSP shard nodes (2 clusters)
3. Add Galera nodes/clusters when necessary
What have we
learned so far?
Pitfalls, hurdles, etc
Creating backups
1. Two ways to make backups:
• Issue SST
• Either mysqldump or Innobackupex
• Regular Innobackupex
• --galera_info
• set global wsrep_desync=on to remove node
Backup SST
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer

SST
Backup
Backup
receiver
done
receiver
done

read/write to two nodes

Cluster drains
node

MySQL
MySQL

Request SST

MySQL
MySQL
Galera
Galera

MySQL
MySQL

Synchronous replication
Backup Innobackupex
Server-1
Server-1

Server-2
Server-2

Server-n
Server-n

Load balancer

read/write to two nodes

wsrep_desync=ON
wsrep_desync=OFF
Stream backup
BackupPC
BackupPC

MySQL
MySQL

MySQL
MySQL
Galera
Galera

MySQL
MySQL

Synchronous replication
Restoring backups
1. Restored backup can be used to prevent SST of new
joiners
2. Automated backup verification
• Restores (randomly) chosen backup
• Installs necessary MySQL version (5.1/5.5)
• Perform basic checks
• Enable replication
• Will not work fully as it needs a working
cluster to join
Monitoring
1. Cluster
• Nodes in the cluster
• Warning at 2, critical at 1
• Availability of the address
1. Load balancer
• Node checks
1. Performance monitoring
• Adding metrics to mysql_statsd is easy
• wsrep_flow_control
Flow control
1. Usage of replication threads
• Scale from 0.0 to 1.0
1. Recommended to stay below 0.1 (10% blocked)
2. Adding more nodes will not solve your problem
3. Increase replication threads
• Recommended 2*CPU cores
• What if 64 is not enough?
• How do you close flood gates?
Other things we bumped into…
1. MySQL version updates
• Update one by one
• PXC SST changes
1. Availability after restart
• Joins cluster after IST/SST
• LRU still loading
1. In descriptive errors during SST
• Local user authentication (after starting mysqld
with sudo!)
1. Schema changes
Future for Galera
at Spil Games
What will we do in the near
future?
Openstack
1. Offer DAAS to our (internal) customers
2. Spawning (automated) database nodes and clusters
when necessary
3. Mix and match Galera and regular MySQL
replication
WAN Replication
1. No immediate use case (yet)
• No need for WAN in sharded environment
• Game catalogue might need it in the future
1. Wait for Galera 3.0
• Datacenter awareness
MaxScale
1. Beta testing MaxScale for SkySQL
• Works flawless in the lab (so far)
• Not yet tested with mixed Galera/MySQL
replication
1. MaxScale itself is not HA (yet)
• Keepalived?
Conclusion
What is our verdict?
Conclusion(s)
1.
2.
3.
4.
5.
6.

Galera definitely live up to expectations
Decreased cluster wide performance
Increased replication performance
High investment in time for initial setup/tools
Maintenance is easier
Well worth the investment for us
Thank you!
• Presentation can be found at:
http://spil.com/fosdem2014
• Mysql_statsd can be found at:
http://spil.com/mysqlstatsd
http://github.com/spilgames/mysql-statsd
• If you wish to contact me:
Email: art@spilgames.com
Twitter: @banpei
• Engineering @ Spil Games
Blog: http://engineering.spilgames.com
Twitter: @spilengineering
46
Photo sources
Our current HA environment:
http://thinkaurelius.com/2013/03/30/titan-server-from-a-single-server-to-a-highly-available-cluster/
What we have learned so far:
http://renaissanceronin.wordpress.com/2009/10/05/playing-with-plasma-cutters/
Near future:
http://www.example-infographics.com/envisioning-the-near-future-of-technology/
Conclusion:
http://www.flickr.com/photos/louisephotography/5796499806/in/photostream/

47

More Related Content

PPTX
Percona Live London 2014: Serve out any page with an HA Sphinx environment
spil-engineering
 
PDF
Database TCO
spil-engineering
 
PDF
Disco workshop
spil-engineering
 
PDF
MySQL Performance Monitoring
spil-engineering
 
PDF
Retaining globally distributed high availability
spil-engineering
 
PDF
MySQL High-Availability and Scale-Out architectures
FromDual GmbH
 
PDF
Run Cloud Native MySQL NDB Cluster in Kubernetes
Bernd Ocklin
 
PDF
PaaSTA: Autoscaling at Yelp
Nathan Handler
 
Percona Live London 2014: Serve out any page with an HA Sphinx environment
spil-engineering
 
Database TCO
spil-engineering
 
Disco workshop
spil-engineering
 
MySQL Performance Monitoring
spil-engineering
 
Retaining globally distributed high availability
spil-engineering
 
MySQL High-Availability and Scale-Out architectures
FromDual GmbH
 
Run Cloud Native MySQL NDB Cluster in Kubernetes
Bernd Ocklin
 
PaaSTA: Autoscaling at Yelp
Nathan Handler
 

What's hot (20)

PDF
MySQL NDB Cluster 101
Bernd Ocklin
 
PDF
How to Make Norikra Perfect
SATOSHI TAGOMORI
 
PDF
MySQL NDB Cluster 8.0 SQL faster than NoSQL
Bernd Ocklin
 
PPTX
Percona tool kit for MySQL DBA's
Karthik .P.R
 
PDF
Oss4b - pxc introduction
Frederic Descamps
 
PDF
MHA: Getting started & moving past quirks percona live santa clara 2013
Colin Charles
 
PDF
MySQL in the Hosted Cloud - Percona Live 2015
Colin Charles
 
PDF
Scaling with sync_replication using Galera and EC2
Marco Tusa
 
PDF
Performance Monitoring: Understanding Your Scylla Cluster
ScyllaDB
 
PDF
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
ScyllaDB
 
PDF
Advanced Operations
DataStax Academy
 
PDF
Make 2016 your year of SMACK talk
DataStax Academy
 
PDF
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
 
PDF
Introduction to Cassandra and CQL for Java developers
Julien Anguenot
 
PDF
Diagnosing Problems in Production (Nov 2015)
Jon Haddad
 
PDF
Apache Sqoop: Unlocking Hadoop for Your Relational Database
huguk
 
PDF
How to deploy Apache Spark 
to Mesos/DCOS
Legacy Typesafe (now Lightbend)
 
PDF
Galera explained 3
Marco Tusa
 
PDF
Bootstrapping Using Free Software
Colin Charles
 
PDF
Modern Linux Performance Tools for Application Troubleshooting
Tanel Poder
 
MySQL NDB Cluster 101
Bernd Ocklin
 
How to Make Norikra Perfect
SATOSHI TAGOMORI
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
Bernd Ocklin
 
Percona tool kit for MySQL DBA's
Karthik .P.R
 
Oss4b - pxc introduction
Frederic Descamps
 
MHA: Getting started & moving past quirks percona live santa clara 2013
Colin Charles
 
MySQL in the Hosted Cloud - Percona Live 2015
Colin Charles
 
Scaling with sync_replication using Galera and EC2
Marco Tusa
 
Performance Monitoring: Understanding Your Scylla Cluster
ScyllaDB
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
ScyllaDB
 
Advanced Operations
DataStax Academy
 
Make 2016 your year of SMACK talk
DataStax Academy
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
 
Introduction to Cassandra and CQL for Java developers
Julien Anguenot
 
Diagnosing Problems in Production (Nov 2015)
Jon Haddad
 
Apache Sqoop: Unlocking Hadoop for Your Relational Database
huguk
 
How to deploy Apache Spark 
to Mesos/DCOS
Legacy Typesafe (now Lightbend)
 
Galera explained 3
Marco Tusa
 
Bootstrapping Using Free Software
Colin Charles
 
Modern Linux Performance Tools for Application Troubleshooting
Tanel Poder
 
Ad

Similar to Spil Games @ FOSDEM: Galera Replicator IRL (20)

PDF
Highly Available Load Balanced Galera MySql Cluster
Amr Fawzy
 
PDF
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Severalnines
 
PDF
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Severalnines
 
PDF
Introduction to Galera
Henrik Ingo
 
PPTX
MySQL Multi-Master Replication
Michael Naumov
 
PPTX
MySQL Multi Master Replication
Moshe Kaplan
 
PDF
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Corporation
 
PDF
Keith Larson Replication
Dave Stokes
 
PDF
MySQL High Availability Solutions
Lenz Grimmer
 
PDF
MySQL High Availability Solutions
Lenz Grimmer
 
PDF
Mysqlhacodebits20091203 1260184765-phpapp02
Louis liu
 
PDF
Lessons from database failures
Colin Charles
 
PDF
Galera Cluster 4 for MySQL 8 Release Webinar slides
Codership Oy - Creators of Galera Cluster
 
PDF
[@NaukriEngineering] Introduction to Galera cluster
Naukri.com
 
PDF
What’s new in Galera 4
MariaDB plc
 
PDF
High Availability with MySQL
Thava Alagu
 
ODP
MySQL 101 PHPTek 2017
Dave Stokes
 
PDF
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
Frederic Descamps
 
PDF
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Severalnines
 
PDF
Lessons from database failures
Colin Charles
 
Highly Available Load Balanced Galera MySql Cluster
Amr Fawzy
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Severalnines
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Severalnines
 
Introduction to Galera
Henrik Ingo
 
MySQL Multi-Master Replication
Michael Naumov
 
MySQL Multi Master Replication
Moshe Kaplan
 
MariaDB Galera Cluster - Simple, Transparent, Highly Available
MariaDB Corporation
 
Keith Larson Replication
Dave Stokes
 
MySQL High Availability Solutions
Lenz Grimmer
 
MySQL High Availability Solutions
Lenz Grimmer
 
Mysqlhacodebits20091203 1260184765-phpapp02
Louis liu
 
Lessons from database failures
Colin Charles
 
Galera Cluster 4 for MySQL 8 Release Webinar slides
Codership Oy - Creators of Galera Cluster
 
[@NaukriEngineering] Introduction to Galera cluster
Naukri.com
 
What’s new in Galera 4
MariaDB plc
 
High Availability with MySQL
Thava Alagu
 
MySQL 101 PHPTek 2017
Dave Stokes
 
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
Frederic Descamps
 
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Severalnines
 
Lessons from database failures
Colin Charles
 
Ad

Recently uploaded (20)

PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Simple and concise overview about Quantum computing..pptx
mughal641
 

Spil Games @ FOSDEM: Galera Replicator IRL

Editor's Notes

  • #5: so that may be the reason our name is not widely known.