SlideShare a Scribd company logo
Lessons Learned: Troubleshooting Replication
April, 11, 2017
Sveta Smirnova
∙ MySQL Support engineer
∙ Author of
∙ MySQL Troubleshooting
∙ JSON UDF functions
∙ FILTER clause for MySQL
∙ Speaker
∙ Percona Live, OOW, Fosdem,
DevConf, HighLoad...
Sveta Smirnova
2
∙Typical Replication Errors
∙MySQL and MariaDB Replication: Must Know
∙Master
∙Slave IO thread
∙Slave SQL thread
∙Multithreaded slave
∙Multi-master
Table of Contents
3
Typical Replication Errors
Replication Stopped
5
Slave Lags from the Master
6
Master Increases Resource Usage
7
Not a Full List
8
MySQL and MariaDB Replication: Must Know
Master Slave
<- Initiates
Asynchronous
10
Master Slave
<- Initiates
<- Requests a packet
Asynchronous
10
Master
Sends the packet ->
Slave
<- Initiates
<- Requests a packet
Asynchronous
10
Master
Sends the packet ->
Slave
<- Initiates
<- Requests a packet
... ?
Asynchronous
10
Master Slave
<- Initiates
Semisynchrous Plugin
11
Master Slave
<- Initiates
<- Requests a packet
Semisynchrous Plugin
11
Master
Sends the packet ->
Slave
<- Initiates
<- Requests a packet
Semisynchrous Plugin
11
Master
Sends the packet ->
Waits "Ack"
Slave
<- Initiates
<- Requests a packet
Semisynchrous Plugin
11
Master
Sends the packet ->
Waits "Ack"
Slave
<- Initiates
<- Requests a packet
<- Sends "Ack"
Semisynchrous Plugin
11
Master
Recieves a change
Storage Engine
Logical
12
Master
Recieves a change
Sends to SE ->
Storage Engine
Logical
12
Master
Recieves a change
Sends to SE ->
Storage Engine
Writes into table
Logical
12
Master
Recieves a change
Sends to SE ->
Storage Engine
Writes into table
<- Returns control
Logical
12
Master
Recieves a change
Sends to SE ->
Writes into binary log
Storage Engine
Writes into table
<- Returns control
Logical
12
Master
Recieves a change
Sends to SE ->
Writes into binary log
Synchronizes ->
Storage Engine
Writes into table
<- Returns control
<- Synchronizes
Logical
12
IO thread
Reads from the master
SQL thread
Two Kinds of Slave Threads
13
IO thread
Reads from the master
Stores in the relay log
SQL thread
Two Kinds of Slave Threads
13
IO thread
Reads from the master
Stores in the relay log
SQL thread
<- Reads from the relay log
Two Kinds of Slave Threads
13
IO thread
Reads from the master
Stores in the relay log
SQL thread
<- Reads from the relay log
Executes
Two Kinds of Slave Threads
13
∙ Multiple SQL threads in 10.0.5+/5.6+
Several SQL Threads
14
∙ Multiple SQL threads in 10.0.5+/5.6+
∙ From the troubleshooting point of view
∙ Single IO thread
Several SQL Threads
14
∙ Multiple SQL threads in 10.0.5+/5.6+
∙ From the troubleshooting point of view
∙ Single IO thread
∙ Single Relay log
Several SQL Threads
14
∙ Multiple SQL threads in 10.0.5+/5.6+
∙ From the troubleshooting point of view
∙ Single IO thread
∙ Single Relay log
∙
Slave lag still possible
Several SQL Threads
14
∙ Multiple SQL threads in 10.0.5+/5.6+
∙ From the troubleshooting point of view
∙ Single IO thread
∙ Single Relay log
∙
Slave lag still possible
∙ Error in one thread stops all
Several SQL Threads
14
∙ Multiple masters in 10.0.1+/5.7+
Multi-Source (Multi-Channel)
15
∙ Multiple masters in 10.0.1+/5.7+
∙ From the troubleshooting point of view
∙ Multiple sets of relay logs
Multi-Source (Multi-Channel)
15
∙ Multiple masters in 10.0.1+/5.7+
∙ From the troubleshooting point of view
∙ Multiple sets of relay logs
∙ Multiple IO threads
Multi-Source (Multi-Channel)
15
∙ Multiple masters in 10.0.1+/5.7+
∙ From the troubleshooting point of view
∙ Multiple sets of relay logs
∙ Multiple IO threads
∙
Multiple SQL threads
Multi-Source (Multi-Channel)
15
∙ Multiple masters in 10.0.1+/5.7+
∙ From the troubleshooting point of view
∙ Multiple sets of relay logs
∙ Multiple IO threads
∙
Multiple SQL threads
∙ MySQL: slave_parallel_workers for each channel
Multi-Source (Multi-Channel)
15
∙ Multiple masters in 10.0.1+/5.7+
∙ From the troubleshooting point of view
∙ Multiple sets of relay logs
∙ Multiple IO threads
∙
Multiple SQL threads
∙ MySQL: slave_parallel_workers for each channel
∙ Channels/sources are independent
Multi-Source (Multi-Channel)
15
∙ Multiple masters in 10.0.1+/5.7+
∙ From the troubleshooting point of view
∙ Multiple sets of relay logs
∙ Multiple IO threads
∙
Multiple SQL threads
∙ MySQL: slave_parallel_workers for each channel
∙ Channels/sources are independent
∙ Error in one stops only one
Multi-Source (Multi-Channel)
15
∙ Multiple masters in 10.0.1+/5.7+
∙ From the troubleshooting point of view
∙ Multiple sets of relay logs
∙ Multiple IO threads
∙
Multiple SQL threads
∙ MySQL: slave_parallel_workers for each channel
∙ Channels/sources are independent
∙ Error in one stops only one
∙
No automatic conflict resolution
Multi-Source (Multi-Channel)
15
∙ You must specify
∙
Name of the master’s binary log file
∙ Position
Position-Based
16
∙ You must specify
∙
Name of the master’s binary log file
∙ Position
∙ From the troubleshooting point ov view
∙
Event executes if on the current position
Position-Based
16
∙ You must specify
∙
Name of the master’s binary log file
∙ Position
∙ From the troubleshooting point ov view
∙
Event executes if on the current position
∙ Easy to skip
Position-Based
16
∙ You must specify
∙
Name of the master’s binary log file
∙ Position
∙ From the troubleshooting point ov view
∙
Event executes if on the current position
∙ Easy to skip
∙ Easy to move position backward
Position-Based
16
∙ You must specify
∙
Name of the master’s binary log file
∙ Position
∙ From the troubleshooting point ov view
∙
Event executes if on the current position
∙ Easy to skip
∙ Easy to move position backward
∙ No conflict resolution
Position-Based
16
∙ Each transaction has unique number: GTID
Global Transaction Identifiers (GTID)
17
∙ Each transaction has unique number: GTID
∙ MySQL: AUTO_POSITION=1
Global Transaction Identifiers (GTID)
17
∙ Each transaction has unique number: GTID
∙ MySQL: AUTO_POSITION=1
∙
MariaDB: master_use_gtid = { slave_pos | current_pos }
Global Transaction Identifiers (GTID)
17
∙ Each transaction has unique number: GTID
∙ MySQL: AUTO_POSITION=1
∙
MariaDB: master_use_gtid = { slave_pos | current_pos }
∙
No need to specify binary log and position
Global Transaction Identifiers (GTID)
17
Client Binary log
Statement-Based Binary Log Format
18
Client
INSERT INTO ... ->
Binary log
Statement-Based Binary Log Format
18
Client
INSERT INTO ... ->
Binary log
SET TIMESTAMP...
Statement-Based Binary Log Format
18
Client
INSERT INTO ... ->
Binary log
SET TIMESTAMP...
SET sql_mode...
Statement-Based Binary Log Format
18
Client
INSERT INTO ... ->
Binary log
SET TIMESTAMP...
SET sql_mode...
INSERT INTO ...
Statement-Based Binary Log Format
18
Client Binary log
Row-Based Binary Log Format
19
Client
UPDATE ... ->
Binary log
Row-Based Binary Log Format
19
Client
UPDATE ... ->
Binary log
SET TIMESTAMP...
Row-Based Binary Log Format
19
Client
UPDATE ... ->
Binary log
SET TIMESTAMP...
SET sql_mode...
Row-Based Binary Log Format
19
Client
UPDATE ... ->
Binary log
SET TIMESTAMP...
SET sql_mode...
Row before changes
Row-Based Binary Log Format
19
Client
UPDATE ... ->
Binary log
SET TIMESTAMP...
SET sql_mode...
Row before changes
Row with changes
Row-Based Binary Log Format
19
∙ Error log file
Main Instruments
20
∙ Error log file
∙ At the slave
∙ SHOW SLAVE STATUS
∙
MySQL: Tables in Performance Schema
∙ System database mysql
Main Instruments
20
∙ Error log file
∙ At the slave
∙ At the master
∙ SHOW MASTER STATUS
∙
SHOW BINLOG EVENTS
∙
mysqlbinlog
Main Instruments
20
∙ Error log file
∙ At the slave
∙ At the master
∙
Percona Toolkit
Main Instruments
20
∙ Error log file
∙ At the slave
∙ At the master
∙
Percona Toolkit
∙
MySQL Utilities
Main Instruments
20
∙
Always available, requires setup
∙
Asynchronous
∙
Master∙
Keeps all changes in the binary log
Two formats: ROW и STATEMENT
∙ Slave
∙ IO thread reads from the master into relay log
∙ SQL thread executes updates
Multiple SQL threads in 10.0.5+/5.6+
Multiple channels/sources (masters) in 10.0.1+/5.7+
∙ GTID in 10.0.2+/5.6+
Replication Must Know: Summary
21
Master
∙
More writes
∙
binlog_row_image =
FULL | MINIMAL | NOBLOB
Performance
23
∙
More writes
∙
binlog_row_image =
FULL | MINIMAL | NOBLOB
∙ binlog_cache_size
Watch Binlog_cache_disk_use
Performance
23
∙
More writes
∙
binlog_row_image =
FULL | MINIMAL | NOBLOB
∙ binlog_cache_size
Watch Binlog_cache_disk_use
∙ binlog_stmt_cache_size
Watch Binlog_stmt_cache_disk_use
Performance
23
∙
More writes
∙
Synchronization
∙ sync_binlog
∙
Do not disable!
∙ You may set it greater than 1
Performance
23
∙ Binary log lifetime
∙ expire_log_days
Behavior
24
∙ Binary log lifetime
∙ Synchronization
∙ SBR is not safe with READ COMMITTED and
READ UNCOMMITTED
Behavior
24
∙ Binary log lifetime
∙ Synchronization
∙ Order of records in the binary log
∙ Non-deterministic events and SBR
Behavior
24
Slave IO thread
∙ SHOW SLAVE STATUS
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
...
Last_IO_Errno: 1045
Last_IO_Error: error connecting to master ’root@127.0.0.1:13000’ -
retry-time: 60 retries: 1
Last_SQL_Errno: 0
Last_SQL_Error:
...
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp: 160824 03:18:36
Last_SQL_Error_Timestamp:
Network
26
∙ SHOW SLAVE STATUS
∙ P_S.replication_connection_status
mysql> select * from performance_schema.replication_connection_statusG
*************************** 1. row ***************************
CHANNEL_NAME:
GROUP_NAME:
SOURCE_UUID:
THREAD_ID: NULL
SERVICE_STATE: CONNECTING
COUNT_RECEIVED_HEARTBEATS: 0
LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00
RECEIVED_TRANSACTION_SET:
LAST_ERROR_NUMBER: 1045
LAST_ERROR_MESSAGE: error connecting to master ’root@127.0.0.1:13000’ -
retry-time: 60 retries: 4
LAST_ERROR_TIMESTAMP: 2016-08-24 03:21:36
1 row in set (0,01 sec)
Network
26
∙ SHOW SLAVE STATUS
∙ P_S.replication_connection_status
∙ Error log
2016-08-24T00:18:36.077384Z 3 [ERROR] Slave I/O for channel ”: error connecting to
master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 1, Error_code: 1045
2016-08-24T00:19:36.299011Z 3 [ERROR] Slave I/O for channel ”: error connecting to
master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 2, Error_code: 1045
2016-08-24T00:20:36.485315Z 3 [ERROR] Slave I/O for channel ”: error connecting to
master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 3, Error_code: 1045
2016-08-24T00:21:36.677915Z 3 [ERROR] Slave I/O for channel ”: error connecting to
master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 4, Error_code: 1045
2016-08-24T00:22:36.872066Z 3 [ERROR] Slave I/O for channel ”: error connecting to
master ’root@127.0.0.1:13000’ - retry-time: 60 retries: 5, Error_code: 1045
Network
26
∙ SHOW SLAVE STATUS
∙ P_S.replication_connection_status
∙ Error log
∙
Access
$ perror 1045
MySQL error code 1045 (ER_ACCESS_DENIED_ERROR): Access denied for user ’%-.48s’@’%-.64s’
(using password: %s)
Network
26
∙ SHOW SLAVE STATUS
∙ P_S.replication_connection_status
∙ Error log
∙
Access
∙ MySQL client slave’s login-password
$ mysql -h127.0.0.1 -P13000 -uroot -pbar
Warning: Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user ’root’@’localhost’ (using password: YES)
Network
26
∙ SHOW SLAVE STATUS
∙ P_S.replication_connection_status
∙ Error log
∙
Access
∙ MySQL client slave’s login-password
SHOW GRANTS
mysql> SHOW GRANTS;
+----------------------------------+
| Grants for foo@% |
+----------------------------------+
| GRANT SELECT ON *.* TO ’foo’@’%’ |
+----------------------------------+
Network
26
∙ SHOW SLAVE STATUS
∙ P_S.replication_connection_status
∙ Error log
∙
Access
∙ MySQL client slave’s login-password
SHOW GRANTS
∙
Fix privileges on the master
∙ Restart slave
Network
26
∙ Regular performance troubleshooting
∙ Check with command line client
∙ Troubleshooting hardware resource usage
webinar
Performance
27
Slave SQL thread
∙ One master - one slave
∙ Different data
Slave cannot execute event from the relay log
∙
Different errors on master and slave
∙ Slave lags behind the master
SQL thread: typical issues
29
∙ One master - one slave
∙ Different data
Slave cannot execute event from the relay log
∙
Different errors on master and slave
∙ Slave lags behind the master
∙ Circle replication and other writes in addition
to SQL thread
∙ Different data
SQL thread: typical issues
29
∙ Did table change outside of the replication?
∙ How?
∙
Can it cause conflict with changes on the master?
Different Data
30
∙ Did table change outside of the replication?
∙ Are table structures identical?
∙ Percona Toolkit
pt-table-checksum, pt-table-sync
∙ MySQL Utilities
MySQL: mysqlrplsync
mysqldbcompare, mysqldiff
Different Data
30
∙ Did table change outside of the replication?
∙ Are table structures identical?
∙ Are changes in the correct order?
∙ mysqlbinlog
∙
Application logic on the master
Different Data
30
∙ Only with SBR
Updates in the wrong order
31
∙ Only with SBR
∙ Row-level locks
Updates in the wrong order
31
∙ Only with SBR
∙ Row-level locks
∙ Triggers
∙ SET GLOBAL sql_slave_skip_counter – No
GTIDs!
∙
Skip transaction – GTIDs
∙ Synchronize tables!
Updates in the wrong order
31
∙ Only with SBR
∙ Row-level locks
∙ Triggers
∙
Different options: for old versions
∙ Start slave with master’s options
∙ Restart SQL thread
∙
Most issues are fixed in recent versions
Updates in the wrong order
31
∙ Threads
∙ Master executes changes in multiple threads
∙
Slave uses one
Slave lags from the master
32
∙ Threads
∙ Seconds_behind_master increases – You
cannot 100% rely on this number!
Slave lags from the master
32
∙ Threads
∙ Seconds_behind_master increases – You
cannot 100% rely on this number!
∙
Tune slave performance
∙ Multi-threaded slave
One thread for one database in MySQL 5.6
There may be conflicts between multiple slave SQL threads
Slave lags from the master
32
∙ Threads
∙ Seconds_behind_master increases – You
cannot 100% rely on this number!
∙
Tune slave performance
∙ Multi-threaded slave
One thread for one database in MySQL 5.6
There may be conflicts between multiple slave SQL threads
∙ Indexes on the slave
Makes sense for SBR only
Slave lags from the master
32
Multithreaded slave
∙ Single relay log
∙
Speed in high concurrent environment may be
less than on master
Performance
34
∙ Single relay log
∙
Speed in high concurrent environment may be
less than on master
∙ MySQL: slave_parallel_workers
Performance
34
∙ Single relay log
∙
Speed in high concurrent environment may be
less than on master
∙ MySQL: slave_parallel_workers
∙ MySQL: slave_parallel_type=DATABASE | LOGICAL_CLOCK
Performance
34
∙ Single relay log
∙
Speed in high concurrent environment may be
less than on master
∙ MariaDB: slave_parallel_threads
Performance
34
∙ Single relay log
∙
Speed in high concurrent environment may be
less than on master
∙ MariaDB: slave_parallel_threads
∙ MariaDB: slave_parallel_max_queued
Performance
34
∙ Single relay log
∙
Speed in high concurrent environment may be
less than on master
∙ MariaDB: slave_parallel_threads
∙ MariaDB: slave_parallel_max_queued
∙ MariaDB: slave_domain_parallel_threads
Performance
34
∙ Single relay log
∙
Speed in high concurrent environment may be
less than on master
∙ MariaDB: slave_parallel_threads
∙ MariaDB: slave_parallel_max_queued
∙ MariaDB: slave_domain_parallel_threads
∙ MariaDB: slave_parallel_mode=optimistic | conservative |
aggressive | minimal | none
Performance
34
∙ Same methods as for single-threaded
Wrong Behavior
35
∙ Same methods as for single-threaded
∙ Error of one thread stops all
mysql> select WORKER_ID, SERVICE_STATE, LAST_SEEN_TRANSACTION, LAST_ERROR_NUMBER,
-> LAST_ERROR_MESSAGE from performance_schema.replication_applier_status_by_workerG
*************************** 1. row ***************************
WORKER_ID: 1
SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION: d318bc17-66dc-11e6-a471-30b5c2208a0f:4988
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
*************************** 2. row ***************************
WORKER_ID: 3
SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION: d318bc17-66dc-11e6-a471-30b5c2208a0f:4986
LAST_ERROR_NUMBER: 1032
LAST_ERROR_MESSAGE: Worker 2 failed executing transaction...
Wrong Behavior
35
∙ Same methods as for single-threaded
∙ Error of one thread stops all
MariaDB [test]> select id, command, time, state, progress from information_schema.processlist
-> where user=’system user’;
+----+---------+------+------------------------------------------------------------------+
| id | command | time | state |
+----+---------+------+------------------------------------------------------------------+
| 25 | Connect | 4738 | Waiting for master to send event |
| 24 | Connect | 5096 | Slave has read all relay log; waiting for the slave I/O thread t |
| 23 | Connect | 0 | Waiting for work from SQL thread |
| 22 | Connect | 0 | Unlocking tables |
| 21 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) |
| 20 | Connect | 0 | Waiting for prior transaction to start commit before starting ne |
| 19 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) |
| 18 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) |
| 17 | Connect | 0 | Update_rows_log_event::find_row(-1)
...
Wrong Behavior
35
Multi-master
∙ Replication must be set for each
channel/source
Specifics
37
∙ Replication must be set for each
channel/source
∙
You may use master with GTID and without
same time
Specifics
37
∙ Replication must be set for each
channel/source
∙
You may use master with GTID and without
same time
∙ Same issues as with regular replication
Specifics
37
∙ Replication must be set for each
channel/source
∙
You may use master with GTID and without
same time
∙ Same issues as with regular replication
∙ MySQL: Filters work for all channels
Specifics
37
∙ Replication must be set for each
channel/source
∙
You may use master with GTID and without
same time
∙ Same issues as with regular replication
∙ MySQL: Filters work for all channels
∙ MariaDB: You may setup filters for each source
Specifics
37
Summary
∙ Issues on the master
∙ Same as for standalone server
∙
More writes and consistency checks
Summary
39
∙ Issues on the master
∙ Slave IO thread
∙ Common network issues
∙
mysql command line client for tests
Summary
39
∙ Issues on the master
∙ Slave IO thread
∙ Slave SQL thread
∙ Regular query-related issues
∙
Regular storage engine issues
∙
Less execution threads than on master
Summary
39
∙
Basic Techniques – troubleshooting webinar
∙
Troubleshooting hardware webinar
∙ Introduction into SE troubleshooting webinar
More Information
40
∙
Basic Techniques – troubleshooting webinar
∙
Troubleshooting hardware webinar
∙ Introduction into SE troubleshooting webinar
∙ Percona Monitoring and Management
∙
Percona Toolkit
∙ MySQL Utilities
More Information
40
∙
Basic Techniques – troubleshooting webinar
∙
Troubleshooting hardware webinar
∙ Introduction into SE troubleshooting webinar
∙ Percona Monitoring and Management
∙
Percona Toolkit
∙ MySQL Utilities
∙ Book MySQL High Availability
∙ MySQL Replication Team blog
∙
Replication in MariaDB
More Information
40
???
Time for questions
41
http://www.slideshare.net/SvetaSmirnova
https://twitter.com/svetsmirnova
https://github.com/svetasmirnova
Thank You!
42
Appendix
Appendix
Replication in Details
∙ Is data up to date?
∙ Are these same?
∙
Table structures
∙ Storage Engine
∙ Data
∙
Any write can break replication
Asynchronous: Slave Q&A
45
∙ Writes on master are slower than in case of
asynchronous
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
∙ Before 5.7: from single slave
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
∙ Before 5.7: from single slave
∙ Now in MySQL:
rpl_semi_sync_master_wait_for_slave_count
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
∙ Before 5.7: from single slave
∙ Now in MySQL:
rpl_semi_sync_master_wait_for_slave_count
∙ Would not wait others
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
∙
What does "Ack"mean?
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
∙
What does "Ack"mean?
∙ Event is written into relay log
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
∙
What does "Ack"mean?
∙ Event is written into relay log
∙ No guarantee it is executed
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
∙
What does "Ack"mean?
∙ What happens in case of timeout?
Semi-synchrous Replication
46
∙ Writes on master are slower than in case of
asynchronous
∙
How many "Ack"s waits master?
∙
What does "Ack"mean?
∙ What happens in case of timeout?
∙ Replication becomes asynchronous
Semi-synchrous Replication
46
∙ Every change written twice:
SE files:
logs, data, ...
Binary Log
Logical Replication
47
∙ Every change written twice:
SE files:
logs, data, ...
Binary Log
∙
You can write on slave
Logical Replication
47
∙ Does not exist in MySQL/MariaSB!
Just for Comparison: Physical Replication
48
∙ Does not exist in MySQL/MariaSB!
∙ There are two closed-source solutions
Just for Comparison: Physical Replication
48
∙ Does not exist in MySQL/MariaSB!
∙ Master writes only into SE files
Just for Comparison: Physical Replication
48
∙ Does not exist in MySQL/MariaSB!
∙ Master writes only into SE files
∙ Which are replicated to slave
Just for Comparison: Physical Replication
48
∙ Does not exist in MySQL/MariaSB!
∙ Master writes only into SE files
∙ Which are replicated to slave
∙ From the troubleshooting point of view
∙
IO: changes are written only once
∙ You cannot write on slave in parallel
∙ Any data inconsistency leads to replication break
Just for Comparison: Physical Replication
48
∙ Data transfer
∙ Execution
∙ Different
∙ Diagnostics
∙ Fixes
Two Kinds of Threads – Two Kinds of Issues
49
∙ Guaranteed that every transaction will be
executed only once
GTID
50
∙ Guaranteed that every transaction will be
executed only once
∙
Simple failover
GTID
50
∙ Guaranteed that every transaction will be
executed only once
∙
Simple failover
∙
It is not easy to skip a transaction
GTID
50
∙ Guaranteed that every transaction will be
executed only once
∙
Simple failover
∙
It is not easy to skip a transaction
∙ MySQL: use mysqlslavetrx
GTID
50
∙ Guaranteed that every transaction will be
executed only once
∙
Simple failover
∙
It is not easy to skip a transaction
∙ MySQL: use mysqlslavetrx
∙
MariaDB: set global gtid_slave_pos=’X-Y-Z’;
GTID
50
∙ Guaranteed that every transaction will be
executed only once
∙
Simple failover
∙
It is not easy to skip a transaction
∙ MySQL: use mysqlslavetrx
∙
MariaDB: set global gtid_slave_pos=’X-Y-Z’;
∙ Be careful with expire_logs_days!
GTID
50
∙ Statement-based (SBR)
∙ Queries are written as received
Binary Log Formats
51
∙ Statement-based (SBR)
∙ Queries are written as received
∙ There is a risk of data inconsistency (non-safe)
INSERT IGNORE
LIMIT without ORDER BY
Non-deterministic functions
...
Binary Log Formats
51
∙ Statement-based (SBR)
∙ Row-based (RBR)
∙ Usually more data are written
IO
Transfer speed
binlog_row_image
Binary Log Formats
51
∙ Statement-based (SBR)
∙ Row-based (RBR)
∙ Usually more data are written
IO
Transfer speed
binlog_row_image
∙
Performance may be worse if table does not have
primary (unique) key
MariaDB may use any index
Binary Log Formats
51
∙ Statement-based (SBR)
∙ Row-based (RBR)
∙ Mixed
∙ Advantages of both formats
Binary Log Formats
51
Appendix
Main Tools by Example
∙ Slave start
Error log
53
∙ Slave start
∙ Errors
2016-08-23T12:11:21.867440Z 4 [ERROR] Slave SQL for channel ’master-1’: Could not execute
Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error
HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213, Error_code: 1032
2016-08-23T12:11:21.867471Z 4 [Warning] Slave: Can’t find record in ’t1’ Error_code: 1032
2016-08-23T12:11:21.867484Z 4 [ERROR] Error running query, slave SQL thread aborted.
Fix the problem, and restart the slave SQL thread with "SLAVE START".
We stopped at log ’master-bin.000001’ position 989
Error log
53
∙ Slave start
∙ Errors
∙ Slave stop
Error log
53
All information about slave
∙
IO thread Configuration
∙ SQL thread Configuration
∙ IO thread Status
∙ SQL thread Status
∙ Errors
Only last one
All are in the error log
SHOW SLAVE STATUS
54
mysql> show slave status G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
...
Master_Log_File: master-bin.000002
Read_Master_Log_Pos: 63810611
Relay_Log_File: slave-relay-bin-master@002d1.000004
Relay_Log_Pos: 1156
Relay_Master_Log_File: master-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
...
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Update_rows event on table m2.t1; Can’t find
record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE;
the event’s master log master-bin.000001, end_log_pos 1213
Skip_Counter: 0
...
SHOW SLAVE STATUS
54
∙ No need to parse SHOW
Tables in Performance Schema
55
∙ No need to parse SHOW
∙ Configuration
∙
replication_connection_configuration
∙ replication_applier_configuration
∙ mysql> select * from replication_connection_configuration
-> join replication_applier_configuration using(channel_name);
Tables in Performance Schema
55
∙ No need to parse SHOW
∙ Configuration
∙ IO thread status
∙ replication_connection_status
Tables in Performance Schema
55
∙ No need to parse SHOW
∙ Configuration
∙ IO thread status
∙ SQL thread status
∙ replication_applier_status
∙ replication_applier_status_by_coordinator - MTS!
mysql> select * from replication_applier_status join
-> replication_applier_status_by_coordinator
-> using(channel_name);
Tables in Performance Schema
55
∙ No need to parse SHOW
∙ Configuration
∙ IO thread status
∙ SQL thread status
∙ replication_applier_status
∙ replication_applier_status_by_worker
mysql> select * from replication_applier_status join
-> replication_applier_status_by_worker
-> using(channel_name);
Tables in Performance Schema
55
∙ Master Info
mysql> select * from slave_master_infoG
*************************** 1. row ***************************
Number_of_lines: 25
Master_log_name: mysqld-bin.000001
Master_log_pos: 154
Host: 127.0.0.1
User_name: root
User_password: secret
Port: 13000
Connect_retry: 60
Enabled_ssl: 0
...
Uuid: 31ed7c8f-74ea-11e6-8de8-30b5c2208a0f
Retry_count: 86400
...
Enabled_auto_position: 1
...
system database mysql: only on the slave
56
∙ Master Info
∙ Relay log info
mysql> select * from slave_relay_log_infoG
*************************** 1. row ***************************
Number_of_lines: 7
Relay_log_name: ./slave-relay-bin-master@002d1.000004
Relay_log_pos: 1156
Master_log_name: master-bin.000001
Master_log_pos: 989
Sql_delay: 0
Number_of_workers: 0
Id: 1
Channel_name: master-1
system database mysql: only on the slave
56
∙ Master Info
∙ Relay log info
∙ Worker info: multi-threaded slave
mysql> select * from slave_worker_infoG
*************************** 1. row ***************************
Id: 1
...
*************************** 8. row ***************************
Id: 8
Relay_log_name: ./Thinkie-relay-bin.000004
Relay_log_pos: 1216
Master_log_name: mysqld-bin.000001
Master_log_pos: 1342
Checkpoint_relay_log_name: ./Thinkie-relay-bin.000004
Checkpoint_relay_log_pos: 963
Checkpoint_master_log_name: mysqld-bin.000001
system database mysql: only on the slave
56
mysql> show master statusG
*************************** 1. row ***************************
File: master-bin.000005
Position: 154
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set:
1 row in set (0,00 sec)
SHOW MASTER STATUS
57
mysql> show binlog events in ’master-bin.000001’ from 989;
+-------------------+------+----------------+-----------+-------------+--------------------------------
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info
+-------------------+------+----------------+-----------+-------------+--------------------------------
| master-bin.000001 | 989 | Anonymous_Gtid | 1 | 1054 | SET @@SESSION.GTID_NEXT= ...
| master-bin.000001 | 1054 | Query | 1 | 1124 | BEGIN
| master-bin.000001 | 1124 | Table_map | 1 | 1167 | table_id: 109 (m2.t1)
| master-bin.000001 | 1167 | Update_rows | 1 | 1213 | table_id: 109 flags: STMT_END_F
| master-bin.000001 | 1213 | Xid | 1 | 1244 | COMMIT /* xid=64 */
+-------------------+------+----------------+-----------+-------------+--------------------------------
5 rows in set (0,00 sec)
SHOW BINLOG EVENTS
58
$ mysqlbinlog var/mysqld.1/data/master-bin.000001 –start-position=989 –stop-position=1213
...
# at 1167
#160822 14:15:11 server id 1 end_log_pos 1213 CRC32 0x1f346c6b
Update_rows: table id 109 flags: STMT_END_F
BINLOG ’
v966VxMBAAAAKwAAAI8EAAAAAG0AAAAAAAEAAm0yAAJ0MQABAwABY2HOoQ==
v966Vx8BAAAALgAAAL0EAAAAAG0AAAAAAAEAAgAB///+BQAAAP4GAAAAa2w0Hw==
’/*!*/;
ROLLBACK /* added by mysqlbinlog */ /*!*/;
SET @@SESSION.GTID_NEXT= ’AUTOMATIC’ /* added by mysqlbinlog */ /*!*/;
...
mysqlbinlog
59
$ mysqlbinlog -v var/mysqld.1/data/master-bin.000001 –start-position=989 –stop-position=1213
...
# at 1167
#160822 14:15:11 server id 1 end_log_pos 1213 CRC32 0x1f346c6b
Update_rows: table id 109 flags: STMT_END_F
BINLOG ’
v966VxMBAAAAKwAAAI8EAAAAAG0AAAAAAAEAAm0yAAJ0MQABAwABY2HOoQ==
v966Vx8BAAAALgAAAL0EAAAAAG0AAAAAAAEAAgAB///+BQAAAP4GAAAAa2w0Hw==
’/*!*/;
### UPDATE ‘m2‘.‘t1‘
### WHERE
### @1=5
### SET
### @1=6
ROLLBACK /* added by mysqlbinlog */ /*!*/;
SET @@SESSION.GTID_NEXT= ’AUTOMATIC’ /* added by mysqlbinlog */ /*!*/;
...
mysqlbinlog
60
∙ Percona Toolkit
∙ pt-table-checksum
Checks data consistency
Toolkits
61
∙ Percona Toolkit
∙ pt-table-checksum
Checks data consistency
∙ pt-table-sync
Fixes data inconsistencies
Toolkits
61
∙ Percona Toolkit
∙ pt-table-checksum
Checks data consistency
∙ pt-table-sync
Fixes data inconsistencies
∙ pt-slave-find
Shows topology
Toolkits
61
∙ MySQL Utilities
∙ mysqlrplcheck
Checks if MySQL servers are ready to replicate
Toolkits
61
∙ MySQL Utilities
∙ mysqlrplcheck
Checks if MySQL servers are ready to replicate
∙ mysqlrplshow
Shows topology
Toolkits
61
∙ MySQL Utilities
∙ mysqlrplcheck
Checks if MySQL servers are ready to replicate
∙ mysqlrplshow
Shows topology
∙ mysqlrplsync
Checks data consistency
Toolkits
61
∙ MySQL Utilities
∙ mysqlrplcheck
Checks if MySQL servers are ready to replicate
∙ mysqlrplshow
Shows topology
∙ mysqlrplsync
Checks data consistency
∙ mysqlslavetrx
Skips 1-N transactions
Toolkits
61
∙ MySQL Utilities
∙
mysqldbcompare
Compares two databases
MariaDB-friendly
Toolkits
61
∙ MySQL Utilities
∙
mysqldbcompare
Compares two databases
MariaDB-friendly
∙
mysqldiff
Checks objects definitions
MariaDB-friendly
Toolkits
61
∙ MySQL Utilities
∙
mysqldbcompare
Compares two databases
MariaDB-friendly
∙
mysqldiff
Checks objects definitions
MariaDB-friendly
∙
mysqlserverinfo
Shows main options, such as port and datadir
Replication-oriented
MariaDB-friendly
Toolkits
61
∙
Error log
∙
Slave
∙ SHOW SLAVE STATUS
∙ Tables in Performance Schema
∙ Tables in mysql database
∙
Master
∙ SHOW MASTER STATUS
∙ SHOW BINLOG EVENTS
∙
mysqlbinlog
∙
mysql command line client
Main Instruments: Summary
62

More Related Content

What's hot (20)

PDF
The Full MySQL and MariaDB Parallel Replication Tutorial
Jean-François Gagné
 
PDF
Dd and atomic ddl pl17 dublin
Ståle Deraas
 
PPTX
MySQL8.0_performance_schema.pptx
NeoClova
 
PDF
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Sandesh Rao
 
PDF
MariaDB MaxScale monitor 매뉴얼
NeoClova
 
PDF
MySQL Advanced Administrator 2021 - 네오클로바
NeoClova
 
PDF
BlueStore, A New Storage Backend for Ceph, One Year In
Sage Weil
 
PDF
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
Kenny Gryp
 
PDF
Group Replication in MySQL 8.0 ( A Walk Through )
Mydbops
 
PDF
Galera cluster for high availability
Mydbops
 
PDF
Understanding oracle rac internals part 2 - slides
Mohamed Farouk
 
PDF
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder
 
PDF
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Glen Hawkins
 
PDF
MySQL Performance Schema in 20 Minutes
Sveta Smirnova
 
PDF
MySQL Group Replication: Handling Network Glitches - Best Practices
Frederic Descamps
 
PDF
MyRocks Deep Dive
Yoshinori Matsunobu
 
PDF
InnoDb Vs NDB Cluster
Mark Swarbrick
 
PDF
基本に戻ってInnoDBの話をします
yoku0825
 
DOCX
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
NeoClova
 
PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
MariaDB plc
 
The Full MySQL and MariaDB Parallel Replication Tutorial
Jean-François Gagné
 
Dd and atomic ddl pl17 dublin
Ståle Deraas
 
MySQL8.0_performance_schema.pptx
NeoClova
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Sandesh Rao
 
MariaDB MaxScale monitor 매뉴얼
NeoClova
 
MySQL Advanced Administrator 2021 - 네오클로바
NeoClova
 
BlueStore, A New Storage Backend for Ceph, One Year In
Sage Weil
 
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
Kenny Gryp
 
Group Replication in MySQL 8.0 ( A Walk Through )
Mydbops
 
Galera cluster for high availability
Mydbops
 
Understanding oracle rac internals part 2 - slides
Mohamed Farouk
 
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder
 
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Glen Hawkins
 
MySQL Performance Schema in 20 Minutes
Sveta Smirnova
 
MySQL Group Replication: Handling Network Glitches - Best Practices
Frederic Descamps
 
MyRocks Deep Dive
Yoshinori Matsunobu
 
InnoDb Vs NDB Cluster
Mark Swarbrick
 
基本に戻ってInnoDBの話をします
yoku0825
 
Keepalived+MaxScale+MariaDB_운영매뉴얼_1.0.docx
NeoClova
 
How to Manage Scale-Out Environments with MariaDB MaxScale
MariaDB plc
 

Viewers also liked (20)

PDF
MySQL Server Defaults
Morgan Tocker
 
PDF
MySQL InnoDB Cluster and Group Replication - OSI 2017 Bangalore
Sujatha Sivakumar
 
PDF
MySQL High Availability and Disaster Recovery with Continuent, a VMware company
Continuent
 
PDF
MySQL High Availability Solutions
Lenz Grimmer
 
PPTX
MySQL aio
zhaolinjnu
 
PDF
MySQL Best Practices - OTN LAD Tour
Ronald Bradford
 
PDF
Capturing, Analyzing and Optimizing MySQL
Ronald Bradford
 
PDF
Java MySQL Connector & Connection Pool Features & Optimization
Kenny Gryp
 
PPT
Mysql high availability and scalability
yin gong
 
PDF
MySQL Group Replication
Kenny Gryp
 
PPT
淘宝数据库架构演进历程
zhaolinjnu
 
PDF
Extensible Data Modeling
Karwin Software Solutions LLC
 
PDF
Multi Source Replication With MySQL 5.7 @ Verisure
Kenny Gryp
 
PDF
Online MySQL Backups with Percona XtraBackup
Kenny Gryp
 
PPTX
High Availability Using MySQL Group Replication
OSSCube
 
PDF
MySQL - checklist для новичка в Highload
Sveta Smirnova
 
PDF
Mix ‘n’ Match Async and Group Replication for Advanced Replication Setups
Pedro Gomes
 
PDF
MHA (MySQL High Availability): Getting started & moving past quirks
Colin Charles
 
PDF
A New Architecture for Group Replication in Data Grid
Editor IJCATR
 
PDF
MySQL 5.7: Focus on InnoDB
Mario Beck
 
MySQL Server Defaults
Morgan Tocker
 
MySQL InnoDB Cluster and Group Replication - OSI 2017 Bangalore
Sujatha Sivakumar
 
MySQL High Availability and Disaster Recovery with Continuent, a VMware company
Continuent
 
MySQL High Availability Solutions
Lenz Grimmer
 
MySQL aio
zhaolinjnu
 
MySQL Best Practices - OTN LAD Tour
Ronald Bradford
 
Capturing, Analyzing and Optimizing MySQL
Ronald Bradford
 
Java MySQL Connector & Connection Pool Features & Optimization
Kenny Gryp
 
Mysql high availability and scalability
yin gong
 
MySQL Group Replication
Kenny Gryp
 
淘宝数据库架构演进历程
zhaolinjnu
 
Extensible Data Modeling
Karwin Software Solutions LLC
 
Multi Source Replication With MySQL 5.7 @ Verisure
Kenny Gryp
 
Online MySQL Backups with Percona XtraBackup
Kenny Gryp
 
High Availability Using MySQL Group Replication
OSSCube
 
MySQL - checklist для новичка в Highload
Sveta Smirnova
 
Mix ‘n’ Match Async and Group Replication for Advanced Replication Setups
Pedro Gomes
 
MHA (MySQL High Availability): Getting started & moving past quirks
Colin Charles
 
A New Architecture for Group Replication in Data Grid
Editor IJCATR
 
MySQL 5.7: Focus on InnoDB
Mario Beck
 
Ad

Similar to Lessons Learned: Troubleshooting Replication (20)

PDF
MySQL Replication Troubleshooting for Oracle DBAs
Sveta Smirnova
 
PDF
Why MySQL Replication Fails, and How to Get it Back
Sveta Smirnova
 
PDF
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
PDF
MySQL Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
PDF
MySQL Parallel Replication: inventory, use-cases and limitations
Jean-François Gagné
 
PDF
MySQL Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
PDF
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Colin Charles
 
PDF
Best practices for MySQL High Availability
Colin Charles
 
PDF
MySQL 5.6 Replication Webinar
Mark Swarbrick
 
PDF
Pseudo GTID and Easy MySQL Replication Topology Management
Shlomi Noach
 
PDF
Webinar: MariaDB Provides the Solution to Ease Multi-Source Replication
Wagner Bianchi
 
PDF
Best practices for MySQL High Availability Tutorial
Colin Charles
 
PPTX
MySQL Replication Overview -- PHPTek 2016
Dave Stokes
 
PDF
MySQL Parallel Replication by Booking.com
Jean-François Gagné
 
PDF
MySQL highav Availability
Baruch Osoveskiy
 
PDF
OSDC 2018 | Scaling & High Availability MySQL learnings from the past decade+...
NETWAYS
 
PPTX
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
Ontico
 
PDF
MySQL Replication
Mark Swarbrick
 
PDF
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
Jean-François Gagné
 
PDF
Percona Live 2012PPT: introduction-to-mysql-replication
mysqlops
 
MySQL Replication Troubleshooting for Oracle DBAs
Sveta Smirnova
 
Why MySQL Replication Fails, and How to Get it Back
Sveta Smirnova
 
MySQL/MariaDB Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
MySQL Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
MySQL Parallel Replication: inventory, use-cases and limitations
Jean-François Gagné
 
MySQL Parallel Replication: inventory, use-case and limitations
Jean-François Gagné
 
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Colin Charles
 
Best practices for MySQL High Availability
Colin Charles
 
MySQL 5.6 Replication Webinar
Mark Swarbrick
 
Pseudo GTID and Easy MySQL Replication Topology Management
Shlomi Noach
 
Webinar: MariaDB Provides the Solution to Ease Multi-Source Replication
Wagner Bianchi
 
Best practices for MySQL High Availability Tutorial
Colin Charles
 
MySQL Replication Overview -- PHPTek 2016
Dave Stokes
 
MySQL Parallel Replication by Booking.com
Jean-François Gagné
 
MySQL highav Availability
Baruch Osoveskiy
 
OSDC 2018 | Scaling & High Availability MySQL learnings from the past decade+...
NETWAYS
 
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
Ontico
 
MySQL Replication
Mark Swarbrick
 
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
Jean-François Gagné
 
Percona Live 2012PPT: introduction-to-mysql-replication
mysqlops
 
Ad

More from Sveta Smirnova (20)

PDF
War Story: Removing Offensive Language from Percona Toolkit
Sveta Smirnova
 
PDF
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
Sveta Smirnova
 
PDF
Database in Kubernetes: Diagnostics and Monitoring
Sveta Smirnova
 
PDF
MySQL Database Monitoring: Must, Good and Nice to Have
Sveta Smirnova
 
PDF
MySQL Cookbook: Recipes for Developers
Sveta Smirnova
 
PDF
MySQL Performance for DevOps
Sveta Smirnova
 
PDF
MySQL Test Framework для поддержки клиентов и верификации багов
Sveta Smirnova
 
PDF
MySQL Cookbook: Recipes for Your Business
Sveta Smirnova
 
PDF
Introduction into MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
PDF
Производительность MySQL для DevOps
Sveta Smirnova
 
PDF
MySQL Performance for DevOps
Sveta Smirnova
 
PDF
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
Sveta Smirnova
 
PDF
How to migrate from MySQL to MariaDB without tears
Sveta Smirnova
 
PDF
Modern solutions for modern database load: improvements in the latest MariaDB...
Sveta Smirnova
 
PDF
How Safe is Asynchronous Master-Master Setup?
Sveta Smirnova
 
PDF
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Sveta Smirnova
 
PDF
How to Avoid Pitfalls in Schema Upgrade with Galera
Sveta Smirnova
 
PDF
How Safe is Asynchronous Master-Master Setup?
Sveta Smirnova
 
PDF
Introduction to MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
PDF
Billion Goods in Few Categories: How Histograms Save a Life?
Sveta Smirnova
 
War Story: Removing Offensive Language from Percona Toolkit
Sveta Smirnova
 
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
Sveta Smirnova
 
Database in Kubernetes: Diagnostics and Monitoring
Sveta Smirnova
 
MySQL Database Monitoring: Must, Good and Nice to Have
Sveta Smirnova
 
MySQL Cookbook: Recipes for Developers
Sveta Smirnova
 
MySQL Performance for DevOps
Sveta Smirnova
 
MySQL Test Framework для поддержки клиентов и верификации багов
Sveta Smirnova
 
MySQL Cookbook: Recipes for Your Business
Sveta Smirnova
 
Introduction into MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
Производительность MySQL для DevOps
Sveta Smirnova
 
MySQL Performance for DevOps
Sveta Smirnova
 
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
Sveta Smirnova
 
How to migrate from MySQL to MariaDB without tears
Sveta Smirnova
 
Modern solutions for modern database load: improvements in the latest MariaDB...
Sveta Smirnova
 
How Safe is Asynchronous Master-Master Setup?
Sveta Smirnova
 
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Sveta Smirnova
 
How to Avoid Pitfalls in Schema Upgrade with Galera
Sveta Smirnova
 
How Safe is Asynchronous Master-Master Setup?
Sveta Smirnova
 
Introduction to MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
Billion Goods in Few Categories: How Histograms Save a Life?
Sveta Smirnova
 

Recently uploaded (20)

PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PPTX
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PPTX
Human Resources Information System (HRIS)
Amity University, Patna
 
PPTX
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
PPTX
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PDF
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PDF
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
PPTX
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
PPTX
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
PDF
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
PDF
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PDF
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
PDF
Executive Business Intelligence Dashboards
vandeslie24
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
MailsDaddy Outlook OST to PST converter.pptx
abhishekdutt366
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
Human Resources Information System (HRIS)
Amity University, Patna
 
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
MiniTool Power Data Recovery Full Crack Latest 2025
muhammadgurbazkhan
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Digger Solo: Semantic search and maps for your local files
seanpedersen96
 
Comprehensive Guide: Shoviv Exchange to Office 365 Migration Tool 2025
Shoviv Software
 
Revolutionizing Code Modernization with AI
KrzysztofKkol1
 
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
GetOnCRM Speeds Up Agentforce 3 Deployment for Enterprise AI Wins.pdf
GetOnCRM Solutions
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
Executive Business Intelligence Dashboards
vandeslie24
 

Lessons Learned: Troubleshooting Replication

  • 1. Lessons Learned: Troubleshooting Replication April, 11, 2017 Sveta Smirnova
  • 2. ∙ MySQL Support engineer ∙ Author of ∙ MySQL Troubleshooting ∙ JSON UDF functions ∙ FILTER clause for MySQL ∙ Speaker ∙ Percona Live, OOW, Fosdem, DevConf, HighLoad... Sveta Smirnova 2
  • 3. ∙Typical Replication Errors ∙MySQL and MariaDB Replication: Must Know ∙Master ∙Slave IO thread ∙Slave SQL thread ∙Multithreaded slave ∙Multi-master Table of Contents 3
  • 6. Slave Lags from the Master 6
  • 8. Not a Full List 8
  • 9. MySQL and MariaDB Replication: Must Know
  • 11. Master Slave <- Initiates <- Requests a packet Asynchronous 10
  • 12. Master Sends the packet -> Slave <- Initiates <- Requests a packet Asynchronous 10
  • 13. Master Sends the packet -> Slave <- Initiates <- Requests a packet ... ? Asynchronous 10
  • 15. Master Slave <- Initiates <- Requests a packet Semisynchrous Plugin 11
  • 16. Master Sends the packet -> Slave <- Initiates <- Requests a packet Semisynchrous Plugin 11
  • 17. Master Sends the packet -> Waits "Ack" Slave <- Initiates <- Requests a packet Semisynchrous Plugin 11
  • 18. Master Sends the packet -> Waits "Ack" Slave <- Initiates <- Requests a packet <- Sends "Ack" Semisynchrous Plugin 11
  • 19. Master Recieves a change Storage Engine Logical 12
  • 20. Master Recieves a change Sends to SE -> Storage Engine Logical 12
  • 21. Master Recieves a change Sends to SE -> Storage Engine Writes into table Logical 12
  • 22. Master Recieves a change Sends to SE -> Storage Engine Writes into table <- Returns control Logical 12
  • 23. Master Recieves a change Sends to SE -> Writes into binary log Storage Engine Writes into table <- Returns control Logical 12
  • 24. Master Recieves a change Sends to SE -> Writes into binary log Synchronizes -> Storage Engine Writes into table <- Returns control <- Synchronizes Logical 12
  • 25. IO thread Reads from the master SQL thread Two Kinds of Slave Threads 13
  • 26. IO thread Reads from the master Stores in the relay log SQL thread Two Kinds of Slave Threads 13
  • 27. IO thread Reads from the master Stores in the relay log SQL thread <- Reads from the relay log Two Kinds of Slave Threads 13
  • 28. IO thread Reads from the master Stores in the relay log SQL thread <- Reads from the relay log Executes Two Kinds of Slave Threads 13
  • 29. ∙ Multiple SQL threads in 10.0.5+/5.6+ Several SQL Threads 14
  • 30. ∙ Multiple SQL threads in 10.0.5+/5.6+ ∙ From the troubleshooting point of view ∙ Single IO thread Several SQL Threads 14
  • 31. ∙ Multiple SQL threads in 10.0.5+/5.6+ ∙ From the troubleshooting point of view ∙ Single IO thread ∙ Single Relay log Several SQL Threads 14
  • 32. ∙ Multiple SQL threads in 10.0.5+/5.6+ ∙ From the troubleshooting point of view ∙ Single IO thread ∙ Single Relay log ∙ Slave lag still possible Several SQL Threads 14
  • 33. ∙ Multiple SQL threads in 10.0.5+/5.6+ ∙ From the troubleshooting point of view ∙ Single IO thread ∙ Single Relay log ∙ Slave lag still possible ∙ Error in one thread stops all Several SQL Threads 14
  • 34. ∙ Multiple masters in 10.0.1+/5.7+ Multi-Source (Multi-Channel) 15
  • 35. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs Multi-Source (Multi-Channel) 15
  • 36. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads Multi-Source (Multi-Channel) 15
  • 37. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads Multi-Source (Multi-Channel) 15
  • 38. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads ∙ MySQL: slave_parallel_workers for each channel Multi-Source (Multi-Channel) 15
  • 39. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads ∙ MySQL: slave_parallel_workers for each channel ∙ Channels/sources are independent Multi-Source (Multi-Channel) 15
  • 40. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads ∙ MySQL: slave_parallel_workers for each channel ∙ Channels/sources are independent ∙ Error in one stops only one Multi-Source (Multi-Channel) 15
  • 41. ∙ Multiple masters in 10.0.1+/5.7+ ∙ From the troubleshooting point of view ∙ Multiple sets of relay logs ∙ Multiple IO threads ∙ Multiple SQL threads ∙ MySQL: slave_parallel_workers for each channel ∙ Channels/sources are independent ∙ Error in one stops only one ∙ No automatic conflict resolution Multi-Source (Multi-Channel) 15
  • 42. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position Position-Based 16
  • 43. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position ∙ From the troubleshooting point ov view ∙ Event executes if on the current position Position-Based 16
  • 44. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position ∙ From the troubleshooting point ov view ∙ Event executes if on the current position ∙ Easy to skip Position-Based 16
  • 45. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position ∙ From the troubleshooting point ov view ∙ Event executes if on the current position ∙ Easy to skip ∙ Easy to move position backward Position-Based 16
  • 46. ∙ You must specify ∙ Name of the master’s binary log file ∙ Position ∙ From the troubleshooting point ov view ∙ Event executes if on the current position ∙ Easy to skip ∙ Easy to move position backward ∙ No conflict resolution Position-Based 16
  • 47. ∙ Each transaction has unique number: GTID Global Transaction Identifiers (GTID) 17
  • 48. ∙ Each transaction has unique number: GTID ∙ MySQL: AUTO_POSITION=1 Global Transaction Identifiers (GTID) 17
  • 49. ∙ Each transaction has unique number: GTID ∙ MySQL: AUTO_POSITION=1 ∙ MariaDB: master_use_gtid = { slave_pos | current_pos } Global Transaction Identifiers (GTID) 17
  • 50. ∙ Each transaction has unique number: GTID ∙ MySQL: AUTO_POSITION=1 ∙ MariaDB: master_use_gtid = { slave_pos | current_pos } ∙ No need to specify binary log and position Global Transaction Identifiers (GTID) 17
  • 51. Client Binary log Statement-Based Binary Log Format 18
  • 52. Client INSERT INTO ... -> Binary log Statement-Based Binary Log Format 18
  • 53. Client INSERT INTO ... -> Binary log SET TIMESTAMP... Statement-Based Binary Log Format 18
  • 54. Client INSERT INTO ... -> Binary log SET TIMESTAMP... SET sql_mode... Statement-Based Binary Log Format 18
  • 55. Client INSERT INTO ... -> Binary log SET TIMESTAMP... SET sql_mode... INSERT INTO ... Statement-Based Binary Log Format 18
  • 56. Client Binary log Row-Based Binary Log Format 19
  • 57. Client UPDATE ... -> Binary log Row-Based Binary Log Format 19
  • 58. Client UPDATE ... -> Binary log SET TIMESTAMP... Row-Based Binary Log Format 19
  • 59. Client UPDATE ... -> Binary log SET TIMESTAMP... SET sql_mode... Row-Based Binary Log Format 19
  • 60. Client UPDATE ... -> Binary log SET TIMESTAMP... SET sql_mode... Row before changes Row-Based Binary Log Format 19
  • 61. Client UPDATE ... -> Binary log SET TIMESTAMP... SET sql_mode... Row before changes Row with changes Row-Based Binary Log Format 19
  • 62. ∙ Error log file Main Instruments 20
  • 63. ∙ Error log file ∙ At the slave ∙ SHOW SLAVE STATUS ∙ MySQL: Tables in Performance Schema ∙ System database mysql Main Instruments 20
  • 64. ∙ Error log file ∙ At the slave ∙ At the master ∙ SHOW MASTER STATUS ∙ SHOW BINLOG EVENTS ∙ mysqlbinlog Main Instruments 20
  • 65. ∙ Error log file ∙ At the slave ∙ At the master ∙ Percona Toolkit Main Instruments 20
  • 66. ∙ Error log file ∙ At the slave ∙ At the master ∙ Percona Toolkit ∙ MySQL Utilities Main Instruments 20
  • 67. ∙ Always available, requires setup ∙ Asynchronous ∙ Master∙ Keeps all changes in the binary log Two formats: ROW и STATEMENT ∙ Slave ∙ IO thread reads from the master into relay log ∙ SQL thread executes updates Multiple SQL threads in 10.0.5+/5.6+ Multiple channels/sources (masters) in 10.0.1+/5.7+ ∙ GTID in 10.0.2+/5.6+ Replication Must Know: Summary 21
  • 69. ∙ More writes ∙ binlog_row_image = FULL | MINIMAL | NOBLOB Performance 23
  • 70. ∙ More writes ∙ binlog_row_image = FULL | MINIMAL | NOBLOB ∙ binlog_cache_size Watch Binlog_cache_disk_use Performance 23
  • 71. ∙ More writes ∙ binlog_row_image = FULL | MINIMAL | NOBLOB ∙ binlog_cache_size Watch Binlog_cache_disk_use ∙ binlog_stmt_cache_size Watch Binlog_stmt_cache_disk_use Performance 23
  • 72. ∙ More writes ∙ Synchronization ∙ sync_binlog ∙ Do not disable! ∙ You may set it greater than 1 Performance 23
  • 73. ∙ Binary log lifetime ∙ expire_log_days Behavior 24
  • 74. ∙ Binary log lifetime ∙ Synchronization ∙ SBR is not safe with READ COMMITTED and READ UNCOMMITTED Behavior 24
  • 75. ∙ Binary log lifetime ∙ Synchronization ∙ Order of records in the binary log ∙ Non-deterministic events and SBR Behavior 24
  • 77. ∙ SHOW SLAVE STATUS Slave_IO_Running: Connecting Slave_SQL_Running: Yes ... Last_IO_Errno: 1045 Last_IO_Error: error connecting to master ’[email protected]:13000’ - retry-time: 60 retries: 1 Last_SQL_Errno: 0 Last_SQL_Error: ... Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: 160824 03:18:36 Last_SQL_Error_Timestamp: Network 26
  • 78. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status mysql> select * from performance_schema.replication_connection_statusG *************************** 1. row *************************** CHANNEL_NAME: GROUP_NAME: SOURCE_UUID: THREAD_ID: NULL SERVICE_STATE: CONNECTING COUNT_RECEIVED_HEARTBEATS: 0 LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00 RECEIVED_TRANSACTION_SET: LAST_ERROR_NUMBER: 1045 LAST_ERROR_MESSAGE: error connecting to master ’[email protected]:13000’ - retry-time: 60 retries: 4 LAST_ERROR_TIMESTAMP: 2016-08-24 03:21:36 1 row in set (0,01 sec) Network 26
  • 79. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log 2016-08-24T00:18:36.077384Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’[email protected]:13000’ - retry-time: 60 retries: 1, Error_code: 1045 2016-08-24T00:19:36.299011Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’[email protected]:13000’ - retry-time: 60 retries: 2, Error_code: 1045 2016-08-24T00:20:36.485315Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’[email protected]:13000’ - retry-time: 60 retries: 3, Error_code: 1045 2016-08-24T00:21:36.677915Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’[email protected]:13000’ - retry-time: 60 retries: 4, Error_code: 1045 2016-08-24T00:22:36.872066Z 3 [ERROR] Slave I/O for channel ”: error connecting to master ’[email protected]:13000’ - retry-time: 60 retries: 5, Error_code: 1045 Network 26
  • 80. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log ∙ Access $ perror 1045 MySQL error code 1045 (ER_ACCESS_DENIED_ERROR): Access denied for user ’%-.48s’@’%-.64s’ (using password: %s) Network 26
  • 81. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log ∙ Access ∙ MySQL client slave’s login-password $ mysql -h127.0.0.1 -P13000 -uroot -pbar Warning: Using a password on the command line interface can be insecure. ERROR 1045 (28000): Access denied for user ’root’@’localhost’ (using password: YES) Network 26
  • 82. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log ∙ Access ∙ MySQL client slave’s login-password SHOW GRANTS mysql> SHOW GRANTS; +----------------------------------+ | Grants for foo@% | +----------------------------------+ | GRANT SELECT ON *.* TO ’foo’@’%’ | +----------------------------------+ Network 26
  • 83. ∙ SHOW SLAVE STATUS ∙ P_S.replication_connection_status ∙ Error log ∙ Access ∙ MySQL client slave’s login-password SHOW GRANTS ∙ Fix privileges on the master ∙ Restart slave Network 26
  • 84. ∙ Regular performance troubleshooting ∙ Check with command line client ∙ Troubleshooting hardware resource usage webinar Performance 27
  • 86. ∙ One master - one slave ∙ Different data Slave cannot execute event from the relay log ∙ Different errors on master and slave ∙ Slave lags behind the master SQL thread: typical issues 29
  • 87. ∙ One master - one slave ∙ Different data Slave cannot execute event from the relay log ∙ Different errors on master and slave ∙ Slave lags behind the master ∙ Circle replication and other writes in addition to SQL thread ∙ Different data SQL thread: typical issues 29
  • 88. ∙ Did table change outside of the replication? ∙ How? ∙ Can it cause conflict with changes on the master? Different Data 30
  • 89. ∙ Did table change outside of the replication? ∙ Are table structures identical? ∙ Percona Toolkit pt-table-checksum, pt-table-sync ∙ MySQL Utilities MySQL: mysqlrplsync mysqldbcompare, mysqldiff Different Data 30
  • 90. ∙ Did table change outside of the replication? ∙ Are table structures identical? ∙ Are changes in the correct order? ∙ mysqlbinlog ∙ Application logic on the master Different Data 30
  • 91. ∙ Only with SBR Updates in the wrong order 31
  • 92. ∙ Only with SBR ∙ Row-level locks Updates in the wrong order 31
  • 93. ∙ Only with SBR ∙ Row-level locks ∙ Triggers ∙ SET GLOBAL sql_slave_skip_counter – No GTIDs! ∙ Skip transaction – GTIDs ∙ Synchronize tables! Updates in the wrong order 31
  • 94. ∙ Only with SBR ∙ Row-level locks ∙ Triggers ∙ Different options: for old versions ∙ Start slave with master’s options ∙ Restart SQL thread ∙ Most issues are fixed in recent versions Updates in the wrong order 31
  • 95. ∙ Threads ∙ Master executes changes in multiple threads ∙ Slave uses one Slave lags from the master 32
  • 96. ∙ Threads ∙ Seconds_behind_master increases – You cannot 100% rely on this number! Slave lags from the master 32
  • 97. ∙ Threads ∙ Seconds_behind_master increases – You cannot 100% rely on this number! ∙ Tune slave performance ∙ Multi-threaded slave One thread for one database in MySQL 5.6 There may be conflicts between multiple slave SQL threads Slave lags from the master 32
  • 98. ∙ Threads ∙ Seconds_behind_master increases – You cannot 100% rely on this number! ∙ Tune slave performance ∙ Multi-threaded slave One thread for one database in MySQL 5.6 There may be conflicts between multiple slave SQL threads ∙ Indexes on the slave Makes sense for SBR only Slave lags from the master 32
  • 100. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master Performance 34
  • 101. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MySQL: slave_parallel_workers Performance 34
  • 102. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MySQL: slave_parallel_workers ∙ MySQL: slave_parallel_type=DATABASE | LOGICAL_CLOCK Performance 34
  • 103. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MariaDB: slave_parallel_threads Performance 34
  • 104. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MariaDB: slave_parallel_threads ∙ MariaDB: slave_parallel_max_queued Performance 34
  • 105. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MariaDB: slave_parallel_threads ∙ MariaDB: slave_parallel_max_queued ∙ MariaDB: slave_domain_parallel_threads Performance 34
  • 106. ∙ Single relay log ∙ Speed in high concurrent environment may be less than on master ∙ MariaDB: slave_parallel_threads ∙ MariaDB: slave_parallel_max_queued ∙ MariaDB: slave_domain_parallel_threads ∙ MariaDB: slave_parallel_mode=optimistic | conservative | aggressive | minimal | none Performance 34
  • 107. ∙ Same methods as for single-threaded Wrong Behavior 35
  • 108. ∙ Same methods as for single-threaded ∙ Error of one thread stops all mysql> select WORKER_ID, SERVICE_STATE, LAST_SEEN_TRANSACTION, LAST_ERROR_NUMBER, -> LAST_ERROR_MESSAGE from performance_schema.replication_applier_status_by_workerG *************************** 1. row *************************** WORKER_ID: 1 SERVICE_STATE: OFF LAST_SEEN_TRANSACTION: d318bc17-66dc-11e6-a471-30b5c2208a0f:4988 LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: *************************** 2. row *************************** WORKER_ID: 3 SERVICE_STATE: OFF LAST_SEEN_TRANSACTION: d318bc17-66dc-11e6-a471-30b5c2208a0f:4986 LAST_ERROR_NUMBER: 1032 LAST_ERROR_MESSAGE: Worker 2 failed executing transaction... Wrong Behavior 35
  • 109. ∙ Same methods as for single-threaded ∙ Error of one thread stops all MariaDB [test]> select id, command, time, state, progress from information_schema.processlist -> where user=’system user’; +----+---------+------+------------------------------------------------------------------+ | id | command | time | state | +----+---------+------+------------------------------------------------------------------+ | 25 | Connect | 4738 | Waiting for master to send event | | 24 | Connect | 5096 | Slave has read all relay log; waiting for the slave I/O thread t | | 23 | Connect | 0 | Waiting for work from SQL thread | | 22 | Connect | 0 | Unlocking tables | | 21 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) | | 20 | Connect | 0 | Waiting for prior transaction to start commit before starting ne | | 19 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) | | 18 | Connect | 0 | Update_rows_log_event::ha_update_row(-1) | | 17 | Connect | 0 | Update_rows_log_event::find_row(-1) ... Wrong Behavior 35
  • 111. ∙ Replication must be set for each channel/source Specifics 37
  • 112. ∙ Replication must be set for each channel/source ∙ You may use master with GTID and without same time Specifics 37
  • 113. ∙ Replication must be set for each channel/source ∙ You may use master with GTID and without same time ∙ Same issues as with regular replication Specifics 37
  • 114. ∙ Replication must be set for each channel/source ∙ You may use master with GTID and without same time ∙ Same issues as with regular replication ∙ MySQL: Filters work for all channels Specifics 37
  • 115. ∙ Replication must be set for each channel/source ∙ You may use master with GTID and without same time ∙ Same issues as with regular replication ∙ MySQL: Filters work for all channels ∙ MariaDB: You may setup filters for each source Specifics 37
  • 117. ∙ Issues on the master ∙ Same as for standalone server ∙ More writes and consistency checks Summary 39
  • 118. ∙ Issues on the master ∙ Slave IO thread ∙ Common network issues ∙ mysql command line client for tests Summary 39
  • 119. ∙ Issues on the master ∙ Slave IO thread ∙ Slave SQL thread ∙ Regular query-related issues ∙ Regular storage engine issues ∙ Less execution threads than on master Summary 39
  • 120. ∙ Basic Techniques – troubleshooting webinar ∙ Troubleshooting hardware webinar ∙ Introduction into SE troubleshooting webinar More Information 40
  • 121. ∙ Basic Techniques – troubleshooting webinar ∙ Troubleshooting hardware webinar ∙ Introduction into SE troubleshooting webinar ∙ Percona Monitoring and Management ∙ Percona Toolkit ∙ MySQL Utilities More Information 40
  • 122. ∙ Basic Techniques – troubleshooting webinar ∙ Troubleshooting hardware webinar ∙ Introduction into SE troubleshooting webinar ∙ Percona Monitoring and Management ∙ Percona Toolkit ∙ MySQL Utilities ∙ Book MySQL High Availability ∙ MySQL Replication Team blog ∙ Replication in MariaDB More Information 40
  • 127. ∙ Is data up to date? ∙ Are these same? ∙ Table structures ∙ Storage Engine ∙ Data ∙ Any write can break replication Asynchronous: Slave Q&A 45
  • 128. ∙ Writes on master are slower than in case of asynchronous Semi-synchrous Replication 46
  • 129. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? Semi-synchrous Replication 46
  • 130. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ Before 5.7: from single slave Semi-synchrous Replication 46
  • 131. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ Before 5.7: from single slave ∙ Now in MySQL: rpl_semi_sync_master_wait_for_slave_count Semi-synchrous Replication 46
  • 132. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ Before 5.7: from single slave ∙ Now in MySQL: rpl_semi_sync_master_wait_for_slave_count ∙ Would not wait others Semi-synchrous Replication 46
  • 133. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? Semi-synchrous Replication 46
  • 134. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? ∙ Event is written into relay log Semi-synchrous Replication 46
  • 135. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? ∙ Event is written into relay log ∙ No guarantee it is executed Semi-synchrous Replication 46
  • 136. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? ∙ What happens in case of timeout? Semi-synchrous Replication 46
  • 137. ∙ Writes on master are slower than in case of asynchronous ∙ How many "Ack"s waits master? ∙ What does "Ack"mean? ∙ What happens in case of timeout? ∙ Replication becomes asynchronous Semi-synchrous Replication 46
  • 138. ∙ Every change written twice: SE files: logs, data, ... Binary Log Logical Replication 47
  • 139. ∙ Every change written twice: SE files: logs, data, ... Binary Log ∙ You can write on slave Logical Replication 47
  • 140. ∙ Does not exist in MySQL/MariaSB! Just for Comparison: Physical Replication 48
  • 141. ∙ Does not exist in MySQL/MariaSB! ∙ There are two closed-source solutions Just for Comparison: Physical Replication 48
  • 142. ∙ Does not exist in MySQL/MariaSB! ∙ Master writes only into SE files Just for Comparison: Physical Replication 48
  • 143. ∙ Does not exist in MySQL/MariaSB! ∙ Master writes only into SE files ∙ Which are replicated to slave Just for Comparison: Physical Replication 48
  • 144. ∙ Does not exist in MySQL/MariaSB! ∙ Master writes only into SE files ∙ Which are replicated to slave ∙ From the troubleshooting point of view ∙ IO: changes are written only once ∙ You cannot write on slave in parallel ∙ Any data inconsistency leads to replication break Just for Comparison: Physical Replication 48
  • 145. ∙ Data transfer ∙ Execution ∙ Different ∙ Diagnostics ∙ Fixes Two Kinds of Threads – Two Kinds of Issues 49
  • 146. ∙ Guaranteed that every transaction will be executed only once GTID 50
  • 147. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover GTID 50
  • 148. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover ∙ It is not easy to skip a transaction GTID 50
  • 149. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover ∙ It is not easy to skip a transaction ∙ MySQL: use mysqlslavetrx GTID 50
  • 150. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover ∙ It is not easy to skip a transaction ∙ MySQL: use mysqlslavetrx ∙ MariaDB: set global gtid_slave_pos=’X-Y-Z’; GTID 50
  • 151. ∙ Guaranteed that every transaction will be executed only once ∙ Simple failover ∙ It is not easy to skip a transaction ∙ MySQL: use mysqlslavetrx ∙ MariaDB: set global gtid_slave_pos=’X-Y-Z’; ∙ Be careful with expire_logs_days! GTID 50
  • 152. ∙ Statement-based (SBR) ∙ Queries are written as received Binary Log Formats 51
  • 153. ∙ Statement-based (SBR) ∙ Queries are written as received ∙ There is a risk of data inconsistency (non-safe) INSERT IGNORE LIMIT without ORDER BY Non-deterministic functions ... Binary Log Formats 51
  • 154. ∙ Statement-based (SBR) ∙ Row-based (RBR) ∙ Usually more data are written IO Transfer speed binlog_row_image Binary Log Formats 51
  • 155. ∙ Statement-based (SBR) ∙ Row-based (RBR) ∙ Usually more data are written IO Transfer speed binlog_row_image ∙ Performance may be worse if table does not have primary (unique) key MariaDB may use any index Binary Log Formats 51
  • 156. ∙ Statement-based (SBR) ∙ Row-based (RBR) ∙ Mixed ∙ Advantages of both formats Binary Log Formats 51
  • 159. ∙ Slave start ∙ Errors 2016-08-23T12:11:21.867440Z 4 [ERROR] Slave SQL for channel ’master-1’: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213, Error_code: 1032 2016-08-23T12:11:21.867471Z 4 [Warning] Slave: Can’t find record in ’t1’ Error_code: 1032 2016-08-23T12:11:21.867484Z 4 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log ’master-bin.000001’ position 989 Error log 53
  • 160. ∙ Slave start ∙ Errors ∙ Slave stop Error log 53
  • 161. All information about slave ∙ IO thread Configuration ∙ SQL thread Configuration ∙ IO thread Status ∙ SQL thread Status ∙ Errors Only last one All are in the error log SHOW SLAVE STATUS 54
  • 162. mysql> show slave status G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 ... Master_Log_File: master-bin.000002 Read_Master_Log_Pos: 63810611 Relay_Log_File: [email protected] Relay_Log_Pos: 1156 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: No ... Replicate_Wild_Ignore_Table: Last_Errno: 1032 Last_Error: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213 Skip_Counter: 0 ... SHOW SLAVE STATUS 54
  • 163. ∙ No need to parse SHOW Tables in Performance Schema 55
  • 164. ∙ No need to parse SHOW ∙ Configuration ∙ replication_connection_configuration ∙ replication_applier_configuration ∙ mysql> select * from replication_connection_configuration -> join replication_applier_configuration using(channel_name); Tables in Performance Schema 55
  • 165. ∙ No need to parse SHOW ∙ Configuration ∙ IO thread status ∙ replication_connection_status Tables in Performance Schema 55
  • 166. ∙ No need to parse SHOW ∙ Configuration ∙ IO thread status ∙ SQL thread status ∙ replication_applier_status ∙ replication_applier_status_by_coordinator - MTS! mysql> select * from replication_applier_status join -> replication_applier_status_by_coordinator -> using(channel_name); Tables in Performance Schema 55
  • 167. ∙ No need to parse SHOW ∙ Configuration ∙ IO thread status ∙ SQL thread status ∙ replication_applier_status ∙ replication_applier_status_by_worker mysql> select * from replication_applier_status join -> replication_applier_status_by_worker -> using(channel_name); Tables in Performance Schema 55
  • 168. ∙ Master Info mysql> select * from slave_master_infoG *************************** 1. row *************************** Number_of_lines: 25 Master_log_name: mysqld-bin.000001 Master_log_pos: 154 Host: 127.0.0.1 User_name: root User_password: secret Port: 13000 Connect_retry: 60 Enabled_ssl: 0 ... Uuid: 31ed7c8f-74ea-11e6-8de8-30b5c2208a0f Retry_count: 86400 ... Enabled_auto_position: 1 ... system database mysql: only on the slave 56
  • 169. ∙ Master Info ∙ Relay log info mysql> select * from slave_relay_log_infoG *************************** 1. row *************************** Number_of_lines: 7 Relay_log_name: ./[email protected] Relay_log_pos: 1156 Master_log_name: master-bin.000001 Master_log_pos: 989 Sql_delay: 0 Number_of_workers: 0 Id: 1 Channel_name: master-1 system database mysql: only on the slave 56
  • 170. ∙ Master Info ∙ Relay log info ∙ Worker info: multi-threaded slave mysql> select * from slave_worker_infoG *************************** 1. row *************************** Id: 1 ... *************************** 8. row *************************** Id: 8 Relay_log_name: ./Thinkie-relay-bin.000004 Relay_log_pos: 1216 Master_log_name: mysqld-bin.000001 Master_log_pos: 1342 Checkpoint_relay_log_name: ./Thinkie-relay-bin.000004 Checkpoint_relay_log_pos: 963 Checkpoint_master_log_name: mysqld-bin.000001 system database mysql: only on the slave 56
  • 171. mysql> show master statusG *************************** 1. row *************************** File: master-bin.000005 Position: 154 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 1 row in set (0,00 sec) SHOW MASTER STATUS 57
  • 172. mysql> show binlog events in ’master-bin.000001’ from 989; +-------------------+------+----------------+-----------+-------------+-------------------------------- | Log_name | Pos | Event_type | Server_id | End_log_pos | Info +-------------------+------+----------------+-----------+-------------+-------------------------------- | master-bin.000001 | 989 | Anonymous_Gtid | 1 | 1054 | SET @@SESSION.GTID_NEXT= ... | master-bin.000001 | 1054 | Query | 1 | 1124 | BEGIN | master-bin.000001 | 1124 | Table_map | 1 | 1167 | table_id: 109 (m2.t1) | master-bin.000001 | 1167 | Update_rows | 1 | 1213 | table_id: 109 flags: STMT_END_F | master-bin.000001 | 1213 | Xid | 1 | 1244 | COMMIT /* xid=64 */ +-------------------+------+----------------+-----------+-------------+-------------------------------- 5 rows in set (0,00 sec) SHOW BINLOG EVENTS 58
  • 173. $ mysqlbinlog var/mysqld.1/data/master-bin.000001 –start-position=989 –stop-position=1213 ... # at 1167 #160822 14:15:11 server id 1 end_log_pos 1213 CRC32 0x1f346c6b Update_rows: table id 109 flags: STMT_END_F BINLOG ’ v966VxMBAAAAKwAAAI8EAAAAAG0AAAAAAAEAAm0yAAJ0MQABAwABY2HOoQ== v966Vx8BAAAALgAAAL0EAAAAAG0AAAAAAAEAAgAB///+BQAAAP4GAAAAa2w0Hw== ’/*!*/; ROLLBACK /* added by mysqlbinlog */ /*!*/; SET @@SESSION.GTID_NEXT= ’AUTOMATIC’ /* added by mysqlbinlog */ /*!*/; ... mysqlbinlog 59
  • 174. $ mysqlbinlog -v var/mysqld.1/data/master-bin.000001 –start-position=989 –stop-position=1213 ... # at 1167 #160822 14:15:11 server id 1 end_log_pos 1213 CRC32 0x1f346c6b Update_rows: table id 109 flags: STMT_END_F BINLOG ’ v966VxMBAAAAKwAAAI8EAAAAAG0AAAAAAAEAAm0yAAJ0MQABAwABY2HOoQ== v966Vx8BAAAALgAAAL0EAAAAAG0AAAAAAAEAAgAB///+BQAAAP4GAAAAa2w0Hw== ’/*!*/; ### UPDATE ‘m2‘.‘t1‘ ### WHERE ### @1=5 ### SET ### @1=6 ROLLBACK /* added by mysqlbinlog */ /*!*/; SET @@SESSION.GTID_NEXT= ’AUTOMATIC’ /* added by mysqlbinlog */ /*!*/; ... mysqlbinlog 60
  • 175. ∙ Percona Toolkit ∙ pt-table-checksum Checks data consistency Toolkits 61
  • 176. ∙ Percona Toolkit ∙ pt-table-checksum Checks data consistency ∙ pt-table-sync Fixes data inconsistencies Toolkits 61
  • 177. ∙ Percona Toolkit ∙ pt-table-checksum Checks data consistency ∙ pt-table-sync Fixes data inconsistencies ∙ pt-slave-find Shows topology Toolkits 61
  • 178. ∙ MySQL Utilities ∙ mysqlrplcheck Checks if MySQL servers are ready to replicate Toolkits 61
  • 179. ∙ MySQL Utilities ∙ mysqlrplcheck Checks if MySQL servers are ready to replicate ∙ mysqlrplshow Shows topology Toolkits 61
  • 180. ∙ MySQL Utilities ∙ mysqlrplcheck Checks if MySQL servers are ready to replicate ∙ mysqlrplshow Shows topology ∙ mysqlrplsync Checks data consistency Toolkits 61
  • 181. ∙ MySQL Utilities ∙ mysqlrplcheck Checks if MySQL servers are ready to replicate ∙ mysqlrplshow Shows topology ∙ mysqlrplsync Checks data consistency ∙ mysqlslavetrx Skips 1-N transactions Toolkits 61
  • 182. ∙ MySQL Utilities ∙ mysqldbcompare Compares two databases MariaDB-friendly Toolkits 61
  • 183. ∙ MySQL Utilities ∙ mysqldbcompare Compares two databases MariaDB-friendly ∙ mysqldiff Checks objects definitions MariaDB-friendly Toolkits 61
  • 184. ∙ MySQL Utilities ∙ mysqldbcompare Compares two databases MariaDB-friendly ∙ mysqldiff Checks objects definitions MariaDB-friendly ∙ mysqlserverinfo Shows main options, such as port and datadir Replication-oriented MariaDB-friendly Toolkits 61
  • 185. ∙ Error log ∙ Slave ∙ SHOW SLAVE STATUS ∙ Tables in Performance Schema ∙ Tables in mysql database ∙ Master ∙ SHOW MASTER STATUS ∙ SHOW BINLOG EVENTS ∙ mysqlbinlog ∙ mysql command line client Main Instruments: Summary 62