Common Cluster Configuration Pitfalls

#MDBW17
Andrew Young, Technical Services Engineer
COMMON CLUSTER
CONFIGURATION PITFALLS

#MDBW17
OVERVIEW
• Replication Review
• Replication Pitfalls
• Sharding Review
• Sharding Pitfalls
• Takeaways
• Questions

#MDBW17
REPLICATION IMPROVES AVAILABILITY BY
DUPLICATING DATA BETWEEN NODES
Primary Secondary Secondary
{x:1,y:2} {x:1,y:2} {x:1,y:2}

#MDBW17
NODES VOTE TO SELECT A PRIMARY
VOTING REQUIRES A STRICT MAJORITY
Server 1
Vote:
Server
1
Server 2
Vote:
Server
1
Server 3
Vote:
Server
3
Winner:
Server 1

#MDBW17
ARBITERS VOTE BUT DON’T STORE DATA
Primary Secondary Arbiter
{x:1,y:2} {x:1,y:2}

#MDBW17
WRITE CONCERN
Primary Secondary Secondary
{x:1,y:2} {x:1,y:2} {x:1,y:2}
w: 1 w: 2 w: 3
w: majority

#MDBW17
BUY N LARGE
• Online retailer, USA only
• Requires high availability
• Report generation should not
impact production performance

#MDBW17
REPLICA SET DEPLOYMENT
DC1
Primary
Arbiter
DC2
Secondary

#MDBW17
BUY N LARGE’S EXPECTED BEHAVIOR
DC1
Primary
Arbiter
DC2
Primary
I’m the only one
left, so I should
be primary!

#MDBW17
ACTUAL BEHAVIOR
DC1
Primary
Arbiter
DC2
Secondary
I only received one
vote out of three, so I
should be a
secondary!

#MDBW17
DC2DC1
BUY N LARGE’S APPLICATION
ARCHITECTURE
Load Balancer
App Server 1
Primary Arbiter
App Server 2
Secondary

#MDBW17
DC2DC1
NETWORK SPLIT
Load Balancer
App Server 1
Primary Arbiter
App Server 2
Secondary

#MDBW17
DC2DC1
EXPECTED BEHAVIOR LEADS TO
POSSIBLE DATA CORRUPTION
Load Balancer
App Server 1
Primary Arbiter
App Server 2
Primary
X = 1
✅
X = 2
✅

#MDBW17
DC2DC1
ACTUAL BEHAVIOR PREVENTS
POSSIBLE DATA CORRUPTION
Load Balancer
App Server 1
Primary Arbiter
App Server 2
Secondary
X = 1
✅
X = 2
❌

#MDBW17
REPLICA SET DEPLOYMENT V2
DC1
Primary
DC2
Secondary
Cloud
Arbiter

#MDBW17
DC1
Primary
DC2
Primary
Cloud
Arbiter

#MDBW17
DC1
Primary
DC2
Secondary
DC3
Secondary

#MDBW17
DC1
Primary
DC2
Secondary
• Under
Provisioned
1 Hour Lag
DC3
Secondary
• Under
Provisioned
1 Hour Lag

#MDBW17
DC1
Primary
DC2
Secondary
• Under
Provisioned
1 Hour Lag
DC3
Primary
• Under
Provisioned
1 Hour Lag

#MDBW17
DC1
Secondary
• Rollback 1
Hour
DC2
Secondary
DC3
Primary

#MDBW17
DC1
Primary
DC2
Secondary
DC3
Secondary

#MDBW17
DC1
Primary
• Operational
DC2
Secondary
• High
Availability
DC3
Secondary
• HA
Hidden
• Reporting

#MDBW17
LESSONS LEARNED - REPLICATION
• Be sure that the application’s write concern setting can still be
fulfilled if one of the nodes crashes.
• Use arbiters very sparingly if at all.
• The standard “hot spare” disaster recovery model requires manual
intervention. For true HA, a network split or the loss of a data
center should not leave the replica set in a read-only mode.
• Design replica sets for high availability first and then add
specialized nodes as needed.

#MDBW17
BUY N LARGE V2
• BnL has gone global.
• Traffic has increased and the
current replica set is starting to
experience performance issues
from the increased load.

#MDBW17
REPLICATION DUPLICATES DATA
Primary
A-Z
Secondary
A-Z
Secondary
A-Z

#MDBW17
SHARDING DIVIDES UP THE DATA
S0
A-E
N-Q
S1
F-J
R-T
S2
K-M
U-Z
Collections are
sharded, not
databases.

#MDBW17
HIGH AVAILABILITY VS SCALABILITY
Primary
A-Z
Secondary
A-Z
Secondary
A-Z
S0
A-E
N-Q
S1
F-J
R-T
S2
K-M
U-Z

#MDBW17
BUT WHAT ABOUT SECONDARY READS?
• Can improve performance in
certain cases.
• May return stale or duplicate
documents.
• Use with caution!
https://xkcd.com/386/

#MDBW17
SHARDED CLUSTER CONFIGURATION V1
• Shard keys:
{_id:”hashed”}
CFG
P
S
S
S0
P
S
S
H
S1
P
S
S
H
S2
P
S
S
H

#MDBW17
SHARDED QUERIES
mongos
A-F G-M N-Z
mongos
A-F G-M N-Z
Range Query
(Requires Shard Key)
Scatter / Gather Query
(No Shard Key)

#MDBW17
• Shard keys:
{category:1}
CFG
P
S
S
S0
P
S
S
H
S1
P
S
S
H
S2
P
S
S
H

#MDBW17
HOW DOES CHUNKING ACTUALLY WORK?
• Chunks are metadata.
• Chunks represent a key range. S0
A-E
N-Q
S1
F-J
R-T
S2
K-M
U-Z

#MDBW17
HOW DOES CHUNKING ACTUALLY WORK?
• The mongos instance controls
chunk creation and splitting.
• After writing 1/5th of the
maximum chunk size to a
chunk, the mongos requests a
chunk split.
S0
A-E
N-Q
S1
F-J
R-S
T
S2
K-M
U-Z

#MDBW17
SHARD KEY SELECTION AND CARDINALITY
• If a key range can’t be split, it
lacks cardinality. S0
A-E
N-Q
S1
F-J
R-S
T
S2
K-M
U-Z
S
Can’t Split!

#MDBW17
• Shard keys:
{
category:1,
sku: 1,
_id: 1
}
CFG
P
S
S
S0
P
S
S
H
S1
P
S
S
H
S2
P
S
S
H

#MDBW17
HOW DOES BALANCING ACTUALLY WORK?
• When one shard has too many
chunks, the balancer executes
a chunk move operation.
• Balancing is based on number
of chunks, not size of data.
S0
A-E
N-Q
S1
F-J
R
S
T
S2
K-M
U-Z
S

#MDBW17
CHUNK MOVE PROCESS
1. Documents in the shard key
range of the chunk are copied
to the destination shard.
2. The chunk metadata is
updated.
3. A delete command is queued
on the source shard.
S0
A-E
N-Q
S1
F-J
R
S
T
S2
K-M
U-Z
S

#MDBW17
WHAT ARE ORPHANED DOCUMENTS?
• Sometimes the same document
exists in two places.
• Only primaries and mongos
instances know about chunks.
• Reading from secondary nodes
in a sharded system will return
both copies of the document.
S0
A-E
N-Q
S1
F-J
R
S
T
S2
K-M
U-Z
S

#MDBW17
EMPTY CHUNKS
• If documents are deleted from a
chunk and not replaced, chunks
can become empty.
• Empty chunks can cause data
size imbalances between
shards.
S0
2012
2015
S1
2013
2016
S2
2014
2017

#MDBW17
• Shard keys:
{
category:1,
sku: 1,
_id: 1
}
CFG
P
S
S
S0
P
S
S
S1
P
S
S
S2
P
S
S

#MDBW17
LESSONS LEARNED - SHARDING
• Replication is for HA, sharding is for scaling.
• Shard key selection is extremely important.
‒ Affects query performance
‒ Affects chunk balancing
‒ Can not easily be changed later on
• Secondary reads on sharded clusters are highly discouraged.
‒ Orphaned documents will cause multiple versions of the same document to
be returned.
‒ MongoDB 3.4 greatly improved both replication and sharding.

#MDBW17
YOUR MILEAGE MAY VARY
https://xkcd.com/722/

Common Cluster Configuration Pitfalls

SAVE
THE
DATE
STRATA + HADOOP WORLD
September 25 – 28, 2017
Javits Center
655 W 34th St
New York, NY 10001**
ORACLE OPENWORLD
September 25 – 28, 2017
Moscone Center
747 Howard Street
San Francisco, CA, 94103

Common Cluster Configuration Pitfalls

More Related Content

What's hot (20)

Similar to Common Cluster Configuration Pitfalls (20)

More from MongoDB (20)

Recently uploaded (20)

Common Cluster Configuration Pitfalls

Editor's Notes