Walking the Walk: Developing the MongoDB Backup Service with MongoDB

Engineer, Cloud Team, 10gen
Steve Briskin
Walking the Walk:
Developing the MongoDB
Backup Service With MongoDB

Agenda
• Intro: The Project
• How the backup service was built
– Keeping State
– Storage of Oplog Documents
– De-duped Snapshot Storage
• Q&A

The Project
• Started in December 2011 – 1 person
• 3 Engineers + PM & Manager by June 2012
• Private Beta – September 2012
• Limited Release – April 2013
• 6 Engineers (and hiring) + PM & Manager –
Now
• Agile Principles

Data Flow
Reconstructed Replica Sets
Sharded
Cluster
BRS
Daemon
Backu
p
Agent
Replica
Set 1
Customer
Replica
Set 4
Replica
Set 3
Replica
Set 2
Backup
Ingestion
10GEN
Backup
Daemon(s)
Main DB
Block
Store
RS
1
RS
2
RS
3
RS
4
2. Initial
Sync3. OpLog Data
1. Configuration
4. Save
Sync/Oplog Data
5. Reconstruct
Replica Set
6. Persist
Snapshot
7. Retrieve
Snapshot
8. SCP Data
Files

Keeping State – First Version
• One document per replica set being backed up
{
_id : ObjectId("5194ecde036446e958b9df9b"),
groupId : “Customer Group”,
replicaSet : ”ReplSet Name",
broken : false,
workingOn : “Initial Sync”,
numOplogs : NumberInt(100),
head :Timestamp(1370982242,1),
lastOplog :Timestamp(1370982243,1),
lastSnapshot:Timestamp(1370981940,1),
machine : "backup1.10gen.com"
}

Keeping State – Current
Version
• More fields, Nested Documents. Still No Joins.
{
_id:ObjectId("5194ecde036446e958b9df9b"),
groupId:“CustomerGroup”,
replicaSet:”ReplSetName",
broken:false,
workingOn:{…},
head:{ts:Timestamp(1370982242,1),
hash:49238479326510
},
lastOplog:{ts:Timestamp(1370982243,1),
hash:93408342387492
}
numOplogs:NumberLong(9400),
oplogNamespace:“CustomerGroup.oplogs_ReplSetName”
lastSnapshot:Timestamp(1370981940,1),
nextSnapshot:Timestamp1371003540,1),
schedule:{
reference:13709812343,
rules{[{…},{…}]}
}
machine:"backup1.10gen.com"
}
Simple Value -> Nested
Document
Integer -> Long
Complex, Nested Document

Imitating a Secondary:
Capturing and storing the oplog

Capture Oplog
• Use replication oplog to capture activity
• Oplog is a Capped Collection – local.oplog.rs
– We can tail Capped Collections
• Strategy
– Tail the Oplog
– Read 10 MB of Data
– Compress and Send to 10gen

Store Oplog – First Version
• Single Capped Collection
• Pros
– Easy
• Cons
– Doesn’t scale!
– Customers will have an impact on each other

Store Oplog – Good Version
• DB per customer and Collection per replica set
• TTL Index for cleanup
• Pros
– Logical and Physical separation of customer data
– Can scale quickly and easily
– Configurable by end user

Storage – First Version
• Archive and Compress MongoDB data files
• Scatter archives across machines
– Pros
• Fast and Easy
– Cons
• No Redundancy, Hard to Scale, Wastes Space
Machine 1
Snapshot_1.tar.gz
Snapshot_4.tar.gz
Machine 2
Snapshot_2.tar.gz
Snapshot_5.tar.gz
Machine 3
Snapshot_3.tar.gz
Snapshot_6.tar.gz

Goal 1: De-Duplicated
Storage
• Observation
– Data change is low and localized
– Data is compressible
• Huge benefits in de-duplicating
Worst Case
0% de-dupe
No compression
Best Case
100% de-dupe
10x compression
Typical Case
90% de-dupe
3x compression
100G
B
100G
B
100G
B
100G
B
100G
B
100G
B
10GB 0GB 100G
B
100G
B
33GB 3GB

Goal 2: Redundancy and
Scalability
• Require HighAvailability & Redundancy
– MongoDB Replication!
• RequireAbility to Scale
– MongoDB Sharding!

Block Store
db_file.0
SHA-256 Hash = “de23425..”
Data = BinData[……]
SHA-256 Hash = “3af37..”
SHA-256 Hash = “e721ac..”

Block Store
• File reference

Block Store Internals
Files Collection
{
_id :
ObjectId("5194ece0036446e958b9dfa1"),
filename : ”db_file.0",
size : NumberLong(786432),
blocks : [
{
hash : "de2f256064….",
size : 96
},
{
hash : ”47a9834f23….",
size : 32121
},
….
}
Blocks Collection
{
_id :
"de2f256064a0af797747c2b9755dcb9f3df0de4f489eac7
31c23ae9ca9cc31",
bytes :
BinData(0,"H4sIAAAAAAAAAO3BAQEAAACAkP6v7gg
KAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAauuOl9cAAAEA"),
zippedSize : 96,
size : 65536
}
SHA-256 Hash
SHA-256
Hash

Putting the file back together
• For each file
– For each block
• Retrieve block
• Uncompress

Block Store Garbage
Collection
• 1st Attempt
– Reference counting
– Slow and non-parallelizable
• 2nd Attempt
– Mark and Sweep
– Parallelizable
– Requires more space

Walking the Walk: Developing the MongoDB Backup Service with MongoDB

More Related Content

What's hot (20)

Viewers also liked (6)

Similar to Walking the Walk: Developing the MongoDB Backup Service with MongoDB (20)

More from MongoDB (20)

Recently uploaded (20)

Walking the Walk: Developing the MongoDB Backup Service with MongoDB

Editor's Notes