Cassandra Virtual Node talk

V is for vnodes
Patrick McFadin, Sr Solution Architect
DataStax

©2012 DataStax
1
Friday, February 15, 13

Agenda for today
• What is a node?
• How vnodes work
• Converting your cluster
• Beneﬁts

©2012 DataStax
2

Since the beginning...
Cassandra has had...

Clusters, which have...

Keyspaces, which have...

Column Families, which have...

©2012 DataStax
3

Row Keys

Unique in a column family
Can be up to 64k in size
Can be sorted in the cluster
Byte Ordered Partitioner

OR...

Can be randomly placed in cluster
Random Partitioner

©2012 DataStax
4

Row Keys
How do you...

• Create a random number?
• Make sure the number is big enough?
• Make it reproducible?

MD5 does the job

Input a Row Key MD5 Get a 128 bit number

©2012 DataStax
5

Row Keys
Input Get

@PatrickMcFadin MD5 0xcfc2d0610aaa712a8c36711d08a2550a

Input Get

8675309 MD5 0x6cc0d36686e6a433aa76f96773852d35

The number produced is a range between:

0 and 2128-1... but Cassandra uses 2127-1

2128 = 340,282,366,920,938,463,463,374,607,431,768,211,456

...otherwise known as a HUGE number.
©2012 DataStax
6

©2012 DataStax
7

Token Assignment
• Each Cassandra node is assigned a token
• Each token is a number inside the huge range
• Tokens mark the ownership range of Row Keys

From: Token = 0

To: Token = 56713727820156410577229101238628035242

From:

To: Token = 113427455640312821154458202477256070484

©2012 DataStax
8

Row Key to Token
Input Get

@PatrickMcFadin MD5 276161727147663567581939045564154008842

Token = 0

I’ll Token = 56713727820156410577229101238628035242
take it!

Token = 113427455640312821154458202477256070484

©2012 DataStax
9

Cassandra 1.1 Node
• Responsible for a single range of keys
• Range determined by single token
• One server = One token = One node

©2012 DataStax
10

Cassandra 1.1 Node

Commodity node?

©2012 DataStax
10

Cassandra 1.1 Node

Commodity node? What you really want.

©2012 DataStax
10

Time for a new plan

• Hardware is only getting bigger
• One node is responsible for more data
• Token assignments are a pain

©2012 DataStax
11

Token assignment (sucks)
• Tokens need to be evenly spread
• Growing a ring... not good options
• Shrinking a ring... not good options
• Tokens have to be added to each server conﬁg

©2012 DataStax
12

Enter Virtual Nodes
• One server should have many nodes
• Each node should be small
• Tokens should be automatic

Version 1.1 Version 1.2
Server 1 Server 1

1 2

1-4
4 3

©2012 DataStax
13

Virtual Node Features
• Default 256 Nodes per server
• Auto assign tokens
• Faster rebuilds of servers
• Faster server add to cluster
• New partitioner (More later)

©2012 DataStax
14

Transitioning to vnodes
Super easy!

Find these lines in your cassandra.yaml ﬁle:

#num_tokens:

initial_token: <some big number>

Change to:
num_tokens: 256

initial_token:

and restart.
Repeat on all nodes in cluster
©2012 DataStax
15

Transitioning to vnodes
After all Cassandra instances have been reset

Initialize a shuffle operation

[patrick@cassandra0 ~]$ cassandra-shuffle create

Enable shuffling

[patrick@cassandra0 ~]$ cassandra-shuffle enable

List pending relocations*

[patrick@cassandra0 ~]$ cassandra-shuffle ls

Let’s walk through it...
*This is a slow op. Be patient.
©2012 DataStax
16

Existing 1.1 cluster
Server 1 Server 2

1-4 4-8

Server 4 Server 3

13-16 9-12

©2012 DataStax


Set num_tokens and restart
Server 1 Server 2

1-4 1-4 4-8 4-8

1-4 1-4 4-8 4-8

Server 4 Server 3

13-16 13-16 9-12 9-12

13-16 13-16 9-12 9-12

©2012 DataStax
18

Set num_tokens and restart
Server 1 Server 2

1 2 5 6

3 4 7 8

Server 4 Server 3

13 14 9 10

15 16 11 12

©2012 DataStax
Initialize and Enable shuffling...
19

Ops life with vnodes
• Add any number of nodes
• No token assignments!
• Bigger server? Larger num_tokens
• Decommission any number of nodes
• New nodetool command: status

One more time now!

©2012 DataStax
22

Bonus new thing
• New Partitioner: Murmur3Partitoner
• Murmur3 replaces MD5
• Slightly faster than MD5 in certain cases
• Go forward partitioner for NEW clusters
• No need to convert

More details here:
https://issues.apache.org/jira/browse/CASSANDRA-3772

©2012 DataStax
23

In conclusion...

Go out and try some vnode love today!

Download Cassandra 1.2 now

http://www.datastax.com/download/community

http://cassandra.apache.org/download/

©2012 DataStax
24

Some handy references

http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2

http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes

Follow me on Twitter for more: @PatrickMcFadin

©2012 DataStax
25

Cassandra Virtual Node talk

More Related Content

What's hot (20)

Viewers also liked (20)

More from Patrick McFadin (20)

Recently uploaded (20)

Cassandra Virtual Node talk