SlideShare a Scribd company logo
Cloning Twitter With
Redis
Dr. Fabio Fumarola
Motivation
• The programming community considers that key-
value cannot be used as replacement for a relational
database.
• Here we show how a key-value layer is an effective
data model to implement many kinds of applications.
2
A Twitter Clone
• One of the most successful new Internet services of
recent times is Twitter.
• Since its launch it has exploded from niche usage to
usage by the general populace, with celebrities such
as Oprah Winfrey, Britney Spears, and Shaquille
O'Neal, and politicians such as Barack Obama and Al
Gore jumping into it.
3
Why Twitter?
• Simple: it does not care what you share, as a long it is less
than 140 characters
• A means to have public conversation: Twitter allows a user
to tweet and have users respond using '@' reply, comment,
or re-tweet
• Fan versus friend
• Understanding user behavior
• Easy to share through text messaging
• Easy to access through multiple devices and applications
4
Main Features
• Allow users to post status updates (known as
'tweets' in Twitter) to the public.
• Allow users to follow and unfollow other users. Users
can follow any other user but it is not reciprocal.
• Allow users to send public messages directed to
particular users using the @ replies convention (in
Twitter this is known as mentions)
5
Main Features
• Allow users to send direct messages to other users,
messages are private to the sender and the recipient
user only (direct messages are only to a single
recipient).
• Allow users to re-tweet or forward another user's
status in their own status update.
• Provide a public timeline where all statuses are
publicly available for viewing.
• Provide APIs to allow external applications access.
6
Redis CLI
7
Redis CLI
• Save a Key Value
– SET foo bar
• Get a value for a key
– GET foo => bar
• Del a key value
– DEL foo
8
Redis CLI
• Increment a value for a key
– SET foo 10
– INCR foo => 11
– INCR foo => 12
– INCR foo => 13
• INCR is an atomic operation
– x = GET foo
– x = x + 1
– SET foo x
9
Redis CLI
• The problem with this kind of operation is when
multiple client update the same key
– x = GET foo (yields 10)
– y = GET foo (yields 10)
– x = x + 1 (x is now 11)
– y = y + 1 (y is now 11)
– SET foo x (foo is now 11)
– SET foo y (foo is now 11)
10
Redis CLI: LIST
• Beyond key-value stores: lists
– LPUSH mylist a (now mylist holds 'a')
– LPUSH mylist b (now mylist holds 'b','a')
– LPUSH mylist c (now mylist holds 'c','b','a')
• LPUSH means Left Push
• There is also the operation RPUSH
• This is very useful for our Twitter clone. User updates
can be added to a list stored in username:updates,
for instance.
11
Redis CLI: LIST
• LRANGE returns a range from the list
– LRANGE mylist 0 1 => c,b
– LRANGE mylist 0 -1 => c,b,a
• The last-index argument can be negative, with a
special meaning: -1 is the last element of the list, -2
the penultimate, and so on
12
Redis CLI: SET
• SADD is the add to set operation
• SREM is the remove from set operation
• SINTER is the perform intersection operation
• SCARD to get the cardinality of a Set
• SMEMBERS to return all the members of a Set.
13
Redis CLI: SET
– SADD myset a
– SADD myset b
– SADD myset foo
– SADD myset bar
– SCARD myset => 4
– SMEMBERS myset => bar,a,foo,b
14
Redis CLI: Sorted SET
• Sorted Set commands are prefixed with Z. The
following is an example of Sorted Sets usage:
– ZADD zset 10 a
– ZADD zset 5 b
– ZADD zset 12.55 c
– ZRANGE zset 0 -1 => b,a,c
• In the above example we added a few elements with
ZADD, and later retrieved the elements with ZRANGE
15
Redis CLI: Sorted SET
• The elements are returned in order according to
their score.
• In order to check if a given element exists, and also
to retrieve its score if it exists, we use the ZSCORE
command:
– ZSCORE zset a => 10
– ZSCORE zset non_existing_element => NULL
16
Redis CLI: HASH
• Redis Hashes are basically like Ruby or Python
hashes, a collection of fields associated with values:
– HMSET myuser name Salvatore surname Sanfilippo
country Italy
– HGET myuser surname => Sanfilippo
17
Data Layout
18
Data Layout
• When working with a relational database, a database
schema must be designed so that we'd know the
tables, indexes, and so on that the database will
contain.
• We don't have tables in Redis, so what do we need
to design?
• We need to identify what keys are needed to
represent our objects and what kind of values this
keys need to hold.
19
Users
• We need to represent users, of course, with their
– username, userid, password, the set of users following a
given user, the set of users a given user follows, and so on.
• The first question is, how should we identify a user?
• Asolution is to associate a unique ID with every user.
• Every other reference to this user will be done by id.
– INCR next_user_id => 1000
– HMSET user:1000 username antirez password p1pp0
20
Users
• Besides the fields already defined, we need some
more stuff in order to fully define a User
• For example, sometimes it can be useful to be able
to get the user ID from the username, so every time
we add an user, we also populate the users key,
which is an Hash, with the username as field, and its
ID as value.
– HSET users antirez 1000
21
Users
– HSET users antirez 1000
• We are only able to access data in a direct way,
without secondary indexes.
• It's not possible to tell Redis to return the key that
holds a specific value.
• This new paradigm is forcing us to organize data so
that everything is accessible by primary key, speaking
in relational DB terms.
22
Followers, following and updates
• A user might have users who follow them, which
we'll call their followers.
• A user might follow other users, which we'll call a
following.
• We have a perfect data structure for this. That is...
Sorted Set.
23
Followers, following and updates
• So let's define our keys:
– followers:1000 => Sorted Set of uids of all the followers
users
– following:1000 => Sorted Set of uids of all the following
users
• We can add new followers with:
– ZADD followers:1000 1401267618 1234 => Add user 1234
with time 1401267618
24
Followers, following and updates
• Another important thing we need is a place were we
can add the updates to display in the user's home
page.
• We'll need to access this data in chronological order
later
• Basically every new update will be LPUSHed in the
user updates key, and thanks to LRANGE, we can
implement pagination and so on.
25
Followers, following and updates
• Note, we use the words updates and posts
interchangeably, since updates are actually "little
posts" in some way.
– posts:1000 => a List of post ids - every new post is
LPUSHed here.
• This list is basically the User timeline.
• We'll push the IDs of her/his own posts, and, the IDs
of all the posts of created by the following users.
Basically we implement a write fanout.
26
Following Users
• We need to create following / follower relationships.
– If user ID 1000 (antirez) wants to follow user ID 5000
(pippo), we need to create both a following and a follower
relationship.
• We just need to ZADD calls:
– ZADD following:1000 5000
– ZADD followers:5000 1000
27
Following Users
• Note the same pattern again and again.
• In theory with a relational database the list of
following and followers would be contained in a
single table with fields like following_id and
follower_id.
• With a key-value DB things are a bit different since
we need to set both the 1000 is following 5000 and
5000 is followed by 1000 relations.
28
Making it horizontally scalable
• Our clone is extremely fast, without any kind of
cache.
• On a very slow and loaded server, an apache
benchmark with 100 parallel clients issuing 100000
requests measured the average pageview to take 5
milliseconds.
• This means we can serve millions of users every day
with just a single Linux box.
29
Making it horizontally scalable
• However you can't go with a single server forever,
how do you scale a key-value store?
• It does not perform any multi-keys operation, so
making it scalable is simple:
1. you may use client-side sharding,
2. or something like a sharding proxy like Twemproxy,
3. or the upcoming Redis Cluster.
30

More Related Content

What's hot (20)

PDF
The inner workings of Dynamo DB
Jonathan Lau
 
PDF
Nov 2011 HUG: Blur - Lucene on Hadoop
Yahoo Developer Network
 
PDF
Intro to HBase
alexbaranau
 
PPTX
Cross-Site BigTable using HBase
HBaseCon
 
PDF
MySQL database replication
PoguttuezhiniVP
 
PPTX
SQL Server 2014 In-Memory OLTP
Tony Rogerson
 
PDF
Mysql database basic user guide
PoguttuezhiniVP
 
PDF
Apache HBase for Architects
Nick Dimiduk
 
ODP
Introduction to PostgreSQL
Jim Mlodgenski
 
PPTX
Apache phoenix
Osama Hussein
 
PDF
New Security Features in Apache HBase 0.98: An Operator's Guide
HBaseCon
 
PDF
Cassandra Explained
Eric Evans
 
PPTX
Advanced Sqoop
Yogesh Kulkarni
 
PDF
Cassandra 2.1 boot camp, Read/Write path
Joshua McKenzie
 
PPTX
Mongodb - NoSql Database
Prashant Gupta
 
PDF
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Mydbops
 
PPTX
Postgresql
NexThoughts Technologies
 
PPTX
Redis Functions, Data Structures for Web Scale Apps
Dave Nielsen
 
PDF
Intro to HBase Internals & Schema Design (for HBase users)
alexbaranau
 
PDF
Learning postgresql
DAVID RAUDALES
 
The inner workings of Dynamo DB
Jonathan Lau
 
Nov 2011 HUG: Blur - Lucene on Hadoop
Yahoo Developer Network
 
Intro to HBase
alexbaranau
 
Cross-Site BigTable using HBase
HBaseCon
 
MySQL database replication
PoguttuezhiniVP
 
SQL Server 2014 In-Memory OLTP
Tony Rogerson
 
Mysql database basic user guide
PoguttuezhiniVP
 
Apache HBase for Architects
Nick Dimiduk
 
Introduction to PostgreSQL
Jim Mlodgenski
 
Apache phoenix
Osama Hussein
 
New Security Features in Apache HBase 0.98: An Operator's Guide
HBaseCon
 
Cassandra Explained
Eric Evans
 
Advanced Sqoop
Yogesh Kulkarni
 
Cassandra 2.1 boot camp, Read/Write path
Joshua McKenzie
 
Mongodb - NoSql Database
Prashant Gupta
 
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Mydbops
 
Redis Functions, Data Structures for Web Scale Apps
Dave Nielsen
 
Intro to HBase Internals & Schema Design (for HBase users)
alexbaranau
 
Learning postgresql
DAVID RAUDALES
 

Similar to 8. key value databases laboratory (20)

PDF
Redis basics
Arthur Shvetsov
 
PDF
Introduction to Redis
Dvir Volk
 
PDF
key value dbs , redis , cassandra , their architecture
VrajGaglani
 
PDF
Introduction to Redis
François-Guillaume Ribreau
 
PDF
Introduction to Redis
Saeid Zebardast
 
PPTX
REDIS327
Rajan Bhatt
 
PPT
NoSQL databases pros and cons
Fabio Fumarola
 
PPTX
Redis
Rhythm Shahriar
 
PPTX
Redis Labcamp
Angelo Simone Scotto
 
KEY
Redis
Ramon Wartala
 
PDF
Paris Redis Meetup Introduction
Gregory Boissinot
 
PDF
Redispresentation apac2012
Ankur Gupta
 
PPT
Redis
ssuserbad56d
 
PDF
Redis - The Universal NoSQL Tool
Eberhard Wolff
 
PPTX
Introduction to Redis
Maarten Smeets
 
PPTX
Redis
Rajesh Kumar
 
ODP
An Introduction to REDIS NoSQL database
Ali MasudianPour
 
PDF
Introduction to redis - version 2
Dvir Volk
 
KEY
KeyValue Stores
Mauro Pompilio
 
PDF
Redis — The AK-47 of Post-relational Databases
Karel Minarik
 
Redis basics
Arthur Shvetsov
 
Introduction to Redis
Dvir Volk
 
key value dbs , redis , cassandra , their architecture
VrajGaglani
 
Introduction to Redis
François-Guillaume Ribreau
 
Introduction to Redis
Saeid Zebardast
 
REDIS327
Rajan Bhatt
 
NoSQL databases pros and cons
Fabio Fumarola
 
Redis Labcamp
Angelo Simone Scotto
 
Paris Redis Meetup Introduction
Gregory Boissinot
 
Redispresentation apac2012
Ankur Gupta
 
Redis - The Universal NoSQL Tool
Eberhard Wolff
 
Introduction to Redis
Maarten Smeets
 
An Introduction to REDIS NoSQL database
Ali MasudianPour
 
Introduction to redis - version 2
Dvir Volk
 
KeyValue Stores
Mauro Pompilio
 
Redis — The AK-47 of Post-relational Databases
Karel Minarik
 
Ad

More from Fabio Fumarola (19)

PPT
11. From Hadoop to Spark 2/2
Fabio Fumarola
 
PPT
11. From Hadoop to Spark 1:2
Fabio Fumarola
 
PPT
10b. Graph Databases Lab
Fabio Fumarola
 
PPT
10. Graph Databases
Fabio Fumarola
 
PPT
9. Document Oriented Databases
Fabio Fumarola
 
PPT
8. column oriented databases
Fabio Fumarola
 
PPT
7. Key-Value Databases: In Depth
Fabio Fumarola
 
PPT
6 Data Modeling for NoSQL 2/2
Fabio Fumarola
 
PPT
5 Data Modeling for NoSQL 1/2
Fabio Fumarola
 
PPT
3 Git
Fabio Fumarola
 
PPT
2 Linux Container and Docker
Fabio Fumarola
 
PDF
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
Fabio Fumarola
 
PPT
Scala and spark
Fabio Fumarola
 
PPT
Hbase an introduction
Fabio Fumarola
 
PPT
An introduction to maven gradle and sbt
Fabio Fumarola
 
PPT
Develop with linux containers and docker
Fabio Fumarola
 
PPT
Linux containers and docker
Fabio Fumarola
 
PPTX
08 datasets
Fabio Fumarola
 
PPTX
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
Fabio Fumarola
 
11. From Hadoop to Spark 2/2
Fabio Fumarola
 
11. From Hadoop to Spark 1:2
Fabio Fumarola
 
10b. Graph Databases Lab
Fabio Fumarola
 
10. Graph Databases
Fabio Fumarola
 
9. Document Oriented Databases
Fabio Fumarola
 
8. column oriented databases
Fabio Fumarola
 
7. Key-Value Databases: In Depth
Fabio Fumarola
 
6 Data Modeling for NoSQL 2/2
Fabio Fumarola
 
5 Data Modeling for NoSQL 1/2
Fabio Fumarola
 
2 Linux Container and Docker
Fabio Fumarola
 
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
Fabio Fumarola
 
Scala and spark
Fabio Fumarola
 
Hbase an introduction
Fabio Fumarola
 
An introduction to maven gradle and sbt
Fabio Fumarola
 
Develop with linux containers and docker
Fabio Fumarola
 
Linux containers and docker
Fabio Fumarola
 
08 datasets
Fabio Fumarola
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
Fabio Fumarola
 
Ad

Recently uploaded (20)

PDF
apidays Singapore 2025 - How APIs can make - or break - trust in your AI by S...
apidays
 
PPTX
big data eco system fundamentals of data science
arivukarasi
 
PPTX
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
PDF
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
PDF
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
PDF
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
PDF
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
PPTX
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
PDF
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
PPTX
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
PPTX
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
PPTX
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
PDF
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PPTX
What Is Data Integration and Transformation?
subhashenia
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PDF
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
apidays Singapore 2025 - How APIs can make - or break - trust in your AI by S...
apidays
 
big data eco system fundamentals of data science
arivukarasi
 
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
What Is Data Integration and Transformation?
subhashenia
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 

8. key value databases laboratory

  • 2. Motivation • The programming community considers that key- value cannot be used as replacement for a relational database. • Here we show how a key-value layer is an effective data model to implement many kinds of applications. 2
  • 3. A Twitter Clone • One of the most successful new Internet services of recent times is Twitter. • Since its launch it has exploded from niche usage to usage by the general populace, with celebrities such as Oprah Winfrey, Britney Spears, and Shaquille O'Neal, and politicians such as Barack Obama and Al Gore jumping into it. 3
  • 4. Why Twitter? • Simple: it does not care what you share, as a long it is less than 140 characters • A means to have public conversation: Twitter allows a user to tweet and have users respond using '@' reply, comment, or re-tweet • Fan versus friend • Understanding user behavior • Easy to share through text messaging • Easy to access through multiple devices and applications 4
  • 5. Main Features • Allow users to post status updates (known as 'tweets' in Twitter) to the public. • Allow users to follow and unfollow other users. Users can follow any other user but it is not reciprocal. • Allow users to send public messages directed to particular users using the @ replies convention (in Twitter this is known as mentions) 5
  • 6. Main Features • Allow users to send direct messages to other users, messages are private to the sender and the recipient user only (direct messages are only to a single recipient). • Allow users to re-tweet or forward another user's status in their own status update. • Provide a public timeline where all statuses are publicly available for viewing. • Provide APIs to allow external applications access. 6
  • 8. Redis CLI • Save a Key Value – SET foo bar • Get a value for a key – GET foo => bar • Del a key value – DEL foo 8
  • 9. Redis CLI • Increment a value for a key – SET foo 10 – INCR foo => 11 – INCR foo => 12 – INCR foo => 13 • INCR is an atomic operation – x = GET foo – x = x + 1 – SET foo x 9
  • 10. Redis CLI • The problem with this kind of operation is when multiple client update the same key – x = GET foo (yields 10) – y = GET foo (yields 10) – x = x + 1 (x is now 11) – y = y + 1 (y is now 11) – SET foo x (foo is now 11) – SET foo y (foo is now 11) 10
  • 11. Redis CLI: LIST • Beyond key-value stores: lists – LPUSH mylist a (now mylist holds 'a') – LPUSH mylist b (now mylist holds 'b','a') – LPUSH mylist c (now mylist holds 'c','b','a') • LPUSH means Left Push • There is also the operation RPUSH • This is very useful for our Twitter clone. User updates can be added to a list stored in username:updates, for instance. 11
  • 12. Redis CLI: LIST • LRANGE returns a range from the list – LRANGE mylist 0 1 => c,b – LRANGE mylist 0 -1 => c,b,a • The last-index argument can be negative, with a special meaning: -1 is the last element of the list, -2 the penultimate, and so on 12
  • 13. Redis CLI: SET • SADD is the add to set operation • SREM is the remove from set operation • SINTER is the perform intersection operation • SCARD to get the cardinality of a Set • SMEMBERS to return all the members of a Set. 13
  • 14. Redis CLI: SET – SADD myset a – SADD myset b – SADD myset foo – SADD myset bar – SCARD myset => 4 – SMEMBERS myset => bar,a,foo,b 14
  • 15. Redis CLI: Sorted SET • Sorted Set commands are prefixed with Z. The following is an example of Sorted Sets usage: – ZADD zset 10 a – ZADD zset 5 b – ZADD zset 12.55 c – ZRANGE zset 0 -1 => b,a,c • In the above example we added a few elements with ZADD, and later retrieved the elements with ZRANGE 15
  • 16. Redis CLI: Sorted SET • The elements are returned in order according to their score. • In order to check if a given element exists, and also to retrieve its score if it exists, we use the ZSCORE command: – ZSCORE zset a => 10 – ZSCORE zset non_existing_element => NULL 16
  • 17. Redis CLI: HASH • Redis Hashes are basically like Ruby or Python hashes, a collection of fields associated with values: – HMSET myuser name Salvatore surname Sanfilippo country Italy – HGET myuser surname => Sanfilippo 17
  • 19. Data Layout • When working with a relational database, a database schema must be designed so that we'd know the tables, indexes, and so on that the database will contain. • We don't have tables in Redis, so what do we need to design? • We need to identify what keys are needed to represent our objects and what kind of values this keys need to hold. 19
  • 20. Users • We need to represent users, of course, with their – username, userid, password, the set of users following a given user, the set of users a given user follows, and so on. • The first question is, how should we identify a user? • Asolution is to associate a unique ID with every user. • Every other reference to this user will be done by id. – INCR next_user_id => 1000 – HMSET user:1000 username antirez password p1pp0 20
  • 21. Users • Besides the fields already defined, we need some more stuff in order to fully define a User • For example, sometimes it can be useful to be able to get the user ID from the username, so every time we add an user, we also populate the users key, which is an Hash, with the username as field, and its ID as value. – HSET users antirez 1000 21
  • 22. Users – HSET users antirez 1000 • We are only able to access data in a direct way, without secondary indexes. • It's not possible to tell Redis to return the key that holds a specific value. • This new paradigm is forcing us to organize data so that everything is accessible by primary key, speaking in relational DB terms. 22
  • 23. Followers, following and updates • A user might have users who follow them, which we'll call their followers. • A user might follow other users, which we'll call a following. • We have a perfect data structure for this. That is... Sorted Set. 23
  • 24. Followers, following and updates • So let's define our keys: – followers:1000 => Sorted Set of uids of all the followers users – following:1000 => Sorted Set of uids of all the following users • We can add new followers with: – ZADD followers:1000 1401267618 1234 => Add user 1234 with time 1401267618 24
  • 25. Followers, following and updates • Another important thing we need is a place were we can add the updates to display in the user's home page. • We'll need to access this data in chronological order later • Basically every new update will be LPUSHed in the user updates key, and thanks to LRANGE, we can implement pagination and so on. 25
  • 26. Followers, following and updates • Note, we use the words updates and posts interchangeably, since updates are actually "little posts" in some way. – posts:1000 => a List of post ids - every new post is LPUSHed here. • This list is basically the User timeline. • We'll push the IDs of her/his own posts, and, the IDs of all the posts of created by the following users. Basically we implement a write fanout. 26
  • 27. Following Users • We need to create following / follower relationships. – If user ID 1000 (antirez) wants to follow user ID 5000 (pippo), we need to create both a following and a follower relationship. • We just need to ZADD calls: – ZADD following:1000 5000 – ZADD followers:5000 1000 27
  • 28. Following Users • Note the same pattern again and again. • In theory with a relational database the list of following and followers would be contained in a single table with fields like following_id and follower_id. • With a key-value DB things are a bit different since we need to set both the 1000 is following 5000 and 5000 is followed by 1000 relations. 28
  • 29. Making it horizontally scalable • Our clone is extremely fast, without any kind of cache. • On a very slow and loaded server, an apache benchmark with 100 parallel clients issuing 100000 requests measured the average pageview to take 5 milliseconds. • This means we can serve millions of users every day with just a single Linux box. 29
  • 30. Making it horizontally scalable • However you can't go with a single server forever, how do you scale a key-value store? • It does not perform any multi-keys operation, so making it scalable is simple: 1. you may use client-side sharding, 2. or something like a sharding proxy like Twemproxy, 3. or the upcoming Redis Cluster. 30

Editor's Notes

  • #21: We use the next_user_id key in order to always get an unique ID for every new user. Then we use this unique ID to name the key holding an Hash with user's data. This is a common design pattern with key-values stores! Keep it in mind.