SlideShare a Scribd company logo
Bitsy Graph Database

Sridhar Ramachandran
Founder, LambdaZen LLC
What is Bitsy?
● A small, fast, embeddable,
durable, in-memory graph
database.
● Maintains an on-disk copy of
the graph database.
● Designed for multi-threaded
OLTP applications.
● Provides ACID guarantees
and optimistic concurrency
control for transactions.
● Compatible with
Tinkerpop/Blueprints -- the
graph database standard.

Tinkerpop software stack
From https://github.com/tinkerpop/blueprints/wiki
In-memory and durable?
● Bitsy maintains a copy of the entire graph in memory
data-structures.
● Bitsy saves all changes made to the database, to the
disk, during a commit operation.
● Commits from different threads are forced to the disk at
once, thereby improving the write performance in a
multithreaded OLTP environment.
● The database is loaded from files during startup.
● All database files are append-only text files with JSONencoded vertices and edges.
● The database files are periodically compacted by a
background thread.
Design Principle #1: No Seek
● Bitsy appends all changes to an
unordered transaction log, unlike
most databases which persist data in
B-Trees and other ordered
structures.
● Ordered data structures perform
multiple seeks per updated element.
● Seek operations on the hard-disk are
expensive (5-15 ms).
● Bitsy avoids seeks per element, and
addresses rotational latency by
combining commits from concurrent
transactions.

Hard disk head: Seek
operations require a
mechanical movement of
the hard disk head which
takes 5-15ms.
Rotational latency is the
time taken for the
requested sector in the
rotating platter to reach
the head. Takes 2-4ms.
Design Principle #2: No Socket
● Typical databases run in a separate
process exposing a socket-based
protocol to applications.
● The cost of serializing and
deserializing the requests and
responses, and calling OS-level
functions, reduces the overall
throughput of the database.
● By avoiding a socket-based protocol
between the application and the
database, Bitsy can achieve submicrosecond query latencies.

The OSI model requires
serialization and
deserialization as the
packet crosses from one
layer to another
Design Principle #3: No SQL
● Tuning a SQL database is
a non-trivial task.
● The biggest factor in a
SQL query's efficiency is
its execution plan.
● By avoiding SQL and the
execution plans that come
with it, Bitsy ensures that
all queries and updates
are efficient*.

An example execution plan from Oracle's
documentation

* The "allow full-graph scan" option must be disabled to guarantee quick responses.
Concurrency Model
● Bitsy is designed to work in multi-threaded OLTP
environments.
● It implements optimistic concurrency control where
edges and vertices are tied to version numbers that are
incremented on updates.
● A BitsyRetryException is raised during a transaction
commit, if an updated vertex/edge has a different
version at the time of commit, than at the time of query.
● The application should retry the entire transaction in
case of conflict.
Write Algorithms
●
●

●
●

●

●

The write algorithms operate on
three levels of "double buffers".
The transaction buffers capture
transactions to be committed
simultaneously.
The commit waits for the buffer to
flush to a transaction file (A/B).
Transaction files are moved to
vertex and edge files on exceeding
a threshold size (default is 4MB).
Vertex and edge files are
reorganized after a period of growth
(default is +1x initial size).
Online backups trigger a
transaction flush, and then copy the
backup the vertex and edge files
representing the DB snapshot.
Write throughput in an OLTP setting
●
●
●

The plot below shows the throughput of a test application* that repeatedly
commits a small transaction (1 vertex + 1 edge) from multiple threads.
The throughput exceeds 50K ops/second at 750 concurrent threads.
The comparison with Neo4J 1.9.2 illustrates the benefit of "No Seek".

* Tests performed on a $600 HP p7-1287c desktop PC with a single 7200 rpm hard disk.
Read throughput in an OLTP setting
●
●

The plot below shows the read throughput of threads, repeatedly traversing
separate portions of the graph in a desktop PC*.
Bitsy implements mostly lock-free read algorithms that can perform close
to 20M ops/second at 1000 threads -- on par with Neo4J’s warm caches.

* Tests performed on a $600 HP p7-1287c desktop PC with 4 cores
Monitoring and Management
● Offline backup and
restore operations are
simple file copy
operations on the
database directory.
● Bitsy exposes a JMX
interface to make online
backups, and adjust
database parameters.
● Bitsy logs messages
using the SLF4J API with
logger names starting
with "com.lambdazen".

Online backup through jconsole
Dependencies
●
●
●
●

Blueprints Core
Jackson JSON Processor
SLF4J API
Ness Computing Core Component: For fast UUID
serialization/deserialization
License
● Bitsy is a dual-licensed product.
● The AGPL v3 license can be used for open-source
●

projects and internally-used closed-source projects.
The commercial license is an extremely liberal license
that provides rights to modify and use Bitsy in an
unlimited number of instances, products* and services.
Pricing details with a 15% promotional discount (till Feb 2014)
Startups and small
businesses
(1-10 employees)

Medium-sized enterprises
(10-500 employees)

Large-sized
enterprises
(500+ employees)

$425 annual
$1699 perpetual

$849 annual
$3399 perpetual

$1275 annual
$5099 perpetual

* The products must not encourage the direct use of Bitsy APIs.
Wrap-up
● Bitsy is a small, fast, embeddable, durable, in-memory
graph database, with the following features:
○ ACID guarantees and clean recovery from crashes
○ Query latency in sub-microseconds
○ High transaction throughput in an OLTP setting with multiple
clients/threads accessing the database

●

○ Well-defined optimistic concurrency model
○ Support for online backups
○ Human-readable database files
○ Small code footprint (~1.5MB with dependencies)
Bitsy is dual-licensed under AGPL and a liberal
commercial license for unlimited enterprise-wide use.
Questions and Feedback
● The project is hosted at https://bitbucket.
org/lambdazen/bitsy with publicly accessible
○ Documentation and install instructions (in Wiki)
○ Links to downloads
○ Issue management

● Please email your questions and feedback to
bisty@lambdazen.com

More Related Content

What's hot (20)

PDF
Postgresql database administration volume 1
Federico Campoli
 
PDF
PostgreSQLレプリケーション徹底紹介
NTT DATA OSS Professional Services
 
PDF
Improving GStreamer performance on large pipelines: from profiling to optimiz...
Luis Lopez
 
PPTX
Netflix viewing data architecture evolution - QCon 2014
Philip Fisher-Ogden
 
ODP
Elasticsearch for beginners
Neil Baker
 
PDF
[D31] PostgreSQLでスケールアウト構成を構築しよう by Yugo Nagata
Insight Technology, Inc.
 
PPTX
Managing your Hadoop Clusters with Apache Ambari
DataWorks Summit
 
PDF
2023 COSCUP - Whats new in PostgreSQL 16
José Lin
 
PDF
バージョン17からのpg_stat_bgwriter (第48回 PostgreSQLアンカンファレンス 発表資料)
NTT DATA Technology & Innovation
 
PDF
Iceberg + Alluxio for Fast Data Analytics
Alluxio, Inc.
 
PDF
Vacuum徹底解説
Masahiko Sawada
 
PDF
Advanced backup methods (Postgres@CERN)
Anastasia Lubennikova
 
PDF
Gcp data engineer
Narendranath Reddy T
 
PDF
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
NTT DATA OSS Professional Services
 
PDF
Fluentd with MySQL
I Goo Lee
 
PPTX
Effective Hyper-V - 久しぶりエディション
Kuninobu SaSaki
 
PDF
Clickhouse at Cloudflare. By Marek Vavrusa
Altinity Ltd
 
PDF
NTTデータ流 Hadoop活用のすすめ ~インフラ構築・運用の勘所~
NTT DATA OSS Professional Services
 
PPTX
Nosql seminar
Shreyashkumar Nangnurwar
 
PDF
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 
Postgresql database administration volume 1
Federico Campoli
 
PostgreSQLレプリケーション徹底紹介
NTT DATA OSS Professional Services
 
Improving GStreamer performance on large pipelines: from profiling to optimiz...
Luis Lopez
 
Netflix viewing data architecture evolution - QCon 2014
Philip Fisher-Ogden
 
Elasticsearch for beginners
Neil Baker
 
[D31] PostgreSQLでスケールアウト構成を構築しよう by Yugo Nagata
Insight Technology, Inc.
 
Managing your Hadoop Clusters with Apache Ambari
DataWorks Summit
 
2023 COSCUP - Whats new in PostgreSQL 16
José Lin
 
バージョン17からのpg_stat_bgwriter (第48回 PostgreSQLアンカンファレンス 発表資料)
NTT DATA Technology & Innovation
 
Iceberg + Alluxio for Fast Data Analytics
Alluxio, Inc.
 
Vacuum徹底解説
Masahiko Sawada
 
Advanced backup methods (Postgres@CERN)
Anastasia Lubennikova
 
Gcp data engineer
Narendranath Reddy T
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
NTT DATA OSS Professional Services
 
Fluentd with MySQL
I Goo Lee
 
Effective Hyper-V - 久しぶりエディション
Kuninobu SaSaki
 
Clickhouse at Cloudflare. By Marek Vavrusa
Altinity Ltd
 
NTTデータ流 Hadoop活用のすすめ ~インフラ構築・運用の勘所~
NTT DATA OSS Professional Services
 
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 

Viewers also liked (6)

PDF
Improvements in Bitsy 1.5
LambdaZen LLC
 
PPT
HyperGraphDb
borislav
 
PDF
HypergraphDB
Jan Drozen
 
PDF
OrientDB distributed architecture 1.1
Luca Garulli
 
PDF
Pinot: Realtime Distributed OLAP datastore
Kishore Gopalakrishna
 
PDF
fluent-plugin-norikra #fluentdcasual
SATOSHI TAGOMORI
 
Improvements in Bitsy 1.5
LambdaZen LLC
 
HyperGraphDb
borislav
 
HypergraphDB
Jan Drozen
 
OrientDB distributed architecture 1.1
Luca Garulli
 
Pinot: Realtime Distributed OLAP datastore
Kishore Gopalakrishna
 
fluent-plugin-norikra #fluentdcasual
SATOSHI TAGOMORI
 
Ad

Similar to Bitsy graph database (20)

PPTX
NewSQL
hyeongchae lee
 
PDF
AgensGraph: a Multi-model Graph Database based on PostgreSql
Kisung Kim
 
PDF
OQGraph @ SCaLE 11x 2013
Antony T Curtis
 
PPT
Database Tendency
grandis_au
 
PDF
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Neo4j
 
PDF
In-memory Database and MySQL Cluster
grandis_au
 
PDF
Database management-system
kalasalingam
 
PDF
Google Devfest 2009 Argentina - Intro to Appengine
Patrick Chanezon
 
PDF
WebCamp: Developer Day: The Big, the Small and the Redis - Андрей Савченко
GeeksLab Odessa
 
PPTX
Hive & HBase For Transaction Processing
DataWorks Summit
 
PPTX
Hive & HBase for Transaction Processing Hadoop Summit EU Apr 2015
alanfgates
 
PDF
Relational vs. Non-Relational
PostgreSQL Experts, Inc.
 
PDF
PostgreSQL, Extensible to the Nth Degree: Functions, Languages, Types, Rules,...
Command Prompt., Inc
 
PPT
Introduction to persistency and Berkeley DB
Philip Johnson
 
KEY
NoSQL: Why, When, and How
BigBlueHat
 
PDF
The End of an Architectural Era Michael Stonebraker
ugur candan
 
PDF
PostgreSQL and MySQL
PostgreSQL Experts, Inc.
 
PDF
LDAP at Lightning Speed
C4Media
 
KEY
An introduction to Pincaster
Frank Denis
 
PPTX
Handling Massive Writes
Liran Zelkha
 
AgensGraph: a Multi-model Graph Database based on PostgreSql
Kisung Kim
 
OQGraph @ SCaLE 11x 2013
Antony T Curtis
 
Database Tendency
grandis_au
 
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Neo4j
 
In-memory Database and MySQL Cluster
grandis_au
 
Database management-system
kalasalingam
 
Google Devfest 2009 Argentina - Intro to Appengine
Patrick Chanezon
 
WebCamp: Developer Day: The Big, the Small and the Redis - Андрей Савченко
GeeksLab Odessa
 
Hive & HBase For Transaction Processing
DataWorks Summit
 
Hive & HBase for Transaction Processing Hadoop Summit EU Apr 2015
alanfgates
 
Relational vs. Non-Relational
PostgreSQL Experts, Inc.
 
PostgreSQL, Extensible to the Nth Degree: Functions, Languages, Types, Rules,...
Command Prompt., Inc
 
Introduction to persistency and Berkeley DB
Philip Johnson
 
NoSQL: Why, When, and How
BigBlueHat
 
The End of an Architectural Era Michael Stonebraker
ugur candan
 
PostgreSQL and MySQL
PostgreSQL Experts, Inc.
 
LDAP at Lightning Speed
C4Media
 
An introduction to Pincaster
Frank Denis
 
Handling Massive Writes
Liran Zelkha
 
Ad

Recently uploaded (20)

PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 

Bitsy graph database

  • 1. Bitsy Graph Database Sridhar Ramachandran Founder, LambdaZen LLC
  • 2. What is Bitsy? ● A small, fast, embeddable, durable, in-memory graph database. ● Maintains an on-disk copy of the graph database. ● Designed for multi-threaded OLTP applications. ● Provides ACID guarantees and optimistic concurrency control for transactions. ● Compatible with Tinkerpop/Blueprints -- the graph database standard. Tinkerpop software stack From https://github.com/tinkerpop/blueprints/wiki
  • 3. In-memory and durable? ● Bitsy maintains a copy of the entire graph in memory data-structures. ● Bitsy saves all changes made to the database, to the disk, during a commit operation. ● Commits from different threads are forced to the disk at once, thereby improving the write performance in a multithreaded OLTP environment. ● The database is loaded from files during startup. ● All database files are append-only text files with JSONencoded vertices and edges. ● The database files are periodically compacted by a background thread.
  • 4. Design Principle #1: No Seek ● Bitsy appends all changes to an unordered transaction log, unlike most databases which persist data in B-Trees and other ordered structures. ● Ordered data structures perform multiple seeks per updated element. ● Seek operations on the hard-disk are expensive (5-15 ms). ● Bitsy avoids seeks per element, and addresses rotational latency by combining commits from concurrent transactions. Hard disk head: Seek operations require a mechanical movement of the hard disk head which takes 5-15ms. Rotational latency is the time taken for the requested sector in the rotating platter to reach the head. Takes 2-4ms.
  • 5. Design Principle #2: No Socket ● Typical databases run in a separate process exposing a socket-based protocol to applications. ● The cost of serializing and deserializing the requests and responses, and calling OS-level functions, reduces the overall throughput of the database. ● By avoiding a socket-based protocol between the application and the database, Bitsy can achieve submicrosecond query latencies. The OSI model requires serialization and deserialization as the packet crosses from one layer to another
  • 6. Design Principle #3: No SQL ● Tuning a SQL database is a non-trivial task. ● The biggest factor in a SQL query's efficiency is its execution plan. ● By avoiding SQL and the execution plans that come with it, Bitsy ensures that all queries and updates are efficient*. An example execution plan from Oracle's documentation * The "allow full-graph scan" option must be disabled to guarantee quick responses.
  • 7. Concurrency Model ● Bitsy is designed to work in multi-threaded OLTP environments. ● It implements optimistic concurrency control where edges and vertices are tied to version numbers that are incremented on updates. ● A BitsyRetryException is raised during a transaction commit, if an updated vertex/edge has a different version at the time of commit, than at the time of query. ● The application should retry the entire transaction in case of conflict.
  • 8. Write Algorithms ● ● ● ● ● ● The write algorithms operate on three levels of "double buffers". The transaction buffers capture transactions to be committed simultaneously. The commit waits for the buffer to flush to a transaction file (A/B). Transaction files are moved to vertex and edge files on exceeding a threshold size (default is 4MB). Vertex and edge files are reorganized after a period of growth (default is +1x initial size). Online backups trigger a transaction flush, and then copy the backup the vertex and edge files representing the DB snapshot.
  • 9. Write throughput in an OLTP setting ● ● ● The plot below shows the throughput of a test application* that repeatedly commits a small transaction (1 vertex + 1 edge) from multiple threads. The throughput exceeds 50K ops/second at 750 concurrent threads. The comparison with Neo4J 1.9.2 illustrates the benefit of "No Seek". * Tests performed on a $600 HP p7-1287c desktop PC with a single 7200 rpm hard disk.
  • 10. Read throughput in an OLTP setting ● ● The plot below shows the read throughput of threads, repeatedly traversing separate portions of the graph in a desktop PC*. Bitsy implements mostly lock-free read algorithms that can perform close to 20M ops/second at 1000 threads -- on par with Neo4J’s warm caches. * Tests performed on a $600 HP p7-1287c desktop PC with 4 cores
  • 11. Monitoring and Management ● Offline backup and restore operations are simple file copy operations on the database directory. ● Bitsy exposes a JMX interface to make online backups, and adjust database parameters. ● Bitsy logs messages using the SLF4J API with logger names starting with "com.lambdazen". Online backup through jconsole
  • 12. Dependencies ● ● ● ● Blueprints Core Jackson JSON Processor SLF4J API Ness Computing Core Component: For fast UUID serialization/deserialization
  • 13. License ● Bitsy is a dual-licensed product. ● The AGPL v3 license can be used for open-source ● projects and internally-used closed-source projects. The commercial license is an extremely liberal license that provides rights to modify and use Bitsy in an unlimited number of instances, products* and services. Pricing details with a 15% promotional discount (till Feb 2014) Startups and small businesses (1-10 employees) Medium-sized enterprises (10-500 employees) Large-sized enterprises (500+ employees) $425 annual $1699 perpetual $849 annual $3399 perpetual $1275 annual $5099 perpetual * The products must not encourage the direct use of Bitsy APIs.
  • 14. Wrap-up ● Bitsy is a small, fast, embeddable, durable, in-memory graph database, with the following features: ○ ACID guarantees and clean recovery from crashes ○ Query latency in sub-microseconds ○ High transaction throughput in an OLTP setting with multiple clients/threads accessing the database ● ○ Well-defined optimistic concurrency model ○ Support for online backups ○ Human-readable database files ○ Small code footprint (~1.5MB with dependencies) Bitsy is dual-licensed under AGPL and a liberal commercial license for unlimited enterprise-wide use.
  • 15. Questions and Feedback ● The project is hosted at https://bitbucket. org/lambdazen/bitsy with publicly accessible ○ Documentation and install instructions (in Wiki) ○ Links to downloads ○ Issue management ● Please email your questions and feedback to [email protected]