SlideShare a Scribd company logo
Relational (RDBMS) to NoSQL
Migration
Ankit Patel | DataStax | Principal Strategy Architect
2 © 2020 Datastax, Inc. All rights reserved.
“We cannot solve our problems
with the same thinking we
used when we created them.”
- Albert Einstein
The Digital Era - The Need to Modernize
3 © 2020 Datastax, Inc. All rights reserved.
Digital Data-Driven AI Enabled
The Modern Era
SAD (Silos Affects Delivery) Speed of Data Matters!
4
Data access
Legacy
processes
Lack of data
analytical skills
Resistance
to change
© 2020 Datastax, Inc. All rights reserved.
Source: https://www.pinterest.com/pin/573716440029920090/
NoSQL - The Future
What is a NoSQL (Not-only-SQL) Database?
5 © 2020 Datastax, Inc. All rights reserved.
• Non Relational Database - supports
ability to access data using other
forms besides Structured Query
Language (SQL)
• Designed to be used by Cloud
Applications’ need to handle massive
amounts of Data in real-time
• Provides ability to overcome scale,
performance, data storage, data
model, and data distribution
limitations
NoSQL vs RDBMS….
6 © 2020 Datastax, Inc. All rights reserved.
C When to use NoSQL? When to use RDBMS?
Applications Decentralized (scalable)
microservice applications
Centralized monolithic
applications
Availability 100% availability,
zero-downtime
Moderate to high
Data Low latency
structured/semi/unstructured
data @ high velocity
Structured data @ moderate
velocity & latency
Transactions Simple transactions & queries Complex nested transactions &
joins
Scalability
(Reads/Writes)
Horizontal (Linear) scaling Vertical scaling
Cassandra: The Best NoSQL Database of Choice
7 © 2020 Datastax, Inc. All rights reserved.
Active-everywhere,
masterless, scales linearly
Best NoSQL database for
cloud-native and microservices
#1 choice of world’s largest
consumer internet applications
Zero Lock-in Global ScaleZero Downtime
If you use a website or a smartphone today,
you’re touching a Cassandra backend system.
Source: https://sdtimes.com/data/apache-cassandra-4-0-beta-now-available/
Cassandra: Cloud Native NoSQL Database
Why?
With Cassandra masterless architecture,
easily achieving 100% uptime across
on-prem, single cloud, hybrid, and/or
multi-cloud deployments is engraved in
the technology.
8 © 2020 Datastax, Inc. All rights reserved.
Experiences, Microservices
& Insights
ON PREM
© 2020 Datastax, Inc. All rights reserved.
● CQL – Cassandra Query Language
● Similar to syntax compared to SQL
● Standard way to communicate to DSE C* cluster for
reading/writing data.
● Feature rich language that allow you to manage the cluster
(managing schema/permissions, managing roles, JSON support,
UDF/UDA support…)
● Example Read: select * from keyspace.table where
partition_key=<value>;
● Example Writing Data: insert into keyspace.table
(partition_key,clustering_key,value1) values (‘A’,’B’,’C’);
Cassandra: What is CQL?
9
© 2020 Datastax, Inc. All rights reserved.
● Similar to schema in RDBMS
● Container for multiple tables
● Replication Strategy is set at the keyspace level (Example:
SimpleStrategy, NetworkTopologyStrategy)
● Replication Factor defined at the keyspace level
● DURABLE_WRITES is set at the keyspace level. Setting to false
will bypass the commit log.
● Example to create keyspace:
CREATE KEYSPACE test WITH replication = {'class':
NetworkTologyStrategy', 'DC1': '1'} AND durable_writes = true;
Cassandra: What is a Keyspace?
10
© 2020 Datastax, Inc. All rights reserved.
● Same as RDMBS table
● Contains a primary key
● Always has partition key as part of primary key
● Optionally can define a clustering key (ordering can be defined)
● Both partition and clustering key can be composed of multi-column
● A of parameters can be adjusted at the table level (compaction,
compression, gc_grace_seconds, time to live, etc..)
Cassandra: What is a Table?
11
© 2020 Datastax, Inc. All rights reserved.
CREATE TABLE test.sample_table (
par_key1 uuid,
par_key2 uuid,
clust_key1 timestamp,
clust_key2 int,
value1 text,
value2 double,
PRIMARY KEY ((par_key1, par_key2), clust_key1, clust_key2)
) WITH CLUSTERING ORDER BY (clust_key1 DESC, clust_key2
ASC)
Cassandra: Example Create Table
12
© 2020 Datastax, Inc. All rights reserved.
● Replication factor determines how many copies of your data are
stored in the Cassandra Cluster.
● Each copy is stored in a different node.
● Replication Factor can be defined by datacenters that you’ve setup
● This is a parameter set at the keyspace level within the cluster.
Cassandra: What is Replication Factor
13
© 2020 Datastax, Inc. All rights reserved.
● This parameter is set by the client on individual queries
● This parameter combined with replication factor can help you achieve
the consistency requirement the specific use case is looking for.
● Some of the different values are
ONE
LOCAL_ONE
QUORUM
EACH_QUORUM
LOCAL_QUORUM
ALL
Cassandra: What is Consistency Level
14
Cassandra - Read/Write in Action
15 © 2020 Datastax, Inc. All rights reserved.
Replication - 3 per DC
Consistency - Per Read/Write
Request from Client
Application - Active/Active
Deployment across DC for
Read/Write
APP
ON-PREM AWS AZURE
APP APP
© 2020 Datastax, Inc. All rights reserved.
● Structured Data is the norm for both
● Re-evaluate the need for ACID transactions with
Lightweight-transactions (LWT) in Cassandra
● Take advantage of Cassandra Performance
○ Move Joins to Application Stack
○ Denormalization & Data Duplication is efficient
○ Choose type of Index wisely based on Latency/TPS
requirements
● Thoroughly plan the Data Model in Cassandra
How can My Enterprise get from an RDBMS Based
Design to Cassandra Based Architecture?
16
ERD to Query Based
ERD Based Design Query Based Design
© 2020 Datastax, Inc. All rights reserved.17
5 Steps to Query Based Design
18 © 2020 Datastax, Inc. All rights reserved.
Design a Mental Model of
Access Patterns
Examples:
Medical History: Read
Surgeries, Read Allergies,
Read Health Conditions
Doctor Visit: Read Notes,
Read Prescriptions, Read
Vitals
Decide the application
access patterns to various
entities to deliver business
functionality.
Examples:
Medical History Queries
Doctor Visit Queries
Define the structure of the
data elements based on
query based design
Example: Read
Prescriptions (patient,
date, drug, dosage, etc..)
Make optimizations to
access the data
Example: Create index to
Read Prescription by drug
type or prescribing Doctor.
Build Cassandra table
schema based on logical
model & optimizations
Example: Table
prescriptions with primary
key patient, date and
index on doctor & drug
type
Application
Conceptual
Model
Logical
Model
Optimizations
Physical
Model
DataStax Enterprise: Cassandra Data Platform
Kubernetes Operator (Cloud-Native Automation + Elasticity)
Developer and DevOps APIs (K8S, CQL, REST, GraphQL, gRPC)
Operational Reliability (Advanced Performance, Enterprise Security, Monitoring)
AI-Scale Experiences, Microservices and Insights
Apache Cassandra NoSQL Database (100% Uptime, Zero-Lock-In, Global Scale)
TRUSTED
ACCELERATED
STRATEGIC
OUTCOMES
FOUNDATIONAL
Operational
Analytics
(Spark, Pipelines,
Streaming)
Enhanced
Search
(Enhance Any Query)
Extensible
Integration
(Kafka, Elastic,
Bulk Loading)
Graph
Engine
(Relate Data Across
Partitions)
Multi-Model
Data
(All Data Styles)
Tools
Thought Leadership
Enterprise Support
Partnerships
OSS Commitment
19 © 2020 Datastax, Inc. All rights reserved.
DataStax Astra: Cassandra Made Easy in the Cloud
20 © 2020 Datastax, Inc. All rights reserved.
Cloud-native
Database-as-a-Service built
on Apache Cassandra
Eliminate the overhead
to install, operate, and
scale Cassandra
Out-of-the-box REST
and GraphQL endpoints
and browser CQL shell
Powered by our
open-source Kubernetes
Operator for Cassandra
Deploy on AWS or GCP and
keep compatibility with
open-source Cassandra
Launch a database in the
cloud with a few clicks,
no credit card required
Cassandra-as-a-Service No Operations Powerful APIs
Cloud Native Zero Lock-in 10 Gig Free Tier
21
Use Case #1 - C&S Wholesale Grocers - Supply
Chain
● Delivers over 140,000 food and non-food items to from over 50
warehouse locations
● Operates over 18 million square feet of storage
● Some of C&S’s customers are Safeway, Target, Stop & Shop
● Traditional solutions slowing down distribution efficiency &
impeding innovation
● Business growth leading to Technology Innovation
22
Use Case #1 - C&S - The Challenge
● Supply Chain Process in local RDBMS to warehouse
● Business need to consolidate warehouse data for ease of
management via mobile app
● The transaction volumes were in the thousands per several
seconds
● Needed real-time view of all the working parts of the
manufacturing operations. Warehouse → locations → pallet
● Data Platform capable of operational analytics
23
Use Case #1 - C&S - Why Cassandra?
● Scalable
● High Transaction Volume
● Low Latency
● High Availability - Warehouse operations 24/7
● Ease of Development for Microservices & Mobile App
● Multi-DC Deployment Capability
● Ease of Operational Analytics
24
Use Case #1 - C&S - Business Benefits
● 5 year ROI projection to save multi-millions
● Able to optimize management capabilities of consolidated
warehouse operations
● Achieve remarkable efficiency in data pipeline
● Transactions - Read/Write Thousands in seconds
● Supports 300+ Users processing ~ 300k records in 5 mins
25
Use Case #1 - C&S - The Architecture
26
C&S - Case Study
We needed an application that
was entirely reliable and not
vulnerable to unplanned outages
because our warehouses are
pretty much 24/7...
https://www.datastax.com/resources/case-study/cs-
wholesale-achieving-seamless-supply-chain-master
y-datastax-enterprise
27
Use Case #2 - Financial Services - Mobile Banking
● Very competitive retail banking market
● Need to keep up with demand growth in digital banking
● Have high customer satisfaction rates
● Achieve efficient DR & Business Continuity Plans
28
Use Case #2 - Financial Services - The Challenge
● # of Transactions in RDBMS was not easily scalable
● DR was not easy
● Achieving Latency metrics was harder as volumes increased
● Downtime or poor experience would translate to customer churn
29
Use Case #2 - Financial Services - Why
Cassandra?
● Deploy 3 DC Cluster
● Microservices Architecture
● Scale Application Stack w/ Database
● Achieve low latency SLA (<20ms on avg)
● DR Strategy was solid w/ High Availability
● Capable of processing billions of transactions per month
• Customer 360/SVOC
• Omnichannel & Global
Payments
• IoT/Time Series/eCommerce
Data (sensors, tick data, user
interactions, shopping cart)
• Fraud Detection
• Online/Mobile Banking
• Inventory Management
30
Some Other Common Use Cases
• Recommendations (products
& services)
• Regulatory Compliance
• Alerts & Monitoring (Credit
card transactions)
• Global Payments
• Portfolio Management
• Loan Authorization
• Authentication (Mobile
Logins)
Thank You!
31 © 2020 Datastax, Inc. All rights reserved.
Ankit Patel
Principal Strategy Architect @ DataStax
https://www.linkedin.com/in/ankit-p-patel

More Related Content

What's hot (20)

PPTX
Demystifying data engineering
Thang Bui (Bob)
 
PDF
Cassandra Introduction & Features
DataStax Academy
 
PDF
3D: DBT using Databricks and Delta
Databricks
 
PPTX
Azure Synapse Analytics Overview (r2)
James Serra
 
PPTX
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
PDF
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
PPTX
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
ODP
Introduction to PostgreSQL
Jim Mlodgenski
 
PDF
Cassandra Database
YounesCharfaoui
 
PDF
Some Iceberg Basics for Beginners (CDP).pdf
Michael Kogan
 
PPTX
Free Training: How to Build a Lakehouse
Databricks
 
PPTX
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Kent Graziano
 
PDF
Nosql data models
Viet-Trung TRAN
 
PPTX
Introducing Azure SQL Database
James Serra
 
PPTX
Presto best practices for Cluster admins, data engineers and analysts
Shubham Tagra
 
PDF
Introducing Change Data Capture with Debezium
ChengKuan Gan
 
PPTX
CockroachDB
andrei moga
 
PDF
Data Engineering Basics
Catherine Kimani
 
ODP
Pentaho Data Integration Introduction
mattcasters
 
PPTX
Modeling Data and Queries for Wide Column NoSQL
ScyllaDB
 
Demystifying data engineering
Thang Bui (Bob)
 
Cassandra Introduction & Features
DataStax Academy
 
3D: DBT using Databricks and Delta
Databricks
 
Azure Synapse Analytics Overview (r2)
James Serra
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
Introduction to PostgreSQL
Jim Mlodgenski
 
Cassandra Database
YounesCharfaoui
 
Some Iceberg Basics for Beginners (CDP).pdf
Michael Kogan
 
Free Training: How to Build a Lakehouse
Databricks
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Kent Graziano
 
Nosql data models
Viet-Trung TRAN
 
Introducing Azure SQL Database
James Serra
 
Presto best practices for Cluster admins, data engineers and analysts
Shubham Tagra
 
Introducing Change Data Capture with Debezium
ChengKuan Gan
 
CockroachDB
andrei moga
 
Data Engineering Basics
Catherine Kimani
 
Pentaho Data Integration Introduction
mattcasters
 
Modeling Data and Queries for Wide Column NoSQL
ScyllaDB
 

Similar to Slides: Relational to NoSQL Migration (20)

PPTX
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
DataStax
 
PPTX
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
DataStax
 
PPTX
How to get Real-Time Value from your IoT Data - Datastax
DataStax
 
PPT
Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...
DataStax
 
PPTX
Performance tuning - A key to successful cassandra migration
Ramkumar Nottath
 
PPTX
Azure SQL Database & Azure SQL Data Warehouse
Mohamed Tawfik
 
PDF
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Certus Solutions
 
PDF
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Mydbops
 
PDF
Data Driven Advanced Analytics using Denodo Platform on AWS
Denodo
 
PPTX
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
PDF
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
Denodo
 
PDF
YugabyteDB_TVA-Datastax.pdf
AmitAgarwal355193
 
PDF
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
DataStax Academy
 
DOC
Tamilarasu_Uthirasamy_10Yrs_Resume
TAMILARASU UTHIRASAMY
 
PDF
Cloud Data Strategy event London
MongoDB
 
PDF
Overcoming Data Gravity in Multi-Cloud Enterprise Architectures
VMware Tanzu
 
PPTX
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
DataStax
 
PPTX
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
In-Memory Computing Summit
 
PDF
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd Iaetsd
 
PPTX
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
DataStax
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
DataStax
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
DataStax
 
How to get Real-Time Value from your IoT Data - Datastax
DataStax
 
Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...
DataStax
 
Performance tuning - A key to successful cassandra migration
Ramkumar Nottath
 
Azure SQL Database & Azure SQL Data Warehouse
Mohamed Tawfik
 
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Certus Solutions
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Mydbops
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Denodo
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
Denodo
 
YugabyteDB_TVA-Datastax.pdf
AmitAgarwal355193
 
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
DataStax Academy
 
Tamilarasu_Uthirasamy_10Yrs_Resume
TAMILARASU UTHIRASAMY
 
Cloud Data Strategy event London
MongoDB
 
Overcoming Data Gravity in Multi-Cloud Enterprise Architectures
VMware Tanzu
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
DataStax
 
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
In-Memory Computing Summit
 
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd Iaetsd
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
DataStax
 
Ad

More from DATAVERSITY (20)

PDF
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
 
PDF
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
PDF
Exploring Levels of Data Literacy
DATAVERSITY
 
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
PDF
Make Data Work for You
DATAVERSITY
 
PDF
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
PDF
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
 
PDF
Data Modeling Fundamentals
DATAVERSITY
 
PDF
Showing ROI for Your Analytic Project
DATAVERSITY
 
PDF
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
PDF
Is Enterprise Data Literacy Possible?
DATAVERSITY
 
PDF
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
PDF
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
PDF
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
 
PDF
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
PDF
2023 Trends in Enterprise Analytics
DATAVERSITY
 
PDF
Data Strategy Best Practices
DATAVERSITY
 
PDF
Who Should Own Data Governance – IT or Business?
DATAVERSITY
 
PDF
Data Management Best Practices
DATAVERSITY
 
PDF
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
Exploring Levels of Data Literacy
DATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Make Data Work for You
DATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
 
Data Modeling Fundamentals
DATAVERSITY
 
Showing ROI for Your Analytic Project
DATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Is Enterprise Data Literacy Possible?
DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
2023 Trends in Enterprise Analytics
DATAVERSITY
 
Data Strategy Best Practices
DATAVERSITY
 
Who Should Own Data Governance – IT or Business?
DATAVERSITY
 
Data Management Best Practices
DATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
Ad

Recently uploaded (20)

PDF
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PPTX
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
PPTX
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPTX
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PPT
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 

Slides: Relational to NoSQL Migration

  • 1. Relational (RDBMS) to NoSQL Migration Ankit Patel | DataStax | Principal Strategy Architect
  • 2. 2 © 2020 Datastax, Inc. All rights reserved. “We cannot solve our problems with the same thinking we used when we created them.” - Albert Einstein
  • 3. The Digital Era - The Need to Modernize 3 © 2020 Datastax, Inc. All rights reserved. Digital Data-Driven AI Enabled
  • 4. The Modern Era SAD (Silos Affects Delivery) Speed of Data Matters! 4 Data access Legacy processes Lack of data analytical skills Resistance to change © 2020 Datastax, Inc. All rights reserved. Source: https://www.pinterest.com/pin/573716440029920090/
  • 5. NoSQL - The Future What is a NoSQL (Not-only-SQL) Database? 5 © 2020 Datastax, Inc. All rights reserved. • Non Relational Database - supports ability to access data using other forms besides Structured Query Language (SQL) • Designed to be used by Cloud Applications’ need to handle massive amounts of Data in real-time • Provides ability to overcome scale, performance, data storage, data model, and data distribution limitations
  • 6. NoSQL vs RDBMS…. 6 © 2020 Datastax, Inc. All rights reserved. C When to use NoSQL? When to use RDBMS? Applications Decentralized (scalable) microservice applications Centralized monolithic applications Availability 100% availability, zero-downtime Moderate to high Data Low latency structured/semi/unstructured data @ high velocity Structured data @ moderate velocity & latency Transactions Simple transactions & queries Complex nested transactions & joins Scalability (Reads/Writes) Horizontal (Linear) scaling Vertical scaling
  • 7. Cassandra: The Best NoSQL Database of Choice 7 © 2020 Datastax, Inc. All rights reserved. Active-everywhere, masterless, scales linearly Best NoSQL database for cloud-native and microservices #1 choice of world’s largest consumer internet applications Zero Lock-in Global ScaleZero Downtime If you use a website or a smartphone today, you’re touching a Cassandra backend system. Source: https://sdtimes.com/data/apache-cassandra-4-0-beta-now-available/
  • 8. Cassandra: Cloud Native NoSQL Database Why? With Cassandra masterless architecture, easily achieving 100% uptime across on-prem, single cloud, hybrid, and/or multi-cloud deployments is engraved in the technology. 8 © 2020 Datastax, Inc. All rights reserved. Experiences, Microservices & Insights ON PREM
  • 9. © 2020 Datastax, Inc. All rights reserved. ● CQL – Cassandra Query Language ● Similar to syntax compared to SQL ● Standard way to communicate to DSE C* cluster for reading/writing data. ● Feature rich language that allow you to manage the cluster (managing schema/permissions, managing roles, JSON support, UDF/UDA support…) ● Example Read: select * from keyspace.table where partition_key=<value>; ● Example Writing Data: insert into keyspace.table (partition_key,clustering_key,value1) values (‘A’,’B’,’C’); Cassandra: What is CQL? 9
  • 10. © 2020 Datastax, Inc. All rights reserved. ● Similar to schema in RDBMS ● Container for multiple tables ● Replication Strategy is set at the keyspace level (Example: SimpleStrategy, NetworkTopologyStrategy) ● Replication Factor defined at the keyspace level ● DURABLE_WRITES is set at the keyspace level. Setting to false will bypass the commit log. ● Example to create keyspace: CREATE KEYSPACE test WITH replication = {'class': NetworkTologyStrategy', 'DC1': '1'} AND durable_writes = true; Cassandra: What is a Keyspace? 10
  • 11. © 2020 Datastax, Inc. All rights reserved. ● Same as RDMBS table ● Contains a primary key ● Always has partition key as part of primary key ● Optionally can define a clustering key (ordering can be defined) ● Both partition and clustering key can be composed of multi-column ● A of parameters can be adjusted at the table level (compaction, compression, gc_grace_seconds, time to live, etc..) Cassandra: What is a Table? 11
  • 12. © 2020 Datastax, Inc. All rights reserved. CREATE TABLE test.sample_table ( par_key1 uuid, par_key2 uuid, clust_key1 timestamp, clust_key2 int, value1 text, value2 double, PRIMARY KEY ((par_key1, par_key2), clust_key1, clust_key2) ) WITH CLUSTERING ORDER BY (clust_key1 DESC, clust_key2 ASC) Cassandra: Example Create Table 12
  • 13. © 2020 Datastax, Inc. All rights reserved. ● Replication factor determines how many copies of your data are stored in the Cassandra Cluster. ● Each copy is stored in a different node. ● Replication Factor can be defined by datacenters that you’ve setup ● This is a parameter set at the keyspace level within the cluster. Cassandra: What is Replication Factor 13
  • 14. © 2020 Datastax, Inc. All rights reserved. ● This parameter is set by the client on individual queries ● This parameter combined with replication factor can help you achieve the consistency requirement the specific use case is looking for. ● Some of the different values are ONE LOCAL_ONE QUORUM EACH_QUORUM LOCAL_QUORUM ALL Cassandra: What is Consistency Level 14
  • 15. Cassandra - Read/Write in Action 15 © 2020 Datastax, Inc. All rights reserved. Replication - 3 per DC Consistency - Per Read/Write Request from Client Application - Active/Active Deployment across DC for Read/Write APP ON-PREM AWS AZURE APP APP
  • 16. © 2020 Datastax, Inc. All rights reserved. ● Structured Data is the norm for both ● Re-evaluate the need for ACID transactions with Lightweight-transactions (LWT) in Cassandra ● Take advantage of Cassandra Performance ○ Move Joins to Application Stack ○ Denormalization & Data Duplication is efficient ○ Choose type of Index wisely based on Latency/TPS requirements ● Thoroughly plan the Data Model in Cassandra How can My Enterprise get from an RDBMS Based Design to Cassandra Based Architecture? 16
  • 17. ERD to Query Based ERD Based Design Query Based Design © 2020 Datastax, Inc. All rights reserved.17
  • 18. 5 Steps to Query Based Design 18 © 2020 Datastax, Inc. All rights reserved. Design a Mental Model of Access Patterns Examples: Medical History: Read Surgeries, Read Allergies, Read Health Conditions Doctor Visit: Read Notes, Read Prescriptions, Read Vitals Decide the application access patterns to various entities to deliver business functionality. Examples: Medical History Queries Doctor Visit Queries Define the structure of the data elements based on query based design Example: Read Prescriptions (patient, date, drug, dosage, etc..) Make optimizations to access the data Example: Create index to Read Prescription by drug type or prescribing Doctor. Build Cassandra table schema based on logical model & optimizations Example: Table prescriptions with primary key patient, date and index on doctor & drug type Application Conceptual Model Logical Model Optimizations Physical Model
  • 19. DataStax Enterprise: Cassandra Data Platform Kubernetes Operator (Cloud-Native Automation + Elasticity) Developer and DevOps APIs (K8S, CQL, REST, GraphQL, gRPC) Operational Reliability (Advanced Performance, Enterprise Security, Monitoring) AI-Scale Experiences, Microservices and Insights Apache Cassandra NoSQL Database (100% Uptime, Zero-Lock-In, Global Scale) TRUSTED ACCELERATED STRATEGIC OUTCOMES FOUNDATIONAL Operational Analytics (Spark, Pipelines, Streaming) Enhanced Search (Enhance Any Query) Extensible Integration (Kafka, Elastic, Bulk Loading) Graph Engine (Relate Data Across Partitions) Multi-Model Data (All Data Styles) Tools Thought Leadership Enterprise Support Partnerships OSS Commitment 19 © 2020 Datastax, Inc. All rights reserved.
  • 20. DataStax Astra: Cassandra Made Easy in the Cloud 20 © 2020 Datastax, Inc. All rights reserved. Cloud-native Database-as-a-Service built on Apache Cassandra Eliminate the overhead to install, operate, and scale Cassandra Out-of-the-box REST and GraphQL endpoints and browser CQL shell Powered by our open-source Kubernetes Operator for Cassandra Deploy on AWS or GCP and keep compatibility with open-source Cassandra Launch a database in the cloud with a few clicks, no credit card required Cassandra-as-a-Service No Operations Powerful APIs Cloud Native Zero Lock-in 10 Gig Free Tier
  • 21. 21 Use Case #1 - C&S Wholesale Grocers - Supply Chain ● Delivers over 140,000 food and non-food items to from over 50 warehouse locations ● Operates over 18 million square feet of storage ● Some of C&S’s customers are Safeway, Target, Stop & Shop ● Traditional solutions slowing down distribution efficiency & impeding innovation ● Business growth leading to Technology Innovation
  • 22. 22 Use Case #1 - C&S - The Challenge ● Supply Chain Process in local RDBMS to warehouse ● Business need to consolidate warehouse data for ease of management via mobile app ● The transaction volumes were in the thousands per several seconds ● Needed real-time view of all the working parts of the manufacturing operations. Warehouse → locations → pallet ● Data Platform capable of operational analytics
  • 23. 23 Use Case #1 - C&S - Why Cassandra? ● Scalable ● High Transaction Volume ● Low Latency ● High Availability - Warehouse operations 24/7 ● Ease of Development for Microservices & Mobile App ● Multi-DC Deployment Capability ● Ease of Operational Analytics
  • 24. 24 Use Case #1 - C&S - Business Benefits ● 5 year ROI projection to save multi-millions ● Able to optimize management capabilities of consolidated warehouse operations ● Achieve remarkable efficiency in data pipeline ● Transactions - Read/Write Thousands in seconds ● Supports 300+ Users processing ~ 300k records in 5 mins
  • 25. 25 Use Case #1 - C&S - The Architecture
  • 26. 26 C&S - Case Study We needed an application that was entirely reliable and not vulnerable to unplanned outages because our warehouses are pretty much 24/7... https://www.datastax.com/resources/case-study/cs- wholesale-achieving-seamless-supply-chain-master y-datastax-enterprise
  • 27. 27 Use Case #2 - Financial Services - Mobile Banking ● Very competitive retail banking market ● Need to keep up with demand growth in digital banking ● Have high customer satisfaction rates ● Achieve efficient DR & Business Continuity Plans
  • 28. 28 Use Case #2 - Financial Services - The Challenge ● # of Transactions in RDBMS was not easily scalable ● DR was not easy ● Achieving Latency metrics was harder as volumes increased ● Downtime or poor experience would translate to customer churn
  • 29. 29 Use Case #2 - Financial Services - Why Cassandra? ● Deploy 3 DC Cluster ● Microservices Architecture ● Scale Application Stack w/ Database ● Achieve low latency SLA (<20ms on avg) ● DR Strategy was solid w/ High Availability ● Capable of processing billions of transactions per month
  • 30. • Customer 360/SVOC • Omnichannel & Global Payments • IoT/Time Series/eCommerce Data (sensors, tick data, user interactions, shopping cart) • Fraud Detection • Online/Mobile Banking • Inventory Management 30 Some Other Common Use Cases • Recommendations (products & services) • Regulatory Compliance • Alerts & Monitoring (Credit card transactions) • Global Payments • Portfolio Management • Loan Authorization • Authentication (Mobile Logins)
  • 31. Thank You! 31 © 2020 Datastax, Inc. All rights reserved. Ankit Patel Principal Strategy Architect @ DataStax https://www.linkedin.com/in/ankit-p-patel