SlideShare a Scribd company logo
Version 1.0
Getting Started with DataStax
Enterprise (DSE) on Docker
In Cassandra Lunch #75, we are going to look at getting
started with DataStax Enterprise on Docker.
Isaac Omolayo
Jr. Software Engineer
@Anant
Getting Started with DataStax Enterprise
● What is DataStax Enterprise ?
● Packages and capabilities of DataStax Enterprise
● Using the DSE Search to solve data problems
● Using DSE Analytics (Spark) to handle data workloads
● Using the DataStax Enterprise Graph
● Running DataStax Enterprise packages on Docker
● Working with DSE Studio, DSE Search, DSE Analytics, DSE Graph
What is DataStax Enterprise ?
● DataStax Enterprise helps enterprises to build transformational data architectures for applications, microservices and
different use cases. The purpose of these is for data sovereignty, availability, scalability, agility, and accessibility by any user
● DataStax Enterprise (DSE) is built on Apache Cassandra
● DSE the world’s most scalable database, well known for 100% uptime, unmatched low latency
● DSE has the ability to handle and manage massive data at planetary scale
● DataStax Enterprise is a cohesive data management platform
● You have the ability to handle different workloads for different use cases using DSE Graph, DSE Analytics, and DSE Search
integration
Packages and capabilities of DataStax Enterprise
● There are different packages that come together to form the DataStax Enterprise ecosystem
○ DataStax OpsCenter
○ DataStax Studio
○ DataStax Enterprise
○ DataStax Enterprise Search
○ DataStax Enterprise Analytics with Spark
○ DataStax Enterprise Graph e.t.c
DataStax Enterprise with Search
● DSE Search allows you to quickly find data and provides a more flexibility search experience for your users
● With DSE Search you can create features like product catalogs, document repositories, ad-hoc reporting engines easily
● Data is written to the database first, and then indexes are updated next, you must create index on your data to enable search
capabilities
● The benefits of running enterprise search functions through DataStax Enterprise and DSE Search include:
○ DSE Search is backed by a scalable database, the connections and the packages are fully integrated
○ A persistent store for search indexes
○ You can easily examine and aggregate data in real-time using CQL
○ Supports indexing and querying of advanced data types, including tuples and user-defined types (UDT)
DataStax Enterprise with Analytics (Spark)
● DSE integrates real-time and batch operational analytics capabilities with an of Apache Spark
● With DSE Analytics you can easily generate reports, target customer and process real-time streams of data
● Care should be taken when enabling both Search and Analytics capability are enabled on a DSE node
● Provision sufficient memory and compute resources to accommodate the specific indexing, query, and processing
appropriate to the use case
● Spark is the default mode when you start an analytics node in a packaged installation. Spark runs locally on each node
DataStax Enterprise with Analytics (Spark)
● DSE Analytics includes integration with Apache Spark, Spark is the framework that will help to support the analytics
applications. Use DSE Analytics to analyze huge databases
● Spark is a distributed computation engine that is designed to handle big data and for in-memory processing
● Features of DSE Analytics
○ Spark Master management
○ Analytics without ETL
○ DataStax Enterprise file system (DSEFS)
○ DSE Analytics Solo
○ Integrated security
○ AlwaysOn SQL
DataStax Enterprises with Graph
● DSE graph is built on top of Apache TinkerPop, Apache Cassandra, Apache Solr, and Apache Spark
● DSE Graph uses Apache TinkerPop standards for data and traversal while also using Apache Cassandra for scalable storage
and retrieval
● DSE Graph supports both transactional and analytic workloads, using two different engines
○ OLAP: Online analytical processing (OLAP) is typically used to perform multidimensional analysis of data
■ Complex calculations on aggregated historical data
○ OLTP: Online transactional processing (OLTP) is characterized by a large number of short, online transactions for
very fast query processing
■ OLTP is typically used for data entry and retrieval with transaction-oriented applications
■ OLTP queries are best for questions that require access to a limited subset of the entire graph
DataStax Enterprise with Graph
● All the DataStax enterprise components are integrated into the DSE graph to form a real-time graph database management
system
● It has the built-in DSE Analytics and DSE Search functionality, visual management and monitoring, and development tools
including DataStax Studio incorporated
Running DataStax Enterprise packages on Docker
● Install Docker on your machine
● Pull all the needed DataStax Enterprise packages images
● Set up DSE Search, DSE Analytics and DSE Graph on Docker container
● Remote into the Docker Containers
● Create a table in Cassandra using CQL
● Access and create a search index on table
● Transform table with Spark Scala on Cassandra table using DSE Analytics
● Access the table in DataStax Studio
● Use the DSE Graph to query the data
Demo
● https://github.com/yTek01/Getting-Started-with-DSE-on-Docker
Resources
● https://docs.datastax.com/en/dse/6.7/dse-
admin/datastax_enterprise/newFeatures.html
● https://docs.datastax.com/en/dse/6.0/dse-
dev/datastax_enterprise/dseGettingStarted.html
● https://docs.datastax.com/en/dse/6.0/dse-
arch/datastax_enterprise/dbArch/archGraphSimilarDiff.html
● https://github.com/datastax/docker-images
● https://github.com/roberd13/Getting-Started-With-DSE-and-Docker
● https://docs.docker.com/engine/install/
Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

More Related Content

What's hot (20)

PDF
Feeding Cassandra with Spark-Streaming and Kafka
DataStax Academy
 
PPTX
Backup multi-cloud solution based on named pipes
Leandro Totino Pereira
 
PPTX
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
ScyllaDB
 
PPTX
Datastax / Cassandra Modeling Strategies
Anant Corporation
 
PDF
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
DataStax
 
PPTX
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Anant Corporation
 
PDF
Cassandra Day SV 2014: Scaling Hulu’s Video Progress Tracking Service with Ap...
DataStax Academy
 
PPTX
Apache Cassandra Lunch #72: Databricks and Cassandra
Anant Corporation
 
PDF
Real Time Analytics with Dse
DataStax Academy
 
PDF
An Overview of Apache Spark
Yasoda Jayaweera
 
PPTX
Captial One: Why Stream Data as Part of Data Transformation?
ScyllaDB
 
PDF
Introduction to apache spark
Muktadiur Rahman
 
PPTX
Spark - The Ultimate Scala Collections by Martin Odersky
Spark Summit
 
PDF
Databases and how to choose them
Datio Big Data
 
PPTX
Zabbix at scale with Elasticsearch
Leandro Totino Pereira
 
PDF
Workshop - How to benchmark your database
ScyllaDB
 
PDF
Demystifying the Distributed Database Landscape
ScyllaDB
 
PPTX
Empowering the AWS DynamoDB™ application developer with Alternator
ScyllaDB
 
PDF
Data Pipelines with Spark & DataStax Enterprise
DataStax
 
PPTX
Building an ETL pipeline for Elasticsearch using Spark
Itai Yaffe
 
Feeding Cassandra with Spark-Streaming and Kafka
DataStax Academy
 
Backup multi-cloud solution based on named pipes
Leandro Totino Pereira
 
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
ScyllaDB
 
Datastax / Cassandra Modeling Strategies
Anant Corporation
 
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
DataStax
 
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Anant Corporation
 
Cassandra Day SV 2014: Scaling Hulu’s Video Progress Tracking Service with Ap...
DataStax Academy
 
Apache Cassandra Lunch #72: Databricks and Cassandra
Anant Corporation
 
Real Time Analytics with Dse
DataStax Academy
 
An Overview of Apache Spark
Yasoda Jayaweera
 
Captial One: Why Stream Data as Part of Data Transformation?
ScyllaDB
 
Introduction to apache spark
Muktadiur Rahman
 
Spark - The Ultimate Scala Collections by Martin Odersky
Spark Summit
 
Databases and how to choose them
Datio Big Data
 
Zabbix at scale with Elasticsearch
Leandro Totino Pereira
 
Workshop - How to benchmark your database
ScyllaDB
 
Demystifying the Distributed Database Landscape
ScyllaDB
 
Empowering the AWS DynamoDB™ application developer with Alternator
ScyllaDB
 
Data Pipelines with Spark & DataStax Enterprise
DataStax
 
Building an ETL pipeline for Elasticsearch using Spark
Itai Yaffe
 

Similar to Apache Cassandra Lunch #75: Getting Started with DataStax Enterprise on Docker (20)

PPTX
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
DataStax
 
PPTX
Webinar: Comparing DataStax Enterprise with Open Source Apache Cassandra
DataStax
 
PPTX
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
DataStax
 
PPTX
Webinar - Bringing connected graph data to Cassandra with DSE Graph
DataStax
 
PDF
Datastax enterprise presentation
Duyhai Doan
 
PPSX
implementation of a big data architecture for real-time analytics with data s...
Joseph Arriola
 
PDF
DataStax: Datastax Enterprise - The Multi-Model Platform
DataStax Academy
 
PDF
Trivadis TechEvent 2016 Introduction to DataStax Enterprise (DSE) Graph by Gu...
Trivadis
 
PPTX
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
PDF
DataStax: Making a Difference with Smart Analytics
DataStax Academy
 
PPTX
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
DataStax
 
PPTX
Webinar | Introducing DataStax Enterprise 4.6
DataStax
 
PPTX
Webinar: Comparing DataStax Enterprise with Open Source Apache Cassandra
DataStax
 
PDF
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
DataStax Academy
 
PPTX
How to get Real-Time Value from your IoT Data - Datastax
DataStax
 
PDF
DataStax | DataStax Tools for Developers (Alex Popescu) | Cassandra Summit 2016
DataStax
 
PPTX
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
DataStax
 
PPTX
Webinar: Don't Leave Your Data in the Dark
DataStax
 
PPTX
BI, Reporting and Analytics on Apache Cassandra
Victor Coustenoble
 
PPTX
Cassandra Lunch #95: Spark Graph Operations with DSEGraphFrames Scala API
Anant Corporation
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
DataStax
 
Webinar: Comparing DataStax Enterprise with Open Source Apache Cassandra
DataStax
 
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...
DataStax
 
Webinar - Bringing connected graph data to Cassandra with DSE Graph
DataStax
 
Datastax enterprise presentation
Duyhai Doan
 
implementation of a big data architecture for real-time analytics with data s...
Joseph Arriola
 
DataStax: Datastax Enterprise - The Multi-Model Platform
DataStax Academy
 
Trivadis TechEvent 2016 Introduction to DataStax Enterprise (DSE) Graph by Gu...
Trivadis
 
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
DataStax: Making a Difference with Smart Analytics
DataStax Academy
 
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
DataStax
 
Webinar | Introducing DataStax Enterprise 4.6
DataStax
 
Webinar: Comparing DataStax Enterprise with Open Source Apache Cassandra
DataStax
 
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
DataStax Academy
 
How to get Real-Time Value from your IoT Data - Datastax
DataStax
 
DataStax | DataStax Tools for Developers (Alex Popescu) | Cassandra Summit 2016
DataStax
 
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
DataStax
 
Webinar: Don't Leave Your Data in the Dark
DataStax
 
BI, Reporting and Analytics on Apache Cassandra
Victor Coustenoble
 
Cassandra Lunch #95: Spark Graph Operations with DSEGraphFrames Scala API
Anant Corporation
 
Ad

More from Anant Corporation (20)

PPTX
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
PPTX
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
Anant Corporation
 
PDF
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Anant Corporation
 
PDF
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Anant Corporation
 
PDF
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
Anant Corporation
 
PDF
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
PPTX
YugabyteDB Developer Tools
Anant Corporation
 
PPTX
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Anant Corporation
 
PPTX
Machine Learning Orchestration with Airflow
Anant Corporation
 
PDF
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Anant Corporation
 
PDF
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Anant Corporation
 
PDF
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Anant Corporation
 
PDF
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Anant Corporation
 
PDF
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Anant Corporation
 
PDF
Data Engineer's Lunch #85: Designing a Modern Data Stack
Anant Corporation
 
PPTX
CL 121
Anant Corporation
 
PDF
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Anant Corporation
 
PDF
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Anant Corporation
 
PPTX
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Anant Corporation
 
PPTX
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Anant Corporation
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
Anant Corporation
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Anant Corporation
 
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Anant Corporation
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
Anant Corporation
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
YugabyteDB Developer Tools
Anant Corporation
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Anant Corporation
 
Machine Learning Orchestration with Airflow
Anant Corporation
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Anant Corporation
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Anant Corporation
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Anant Corporation
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Anant Corporation
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Anant Corporation
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Anant Corporation
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Anant Corporation
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Anant Corporation
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Anant Corporation
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Anant Corporation
 
Ad

Recently uploaded (20)

PPTX
big data eco system fundamentals of data science
arivukarasi
 
PPTX
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
PPT
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
PPTX
Powerful Uses of Data Analytics You Should Know
subhashenia
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
PDF
Optimizing Large Language Models with vLLM and Related Tools.pdf
Tamanna36
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PDF
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
PDF
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
PPTX
How to Add Columns and Rows in an R Data Frame
subhashenia
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PDF
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
PPTX
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
big data eco system fundamentals of data science
arivukarasi
 
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
Business implication of Artificial Intelligence.pdf
VishalChugh12
 
Powerful Uses of Data Analytics You Should Know
subhashenia
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
Optimizing Large Language Models with vLLM and Related Tools.pdf
Tamanna36
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
How to Add Columns and Rows in an R Data Frame
subhashenia
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 

Apache Cassandra Lunch #75: Getting Started with DataStax Enterprise on Docker

  • 1. Version 1.0 Getting Started with DataStax Enterprise (DSE) on Docker In Cassandra Lunch #75, we are going to look at getting started with DataStax Enterprise on Docker. Isaac Omolayo Jr. Software Engineer @Anant
  • 2. Getting Started with DataStax Enterprise ● What is DataStax Enterprise ? ● Packages and capabilities of DataStax Enterprise ● Using the DSE Search to solve data problems ● Using DSE Analytics (Spark) to handle data workloads ● Using the DataStax Enterprise Graph ● Running DataStax Enterprise packages on Docker ● Working with DSE Studio, DSE Search, DSE Analytics, DSE Graph
  • 3. What is DataStax Enterprise ? ● DataStax Enterprise helps enterprises to build transformational data architectures for applications, microservices and different use cases. The purpose of these is for data sovereignty, availability, scalability, agility, and accessibility by any user ● DataStax Enterprise (DSE) is built on Apache Cassandra ● DSE the world’s most scalable database, well known for 100% uptime, unmatched low latency ● DSE has the ability to handle and manage massive data at planetary scale ● DataStax Enterprise is a cohesive data management platform ● You have the ability to handle different workloads for different use cases using DSE Graph, DSE Analytics, and DSE Search integration
  • 4. Packages and capabilities of DataStax Enterprise ● There are different packages that come together to form the DataStax Enterprise ecosystem ○ DataStax OpsCenter ○ DataStax Studio ○ DataStax Enterprise ○ DataStax Enterprise Search ○ DataStax Enterprise Analytics with Spark ○ DataStax Enterprise Graph e.t.c
  • 5. DataStax Enterprise with Search ● DSE Search allows you to quickly find data and provides a more flexibility search experience for your users ● With DSE Search you can create features like product catalogs, document repositories, ad-hoc reporting engines easily ● Data is written to the database first, and then indexes are updated next, you must create index on your data to enable search capabilities ● The benefits of running enterprise search functions through DataStax Enterprise and DSE Search include: ○ DSE Search is backed by a scalable database, the connections and the packages are fully integrated ○ A persistent store for search indexes ○ You can easily examine and aggregate data in real-time using CQL ○ Supports indexing and querying of advanced data types, including tuples and user-defined types (UDT)
  • 6. DataStax Enterprise with Analytics (Spark) ● DSE integrates real-time and batch operational analytics capabilities with an of Apache Spark ● With DSE Analytics you can easily generate reports, target customer and process real-time streams of data ● Care should be taken when enabling both Search and Analytics capability are enabled on a DSE node ● Provision sufficient memory and compute resources to accommodate the specific indexing, query, and processing appropriate to the use case ● Spark is the default mode when you start an analytics node in a packaged installation. Spark runs locally on each node
  • 7. DataStax Enterprise with Analytics (Spark) ● DSE Analytics includes integration with Apache Spark, Spark is the framework that will help to support the analytics applications. Use DSE Analytics to analyze huge databases ● Spark is a distributed computation engine that is designed to handle big data and for in-memory processing ● Features of DSE Analytics ○ Spark Master management ○ Analytics without ETL ○ DataStax Enterprise file system (DSEFS) ○ DSE Analytics Solo ○ Integrated security ○ AlwaysOn SQL
  • 8. DataStax Enterprises with Graph ● DSE graph is built on top of Apache TinkerPop, Apache Cassandra, Apache Solr, and Apache Spark ● DSE Graph uses Apache TinkerPop standards for data and traversal while also using Apache Cassandra for scalable storage and retrieval ● DSE Graph supports both transactional and analytic workloads, using two different engines ○ OLAP: Online analytical processing (OLAP) is typically used to perform multidimensional analysis of data ■ Complex calculations on aggregated historical data ○ OLTP: Online transactional processing (OLTP) is characterized by a large number of short, online transactions for very fast query processing ■ OLTP is typically used for data entry and retrieval with transaction-oriented applications ■ OLTP queries are best for questions that require access to a limited subset of the entire graph
  • 9. DataStax Enterprise with Graph ● All the DataStax enterprise components are integrated into the DSE graph to form a real-time graph database management system ● It has the built-in DSE Analytics and DSE Search functionality, visual management and monitoring, and development tools including DataStax Studio incorporated
  • 10. Running DataStax Enterprise packages on Docker ● Install Docker on your machine ● Pull all the needed DataStax Enterprise packages images ● Set up DSE Search, DSE Analytics and DSE Graph on Docker container ● Remote into the Docker Containers ● Create a table in Cassandra using CQL ● Access and create a search index on table ● Transform table with Spark Scala on Cassandra table using DSE Analytics ● Access the table in DataStax Studio ● Use the DSE Graph to query the data
  • 12. Resources ● https://docs.datastax.com/en/dse/6.7/dse- admin/datastax_enterprise/newFeatures.html ● https://docs.datastax.com/en/dse/6.0/dse- dev/datastax_enterprise/dseGettingStarted.html ● https://docs.datastax.com/en/dse/6.0/dse- arch/datastax_enterprise/dbArch/archGraphSimilarDiff.html ● https://github.com/datastax/docker-images ● https://github.com/roberd13/Getting-Started-With-DSE-and-Docker ● https://docs.docker.com/engine/install/
  • 13. Strategy: Scalable Fast Data Architecture: Cassandra, Spark, Kafka Engineering: Node, Python, JVM,CLR Operations: Cloud, Container Rescue: Downtime!! I need help. www.anant.us | [email protected] | (855) 262-6826 3 Washington Circle, NW | Suite 301 | Washington, DC 20037