SlideShare a Scribd company logo
MongoDBEurope2016
Old Billingsgate, London
15th November
Use my code JD20 for 20% off tickets
mongodb.com/europe
Back to Basics 2016 : Webinar 4
Advanced Indexing –
Text and Geospatial Indexes
Joe Drumgoole
Director of Developer Advocacy, EMEA
@jdrumgoole
V1.1
3
Recap
• Webinar 1 – Introduction to NoSQL
– The different types of NoSQL databases
– What kind of database is MongoDB? A document database.
• Webinar 2 – My First Application
– Creating databases and collections
– CRUD operations
– Indexes and Explain
• Webinar 3 – Schema Design
– Dynamic schema
– Embedding approaches
– Examples
4
Indexing
• An efficient way to look up data by its value
• Avoids table scans
1 2 3 4 5 6 7
5
Traditional Databases Use Btrees
• … and so does MongoDB
6
Queries, Inserts, Deletes O(Log(n) Time
7
Creating a Simple Index
db.coll.createIndex( { fieldName : <Direction> } )
Database Name
Collection Name
Command
Field Name to
be indexed
Ascending : 1
Descending : -1
8
Two Other Kinds of Indexes
• Full Text Index
– Allows searching inside the text of a field ( Lucene, Solr and Elastic
Search)
• Geospatial Index
– Allows searching by location (e.g. people near me)
• These indexes do not use Btrees
9
Full Text Indexes
• An “inverted index” on all the words inside a single field (only one text index per collection)
{ “comment” : “I think your blog post is very interesting
and informative. I hope you will post more
info like this in the future” }
>> db.posts.createIndex( { “comments” : “text” } )
MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} )
{ "_id" : ObjectId(“…"), "comment" : "I think your blog post is very
interesting and informative. I hope you will post more info like this
in the future" }
MongoDB Enterprise >
10
Results
MongoDB Enterprise > db.posts.getIndexes()
...
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "comment_text",
"ns" : "test.posts",
"weights" : {
"comment" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
11
Dropping Text Indexes
• We drop text indexes by name rather than shape
db.posts.getIndexes()
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "comment_text_text",
"ns" : "test.posts",
"weights" : {
"comment" : 5,
"tags" : 10
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
12
Hence
MongoDB Enterprise > db.posts.dropIndex( "comment_text_tags_text" )
{ "nIndexesWas" : 2, "ok" : 1 }
MongoDB Enterprise >
• You can give an index an explict name to make this easier
MongoDB Enterprise > db.posts.createIndex( { "comments" : "text", "tags" :
"text" }, { "name" : "text_index" } )
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
13
On The Server
I INDEX [conn275] build index on: test.posts properties: { v: 1, key:
{ _fts: "text", _ftsx: 1 }, name: "comment_text", ns: "test.posts",
weights: { comment: 1 }, default_language: "english",
language_override: "language", textIndexVersion: 3 }}
I INDEX [conn275] building index using bulk method
I INDEX [conn275] build index done. scanned 3 total records. 0 secs
14
More Detailed Example
>> db.posts.insert( { "comment" : "Red yellow orange green" } )
>> db.posts.insert( { "comment" : "Pink purple blue" } )
>> db.posts.insert( { "comment" : "Red Pink" } )
>> db.posts.find( { "$text" : { "$search" : "Red" }} )
{ "_id" : ObjectId(“…”), "comment" : "Red yellow orange green" }
{ "_id" : ObjectId( »…"), "comment" : "Red Pink" }
>> db.posts.find( { "$text" : { "$search" : "Red Green" }} )
{ "_id" : ObjectId(« …"), "comment" : "Red Pink" }
{ "_id" : ObjectId(« …"), "comment" : "Red yellow orange green" }
>> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve
{ "_id" : ObjectId(“…"), "comment" : "Red yellow orange green" }
{ "_id" : ObjectId(«…”), "comment" : "Red Pink" }
>>
15
Using Weights
• We can assign different weights to different fields in the text index
• E.g. I want to favour tags over comments in searching
• So I increase the weight for the the tags field
>> db.blog.createIndex( { comment: "text",
tags : "text” },
{ weights: { comment: 5,
tags : 10 }} )
• Now searches will favour tags
16
$textscore
• Weights impact $textscore:
>> db.posts.find( { "$text" : { "$search" : "Red" }}, { score: {
$meta: "textScore" }} ).sort( { score: { $meta: "textScore" } } )
{ "_id" : …, "comment" : "hello", "tags" : "Red green orange", "score"
: 6.666666666666666 }
{ "_id" : …, "comment" : "Red Pink", "score" : 3.75 }
{ "_id" : …, "comment" : "Red yellow orange green", "score" : 3.125 }
>>
17
Other Parameters
• Language : Pick the language you want to search in e.g.
– $language : Spanish
• Support case sensitive searching
– $caseSensitive : True (default false)
• Support accented characters (diacritic sensitive search e.g. café
is distinguished from cafe )
– $diacriticSensitive : True (default false)
Geospatial Indexes
19
Geospatial Indexes
• MongoDB supports 2D Sphere indexes
• Allows a user to represent location on the earth (which is a sphere)
• Coordinates are stored in GeoJSON format
• The Geospatial index supports subset of the GeoJSON operations
• The index is based on a QuadTree representation
• Index is based on WGS 84 standard
20
Coordinates
• Coordinates are represented as longitude, latitude
• longitude
– Measured from Greenwich meridian in London (0 degrees) locations east
(up to 180 degrees)
– For locations west we specify as negative
• Latitude
– Measured from equator north and south (0 to 90 north, 0 to -90 south)
• Coordinates in MongoDB are stored on Longitude/Latitude order
• Coordinates in Google are stored in Latitude/Longitude order
21
2DSphere Versions
• Three versions of 2dSphere index in MongoDB
• Version 1 : Up to MongoDB 2.4
• Version 2 : From MongoDB 2.6 onwards
• Version 3 : From MongoDB 3.2 onwards
• We will only be talking about Version 3 in this webinar
22
Creating a 2dSphere Index
db.collection.createIndex
( { <location field> : "2dsphere" } )
• Location field must be coordinate or GeoJSON data
23
Example
>> db.test.createIndex( { loc : "2dsphere" } )
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
24
Output
>> db.test.getIndexes()
[
{
"v" : 1,
"key" : {
"loc" : "2dsphere"
},
"name" : "loc_2dsphere",
"ns" : "geo.test",
"2dsphereIndexVersion" : 3
}
]
>>
25
Use a Simple Dataset to investigate Geo Queries
• Lets search for restaurants in Manhattan
• Using two candidate collections
– https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json
– https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json
• Import them into MongoDB
– mongoimport –c neighborhoods –d geo neighborhoods.json
– mongoimport –c restaurants –d geo restaurants.json
26
Neighborhood Document
MongoDB Enterprise > db.neighborhoods.findOne()
{
"_id" : ObjectId("55cb9c666c522cafdb053a1a"),
"geometry" : {
"coordinates" : [
[
[
-73.94193078816193,
40.70072523469547
],
...
[
-73.94409591260093,
40.69897295461309
],
]
"type" : "Polygon"
},
"name" : "Bedford"
}
27
Restaurant Document
MongoDB Enterprise > db.restaurants.findOne()
{
"_id" : ObjectId("55cba2476c522cafdb053adf"),
"location" : {
"coordinates" : [
-73.98241999999999,
40.579505
],
"type" : "Point"
},
"name" : "Riviera Caterer"
}
MongoDB Enterprise >
You can type this into
google maps but
remember to reverse the
coordinate order
28
Add Indexes
MongoDB Enterprise > db.restaurants.createIndex({ location: "2dsphere" })
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
MongoDB Enterprise > db.neighborhoods.createIndex({ geometry: "2dsphere" })
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
MongoDB Enterprise >
29
Use $geoIntersects to find our Neighborhood
• Assume we are at -73.93414657, 40.82302903
• What neighborhood are we in? Use $geoIntersects
db.neighborhoods.findOne({ geometry:
{ $geoIntersects:
{ $geometry:
{ type: "Point",
coordinates:
[ -73.93414657,
40.82302903 ]}}}})
30
Results
{
"geometry" : {
”coordinates" : [
[
-73.9338307684026,
40.81959665747723
],
...
[
-73.93383000695911,
40.81949109558767
]
]
"type" : "Polygon"
},
"name" : "Central Harlem North-Polo Grounds"
}
31
Find All Restaurants within 0.35 km
db.restaurants.find({ location:
{ $geoWithin: { $centerSphere:
[ [ -73.93414657, 40.82302903 ], 5 / 6,378.1 ] }
} })
Distance in km
Divide by radius of earth
to convert to radians
32
Results – (Projected)
{ "name" : "Gotham Stadium Tennis Center Cafe" }
{ "name" : "Chuck E. Cheese'S" }
{ "name" : "Red Star Chinese Restaurant" }
{ "name" : "Tia Melli'S Latin Kitchen" }
{ "name" : "Domino'S Pizza" }
• Without projection
{ "_id" : ObjectId("55cba2476c522cafdb0550aa"),
"location" : { "coordinates" : [ -73.93795159999999, 40.823376 ],
"type" : "Point" },
"name" : "Domino'S Pizza" }
33
Summary of Operators
• $geoIntersect: Find areas or points that overlap or are
adjacent
• $geoWithin: Find areas on points that lie within a specific area
• $geoNear: Returns locations in order from nearest to furthest
away
34
Summary
• Text Indexes : Full text searching of all the text items in a
collection
• Geospatial Indexes : Search by location, by intersection or by
distance from a point
35
Q & A
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
37
• This is slide content
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
41
42
LOREM
IPSUM
LOREM
IPSUM
LOREM
IPSUM
LOREM
IPSUM
Sollicitudin VenenatisLOREM
IPSUM
LOREM
IPSUM
LOREM
IPSUM
LOREM
IPSUM
Graphic Element Examples
Porta Ultricies
Commodo Porta
Graph Examples
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Category 1 Category 2 Category 3 Category 4
Series 1
Series 2
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Category 1 Category 2 Category 3 Category 4
Series 1
Series 2
{
_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type : "Health",
plan : "PPO Plus" },
{ type : "Dental",
plan : "Standard" }
]
}
Code/Highlight Example
Aggregation Framework Agility Backup Big Data Briefcase
Buildings Business Intelligence Camera Cash Register Catalog
Chat Checkmark Checkmark Cloud Commercial Contract
Computer Content Continuous Development Credit Card Customer Success
Data Center Data Variety Data Velocity Data Volume Data Warehouse Database
Dialogue Directory Documents Downloads Drivers Dynamic Schema
EDW Integration Faster Time to Market File Transfer Flexible Gear Hadoop
Health Check High Availability Horizontal Scaling Integrating into Infrastructure Internet of Things Iterative Development
Life Preserver Line Graph Lock Log Data Lower Cost Magnifying Glass
Man Mobile Phone Meter Monitoring Music New Apps
New Data Types Online Open Source Parachute Personalization Pin
Platform Certification Product Catalog Puzzle Pieces RDBMS Realtime Analytics Rich Querying
Life Preserver RSS Scalability Scale Secondary Indexing Steering Wheel
Stopwatch Text Search Tick Data Training Transmission Tower Trophy
Woman World

More Related Content

What's hot (19)

PPTX
Webinar: Getting Started with MongoDB - Back to Basics
MongoDB
 
PPTX
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
MongoDB
 
PPT
Introduction to MongoDB
antoinegirbal
 
PPT
Introduction to MongoDB
Nosh Petigara
 
PPTX
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB
 
PPTX
Webinar: Schema Design
MongoDB
 
PDF
Indexing
Mike Dirolf
 
PPTX
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
MongoDB
 
PDF
Webinar: Building Your First App with MongoDB and Java
MongoDB
 
PPTX
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
MongoDB
 
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
PPTX
Building a Scalable Inbox System with MongoDB and Java
antoinegirbal
 
KEY
MongoDB Java Development - MongoBoston 2010
Eliot Horowitz
 
PPTX
MongoDB Schema Design: Four Real-World Examples
Mike Friedman
 
PDF
Webinar: Working with Graph Data in MongoDB
MongoDB
 
KEY
MongoDB
Steven Francia
 
PDF
Mongo DB schema design patterns
joergreichert
 
PPTX
Back to Basics Webinar 3 - Thinking in Documents
Joe Drumgoole
 
PPTX
Back to Basics Webinar 2: Your First MongoDB Application
MongoDB
 
Webinar: Getting Started with MongoDB - Back to Basics
MongoDB
 
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
MongoDB
 
Introduction to MongoDB
antoinegirbal
 
Introduction to MongoDB
Nosh Petigara
 
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB
 
Webinar: Schema Design
MongoDB
 
Indexing
Mike Dirolf
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
MongoDB
 
Webinar: Building Your First App with MongoDB and Java
MongoDB
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
MongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Building a Scalable Inbox System with MongoDB and Java
antoinegirbal
 
MongoDB Java Development - MongoBoston 2010
Eliot Horowitz
 
MongoDB Schema Design: Four Real-World Examples
Mike Friedman
 
Webinar: Working with Graph Data in MongoDB
MongoDB
 
Mongo DB schema design patterns
joergreichert
 
Back to Basics Webinar 3 - Thinking in Documents
Joe Drumgoole
 
Back to Basics Webinar 2: Your First MongoDB Application
MongoDB
 

Viewers also liked (11)

PDF
Mongo db data-models guide
Deysi Gmarra
 
PPTX
MongoDB for Developers
Ciro Donato Caiazzo
 
PPTX
Back to Basics Webinar 6: Production Deployment
MongoDB
 
PPTX
Beyond the Basics 1: Storage Engines
MongoDB
 
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
PPTX
Back to Basics, webinar 4: Indicizzazione avanzata, indici testuali e geospaz...
MongoDB
 
PDF
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB
 
PPTX
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
KEY
OSCON 2012 MongoDB Tutorial
Steven Francia
 
PDF
Advanced Schema Design Patterns
MongoDB
 
PPTX
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
MongoDB
 
Mongo db data-models guide
Deysi Gmarra
 
MongoDB for Developers
Ciro Donato Caiazzo
 
Back to Basics Webinar 6: Production Deployment
MongoDB
 
Beyond the Basics 1: Storage Engines
MongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Back to Basics, webinar 4: Indicizzazione avanzata, indici testuali e geospaz...
MongoDB
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
MongoDB
 
OSCON 2012 MongoDB Tutorial
Steven Francia
 
Advanced Schema Design Patterns
MongoDB
 
Developing with the Modern App Stack: MEAN and MERN (with Angular2 and ReactJS)
MongoDB
 
Ad

Similar to Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes (20)

PPTX
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
MongoDB
 
PPTX
Webinar: General Technical Overview of MongoDB for Dev Teams
MongoDB
 
PPTX
1403 app dev series - session 5 - analytics
MongoDB
 
KEY
Schema Design with MongoDB
rogerbodamer
 
PPT
Fast querying indexing for performance (4)
MongoDB
 
KEY
Mongodb intro
christkv
 
PPT
Building web applications with mongo db presentation
Murat Çakal
 
PDF
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
PDF
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
PDF
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
NETWAYS
 
PPT
9b. Document-Oriented Databases lab
Fabio Fumarola
 
KEY
Schema Design (Mongo Austin)
MongoDB
 
PPTX
Marc s01 e02-crud-database
MongoDB
 
PPTX
Joins and Other MongoDB 3.2 Aggregation Enhancements
Andrew Morgan
 
PPTX
Getting Started with Geospatial Data in MongoDB
MongoDB
 
PDF
Indexing and Query Optimizer
MongoDB
 
PDF
Building your first app with mongo db
MongoDB
 
PPT
Mongo db basics
Dhaval Mistry
 
PPTX
MongoDb and NoSQL
TO THE NEW | Technology
 
PPT
Nosh slides mongodb web application - mongo philly 2011
MongoDB
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
MongoDB
 
Webinar: General Technical Overview of MongoDB for Dev Teams
MongoDB
 
1403 app dev series - session 5 - analytics
MongoDB
 
Schema Design with MongoDB
rogerbodamer
 
Fast querying indexing for performance (4)
MongoDB
 
Mongodb intro
christkv
 
Building web applications with mongo db presentation
Murat Çakal
 
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
NETWAYS
 
9b. Document-Oriented Databases lab
Fabio Fumarola
 
Schema Design (Mongo Austin)
MongoDB
 
Marc s01 e02-crud-database
MongoDB
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Andrew Morgan
 
Getting Started with Geospatial Data in MongoDB
MongoDB
 
Indexing and Query Optimizer
MongoDB
 
Building your first app with mongo db
MongoDB
 
Mongo db basics
Dhaval Mistry
 
MongoDb and NoSQL
TO THE NEW | Technology
 
Nosh slides mongodb web application - mongo philly 2011
MongoDB
 
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

Recently uploaded (20)

PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PDF
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
PDF
Unlocking Insights: Introducing i-Metrics Asia-Pacific Corporation and Strate...
Janette Toral
 
PPTX
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
PPTX
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
SQL for Accountants and Finance Managers
ysmaelreyes
 
PDF
Optimizing Large Language Models with vLLM and Related Tools.pdf
Tamanna36
 
PPTX
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
PDF
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
PPTX
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
PDF
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
PDF
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
PPTX
Powerful Uses of Data Analytics You Should Know
subhashenia
 
PDF
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
PPTX
big data eco system fundamentals of data science
arivukarasi
 
PPTX
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
PPT
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
Unlocking Insights: Introducing i-Metrics Asia-Pacific Corporation and Strate...
Janette Toral
 
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
SQL for Accountants and Finance Managers
ysmaelreyes
 
Optimizing Large Language Models with vLLM and Related Tools.pdf
Tamanna36
 
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
The Best NVIDIA GPUs for LLM Inference in 2025.pdf
Tamanna36
 
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
Powerful Uses of Data Analytics You Should Know
subhashenia
 
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
big data eco system fundamentals of data science
arivukarasi
 
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 

Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

  • 1. MongoDBEurope2016 Old Billingsgate, London 15th November Use my code JD20 for 20% off tickets mongodb.com/europe
  • 2. Back to Basics 2016 : Webinar 4 Advanced Indexing – Text and Geospatial Indexes Joe Drumgoole Director of Developer Advocacy, EMEA @jdrumgoole V1.1
  • 3. 3 Recap • Webinar 1 – Introduction to NoSQL – The different types of NoSQL databases – What kind of database is MongoDB? A document database. • Webinar 2 – My First Application – Creating databases and collections – CRUD operations – Indexes and Explain • Webinar 3 – Schema Design – Dynamic schema – Embedding approaches – Examples
  • 4. 4 Indexing • An efficient way to look up data by its value • Avoids table scans 1 2 3 4 5 6 7
  • 5. 5 Traditional Databases Use Btrees • … and so does MongoDB
  • 7. 7 Creating a Simple Index db.coll.createIndex( { fieldName : <Direction> } ) Database Name Collection Name Command Field Name to be indexed Ascending : 1 Descending : -1
  • 8. 8 Two Other Kinds of Indexes • Full Text Index – Allows searching inside the text of a field ( Lucene, Solr and Elastic Search) • Geospatial Index – Allows searching by location (e.g. people near me) • These indexes do not use Btrees
  • 9. 9 Full Text Indexes • An “inverted index” on all the words inside a single field (only one text index per collection) { “comment” : “I think your blog post is very interesting and informative. I hope you will post more info like this in the future” } >> db.posts.createIndex( { “comments” : “text” } ) MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} ) { "_id" : ObjectId(“…"), "comment" : "I think your blog post is very interesting and informative. I hope you will post more info like this in the future" } MongoDB Enterprise >
  • 10. 10 Results MongoDB Enterprise > db.posts.getIndexes() ... { "v" : 1, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "comment_text", "ns" : "test.posts", "weights" : { "comment" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 }
  • 11. 11 Dropping Text Indexes • We drop text indexes by name rather than shape db.posts.getIndexes() { "v" : 1, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "comment_text_text", "ns" : "test.posts", "weights" : { "comment" : 5, "tags" : 10 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 }
  • 12. 12 Hence MongoDB Enterprise > db.posts.dropIndex( "comment_text_tags_text" ) { "nIndexesWas" : 2, "ok" : 1 } MongoDB Enterprise > • You can give an index an explict name to make this easier MongoDB Enterprise > db.posts.createIndex( { "comments" : "text", "tags" : "text" }, { "name" : "text_index" } ) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
  • 13. 13 On The Server I INDEX [conn275] build index on: test.posts properties: { v: 1, key: { _fts: "text", _ftsx: 1 }, name: "comment_text", ns: "test.posts", weights: { comment: 1 }, default_language: "english", language_override: "language", textIndexVersion: 3 }} I INDEX [conn275] building index using bulk method I INDEX [conn275] build index done. scanned 3 total records. 0 secs
  • 14. 14 More Detailed Example >> db.posts.insert( { "comment" : "Red yellow orange green" } ) >> db.posts.insert( { "comment" : "Pink purple blue" } ) >> db.posts.insert( { "comment" : "Red Pink" } ) >> db.posts.find( { "$text" : { "$search" : "Red" }} ) { "_id" : ObjectId(“…”), "comment" : "Red yellow orange green" } { "_id" : ObjectId( »…"), "comment" : "Red Pink" } >> db.posts.find( { "$text" : { "$search" : "Red Green" }} ) { "_id" : ObjectId(« …"), "comment" : "Red Pink" } { "_id" : ObjectId(« …"), "comment" : "Red yellow orange green" } >> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve { "_id" : ObjectId(“…"), "comment" : "Red yellow orange green" } { "_id" : ObjectId(«…”), "comment" : "Red Pink" } >>
  • 15. 15 Using Weights • We can assign different weights to different fields in the text index • E.g. I want to favour tags over comments in searching • So I increase the weight for the the tags field >> db.blog.createIndex( { comment: "text", tags : "text” }, { weights: { comment: 5, tags : 10 }} ) • Now searches will favour tags
  • 16. 16 $textscore • Weights impact $textscore: >> db.posts.find( { "$text" : { "$search" : "Red" }}, { score: { $meta: "textScore" }} ).sort( { score: { $meta: "textScore" } } ) { "_id" : …, "comment" : "hello", "tags" : "Red green orange", "score" : 6.666666666666666 } { "_id" : …, "comment" : "Red Pink", "score" : 3.75 } { "_id" : …, "comment" : "Red yellow orange green", "score" : 3.125 } >>
  • 17. 17 Other Parameters • Language : Pick the language you want to search in e.g. – $language : Spanish • Support case sensitive searching – $caseSensitive : True (default false) • Support accented characters (diacritic sensitive search e.g. café is distinguished from cafe ) – $diacriticSensitive : True (default false)
  • 19. 19 Geospatial Indexes • MongoDB supports 2D Sphere indexes • Allows a user to represent location on the earth (which is a sphere) • Coordinates are stored in GeoJSON format • The Geospatial index supports subset of the GeoJSON operations • The index is based on a QuadTree representation • Index is based on WGS 84 standard
  • 20. 20 Coordinates • Coordinates are represented as longitude, latitude • longitude – Measured from Greenwich meridian in London (0 degrees) locations east (up to 180 degrees) – For locations west we specify as negative • Latitude – Measured from equator north and south (0 to 90 north, 0 to -90 south) • Coordinates in MongoDB are stored on Longitude/Latitude order • Coordinates in Google are stored in Latitude/Longitude order
  • 21. 21 2DSphere Versions • Three versions of 2dSphere index in MongoDB • Version 1 : Up to MongoDB 2.4 • Version 2 : From MongoDB 2.6 onwards • Version 3 : From MongoDB 3.2 onwards • We will only be talking about Version 3 in this webinar
  • 22. 22 Creating a 2dSphere Index db.collection.createIndex ( { <location field> : "2dsphere" } ) • Location field must be coordinate or GeoJSON data
  • 23. 23 Example >> db.test.createIndex( { loc : "2dsphere" } ) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
  • 24. 24 Output >> db.test.getIndexes() [ { "v" : 1, "key" : { "loc" : "2dsphere" }, "name" : "loc_2dsphere", "ns" : "geo.test", "2dsphereIndexVersion" : 3 } ] >>
  • 25. 25 Use a Simple Dataset to investigate Geo Queries • Lets search for restaurants in Manhattan • Using two candidate collections – https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json – https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json • Import them into MongoDB – mongoimport –c neighborhoods –d geo neighborhoods.json – mongoimport –c restaurants –d geo restaurants.json
  • 26. 26 Neighborhood Document MongoDB Enterprise > db.neighborhoods.findOne() { "_id" : ObjectId("55cb9c666c522cafdb053a1a"), "geometry" : { "coordinates" : [ [ [ -73.94193078816193, 40.70072523469547 ], ... [ -73.94409591260093, 40.69897295461309 ], ] "type" : "Polygon" }, "name" : "Bedford" }
  • 27. 27 Restaurant Document MongoDB Enterprise > db.restaurants.findOne() { "_id" : ObjectId("55cba2476c522cafdb053adf"), "location" : { "coordinates" : [ -73.98241999999999, 40.579505 ], "type" : "Point" }, "name" : "Riviera Caterer" } MongoDB Enterprise > You can type this into google maps but remember to reverse the coordinate order
  • 28. 28 Add Indexes MongoDB Enterprise > db.restaurants.createIndex({ location: "2dsphere" }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } MongoDB Enterprise > db.neighborhoods.createIndex({ geometry: "2dsphere" }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } MongoDB Enterprise >
  • 29. 29 Use $geoIntersects to find our Neighborhood • Assume we are at -73.93414657, 40.82302903 • What neighborhood are we in? Use $geoIntersects db.neighborhoods.findOne({ geometry: { $geoIntersects: { $geometry: { type: "Point", coordinates: [ -73.93414657, 40.82302903 ]}}}})
  • 30. 30 Results { "geometry" : { ”coordinates" : [ [ -73.9338307684026, 40.81959665747723 ], ... [ -73.93383000695911, 40.81949109558767 ] ] "type" : "Polygon" }, "name" : "Central Harlem North-Polo Grounds" }
  • 31. 31 Find All Restaurants within 0.35 km db.restaurants.find({ location: { $geoWithin: { $centerSphere: [ [ -73.93414657, 40.82302903 ], 5 / 6,378.1 ] } } }) Distance in km Divide by radius of earth to convert to radians
  • 32. 32 Results – (Projected) { "name" : "Gotham Stadium Tennis Center Cafe" } { "name" : "Chuck E. Cheese'S" } { "name" : "Red Star Chinese Restaurant" } { "name" : "Tia Melli'S Latin Kitchen" } { "name" : "Domino'S Pizza" } • Without projection { "_id" : ObjectId("55cba2476c522cafdb0550aa"), "location" : { "coordinates" : [ -73.93795159999999, 40.823376 ], "type" : "Point" }, "name" : "Domino'S Pizza" }
  • 33. 33 Summary of Operators • $geoIntersect: Find areas or points that overlap or are adjacent • $geoWithin: Find areas on points that lie within a specific area • $geoNear: Returns locations in order from nearest to furthest away
  • 34. 34 Summary • Text Indexes : Full text searching of all the text items in a collection • Geospatial Indexes : Search by location, by intersection or by distance from a point
  • 37. 37 • This is slide content
  • 41. 41
  • 42. 42
  • 44. Porta Ultricies Commodo Porta Graph Examples 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Category 1 Category 2 Category 3 Category 4 Series 1 Series 2
  • 45. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Category 1 Category 2 Category 3 Category 4 Series 1 Series 2
  • 46. { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] } Code/Highlight Example
  • 47. Aggregation Framework Agility Backup Big Data Briefcase Buildings Business Intelligence Camera Cash Register Catalog Chat Checkmark Checkmark Cloud Commercial Contract Computer Content Continuous Development Credit Card Customer Success
  • 48. Data Center Data Variety Data Velocity Data Volume Data Warehouse Database Dialogue Directory Documents Downloads Drivers Dynamic Schema EDW Integration Faster Time to Market File Transfer Flexible Gear Hadoop Health Check High Availability Horizontal Scaling Integrating into Infrastructure Internet of Things Iterative Development
  • 49. Life Preserver Line Graph Lock Log Data Lower Cost Magnifying Glass Man Mobile Phone Meter Monitoring Music New Apps New Data Types Online Open Source Parachute Personalization Pin Platform Certification Product Catalog Puzzle Pieces RDBMS Realtime Analytics Rich Querying
  • 50. Life Preserver RSS Scalability Scale Secondary Indexing Steering Wheel Stopwatch Text Search Tick Data Training Transmission Tower Trophy Woman World

Editor's Notes

  • #3: Who I am, how long have I been at MongoDB.
  • #6: Each item in a Btree node points to a sub-tree containing elements below its key value. Insertions require a read before a write. Writes that split nodes are expensive.
  • #7: Effectively the depth of the tree.
  • #22: Production release numbering.
  • #27: Visit Map to show location.
  • #28: Show Riviera on Google Maps.