SlideShare a Scribd company logo
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales
MongoDBEurope2016
Old Billingsgate, London
15th November
Use my code rubenterceno20 for 20% off tickets
mongodb.com/europe
Conceptos Básicos 2016
Indexación Avanzada:
Índices de texto y Geoespaciales
Rubén Terceño
Senior Solutions Architect, EMEA
ruben@mongodb.com
@rubenTerceno
Agenda del Curso
Date Time Webinar
25-Mayo-2016 16:00 CEST Introducción a NoSQL
7-Junio-2016 16:00 CEST Su primera aplicación MongoDB
21-Junio-2016 16:00 CEST Diseño de esquema orientado a documentos
07-Julio-2016 16:00 CEST Indexación avanzada, índices de texto y geoespaciales
19-Julio-2016 16:00 CEST Introducción al Aggregation Framework
28-Julio-2016 16:00 CEST Despliegue en producción
Resumen de lo visto hasta ahora
• ¿Porqué existe NoSQL?
• Tipos de bases de datos NoSQL
• Características clave de MongoDB
• Instalación y creación de bases de datos y colecciones
• Operaciones CRUD
• Índices y explain()
• Diseño de esquema dinámico
• Jerarquía y documentos embebidos
• Polimorfismo
Indexing
• An efficient way to look up data by its value
• Avoids table scans
1 2 3 4 5 6 7
Traditional Databases Use B-trees
• … and so does MongoDB
O(Log(n) Time
Creating a Simple Index
db.coll.createIndex( { fieldName : <Direction> } )
Database Name
Collection Name
Command
Field Name to
be indexed
Ascending : 1
Descending : -1
Two Other Kinds of Indexes
• Full Text Index
• Allows searching inside the text of a field or several fields, ordering the
results by relevance.
• Geospatial Index
• Allows geospatial queries
• People around me.
• Countries I’m traversing during my trip.
• Restaurants in a given neighborhood.
• These indexes do not use B-trees
Full Text Indexes
• An “inverted index” on all the words inside text fields (only one text index per collection)
{ “comment” : “I think your blog post is very interesting
and informative. I hope you will post more
info like this in the future” }
>> db.posts.createIndex( { “comments” : “text” } )
MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} )
{ "_id" : ObjectId(“…"), "comment" : "I think your blog post is very
interesting and informative. I hope you will post more info like this in
the future" }
MongoDB Enterprise >
On The Server
2016-07-07T09:48:48.605+0200 I INDEX [conn4] build index on:
indexes.products properties: { v: 1,
key: { _fts: "text", _ftsx: 1 },
name: "longDescription_text_shortDescription_text_name_text”,
ns: "indexes.products",
weights: { longDescription: 1,
name: 10,
shortDescription: 3 },
default_language: "english”,
language_override: "language”,
textIndexVersion: 3 }
More Detailed Example
>> db.posts.insert( { "comment" : "Red yellow orange green" } )
>> db.posts.insert( { "comment" : "Pink purple blue" } )
>> db.posts.insert( { "comment" : "Red Pink" } )
>> db.posts.find( { "$text" : { "$search" : "Red" }} )
{ "_id" : ObjectId("…"), "comment" : "Red yellow orange green" }
{ "_id" : ObjectId("…"), "comment" : "Red Pink" }
>> db.posts.find( { "$text" : { "$search" : "Pink Green" }} )
{ "_id" : ObjectId("…"), "comment" : "Red Pink" }
{ "_id" : ObjectId("…"), "comment" : "Red yellow orange green" }
>> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve
{ "_id" : ObjectId("…"), "comment" : "Red yellow orange green" }
{ "_id" : ObjectId("…"), "comment" : "Red Pink" }
>>
Using Weights
• We can assign different weights to different fields in the text index
• E.g. I want to favour name over shortDescription in searching
• So I increase the weight for the the name field
>> db.blog.createIndex( { shortDescription: "text",
longDescription: "text”,
name: "text” },
{ weights: { shortDescription: 3,
longDescription: 1,
name: 10 }} )
• Now searches will favour name over shortDesciption over longDescription
$textscore
• We may want to favor results with higher weights, thus:
>> db.products.find({$text : {$search: "humongous"}}, {score:
{$meta : "textScore"}, name: 1, longDescription: 1,
shortDescription: 1}).sort( { score: { $meta: "textScore" } } )
Other Parameters
• Language : Pick the language you want to search in e.g.
• $language : Spanish
• Support case sensitive searching
• $caseSensitive : True (default false)
• Support accented characters (diacritic sensitive search e.g. café
is distinguished from cafe )
• $diacriticSensitive : True (default false)
Geospatial Indexes
• 2d
• Represents a flat surface. A good fit if:
• You have legacy coordinate pairs (MongoDB 2.2 or earlier).
• You do not plan to use geoJSON objects.
• You don’t worry about the Earth's curvature. (Yup, earth is not flat)
• 2dsphere
• Represents a flat surface on top of an spheroid.
• It should be the default choice for geoData
• Coordinates are (usually) stored in GeoJSON format
• The index is based on a QuadTree representation
• The index is based on WGS 84 standard
Coordinates
• Coordinates are represented as longitude, latitude
• Longitude
• Measured from Greenwich meridian (0 degrees)
• For locations east up to +180 degrees
• For locations west we specify as negative up to -180
• Latitude
• Measured from equator north and south (0 to 90 north, 0 to -90 south)
• Coordinates in MongoDB are stored on Longitude/Latitude order
• Coordinates in Google Maps are stored in Latitude/Longitude order
2dSphere Versions
• Two versions of 2dSphere index in MongoDB
• Version 1 : Up to MongoDB 2.4
• Version 2 : From MongoDB 2.6 onwards
• Version 3 : From MongoDB 3.2 onwards
• We will only be talking about Version 3 in this webinar
Creating a 2dSphere Index
db.collection.createIndex
( { <location field> : "2dsphere" } )
• Location field must be coordinate or GeoJSON data
Example
>> db.wines.createIndex( { geometry: "2dsphere" } )
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
Testing Geo Queries
• Lets search for wine regions in the world
• Using two collections from my gitHub repo
• https://github.com/terce13/geoData
• Import them into MongoDB
• mongoimport -c wines -d geo wine_regions.json
• mongoimport -c countries -d geo countries.json
Country Document (Vatican)
{
"_id" : ObjectId("577e2ebd1007503076ac8c86"),
"type" : "Feature",
"properties" : {
"featurecla" : "Admin-0 country",
"sovereignt" : "Vatican",
"type" : "Sovereign country",
"admin" : "Vatican",
"adm0_a3" : "VAT",
"name" : "Vatican",
"name_long" : "Vatican",
"abbrev" : "Vat.",
"postal" : "V",
"formal_en" : "State of the Vatican
City",
"name_sort" : "Vatican (Holy Sea)",
"name_alt" : "Holy Sea”,
"pop_est" : 832,
"economy" : "2. Developed region:
nonG7",
"income_grp" : "2. High income:
nonOECD",
"continent" : "Europe",
"region_un" : "Europe",
"subregion" : "Southern Europe",
"region_wb" : "Europe & Central Asia",
},
"geometry" : {
"type" : "Polygon",
"coordinates" : [ [
[12.439160156250011,
41.898388671875],
[12.430566406250023,
41.89755859375],
[12.427539062500017,
41.900732421875],
[12.430566406250023,
41.90546875],
[12.438378906250023,
41.906201171875],
[12.439160156250011,
41.898388671875]]]
}
}
Wine region document
MongoDB Enterprise > db.wines.findOne()
{
"_id" : ObjectId("577e2e7e1007503076ac8769"),
"properties" : {
"name" : "AOC Anjou-Villages",
"description" : null,
"id" : "a629ojjxl15z"
},
"type" : "Feature",
"geometry" : {
"type" : "Point",
"coordinates" : [ -0.618980171610645, 47.2211343496821]
}
}
You can type this into
google maps but
remember to reverse the
coordinate order
Add Indexes
MongoDB Enterprise > db.wines.createIndex({ geometry: "2dsphere" })
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
MongoDB Enterprise > db.countries.createIndex({ geometry: "2dsphere" })
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
$geoIntersects to find our country
• Assume we are at lat: 43.47, lon: -3.81
• What country are we in? Use $geoIntersects
db.countries.findOne({ geometry:
{ $geoIntersects:
{ $geometry:
{ type: "Point",
coordinates:
[ -3.81, 43.47 ]}}}},
{"properties.name": 1})
Results
{
"_id" :
ObjectId("577e2ebd1007503076ac8be5"),
"properties" : {
"name" : "Spain"
}
}
Wine regions around me
• Use $near (ordered results by distance)
db.wines.find({geometry:
{$near:
{$geometry:{type : "Point",
coordinates : [-3.81,43.47]},
$maxDistance: 250000 }
}
}
)
Results (Projected)
{ "properties" : { "name" : "DO Arabako-Txakolina" } }
{ "properties" : { "name" : "DO Chacoli de Vizcaya" } }
{ "properties" : { "name" : "DO Chacoli de Guetaria" } }
{ "properties" : { "name" : "DO Rioja" } }
{ "properties" : { "name" : "DO Navarra" } }
{ "properties" : { "name" : "DO Cigales" } }
{ "properties" : { "name" : "AOC Irouléguy" } }
{ "properties" : { "name" : "DO Ribera de Duero" } }
{ "properties" : { "name" : "DO Rueda" } }
{ "properties" : { "name" : "AOC Béarn-Bellocq" } }
But screens are not circular
db.wines.find({ geometry:
{ $geoWithin: { $geometry:{type : "Polygon",
coordinates : [[[-51,-29],
[-71,-29],
[-71,-33],
[-51,-33],
[-51,-29]]]}
}
}
})
Results – (Projected)
{ "properties" : { "name" : "Pinheiro Machado" } }
{ "properties" : { "name" : "Rio Negro" } }
{ "properties" : { "name" : "Tacuarembó" } }
{ "properties" : { "name" : "Rivera" } }
{ "properties" : { "name" : "Artigas" } }
{ "properties" : { "name" : "Salto" } }
{ "properties" : { "name" : "Paysandú" } }
{ "properties" : { "name" : "Mendoza" } }
{ "properties" : { "name" : "Luján de Cuyo" } }
{ "properties" : { "name" : "Aconcagua" } }
Use geo objects smartly
• Use polygons and/or multipolygons from a collection to query a
second one.
var mex = db.countries.findOne({"properties.name" : "Mexico"})
db.wines.find({geometry: {
$geoWithin: {
$geometry: mex.geometry}}})
{ "_id" : ObjectId("577e2e7e1007503076ac8ab9"), "properties" : { "name" : "Los Romos",
"description" : null, "id" : "a629ojjkguyw" }, "type" : "Feature", "geometry" : { "type" :
"Point", "coordinates" : [ -102.304048304437, 22.0992980768825 ] } }
{ "_id" : ObjectId("577e2e7e1007503076ac8a8d"), "properties" : { "name" : "Hermosillo",
"description" : null, "id" : "a629ojiw0i7f" }, "type" : "Feature", "geometry" : { "type" :
"Point", "coordinates" : [ -111.03600413129, 29.074715739466 ] } }
Let’s do crazy things
var wines = db.wines.find()
while (wines.hasNext()){
var wine = wines.next();
var country = db.countries.findOne({geometry :
{$geoIntersects : {$geometry : wine.geometry}}});
if (country!=null){
db.wines.update({"_id" : wine._id},
{$set : {"properties.country" :
country.properties.name}});
}
}
Summary of Operators
• $geoIntersect: Find areas or points that overlap or are
adjacent
• Points or polygons, doesn’t matter.
• $geoWithin: Find areas on points that lie within a specific area
• Use screen limits smartly
• $near: Returns locations in order from nearest to furthest away
• Find closest objects.
Summary
• Los índices de texto permiten hacer búsquedas tipo Google, SOLR, ElasticSearch
• Pueden tenere en cuenta los pesos de diferentes campos
• Pueden combinarse con otras búsquedas
• Pueden devolver los resultado ordenados por relevancia
• Pueden ser multilenguaje y case/accent insensitive
• Los índices geoespaciales permiten manejar objetos GeoJSON
• Permiten hacer búsquedas por proximidad, inclusión e intersección
• Utilizan el sistema de referencia más habitual, WGS84
• Ojo!!! Latitud y longitud son al revés que Google Maps.
• Pueden combinarse con otras búsquedas
• Existe un índice especial (2d) para superficies planas (un campo de fútbol, un mundo
virtual, etc.)
Próximo Webinar
Introducción a Aggregation Framework
• 19 de Julio 2016 – 16:00 CEST, 11:00 ART, 9:00
• ¡Regístrese si aún no lo ha hecho!
• MongoDB Aggregation Framework concede al desarrollador la capacidad de
desplegar un procesamiento de análisis avanzado dentro de la base de
datos..
• Este procesa los datos en una pipeline tipo Unix y permite a los
desarrolladores:
• Remodelar, transformar y extraer datos.
• Aplicar funciones analíticas estándares que van desde las sumas y las medias hasta la
desviación estándar.
• Regístrese en : https://www.mongodb.com/webinars
• Denos su opinión, por favor: back-to-basics@mongodb.com
¿Preguntas?
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales

More Related Content

PPTX
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
MongoDB
 
PPTX
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
MongoDB
 
PPTX
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
MongoDB
 
PPTX
Back to Basics Webinar 2: Your First MongoDB Application
MongoDB
 
PPTX
Webinar: Back to Basics: Thinking in Documents
MongoDB
 
PPTX
Back to Basics Webinar 3: Schema Design Thinking in Documents
MongoDB
 
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
PPTX
Webinar: Getting Started with MongoDB - Back to Basics
MongoDB
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
MongoDB
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
MongoDB
 
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
MongoDB
 
Back to Basics Webinar 2: Your First MongoDB Application
MongoDB
 
Webinar: Back to Basics: Thinking in Documents
MongoDB
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
MongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Webinar: Getting Started with MongoDB - Back to Basics
MongoDB
 

What's hot (20)

PPTX
Back to Basics, webinar 2: La tua prima applicazione MongoDB
MongoDB
 
PPTX
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
MongoDB
 
KEY
Practical Ruby Projects With Mongo Db
Alex Sharp
 
KEY
MongoDB
Steven Francia
 
PPTX
High Performance Applications with MongoDB
MongoDB
 
PPTX
Beyond the Basics 2: Aggregation Framework
MongoDB
 
PDF
Back to Basics 2017: Mí primera aplicación MongoDB
MongoDB
 
PPTX
Back to Basics Webinar 5: Introduction to the Aggregation Framework
MongoDB
 
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
ODP
MongoDB : The Definitive Guide
Wildan Maulana
 
PPTX
MongoDB Schema Design: Practical Applications and Implications
MongoDB
 
PPTX
Introduction to MongoDB and Hadoop
Steven Francia
 
PPTX
MongoDB 101
Abhijeet Vaikar
 
PPTX
Introduction to MongoDB
MongoDB
 
PPTX
Back to Basics Spanish 4 Introduction to sharding
MongoDB
 
KEY
OSCON 2012 MongoDB Tutorial
Steven Francia
 
ODP
MongoDB - Ekino PHP
Florent DENIS
 
PPTX
Back to Basics: My First MongoDB Application
MongoDB
 
KEY
MongoDB and hadoop
Steven Francia
 
PPTX
Conceptos básicos. Seminario web 6: Despliegue de producción
MongoDB
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
MongoDB
 
Webinaire 2 de la série « Retour aux fondamentaux » : Votre première applicat...
MongoDB
 
Practical Ruby Projects With Mongo Db
Alex Sharp
 
High Performance Applications with MongoDB
MongoDB
 
Beyond the Basics 2: Aggregation Framework
MongoDB
 
Back to Basics 2017: Mí primera aplicación MongoDB
MongoDB
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
MongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
MongoDB : The Definitive Guide
Wildan Maulana
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB
 
Introduction to MongoDB and Hadoop
Steven Francia
 
MongoDB 101
Abhijeet Vaikar
 
Introduction to MongoDB
MongoDB
 
Back to Basics Spanish 4 Introduction to sharding
MongoDB
 
OSCON 2012 MongoDB Tutorial
Steven Francia
 
MongoDB - Ekino PHP
Florent DENIS
 
Back to Basics: My First MongoDB Application
MongoDB
 
MongoDB and hadoop
Steven Francia
 
Conceptos básicos. Seminario web 6: Despliegue de producción
MongoDB
 
Ad

Viewers also liked (7)

PPTX
Conceptos básicos. Seminario web 1: Introducción a NoSQL
MongoDB
 
PPTX
Seminario web: Simplificando el uso de su base de datos con Atlas
MongoDB
 
PPTX
Event-Based Subscription with MongoDB
MongoDB
 
PDF
Indexing
Mike Dirolf
 
PPTX
Normalización de la base de datos (3 formas normales)
michell_quitian
 
PPTX
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
PDF
Webinar: 10-Step Guide to Creating a Single View of your Business
MongoDB
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
MongoDB
 
Seminario web: Simplificando el uso de su base de datos con Atlas
MongoDB
 
Event-Based Subscription with MongoDB
MongoDB
 
Indexing
Mike Dirolf
 
Normalización de la base de datos (3 formas normales)
michell_quitian
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB
 
Webinar: 10-Step Guide to Creating a Single View of your Business
MongoDB
 
Ad

Similar to Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales (20)

KEY
Mongodb intro
christkv
 
PPTX
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
MongoDB
 
PDF
Webinar: Data Processing and Aggregation Options
MongoDB
 
PPTX
Webinar: General Technical Overview of MongoDB for Dev Teams
MongoDB
 
PPT
Building web applications with mongo db presentation
Murat Çakal
 
PDF
Using MongoDB and Python
Mike Bright
 
PDF
2016 feb-23 pyugre-py_mongo
Michael Bright
 
PPTX
Getting Started with Geospatial Data in MongoDB
MongoDB
 
PDF
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
PPTX
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
MongoDB
 
PPTX
1403 app dev series - session 5 - analytics
MongoDB
 
PPT
9b. Document-Oriented Databases lab
Fabio Fumarola
 
PDF
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
NETWAYS
 
PDF
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
PPT
Meetup#1: 10 reasons to fall in love with MongoDB
Minsk MongoDB User Group
 
PDF
MongoDB: a gentle, friendly overview
Antonio Pintus
 
PDF
MongoDB at FrozenRails
Mike Dirolf
 
PPT
Mongo Web Apps: OSCON 2011
rogerbodamer
 
PDF
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
KEY
MongoDB at GUL
Israel Gutiérrez
 
Mongodb intro
christkv
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
MongoDB
 
Webinar: Data Processing and Aggregation Options
MongoDB
 
Webinar: General Technical Overview of MongoDB for Dev Teams
MongoDB
 
Building web applications with mongo db presentation
Murat Çakal
 
Using MongoDB and Python
Mike Bright
 
2016 feb-23 pyugre-py_mongo
Michael Bright
 
Getting Started with Geospatial Data in MongoDB
MongoDB
 
Aggregation Framework MongoDB Days Munich
Norberto Leite
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
MongoDB
 
1403 app dev series - session 5 - analytics
MongoDB
 
9b. Document-Oriented Databases lab
Fabio Fumarola
 
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
NETWAYS
 
10gen Presents Schema Design and Data Modeling
DATAVERSITY
 
Meetup#1: 10 reasons to fall in love with MongoDB
Minsk MongoDB User Group
 
MongoDB: a gentle, friendly overview
Antonio Pintus
 
MongoDB at FrozenRails
Mike Dirolf
 
Mongo Web Apps: OSCON 2011
rogerbodamer
 
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
MongoDB at GUL
Israel Gutiérrez
 

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

Recently uploaded (20)

PPTX
Databricks-DE-Associate Certification Questions-june-2024.pptx
pedelli41
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PPTX
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
PDF
CH2-MODEL-SETUP-v2017.1-JC-APR27-2017.pdf
jcc00023con
 
PDF
A Systems Thinking Approach to Algorithmic Fairness.pdf
Epistamai
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PPTX
1intro to AI.pptx AI components & composition
ssuserb993e5
 
PPTX
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Probability systematic sampling methods.pptx
PrakashRajput19
 
PPTX
Economic Sector Performance Recovery.pptx
yulisbaso2020
 
PPTX
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PPTX
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
PDF
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Databricks-DE-Associate Certification Questions-june-2024.pptx
pedelli41
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
CH2-MODEL-SETUP-v2017.1-JC-APR27-2017.pdf
jcc00023con
 
A Systems Thinking Approach to Algorithmic Fairness.pdf
Epistamai
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
blockchain123456789012345678901234567890
tanvikhunt1003
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
1intro to AI.pptx AI components & composition
ssuserb993e5
 
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Probability systematic sampling methods.pptx
PrakashRajput19
 
Economic Sector Performance Recovery.pptx
yulisbaso2020
 
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 

Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales

  • 2. MongoDBEurope2016 Old Billingsgate, London 15th November Use my code rubenterceno20 for 20% off tickets mongodb.com/europe
  • 3. Conceptos Básicos 2016 Indexación Avanzada: Índices de texto y Geoespaciales Rubén Terceño Senior Solutions Architect, EMEA [email protected] @rubenTerceno
  • 4. Agenda del Curso Date Time Webinar 25-Mayo-2016 16:00 CEST Introducción a NoSQL 7-Junio-2016 16:00 CEST Su primera aplicación MongoDB 21-Junio-2016 16:00 CEST Diseño de esquema orientado a documentos 07-Julio-2016 16:00 CEST Indexación avanzada, índices de texto y geoespaciales 19-Julio-2016 16:00 CEST Introducción al Aggregation Framework 28-Julio-2016 16:00 CEST Despliegue en producción
  • 5. Resumen de lo visto hasta ahora • ¿Porqué existe NoSQL? • Tipos de bases de datos NoSQL • Características clave de MongoDB • Instalación y creación de bases de datos y colecciones • Operaciones CRUD • Índices y explain() • Diseño de esquema dinámico • Jerarquía y documentos embebidos • Polimorfismo
  • 6. Indexing • An efficient way to look up data by its value • Avoids table scans 1 2 3 4 5 6 7
  • 7. Traditional Databases Use B-trees • … and so does MongoDB
  • 9. Creating a Simple Index db.coll.createIndex( { fieldName : <Direction> } ) Database Name Collection Name Command Field Name to be indexed Ascending : 1 Descending : -1
  • 10. Two Other Kinds of Indexes • Full Text Index • Allows searching inside the text of a field or several fields, ordering the results by relevance. • Geospatial Index • Allows geospatial queries • People around me. • Countries I’m traversing during my trip. • Restaurants in a given neighborhood. • These indexes do not use B-trees
  • 11. Full Text Indexes • An “inverted index” on all the words inside text fields (only one text index per collection) { “comment” : “I think your blog post is very interesting and informative. I hope you will post more info like this in the future” } >> db.posts.createIndex( { “comments” : “text” } ) MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} ) { "_id" : ObjectId(“…"), "comment" : "I think your blog post is very interesting and informative. I hope you will post more info like this in the future" } MongoDB Enterprise >
  • 12. On The Server 2016-07-07T09:48:48.605+0200 I INDEX [conn4] build index on: indexes.products properties: { v: 1, key: { _fts: "text", _ftsx: 1 }, name: "longDescription_text_shortDescription_text_name_text”, ns: "indexes.products", weights: { longDescription: 1, name: 10, shortDescription: 3 }, default_language: "english”, language_override: "language”, textIndexVersion: 3 }
  • 13. More Detailed Example >> db.posts.insert( { "comment" : "Red yellow orange green" } ) >> db.posts.insert( { "comment" : "Pink purple blue" } ) >> db.posts.insert( { "comment" : "Red Pink" } ) >> db.posts.find( { "$text" : { "$search" : "Red" }} ) { "_id" : ObjectId("…"), "comment" : "Red yellow orange green" } { "_id" : ObjectId("…"), "comment" : "Red Pink" } >> db.posts.find( { "$text" : { "$search" : "Pink Green" }} ) { "_id" : ObjectId("…"), "comment" : "Red Pink" } { "_id" : ObjectId("…"), "comment" : "Red yellow orange green" } >> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve { "_id" : ObjectId("…"), "comment" : "Red yellow orange green" } { "_id" : ObjectId("…"), "comment" : "Red Pink" } >>
  • 14. Using Weights • We can assign different weights to different fields in the text index • E.g. I want to favour name over shortDescription in searching • So I increase the weight for the the name field >> db.blog.createIndex( { shortDescription: "text", longDescription: "text”, name: "text” }, { weights: { shortDescription: 3, longDescription: 1, name: 10 }} ) • Now searches will favour name over shortDesciption over longDescription
  • 15. $textscore • We may want to favor results with higher weights, thus: >> db.products.find({$text : {$search: "humongous"}}, {score: {$meta : "textScore"}, name: 1, longDescription: 1, shortDescription: 1}).sort( { score: { $meta: "textScore" } } )
  • 16. Other Parameters • Language : Pick the language you want to search in e.g. • $language : Spanish • Support case sensitive searching • $caseSensitive : True (default false) • Support accented characters (diacritic sensitive search e.g. café is distinguished from cafe ) • $diacriticSensitive : True (default false)
  • 17. Geospatial Indexes • 2d • Represents a flat surface. A good fit if: • You have legacy coordinate pairs (MongoDB 2.2 or earlier). • You do not plan to use geoJSON objects. • You don’t worry about the Earth's curvature. (Yup, earth is not flat) • 2dsphere • Represents a flat surface on top of an spheroid. • It should be the default choice for geoData • Coordinates are (usually) stored in GeoJSON format • The index is based on a QuadTree representation • The index is based on WGS 84 standard
  • 18. Coordinates • Coordinates are represented as longitude, latitude • Longitude • Measured from Greenwich meridian (0 degrees) • For locations east up to +180 degrees • For locations west we specify as negative up to -180 • Latitude • Measured from equator north and south (0 to 90 north, 0 to -90 south) • Coordinates in MongoDB are stored on Longitude/Latitude order • Coordinates in Google Maps are stored in Latitude/Longitude order
  • 19. 2dSphere Versions • Two versions of 2dSphere index in MongoDB • Version 1 : Up to MongoDB 2.4 • Version 2 : From MongoDB 2.6 onwards • Version 3 : From MongoDB 3.2 onwards • We will only be talking about Version 3 in this webinar
  • 20. Creating a 2dSphere Index db.collection.createIndex ( { <location field> : "2dsphere" } ) • Location field must be coordinate or GeoJSON data
  • 21. Example >> db.wines.createIndex( { geometry: "2dsphere" } ) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
  • 22. Testing Geo Queries • Lets search for wine regions in the world • Using two collections from my gitHub repo • https://github.com/terce13/geoData • Import them into MongoDB • mongoimport -c wines -d geo wine_regions.json • mongoimport -c countries -d geo countries.json
  • 23. Country Document (Vatican) { "_id" : ObjectId("577e2ebd1007503076ac8c86"), "type" : "Feature", "properties" : { "featurecla" : "Admin-0 country", "sovereignt" : "Vatican", "type" : "Sovereign country", "admin" : "Vatican", "adm0_a3" : "VAT", "name" : "Vatican", "name_long" : "Vatican", "abbrev" : "Vat.", "postal" : "V", "formal_en" : "State of the Vatican City", "name_sort" : "Vatican (Holy Sea)", "name_alt" : "Holy Sea”, "pop_est" : 832, "economy" : "2. Developed region: nonG7", "income_grp" : "2. High income: nonOECD", "continent" : "Europe", "region_un" : "Europe", "subregion" : "Southern Europe", "region_wb" : "Europe & Central Asia", }, "geometry" : { "type" : "Polygon", "coordinates" : [ [ [12.439160156250011, 41.898388671875], [12.430566406250023, 41.89755859375], [12.427539062500017, 41.900732421875], [12.430566406250023, 41.90546875], [12.438378906250023, 41.906201171875], [12.439160156250011, 41.898388671875]]] } }
  • 24. Wine region document MongoDB Enterprise > db.wines.findOne() { "_id" : ObjectId("577e2e7e1007503076ac8769"), "properties" : { "name" : "AOC Anjou-Villages", "description" : null, "id" : "a629ojjxl15z" }, "type" : "Feature", "geometry" : { "type" : "Point", "coordinates" : [ -0.618980171610645, 47.2211343496821] } } You can type this into google maps but remember to reverse the coordinate order
  • 25. Add Indexes MongoDB Enterprise > db.wines.createIndex({ geometry: "2dsphere" }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } MongoDB Enterprise > db.countries.createIndex({ geometry: "2dsphere" }) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1
  • 26. $geoIntersects to find our country • Assume we are at lat: 43.47, lon: -3.81 • What country are we in? Use $geoIntersects db.countries.findOne({ geometry: { $geoIntersects: { $geometry: { type: "Point", coordinates: [ -3.81, 43.47 ]}}}}, {"properties.name": 1})
  • 28. Wine regions around me • Use $near (ordered results by distance) db.wines.find({geometry: {$near: {$geometry:{type : "Point", coordinates : [-3.81,43.47]}, $maxDistance: 250000 } } } )
  • 29. Results (Projected) { "properties" : { "name" : "DO Arabako-Txakolina" } } { "properties" : { "name" : "DO Chacoli de Vizcaya" } } { "properties" : { "name" : "DO Chacoli de Guetaria" } } { "properties" : { "name" : "DO Rioja" } } { "properties" : { "name" : "DO Navarra" } } { "properties" : { "name" : "DO Cigales" } } { "properties" : { "name" : "AOC Irouléguy" } } { "properties" : { "name" : "DO Ribera de Duero" } } { "properties" : { "name" : "DO Rueda" } } { "properties" : { "name" : "AOC Béarn-Bellocq" } }
  • 30. But screens are not circular db.wines.find({ geometry: { $geoWithin: { $geometry:{type : "Polygon", coordinates : [[[-51,-29], [-71,-29], [-71,-33], [-51,-33], [-51,-29]]]} } } })
  • 31. Results – (Projected) { "properties" : { "name" : "Pinheiro Machado" } } { "properties" : { "name" : "Rio Negro" } } { "properties" : { "name" : "Tacuarembó" } } { "properties" : { "name" : "Rivera" } } { "properties" : { "name" : "Artigas" } } { "properties" : { "name" : "Salto" } } { "properties" : { "name" : "Paysandú" } } { "properties" : { "name" : "Mendoza" } } { "properties" : { "name" : "Luján de Cuyo" } } { "properties" : { "name" : "Aconcagua" } }
  • 32. Use geo objects smartly • Use polygons and/or multipolygons from a collection to query a second one. var mex = db.countries.findOne({"properties.name" : "Mexico"}) db.wines.find({geometry: { $geoWithin: { $geometry: mex.geometry}}}) { "_id" : ObjectId("577e2e7e1007503076ac8ab9"), "properties" : { "name" : "Los Romos", "description" : null, "id" : "a629ojjkguyw" }, "type" : "Feature", "geometry" : { "type" : "Point", "coordinates" : [ -102.304048304437, 22.0992980768825 ] } } { "_id" : ObjectId("577e2e7e1007503076ac8a8d"), "properties" : { "name" : "Hermosillo", "description" : null, "id" : "a629ojiw0i7f" }, "type" : "Feature", "geometry" : { "type" : "Point", "coordinates" : [ -111.03600413129, 29.074715739466 ] } }
  • 33. Let’s do crazy things var wines = db.wines.find() while (wines.hasNext()){ var wine = wines.next(); var country = db.countries.findOne({geometry : {$geoIntersects : {$geometry : wine.geometry}}}); if (country!=null){ db.wines.update({"_id" : wine._id}, {$set : {"properties.country" : country.properties.name}}); } }
  • 34. Summary of Operators • $geoIntersect: Find areas or points that overlap or are adjacent • Points or polygons, doesn’t matter. • $geoWithin: Find areas on points that lie within a specific area • Use screen limits smartly • $near: Returns locations in order from nearest to furthest away • Find closest objects.
  • 35. Summary • Los índices de texto permiten hacer búsquedas tipo Google, SOLR, ElasticSearch • Pueden tenere en cuenta los pesos de diferentes campos • Pueden combinarse con otras búsquedas • Pueden devolver los resultado ordenados por relevancia • Pueden ser multilenguaje y case/accent insensitive • Los índices geoespaciales permiten manejar objetos GeoJSON • Permiten hacer búsquedas por proximidad, inclusión e intersección • Utilizan el sistema de referencia más habitual, WGS84 • Ojo!!! Latitud y longitud son al revés que Google Maps. • Pueden combinarse con otras búsquedas • Existe un índice especial (2d) para superficies planas (un campo de fútbol, un mundo virtual, etc.)
  • 36. Próximo Webinar Introducción a Aggregation Framework • 19 de Julio 2016 – 16:00 CEST, 11:00 ART, 9:00 • ¡Regístrese si aún no lo ha hecho! • MongoDB Aggregation Framework concede al desarrollador la capacidad de desplegar un procesamiento de análisis avanzado dentro de la base de datos.. • Este procesa los datos en una pipeline tipo Unix y permite a los desarrolladores: • Remodelar, transformar y extraer datos. • Aplicar funciones analíticas estándares que van desde las sumas y las medias hasta la desviación estándar. • Regístrese en : https://www.mongodb.com/webinars • Denos su opinión, por favor: [email protected]

Editor's Notes

  • #4: Who I am, how long have I been at MongoDB.
  • #8: Each item in a Btree node points to a sub-tree containing elements below its key value. Insertions require a read before a write. Writes that split nodes are expensive.
  • #9: Effectively the depth of the tree.
  • #20: Production release numbering.