SlideShare a Scribd company logo
Introduction to	

Elasticsearch
27th May 2014 - BigData Meetup
Eric Rodriguez	

@wavyx
About Me
Eric Rodriguez
Founder of data.be
!
• Web entrepreneur
• Data addict
• Multi-Language: PHP, Java/
Groovy/Grails, .Net, …
be.linkedin.com/in/erodriguez
!
github.com/wavyx
!
@wavyx
Elasticsearch - Company
• Founded in 2012 => http://www.elasticsearch.com	

• Professional services	

• Training	

• Consultancy / Development support	

• Production support subscription (3 levels of SLAs)
Enterprises using Elasticsearch
(M)ELK Stack
• Elasticsearch - Search server based on Lucene	

• Logstash -Tool for managing events and logs	

• Kibana -Visualize logs and time-stamped data	

• Marvel - Monitor your cluster’s heartbeat
You Know, for Search…
Logstash
• Collect, parse, index, and search logs
Kibana
• A versatile dashboard to see and interact with your data
Marvel
• Monitor the health of your cluster

cluster-wide metrics, overview of all nodes and
indices and events (master election, new nodes)
real time, 	

search and 	

analytics engine	

open-source
Lucene
JSON
schema 	

free	

document

store
RESTful
API
documentation
scalability
high 	

availability	

distributed	

multi 	

tenancy
per-operation

persistence
Use Cases
• Full-Text Search	

• Data Store	

• Analytics	

• Alerts	

• Ads	

• …
Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
Elasticsearch core
• Apache Lucene is a high-performance, full-featured text search engine library
written entirely in Java	

• Elasticsearch added value: “Simple is best”
• Simple API (with documentation)	

• JSON & RESTful	

• Sharding & Replication	

• Extensibility: plugins and scripts	

• Interoperability: clients and integrations
Terms for DBAs
• Index	

• Type	

• Document	

• Fields	

• Mapping
ElasticsearchRDBMs
• Database	

• Table	

• Row	

• Column	

• Schema
Plug & Play
• Zero configuration	

• 4 LoC to get started ;)
Alive !
=> http://localhost:9200/?pretty
REST
• Check your cluster, node, and index health, status, and statistics	

• Administer your cluster, node, and index data and metadata	

• Perform CRUD (Create, Read, Update, and Delete) and
search operations against your indexes	

• Execute advanced search operations such as paging, sorting,
filtering, scripting, faceting, aggregations, and many others
Basic Operations 1/3
• Add a document
• Create index
Basic Operations 2/3
• Modify/Replace a document
• Delete a document
• Delete index
Basic Operations 3/3
• Update a document
Mapping 1/2
• Define how a document should be mapped
(similar to schema): searchable fields, tokenization,
storage, ..	

• Explicit mapping is defined on an index/type level	

• A default mapping is automatically created
Mapping 2/2
• Core types: string, integer/long, float/double, boolean, and null	

• Other types:Array, Object, Nested, IP, GeoPoint, GeoShape,
Attachment	

• Example
Search API 1/2
• Multi-index, Multi-type
• Uri search - Google like

Operators (AND/OR), fields, sort, paging, wildcards, …
Search API 2/2
• Paging & Sort	

• Fields: selection, scripts	

• Post filter	

• Highlighting	

• Rescoring	

• Explain	

• …
Query DSL
• “SQL” for elasticsearch	

• Queries should be used	

• for full text search	

• where the result depends on a relevance score	

• Filters should be used	

• for binary yes/no searches	

• for queries on exact values
Basic Queries
Basic Filters
Analysis 1/2
• Analysis is extracting “terms” from a given text
• Processing natural language to make it computer
searchable	

• Configurable registry of Analyzers that can be used	

• to break indexed (analyzed) fields when a
document is indexed 	

• to process query strings
Analysis 2/2
• Analyzers are composed of 	

• a singleTokenizer (may be preceded by one or
more CharFilters)	

• zero or moreTokenFilters	

• Default Analyzers

standard, pattern, whitespace, language, snowball
Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
Analytics
• Aggregation of information: similar to “group by”	

• Facets	

• Aggregated data based on a search query 	

• One-dimensional results	

• Ex:“term facets” return facetcounts for various values for a specific field 

Think color, tag, category, …	

• Aggregations (ES 1.0+)	

• Nested Facets	

• Basic Stats: mean, min, max, std dev, term counts	

• SignificantTerms, Percentiles, Cardinality estimations
Facets
• not yet deprecated, but use aggregations!
• Various Facets

terms, range, histogram, date,
statistical, geo distance, …
Aggregations
• A generic powerful framework that can be divided into 2 main families:	

• Bucketing

Each bucket is associated with a key and a document criterion

The aggregation process provides a list of buckets - each one with a set of
documents that "belong" to it.	

• Metric

Aggregations that keep track and compute metrics over a set of
documents.	

• Aggregations can be nested !
Bucket Aggregators
• global	

• filter	

• missing	

• terms	

• range	

• date range	

• ip range
• histogram	

• date histogram	

• geo distance	

• geohash grid	

• nested	

• reverse nested	

• top hits (version 1.3)
Metrics Aggregators
• count	

• stats	

• extended stats	

• cardinality	

• percentiles
• min	

• max	

• sum	

• avg
Search for end users
• Suggesters - “Did you mean”

Terms, Phrases, Completion, Context	

• “More like this”

Find documents that are "like" provided text by
running it against one or more fields
Percolator
• Classic ES 	

1. Add & Index documents	

2. Search with queries
3. Retrieve matching documents	

• Percolator
1. Add & Index queries	

2. Percolate documents
3. Retrieve matching queries
Why Percolate ?!
• Alerts: social media mentions, weather forecast, news alerts	

• Automatic Monitoring: price monitoring, stock alerts, logs	

• Ads: display targeted ads based on user’s search queries	

• Enrich: percolate new documents, then add query matches
as document tags
High Availability 1/2
• Sharding - Write Scalability
• Split logical data over multiple machines & Control data flows	

• Each index has a fixed number of shards	

• Improve indexing performance	

• Replication - Read Scalability
• Each shard can have 0-many replicas (dynamic setup)	

• Removing SPOF (Single Point Of Failure)	

• Improve search performance
High Availability 2/2
• Zen Discovery
• Automatic discovery of nodes within a cluster
and electing a master node	

• Useful for failover and replication	

• Specific modules:Amazon EC2, Microsoft
Azure, Google Compute Engine	

• Snapshot & Restore module
Cluster Management
• Marvel - http://www.elasticsearch.org/overview/marvel/ 	

• BigDesk - http://bigdesk.org/	

• Paramedic - https://github.com/karmi/elasticsearch-
paramedic	

• KOPF - https://github.com/lmenezes/elasticsearch-kopf/	

• Elastic HQ - http://www.elastichq.org/
Clients & Integration
• Ecosystem: Kibana, Logstash, Marvel, Hadoop integration	

• API Clients: Java, Javascript, Groovy, PHP, Perl, Python, .Net,
Ruby, Scala, Clojure, Go, Erlang, …	

• Integrations: Grails, Django, Play!, Symfony2, Carrot2, Spring,
Drupal,Wordpress, …	

• Rivers: CouchDB, JDBC, MongoDB, Neo4j, Redis, RabbitMQ,
ActiveMQ,Amazon SQS, File System,Twitter,Wikipedia, RSS, …
Fast & Furious Evolution
Version 1.1

March 25, 2014
• Cardinality Agg	

• Percentiles Agg	

• SignificantTerms
Agg	

• SearchTemplates	

• Cross fields search	

• Alias for indices &
templates
Version 1.2

May 22, 2014
• Java 7	

• Indexing & Merging
performance 	

• Aggregations
performance	

• Context suggester	

• Deep scrolling	

• Field value factor
Benchmark API coming in 1.3
Version 1.0

Feb 12, 2014
• Aggregations	

• Snapshot & Restore	

• Distributed
Percolator	

• Cat API	

• Federated search	

• Doc values	

• Circuit breaker
Resources
• http://www.elasticsearch.org/guide/	

• http://www.elasticsearch.org/videos/	

• http://www.elasticsearchtutorial.com/	

• http://exploringelasticsearch.com/	

• http://joelabrahamsson.com/elasticsearch-101/	

• http://belczyk.com/2014/01/elasticsearch-recomended-learning-materials/	

• http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/modules-
plugins.html
Books
• Elasticsearch Server

http://www.packtpub.com/
elasticsearch-server-2e/book
• Elasticsearch in Action

http://www.manning.com/
hinman/
Books
• Elasticsearch Cookbook

http://www.packtpub.com/
elasticsearch-cookbook/book
• Mastering Elasticsearch

http://www.packtpub.com/
mastering-elasticsearch-
querying-and-data-handling/
book
Books
• Elasticsearch -The Definitive Guide

http://www.elasticsearch.org/blog/elasticsearch-definitive-guide/
Thank you!
eric@data.be - @wavyx
be.linkedin.com/in/erodriguez - github.com/wavyx
http://www.meetup.com/ElasticSearch-User-Group-Belux-Belgium-Luxembourg/

More Related Content

What's hot (19)

PPTX
Intro to elasticsearch
Joey Wen
 
PDF
ElasticSearch - index server used as a document database
Robert Lujo
 
PPSX
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
Rahul K Chauhan
 
PDF
Intro to Elasticsearch
Clifford James
 
PPTX
Elastic search overview
ABC Talks
 
ODP
Elastic search
NexThoughts Technologies
 
PPTX
Elasticsearch
Ricardo Peres
 
PDF
Elasticsearch in 15 minutes
David Pilato
 
PDF
Roaring with elastic search sangam2018
Vinay Kumar
 
PPTX
ElasticSearch - DevNexus Atlanta - 2014
Roy Russo
 
PPTX
The ultimate guide for Elasticsearch plugins
Itamar
 
PDF
Introduction to Elasticsearch
Sperasoft
 
PPTX
Elasticsearch - DevNexus 2015
Roy Russo
 
PPTX
Introduction to Elasticsearch
Bo Andersen
 
PDF
Elasticsearch Basics
Shifa Khan
 
PDF
Introduction to elasticsearch
hypto
 
ODP
Cool bonsai cool - an introduction to ElasticSearch
clintongormley
 
PPTX
Elastic search Walkthrough
Suhel Meman
 
PDF
Elasticsearch for Data Analytics
Felipe
 
Intro to elasticsearch
Joey Wen
 
ElasticSearch - index server used as a document database
Robert Lujo
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
Rahul K Chauhan
 
Intro to Elasticsearch
Clifford James
 
Elastic search overview
ABC Talks
 
Elastic search
NexThoughts Technologies
 
Elasticsearch
Ricardo Peres
 
Elasticsearch in 15 minutes
David Pilato
 
Roaring with elastic search sangam2018
Vinay Kumar
 
ElasticSearch - DevNexus Atlanta - 2014
Roy Russo
 
The ultimate guide for Elasticsearch plugins
Itamar
 
Introduction to Elasticsearch
Sperasoft
 
Elasticsearch - DevNexus 2015
Roy Russo
 
Introduction to Elasticsearch
Bo Andersen
 
Elasticsearch Basics
Shifa Khan
 
Introduction to elasticsearch
hypto
 
Cool bonsai cool - an introduction to ElasticSearch
clintongormley
 
Elastic search Walkthrough
Suhel Meman
 
Elasticsearch for Data Analytics
Felipe
 

Viewers also liked (17)

PPTX
Study Notes: Google Percolator
Gao Yunzhong
 
PDF
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Karel Minarik
 
ODP
Elasticsearch for beginners
Neil Baker
 
PDF
What's new in Elasticsearch v5
Idan Tohami
 
ODP
Elasticsearch presentation 1
Maruf Hassan
 
PDF
Workshop: Learning Elasticsearch
Anurag Patel
 
PDF
Column base plates_prof_thomas_murray
Alberto Rosado
 
PDF
LogStash - Yes, logging can be awesome
James Turnbull
 
PDF
Down and dirty with Elasticsearch
clintongormley
 
PDF
Scaling real-time search and analytics with Elasticsearch
clintongormley
 
PDF
Introduction to Elasticsearch
Sematext Group, Inc.
 
PDF
Elasticsearch
Andrii Gakhov
 
PDF
Elasticsearch in Netflix
Danny Yuan
 
PPT
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Sematext Group, Inc.
 
PPTX
Extraction processes
Niraj Bartaula
 
PPTX
ElasticSearch Basic Introduction
Mayur Rathod
 
PPT
Bollman extractor
Padmaratinam
 
Study Notes: Google Percolator
Gao Yunzhong
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Karel Minarik
 
Elasticsearch for beginners
Neil Baker
 
What's new in Elasticsearch v5
Idan Tohami
 
Elasticsearch presentation 1
Maruf Hassan
 
Workshop: Learning Elasticsearch
Anurag Patel
 
Column base plates_prof_thomas_murray
Alberto Rosado
 
LogStash - Yes, logging can be awesome
James Turnbull
 
Down and dirty with Elasticsearch
clintongormley
 
Scaling real-time search and analytics with Elasticsearch
clintongormley
 
Introduction to Elasticsearch
Sematext Group, Inc.
 
Elasticsearch
Andrii Gakhov
 
Elasticsearch in Netflix
Danny Yuan
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Sematext Group, Inc.
 
Extraction processes
Niraj Bartaula
 
ElasticSearch Basic Introduction
Mayur Rathod
 
Bollman extractor
Padmaratinam
 
Ad

Similar to Elasticsearch Introduction at BigData meetup (20)

PPTX
Elastic pivorak
Pivorak MeetUp
 
PPTX
Elastic Search Capability Presentation.pptx
Knoldus Inc.
 
PPTX
Elasticsearch
Yervand Aghababyan
 
PPTX
Devnexus 2018
Roy Russo
 
PPTX
Dev nexus 2017
Roy Russo
 
PPTX
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
PDF
Elasticsearch speed is key
Enterprise Search Warsaw Meetup
 
PDF
Elasticsearch
Amine Ferchichi
 
PDF
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
cadejaumafiq
 
PPTX
Introduction to Elasticsearch
Ismaeel Enjreny
 
PPTX
BigData Search Simplified with ElasticSearch
TO THE NEW | Technology
 
PPTX
The ELK Stack - Launch and Learn presentation
saivjadhav2003
 
PDF
Explore Elasticsearch and Why It’s Worth Using
Inexture Solutions
 
PDF
Elasticsearch and Spark
Audible, Inc.
 
PDF
Elasticsearch, a distributed search engine with real-time analytics
Tiziano Fagni
 
PDF
Elasto Mania
andrefsantos
 
PPTX
Intro elasticsearch taswarbhatti
Taswar Bhatti
 
PPTX
Elasticsearch { "Meetup" : "talk" }
Lutf Ur Rehman
 
PPTX
Elasticsearch
Divij Sehgal
 
PDF
Elasticsearch
Shagun Rathore
 
Elastic pivorak
Pivorak MeetUp
 
Elastic Search Capability Presentation.pptx
Knoldus Inc.
 
Elasticsearch
Yervand Aghababyan
 
Devnexus 2018
Roy Russo
 
Dev nexus 2017
Roy Russo
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
Elasticsearch speed is key
Enterprise Search Warsaw Meetup
 
Elasticsearch
Amine Ferchichi
 
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
cadejaumafiq
 
Introduction to Elasticsearch
Ismaeel Enjreny
 
BigData Search Simplified with ElasticSearch
TO THE NEW | Technology
 
The ELK Stack - Launch and Learn presentation
saivjadhav2003
 
Explore Elasticsearch and Why It’s Worth Using
Inexture Solutions
 
Elasticsearch and Spark
Audible, Inc.
 
Elasticsearch, a distributed search engine with real-time analytics
Tiziano Fagni
 
Elasto Mania
andrefsantos
 
Intro elasticsearch taswarbhatti
Taswar Bhatti
 
Elasticsearch { "Meetup" : "talk" }
Lutf Ur Rehman
 
Elasticsearch
Divij Sehgal
 
Elasticsearch
Shagun Rathore
 
Ad

More from Eric Rodriguez (Hiring in Lex) (10)

PDF
talk2me pitch - Startup Weekend Brussels - Artificial Intelligence
Eric Rodriguez (Hiring in Lex)
 
PDF
LegalTech - Bots vs Lawyers
Eric Rodriguez (Hiring in Lex)
 
PPT
Using Open Data to fuel LegalTech Innovation
Eric Rodriguez (Hiring in Lex)
 
PDF
Data.be - BNP Paribas - International Hackathon 2015
Eric Rodriguez (Hiring in Lex)
 
PDF
Startup Weekend Brussels Makers - Videophone
Eric Rodriguez (Hiring in Lex)
 
PDF
Startup weekend summit brussels data.be
Eric Rodriguez (Hiring in Lex)
 
PDF
Smartstation - SWLiege 2014
Eric Rodriguez (Hiring in Lex)
 
PDF
Big Data introduction - Café Numérique Bruxelles
Eric Rodriguez (Hiring in Lex)
 
PDF
MapChat - SWLeuven 2014
Eric Rodriguez (Hiring in Lex)
 
PDF
Follow the White Rabbit - Message Queues with PHP
Eric Rodriguez (Hiring in Lex)
 
talk2me pitch - Startup Weekend Brussels - Artificial Intelligence
Eric Rodriguez (Hiring in Lex)
 
LegalTech - Bots vs Lawyers
Eric Rodriguez (Hiring in Lex)
 
Using Open Data to fuel LegalTech Innovation
Eric Rodriguez (Hiring in Lex)
 
Data.be - BNP Paribas - International Hackathon 2015
Eric Rodriguez (Hiring in Lex)
 
Startup Weekend Brussels Makers - Videophone
Eric Rodriguez (Hiring in Lex)
 
Startup weekend summit brussels data.be
Eric Rodriguez (Hiring in Lex)
 
Smartstation - SWLiege 2014
Eric Rodriguez (Hiring in Lex)
 
Big Data introduction - Café Numérique Bruxelles
Eric Rodriguez (Hiring in Lex)
 
MapChat - SWLeuven 2014
Eric Rodriguez (Hiring in Lex)
 
Follow the White Rabbit - Message Queues with PHP
Eric Rodriguez (Hiring in Lex)
 

Recently uploaded (20)

PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
🚀 Let’s Build Our First Slack Workflow! 🔧.pdf
SanjeetMishra29
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
What’s my job again? Slides from Mark Simos talk at 2025 Tampa BSides
Mark Simos
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 

Elasticsearch Introduction at BigData meetup

  • 1. Introduction to Elasticsearch 27th May 2014 - BigData Meetup Eric Rodriguez @wavyx
  • 2. About Me Eric Rodriguez Founder of data.be ! • Web entrepreneur • Data addict • Multi-Language: PHP, Java/ Groovy/Grails, .Net, … be.linkedin.com/in/erodriguez ! github.com/wavyx ! @wavyx
  • 3. Elasticsearch - Company • Founded in 2012 => http://www.elasticsearch.com • Professional services • Training • Consultancy / Development support • Production support subscription (3 levels of SLAs)
  • 5. (M)ELK Stack • Elasticsearch - Search server based on Lucene • Logstash -Tool for managing events and logs • Kibana -Visualize logs and time-stamped data • Marvel - Monitor your cluster’s heartbeat You Know, for Search…
  • 6. Logstash • Collect, parse, index, and search logs
  • 7. Kibana • A versatile dashboard to see and interact with your data
  • 8. Marvel • Monitor the health of your cluster
 cluster-wide metrics, overview of all nodes and indices and events (master election, new nodes)
  • 9. real time, search and analytics engine open-source Lucene JSON schema free document
 store RESTful API documentation scalability high availability distributed multi tenancy per-operation
 persistence
  • 10. Use Cases • Full-Text Search • Data Store • Analytics • Alerts • Ads • …
  • 11. Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
  • 12. Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
  • 13. Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
  • 14. Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
  • 15. Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
  • 16. Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
  • 17. Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
  • 18. Elasticsearch core • Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java • Elasticsearch added value: “Simple is best” • Simple API (with documentation) • JSON & RESTful • Sharding & Replication • Extensibility: plugins and scripts • Interoperability: clients and integrations
  • 19. Terms for DBAs • Index • Type • Document • Fields • Mapping ElasticsearchRDBMs • Database • Table • Row • Column • Schema
  • 20. Plug & Play • Zero configuration • 4 LoC to get started ;)
  • 22. REST • Check your cluster, node, and index health, status, and statistics • Administer your cluster, node, and index data and metadata • Perform CRUD (Create, Read, Update, and Delete) and search operations against your indexes • Execute advanced search operations such as paging, sorting, filtering, scripting, faceting, aggregations, and many others
  • 23. Basic Operations 1/3 • Add a document • Create index
  • 24. Basic Operations 2/3 • Modify/Replace a document • Delete a document • Delete index
  • 25. Basic Operations 3/3 • Update a document
  • 26. Mapping 1/2 • Define how a document should be mapped (similar to schema): searchable fields, tokenization, storage, .. • Explicit mapping is defined on an index/type level • A default mapping is automatically created
  • 27. Mapping 2/2 • Core types: string, integer/long, float/double, boolean, and null • Other types:Array, Object, Nested, IP, GeoPoint, GeoShape, Attachment • Example
  • 28. Search API 1/2 • Multi-index, Multi-type • Uri search - Google like
 Operators (AND/OR), fields, sort, paging, wildcards, …
  • 29. Search API 2/2 • Paging & Sort • Fields: selection, scripts • Post filter • Highlighting • Rescoring • Explain • …
  • 30. Query DSL • “SQL” for elasticsearch • Queries should be used • for full text search • where the result depends on a relevance score • Filters should be used • for binary yes/no searches • for queries on exact values
  • 33. Analysis 1/2 • Analysis is extracting “terms” from a given text • Processing natural language to make it computer searchable • Configurable registry of Analyzers that can be used • to break indexed (analyzed) fields when a document is indexed • to process query strings
  • 34. Analysis 2/2 • Analyzers are composed of • a singleTokenizer (may be preceded by one or more CharFilters) • zero or moreTokenFilters • Default Analyzers
 standard, pattern, whitespace, language, snowball
  • 35. Copyright 2014 Elasticsearch Inc / Elasticsearch BV.All rights reserved. Content used with permission from Elasticsearch.
  • 36. Analytics • Aggregation of information: similar to “group by” • Facets • Aggregated data based on a search query • One-dimensional results • Ex:“term facets” return facetcounts for various values for a specific field 
 Think color, tag, category, … • Aggregations (ES 1.0+) • Nested Facets • Basic Stats: mean, min, max, std dev, term counts • SignificantTerms, Percentiles, Cardinality estimations
  • 37. Facets • not yet deprecated, but use aggregations! • Various Facets
 terms, range, histogram, date, statistical, geo distance, …
  • 38. Aggregations • A generic powerful framework that can be divided into 2 main families: • Bucketing
 Each bucket is associated with a key and a document criterion
 The aggregation process provides a list of buckets - each one with a set of documents that "belong" to it. • Metric
 Aggregations that keep track and compute metrics over a set of documents. • Aggregations can be nested !
  • 39. Bucket Aggregators • global • filter • missing • terms • range • date range • ip range • histogram • date histogram • geo distance • geohash grid • nested • reverse nested • top hits (version 1.3)
  • 40. Metrics Aggregators • count • stats • extended stats • cardinality • percentiles • min • max • sum • avg
  • 41. Search for end users • Suggesters - “Did you mean”
 Terms, Phrases, Completion, Context • “More like this”
 Find documents that are "like" provided text by running it against one or more fields
  • 42. Percolator • Classic ES 1. Add & Index documents 2. Search with queries 3. Retrieve matching documents • Percolator 1. Add & Index queries 2. Percolate documents 3. Retrieve matching queries
  • 43. Why Percolate ?! • Alerts: social media mentions, weather forecast, news alerts • Automatic Monitoring: price monitoring, stock alerts, logs • Ads: display targeted ads based on user’s search queries • Enrich: percolate new documents, then add query matches as document tags
  • 44. High Availability 1/2 • Sharding - Write Scalability • Split logical data over multiple machines & Control data flows • Each index has a fixed number of shards • Improve indexing performance • Replication - Read Scalability • Each shard can have 0-many replicas (dynamic setup) • Removing SPOF (Single Point Of Failure) • Improve search performance
  • 45. High Availability 2/2 • Zen Discovery • Automatic discovery of nodes within a cluster and electing a master node • Useful for failover and replication • Specific modules:Amazon EC2, Microsoft Azure, Google Compute Engine • Snapshot & Restore module
  • 46. Cluster Management • Marvel - http://www.elasticsearch.org/overview/marvel/ • BigDesk - http://bigdesk.org/ • Paramedic - https://github.com/karmi/elasticsearch- paramedic • KOPF - https://github.com/lmenezes/elasticsearch-kopf/ • Elastic HQ - http://www.elastichq.org/
  • 47. Clients & Integration • Ecosystem: Kibana, Logstash, Marvel, Hadoop integration • API Clients: Java, Javascript, Groovy, PHP, Perl, Python, .Net, Ruby, Scala, Clojure, Go, Erlang, … • Integrations: Grails, Django, Play!, Symfony2, Carrot2, Spring, Drupal,Wordpress, … • Rivers: CouchDB, JDBC, MongoDB, Neo4j, Redis, RabbitMQ, ActiveMQ,Amazon SQS, File System,Twitter,Wikipedia, RSS, …
  • 48. Fast & Furious Evolution Version 1.1
 March 25, 2014 • Cardinality Agg • Percentiles Agg • SignificantTerms Agg • SearchTemplates • Cross fields search • Alias for indices & templates Version 1.2
 May 22, 2014 • Java 7 • Indexing & Merging performance • Aggregations performance • Context suggester • Deep scrolling • Field value factor Benchmark API coming in 1.3 Version 1.0
 Feb 12, 2014 • Aggregations • Snapshot & Restore • Distributed Percolator • Cat API • Federated search • Doc values • Circuit breaker
  • 49. Resources • http://www.elasticsearch.org/guide/ • http://www.elasticsearch.org/videos/ • http://www.elasticsearchtutorial.com/ • http://exploringelasticsearch.com/ • http://joelabrahamsson.com/elasticsearch-101/ • http://belczyk.com/2014/01/elasticsearch-recomended-learning-materials/ • http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/modules- plugins.html
  • 50. Books • Elasticsearch Server
 http://www.packtpub.com/ elasticsearch-server-2e/book • Elasticsearch in Action
 http://www.manning.com/ hinman/
  • 51. Books • Elasticsearch Cookbook
 http://www.packtpub.com/ elasticsearch-cookbook/book • Mastering Elasticsearch
 http://www.packtpub.com/ mastering-elasticsearch- querying-and-data-handling/ book
  • 52. Books • Elasticsearch -The Definitive Guide
 http://www.elasticsearch.org/blog/elasticsearch-definitive-guide/
  • 53. Thank you! [email protected] - @wavyx be.linkedin.com/in/erodriguez - github.com/wavyx http://www.meetup.com/ElasticSearch-User-Group-Belux-Belgium-Luxembourg/