SlideShare a Scribd company logo
Mathematical Modeling
with Apache Solr
Joel Bernstein
Senior Data Engineer, Lucidworks
Lucene/Solr Committer & PMC
@jbernste2
#Activate18 #ActivateSearch
Agenda
• Math Expressions Introduction
• Model Types
• Training Regression Models
• Assessing Regression Models
• Prediction and Anomaly Detection
• Mathematics language that runs inside of Solr Cloud.

• Integration of Apache Commons Math with Solr.

• Designed for fast quantitative analysis of result sets. 

• Any Streaming Expression can be used to create the data set.
Math Expressions
Vector/Matrix Math Text Analysis
Statistics Probability
Monte Carlo Simulations Linear Regression
Curve Fitting Time Series
Interpolation, Derivatives, Integrals Digital Signal Processing
Machine Learning Computational Geometry
Statistical Modeling
Two of the most commonly used statistical models are: 

• Regression Models: Predicting Numeric Values

• Probability Distributions: Models of probability
Regression Support in Math Expressions
Linear
• Simple Linear Regression
• Multivariate Linear Regression
Non-linear
• Loess Regression: Bivariate, robust, often used for time series modeling. 

• Polynomial Curve Fitting: Bivariate, general purpose modeling of curves.

• Harmonic Curve Fitting: Bivariate, sine wave modeling.

• Gaussian Curve Fitting: Bivariate, modeling a Gaussian peak. 

• KNN Regression: Multivariate, robust, distance based, very flexible.
Probability Distributions
• Statistical models of probability

• Used to model risk

• Perform simulations

• Natural outlier detectors

• Math Expressions supports many of the commonly used probability distributions 

• There is an important relationship between the Normal Distribution and Regression Models
Training the Model
Use Case
• Detect unusual slowness in the network.

• Use Simple Linear Regression to model the linear relationship of file sizes and response times.

• Use the regression model to detect higher than expected response times.
Random Sample
Response
Set the Samples to a Variable
Response
Vectorize the File Sizes
Response
Vectorize the Responses
Response
Plotting the Responses (Sunplot by: Michael Suzuki)
Response Times
Simple Linear Regression
Response
Cache the Model
Assessing the Model
Statistical Analysis of the Residuals
• Residuals
– Calculate the Residuals

– Describe

– Normality Test

– Residual Plot (Homoscedasticity)

– Model the Residuals
Residuals
• The difference between the actual value and the predicted value is called the residual.

• Residuals represent the error of a regression model.

• Residuals can be analyzed and modeled as a probability distribution.

• In an ideal scenario the residuals well be normally distributed and homoscedastic.
Predictions
Response
Calculate the Residuals
Response
Describe
Response
Testing for Normality
Response
Cache the Residuals Distribution
Residual Plot
Scatter Plot of Residuals
Prediction
Getting the Cached Model
Response
Request/Response Prediction
Response
Streaming Prediction
Response
Anomaly Detection
Streaming Anomaly Detection
Response
Thank you!
Joel Bernstein
Senior Data Engineer, Lucidworks
@jbernste2
#Activate18 #ActivateSearch

More Related Content

Similar to Applied Mathematical Modeling with Apache Solr - Joel Bernstein, Lucidworks (20)

PPTX
cs 601 - lecture 1.pptx
GopalPatidar13
 
PPTX
In this presentation concepts of simulation and modelling
Aslbtr
 
PPTX
Matrix OLS dshksdfbksjdbfkdjsfbdskfbdkj.pptx
ImranUmar27
 
PPTX
Neural model for density estimation.pptx
guruprassand
 
PPTX
MACHINE LEARNING YEAR DL SECOND PART.pptx
NAGARAJANS68
 
PDF
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
MLconf
 
PPT
Lecture01.ppt
InamUllahKhan961803
 
PDF
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
MLconf
 
PPTX
Linear Modelscxccxcxcxsaddsaccsdddd.pptx
rishabhsrivastava518345
 
PPTX
Gaussian Processes and Time Series.pptx
guruprassand
 
PPTX
Day17.pptx department of computer science and eng
RamaKrishnaErroju
 
PPTX
Kuliah_3_Hydro pemodelan untuk kelas IT.pptx
Dadang Subarna
 
PDF
Artificial Intelligence Course: Linear models
ananth
 
PDF
Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of...
e2wi67sy4816pahn
 
PDF
Mario Leon - IntelliSense.io - Presentacion Mineria Digital 2022.pdf
MarioAlejandroLeonGa
 
PPT
Simulation and monte carlo some general principles
Ahmedaboraia
 
PPTX
The principles of simulation system design.pptx
ubaidullah75790
 
PPT
dimension reduction.ppt
Deadpool120050
 
PPT
Econometric model ing
Matt Grant
 
PDF
Modelling and evaluation
eShikshak
 
cs 601 - lecture 1.pptx
GopalPatidar13
 
In this presentation concepts of simulation and modelling
Aslbtr
 
Matrix OLS dshksdfbksjdbfkdjsfbdskfbdkj.pptx
ImranUmar27
 
Neural model for density estimation.pptx
guruprassand
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
NAGARAJANS68
 
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
MLconf
 
Lecture01.ppt
InamUllahKhan961803
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
MLconf
 
Linear Modelscxccxcxcxsaddsaccsdddd.pptx
rishabhsrivastava518345
 
Gaussian Processes and Time Series.pptx
guruprassand
 
Day17.pptx department of computer science and eng
RamaKrishnaErroju
 
Kuliah_3_Hydro pemodelan untuk kelas IT.pptx
Dadang Subarna
 
Artificial Intelligence Course: Linear models
ananth
 
Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of...
e2wi67sy4816pahn
 
Mario Leon - IntelliSense.io - Presentacion Mineria Digital 2022.pdf
MarioAlejandroLeonGa
 
Simulation and monte carlo some general principles
Ahmedaboraia
 
The principles of simulation system design.pptx
ubaidullah75790
 
dimension reduction.ppt
Deadpool120050
 
Econometric model ing
Matt Grant
 
Modelling and evaluation
eShikshak
 

More from Lucidworks (20)

PDF
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
 
PDF
Drive Agent Effectiveness in Salesforce
Lucidworks
 
PPTX
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
 
PPTX
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
 
PPTX
Connected Experiences Are Personalized Experiences
Lucidworks
 
PDF
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 
PPTX
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
 
PPTX
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
 
PPTX
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
 
PPTX
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
 
PDF
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
 
PPTX
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
 
PDF
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
 
PPTX
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
 
PPTX
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
 
PDF
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
 
PPTX
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
 
PPTX
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
 
PPTX
Webinar: Building a Business Case for Enterprise Search
Lucidworks
 
PPTX
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
 
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
 
Drive Agent Effectiveness in Salesforce
Lucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
 
Connected Experiences Are Personalized Experiences
Lucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
 
Webinar: Building a Business Case for Enterprise Search
Lucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
 
Ad

Recently uploaded (20)

PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Ad

Applied Mathematical Modeling with Apache Solr - Joel Bernstein, Lucidworks