SlideShare a Scribd company logo
GoDataDriven
PROUDLY PART OF THE XEBIA GROUP
Real time data driven applications
Giovanni Lanzani
Data Whisperer
and SQL vs NoSQL databases
Who am I?
2008-2012: PhDTheoretical Physics
2012-2013: KPMG
2013-Now: GoDataDriven
Feedback
@gglanzani
Real-time, data driven app?
•No store and retrieve;
•Store, {transform, enrich, analyse} and retrieve;
•Real-time: retrieve is not a batch process;
•App: something your mother could use:
SELECT attendees
FROM NoSQLMatters
WHERE password = '1234';
Get insight about event impact
Get insight about event impact
Get insight about event impact
Get insight about event impact
Challenges
1. Big Data;
2. Privacy;
3. Some real-time analysis;
4. Real-time retrieval.
Is it Big Data?
Everybody talks about it
Nobody knows how to do it
Everyone thinks everyone else is doing it, so everyone
claims they’re doing it…
Dan Ariely
Is it Big Data?
•Raw logs are in the order of 40TB;
•We use Hadoop for storing, enriching and pre-
processing.
2. Privacy
3. (Some) real-time analysis
•Harder than it looks;
•Large data;
•Retrieval is by giving date, center location +
radius.
4. Real-Time Retrieval
AngularJS python app
REST
Front-end Back-end
JSON
Architecture
JS-1
JS-2
date hour id_activity postcode hits delta sbi
2013-01-01 12 1234 1234AB 35 22 1
2013-01-08 12 1234 1234AB 45 35 1
2013-01-01 11 2345 5555ZB 2 1 2
2013-01-08 11 2345 5555ZB 55 2 2
Data Example
date hour id_activity
postcod
e
hits delta sbi
2013-01-01 12 1234 1234AB 35 22 1
2013-01-08 12 1234 1234AB 45 35 1
2013-01-01 11 2345 5555ZB 2 1 2
2013-01-08 11 2345 5555ZB 55 2 2
Data Example
helper.py example
def get_statistics(data, sbi):
sbi_df = data[data.sbi == sbi]
# select * from data where sbi = sbi
hits = sbi_df.hits.sum() # select sum(hits) from …
delta_hits = sbi_df.delta.sum() # select sum(delta) from …
if delta_hits:
percentage = (hits - delta_hits) / delta_hits
else:
percentage = 0
return {"sbi": sbi, "total": hits, "percentage": percentage}
helper.py example
def get_timeline(data, sbi):
df_sbi = data.groupby([“date”, “hour", “sbi"]).aggregate(sum)
# select sum(hits), sum(delta) from data group by date, hour, sbi
return df_sbi
Who has my data?
•First iteration was a (pre)-POC, less data (3GB vs
500GB);
•Time constraints;
•Oeps: everything is a pandas df!
Advantage of “everything is a df”
Pro:
•Fast!!
•Use what you know
•NO DBA’s!
•We all love CSV’s!
Contra:
•Doesn’t scale;
•Huge startup time;
•NO DBA’s!
•We all hate CSV’s!
•Set the dataframe index wisely;
•Align the data to the index:
•Beware of modifications of the original dataframe!
source_data.sort_index(inplace=True)
If you want to go down this path
The reason pandas is faster is because I came up with a better algorithm
If you want to go down this path
AngularJS python app
REST
Front-end Back-end Database
JSON
?
If you don’t
A word about (traditional) databases…
Db: programming language dict
Postgres for data driven apps?
Postgres for data driven apps?
Issues?!
•With a radius of 10km, in Amsterdam, you get
10k postcodes.You need to do this in your SQL:
•Index on date and postcode, but single queries
running more than 20 minutes.
SELECT * FROM datapoints
WHERE
date IN date_array
AND
postcode IN postcode_array;
PostGIS is a spatial database extender for PostgreSQL.
Supports geographic objects allowing location queries:
SELECT *
FROM datapoints
WHERE ST_DWithin(lon, lat, 1500)
AND dates IN ('2013-02-30', '2013-02-31');
-- every point within 1.5km
-- from (lat, lon) on imaginary dates
Postgres + Postgis (2.x)
Other db’s?
How we solved it
1. Align data on disk by date;
2. Use the temporary table trick:
3. Lose precision: 1234AB→1234
CREATE TEMPORARY TABLE tmp (postcodes STRING NOT NULL
PRIMARY KEY);
INSERT INTO tmp (postcodes) VALUES postcode_array;
SELECT * FROM tmp
JOIN datapoints d
ON d.postcode = tmp.postcodes
WHERE
d.dt IN dates_array;
Take home messages
1. Geospatial problems are “hard” and cam kill your
queries;
2. Not everybody has infinite resources: be smart
and KISS!
3. SQL or NoSQL? (Size, schema)
GoDataDriven
We’re hiring / Questions? / Thank you!
@gglanzani
giovannilanzani@godatadriven.com
Giovanni Lanzani
Data Whisperer

More Related Content

What's hot (13)

PDF
Big Data Rampage
Niko Vuokko
 
PPT
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
The Hive
 
PPTX
Reduce Query Time Up to 60% with Selective Search
Lucidworks
 
PDF
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Demi Ben-Ari
 
PDF
Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...
Databricks
 
PPTX
Question Answering and Virtual Assistants with Deep Learning
Lucidworks
 
PDF
Measure All the Things! - Austin Data Day 2014
gdusbabek
 
PDF
Before Kaggle
Pierre Gutierrez
 
PDF
Building Better Models Faster Using Active Learning
CrowdFlower
 
PDF
Runaway complexity in Big Data... and a plan to stop it
nathanmarz
 
PDF
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
Work-Bench
 
PDF
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Demi Ben-Ari
 
PDF
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Big Data Spain
 
Big Data Rampage
Niko Vuokko
 
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
The Hive
 
Reduce Query Time Up to 60% with Selective Search
Lucidworks
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Demi Ben-Ari
 
Semantic Search: Fast Results from Large, Non-Native Language Corpora with Ro...
Databricks
 
Question Answering and Virtual Assistants with Deep Learning
Lucidworks
 
Measure All the Things! - Austin Data Day 2014
gdusbabek
 
Before Kaggle
Pierre Gutierrez
 
Building Better Models Faster Using Active Learning
CrowdFlower
 
Runaway complexity in Big Data... and a plan to stop it
nathanmarz
 
Dr. Datascience or: How I Learned to Stop Munging and Love Tests
Work-Bench
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Demi Ben-Ari
 
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Big Data Spain
 

Viewers also liked (11)

PDF
Apache Spark Talk for Applied machine learning
GoDataDriven
 
PDF
Heating solution using Panstamp and Python
Oriol Rius
 
PDF
Divolte collector overview
GoDataDriven
 
PDF
Sea Amsterdam 2014 November 19
GoDataDriven
 
PDF
Divolte Collector - meetup presentation
fvanvollenhoven
 
PDF
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
PyData
 
PDF
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Wes McKinney
 
PDF
pandas: Powerful data analysis tools for Python
Wes McKinney
 
PDF
Python for Financial Data Analysis with pandas
Wes McKinney
 
PDF
Google Analytics vs. Omniture Comparative Guide
Jimmy Jay
 
PDF
Build Features, Not Apps
Natasha Murashev
 
Apache Spark Talk for Applied machine learning
GoDataDriven
 
Heating solution using Panstamp and Python
Oriol Rius
 
Divolte collector overview
GoDataDriven
 
Sea Amsterdam 2014 November 19
GoDataDriven
 
Divolte Collector - meetup presentation
fvanvollenhoven
 
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
PyData
 
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Wes McKinney
 
pandas: Powerful data analysis tools for Python
Wes McKinney
 
Python for Financial Data Analysis with pandas
Wes McKinney
 
Google Analytics vs. Omniture Comparative Guide
Jimmy Jay
 
Build Features, Not Apps
Natasha Murashev
 
Ad

Similar to Real time data driven applications (and SQL vs NoSQL databases) (20)

PDF
Giovanni Lanzani – SQL & NoSQL databases for data driven applications - NoSQL...
NoSQLmatters
 
PDF
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Guido Schmutz
 
PDF
Keepin’ It Real(-Time) With Nadine Farah | Current 2022
HostedbyConfluent
 
PPTX
temporal and spatial database.pptx
64837JAYAASRIK
 
PDF
Complex realtime event analytics using BigQuery @Crunch Warmup
Márton Kodok
 
PPTX
Real Time Analytics
Mohsin Hakim
 
PDF
Seminaire bigdata23102014
Raja Chiky
 
PPTX
Real Time Analytics
Mohsin Hakim
 
PDF
Why PostgreSQL for Analytics Infrastructure (DW)?
Huy Nguyen
 
PDF
Your Timestamps Deserve Better than a Generic Database
javier ramirez
 
PPTX
Big data
Mani Gandan
 
PPTX
The key to unlocking the Value in the IoT? Managing the Data!
DataWorks Summit/Hadoop Summit
 
PDF
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
IJDMS
 
PPTX
Dw 07032018-dr pl pradhan
Dr Pradhan PL Pradhan
 
PDF
Database Survival Guide: Exploratory Webcast
Eric Kavanagh
 
PPTX
Fundamentals of Big Data
The Wisdom Daily
 
PDF
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dan Lynn
 
PDF
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Krishna Sankar
 
PPTX
Explore big data at speed of thought with Spark 2.0 and Snappydata
Data Con LA
 
PDF
Real-time big data analytics based on product recommendations case study
deep.bi
 
Giovanni Lanzani – SQL & NoSQL databases for data driven applications - NoSQL...
NoSQLmatters
 
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Guido Schmutz
 
Keepin’ It Real(-Time) With Nadine Farah | Current 2022
HostedbyConfluent
 
temporal and spatial database.pptx
64837JAYAASRIK
 
Complex realtime event analytics using BigQuery @Crunch Warmup
Márton Kodok
 
Real Time Analytics
Mohsin Hakim
 
Seminaire bigdata23102014
Raja Chiky
 
Real Time Analytics
Mohsin Hakim
 
Why PostgreSQL for Analytics Infrastructure (DW)?
Huy Nguyen
 
Your Timestamps Deserve Better than a Generic Database
javier ramirez
 
Big data
Mani Gandan
 
The key to unlocking the Value in the IoT? Managing the Data!
DataWorks Summit/Hadoop Summit
 
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
IJDMS
 
Dw 07032018-dr pl pradhan
Dr Pradhan PL Pradhan
 
Database Survival Guide: Exploratory Webcast
Eric Kavanagh
 
Fundamentals of Big Data
The Wisdom Daily
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dan Lynn
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Krishna Sankar
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Data Con LA
 
Real-time big data analytics based on product recommendations case study
deep.bi
 
Ad

More from GoDataDriven (20)

PDF
Streamlining Data Science Workflows with a Feature Catalog
GoDataDriven
 
PDF
Visualizing Big Data in a Small Screen
GoDataDriven
 
PDF
Building a Scalable and reliable open source ML Platform with MLFlow
GoDataDriven
 
PDF
Training Taster: Leading the way to become a data-driven organization
GoDataDriven
 
PDF
My Path From Data Engineer to Analytics Engineer
GoDataDriven
 
PDF
dbt Python models - GoDataFest by Guillermo Sanchez
GoDataDriven
 
PDF
Workshop on Google Cloud Data Platform
GoDataDriven
 
PDF
How to create a Devcontainer for your Python project
GoDataDriven
 
PDF
Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...
GoDataDriven
 
PDF
Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022
GoDataDriven
 
PDF
MLOps CodeBreakfast on AWS - GoDataFest 2022
GoDataDriven
 
PDF
MLOps CodeBreakfast on Azure - GoDataFest 2022
GoDataDriven
 
PDF
Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022
GoDataDriven
 
PDF
Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022
GoDataDriven
 
PPTX
AWS Well-Architected Webinar Security - Ben de Haan
GoDataDriven
 
PDF
The 7 Habits of Effective Data Driven Companies
GoDataDriven
 
PPTX
DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...
GoDataDriven
 
PDF
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
GoDataDriven
 
PDF
Smart application on Azure at Vattenfall - Rens Weijers & Peter van 't Hof
GoDataDriven
 
PDF
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
GoDataDriven
 
Streamlining Data Science Workflows with a Feature Catalog
GoDataDriven
 
Visualizing Big Data in a Small Screen
GoDataDriven
 
Building a Scalable and reliable open source ML Platform with MLFlow
GoDataDriven
 
Training Taster: Leading the way to become a data-driven organization
GoDataDriven
 
My Path From Data Engineer to Analytics Engineer
GoDataDriven
 
dbt Python models - GoDataFest by Guillermo Sanchez
GoDataDriven
 
Workshop on Google Cloud Data Platform
GoDataDriven
 
How to create a Devcontainer for your Python project
GoDataDriven
 
Using Graph Neural Networks To Embrace The Dependency In Your Data by Usman Z...
GoDataDriven
 
Common Issues With Time Series by Vadim Nelidov - GoDataFest 2022
GoDataDriven
 
MLOps CodeBreakfast on AWS - GoDataFest 2022
GoDataDriven
 
MLOps CodeBreakfast on Azure - GoDataFest 2022
GoDataDriven
 
Tableau vs. Power BI by Juan Manuel Perafan - GoDataFest 2022
GoDataDriven
 
Deploying a Modern Data Stack by Lasse Benninga - GoDataFest 2022
GoDataDriven
 
AWS Well-Architected Webinar Security - Ben de Haan
GoDataDriven
 
The 7 Habits of Effective Data Driven Companies
GoDataDriven
 
DevOps for Data Science on Azure - Marcel de Vries (Xpirit) and Niels Zeilema...
GoDataDriven
 
Artificial intelligence in actions: delivering a new experience to Formula 1 ...
GoDataDriven
 
Smart application on Azure at Vattenfall - Rens Weijers & Peter van 't Hof
GoDataDriven
 
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
GoDataDriven
 

Recently uploaded (20)

PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PPTX
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
PDF
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
PPTX
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
Copia de Strategic Roadmap Infographics by Slidesgo.pptx (1).pdf
ssuserd4c6911
 
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 

Real time data driven applications (and SQL vs NoSQL databases)

  • 1. GoDataDriven PROUDLY PART OF THE XEBIA GROUP Real time data driven applications Giovanni Lanzani Data Whisperer and SQL vs NoSQL databases
  • 2. Who am I? 2008-2012: PhDTheoretical Physics 2012-2013: KPMG 2013-Now: GoDataDriven
  • 4. Real-time, data driven app? •No store and retrieve; •Store, {transform, enrich, analyse} and retrieve; •Real-time: retrieve is not a batch process; •App: something your mother could use: SELECT attendees FROM NoSQLMatters WHERE password = '1234';
  • 5. Get insight about event impact
  • 6. Get insight about event impact
  • 7. Get insight about event impact
  • 8. Get insight about event impact
  • 9. Challenges 1. Big Data; 2. Privacy; 3. Some real-time analysis; 4. Real-time retrieval.
  • 10. Is it Big Data? Everybody talks about it Nobody knows how to do it Everyone thinks everyone else is doing it, so everyone claims they’re doing it… Dan Ariely
  • 11. Is it Big Data? •Raw logs are in the order of 40TB; •We use Hadoop for storing, enriching and pre- processing.
  • 14. •Harder than it looks; •Large data; •Retrieval is by giving date, center location + radius. 4. Real-Time Retrieval
  • 15. AngularJS python app REST Front-end Back-end JSON Architecture
  • 16. JS-1
  • 17. JS-2
  • 18. date hour id_activity postcode hits delta sbi 2013-01-01 12 1234 1234AB 35 22 1 2013-01-08 12 1234 1234AB 45 35 1 2013-01-01 11 2345 5555ZB 2 1 2 2013-01-08 11 2345 5555ZB 55 2 2 Data Example
  • 19. date hour id_activity postcod e hits delta sbi 2013-01-01 12 1234 1234AB 35 22 1 2013-01-08 12 1234 1234AB 45 35 1 2013-01-01 11 2345 5555ZB 2 1 2 2013-01-08 11 2345 5555ZB 55 2 2 Data Example
  • 20. helper.py example def get_statistics(data, sbi): sbi_df = data[data.sbi == sbi] # select * from data where sbi = sbi hits = sbi_df.hits.sum() # select sum(hits) from … delta_hits = sbi_df.delta.sum() # select sum(delta) from … if delta_hits: percentage = (hits - delta_hits) / delta_hits else: percentage = 0 return {"sbi": sbi, "total": hits, "percentage": percentage}
  • 21. helper.py example def get_timeline(data, sbi): df_sbi = data.groupby([“date”, “hour", “sbi"]).aggregate(sum) # select sum(hits), sum(delta) from data group by date, hour, sbi return df_sbi
  • 22. Who has my data? •First iteration was a (pre)-POC, less data (3GB vs 500GB); •Time constraints; •Oeps: everything is a pandas df!
  • 23. Advantage of “everything is a df” Pro: •Fast!! •Use what you know •NO DBA’s! •We all love CSV’s! Contra: •Doesn’t scale; •Huge startup time; •NO DBA’s! •We all hate CSV’s!
  • 24. •Set the dataframe index wisely; •Align the data to the index: •Beware of modifications of the original dataframe! source_data.sort_index(inplace=True) If you want to go down this path
  • 25. The reason pandas is faster is because I came up with a better algorithm If you want to go down this path
  • 26. AngularJS python app REST Front-end Back-end Database JSON ? If you don’t
  • 27. A word about (traditional) databases…
  • 29. Postgres for data driven apps?
  • 30. Postgres for data driven apps?
  • 31. Issues?! •With a radius of 10km, in Amsterdam, you get 10k postcodes.You need to do this in your SQL: •Index on date and postcode, but single queries running more than 20 minutes. SELECT * FROM datapoints WHERE date IN date_array AND postcode IN postcode_array;
  • 32. PostGIS is a spatial database extender for PostgreSQL. Supports geographic objects allowing location queries: SELECT * FROM datapoints WHERE ST_DWithin(lon, lat, 1500) AND dates IN ('2013-02-30', '2013-02-31'); -- every point within 1.5km -- from (lat, lon) on imaginary dates Postgres + Postgis (2.x)
  • 34. How we solved it 1. Align data on disk by date; 2. Use the temporary table trick: 3. Lose precision: 1234AB→1234 CREATE TEMPORARY TABLE tmp (postcodes STRING NOT NULL PRIMARY KEY); INSERT INTO tmp (postcodes) VALUES postcode_array; SELECT * FROM tmp JOIN datapoints d ON d.postcode = tmp.postcodes WHERE d.dt IN dates_array;
  • 35. Take home messages 1. Geospatial problems are “hard” and cam kill your queries; 2. Not everybody has infinite resources: be smart and KISS! 3. SQL or NoSQL? (Size, schema)
  • 36. GoDataDriven We’re hiring / Questions? / Thank you! @gglanzani [email protected] Giovanni Lanzani Data Whisperer