SlideShare a Scribd company logo
Doing data science
with Clojure
@sbelak
simon@goopti.com
Doing data science with Clojure
The analytics chasm
Ideal. Almost real-time, can
be done during brainstorming
without disrupting flow
< 2min < 20min project
squeeze in
somewhere
in the day
fail
roadmap

ahoy!
Easy things should be easy,
and hard things should be
possible.
— L. Wall
Data frames considered
harmful
• Data frame (=table) conflates representation and
abstraction
• Clojure excels in structure manipulation/encoding
github.com/sbelak/huri
• No data structures, just functions over collections
• Composable (even DSLs — no macros!)
• Reasonably fast (transducers <3)
• Do-what-I-mean (auto-sort, liberal with inputs, …)
• Minimal buy-in
• Support reaching into nested structures everywhere
Composability is key to
quick iterating
• Curried versions where possible
• ->> and partial friendly
• Side benefit: consistent API
“This is possibly Clojure’s most important
property: the syntax expresses the code’s
semantic layers. An experienced reader of
Clojure can skip over most of the code and
have a lossless understanding of its high-
level intent.”
— Z. Tellman, Elements of Clojure
Live programming
Catching errors early more context
easier debugging faster iterating
clojure.spec
Queryable data
descriptions
<3 Bret Victor
Think in distributions,
not numbers
Doing data science with Clojure
The power of
sharing runtime
Notebooks as
dashboards
The ecosystem
What about machine learning?
farm it out to
sklearn
Mini compilers targeting
a specific library in
another language
huri.plot
• DSL that compiles to ggplot2
• Targets Gorilla REPL
• Follows the rest of Huri’s design philosophy
• bar chart, scatter plot, line chart, box & violin plot,
heatmap, histogram
Doing data science with Clojure
Takeouts
• Speed-of-answer matters
• Data science is about communication
• We don’t have to reinvent every wheel in Clojure
• Clojure is fantastic at structure manipulation, play
to its strengths
• Blurring the line between environment and work is
a powerful idea
Questions
@sbelak
github.com/sbelak/huri

More Related Content

What's hot (13)

PDF
Advanced web application architecture - PHP Barcelona
Matthias Noback
 
PDF
Hexagonal Symfony - SymfonyCon Amsterdam 2019
Matthias Noback
 
PDF
Brutal refactoring, lying code, the Churn, and other emotional stories from L...
Matthias Noback
 
ODP
CouchDB @ PoliMi
Giorgio Sironi
 
PDF
Multithreading in C# - pitfalls, mistakes and solutions.
Marcin Dembowski
 
PDF
Functional Programming and Java8
Ender Aydin Orak
 
PPTX
LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014
Endava
 
PDF
Advanced web application architecture - Talk
Matthias Noback
 
PDF
Service abstractions - Part 1: Queries
Matthias Noback
 
PDF
Advanced web application architecture Way2Web
Matthias Noback
 
PPTX
Eurosport's Kodakademi #2
Benjamin Baumann
 
PPTX
Realizzare un Virtual Assistant con Bot Framework Azure e Unity
Marco Parenzan
 
PDF
A testing strategy for hexagonal applications
Matthias Noback
 
Advanced web application architecture - PHP Barcelona
Matthias Noback
 
Hexagonal Symfony - SymfonyCon Amsterdam 2019
Matthias Noback
 
Brutal refactoring, lying code, the Churn, and other emotional stories from L...
Matthias Noback
 
CouchDB @ PoliMi
Giorgio Sironi
 
Multithreading in C# - pitfalls, mistakes and solutions.
Marcin Dembowski
 
Functional Programming and Java8
Ender Aydin Orak
 
LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014
Endava
 
Advanced web application architecture - Talk
Matthias Noback
 
Service abstractions - Part 1: Queries
Matthias Noback
 
Advanced web application architecture Way2Web
Matthias Noback
 
Eurosport's Kodakademi #2
Benjamin Baumann
 
Realizzare un Virtual Assistant con Bot Framework Azure e Unity
Marco Parenzan
 
A testing strategy for hexagonal applications
Matthias Noback
 

Viewers also liked (12)

PDF
Spec + onyx
Simon Belak
 
PPTX
inOrbit 2015: odkrivanje segmentov iz podatkov
Red Orbit digital marketing
 
PDF
Odkrivanje segmentov iz podatkov
Simon Belak
 
PPT
O Filozofih In Programih
Simon Belak
 
PDF
Turn to datadriven: the first 6 months
Simon Belak
 
PDF
The time is out of joint: O cursed spite, / That ever I was born to set it ri...
Simon Belak
 
PDF
Dao of lisp
Simon Belak
 
PDF
Predicting the future with goopti
Simon Belak
 
PPTX
Napadi na algoritme za strojno učenje
Simon Belak
 
PDF
ETL in Clojure
Dmitriy Morozov
 
PDF
Turn to data-driven: the first 6 months, Simon Belak
Red Orbit digital marketing
 
PDF
Using Onyx in anger
Simon Belak
 
Spec + onyx
Simon Belak
 
inOrbit 2015: odkrivanje segmentov iz podatkov
Red Orbit digital marketing
 
Odkrivanje segmentov iz podatkov
Simon Belak
 
O Filozofih In Programih
Simon Belak
 
Turn to datadriven: the first 6 months
Simon Belak
 
The time is out of joint: O cursed spite, / That ever I was born to set it ri...
Simon Belak
 
Dao of lisp
Simon Belak
 
Predicting the future with goopti
Simon Belak
 
Napadi na algoritme za strojno učenje
Simon Belak
 
ETL in Clojure
Dmitriy Morozov
 
Turn to data-driven: the first 6 months, Simon Belak
Red Orbit digital marketing
 
Using Onyx in anger
Simon Belak
 
Ad

Similar to Doing data science with Clojure (20)

PDF
Deep Dive into the Idea of Software Architecture
Matthew Clarke
 
PDF
CQRS recepies
Francesco Garavaglia
 
PPT
The economies of scaling software - Abdel Remani
jaxconf
 
PDF
There's no magic... until you talk about databases
ESUG
 
PPT
The Economies of Scaling Software
Abdelmonaim Remani
 
PPTX
Javascript best practices
Jayanga V. Liyanage
 
PDF
Apache Drill (ver. 0.2)
Camuel Gilyadov
 
PPTX
Clean sw 3_architecture
AngelLuisBlasco
 
PPTX
Closing Keynote
Neo4j
 
PPTX
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
jazoon13
 
PPTX
Backbonemeetup
Ben McCormick
 
PPT
Mano PPT for introduction Computer Architecture .ppt
sudhansh5
 
PPT
5 Pitfalls to Avoid with MongoDB
Tim Callaghan
 
PDF
John adams talk cloudy
John Adams
 
PDF
Cavity Data
Alan Dean
 
PPTX
Discovering Vulnerabilities For Fun and Profit
Abhisek Datta
 
PPTX
Guide to Destroying Codebases The Demise of Clever Code
Gabor Varadi
 
PDF
FP Days: Down the Clojure Rabbit Hole
Christophe Grand
 
PDF
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
smallerror
 
PDF
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
xlight
 
Deep Dive into the Idea of Software Architecture
Matthew Clarke
 
CQRS recepies
Francesco Garavaglia
 
The economies of scaling software - Abdel Remani
jaxconf
 
There's no magic... until you talk about databases
ESUG
 
The Economies of Scaling Software
Abdelmonaim Remani
 
Javascript best practices
Jayanga V. Liyanage
 
Apache Drill (ver. 0.2)
Camuel Gilyadov
 
Clean sw 3_architecture
AngelLuisBlasco
 
Closing Keynote
Neo4j
 
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
jazoon13
 
Backbonemeetup
Ben McCormick
 
Mano PPT for introduction Computer Architecture .ppt
sudhansh5
 
5 Pitfalls to Avoid with MongoDB
Tim Callaghan
 
John adams talk cloudy
John Adams
 
Cavity Data
Alan Dean
 
Discovering Vulnerabilities For Fun and Profit
Abhisek Datta
 
Guide to Destroying Codebases The Demise of Clever Code
Gabor Varadi
 
FP Days: Down the Clojure Rabbit Hole
Christophe Grand
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
xlight
 
Ad

More from Simon Belak (20)

PDF
Tools for building the future
Simon Belak
 
PDF
Doing data science with clojure
Simon Belak
 
PDF
Exploratory analysis
Simon Belak
 
PDF
Levelling up your data infrastructure
Simon Belak
 
PDF
The subtle art of recommendation
Simon Belak
 
PDF
Metabase Ljubljana Meetup #2
Simon Belak
 
PDF
Metabase lj meetup
Simon Belak
 
PDF
Sketch algorithms
Simon Belak
 
PDF
Transducing for fun and profit
Simon Belak
 
PDF
Your metrics are wrong
Simon Belak
 
PDF
Writing smart contracts the sane way
Simon Belak
 
PDF
Online statistical analysis using transducers and sketch algorithms
Simon Belak
 
PDF
Save the princess
Simon Belak
 
PDF
Data driven going to market strategy
Simon Belak
 
PDF
Spec: a lisp-flavoured type system
Simon Belak
 
PDF
A data layer in clojure
Simon Belak
 
PDF
The log
Simon Belak
 
PDF
Statisics for hackers
Simon Belak
 
PDF
The data driven startup
Simon Belak
 
PDF
Investor story
Simon Belak
 
Tools for building the future
Simon Belak
 
Doing data science with clojure
Simon Belak
 
Exploratory analysis
Simon Belak
 
Levelling up your data infrastructure
Simon Belak
 
The subtle art of recommendation
Simon Belak
 
Metabase Ljubljana Meetup #2
Simon Belak
 
Metabase lj meetup
Simon Belak
 
Sketch algorithms
Simon Belak
 
Transducing for fun and profit
Simon Belak
 
Your metrics are wrong
Simon Belak
 
Writing smart contracts the sane way
Simon Belak
 
Online statistical analysis using transducers and sketch algorithms
Simon Belak
 
Save the princess
Simon Belak
 
Data driven going to market strategy
Simon Belak
 
Spec: a lisp-flavoured type system
Simon Belak
 
A data layer in clojure
Simon Belak
 
The log
Simon Belak
 
Statisics for hackers
Simon Belak
 
The data driven startup
Simon Belak
 
Investor story
Simon Belak
 

Recently uploaded (20)

PDF
Driving Employee Engagement in a Hybrid World.pdf
Mia scott
 
PPTX
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
PPTX
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
PDF
Merits and Demerits of DBMS over File System & 3-Tier Architecture in DBMS
MD RIZWAN MOLLA
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
Driving Employee Engagement in a Hybrid World.pdf
Mia scott
 
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
Merits and Demerits of DBMS over File System & 3-Tier Architecture in DBMS
MD RIZWAN MOLLA
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 

Doing data science with Clojure