SlideShare a Scribd company logo
©2013 DataStax Confidential. Do not distribute without consent.
@rustyrazorblade
Jon Haddad

Technical Evangelist, DataStax
Python Performance Profiling
1
What are our goals?
• Understand potential bottlenecks in dev
• Testing
• Call graphs
• Understand code once it's in production
• Micro benchmarks
• Automatic logging of slow DB queries / api calls
• Gather evidence
• No guessing
• We need insight into both environments
Why do we need it in dev & prod?
• Dev != production
• No network latency on our desktops
• Round trips are cheap in dev
• Rarely hitting disk (DB fully in memory)
• Zero CPU contention
• Failure / failover rarely tested
Before Production
Approaches in Dev
• Unit / functional tests
• Code coverage is important
• if you’re not testing it, it’s probably broken
• Must be reliable, repeatable
• Always keep production in mind
• Know your hardware
• Load test regularly
• Jenkins performance plugin
Finding slow tests is easy
Sometimes it's unavoidable…
• Make sure you mark tests that are
expected to be slow
• These are frequently testing offline tasks
in functional tests
Profiler - Hotshot
pycallgraph
• Understand code structure and flow
• Summarize times
• Darker colors represent more time
spent
Blocking I/O
• Usually the problem with web servers
• Apps can be CPU bound but it's less frequent
Moving past blocking I/O
• Event libraries!
• libev most stable
• gevent is a beautiful wrapper
• Pool.map() is your friend
• async can hide issues & make code
harder to profile
Profiler - GreenletProfiler
• Takes into account greenlets
• Generates callgrind files
• Mac Users: qcachegrind
In Production
Profile with minimal overhead
• We need something really lightweight!
• Our applications can time EVERYTHING
• api requests
• database queries
• individual functions
• small blocks of code
• statsd is our friend
• microtimers, counters
• Integrates w/ librato, graphite
statsd + graphite / grafana
Logging
• Log slow database queries / api calls
automatically
• Log & aggregate errors
• What table was hit?
• Read or write?
• What was the query?
• Can we duplicate?
• Logstash / splunk / etc
©2013 DataStax Confidential. Do not distribute without consent. 17

More Related Content

What's hot (18)

PPTX
Rebooting design in RavenDB
Oren Eini
 
PDF
High Performance Systems in Go - GopherCon 2014
Derek Collison
 
PPT
Building your own NSQL store
Edward Capriolo
 
PDF
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Redis Labs
 
PPTX
Mario Cartia - SMACK is the new LAMP! - Codemotion Milan 2017
Codemotion
 
PPTX
Lessons from the Trenches - Building Enterprise Applications with RavenDB
Oren Eini
 
PDF
Getting started with Riak in the Cloud
Ines Sombra
 
PDF
Know thy cost (or where performance problems lurk)
Oren Eini
 
PDF
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
PPTX
RavenDB 3.5
Oren Eini
 
PDF
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
PPTX
Игорь Фесенко "Direction of C# as a High-Performance Language"
Fwdays
 
PPTX
Lessons Learned from Building and Operating Scuba
SingleStore
 
PDF
NetflixOSS Meetup season 3 episode 1
Ruslan Meshenberg
 
PDF
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
Codemotion
 
PPTX
RavenDB 4.0
Oren Eini
 
PPTX
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Oren Eini
 
PPTX
Honest Performance Testing with "NDBench" (Vinay Chella, Netflix) | Cassandra...
DataStax
 
Rebooting design in RavenDB
Oren Eini
 
High Performance Systems in Go - GopherCon 2014
Derek Collison
 
Building your own NSQL store
Edward Capriolo
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Redis Labs
 
Mario Cartia - SMACK is the new LAMP! - Codemotion Milan 2017
Codemotion
 
Lessons from the Trenches - Building Enterprise Applications with RavenDB
Oren Eini
 
Getting started with Riak in the Cloud
Ines Sombra
 
Know thy cost (or where performance problems lurk)
Oren Eini
 
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
RavenDB 3.5
Oren Eini
 
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
Игорь Фесенко "Direction of C# as a High-Performance Language"
Fwdays
 
Lessons Learned from Building and Operating Scuba
SingleStore
 
NetflixOSS Meetup season 3 episode 1
Ruslan Meshenberg
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
Codemotion
 
RavenDB 4.0
Oren Eini
 
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Oren Eini
 
Honest Performance Testing with "NDBench" (Vinay Chella, Netflix) | Cassandra...
DataStax
 

Viewers also liked (20)

PDF
Introduction to Cassandra - Denver
Jon Haddad
 
PDF
Intro to Cassandra
Jon Haddad
 
PDF
Crash course intro to cassandra
Jon Haddad
 
PDF
Cassandra 3.0 Awesomeness
Jon Haddad
 
PDF
Cassandra Core Concepts
Jon Haddad
 
PDF
Enter the Snake Pit for Fast and Easy Spark
Jon Haddad
 
PDF
Diagnosing Problems in Production (Nov 2015)
Jon Haddad
 
PDF
Spark and cassandra (Hulu Talk)
Jon Haddad
 
PDF
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Jon Haddad
 
PDF
Intro to py spark (and cassandra)
Jon Haddad
 
PDF
Cassandra Core Concepts - Cassandra Day Toronto
Jon Haddad
 
PDF
Python and cassandra
Jon Haddad
 
PDF
Getting The Best Performance With PySpark
Spark Summit
 
PDF
Python Load Testing - Pygotham 2012
Dan Kuebrich
 
PDF
What’s eating python performance
Piotr Przymus
 
PPTX
Vasiliy Litvinov - Python Profiling
Sergey Arkhipov
 
PPTX
Denis Nagorny - Pumping Python Performance
Sergey Arkhipov
 
PDF
The High Performance Python Landscape by Ian Ozsvald
PyData
 
PPTX
Boost.Python: C++ and Python Integration
GlobalLogic Ukraine
 
PDF
Spark + Scikit Learn- Performance Tuning
晨揚 施
 
Introduction to Cassandra - Denver
Jon Haddad
 
Intro to Cassandra
Jon Haddad
 
Crash course intro to cassandra
Jon Haddad
 
Cassandra 3.0 Awesomeness
Jon Haddad
 
Cassandra Core Concepts
Jon Haddad
 
Enter the Snake Pit for Fast and Easy Spark
Jon Haddad
 
Diagnosing Problems in Production (Nov 2015)
Jon Haddad
 
Spark and cassandra (Hulu Talk)
Jon Haddad
 
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Jon Haddad
 
Intro to py spark (and cassandra)
Jon Haddad
 
Cassandra Core Concepts - Cassandra Day Toronto
Jon Haddad
 
Python and cassandra
Jon Haddad
 
Getting The Best Performance With PySpark
Spark Summit
 
Python Load Testing - Pygotham 2012
Dan Kuebrich
 
What’s eating python performance
Piotr Przymus
 
Vasiliy Litvinov - Python Profiling
Sergey Arkhipov
 
Denis Nagorny - Pumping Python Performance
Sergey Arkhipov
 
The High Performance Python Landscape by Ian Ozsvald
PyData
 
Boost.Python: C++ and Python Integration
GlobalLogic Ukraine
 
Spark + Scikit Learn- Performance Tuning
晨揚 施
 
Ad

Similar to Python performance profiling (20)

PDF
Profiling PHP - AmsterdamPHP Meetup - 2014-11-20
Dennis de Greef
 
PDF
stackconf 2022: Optimize Performance with Continuous Production Profiling
NETWAYS
 
PDF
Where is the bottleneck
Manuel Miranda de Cid
 
PDF
High Performance Python 2nd Edition Micha Gorelick
danuzakhiem
 
PDF
Leveraging Python Telemetry, Azure Application Logging, and Performance Testi...
Stackify
 
PDF
Swift profiling middleware and tools
zhang hua
 
PDF
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
PyData
 
PPTX
Profiling and Tuning a Web Application - The Dirty Details
Achievers Tech
 
PDF
Load Test Like a Pro
Rob Harrop
 
PDF
Become a Performance Diagnostics Hero
TechWell
 
PDF
An exploratory study of the state of practice of performance testing in Java-...
corpaulbezemer
 
PDF
Load testing and performance tracing
Hans Höchtl
 
PDF
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
Steve Caron
 
PDF
Designing and Running Performance Experiments
J On The Beach
 
PPT
Software Performance
Prabhanshu Saraswat
 
PDF
Fantastic performance and where to find it
RichardWarburton
 
PPTX
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
MongoDB
 
PDF
Optimizing Python
AdimianBE
 
PDF
PyGotham 2014 Introduction to Profiling
Perrin Harkins
 
PDF
SCQAA-SF Meeting on May 21 2014
Sujit Ghosh
 
Profiling PHP - AmsterdamPHP Meetup - 2014-11-20
Dennis de Greef
 
stackconf 2022: Optimize Performance with Continuous Production Profiling
NETWAYS
 
Where is the bottleneck
Manuel Miranda de Cid
 
High Performance Python 2nd Edition Micha Gorelick
danuzakhiem
 
Leveraging Python Telemetry, Azure Application Logging, and Performance Testi...
Stackify
 
Swift profiling middleware and tools
zhang hua
 
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
PyData
 
Profiling and Tuning a Web Application - The Dirty Details
Achievers Tech
 
Load Test Like a Pro
Rob Harrop
 
Become a Performance Diagnostics Hero
TechWell
 
An exploratory study of the state of practice of performance testing in Java-...
corpaulbezemer
 
Load testing and performance tracing
Hans Höchtl
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
Steve Caron
 
Designing and Running Performance Experiments
J On The Beach
 
Software Performance
Prabhanshu Saraswat
 
Fantastic performance and where to find it
RichardWarburton
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
MongoDB
 
Optimizing Python
AdimianBE
 
PyGotham 2014 Introduction to Profiling
Perrin Harkins
 
SCQAA-SF Meeting on May 21 2014
Sujit Ghosh
 
Ad

Recently uploaded (20)

PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 

Python performance profiling

  • 1. ©2013 DataStax Confidential. Do not distribute without consent. @rustyrazorblade Jon Haddad
 Technical Evangelist, DataStax Python Performance Profiling 1
  • 2. What are our goals? • Understand potential bottlenecks in dev • Testing • Call graphs • Understand code once it's in production • Micro benchmarks • Automatic logging of slow DB queries / api calls • Gather evidence • No guessing • We need insight into both environments
  • 3. Why do we need it in dev & prod? • Dev != production • No network latency on our desktops • Round trips are cheap in dev • Rarely hitting disk (DB fully in memory) • Zero CPU contention • Failure / failover rarely tested
  • 5. Approaches in Dev • Unit / functional tests • Code coverage is important • if you’re not testing it, it’s probably broken • Must be reliable, repeatable • Always keep production in mind • Know your hardware • Load test regularly • Jenkins performance plugin
  • 7. Sometimes it's unavoidable… • Make sure you mark tests that are expected to be slow • These are frequently testing offline tasks in functional tests
  • 9. pycallgraph • Understand code structure and flow • Summarize times • Darker colors represent more time spent
  • 10. Blocking I/O • Usually the problem with web servers • Apps can be CPU bound but it's less frequent
  • 11. Moving past blocking I/O • Event libraries! • libev most stable • gevent is a beautiful wrapper • Pool.map() is your friend • async can hide issues & make code harder to profile
  • 12. Profiler - GreenletProfiler • Takes into account greenlets • Generates callgrind files • Mac Users: qcachegrind
  • 14. Profile with minimal overhead • We need something really lightweight! • Our applications can time EVERYTHING • api requests • database queries • individual functions • small blocks of code • statsd is our friend • microtimers, counters • Integrates w/ librato, graphite
  • 15. statsd + graphite / grafana
  • 16. Logging • Log slow database queries / api calls automatically • Log & aggregate errors • What table was hit? • Read or write? • What was the query? • Can we duplicate? • Logstash / splunk / etc
  • 17. ©2013 DataStax Confidential. Do not distribute without consent. 17