SlideShare a Scribd company logo
Sylvain Zimmer / @sylvinus
PyParis 2017
DEVELOPER-FRIENDLY TASKQUEUES
WHAT WE LEARNED BUILDING MRQ
& WHAT YOU SHOULD ASK YOURSELF BEFORE CHOOSING ONE
/usr/bin/whoami
▸ (SpaceX nerd)
▸ Founder, dotConferences
▸ CTO Pricing Assistant
▸ Co-organizer Paris.py meetup
▸ User of Python taskqueues for 10+ years
▸ Main contributor of MRQ
4 years ago...
OMG !1!!!
TASKQUEUE FUNDAMENTALS
Credit: Adrien Di Pasquale
A typical job/task
def send_an_email(email_type, user):
html = template(email_type, user)
status = email.send(html, user["email"])
metrics.send("email_%s" % status, 1)
return status
KERNEL PANIC
Task properties
Re-entrant Idempotent Nullipotent< <
▸ Safe to interrupt
and then retry
▸ Safe to call
multiple times
▸ Result will be the
same
▸ Free of side-effects
def reentrant(a):
value = a + random()
db.insert(value)
def idempotent(key, value):
db.update(key, value)
def nullipotent(a):
return a ** 2
Other task properties & best practices
▸ Serializable args, serializable result
▸ Args validation / documentation
▸ Least args possible
▸ Canonical path vs. registration
▸ Concurrent safety
▸ Statuses
Coroutines vs. Threads vs. Processes
▸ IO-bound tasks vs. CPU-bound tasks
▸ Threads offer few benefits for a Python worker (GIL)
▸ Coroutines/Greenlets are ideal for IO-bound tasks
▸ Processes are required for CPU-bound tasks
▸ If you have heterogenous tasks, your TQ should support
both!
$ mrq-worker --greenlets 25 --processes 4
Performance: latency & throughput
APP
BROKER WORKER
RESULT STORE
OPS & TOOLING
Developer-friendly task queues: what we learned building MRQ, Sylvain Zimmer
Errors
▸ Exception handlers
▸ Timeouts
▸ Retry rules
▸ Sentry & friends
▸ gevent: test your tracebacks!
▸ Priorities
▸ Human process to manage failed tasks!
Task visibility
▸ Tasks by status, path, worker, ...
▸ Tracebacks & current stack
▸ Logs
▸ Timing info
▸ Cancel / Kill / Move tasks
▸ Progress
Memory leaks
▸ Workers = long-running processes
▸ gevent makes debugging harder
▸ Watch out for global variables or mutable class attributes!
▸ Python's ecosystem is surprisingly poor in this area
▸ guppy, objgraph can usually help
Misc tools
▸ Scheduler
▸ Command-line runner, e.g. mrq-run tasks.myTask {"a": 1}
▸ Autoscaling
▸ Profiler
CONSISTENCY
Consistency guarantees
▸ At least once vs. At most once vs. Exactly once
▸ Ordering
▸ Critical operations:
▸ Queueing
▸ Marking tasks as started
▸ Timeouts & retries
Types of brokers
▸ Specialized message queues (RabbitMQ, SNS, Kafka, ...)
▸ Performance, complexity, poor visibility
▸ In-memory data stores (Redis, ...)
▸ Performance, simplicity, harder to scale
▸ Regular databases (MongoDB, PostgreSQL, ...)
▸ Often enough for the job!
At the heart of the broker
▸ Atomic update from "queued" to "started"
▸ MRQ with MongoDB broker: find_one_and_update()
▸ MRQ with Redis broker: Pushback in a ZSET
ZSETs in Redis
▸ Sorted sets with O(log(N)) scalability
▸ set/get by key, order by key, lookups by key or value
▸ Very interesting properties for task queues: Unicity,
Ordering, Atomicity of updates, Performance
▸ MRQ's "Pushback" model:
▸ Queue with key=timestamp
▸ Unqueue by fetching key range & setting new keys in
the future
▸ After completion the task adjusts or removes the key
Developer-friendly task queues: what we learned building MRQ, Sylvain Zimmer
Consistency guarantees
▸ Must be thought of for the whole system, not just the
broker!
▸ Brokers can be misused or misconfigured
▸ The workers can drop tasks if they want to ;-)
▸ Consistency starts at queueing time!
TIME TO CHOOSE!
Think hard about what you need
▸ Will your taskqueue be the foundation of your
architecture, or is it just a side project?
▸ What performance do you need? (IO vs. CPU, latency, ...)
▸ What level of visibility and control do you need on queued
& running tasks?
▸ Can workers terminate abruptly? Lots of design
consequences!
▸ What language interop do you need?
And then all the usual questions...
▸ Is it supported by a lively community?
▸ License
▸ Documentation
▸ Future plans
Which one to pick?
▸ Celery: High performance, large community, very complex,
major upgrades painful
▸ RQ: Extremely simple to understand, low performance
▸ MRQ: Adjust task visibility vs. performance, simple to
understand, 1.0 soon
▸ Lots of other valid options! Just be sure to ask yourself the
right questions ;-)
BE GRATEFUL FOR
THE OSS YOU USE!
REMINDER
Hiring Pythonistas!
QUESTIONS?
THANKS!
Photo credits: https://www.flickr.com/photos/spacex/

More Related Content

What's hot (20)

PPTX
Open-Source Analytics Stack on MongoDB, with Schema, Pierre-Alain Jachiet and...
Pôle Systematic Paris-Region
 
PDF
Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)
Tim Bunce
 
PDF
Nodejs Explained with Examples
Gabriele Lana
 
PDF
GPU Computing for Data Science
Domino Data Lab
 
PDF
DaNode - A home made web server in D
Andrei Alexandrescu
 
KEY
NodeJS
.toster
 
PPTX
Nodejs intro
Ndjido Ardo BAR
 
PDF
NodeJS
LinkMe Srl
 
PPT
Node js presentation
martincabrera
 
PPTX
introduction to node.js
orkaplan
 
PPTX
Rapid Application Design in Financial Services
Aerospike
 
PDF
Pragmatic Monolith-First, easy to decompose, clean architecture
Piotr Pelczar
 
PPTX
Shall we play a game?
Maciej Lasyk
 
PDF
Perl Dist::Surveyor 2011
Tim Bunce
 
PPTX
Go & multi platform GUI Trials and Errors
Yoshiki Shibukawa
 
PDF
An Introduction of Node Package Manager (NPM)
iFour Technolab Pvt. Ltd.
 
PDF
Perl-Critic
Jonas Brømsø
 
PDF
Fluentd - road to v1 -
N Masahiro
 
KEY
Writing robust Node.js applications
Tom Croucher
 
PPTX
Pwning with powershell
jaredhaight
 
Open-Source Analytics Stack on MongoDB, with Schema, Pierre-Alain Jachiet and...
Pôle Systematic Paris-Region
 
Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)
Tim Bunce
 
Nodejs Explained with Examples
Gabriele Lana
 
GPU Computing for Data Science
Domino Data Lab
 
DaNode - A home made web server in D
Andrei Alexandrescu
 
NodeJS
.toster
 
Nodejs intro
Ndjido Ardo BAR
 
NodeJS
LinkMe Srl
 
Node js presentation
martincabrera
 
introduction to node.js
orkaplan
 
Rapid Application Design in Financial Services
Aerospike
 
Pragmatic Monolith-First, easy to decompose, clean architecture
Piotr Pelczar
 
Shall we play a game?
Maciej Lasyk
 
Perl Dist::Surveyor 2011
Tim Bunce
 
Go & multi platform GUI Trials and Errors
Yoshiki Shibukawa
 
An Introduction of Node Package Manager (NPM)
iFour Technolab Pvt. Ltd.
 
Perl-Critic
Jonas Brømsø
 
Fluentd - road to v1 -
N Masahiro
 
Writing robust Node.js applications
Tom Croucher
 
Pwning with powershell
jaredhaight
 

Similar to Developer-friendly task queues: what we learned building MRQ, Sylvain Zimmer (20)

PDF
Why and how Pricing Assistant migrated from Celery to RQ - Paris.py #2
Sylvain Zimmer
 
ODP
Introduction to Python Celery
Mahendra M
 
PDF
Queue Everything and Please Everyone
Vaidik Kapoor
 
PDF
Work Queue Systems
David Butler
 
PDF
Celery - A Distributed Task Queue
Duy Do
 
PPTX
Job Queues Overview
joeyrobert
 
PDF
Why Task Queues - ComoRichWeb
Bryan Helmig
 
PDF
Queick: A Simple Job Queue System for Python
Ryota Suenaga
 
PDF
Celery with python
Alexandre González Rodríguez
 
PDF
Celery
Òscar Vilaplana
 
PDF
Faster PHP apps using Queues and Workers
Richard Baker
 
PDF
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
Hua Chu
 
PPTX
Massaging the Pony: Message Queues and You
Shawn Rider
 
ODP
Deferred Processing in Ruby - Philly rb - August 2011
rob_dimarco
 
PDF
Tasks: you gotta know how to run them
Filipe Ximenes
 
PDF
Advanced task management with Celery
Mahendra M
 
PDF
Celery vs MRQ
Alexander Lifanov
 
PDF
PyCon India 2012: Celery Talk
Piyush Kumar
 
PDF
Celery: The Distributed Task Queue
Richard Leland
 
PDF
7 ways to execute scheduled jobs with python
Hugo Shi
 
Why and how Pricing Assistant migrated from Celery to RQ - Paris.py #2
Sylvain Zimmer
 
Introduction to Python Celery
Mahendra M
 
Queue Everything and Please Everyone
Vaidik Kapoor
 
Work Queue Systems
David Butler
 
Celery - A Distributed Task Queue
Duy Do
 
Job Queues Overview
joeyrobert
 
Why Task Queues - ComoRichWeb
Bryan Helmig
 
Queick: A Simple Job Queue System for Python
Ryota Suenaga
 
Celery with python
Alexandre González Rodríguez
 
Faster PHP apps using Queues and Workers
Richard Baker
 
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
Hua Chu
 
Massaging the Pony: Message Queues and You
Shawn Rider
 
Deferred Processing in Ruby - Philly rb - August 2011
rob_dimarco
 
Tasks: you gotta know how to run them
Filipe Ximenes
 
Advanced task management with Celery
Mahendra M
 
Celery vs MRQ
Alexander Lifanov
 
PyCon India 2012: Celery Talk
Piyush Kumar
 
Celery: The Distributed Task Queue
Richard Leland
 
7 ways to execute scheduled jobs with python
Hugo Shi
 
Ad

More from Pôle Systematic Paris-Region (20)

PDF
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
Pôle Systematic Paris-Region
 
PDF
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
Pôle Systematic Paris-Region
 
PDF
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
Pôle Systematic Paris-Region
 
PDF
OSIS19_Cloud : Performance and power management in virtualized data centers, ...
Pôle Systematic Paris-Region
 
PDF
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
Pôle Systematic Paris-Region
 
PDF
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
Pôle Systematic Paris-Region
 
PDF
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
Pôle Systematic Paris-Region
 
PDF
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Pôle Systematic Paris-Region
 
PDF
Osis18_Cloud : Pas de commun sans communauté ?
Pôle Systematic Paris-Region
 
PDF
Osis18_Cloud : Projet Wolphin
Pôle Systematic Paris-Region
 
PDF
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
Pôle Systematic Paris-Region
 
PDF
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Pôle Systematic Paris-Region
 
PDF
Osis18_Cloud : Software-heritage
Pôle Systematic Paris-Region
 
PDF
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
Pôle Systematic Paris-Region
 
PDF
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
Pôle Systematic Paris-Region
 
PDF
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
Pôle Systematic Paris-Region
 
PDF
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
Pôle Systematic Paris-Region
 
PDF
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
Pôle Systematic Paris-Region
 
PDF
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
Pôle Systematic Paris-Region
 
PDF
PyParis 2017 / Un mooc python, by thierry parmentelat
Pôle Systematic Paris-Region
 
OSIS19_IoT :Transparent remote connectivity to short-range IoT devices, by Na...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : SAFC: Scheduling and Allocation Framework for Containers in a ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Performance and power management in virtualized data centers, ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du ...
Pôle Systematic Paris-Region
 
OSIS19_Cloud : Attribution automatique de ressources pour micro-services, Alt...
Pôle Systematic Paris-Region
 
OSIS19_IoT : State of the art in security for embedded systems and IoT, by Pi...
Pôle Systematic Paris-Region
 
Osis19_IoT: Proof of Pointer Programs with Ownership in SPARK, by Yannick Moy
Pôle Systematic Paris-Region
 
Osis18_Cloud : Pas de commun sans communauté ?
Pôle Systematic Paris-Region
 
Osis18_Cloud : Projet Wolphin
Pôle Systematic Paris-Region
 
Osis18_Cloud : Virtualisation efficace d’architectures NUMA
Pôle Systematic Paris-Region
 
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Pôle Systematic Paris-Region
 
Osis18_Cloud : Software-heritage
Pôle Systematic Paris-Region
 
OSIS18_IoT: L'approche machine virtuelle pour les microcontrôleurs, le projet...
Pôle Systematic Paris-Region
 
OSIS18_IoT: La securite des objets connectes a bas cout avec l'os et riot
Pôle Systematic Paris-Region
 
OSIS18_IoT : Solution de mise au point pour les systemes embarques, par Julio...
Pôle Systematic Paris-Region
 
OSIS18_IoT : Securisation du reseau des objets connectes, par Nicolas LE SAUZ...
Pôle Systematic Paris-Region
 
OSIS18_IoT : Ada and SPARK - Defense in Depth for Safe Micro-controller Progr...
Pôle Systematic Paris-Region
 
OSIS18_IoT : RTEMS pour l'IoT professionnel, par Pierre Ficheux (Smile ECS)
Pôle Systematic Paris-Region
 
PyParis 2017 / Un mooc python, by thierry parmentelat
Pôle Systematic Paris-Region
 
Ad

Recently uploaded (20)

PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Learn Computer Forensics, Second Edition
AnuraShantha7
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Learn Computer Forensics, Second Edition
AnuraShantha7
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
Smart Air Quality Monitoring with Serrax AQM190 LITE
SERRAX TECHNOLOGIES LLP
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 

Developer-friendly task queues: what we learned building MRQ, Sylvain Zimmer

  • 1. Sylvain Zimmer / @sylvinus PyParis 2017 DEVELOPER-FRIENDLY TASKQUEUES WHAT WE LEARNED BUILDING MRQ & WHAT YOU SHOULD ASK YOURSELF BEFORE CHOOSING ONE
  • 2. /usr/bin/whoami ▸ (SpaceX nerd) ▸ Founder, dotConferences ▸ CTO Pricing Assistant ▸ Co-organizer Paris.py meetup ▸ User of Python taskqueues for 10+ years ▸ Main contributor of MRQ
  • 5. Credit: Adrien Di Pasquale
  • 6. A typical job/task def send_an_email(email_type, user): html = template(email_type, user) status = email.send(html, user["email"]) metrics.send("email_%s" % status, 1) return status KERNEL PANIC
  • 7. Task properties Re-entrant Idempotent Nullipotent< < ▸ Safe to interrupt and then retry ▸ Safe to call multiple times ▸ Result will be the same ▸ Free of side-effects def reentrant(a): value = a + random() db.insert(value) def idempotent(key, value): db.update(key, value) def nullipotent(a): return a ** 2
  • 8. Other task properties & best practices ▸ Serializable args, serializable result ▸ Args validation / documentation ▸ Least args possible ▸ Canonical path vs. registration ▸ Concurrent safety ▸ Statuses
  • 9. Coroutines vs. Threads vs. Processes ▸ IO-bound tasks vs. CPU-bound tasks ▸ Threads offer few benefits for a Python worker (GIL) ▸ Coroutines/Greenlets are ideal for IO-bound tasks ▸ Processes are required for CPU-bound tasks ▸ If you have heterogenous tasks, your TQ should support both! $ mrq-worker --greenlets 25 --processes 4
  • 10. Performance: latency & throughput APP BROKER WORKER RESULT STORE
  • 13. Errors ▸ Exception handlers ▸ Timeouts ▸ Retry rules ▸ Sentry & friends ▸ gevent: test your tracebacks! ▸ Priorities ▸ Human process to manage failed tasks!
  • 14. Task visibility ▸ Tasks by status, path, worker, ... ▸ Tracebacks & current stack ▸ Logs ▸ Timing info ▸ Cancel / Kill / Move tasks ▸ Progress
  • 15. Memory leaks ▸ Workers = long-running processes ▸ gevent makes debugging harder ▸ Watch out for global variables or mutable class attributes! ▸ Python's ecosystem is surprisingly poor in this area ▸ guppy, objgraph can usually help
  • 16. Misc tools ▸ Scheduler ▸ Command-line runner, e.g. mrq-run tasks.myTask {"a": 1} ▸ Autoscaling ▸ Profiler
  • 18. Consistency guarantees ▸ At least once vs. At most once vs. Exactly once ▸ Ordering ▸ Critical operations: ▸ Queueing ▸ Marking tasks as started ▸ Timeouts & retries
  • 19. Types of brokers ▸ Specialized message queues (RabbitMQ, SNS, Kafka, ...) ▸ Performance, complexity, poor visibility ▸ In-memory data stores (Redis, ...) ▸ Performance, simplicity, harder to scale ▸ Regular databases (MongoDB, PostgreSQL, ...) ▸ Often enough for the job!
  • 20. At the heart of the broker ▸ Atomic update from "queued" to "started" ▸ MRQ with MongoDB broker: find_one_and_update() ▸ MRQ with Redis broker: Pushback in a ZSET
  • 21. ZSETs in Redis ▸ Sorted sets with O(log(N)) scalability ▸ set/get by key, order by key, lookups by key or value ▸ Very interesting properties for task queues: Unicity, Ordering, Atomicity of updates, Performance ▸ MRQ's "Pushback" model: ▸ Queue with key=timestamp ▸ Unqueue by fetching key range & setting new keys in the future ▸ After completion the task adjusts or removes the key
  • 23. Consistency guarantees ▸ Must be thought of for the whole system, not just the broker! ▸ Brokers can be misused or misconfigured ▸ The workers can drop tasks if they want to ;-) ▸ Consistency starts at queueing time!
  • 25. Think hard about what you need ▸ Will your taskqueue be the foundation of your architecture, or is it just a side project? ▸ What performance do you need? (IO vs. CPU, latency, ...) ▸ What level of visibility and control do you need on queued & running tasks? ▸ Can workers terminate abruptly? Lots of design consequences! ▸ What language interop do you need?
  • 26. And then all the usual questions... ▸ Is it supported by a lively community? ▸ License ▸ Documentation ▸ Future plans
  • 27. Which one to pick? ▸ Celery: High performance, large community, very complex, major upgrades painful ▸ RQ: Extremely simple to understand, low performance ▸ MRQ: Adjust task visibility vs. performance, simple to understand, 1.0 soon ▸ Lots of other valid options! Just be sure to ask yourself the right questions ;-)
  • 28. BE GRATEFUL FOR THE OSS YOU USE! REMINDER