SlideShare a Scribd company logo
IGUAZÚ 
A Job Scheduler Using Mesos and Docker 
Colleen Lee 
Software Engineer
"COURSERA" IN 2011
"COURSERA" IN 2011
! 
! 
! 
! 
"COURSERA" IN 2011 
! 
• Work to be done: gradebook exports, regrading 
quizzes, sending batch emails, encoding videos, etc.
CASCADE
Cascade: Lifecycle of a Job
HOW DOES CASCADE 
WORK? 
Client
HOW DOES CASCADE 
Job data 
Client 
SQS 
Database 
Job information 
WORK?
HOW DOES CASCADE 
Client 
WORK? 
SQS 
Worker Worker 
Cascade 
Database
HOW DOES CASCADE 
Client 
WORK? 
SQS 
Worker Worker 
Cascade 
New job 
Database 
Poll
HOW DOES CASCADE 
Client 
WORK? 
SQS 
Worker Worker 
Running... 
Cascade 
Status information 
Database
HOW DOES CASCADE 
Client 
SQS 
Worker Worker 
Cascade 
No new jobs! 
Database 
Poll 
Zzz... 
WORK?
HOW DOES CASCADE 
Client 
WORK? 
SQS 
Worker Worker 
Cascade 
Status? 
Job status 
Database
HOW DOES CASCADE 
Client 
WORK? 
SQS 
Worker Worker 
Cascade 
Job is done! 
Database
LACK OF ISOLATION 
Worker 2 
Worker 1 Worker 2 
Cascade 
Worker 
1 
CPUs 
Memory 
Worker 
2 
Worker 1
LACK OF ISOLATION 
Worker 
1 Worker 
Worker 2 
Worker 1 Worker 2 
Cascade 
2 
Worker 1 
CPUs Memory
FRAGILE DEPLOYMENT FLOW
FRAGILE DEPLOYMENT FLOW 
System 
code vs. Job 
code
FRAGILE DEPLOYMENT FLOW 
System 
code vs. Job 
code 
Rarely updated Updated all the time
FRAGILE DEPLOYMENT FLOW 
System 
code vs. Job 
code 
Rarely updated Updated all the time 
Unique system no deploy tooling
FRAGILE DEPLOYMENT FLOW 
Worker 
Poll for job... 
Run job... 
Poll for job... 
Run job... 
Update job code???
FRAGILE DEPLOYMENT FLOW 
Worker 
Poll for job... 
Run job... 
Poll for job... 
Run job... 
Update job code!
FRAGILE DEPLOYMENT FLOW 
Worker 
Poll for job... 
Run job... 
Poll for job... 
Run job... 
Update job code! 
Worker 
(new code) 
Poll for job... 
Run job... 
Poll for job... 
Run job...
FRAGILE DEPLOYMENT FLOW 
Worker 
Poll for job... 
Run job... 
Poll for job... 
Run job... 
Update job code! 
Worker 
(new code) 
Poll for job... 
Run job... 
Poll for job... 
Run job...
FRAGILE DEPLOYMENT FLOW 
Worker 
Poll for job... 
Run job... 
Poll for job... 
Run job... 
Update job code! 
Worker 
(new code) 
Poll for job... 
Run job... 
Poll for job... 
Run job... 
Worker Worker Worker 
Cascade 
Worker Worker Worker 
Cascade 
Worker Worker Worker 
Cascade
POOR DEVELOPMENT STORY
POOR DEVELOPMENT STORY 
Client 
SQS 
Worker Worker 
Cascade 
Database
POOR DEVELOPMENT STORY 
Dev 1 
Client 
SQS 
Worker 
Cascade 
DB
POOR DEVELOPMENT STORY 
Dev 1 
Client 
SQS 
Worker 
Cascade 
DB 
Client Worker 
Dev 2 
Cascade 
DB
POOR DEVELOPMENT STORY 
Job data Poll 
Dev 1 
Client 
SQS 
Worker 
Cascade 
DB 
New job 
Client Worker 
Dev 2 
Cascade 
DB 
Where'd my 
job go? 
What is 
this job?
CASCADE HAS ... SOME ISSUES 
• Lack of isolation 
• Fragile deployment flow 
• Poor development story
CASCADE HAS ... SOME ISSUES 
• Lack of isolation 
• Fragile deployment flow 
• Poor development story 
• Tied exclusively to one language
CASCADE HAS ... SOME ISSUES 
• Lack of isolation 
• Fragile deployment flow 
• Poor development story 
• Tied exclusively to one language 
Cascade-Scala? Cascade-Python?
CASCADE HAS ... SOME ISSUES 
• Lack of isolation 
• Fragile deployment flow 
• Poor development story 
• Tied exclusively to one language 
Cascade-Scala? Cascade-Python? Duplicating work: BAD 
Static partitioning: BAD
2014: CASCADE V2??
2014: CASCADE V2??
2014: CASCADE V2?? 
DATABASE MIGRATIONS!! 
SCHEDULED JOBS!! 
PROGRAMMING ASSIGNMENTS!
2014: CASCADE V2??
IGUAZÚ!
IGUAZÚ!
Iguazú: A Long-Running Job Scheduler using Docker and Mesos
• Resource isolation: cgroups 
! 
! 
! 
! 
!
• Resource isolation: cgroups 
• Master(s) has/have "soft state" 
• Coordination AND robustness 
! 
! 
!
• Resource isolation: cgroups 
• Master(s) has/have "soft state" 
• Coordination AND robustness 
• Implementation: 
• Scheduler: accepts and manages resources 
• Executor: process launched on slaves to run 
tasks
Iguazú: A Long-Running Job Scheduler using Docker and Mesos
• Lightweight, but provides abstraction of a VM
• Lightweight, but provides abstraction of VM 
• Dockerfiles: self-documenting!
• Lightweight, but provides abstraction of VM 
• Dockerfiles: self-documenting!
• Lightweight, but provides abstraction of VM 
• Dockerfiles: self-documenting! 
• Private Docker registry: convenience, versioning
• Lightweight, but provides abstraction of VM 
• Dockerfiles: self-documenting! 
• Private Docker registry: convenience, versioning 
• Usage: specify the image and specify a command
Iguazú: Lifecycle of a Job
HOW DOES IGUAZÚ WORK? 
Client
HOW DOES IGUAZÚ WORK? 
Client 
SQS 
Iguazú 
Job data 
Production mode 
Job id 
Database 
Job 
information
HOW DOES IGUAZÚ WORK? 
Client 
In-memory 
Iguazú 
Job data 
Development mode 
Job id 
Database 
Job 
information
HOW DOES IGUAZÚ WORK? 
Client 
SQS 
Poll 
Manager 
Iguazú 
Database
HOW DOES IGUAZÚ WORK? 
Client 
SQS 
Job data 
Manager 
Scheduler 
Iguazú 
Database
HOW DOES IGUAZÚ WORK? 
Master(s) 
Client 
SQS 
Manager 
Scheduler 
Iguazú 
Database 
Slave 
Slave
HOW DOES IGUAZÚ WORK? 
Master(s) 
Client 
SQS 
Manager 
Scheduler 
Iguazú 
Database 
Slave 
Executor 
Slave
HOW DOES IGUAZÚ WORK? 
Master(s) 
Client 
SQS 
Manager 
Scheduler 
Iguazú 
Database 
Slave 
Executor 
registry 
Check for 
Slave new image
HOW DOES IGUAZÚ WORK? 
Master(s) 
Client 
SQS 
Manager 
Scheduler 
Iguazú 
Database 
Slave 
Executor 
registry 
Status updates 
Slave
HOW DOES IGUAZÚ WORK? 
Master(s) 
Client 
SQS 
Manager 
Scheduler 
Iguazú 
Database 
TASSKla_vFeINISHED 
Executor 
registry 
Status updates 
Slave
HOW DOES IGUAZÚ WORK? 
All done! TASSKla_vFeINISHED 
Master(s) 
Client 
SQS 
Manager 
Scheduler 
Iguazú 
Database 
Executor 
registry 
Status updates 
Slave
HOW DOES IGUAZÚ WORK? 
All done! TASSKla_vFeINISHED 
Master(s) 
Client 
SQS 
Manager 
Scheduler 
Iguazú 
Database 
Executor 
registry 
Status updates 
Slave 
OK!
HOW DOES IGUAZÚ WORK? 
Master(s) 
Client 
SQS 
Manager 
Scheduler 
Iguazú 
Database 
Slave 
Executor 
registry 
Job is done! 
Slave 
Status? Job status
REMEMBER CASCADE'S 
PROBLEMS? 
• Lack of isolation 
• Fragile deployment flow 
• Poor development story 
• Tied exclusively to one language
REMEMBER CASCADE'S 
PROBLEMS? 
Mesos: cgroups! 
• Isolation! 
• Fragile deployment flow 
• Poor development story 
• Tied exclusively to one language
REMEMBER CASCADE'S 
PROBLEMS? 
• Isolation! 
• Easy deployment flow 
• Poor development story 
• Tied exclusively to one language 
Private Docker repo
REMEMBER CASCADE'S 
PROBLEMS? 
• Isolation! 
• Easy deployment flow 
• Consistent development story 
• Tied exclusively to one language 
Iguazú: 
proper abstractions
REMEMBER CASCADE'S 
PROBLEMS? 
• Isolation! 
• Easy deployment flow 
• Consistent development story 
• Any language No restrictions
OTHER BENEFITS
OTHER BENEFITS 
• Ease of transition 
! 
! 
! 
! 
Mesos: job management 
Docker: job packaging
OTHER BENEFITS 
• Ease of transition 
• Performance 
! 
! 
! 
Mesos: long-running! 
PHP: blaaargh
OTHER BENEFITS 
• Ease of transition 
• Performance 
• Flexibility 
! 
! 
Use Docker, run Scala code, etc.
OTHER BENEFITS 
• Ease of transition 
• Performance 
• Flexibility 
• Fine-grained control over scheduling 
! 
Autoscaling!
OTHER BENEFITS 
• Ease of transition 
• Performance 
• Flexibility 
• Fine-grained control over scheduling 
• Designed to work on a heterogeneous pool of 
resources Security :)
THANKS! QUESTIONS? 
! 
We are hiring! See http://coursera.org/jobs 
! 
@firejade0 
clee@coursera.org

More Related Content

PDF
Continuous Integration and Deployment Best Practices on AWS
Danilo Poccia
 
PPT
Docker in the Cloud
Sascha Möllering
 
PPTX
Sas 2015 event_driven
Sascha Möllering
 
PPT
DevOpsCon Cloud Workshop
Sascha Möllering
 
PPTX
Continuous delivery and deployment on AWS
Shiva Narayanaswamy
 
PDF
IaC on AWS Cloud
Bhuvaneswari Subramani
 
PDF
AWS Lambda and Serverless framework: lessons learned while building a serverl...
Luciano Mammino
 
PDF
JUST EAT: Embracing DevOps
Peter Mounce
 
Continuous Integration and Deployment Best Practices on AWS
Danilo Poccia
 
Docker in the Cloud
Sascha Möllering
 
Sas 2015 event_driven
Sascha Möllering
 
DevOpsCon Cloud Workshop
Sascha Möllering
 
Continuous delivery and deployment on AWS
Shiva Narayanaswamy
 
IaC on AWS Cloud
Bhuvaneswari Subramani
 
AWS Lambda and Serverless framework: lessons learned while building a serverl...
Luciano Mammino
 
JUST EAT: Embracing DevOps
Peter Mounce
 

Viewers also liked (8)

PDF
I Love APIs 2015: Scaling Mobile-focused Microservices at Verizon
Apigee | Google Cloud
 
PDF
Reactive Fault Tolerant Programming with Hystrix and RxJava
Matt Stine
 
PDF
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
Apigee | Google Cloud
 
PPTX
Microservices at Netflix
Katharina Probst
 
PDF
Chris Munns, DevOps @ Amazon: Microservices, 2 Pizza Teams, & 50 Million Depl...
TriNimbus
 
PDF
I Love APIs 2015: Microservices at Amazon
Apigee | Google Cloud
 
PDF
Talks@Coursera - A/B Testing @ Internet Scale
courseratalks
 
PDF
Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo
gjuljo
 
I Love APIs 2015: Scaling Mobile-focused Microservices at Verizon
Apigee | Google Cloud
 
Reactive Fault Tolerant Programming with Hystrix and RxJava
Matt Stine
 
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
Apigee | Google Cloud
 
Microservices at Netflix
Katharina Probst
 
Chris Munns, DevOps @ Amazon: Microservices, 2 Pizza Teams, & 50 Million Depl...
TriNimbus
 
I Love APIs 2015: Microservices at Amazon
Apigee | Google Cloud
 
Talks@Coursera - A/B Testing @ Internet Scale
courseratalks
 
Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo
gjuljo
 
Ad

Similar to Iguazú: A Long-Running Job Scheduler using Docker and Mesos (20)

KEY
Why Architecture in Web Development matters
Lars Jankowfsky
 
PDF
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Evan Chan
 
KEY
Standardizing and Managing Your Infrastructure - MOSC 2011
Brian Ritchie
 
PDF
Solid And Sustainable Development in Scala
Kazuhiro Sera
 
PDF
JavaScript: Past, Present, Future
Jungryul Choi
 
PDF
Solid and Sustainable Development in Scala
scalaconfjp
 
PDF
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
DataStax Academy
 
PPTX
Inside Wordnik's Architecture
Tony Tam
 
PPTX
Continuous Delivery and Infrastructure as Code
Sascha Möllering
 
PDF
The JavaScript Delusion
JUGBD
 
PDF
Spark Summit 2014: Spark Job Server Talk
Evan Chan
 
PDF
Scaling Drupal: Not IF... HOW
Treehouse Agency
 
KEY
Android java fx-jme@jug-lugano
Fabrizio Giudici
 
PDF
CliqueSquare processing
INRIA-OAK
 
PDF
Keep Calm and Use Kanban
Acquate
 
PDF
Quo vadis, JavaScript? Devday.pl keynote
Christian Heilmann
 
PPTX
Architecture & Workflow of Modern Web Apps
Rasheed Waraich
 
PDF
Pipeline as code for your infrastructure as Code
Kris Buytaert
 
PDF
Smart Client Development
Tamir Khason
 
PPTX
Using Apache Camel as AKKA
Johan Edstrom
 
Why Architecture in Web Development matters
Lars Jankowfsky
 
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Evan Chan
 
Standardizing and Managing Your Infrastructure - MOSC 2011
Brian Ritchie
 
Solid And Sustainable Development in Scala
Kazuhiro Sera
 
JavaScript: Past, Present, Future
Jungryul Choi
 
Solid and Sustainable Development in Scala
scalaconfjp
 
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
DataStax Academy
 
Inside Wordnik's Architecture
Tony Tam
 
Continuous Delivery and Infrastructure as Code
Sascha Möllering
 
The JavaScript Delusion
JUGBD
 
Spark Summit 2014: Spark Job Server Talk
Evan Chan
 
Scaling Drupal: Not IF... HOW
Treehouse Agency
 
Android java fx-jme@jug-lugano
Fabrizio Giudici
 
CliqueSquare processing
INRIA-OAK
 
Keep Calm and Use Kanban
Acquate
 
Quo vadis, JavaScript? Devday.pl keynote
Christian Heilmann
 
Architecture & Workflow of Modern Web Apps
Rasheed Waraich
 
Pipeline as code for your infrastructure as Code
Kris Buytaert
 
Smart Client Development
Tamir Khason
 
Using Apache Camel as AKKA
Johan Edstrom
 
Ad

Recently uploaded (20)

PDF
Natural_Language_processing_Unit_I_notes.pdf
sanguleumeshit
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PDF
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
PDF
Zero Carbon Building Performance standard
BassemOsman1
 
PPT
Understanding the Key Components and Parts of a Drone System.ppt
Siva Reddy
 
PPTX
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
PDF
Machine Learning All topics Covers In This Single Slides
AmritTiwari19
 
PPTX
Chapter_Seven_Construction_Reliability_Elective_III_Msc CM
SubashKumarBhattarai
 
PDF
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
PDF
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
PDF
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
PDF
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
PPTX
database slide on modern techniques for optimizing database queries.pptx
aky52024
 
PPTX
Inventory management chapter in automation and robotics.
atisht0104
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
PDF
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
PPTX
Information Retrieval and Extraction - Module 7
premSankar19
 
PDF
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
Natural_Language_processing_Unit_I_notes.pdf
sanguleumeshit
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
CAD-CAM U-1 Combined Notes_57761226_2025_04_22_14_40.pdf
shailendrapratap2002
 
Zero Carbon Building Performance standard
BassemOsman1
 
Understanding the Key Components and Parts of a Drone System.ppt
Siva Reddy
 
business incubation centre aaaaaaaaaaaaaa
hodeeesite4
 
Machine Learning All topics Covers In This Single Slides
AmritTiwari19
 
Chapter_Seven_Construction_Reliability_Elective_III_Msc CM
SubashKumarBhattarai
 
Packaging Tips for Stainless Steel Tubes and Pipes
heavymetalsandtubes
 
Introduction to Ship Engine Room Systems.pdf
Mahmoud Moghtaderi
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
AI-Driven IoT-Enabled UAV Inspection Framework for Predictive Maintenance and...
ijcncjournal019
 
Chad Ayach - A Versatile Aerospace Professional
Chad Ayach
 
Advanced LangChain & RAG: Building a Financial AI Assistant with Real-Time Data
Soufiane Sejjari
 
database slide on modern techniques for optimizing database queries.pptx
aky52024
 
Inventory management chapter in automation and robotics.
atisht0104
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 
settlement FOR FOUNDATION ENGINEERS.pdf
Endalkazene
 
Information Retrieval and Extraction - Module 7
premSankar19
 
67243-Cooling and Heating & Calculation.pdf
DHAKA POLYTECHNIC
 

Iguazú: A Long-Running Job Scheduler using Docker and Mesos

  • 1. IGUAZÚ A Job Scheduler Using Mesos and Docker Colleen Lee Software Engineer
  • 4. ! ! ! ! "COURSERA" IN 2011 ! • Work to be done: gradebook exports, regrading quizzes, sending batch emails, encoding videos, etc.
  • 7. HOW DOES CASCADE WORK? Client
  • 8. HOW DOES CASCADE Job data Client SQS Database Job information WORK?
  • 9. HOW DOES CASCADE Client WORK? SQS Worker Worker Cascade Database
  • 10. HOW DOES CASCADE Client WORK? SQS Worker Worker Cascade New job Database Poll
  • 11. HOW DOES CASCADE Client WORK? SQS Worker Worker Running... Cascade Status information Database
  • 12. HOW DOES CASCADE Client SQS Worker Worker Cascade No new jobs! Database Poll Zzz... WORK?
  • 13. HOW DOES CASCADE Client WORK? SQS Worker Worker Cascade Status? Job status Database
  • 14. HOW DOES CASCADE Client WORK? SQS Worker Worker Cascade Job is done! Database
  • 15. LACK OF ISOLATION Worker 2 Worker 1 Worker 2 Cascade Worker 1 CPUs Memory Worker 2 Worker 1
  • 16. LACK OF ISOLATION Worker 1 Worker Worker 2 Worker 1 Worker 2 Cascade 2 Worker 1 CPUs Memory
  • 18. FRAGILE DEPLOYMENT FLOW System code vs. Job code
  • 19. FRAGILE DEPLOYMENT FLOW System code vs. Job code Rarely updated Updated all the time
  • 20. FRAGILE DEPLOYMENT FLOW System code vs. Job code Rarely updated Updated all the time Unique system no deploy tooling
  • 21. FRAGILE DEPLOYMENT FLOW Worker Poll for job... Run job... Poll for job... Run job... Update job code???
  • 22. FRAGILE DEPLOYMENT FLOW Worker Poll for job... Run job... Poll for job... Run job... Update job code!
  • 23. FRAGILE DEPLOYMENT FLOW Worker Poll for job... Run job... Poll for job... Run job... Update job code! Worker (new code) Poll for job... Run job... Poll for job... Run job...
  • 24. FRAGILE DEPLOYMENT FLOW Worker Poll for job... Run job... Poll for job... Run job... Update job code! Worker (new code) Poll for job... Run job... Poll for job... Run job...
  • 25. FRAGILE DEPLOYMENT FLOW Worker Poll for job... Run job... Poll for job... Run job... Update job code! Worker (new code) Poll for job... Run job... Poll for job... Run job... Worker Worker Worker Cascade Worker Worker Worker Cascade Worker Worker Worker Cascade
  • 27. POOR DEVELOPMENT STORY Client SQS Worker Worker Cascade Database
  • 28. POOR DEVELOPMENT STORY Dev 1 Client SQS Worker Cascade DB
  • 29. POOR DEVELOPMENT STORY Dev 1 Client SQS Worker Cascade DB Client Worker Dev 2 Cascade DB
  • 30. POOR DEVELOPMENT STORY Job data Poll Dev 1 Client SQS Worker Cascade DB New job Client Worker Dev 2 Cascade DB Where'd my job go? What is this job?
  • 31. CASCADE HAS ... SOME ISSUES • Lack of isolation • Fragile deployment flow • Poor development story
  • 32. CASCADE HAS ... SOME ISSUES • Lack of isolation • Fragile deployment flow • Poor development story • Tied exclusively to one language
  • 33. CASCADE HAS ... SOME ISSUES • Lack of isolation • Fragile deployment flow • Poor development story • Tied exclusively to one language Cascade-Scala? Cascade-Python?
  • 34. CASCADE HAS ... SOME ISSUES • Lack of isolation • Fragile deployment flow • Poor development story • Tied exclusively to one language Cascade-Scala? Cascade-Python? Duplicating work: BAD Static partitioning: BAD
  • 37. 2014: CASCADE V2?? DATABASE MIGRATIONS!! SCHEDULED JOBS!! PROGRAMMING ASSIGNMENTS!
  • 42. • Resource isolation: cgroups ! ! ! ! !
  • 43. • Resource isolation: cgroups • Master(s) has/have "soft state" • Coordination AND robustness ! ! !
  • 44. • Resource isolation: cgroups • Master(s) has/have "soft state" • Coordination AND robustness • Implementation: • Scheduler: accepts and manages resources • Executor: process launched on slaves to run tasks
  • 46. • Lightweight, but provides abstraction of a VM
  • 47. • Lightweight, but provides abstraction of VM • Dockerfiles: self-documenting!
  • 48. • Lightweight, but provides abstraction of VM • Dockerfiles: self-documenting!
  • 49. • Lightweight, but provides abstraction of VM • Dockerfiles: self-documenting! • Private Docker registry: convenience, versioning
  • 50. • Lightweight, but provides abstraction of VM • Dockerfiles: self-documenting! • Private Docker registry: convenience, versioning • Usage: specify the image and specify a command
  • 52. HOW DOES IGUAZÚ WORK? Client
  • 53. HOW DOES IGUAZÚ WORK? Client SQS Iguazú Job data Production mode Job id Database Job information
  • 54. HOW DOES IGUAZÚ WORK? Client In-memory Iguazú Job data Development mode Job id Database Job information
  • 55. HOW DOES IGUAZÚ WORK? Client SQS Poll Manager Iguazú Database
  • 56. HOW DOES IGUAZÚ WORK? Client SQS Job data Manager Scheduler Iguazú Database
  • 57. HOW DOES IGUAZÚ WORK? Master(s) Client SQS Manager Scheduler Iguazú Database Slave Slave
  • 58. HOW DOES IGUAZÚ WORK? Master(s) Client SQS Manager Scheduler Iguazú Database Slave Executor Slave
  • 59. HOW DOES IGUAZÚ WORK? Master(s) Client SQS Manager Scheduler Iguazú Database Slave Executor registry Check for Slave new image
  • 60. HOW DOES IGUAZÚ WORK? Master(s) Client SQS Manager Scheduler Iguazú Database Slave Executor registry Status updates Slave
  • 61. HOW DOES IGUAZÚ WORK? Master(s) Client SQS Manager Scheduler Iguazú Database TASSKla_vFeINISHED Executor registry Status updates Slave
  • 62. HOW DOES IGUAZÚ WORK? All done! TASSKla_vFeINISHED Master(s) Client SQS Manager Scheduler Iguazú Database Executor registry Status updates Slave
  • 63. HOW DOES IGUAZÚ WORK? All done! TASSKla_vFeINISHED Master(s) Client SQS Manager Scheduler Iguazú Database Executor registry Status updates Slave OK!
  • 64. HOW DOES IGUAZÚ WORK? Master(s) Client SQS Manager Scheduler Iguazú Database Slave Executor registry Job is done! Slave Status? Job status
  • 65. REMEMBER CASCADE'S PROBLEMS? • Lack of isolation • Fragile deployment flow • Poor development story • Tied exclusively to one language
  • 66. REMEMBER CASCADE'S PROBLEMS? Mesos: cgroups! • Isolation! • Fragile deployment flow • Poor development story • Tied exclusively to one language
  • 67. REMEMBER CASCADE'S PROBLEMS? • Isolation! • Easy deployment flow • Poor development story • Tied exclusively to one language Private Docker repo
  • 68. REMEMBER CASCADE'S PROBLEMS? • Isolation! • Easy deployment flow • Consistent development story • Tied exclusively to one language Iguazú: proper abstractions
  • 69. REMEMBER CASCADE'S PROBLEMS? • Isolation! • Easy deployment flow • Consistent development story • Any language No restrictions
  • 71. OTHER BENEFITS • Ease of transition ! ! ! ! Mesos: job management Docker: job packaging
  • 72. OTHER BENEFITS • Ease of transition • Performance ! ! ! Mesos: long-running! PHP: blaaargh
  • 73. OTHER BENEFITS • Ease of transition • Performance • Flexibility ! ! Use Docker, run Scala code, etc.
  • 74. OTHER BENEFITS • Ease of transition • Performance • Flexibility • Fine-grained control over scheduling ! Autoscaling!
  • 75. OTHER BENEFITS • Ease of transition • Performance • Flexibility • Fine-grained control over scheduling • Designed to work on a heterogeneous pool of resources Security :)
  • 76. THANKS! QUESTIONS? ! We are hiring! See http://coursera.org/jobs ! @firejade0 [email protected]