SlideShare a Scribd company logo
Optimizing Costs with AWS
Source: http://ir.netflix.com
•   Rationale and High-level Methodology
•   AWS resource-specific optimizations
•   Performance Testing
•   Results
•   Q&A
AWS Re:Invent -  Optimizing Costs with AWS
Rationale
• Applications operate at massive scale
  • Across three regions and multiple zones per region


• Service oriented architecture
  • Many moving parts (teams)


• Unconstrained deployment capabilities
  • “Freedom and Responsibility” culture
Rationale, cont.
• Improve availability
  • Avoid saturation of key resources
    • Dynamically adjust capacity to meet workload demands


• Plan for increasing workloads
  • Less focus on reducing current demand


• Maximize efficiency
    • Balance OLTP and batch demands

•   “That which is measured improves”
•   Asgard framework enables turnkey deployment (Netflix open-sourced)



    •   All engineers have full access

    •   Real-time reservation capacity

    •   Unconstrained ASG size limits
AWS Re:Invent -  Optimizing Costs with AWS
•   Birds-eye view of usage
•   Near real-time data
•   Open sourcing plans for tool
•   Decomposes by application
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
Unused Reservation Instance Hours *
2,000

1,500                                           Need to
                                                use this
1,000                                           capacity

 500

   0
         Mon   Tue   Wed   Thu   Fri   Sat



                                          * - fictitious volumes
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
Healthy


                                          Thrashing




                                         Double-Jump

Y-axis = number of instances in ASG
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
Adopted batch delete
Requests/day




                      Started batch send
                           adoption


                      Batch capabilities
                      Adoption complete



               Time
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWS
Legend
 Github / Techblog            Priam                            Exhibitor
                                                                                     Servo and Autoscaling Scripts
      Apache          Cassandra as a Service            Zookeeper as a Service
   Contributions
                             Astyanax                           Curator                           Honu
Techblog Post Only
                      Cassandra client for Java          Zookeeper Patterns           Log4j streaming to Hadoop
  Coming Soon
                           CassJMeter                         EVCache                   Circuit Breaker - Hystrix
                        Cassandra test suite           Memcached as a Service           Robust service pattern
                     Cassandra Multi-region EC2           Eureka / Discovery           Asgard - AutoScaleGroup
                         datastore support                 Service Directory             based AWS console
                             Aegisthus                         Archaius                     Chaos Monkey
                     Hadoop ETL for Cassandra         Dynamics Properties Service       Robustness verification
                                                                 Edda
                               Explorers                                                   Latency Monkey
                                                        Queryable config history
                     Governator - Library lifecycle    Server-side latency/error
                                                                                            Janitor Monkey
                      and dependency injection                 injection
                                Odin
                                                        REST Client + mid-tier LB         Bakeries and AMI
                       Workflow orchestration

                        Blitz4j - Async logging       Configuration REST endpoints         Build dynaslaves
Netflix at 2012 re:Invent
Date/Time         Presenter             Topic
Wed 8:30-10:00    Reed Hastings         Keynote with Andy Jassy
Wed 1:00-1:45     Coburn Watson         Optimizing Costs with AWS
Wed 2:05-2:55     Kevin McEntee         Netflix’s Transcoding Transformation
Wed 3:25-4:15     Neil Hunt / Yury I.   Netflix: Embracing the Cloud
Wed 4:30-5:20     Adrian Cockcroft      High Availability Architecture at Netflix
Thu 10:30-11:20   Jeremy Edberg         Rainmakers – Operating Clouds
Thu 11:35-12:25   Kurt Brown            Data Science with Elastic Map Reduce (EMR)
Thu 11:35-12:25   Jason Chan            Security Panel: Learn from CISOs working with AWS
Thu 3:00-3:50     Adrian Cockcroft      Compute & Networking Masters Customer Panel
Thu 3:00-3:50     Ruslan M./Gregg U.    Optimizing Your Cassandra Database on AWS
Thu 4:05-4:55     Ariel Tseitlin        Intro to Chaos Monkey and the Simian Army
We are sincerely eager to
hear your FEEDBACK on this
presentation and on re:Invent.

 Please fill out an evaluation
   form when you have a
            chance.
     Contact: cwatson@netflix.com

More Related Content

Viewers also liked (10)

PDF
TheNudge - Spirit - vF
Atul Satija
 
PDF
Using AWS CloudFront with S3 at SMARTSTUDY
Hyun-woo Park
 
PPT
Netflix Aws Startup Tour 090617134938 Phpapp02
GovCloud Network
 
PDF
Architecting applications in the AWS cloud
Cloud Genius
 
PPTX
re:Invent 2012 Optimizing Cassandra
Ruslan Meshenberg
 
PDF
Netflix Moving To Cloud
Hien Luu
 
PPTX
Data Science with Elastic MapReduce (EMR) at Netflix
Kurt Brown
 
KEY
Asgard: Using Grails to Deploy Netflix to AWS (Extended Slides)
Joe Sondow
 
PDF
A Tool for Practical Garbage Collection Analysis In the Cloud
Arun Kejariwal
 
PDF
Netflix cloud architecture...continued
Cloud Genius
 
TheNudge - Spirit - vF
Atul Satija
 
Using AWS CloudFront with S3 at SMARTSTUDY
Hyun-woo Park
 
Netflix Aws Startup Tour 090617134938 Phpapp02
GovCloud Network
 
Architecting applications in the AWS cloud
Cloud Genius
 
re:Invent 2012 Optimizing Cassandra
Ruslan Meshenberg
 
Netflix Moving To Cloud
Hien Luu
 
Data Science with Elastic MapReduce (EMR) at Netflix
Kurt Brown
 
Asgard: Using Grails to Deploy Netflix to AWS (Extended Slides)
Joe Sondow
 
A Tool for Practical Garbage Collection Analysis In the Cloud
Arun Kejariwal
 
Netflix cloud architecture...continued
Cloud Genius
 

Similar to AWS Re:Invent - Optimizing Costs with AWS (20)

PDF
Netflix Global Cloud Architecture
Adrian Cockcroft
 
PDF
NetflixOSS Open House Lightning talks
Ruslan Meshenberg
 
PPTX
Architectures for High Availability - QConSF
Adrian Cockcroft
 
PDF
Google Compute and MapR
MapR Technologies
 
PDF
The Netflix Open Source Platform
Ruslan Meshenberg
 
PDF
SV Forum Platform Architecture SIG - Netflix Open Source Platform
Adrian Cockcroft
 
PDF
Paving the Way to IT-as-a-Service
buildacloud
 
PDF
Practice and challenges from building IaaS
Shawn Zhu
 
PPTX
eBay From Ground Level to the Clouds
X.commerce
 
PDF
Architecting for the cloud
Leonidas Tsementzis
 
PDF
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
IndicThreads
 
KEY
Real World Cloud Application Security
Jason Chan
 
PDF
Cloud Architectures - Jinesh Varia - GrepTheWeb
jineshvaria
 
PPTX
Overview: Building Open Source Cloud Computing Environments
Mark Hinkle
 
PDF
Open stack swift_essex_meetup_2012_06_21_judd_maltin
Kamesh Pemmaraju
 
PDF
Cloud Architecture Tutorial - Running in the Cloud (3of3)
Adrian Cockcroft
 
PDF
Guy Nirpaz Next Gen App Servers
deimos
 
PDF
Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...
Michaël Figuière
 
PPTX
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
Adrian Cockcroft
 
PDF
Jeff barr Seattle_interactive_2011_q4
Seattle Interactive Conference
 
Netflix Global Cloud Architecture
Adrian Cockcroft
 
NetflixOSS Open House Lightning talks
Ruslan Meshenberg
 
Architectures for High Availability - QConSF
Adrian Cockcroft
 
Google Compute and MapR
MapR Technologies
 
The Netflix Open Source Platform
Ruslan Meshenberg
 
SV Forum Platform Architecture SIG - Netflix Open Source Platform
Adrian Cockcroft
 
Paving the Way to IT-as-a-Service
buildacloud
 
Practice and challenges from building IaaS
Shawn Zhu
 
eBay From Ground Level to the Clouds
X.commerce
 
Architecting for the cloud
Leonidas Tsementzis
 
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
IndicThreads
 
Real World Cloud Application Security
Jason Chan
 
Cloud Architectures - Jinesh Varia - GrepTheWeb
jineshvaria
 
Overview: Building Open Source Cloud Computing Environments
Mark Hinkle
 
Open stack swift_essex_meetup_2012_06_21_judd_maltin
Kamesh Pemmaraju
 
Cloud Architecture Tutorial - Running in the Cloud (3of3)
Adrian Cockcroft
 
Guy Nirpaz Next Gen App Servers
deimos
 
Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...
Michaël Figuière
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
Adrian Cockcroft
 
Jeff barr Seattle_interactive_2011_q4
Seattle Interactive Conference
 
Ad

More from Coburn Watson (6)

PDF
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Coburn Watson
 
PDF
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Coburn Watson
 
PPTX
goto; London: Keeping your Cloud Footprint in Check
Coburn Watson
 
PPTX
CPN302 your-linux-ami-optimization-and-performance
Coburn Watson
 
PPTX
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Coburn Watson
 
PDF
#lspe Q1 2013 dynamically scaling netflix in the cloud
Coburn Watson
 
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Coburn Watson
 
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Coburn Watson
 
goto; London: Keeping your Cloud Footprint in Check
Coburn Watson
 
CPN302 your-linux-ami-optimization-and-performance
Coburn Watson
 
Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...
Coburn Watson
 
#lspe Q1 2013 dynamically scaling netflix in the cloud
Coburn Watson
 
Ad

Recently uploaded (20)

DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PPTX
Designing Production-Ready AI Agents
Kunal Rai
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
July Patch Tuesday
Ivanti
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Designing Production-Ready AI Agents
Kunal Rai
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Advancing WebDriver BiDi support in WebKit
Igalia
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
July Patch Tuesday
Ivanti
 

AWS Re:Invent - Optimizing Costs with AWS

  • 3. Rationale and High-level Methodology • AWS resource-specific optimizations • Performance Testing • Results • Q&A
  • 5. Rationale • Applications operate at massive scale • Across three regions and multiple zones per region • Service oriented architecture • Many moving parts (teams) • Unconstrained deployment capabilities • “Freedom and Responsibility” culture
  • 6. Rationale, cont. • Improve availability • Avoid saturation of key resources • Dynamically adjust capacity to meet workload demands • Plan for increasing workloads • Less focus on reducing current demand • Maximize efficiency • Balance OLTP and batch demands • “That which is measured improves”
  • 7. Asgard framework enables turnkey deployment (Netflix open-sourced) • All engineers have full access • Real-time reservation capacity • Unconstrained ASG size limits
  • 9. Birds-eye view of usage • Near real-time data • Open sourcing plans for tool • Decomposes by application
  • 15. Unused Reservation Instance Hours * 2,000 1,500 Need to use this 1,000 capacity 500 0 Mon Tue Wed Thu Fri Sat * - fictitious volumes
  • 19. Healthy Thrashing Double-Jump Y-axis = number of instances in ASG
  • 22. Adopted batch delete Requests/day Started batch send adoption Batch capabilities Adoption complete Time
  • 28. Legend Github / Techblog Priam Exhibitor Servo and Autoscaling Scripts Apache Cassandra as a Service Zookeeper as a Service Contributions Astyanax Curator Honu Techblog Post Only Cassandra client for Java Zookeeper Patterns Log4j streaming to Hadoop Coming Soon CassJMeter EVCache Circuit Breaker - Hystrix Cassandra test suite Memcached as a Service Robust service pattern Cassandra Multi-region EC2 Eureka / Discovery Asgard - AutoScaleGroup datastore support Service Directory based AWS console Aegisthus Archaius Chaos Monkey Hadoop ETL for Cassandra Dynamics Properties Service Robustness verification Edda Explorers Latency Monkey Queryable config history Governator - Library lifecycle Server-side latency/error Janitor Monkey and dependency injection injection Odin REST Client + mid-tier LB Bakeries and AMI Workflow orchestration Blitz4j - Async logging Configuration REST endpoints Build dynaslaves
  • 29. Netflix at 2012 re:Invent Date/Time Presenter Topic Wed 8:30-10:00 Reed Hastings Keynote with Andy Jassy Wed 1:00-1:45 Coburn Watson Optimizing Costs with AWS Wed 2:05-2:55 Kevin McEntee Netflix’s Transcoding Transformation Wed 3:25-4:15 Neil Hunt / Yury I. Netflix: Embracing the Cloud Wed 4:30-5:20 Adrian Cockcroft High Availability Architecture at Netflix Thu 10:30-11:20 Jeremy Edberg Rainmakers – Operating Clouds Thu 11:35-12:25 Kurt Brown Data Science with Elastic Map Reduce (EMR) Thu 11:35-12:25 Jason Chan Security Panel: Learn from CISOs working with AWS Thu 3:00-3:50 Adrian Cockcroft Compute & Networking Masters Customer Panel Thu 3:00-3:50 Ruslan M./Gregg U. Optimizing Your Cassandra Database on AWS Thu 4:05-4:55 Ariel Tseitlin Intro to Chaos Monkey and the Simian Army
  • 30. We are sincerely eager to hear your FEEDBACK on this presentation and on re:Invent. Please fill out an evaluation form when you have a chance. Contact: [email protected]