SlideShare a Scribd company logo
The Evolution of Testing Methodology
at AWS: From Status Quo To Formal
Methods With TLA+
Tim Rath
Principal Engineer
AWS Database Services
Amazon.com
1
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/aws-testing-tla
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
AWS Landscape
2
Services comprise large fleets
of servers decomposed into
smaller services
3
Many of which experience
sustained exponential growth
4
S3 experienced exponential growth for
6 years to reach 1 trillion objects
stored; less than a year later it reached
2 trillion objects [1]
5
DynamoDB processes millions of
transactions per second in a single
AWS region around the clock [2]
6
Systems and data are managed
through subtle concurrent and
distributed algorithms
7
“Must Haves” of Every Service
8
Security
Durability
Scalability
Availability
General Test Strategy
• Developers
– Unit Tests
– Integration Tests
9
•QA (release testing)
•Functional Tests
•Performance Tests
•Stress Tests
•Failure Tests
Test Adequacy Criteria
• Literature expresses as a form of measurable
code coverage [3]
• Status Quo Criteria:
– Statement coverage
• Most common adequacy criteria employed today
• Tools readily available to measure and report
• Extremely weak criteria
10
Test Adequacy Criteria
• Literature expresses as a form of measurable
code coverage[3]
• Perfect Criteria:
– Cover every execution path across every possible
state for the system
• States may be infinite
• Test space is exponential with path length
11
Test Adequacy Criteria
• Real world practice further defines test
adequacy criteria through an ad-hoc process:
– Brain-storm test scenarios
– Brain-storm stress test workloads
– Brain-storm failure scenarios
12
Better testing of distributed algorithms
• We look for strategies that:
– Help to understand and protect algorithm invariants
– Help expand test coverage
– Allow thorough testing as early as possible in the
development process
13
Development starts with specification
14
Development starts with specification
14
Development starts with specification
14
Development starts with specification
14
Testing Starts With Development
• Write tests and test support structure while working
on the implementation
– The spirit of “Test Driven Development” without
subscribing to the specific formula
• It is unclear how useful adhering to strict TDD concepts
really is [4]
• The emphasis it puts on test, and test thoroughness is
where the value is [5]
15
Testing Starts With Development
15
Assert system invariants
• Asserts enforce the specification in code
• Strong assert statements come from clear
understanding of the specification
16
Assert system invariants
16
Assert system invariants
16
Generative Testing
Formalization around randomized testing
with invariant or “property” checking
17
Generative Testing
18
Test Case
Generator
Test
Execution
Validation Of
Properties
Against Result
Properties
Violated?
Expected
Properties Of
Result
Report
Failure Case
Loop
No Yes
Anecdotes From QuickCheck Paper [6]
• Made them think harder about properties;
document the specification
• Need to think about the input domain to
exercise less probable paths
19
Anecdotes From QuickCheck Paper [6]
19
Anecdotes From QuickCheck Paper [6]
19
In Process Clusters
20
Process
1) Run multiple nodes in the same process as unit test
2) Insert arbitrary code into the communication channel
3) Ability to pipeline the inserted bits of code
In Process Clusters
• Helps write better integration tests
– Allows easy construction of intricate test scenarios
– Integration testing of distributed components in a
unit test environment
• Helps write better stress tests
– Direct control at the communication layer
– Much faster, lower over-head test cycles
21
In Process Clusters
21
In Process Clusters
21
Informal Proofs
• Requires deep thinking which promotes even
better understanding of the algorithms
• Hard to get right – can still lead to a false
sense of security
22
Informal Proofs
22
Formal Methods
• Precise specification of algorithms
• Tools to validate correctness
• We surveyed some of the systems and languages
for writing formal specifications, and ended up
finding what we were looking for in TLA+ [7]
• TLA+ gives us all possible executions over all
possible system states for algorithm designs
23
TLA+
24
read
write
read
write
read
write
read
write
read
write
TLA+
• How the model works:
– Init == <set of initial system states>
– Next == <set of possible next actions>
• That’s it!
25
TLA+
25
TLA+
25
TLA+
25
Real World Examples
• DynamoDB
– Replication protocols
– Membership handling
– Quorum Configuration Changes
• Other AWS projects [8]
– Low level distributed network protocol
– Internal distributed lock manager
– S3, EC2, EBS system management algorithms
26
Thank You!
27
Thank You!
27
we’re hiring
rath@amazon.com
28
[2] Hamilton, J. Challenges in Designing at Scale: Formal Methods in Building Robust
Distributed Systems. Perspectives Blog. July 2014;
http://perspectives.mvdirona.com/2014/07/03/ChallengesInDesigningAtScaleFormalMethodsInBuildingRobustDistributedSystems.aspx
[1] Barr, J. Amazon S3-The First Trillion Objects. Amazon Web Services Blog. June 2012;
https://aws.amazon.com/blogs/aws/amazon-s3-two-trillion-objects-11-million-requests-second/
[3] Zhu, H., et al. Software Unit Test Coverage and Adequacy. ACM Computing Surveys,
Vol. 29, No. 4, December 1997;
http://www.cs.toronto.edu/~chechik/courses07/csc410/p366-zhu.pdf
[4] Bulajic, A., Sambasivam, S., Stojic, R. Overview of the Test Driven Development
Research Projects and Experiments. Proceedings of Informing Science & IT Education
Conference (InSITE), 2012;
http://proceedings.informingscience.org/InSITE2012/InSITE12p165-187Bulajic0052.pdf
[5] Dalke, A. Problems with TDD. Dalke Scientific (dalkescientific.com), December 2009;
http://www.dalkescientific.com/writings/diary/archive/2009/12/29/problems_with_tdd.html
29
[6] Claessen, K., Hughes, J. QuickCheck: A Lightweight Tool for Random Testing of
Haskell Programs. ACM SIGPLAN Notices Volume 35 Issue 9, Sept. 2000;
http://www.eecs.northwestern.edu/~robby/courses/395-495-2009-fall/quick.pdf
[7] Newcombe, C., et al. Use of Formal Methods at Amazon Web Services. (pending ACM
publication), November 2013;
http://research.microsoft.com/en-us/um/people/lamport/tla/amazon.html
[8] Newcombe, C. Why Amazon Chose TLA+. Lecture Notes in Computer Science Volume
8477, June 2014; http://link.springer.com/chapter/10.1007%2F978-3-662-43652-3_3
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/aws-
testing-tla

More Related Content

What's hot (20)

PPTX
Monitoring Apache Kafka
confluent
 
PDF
Using Redgate, AKS and Azure to bring DevOps to your Database
Red Gate Software
 
PDF
Scalable and Reliable Logging at Pinterest
Krishna Gade
 
PPTX
RedisConf18 - Implementing a New Data Structure for Redis
Redis Labs
 
PDF
The future of DevOps: fully left-shifted deployments with version control and...
Red Gate Software
 
PPTX
RedisConf18 - Serving Automated Home Valuation with Redis & Kafka
Redis Labs
 
PPTX
Peer council 2013_presentation
WiLS
 
PPTX
RedisConf18 - Redis Fault Injection
Redis Labs
 
PDF
2017 LabVIEW Developer Day - Branching Workflows for Accelerated Team Develop...
Ching-Hwa Yu
 
PPTX
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
 
PDF
SDL Trados Studio 2014... what's new?
SDL Trados
 
PDF
Stream Processing with Apache Flink
C4Media
 
PPTX
Extending the Yahoo Streaming Benchmark
Jamie Grier
 
PPTX
QCon London - Stream Processing with Apache Flink
Robert Metzger
 
PPTX
QA Evening Игорь Колосов - Performance Testing: Metrics & Measurements
Artjoker
 
PPTX
Successfully migrating existing databases to Azure
Red Gate Software
 
PDF
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Flink Forward
 
PPTX
Robust Stream Processing with Apache Flink
Jamie Grier
 
PPTX
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
confluent
 
PPTX
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 
Monitoring Apache Kafka
confluent
 
Using Redgate, AKS and Azure to bring DevOps to your Database
Red Gate Software
 
Scalable and Reliable Logging at Pinterest
Krishna Gade
 
RedisConf18 - Implementing a New Data Structure for Redis
Redis Labs
 
The future of DevOps: fully left-shifted deployments with version control and...
Red Gate Software
 
RedisConf18 - Serving Automated Home Valuation with Redis & Kafka
Redis Labs
 
Peer council 2013_presentation
WiLS
 
RedisConf18 - Redis Fault Injection
Redis Labs
 
2017 LabVIEW Developer Day - Branching Workflows for Accelerated Team Develop...
Ching-Hwa Yu
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
 
SDL Trados Studio 2014... what's new?
SDL Trados
 
Stream Processing with Apache Flink
C4Media
 
Extending the Yahoo Streaming Benchmark
Jamie Grier
 
QCon London - Stream Processing with Apache Flink
Robert Metzger
 
QA Evening Игорь Колосов - Performance Testing: Metrics & Measurements
Artjoker
 
Successfully migrating existing databases to Azure
Red Gate Software
 
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Flink Forward
 
Robust Stream Processing with Apache Flink
Jamie Grier
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
confluent
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 

Similar to The Evolution of Testing Methodology at AWS: From Status Quo to Formal Methods with TLA+ (20)

PPT
Testing 2 - Thinking Like A Tester
ArleneAndrews2
 
PDF
software-testing-framework 3
pradeepcutz
 
PDF
Introduction to Software Testing
Henry Muccini
 
PDF
Formal Methods And Testing An Outcome Of The Fortest Network Revised Selected...
isananocaq
 
PPT
Software Testing Process
guest1f2740
 
PPT
Testing process
Terry Cho
 
PDF
Highly Dependable Software 1st Edition Marvin Zelkowitz Phd Ms Bs
klunkaskeyam
 
PDF
Software testing: an introduction - 2017
XavierDevroey
 
PDF
Software testing with examples in Angular (and AngularJS)
Paweł Żurowski
 
PPT
Testing
Kiran Kumar
 
PPT
Testing
Muni Ram
 
PPT
Software Testing 1198102207476437 4
Siddhartha Parida
 
PPT
Software Testing
Ecaterina Moraru (Valica)
 
PPTX
Unit 1 basic concepts of testing & quality
ravikhimani
 
PPTX
Unit 1 basic concepts of testing & quality
ravikhimani1984
 
PPT
Software testing & its technology
Hasam Panezai
 
PPTX
Model Driven Testing: requirements, models & test
Gregory Solovey
 
DOCX
Software Testing
Faisal Hussain
 
PDF
L software testing
Fáber D. Giraldo
 
PDF
November 2024 - Top 10 Read Articles in Software Engineering & Applications
sebastianku31
 
Testing 2 - Thinking Like A Tester
ArleneAndrews2
 
software-testing-framework 3
pradeepcutz
 
Introduction to Software Testing
Henry Muccini
 
Formal Methods And Testing An Outcome Of The Fortest Network Revised Selected...
isananocaq
 
Software Testing Process
guest1f2740
 
Testing process
Terry Cho
 
Highly Dependable Software 1st Edition Marvin Zelkowitz Phd Ms Bs
klunkaskeyam
 
Software testing: an introduction - 2017
XavierDevroey
 
Software testing with examples in Angular (and AngularJS)
Paweł Żurowski
 
Testing
Kiran Kumar
 
Testing
Muni Ram
 
Software Testing 1198102207476437 4
Siddhartha Parida
 
Software Testing
Ecaterina Moraru (Valica)
 
Unit 1 basic concepts of testing & quality
ravikhimani
 
Unit 1 basic concepts of testing & quality
ravikhimani1984
 
Software testing & its technology
Hasam Panezai
 
Model Driven Testing: requirements, models & test
Gregory Solovey
 
Software Testing
Faisal Hussain
 
L software testing
Fáber D. Giraldo
 
November 2024 - Top 10 Read Articles in Software Engineering & Applications
sebastianku31
 
Ad

More from C4Media (20)

PDF
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
C4Media
 
PDF
Next Generation Client APIs in Envoy Mobile
C4Media
 
PDF
Software Teams and Teamwork Trends Report Q1 2020
C4Media
 
PDF
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
PDF
Kafka Needs No Keeper
C4Media
 
PDF
High Performing Teams Act Like Owners
C4Media
 
PDF
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
C4Media
 
PDF
Service Meshes- The Ultimate Guide
C4Media
 
PDF
Shifting Left with Cloud Native CI/CD
C4Media
 
PDF
CI/CD for Machine Learning
C4Media
 
PDF
Fault Tolerance at Speed
C4Media
 
PDF
Architectures That Scale Deep - Regaining Control in Deep Systems
C4Media
 
PDF
ML in the Browser: Interactive Experiences with Tensorflow.js
C4Media
 
PDF
Build Your Own WebAssembly Compiler
C4Media
 
PDF
User & Device Identity for Microservices @ Netflix Scale
C4Media
 
PDF
Scaling Patterns for Netflix's Edge
C4Media
 
PDF
Make Your Electron App Feel at Home Everywhere
C4Media
 
PDF
The Talk You've Been Await-ing For
C4Media
 
PDF
Future of Data Engineering
C4Media
 
PDF
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
C4Media
 
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
C4Media
 
Next Generation Client APIs in Envoy Mobile
C4Media
 
Software Teams and Teamwork Trends Report Q1 2020
C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
Kafka Needs No Keeper
C4Media
 
High Performing Teams Act Like Owners
C4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
C4Media
 
Service Meshes- The Ultimate Guide
C4Media
 
Shifting Left with Cloud Native CI/CD
C4Media
 
CI/CD for Machine Learning
C4Media
 
Fault Tolerance at Speed
C4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
C4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
C4Media
 
Build Your Own WebAssembly Compiler
C4Media
 
User & Device Identity for Microservices @ Netflix Scale
C4Media
 
Scaling Patterns for Netflix's Edge
C4Media
 
Make Your Electron App Feel at Home Everywhere
C4Media
 
The Talk You've Been Await-ing For
C4Media
 
Future of Data Engineering
C4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
C4Media
 
Ad

Recently uploaded (20)

PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
July Patch Tuesday
Ivanti
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 

The Evolution of Testing Methodology at AWS: From Status Quo to Formal Methods with TLA+

  • 1. The Evolution of Testing Methodology at AWS: From Status Quo To Formal Methods With TLA+ Tim Rath Principal Engineer AWS Database Services Amazon.com 1
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /aws-testing-tla
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 5. Services comprise large fleets of servers decomposed into smaller services 3
  • 6. Many of which experience sustained exponential growth 4
  • 7. S3 experienced exponential growth for 6 years to reach 1 trillion objects stored; less than a year later it reached 2 trillion objects [1] 5
  • 8. DynamoDB processes millions of transactions per second in a single AWS region around the clock [2] 6
  • 9. Systems and data are managed through subtle concurrent and distributed algorithms 7
  • 10. “Must Haves” of Every Service 8 Security Durability Scalability Availability
  • 11. General Test Strategy • Developers – Unit Tests – Integration Tests 9 •QA (release testing) •Functional Tests •Performance Tests •Stress Tests •Failure Tests
  • 12. Test Adequacy Criteria • Literature expresses as a form of measurable code coverage [3] • Status Quo Criteria: – Statement coverage • Most common adequacy criteria employed today • Tools readily available to measure and report • Extremely weak criteria 10
  • 13. Test Adequacy Criteria • Literature expresses as a form of measurable code coverage[3] • Perfect Criteria: – Cover every execution path across every possible state for the system • States may be infinite • Test space is exponential with path length 11
  • 14. Test Adequacy Criteria • Real world practice further defines test adequacy criteria through an ad-hoc process: – Brain-storm test scenarios – Brain-storm stress test workloads – Brain-storm failure scenarios 12
  • 15. Better testing of distributed algorithms • We look for strategies that: – Help to understand and protect algorithm invariants – Help expand test coverage – Allow thorough testing as early as possible in the development process 13
  • 16. Development starts with specification 14
  • 17. Development starts with specification 14
  • 18. Development starts with specification 14
  • 19. Development starts with specification 14
  • 20. Testing Starts With Development • Write tests and test support structure while working on the implementation – The spirit of “Test Driven Development” without subscribing to the specific formula • It is unclear how useful adhering to strict TDD concepts really is [4] • The emphasis it puts on test, and test thoroughness is where the value is [5] 15
  • 21. Testing Starts With Development 15
  • 22. Assert system invariants • Asserts enforce the specification in code • Strong assert statements come from clear understanding of the specification 16
  • 25. Generative Testing Formalization around randomized testing with invariant or “property” checking 17
  • 26. Generative Testing 18 Test Case Generator Test Execution Validation Of Properties Against Result Properties Violated? Expected Properties Of Result Report Failure Case Loop No Yes
  • 27. Anecdotes From QuickCheck Paper [6] • Made them think harder about properties; document the specification • Need to think about the input domain to exercise less probable paths 19
  • 30. In Process Clusters 20 Process 1) Run multiple nodes in the same process as unit test 2) Insert arbitrary code into the communication channel 3) Ability to pipeline the inserted bits of code
  • 31. In Process Clusters • Helps write better integration tests – Allows easy construction of intricate test scenarios – Integration testing of distributed components in a unit test environment • Helps write better stress tests – Direct control at the communication layer – Much faster, lower over-head test cycles 21
  • 34. Informal Proofs • Requires deep thinking which promotes even better understanding of the algorithms • Hard to get right – can still lead to a false sense of security 22
  • 36. Formal Methods • Precise specification of algorithms • Tools to validate correctness • We surveyed some of the systems and languages for writing formal specifications, and ended up finding what we were looking for in TLA+ [7] • TLA+ gives us all possible executions over all possible system states for algorithm designs 23
  • 38. TLA+ • How the model works: – Init == <set of initial system states> – Next == <set of possible next actions> • That’s it! 25
  • 42. Real World Examples • DynamoDB – Replication protocols – Membership handling – Quorum Configuration Changes • Other AWS projects [8] – Low level distributed network protocol – Internal distributed lock manager – S3, EC2, EBS system management algorithms 26
  • 45. 28 [2] Hamilton, J. Challenges in Designing at Scale: Formal Methods in Building Robust Distributed Systems. Perspectives Blog. July 2014; http://perspectives.mvdirona.com/2014/07/03/ChallengesInDesigningAtScaleFormalMethodsInBuildingRobustDistributedSystems.aspx [1] Barr, J. Amazon S3-The First Trillion Objects. Amazon Web Services Blog. June 2012; https://aws.amazon.com/blogs/aws/amazon-s3-two-trillion-objects-11-million-requests-second/ [3] Zhu, H., et al. Software Unit Test Coverage and Adequacy. ACM Computing Surveys, Vol. 29, No. 4, December 1997; http://www.cs.toronto.edu/~chechik/courses07/csc410/p366-zhu.pdf [4] Bulajic, A., Sambasivam, S., Stojic, R. Overview of the Test Driven Development Research Projects and Experiments. Proceedings of Informing Science & IT Education Conference (InSITE), 2012; http://proceedings.informingscience.org/InSITE2012/InSITE12p165-187Bulajic0052.pdf [5] Dalke, A. Problems with TDD. Dalke Scientific (dalkescientific.com), December 2009; http://www.dalkescientific.com/writings/diary/archive/2009/12/29/problems_with_tdd.html
  • 46. 29 [6] Claessen, K., Hughes, J. QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs. ACM SIGPLAN Notices Volume 35 Issue 9, Sept. 2000; http://www.eecs.northwestern.edu/~robby/courses/395-495-2009-fall/quick.pdf [7] Newcombe, C., et al. Use of Formal Methods at Amazon Web Services. (pending ACM publication), November 2013; http://research.microsoft.com/en-us/um/people/lamport/tla/amazon.html [8] Newcombe, C. Why Amazon Chose TLA+. Lecture Notes in Computer Science Volume 8477, June 2014; http://link.springer.com/chapter/10.1007%2F978-3-662-43652-3_3
  • 47. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/aws- testing-tla