A single dramatic software failure can cost a company millions of dollars - but can be avoided with simple changes to design and architecture. This new edition of the best-selling industry standard shows you how to create systems that run longer, with fewer failures, and recover better when bad things happen. New coverage includes DevOps, microservices, and cloud-native architecture. Stability antipatterns have grown to include systemic problems in large-scale systems. This is a must-have pragmatic guide to engineering for production systems.
If you're a software developer, and you don't want to get alerts every night for the rest of your life, help is here. With a combination of case studies about huge losses - lost revenue, lost reputation, lost time, lost opportunity - and practical, down-to-earth advice that was all gained through painful experience, this book helps you avoid the pitfalls that cost companies millions of dollars in downtime and reputation. Eighty percent of project life-cycle cost is in production, yet few books address this topic.
This updated edition deals with the production of today's systems - larger, more complex, and heavily virtualized - and includes information on chaos engineering, the discipline of applying randomness and deliberate stress to reveal systematic problems. Build systems that survive the real world, avoid downtime, implement zero-downtime upgrades and continuous delivery, and make cloud-native applications resilient. Examine ways to architect, design, and build software - particularly distributed systems - that stands up to the typhoon winds of a flash mob, a Slashdotting, or a link on Reddit. Take a hard look at software that failed the test and find ways to make sure your software survives.
To skip the pain and get the experience...get this book.
Table of Contents
Chapter 1. Living In Production
Part I—Create Stability
Chapter 2. Case Study: The Exception That Grounded An Airline
Chapter 3. Stabilize Your System
Chapter 4. Stability Antipatterns
Chapter 5. Stability Patterns
Part II—Design for Production
Chapter 6. Case Study: Phenomenal Cosmic Powers, Itty-Bitty Living Space
Chapter 7. Foundations
Chapter 8. Processes On Machines
Chapter 9. Interconnect
Chapter 10. Control Plane
Chapter 11. Security
Part III—Deliver Your System
Chapter 12. Case Study: Waiting For Godot
Chapter 13. Design For Deployment
Chapter 14. Handling Versions
Part IV—Solve Systemic Problems
Chapter 15. Case Study: Trampled By Your Own Customers
Chapter 16. Adaptation
Chapter 17. Chaos Engineering