A single dramatic software failure can cost a company millions of dollars
- but can be avoided with simple changes to design & architecture This new edition of the best-selling industry standard shows you how to create systems that run longer with fewer failures & recover better when bad things happen New coverage
Includes:: Dev Ops microservices & cloud-native architecture Stability antipatterns have grown to include systemic problems in large-scale systems This is a must-have pragmatic guide to engineering for production systems If you're a software developer & you don't want to get alerts every night for the rest of your life help is here With a combination of case studies about huge losses
- lost revenue lost reputation lost time lost opportunity
- & practical down-to-earth advice that was all gained through painful experience this book helps you avoid the pitfalls that cost companies millions of dollars in downtime & reputation Eighty percent of project life-cycle cost is in production yet few books address this topic This updated edition deals with the production of today's systems
- larger more complex & heavily virtualized
- & is the first book to cover chaos engineering the discipline of applying randomness & deliberate stress to reveal systematic problems Build systems that survive the real world avoid downtime implement zero-downtime upgrades & continuous delivery & make cloud-native applications resilient Examine ways to architect design & build software
- particularly distributed systems
- that stands up to the typhoon winds of a flash mob a Slashdotting or a link on Reddit Take a hard look at software that failed the test & find ways to make sure your software survives