Loading…
CppCon 2019 has ended
Wednesday, September 18 • 08:00 - 08:45
Site Reliability Engineering: Balancing Risk and Velocity

Log in to save this to your schedule, view media, leave feedback and see who's attending!

As a venerable, powerful language, C++ is used in a variety of mission critical, performance-sensitive, complex applications and services. Whether in embedded systems, video games, network programming, or a host of other areas, there is always an inherent tug-of-war between the velocity of feature development and the risks to reliability of a production service or application. Over the years, many strategies to address this conflict have arisen, from complex change management processes to new models of work (e.g. the “DevOps” movement).

Site Reliability Engineering seeks to implement some of the principles of the “DevOps” mindset with concrete practices and cultural norms. Born out of decades of running massively scaled systems at companies like Google, SRE implements some hard-won lessons in the trenches of billion-plus user applications. This talk will introduce attendees to some of the basic concepts of SRE, and frame how it influences the development process for a service. We’ll talk about service level objectives, error budgets, and risk analysis, and how teams can use these tools to better communicate and drive innovation, while maintaining a minimum acceptable level of reliability for their users.

SRE concepts are not solely useful to C++ developers, but also to other devs, operations teams, product organizations, and anyone influencing the production course of an application or service. After attending this talk, conference goers will have a basic grasp of SRE fundamentals, and be ready to take additional steps like reading SRE books, taking SRE training, or even establishing aspirational service level objectives in their own organizations.

Speakers
avatar for Derek Remund

Derek Remund

Strategic Cloud Engineer, Google



Wednesday September 18, 2019 08:00 - 08:45 MDT
Summit 8/9