Gå direkte til innholdet
Agentic Reliability Engineering
Spar

Agentic Reliability Engineering

As modern systems grow in scale, speed, and complexity, traditional reliability practices are reaching their limits. Site Reliability Engineering (SRE) excels at automating repeatable and predictable operational tasks, enabling engineers to respond faster and operate at scale. But as systems become more dynamic and interconnected, reliability increasingly depends on decisions made in real time, under uncertainty, and across competing priorities.

Agentic Reliability Engineering represents the next evolution of reliability. Instead of encoding every operational decision into runbooks and automation, engineers define intent, constraints, and principles, allowing systems to observe context, reason about trade-offs, and act autonomously within clear guardrails. Reliability shifts from human-driven reaction to system-driven decision-making, while remaining governable and accountable.

Written for experienced SREs, platform engineers, and engineering leaders, this book presents a practical framework for designing systems that can learn, adapt, and operate safely at machine speed.

By the end of this book, you'll be able to:

  • Understand how reliability evolves from automation to autonomy
  • Design intent-driven agentic reliability boundaries
  • Implement agent-driven incident response and learning loops
  • Build observability and decision feedback that enables trust-based autonomy
  • Lead technical and cultural change toward scalable, trust-based autonomy
Undertittel
Building Agentic Systems That Think, Adapt, and Recover
Forfatter
David Jambor
ISBN
9798341673137
Vekt
310 gram
Utgivelsesdato
30.11.2026
Antall sider
400