DevOps Classroom Series – 20/02/2020

Alerts

  • Avoid Noisy Alerts
  • Alert on SLI’s, SLOs and SLAS
  • To make alerts meaniful make your monitoring system Observable

Observability

  • Three Pillars of Observability
    • Structured Logs
    • Metrics
    • Traces

Incidents

  • Create a role called as Incident Commander

  • Incident Command Appoints Operations Lead

  • Incident Command Appoints Communications Lead

  • After incident is resolved, Incident Commander is responsible for sharing a Post-Mortem Report after conducting Retrospection.

Some Important Metrics

  • MTTF (Mean Time To Failure)

  • MTBF (Mean Time Between Failures)

  • MTTR ( Mean Time To Resolve)

  • Refer Here for Slideshow of SRE

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About learningthoughtsadmin