DevOps Classroom Series – SRE – 24/Oct/2019

Monitoring and Alerting

  • Alert on SLIs or SLOs
  • Turn off all the other alerts

Observability

  • Three different things to observe
    • Logs
    • Metrics
    • Traces
  • Monitoring system with High level failures which navigates to
    • Logs
    • Metrics
    • Traces

Incident Mangement

  • Clear cut Ways of Working defined by SRE.
  • Following Roles are available to deal the situation
  • Incident Commander Role is allocated when the incident is recorded.
  • Incident Commander has following activiteis
    • Plan the Work to Resolve incident or delegate to Planning Lead (New Role create for incident)
    • Do Operations to Resolve or delegate to Operations Lead (New Role create for incident)
    • Make necessary Communications or delegate to Communications Lead (New Role create for incident)
    • Once the incident is resolved Create Postmortem Documnet

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About continuous learner

devops & cloud enthusiastic learner