DevOps Classroom Series – 27/Oct/2019


System Monitoring

  • Monitor Basic System Details
    • Is Server Up
    • Is Http Page responding
    • Is Datbase Query Responding
    • Whats the free disk space at that moment
    • What is CPU Utilization at that moment
    • What is Network load at that moment
    • What is Disk IO Activity at that moment
  • It can be completely our responsibility (DevOps/SRE)
  • Tools:
    • Nagios
    • Zabbix
    • Icinga

Application Monitoring

  • Detailed monitoring of your application
    • How much memory is my application consuming
    • Whats the current number of concurrent users on my application
    • What are my applications logs speaking
    • What are my applications traces telling
    • What are failure patterns
  • Need collaboration with Dev to accomplish this monitoring
  • Tools:
    • Elastic Stack
    • Splunk
    • App Dynamics
    • Application Insights


  • Metric: Some measurement in terms of units of System/Application. Eg CPU Utilization, Newtork In/Out
  • Charts: Charts are metrics aggregated over time
  • Logs:
    • System Logs
    • Application Logs
  • Dashboard: A unified view of every thing that matters.
  • Alert: concern about a system raised to person(s)


  • Health of the System
  • Identify Failure Patterns
  • Do analytics to suggest better customer experience.

SRE Expectations

  • Reduce Noise and Alert on SLIs or SLOs
  • Make your Monitoring Observable


  • Has two Versions
    • Core
    • Enterprise
  • Nagios Core Installation:
    • Involves downloading Nagios Code
    • Building the Nagios Code
    • Configuring Nagios
    • For the above nagios uses make
  • Nagios
    • Plugins
    • Commands

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About continuous learner

devops & cloud enthusiastic learner