AWS Classroom Series – 17/Sept/2020

Metric Retention

  • Metrics exist only in the region in which they are created.
  • Metrics cannot be deleted, but they expire after 15 months if nod new data is published.
  • Cloud Watch retains metric data as
    1. Data points with a period less than 60 seconds are available for 3 hours.
    2. Data points with a period of 60 seconds are available for 15 days.
    3. Data points with a period of 300 seconds are available for 63 days.
    4. Data points with a period of 3600 seconds are available for 455 days (15 months)
  • Metrics that do not have any new data points in the past two weeks will not be shown in the console. You can search them in All metrics tab or use cli.

Dimension

  • A Dimension is a name value pair that is part of the metric. We can assign 10 dimensions to a metrics.

Metrics & Alarms

  • Lets create an ec2 instance with ubuntu.
  • Lets plot metrics Preview Preview Preview
  • Now lets select a simple metric Cpu Utilization Preview Preview
  • Now lets login into the ec2 instance and create artificial cpu load
sudo apt-get update
htop
# login from other console
sudo apt-get install stress -y
stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 10m

Preview

  • Wait for at least 10 mins
  • Lets enable detailed cloud watch monitoring for the ec2 instance Preview Preview Preview
  • When we observe anamolies we need to take action, depending on the aws service which you are monitoring we have actions & also common set of actions. Preview
  • Lets go with the concept of alarm (Cloud watch alarm)

AWS Cloudwatch Alarms

  • In Alarm we configure some condition (CPU Utilization > 90% or CPU Utilization < 15% )

  • Lets create an alarm, AWS Console gives the following options to create alarms Preview Preview Preview

  • Lets create a reboot action when CPU utilization is grater than 90% for 10 mins Preview

  • Alarm will have 3 states

    • Insufficient: This represents data is insufficient
    • OK: Alarm condition is not met
    • Alarm: Alarm condition is met
  • Actions: Actions are what has to happen when Alarm condition is met. Preview Preview Preview

  • CloudWatch Alarm States Preview

  • Next Steps:

    1. When Alarm is in alarm state
      • How do i send mails/pager alerts/sms
      • How do i integrate with external systems (ITIL)
    2. We will be creating multi machine ec2 metrics & logs

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About learningthoughtsadmin