Cloudwatch Terms
-
Metric: Any measurable unit of the Infrastructure/Application/Platform. Eg CPU Utilization, Disk Read Bytes, Network Packets Out
-
Unit: measurement unit generally something like Bytes/Sec, Percentages
-
TimePeriod: Duration of the metric
-
Aggregations: Min, Max, Sum, Count, Average, Percentile
-
Alarm: Is state when something measurable is wrong.
- When CPU Utilzation is > 85% for 10 mins
- When free disk space < 1GB
- When Network Packets IN == 0 for 10 mins
-
Alarm States:
- Alarm: When the Alarm condition is met for eg CPU Utilization is > 85% for over 10 mins
- OK: When Alarm condition is not met for eg CPU Utilization is == 74% for over 10 mins
- Insufficient: Cloudwatch doesnot have data of timeperiod. eg Instance has been create 2 mins ago. Data is not sufficent to recognize the alarm or ok state.
Basic Cloudwatch Alarming
-
Workflow
-
Alarms can do the following action when state is ALarm
- Send Notifications using SNS
- EC2 AutoScaling Action
- EC2 Actions
AWS CloudWatch Overview
note: From official AWS docs
Next Steps
- Cloudwatch and EC2
- Cloudwatch and RDS
- Cloudwatch and ELB
- Other Service Integrations