MultiCloud Classroom notes 16/May/2026

Azure Monitoring Dashboards

Overview

Azure Monitoring Dashboards are customizable, unified views within the Azure portal that consolidate metrics, logs, alerts, and health data from across your Azure resources. They give operations teams, developers, and stakeholders a single pane of glass to observe the health, performance, and availability of cloud workloads in real time.

Types of Dashboards

Type Description Best For
Azure Portal Dashboard Drag-and-drop tiles pinned from resources Quick operational overviews
Azure Workbooks Rich, interactive reports with KQL queries Deep-dive analysis and sharing
Grafana (Azure Managed) Open-source dashboards backed by Azure data sources Advanced visualization, multi-cloud
Power BI Business-oriented dashboards connected via Azure Monitor Executive reporting

Key Metrics to Monitor

Virtual Machines

Metric Threshold to Watch
Percentage CPU > 80% sustained
Available Memory Bytes < 10% of total RAM
Disk Read/Write Bytes/sec Spikes indicating I/O bottlenecks
Network In/Out Unexpected traffic surges

Azure Kubernetes Service (AKS)

Metric Notes
Node CPU utilization Per node and cluster-wide
Pod restart count Indicates crashes or OOM kills
Pending pod count Scheduling failures
kube-apiserver latency Control plane health

Azure SQL / Cosmos DB

Metric Notes
DTU/vCore utilization Database compute saturation
Deadlocks Query contention
Request Units consumed Cosmos DB throughput
Connection failures Connectivity issues

App Services / Functions

Metric Notes
HTTP 5xx errors Server-side failures
Average response time Performance degradation
Requests per second Load trends
Function execution count Invocation volume

Writing KQL Queries for Dashboards

KQL (Kusto Query Language) is used to query Log Analytics data for pinning to dashboards.

Example — Top 10 Errors in the Last 24 Hours

AppExceptions
| where TimeGenerated > ago(24h)
| summarize ErrorCount = count() by OuterType, AppRoleName
| top 10 by ErrorCount desc
| render barchart

Example — VM CPU Average by Hour

Perf
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| summarize AvgCPU = avg(CounterValue) by bin(TimeGenerated, 1h), Computer
| render timechart

Example — Alert Trend Over 7 Days

AlertsManagementResources
| where type == "microsoft.alertsmanagement/alerts"
| where properties.essentials.startDateTime > ago(7d)
| summarize AlertCount = count() by Severity = tostring(properties.essentials.severity), bin(todatetime(properties.essentials.startDateTime), 1d)
| render columnchart

Best Practices

  • Organize by audience — Create separate dashboards for Ops, Dev, and Executive views
  • Use resource tags — Filter dashboards dynamically using subscription-level tag scopes
  • Version control dashboards — Export JSON and store in Git for auditability
  • Set alert tiles prominently — Always include an alert summary tile at the top
  • Limit tile count — Keep dashboards focused; use Workbooks for detailed drill-downs
  • Apply RBAC — Grant least-privilege access; use Reader role for view-only stakeholders
  • Schedule review cadence — Revisit dashboard relevance monthly as architectures evolve

Useful Links

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Please turn AdBlock off
Social Media Icons Powered by Acurax Web Design Company

Discover more from Direct DevOps from Quality Thought

Subscribe now to keep reading and get access to the full archive.

Continue reading

Visit Us On FacebookVisit Us On LinkedinVisit Us On Youtube