DevOps Classroom notes 06/Sep/2025

Site Reliability Engineering

  • These are practices followed by google to run their production systems. Refer Here for books.
  • Refer Here for SRE concepts

Observability of Kubernetes using Prometheus and Grafana Stack

  • We would setup observability using Prometheus, Grafana, Loki
    Preview
  • We would setup a k8s cluster (aks) and most of the clouds support prometheus and grafana as addons
  • Setup AKS
RG=rg-aks-obsv
AKS=aks-obsv
LOCATION=eastus
NAMESPACE=monitoring
az group create -n $RG -l $LOCATION
az aks create -g $RG -n $AKS --node-count 3 --node-vm-size Standard_B2ms \
    --enable-managed-identity 
az aks get-credentials -g $RG -n $AKS --overwrite-existing
  • Setting up promethesu using helm
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

kubectl create namespace $NAMESPACE

cat > values.kps.yaml <<'EOF'
grafana:
  adminUser: admin
  adminPassword: admin123
  service: { type: LoadBalancer }
  persistence:
    enabled: true
    type: pvc
    size: 10Gi
    storageClassName: managed-csi

prometheus:
  prometheusSpec:
    retention: 15d
    retentionSize: "2GiB"
    walCompression: true
    enableFeatures:
      - exemplar-storage
    storageSpec:
      volumeClaimTemplate:
        spec:
          storageClassName: managed-csi
          accessModes: ["ReadWriteOnce"]
          resources: { requests: { storage: 5Gi } }
    podMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelectorNilUsesHelmValues: false

alertmanager:
  alertmanagerSpec:
    storage:
      volumeClaimTemplate:
        spec:
          storageClassName: managed-csi
          accessModes: ["ReadWriteOnce"]
          resources: { requests: { storage: 1Gi } }
EOF
  • Now install prometheus
helm install kps prometheus-community/kube-prometheus-stack -n $NAMESPACE -f values.kps.yaml


kubectl get pods -n $NAMESPACE
kubectl get svc -n $NAMESPACE
  • Get the grafana url
GRAFANA_LB=$(kubectl -n $NAMESPACE get svc kps-grafana -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "Grafana: http://$GRAFANA_LB  (admin/admin123)"

  • Now lets setup log collection using loki and promtail
helm install loki grafana/loki-simple-scalable \
  --namespace $NAMESPACE



helm install promtail grafana/promtail \
  --namespace $NAMESPACE \
  --set "loki.serviceName=loki-write" \
  --set "loki.servicePort=3100"

  • Lets deploy the app built for monitoring
kubectl create ns demo
IMAGE=shaikkhajaibrahim/observapp:latest

# Deployment & Service (uses your pushed image)
cat > fastapi-k8s.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fastapi-obs
  namespace: demo
spec:
  replicas: 2
  selector:
    matchLabels: { app: fastapi-obs }
  template:
    metadata:
      labels: { app: fastapi-obs }
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8000"
        prometheus.io/path: "/metrics"
    spec:
      containers:
      - name: app
        image: $IMAGE
        env:
        - name: OTEL_SERVICE_NAME
          value: fastapi-obs
        - name: OTEL_EXPORTER_OTLP_ENDPOINT
          value: http://otel-collector.otel:4318
        - name: OTEL_EXPORTER_OTLP_PROTOCOL
          value: http/protobuf
        - name: OTEL_TRACES_SAMPLER
          value: parentbased_traceidratio
        - name: OTEL_TRACES_SAMPLER_ARG
          value: "1.0"          # sample everything in lab; reduce in prod
        - name: OTEL_PYTHON_LOG_CORRELATION
          value: "true"
        ports:
        - containerPort: 8000
---
apiVersion: v1
kind: Service
metadata:
  name: fastapi-obs
  namespace: demo
spec:
  selector: { app: fastapi-obs }
  ports:
  - port: 80
    targetPort: 8000
EOF

kubectl apply -f fastapi-k8s.yaml

  • setup steady traffic gnerator
kubectl -n demo run looper --image=curlimages/curl -i --rm -- \
  sh -lc 'while true; do curl -s fastapi-obs.demo/hello >/dev/null; sleep 0.2; done'

We have built a sample application for observability

Published
Categorized as Uncategorized Tagged

By continuous learner

devops & cloud enthusiastic learner

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Please turn AdBlock off
Animated Social Media Icons by Acurax Responsive Web Designing Company

Discover more from Direct DevOps from Quality Thought

Subscribe now to keep reading and get access to the full archive.

Continue reading

Visit Us On FacebookVisit Us On LinkedinVisit Us On Youtube