Completek8s Classroom notes 02/Nov/2023

Why Kubernetes is Challenging in Production

  • k8s in production brings challenges and difficulties from
    • scaling
    • uptime
    • security
    • observability
    • resource utilization
    • cost management
  • K8s lacks complete support for some essential services such as IAM, storage and image registries
  • Learning curve and too many moving parts make it little bit more difficult to managing k8s. Lets have a look at k8s infra layers

Production Readiness checklist

Cluster Infrastructre

  • The following check list items cover the production readiness requirements on the cluster level
    • Run a highly available control plane
    • Run a highly available workers group
    • Use a shared storage management system
    • Deploy infrastructure oservability stack

Cluster services

  • The following checklist items cover the production readiness requirements on the cluster services level
    • Control cluster access
    • Hardening the default pod security admission
    • Enforce custom policies and rules
    • Deploy and restrict network policies
    • Enforce Security checks and conformance testing
    • Deploy a backup and restore solution
    • Deploy an observability stack for cluster componets

Apps and Deployments

  • The following checklist items cover the production readiness requirements on the apps and deployments level
    • Automate images quality and vulnerability scanning
    • Deploy ingress controller
    • Manage certificates and secrets
    • Deploy app observability stack

Kubernetes Infrastructure Best Practices

The 12 principles of infrastructure design and management

  • The following list summarizes the core principles that may lead to decision making through the k8s infrastructre desing process.
    • Go Managed
    • Simplify
    • Everything as Code (Xac)
    • Immutable infrastructure
    • Automation (GitOps and Operators)
    • Standardization
    • Single Source of truth (Git)
    • Design for availability
    • Cloud agnostic
    • Business Continuity
    • Plan for failures
    • Operational effeciency

Cloud Native landscape & ecosystem

  • This landscape has four layers

    • Provisioning
    • Runtime
    • Orchestration Managment
    • App definition and development
  • Cloud Native Trail map Refer Here

Best-Practices for production

  • List of important considerations and best practices to run k8s in production
    • Cluster Configuration:
      • Use infrastructure as code (IaC) to automate the creation and management of k8s clusters
      • Seperate your clusters for development, testing and production
    • Security
      • Following the priniciple of lest privilege to access the k8s api
      • Use RBAC to manage access to resources
      • Secure your cluster with network policiews
      • Use naamespaces to isolate workloads
      • Keep container images free of vulnerabilities and regularly scan them
      • Use trusted base images for containers
      • Enable audit loggging to keep track of activities
    • Networking
      • Use CNI with network policies enabled
      • Expose services through ingress controllers and LoadBalancer with secure connectivity
    • Storage:
      • Use persitent volumes for stateful applications
      • Regularly backup your persisted data
      • Implement robust storage solutions that match your IOPS and throughput requirements
    • Monitoring & logging:
      • Implement a comprehensive monitoring solution like Prometheus to track cluster state and Performance
      • Aggregate and analyze logs using tools like Elastic search, fluentd and Kibana
    • High Availability:
      • Run k8s control plane components in HA mode
      • Deploy critical applications with multiple replicase
      • Distribute workloads across multiple nodes and zones
    • Disaster Recovery:
      • Create a regular backups of your cluster state (etcd)
      • Have a Disaster Recovery Plan in place
    • Automation:
      • Automate your deployments with CI/CD pipelines
      • Use GitOps for declarative infrastructure and applications management
    • Resource Management:
      • Implement Resource requests and limits to ensure fair scheduling and avoid resource contention
      • Use Horizontal Pod Autoscaling to adjust number of pod replicase based on Lod
    • Updates and Upgrades:
      • Regularly apply updates to k8s and containerized applications
      • Perform rolling updates to minimize downtime
    • Performance Tuning:
      • Profile the performace of your applications and optimize them as needed
      • Tune the kernel and network settings for better performance where necessary
    • State Managment:
      • Use stateful sets for workloads that require stable and persitent storage
      • Ensure the state is backed up to avoid the data inconsistency
    • Cost Management:
      • Monitor resource usage to optimize costs
      • use cost-allocation tags for billing and cost optimizations

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About continuous learner

devops & cloud enthusiastic learner