All DevOps & Cloud Guides
AKS Backup Patterns (Batch 8)
Backup etcd state and persistent volumes via Velero or Azure Backup. Quick reference guide with examples and best practices. Updated November 2025.
Alert Flood Mitigation
Silence noisy alerts, throttle them, and adjust routing. Quick reference guide with examples and best practices. Updated November 2025.
Alert Flood Mitigation (Batch 8)
Use silences, grouping, and rate limiting to tame flood. Quick reference guide with examples and best practices. Updated November 2025.
Ansible
IT automation and configuration management. Use ansible-vault to secure sensitive variables Quick reference guide with examples and best practices. Updated November 2025.
Ansible Playbooks
Define playbooks with hosts, tasks, handlers, and roles Quick reference guide with examples and best practices. Updated November 2025.
Argo CD Sync Waves
Use sync waves to control deployment order and gates during GitOps syncs. Quick reference guide with examples and best practices. Updated November 2025.
ArgoCD
GitOps continuous delivery for Kubernetes. ArgoCD continuously monitors Git for changes Quick reference guide with examples and best practices. Updated November 2025.
Automation Ops Checklist
Document automation workflows, approval steps, and fallbacks. Quick reference guide with examples and best practices. Updated November 2025.
AWS Cost Operations
Monitor budgets, schedule reports, and optimize spend Quick reference guide with examples and best practices. Updated November 2025.
AWS Cost Optimization
Analyze spend, rightsizing opportunities, and savings plans monthly. Quick reference guide with examples and best practices. Updated November 2025.
AWS EKS Bottlerocket
Use Bottlerocket AMIs for immutable node pools with tuned kubelets. Quick reference guide with examples and best practices. Updated November 2025.
AWS EKS Encryption 1
Encrypt secrets and etcd Quick reference guide with examples and best practices. Updated November 2025.
AWS EKS Encryption 11
Encrypt secrets and etcd Quick reference guide with examples and best practices. Updated November 2025.
AWS EKS Encryption 21
Encrypt secrets and etcd Quick reference guide with examples and best practices. Updated November 2025.
AWS EKS Encryption 31
Encrypt secrets and etcd Quick reference guide with examples and best practices. Updated November 2025.
AWS EKS Encryption 41
Encrypt secrets and etcd Quick reference guide with examples and best practices. Updated November 2025.
AWS GuardDuty Response
Classify GuardDuty findings, correlate context, and trigger remediation. Quick reference guide with examples and best practices. Updated November 2025.
AWS IAM Access Advisor
Use Access Advisor data to remove stale IAM actions from roles. Quick reference guide with examples and best practices. Updated November 2025.
AWS IAM Policies
Design IAM policies and test with the simulator Quick reference guide with examples and best practices. Updated November 2025.
AWS Lambda Health Probes
Send regular synthetic requests to Lambda entry points to detect slowdowns. Quick reference guide with examples and best practices. Updated November 2025.
AWS Lambda Powertools
Leverage Powertools decorators for structured logs, telemetry, and idempotency. Quick reference guide with examples and best practices. Updated November 2025.
AWS Monitoring
CloudWatch alarms, X-Ray tracing, and dashboards Quick reference guide with examples and best practices. Updated November 2025.
AWS Resilience Patterns
Design resilient AWS workloads with retries, multi-AZ + fallback Quick reference guide with examples and best practices. Updated November 2025.
AWS S3 Policy Review (Batch 8)
Check public access block settings and fine-tune bucket policies. Quick reference guide with examples and best practices. Updated November 2025.
AWS Security Hub Audit
Quick diagnostics for AWS Security Hub, IAM, CloudTrail, and GuardDuty Quick reference guide with examples and best practices. Updated November 2025.
AWS Services
Amazon Web Services essentials. Navigate AWS cloud services efficiently Quick reference guide with examples and best practices. Updated November 2025.
AWS SNS+SQS Anti-Patterns
Avoid infinite retries, missing DLQs, and unbounded queue growth. Quick reference guide with examples and best practices. Updated November 2025.
Azure Container Instances
Run container workloads without Kubernetes by relying on ACI. Quick reference guide with examples and best practices. Updated November 2025.
Azure Cost Alerts (Batch 8)
Send budget alerts when spend approaches limits. Quick reference guide with examples and best practices. Updated November 2025.
Azure DevOps Pipeline Cheats
Design multi-stage pipelines with reusable templates and approvals. Quick reference guide with examples and best practices. Updated November 2025.
Azure Function Proxies
Add proxies for path rewrites, auth, and caching in front of Functions. Quick reference guide with examples and best practices. Updated November 2025.
Azure Functions
HTTP/timer/cosmos triggers plus durable functions Quick reference guide with examples and best practices. Updated November 2025.
Azure Logging
Collect diagnostics + logs in Azure Quick reference guide with examples and best practices. Updated November 2025.
Azure Monitor Log Analytics (Batch 8)
Run Kusto queries and alerts within Log Analytics. Quick reference guide with examples and best practices. Updated November 2025.
Azure Monitor-Based Alerts
Build metric alerts with dynamic thresholds and action groups. Quick reference guide with examples and best practices. Updated November 2025.
Azure Policy Baselines
Define policy sets that cover identity, network, storage, and cost guardrails. Quick reference guide with examples and best practices. Updated November 2025.
Azure Private Endpoints
Expose Azure services via private endpoints tied to VNets. Quick reference guide with examples and best practices. Updated November 2025.
Azure Spot VM Operations
Use Azure Spot VMs while handling eviction notices and fallback hosts. Quick reference guide with examples and best practices. Updated November 2025.
Azure Workload Identity 10
Managed identity best practices Quick reference guide with examples and best practices. Updated November 2025.
Azure Workload Identity 20
Managed identity best practices Quick reference guide with examples and best practices. Updated November 2025.
Azure Workload Identity 30
Managed identity best practices Quick reference guide with examples and best practices. Updated November 2025.
Azure Workload Identity 40
Managed identity best practices Quick reference guide with examples and best practices. Updated November 2025.
Azure Workload Identity 50
Managed identity best practices Quick reference guide with examples and best practices. Updated November 2025.
Blue/Green CI/CD (Batch 8)
Use stacks for green/blue deployments and swap load balancers once ready. Quick reference guide with examples and best practices. Updated November 2025.
Chaos Engineering Controls
Introduce controlled faults to prove resilience while minimizing blast radius. Quick reference guide with examples and best practices. Updated November 2025.
CI/CD Best Practices
Continuous Integration and Deployment Quick reference guide with examples and best practices. Updated November 2025.
CI/CD Branch Gating
Require passing builds, approvals, and gating policies before merges. Quick reference guide with examples and best practices. Updated November 2025.
CI/CD Canary Workflows
Orchestrate canary deployments using traffic splits and health gates. Quick reference guide with examples and best practices. Updated November 2025.
CI/CD Canary Workflows (Batch 8)
Deploy small percentages of traffic to new builds using feature flags and health gates. Quick reference guide with examples and best practices. Updated November 2025.
CI/CD Security
Gate pipelines with scans, secrets hygiene, and approvals Quick reference guide with examples and best practices. Updated November 2025.
CI/CD Signing & Notarization
Sign releases and optionally notarize containers/binaries. Quick reference guide with examples and best practices. Updated November 2025.
CircleCI
Cloud-based CI/CD platform. Use CircleCI orbs to simplify configuration Quick reference guide with examples and best practices. Updated November 2025.
Cloud Basics
AWS, GCP, and Azure fundamentals Quick reference guide with examples and best practices. Updated November 2025.
Cloudflare Workers
Deploy edge functions, persist data, and configure routes Quick reference guide with examples and best practices. Updated November 2025.
Consul Service Mesh
Use Consul intentions, ACLs, and proxies to secure service-to-service traffic. Quick reference guide with examples and best practices. Updated November 2025.
Data Version Control
Use DVC/git-lfs to version data plus experiments Quick reference guide with examples and best practices. Updated November 2025.
Datadog
Monitoring and observability platform. Datadog provides unified view across infrastructure and apps Quick reference guide with examples and best practices. Updated November 2025.
DigitalOcean
DigitalOcean droplets, spaces, app platform, and management Quick reference guide with examples and best practices. Updated November 2025.
Docker
Container management and Docker commands Quick reference guide with examples and best practices. Updated November 2025.
Docker BuildKit Optimizations
Use BuildKit features like cache imports, secret mounts, and parallel build stages to shrink build time Quick reference guide with examples and best practices. Updated November 2025.
Docker Compose Health
Add health probes to compose services and restart policies. Quick reference guide with examples and best practices. Updated November 2025.
Docker Compose Secrets (Batch 8)
Reference secrets objects rather than inline env vars. Quick reference guide with examples and best practices. Updated November 2025.
Docker Healthcheck Definitions
Define `HEALTHCHECK` commands and intervals for each service. Quick reference guide with examples and best practices. Updated November 2025.
Docker Image Layer Caching
Structure Dockerfiles so stable layers are reused across builds. Quick reference guide with examples and best practices. Updated November 2025.
Docker Nexus Proxy
Proxy Docker Hub via Nexus to cache layers and limit outbound costs. Quick reference guide with examples and best practices. Updated November 2025.
Docker Non-Root Best Practices
Switch to non-root users and drop capabilities during image build. Quick reference guide with examples and best practices. Updated November 2025.
Docker Practices
Slim images, multistage builds, and secure runtime habits Quick reference guide with examples and best practices. Updated November 2025.
Dockerfile Security (Batch 8)
Use multi-stage builds, drop root, and scan images. Quick reference guide with examples and best practices. Updated November 2025.
EKS Fargate Spot
Run bursty or fault-tolerant pods on Fargate Spot to reduce spend. Quick reference guide with examples and best practices. Updated November 2025.
EKS IAM Roles for Service Accounts (Batch 8)
Bind IAM roles to service accounts using IRSA. Quick reference guide with examples and best practices. Updated November 2025.
GCP DNS Peering (Batch 8)
Peering Cloud DNS zones into VPCs for custom domain resolution. Quick reference guide with examples and best practices. Updated November 2025.
GCP IAM Conditions
Limit access by enforcing conditions on service accounts or users. Quick reference guide with examples and best practices. Updated November 2025.
GCP Private Service Connect
Expose services privately via Private Service Connect endpoints. Quick reference guide with examples and best practices. Updated November 2025.
GCP Secret Manager Lifecycle
Version secrets, rotate them automatically, and track IAM bindings. Quick reference guide with examples and best practices. Updated November 2025.
GCP Service Mesh Primer
Configure Anthos Service Mesh traffic policies and telemetry. Quick reference guide with examples and best practices. Updated November 2025.
Git
Git version control cheat sheet with essential commands for branching, merging, committing, and collaboration. Quick reference guide with examples and best practices. Updated November 2025.
Git Branching Strategies
Choose models (GitHub flow, GitLab flow, trunk-based) and keep merges predictable Quick reference guide with examples and best practices. Updated November 2025.
Git CLI Tips
Speed up CLI workflows with aliases, rerere, and stash Quick reference guide with examples and best practices. Updated November 2025.
Git Commit Message Standards
Use conventional prefixes, scopes, and tidy bodies for automation. Quick reference guide with examples and best practices. Updated November 2025.
Git LFS Workflows
Keep large binaries in Git LFS and track quota usage. Quick reference guide with examples and best practices. Updated November 2025.
Git Secret Lint
Scan staged files with `git-secrets` or `detect-secrets` before commits. Quick reference guide with examples and best practices. Updated November 2025.
Git Workflows (Trunk, Feature, Shape)
Compare trunk-based, feature, and shape-up workflows. Quick reference guide with examples and best practices. Updated November 2025.
GitHub Actions
CI/CD automation for GitHub. Use actions/cache to speed up workflow runs Quick reference guide with examples and best practices. Updated November 2025.
GitHub Actions Advanced
Advanced GitHub Actions - workflows, matrix builds, and reusable actions Quick reference guide with examples and best practices. Updated November 2025.
GitHub Actions Matrix Builds
Define matrices to test across environments efficiently. Quick reference guide with examples and best practices. Updated November 2025.
GitHub Actions Security
Secure your GitHub Actions pipelines with scans, secrets, and policies Quick reference guide with examples and best practices. Updated November 2025.
GitLab CI/CD
Define YAML jobs, caching, and rules Quick reference guide with examples and best practices. Updated November 2025.
GitOps
GitOps principles and tools - ArgoCD and FluxCD Quick reference guide with examples and best practices. Updated November 2025.
GitOps Observability
Monitor GitOps controllers for drift, errors, and delivery latency Quick reference guide with examples and best practices. Updated November 2025.
GitOps Workflows
Declarative delivery, automation, and drift detection Quick reference guide with examples and best practices. Updated November 2025.
GKE Resource Quota Monitoring
Monitor `ResourceQuota` usage per namespace so teams avoid hot nodes. Quick reference guide with examples and best practices. Updated November 2025.
Google Cloud Platform
GCP services overview. Use BigQuery for large-scale data analytics Quick reference guide with examples and best practices. Updated November 2025.
Grafana
Data visualization and monitoring. Use template variables to create reusable dashboards Quick reference guide with examples and best practices. Updated November 2025.
Grafana Dashboard Shorthand
Quickly assemble dashboards with templates, panel links, and alert rules. Quick reference guide with examples and best practices. Updated November 2025.
Grafana Panel Library
Share JSON panels as a library and version them in code. Quick reference guide with examples and best practices. Updated November 2025.
Grafana Snapshot Share
Generate snapshots to share dashboards without login credentials. Quick reference guide with examples and best practices. Updated November 2025.
Grafana Template Variables
Use template variables for environment, region, and service filters. Quick reference guide with examples and best practices. Updated November 2025.
GuardDuty Threat Hunter (Batch 8)
Correlate findings with AWS config and CloudTrail for quick hunting. Quick reference guide with examples and best practices. Updated November 2025.
HashiCorp Consul
Service mesh and discovery. Consul provides both service discovery and mesh Quick reference guide with examples and best practices. Updated November 2025.
HashiCorp Vault
Secrets and encryption management. Use dynamic secrets instead of static credentials Quick reference guide with examples and best practices. Updated November 2025.
Helm
Kubernetes package manager. Use helm template to preview manifests before install Quick reference guide with examples and best practices. Updated November 2025.
Helm Chart Structure
Organize templates, values, and hooks for reusable Helm packages Quick reference guide with examples and best practices. Updated November 2025.
Helm Chart Templating
Structure values, templates, and helpers for maintainable charts. Quick reference guide with examples and best practices. Updated November 2025.
Helm Linting
Run `helm lint` and schema checks to catch template issues. Quick reference guide with examples and best practices. Updated November 2025.
Istio
Service mesh platform. Istio adds observability, security, and control to microservices Quick reference guide with examples and best practices. Updated November 2025.
Istio Ingress Gateway
Expose services through Istio Gateways with TLS policies and routing rules. Quick reference guide with examples and best practices. Updated November 2025.
Jenkins
Open-source automation server. Use declarative pipelines for simpler syntax Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes
K8s commands and concepts Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes Cost Optimization
Node pools, schedule tuning, and spot workloads Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes GitOps Checklist
Capture repo layout, bootstrap repo credentials, and validate sync targets. Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes Liveness Patterns (Batch 8)
Use probe endpoints to restart stuck containers. Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes Network Policies
Define ingress and egress rules to limit pod communication. Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes Operator Patterns
Structure Kubernetes operators with clear CRDs, leader election, and retries. Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes Operators
Design CRDs, controllers, and reconciliation loops for reliable automation Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes Pod Troubleshooting
Triaging pod restarts, CrashLoops, and silent failures. Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes RuntimeClass
Define RuntimeClass to select alternate container runtimes (gvisor, kata). Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes Security
Network policies, RBAC, and supply-chain protections Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes StorageClasses
Tune provisioners, reclaim policies, and binding modes per workload. Quick reference guide with examples and best practices. Updated November 2025.
Kubernetes Telemetry
Collect metrics, traces, and logs centrally Quick reference guide with examples and best practices. Updated November 2025.
Lambda Async Invocation Tips
Handle async events with DLQs and idempotent functions. Quick reference guide with examples and best practices. Updated November 2025.
Lambda@Edge
Associate functions with CloudFront events and monitor latency Quick reference guide with examples and best practices. Updated November 2025.
Linux Audit Rules
Use auditctl to watch sensitive files and commands. Quick reference guide with examples and best practices. Updated November 2025.
Linux Commands
Essential Linux terminal commands Quick reference guide with examples and best practices. Updated November 2025.
Linux iptables Tables (Batch 8)
Understand filter, nat, and raw tables frames. Quick reference guide with examples and best practices. Updated November 2025.
Linux OOM Score Tuning
Lower OOM scores for critical daemons so the killer prefers others. Quick reference guide with examples and best practices. Updated November 2025.
Linux Systemd Timers
Use timers instead of cron for predictable unit execution. Quick reference guide with examples and best practices. Updated November 2025.
Microsoft Azure
Azure cloud services essentials. Use Azure Cost Management to monitor spending Quick reference guide with examples and best practices. Updated November 2025.
Monitoring Drilldowns
Connect high-level panels to deeper dashboards or APIs for troubleshooting. Quick reference guide with examples and best practices. Updated November 2025.
Multi-Cloud IAM Mapping
Document equivalent roles, service accounts, and policies for each cloud. Quick reference guide with examples and best practices. Updated November 2025.
Nginx
Nginx web server configuration Quick reference guide with examples and best practices. Updated November 2025.
Observability as Code
Store observability configs under version control Quick reference guide with examples and best practices. Updated November 2025.
Observability Dogfooding
Teams should consume their own metrics, traces, and dashboards before release. Quick reference guide with examples and best practices. Updated November 2025.
OpenTelemetry Instrumentation
Capture consistent telemetry for services and propagate context across boundaries. Quick reference guide with examples and best practices. Updated November 2025.
Prometheus
Open-source monitoring and alerting. Use Grafana for visualizing Prometheus metrics Quick reference guide with examples and best practices. Updated November 2025.
Prometheus Alert Rates
Document alert noise sources and mute via rate limiting or silences. Quick reference guide with examples and best practices. Updated November 2025.
Prometheus Alerting Best Practices
Add runbooks, use routes, and silence noise to prevent fatigue. Quick reference guide with examples and best practices. Updated November 2025.
Prometheus Blackbox Prober
Probe HTTP, TCP, and ICMP endpoints from your observability stack. Quick reference guide with examples and best practices. Updated November 2025.
Prometheus Query Best Practices
Avoid costly queries while keeping dashboards accurate. Quick reference guide with examples and best practices. Updated November 2025.
Pulumi
Infrastructure as Code with programming languages. Pulumi uses real programming languages, not DSLs Quick reference guide with examples and best practices. Updated November 2025.
Serverless Cost Controls
Prevent runaway bills via concurrency, schedules, and budgets. Quick reference guide with examples and best practices. Updated November 2025.
SRE Incident Response Playbook
Runbooks for status pages, incident channels, and after action reviews. Quick reference guide with examples and best practices. Updated November 2025.
SRE On-call
Define rotations, alert behavior, and runbook links Quick reference guide with examples and best practices. Updated November 2025.
Systemd Services
Craft service + timer units, enable/disable, and inspect logs Quick reference guide with examples and best practices. Updated November 2025.
Terraform
Infrastructure as Code tool. Use terraform workspaces to manage multiple environments Quick reference guide with examples and best practices. Updated November 2025.
Terraform Cloud
Remote workspaces, policy sets, and secure runs in Terraform Cloud Quick reference guide with examples and best practices. Updated November 2025.
Terraform Drift Detection
Catch resource drift before it surprises production. Quick reference guide with examples and best practices. Updated November 2025.
Terraform Module Registry (Batch 8)
Host modules on Terraform Registry or private registries with versioning. Quick reference guide with examples and best practices. Updated November 2025.
Terraform Modules
Module structure, inputs/outputs, and versioning Quick reference guide with examples and best practices. Updated November 2025.
Terraform Provider Security 18
Limit provider privileges Quick reference guide with examples and best practices. Updated November 2025.
Terraform Provider Security 28
Limit provider privileges Quick reference guide with examples and best practices. Updated November 2025.
Terraform Provider Security 38
Limit provider privileges Quick reference guide with examples and best practices. Updated November 2025.
Terraform Provider Security 48
Limit provider privileges Quick reference guide with examples and best practices. Updated November 2025.
Terraform Provider Security 8
Limit provider privileges Quick reference guide with examples and best practices. Updated November 2025.
Terraform Secure Backends
Configure remote backends with encryption and locking. Quick reference guide with examples and best practices. Updated November 2025.
Terraform Security Scanner
Run scanners, enforce policies, and catch drift Quick reference guide with examples and best practices. Updated November 2025.
Terraform Testing
Run fmt/validate/plan, policy tests, and module unit tests Quick reference guide with examples and best practices. Updated November 2025.
Terraform Workspace Strategy
Use named workspaces for staging, prod, and experiments. Quick reference guide with examples and best practices. Updated November 2025.
Travel Digital Nomad Kit
Pack chargers, backups, and routines for nomadic productivity. Quick reference guide with examples and best practices. Updated November 2025.
Vault Dynamic Secrets
Issue temporary credentials for databases, cloud APIs, or SSH. Quick reference guide with examples and best practices. Updated November 2025.
Vault Dynamic SSH
Issue short-lived SSH certs via Vault's SSH secrets engine. Quick reference guide with examples and best practices. Updated November 2025.
Vault KV Versioning
Use KV v2 to store versioned secrets and recover past values. Quick reference guide with examples and best practices. Updated November 2025.
Vault Secrets Engines
Enable engines, generate dynamic credentials, and renew leases Quick reference guide with examples and best practices. Updated November 2025.
Vault Secrets Rotation
Rotate secrets and revoke leases via Vault cron jobs. Quick reference guide with examples and best practices. Updated November 2025.