Advanced Deployment Strategies¶

Duration: 40 minutes (20 minutes theory + 20 minutes lab)

Introduction¶

Basic deployments with rolling updates work for many scenarios, but production systems often require more sophisticated deployment strategies to minimize risk and downtime.

Deployment Strategies Overview¶

Strategy	Description	Downtime	Risk	Rollback Speed
Recreate	Stop all, deploy new	Yes	High	Slow
Rolling Update	Gradual replacement	No	Medium	Medium
Blue/Green	Two complete environments	No	Low	Instant
Canary	Gradual traffic shift	No	Very Low	Instant
A/B Testing	Feature-based routing	No	Low	Instant

Rolling Updates (Advanced)¶

Configuration¶

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2        # Max pods above desired count
      maxUnavailable: 1  # Max pods unavailable during update
  minReadySeconds: 30    # Wait before considering pod ready
  progressDeadlineSeconds: 600  # Timeout for deployment progress
  template:
    spec:
      containers:
      - name: app
        image: myapp:v2
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5

Best Practices:

Set maxUnavailable to 0 for critical services
Use maxSurge to speed up deployments
Always configure readiness probes
Set appropriate minReadySeconds

Blue/Green Deployments¶

Deploy new version alongside old, then switch traffic instantly.

Implementation with Services¶

# Blue deployment (v1 - current production)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-blue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
      version: blue
  template:
    metadata:
      labels:
        app: myapp
        version: blue
    spec:
      containers:
      - name: app
        image: myapp:v1
---
# Green deployment (v2 - new version)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-green
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
      version: green
  template:
    metadata:
      labels:
        app: myapp
        version: green
    spec:
      containers:
      - name: app
        image: myapp:v2
---
# Service points to blue initially
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
    version: blue  # Switch to 'green' when ready
  ports:
  - port: 80
    targetPort: 8080

Switching Process:

Deploy green environment
Test green environment internally
Update Service selector to point to green
Monitor for issues
Keep blue running for quick rollback
Delete blue after success confirmation

Advantages:¶

Instant switch and rollback
Full testing before switch
Zero downtime

Disadvantages:¶

Double resources needed
Database migrations can be tricky
Session handling between versions

Canary Deployments¶

Gradually shift traffic to new version while monitoring.

Using Labels and Multiple Services¶

# Stable deployment (v1 - 90% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-stable
spec:
  replicas: 9
  selector:
    matchLabels:
      app: myapp
      track: stable
  template:
    metadata:
      labels:
        app: myapp
        track: stable
    spec:
      containers:
      - name: app
        image: myapp:v1
---
# Canary deployment (v2 - 10% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-canary
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myapp
      track: canary
  template:
    metadata:
      labels:
        app: myapp
        track: canary
    spec:
      containers:
      - name: app
        image: myapp:v2
---
# Service routes to both (weighted by replicas)
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp  # Matches both stable and canary
  ports:
  - port: 80
    targetPort: 8080

Traffic Distribution:

9 stable pods + 1 canary pod = ~10% to canary
Gradually increase canary replicas
Monitor metrics and errors
Rollback by deleting canary deployment
Promote by switching stable to v2

Progressive Canary¶

1. Deploy canary (1 replica) - 10% traffic
2. Monitor metrics for 10 minutes
3. If healthy, scale to 3 replicas - 30% traffic
4. Monitor for 20 minutes
5. If healthy, scale to 5 replicas - 50% traffic
6. Monitor for 30 minutes
7. Promote canary to stable
8. Scale down old stable

Flagger for Progressive Delivery¶

Flagger automates canary deployments:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: myapp
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  service:
    port: 80
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 99
      interval: 1m
    - name: request-duration
      thresholdRange:
        max: 500
      interval: 1m
    webhooks:
    - name: load-test
      url: http://flagger-loadtester/

Flagger Process:

Detects new image in deployment
Creates canary deployment
Routes initial traffic (10%)
Monitors metrics every minute
Increases traffic by 10% if healthy
Promotes to stable if all checks pass
Rolls back automatically if metrics fail

A/B Testing¶

Route traffic based on user attributes, not just percentage.

Using Ingress Annotations¶

# Baseline deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-v1
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: app
        image: myapp:v1
---
# New feature deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-v2
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: app
        image: myapp:v2
---
# Main Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp-main
spec:
  rules:
  - host: myapp.com
    http:
      paths:
      - path: /
        backend:
          service:
            name: myapp-v1
            port:
              number: 80
---
# A/B Test Ingress (canary based on cookie)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp-ab
  annotations:
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-by-cookie: "beta_user"
spec:
  rules:
  - host: myapp.com
    http:
      paths:
      - path: /
        backend:
          service:
            name: myapp-v2
            port:
              number: 80

Routing Rules:

Users with beta_user=always cookie → v2
All other users → v1
Can also route by header, query param, etc.

Feature Flags¶

Control features without deployment:

// In application code
if featureFlags.IsEnabled("new-ui", user) {
    return renderNewUI()
} else {
    return renderOldUI()
}

Tools:

LaunchDarkly
Split.io
Unleash (open-source)
ConfigCat

Kubernetes ConfigMap Integration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: feature-flags
data:
  flags.json: |
    {
      "new-ui": {
        "enabled": true,
        "rollout": 25,
        "users": ["beta@example.com"]
      }
    }

Shadow Deployments¶

Run new version in parallel, compare responses (don't return to users).

# Traffic Mirroring with Istio
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp
  http:
  - route:
    - destination:
        host: myapp-v1
      weight: 100
    mirror:
      host: myapp-v2
    mirrorPercentage:
      value: 100.0

Use Cases:

Performance testing with production load
Comparing response differences
Testing new algorithms

Deployment Health Checks¶

Readiness vs Liveness vs Startup¶

spec:
  containers:
  - name: app
    # Startup probe (for slow-starting apps)
    startupProbe:
      httpGet:
        path: /health
        port: 8080
      failureThreshold: 30
      periodSeconds: 10

    # Liveness probe (restart if failing)
    livenessProbe:
      httpGet:
        path: /health
        port: 8080
      initialDelaySeconds: 30
      periodSeconds: 10
      failureThreshold: 3

    # Readiness probe (remove from service if failing)
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5
      failureThreshold: 3

Deployment Best Practices¶

Always use readiness probes - Prevent traffic to unhealthy pods
Set appropriate timeouts - Don't rush deployments
Monitor during rollout - Watch error rates, latency
Automate rollback - Use metrics-based triggers
Test thoroughly - Use staging environment
Database migrations - Handle schema changes carefully
Session management - Use external session store
Gradual rollout - Start small, increase slowly
Document process - Clear runbooks for rollback
Practice rollbacks - Test rollback procedures regularly

Deployment Metrics to Monitor¶

Error rate - 5xx responses
Latency - p50, p95, p99 response times
Throughput - Requests per second
Saturation - CPU, memory usage
Pod restarts - CrashLoopBackOff events

Rollback Strategies¶

kubectl rollout¶

# Check rollout status
kubectl rollout status deployment/myapp

# View history
kubectl rollout history deployment/myapp

# Rollback to previous
kubectl rollout undo deployment/myapp

# Rollback to specific revision
kubectl rollout undo deployment/myapp --to-revision=2

# Pause rollout
kubectl rollout pause deployment/myapp

# Resume rollout
kubectl rollout resume deployment/myapp

Key takeaways¶

Rolling updates gradually replace old Pods with new ones, providing zero-downtime deployments by default
Blue/green deployments run two identical environments and switch traffic instantly, making rollbacks instant
Canary deployments route a small percentage of traffic to the new version before full rollout, reducing blast radius
Flagger automates progressive delivery by analysing metrics and promoting or rolling back automatically
Proper health checks (readiness and liveness probes) are essential for any advanced deployment strategy to work correctly

Check your understanding¶

What is the main difference between a blue/green deployment and a canary deployment?
How does Kubernetes determine whether a rolling update can proceed to the next Pod?
What kubectl command pauses a rolling update mid-deployment?
What metric is Flagger typically configured to check before promoting a canary release?
When would you choose a shadow deployment over a canary deployment?

Solution

Blue/green switches all traffic at once between two full environments; canary gradually shifts a small percentage of traffic to the new version
The new Pod must pass its readiness probe before the old Pod is terminated
kubectl rollout pause deployment/<name>
Error rate (and/or request latency) — Flagger promotes the canary only if the error rate stays below the configured threshold
When you want to test the new version with real production traffic without any risk to users — shadow deployments mirror traffic but discard responses from the new version

Hands-on¶

Apply the concepts from this section in the lab exercises.

Next section¶

Once you've reviewed the content and completed the lab, proceed to the next section.