Storage¶
Duration: 50 minutes (20 minutes theory + 30 minutes lab)
Learning Objectives¶
- Understand the difference between ephemeral and persistent storage
- Learn about Volumes, PersistentVolumes (PV), and PersistentVolumeClaims (PVC)
- Understand StorageClasses and dynamic provisioning
- Mount persistent storage in Pods
- Understand volume access modes and reclaim policies
Storage in Docker Compose vs Kubernetes¶
Docker Compose Volumes¶
services:
database:
image: postgres:16-alpine
volumes:
- db-data:/var/lib/postgresql/data # Named volume
- ./config:/etc/config:ro # Bind mount
volumes:
db-data: # Managed by Docker
Kubernetes Volumes¶
In Kubernetes, storage is more explicit and flexible:
- Volumes: Defined at Pod level, tied to Pod lifecycle
- PersistentVolumes (PV): Cluster-level storage resources
- PersistentVolumeClaims (PVC): Requests for storage
- StorageClasses: Templates for dynamic provisioning
Storage Concepts¶
1. Volume Types¶
Kubernetes supports many volume types. Common ones:
Ephemeral Volumes (Pod lifecycle):
emptyDir: Temporary directory, deleted when Pod terminatesconfigMap: Mount ConfigMap datasecret: Mount Secret datadownwardAPI: Expose Pod metadata
Persistent Volumes (Independent lifecycle):
persistentVolumeClaim: Reference to a PVChostPath: Mount file/directory from node (for testing only)- Cloud-specific:
awsElasticBlockStore,azureDisk,gcePersistentDisk - Network storage:
nfs,cephfs,glusterfs local: Local storage on specific nodes
2. emptyDir Volumes¶
Temporary storage that exists as long as the Pod exists.
apiVersion: v1
kind: Pod
metadata:
name: cache-pod
spec:
containers:
- name: app
image: nginx:1.25-alpine
volumeMounts:
- name: cache
mountPath: /cache
volumes:
- name: cache
emptyDir: {} # Stored on node's disk
# emptyDir:
# medium: Memory # Use tmpfs (RAM)
Use cases:
- Temporary cache
- Scratch space for computations
- Sharing data between containers in same Pod
3. PersistentVolumes and PersistentVolumeClaims¶
PersistentVolume (PV): Cluster-level storage resource provisioned by admin or dynamically
PersistentVolumeClaim (PVC): User request for storage
flowchart TB
Storage["Storage Layer<br/>(Network Storage, Cloud Disks, etc.)"]
PV["PersistentVolume (PV)<br/>• Storage implementation<br/>• Admin-provisioned or dynamic<br/>• Cluster-wide resource"]
PVC["PersistentVolumeClaim (PVC)<br/>• Storage request by user<br/>• Specifies size and access mode<br/>• Namespace-scoped"]
Pod["Pod<br/>• Mounts PVC as volume"]
Storage --> PV
PV -- Bound --> PVC
PVC -- Referenced --> Pod
4. Access Modes¶
- ReadWriteOnce (RWO): Volume can be mounted read-write by single node
- ReadOnlyMany (ROX): Volume can be mounted read-only by many nodes
- ReadWriteMany (RWX): Volume can be mounted read-write by many nodes
- ReadWriteOncePod (RWOP): Volume can be mounted read-write by single Pod
Not all volume types support all access modes.
5. Reclaim Policies¶
What happens to PV when PVC is deleted:
- Retain: Manual reclamation - PV remains, data preserved
- Delete: PV and underlying storage are deleted
- Recycle: (Deprecated) Basic scrub (
rm -rf /volume/*)
6. StorageClasses¶
Define classes of storage with different properties.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iops: "3000"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
Properties:
- provisioner: Creates volumes (cloud provider, CSI driver, etc.)
- parameters: Provider-specific settings
- reclaimPolicy: What happens when PVC is deleted
- volumeBindingMode:
Immediate: Provision volume when PVC is createdWaitForFirstConsumer: Provision when Pod using PVC is scheduled
In kind (Local Development)¶
kind doesn't have dynamic provisioning by default. You can use:
- hostPath volumes: Testing only, data stored on node
- local-path provisioner: Optional addon for dynamic provisioning
- emptyDir: Ephemeral storage
For production-like testing, you can install the Rancher local-path provisioner:
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Example: Static Provisioning¶
Step 1: Create PersistentVolume¶
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-static
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: /mnt/data # Path on the node (testing only)
Step 2: Create PersistentVolumeClaim¶
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-static
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi # Must be <= PV capacity
Kubernetes automatically binds PVC to available PV.
Step 3: Use PVC in Pod¶
apiVersion: v1
kind: Pod
metadata:
name: pod-with-pvc
spec:
containers:
- name: app
image: busybox:1.36
command: ["/bin/sh", "-c"]
args:
- |
echo "Writing to persistent storage"
echo "Data: $(date)" > /data/timestamp.txt
cat /data/timestamp.txt
sleep 3600
volumeMounts:
- name: storage
mountPath: /data
volumes:
- name: storage
persistentVolumeClaim:
claimName: pvc-static
Example: Dynamic Provisioning¶
With a StorageClass configured (like local-path in kind):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-dynamic
spec:
accessModes:
- ReadWriteOnce
storageClassName: local-path # References StorageClass
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-with-storage
spec:
replicas: 1 # RWO: only 1 replica can mount
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: app
image: busybox:1.36
command: ["/bin/sh", "-c"]
args:
- |
# Create/append to file
echo "Pod started at: $(date)" >> /data/log.txt
echo "Current log:"
cat /data/log.txt
sleep 3600
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-dynamic
PersistentVolume is automatically created by the StorageClass provisioner.
StatefulSets and Storage¶
StatefulSets provide:
- Stable, unique network identifiers
- Stable, persistent storage
- Ordered, graceful deployment and scaling
Each Pod in a StatefulSet gets its own PVC:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: database
spec:
serviceName: database
replicas: 3
selector:
matchLabels:
app: db
template:
metadata:
labels:
app: db
spec:
containers:
- name: postgres
image: postgres:16-alpine
env:
- name: POSTGRES_PASSWORD
value: password
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates: # Creates PVC for each Pod
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: local-path
resources:
requests:
storage: 5Gi
This creates:
database-0with PVCdata-database-0database-1with PVCdata-database-1database-2with PVCdata-database-2
Each Pod gets its own persistent storage that survives Pod restarts.
Volume Snapshots¶
Some storage systems support snapshots for backup/restore:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: pvc-snapshot
spec:
volumeSnapshotClassName: csi-snapclass
source:
persistentVolumeClaimName: pvc-dynamic
Requires CSI driver support (not available in basic kind setup).
Best Practices¶
- Use PVCs, not direct PVs: Let Kubernetes handle binding
- Choose appropriate access modes: Most apps need RWO
- Set resource requests: Specify required storage size
- Use StorageClasses: Enable dynamic provisioning
- Plan for backups: Volume snapshots or external backup tools
- Consider StatefulSets: For stateful applications needing persistent identities
- Test disaster recovery: Verify your data survives Pod/Node failures
- Monitor storage usage: Track PVC usage and available capacity
- Use appropriate reclaim policies: Retain for production data
- Avoid hostPath in production: Use proper storage solutions
Storage for Different Workloads¶
| Workload | Storage Type | Access Mode | Notes |
|---|---|---|---|
| Database (single) | PVC | RWO | Use StatefulSet |
| Database (clustered) | PVC per replica | RWO | StatefulSet with volumeClaimTemplates |
| Shared files | PVC | RWX | Requires RWX-capable storage (NFS, CephFS) |
| Logs | emptyDir + log collector | - | Use DaemonSet for log collection |
| Cache | emptyDir (Memory) | - | Temporary, fast access |
| Config files | ConfigMap | - | Small configuration files |
| Build workspace | emptyDir | - | Temporary build artifacts |
Troubleshooting Storage¶
# List PVs
kubectl get pv
# List PVCs
kubectl get pvc
# Describe PVC for details and events
kubectl describe pvc <pvc-name>
# Check StorageClasses
kubectl get storageclass
# Check Pod events for mount issues
kubectl describe pod <pod-name>
# View logs from provisioner
kubectl logs -n kube-system -l app=local-path-provisioner
Common issues:
- PVC pending: No available PV or StorageClass
- Pod pending: PVC not bound or access mode incompatible
- Mount failed: Permissions, path doesn't exist, or driver issue
Docker Compose to Kubernetes: Storage¶
Docker Compose¶
services:
app:
image: myapp:latest
volumes:
- app-data:/var/lib/app
- ./config:/etc/config:ro
volumes:
app-data:
Kubernetes¶
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: app-data
spec:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
config.yaml: |
# configuration here
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
spec:
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: app
image: myapp:latest
volumeMounts:
- name: data
mountPath: /var/lib/app
- name: config
mountPath: /etc/config
readOnly: true
volumes:
- name: data
persistentVolumeClaim:
claimName: app-data
- name: config
configMap:
name: app-config
Key differences:
- Compose: Volumes managed automatically
- K8s: Explicit PVC creation and binding
- Compose: Bind mounts for local files
- K8s: ConfigMaps/Secrets for configuration
Examples¶
Check examples/ for sample manifests.
Key takeaways¶
- Kubernetes separates storage provisioning (PV) from consumption (PVC)
- emptyDir is ephemeral, PVC provides persistence
- StorageClasses enable dynamic provisioning
- Access modes determine how volumes can be mounted
- StatefulSets provide per-Pod persistent storage
- Choose storage solutions based on workload requirements
Check your understanding¶
- What is the difference between a Volume and a PersistentVolume?
- What is the difference between emptyDir and a PersistentVolumeClaim?
- What does ReadWriteOnce (RWO) mean?
- How does a StorageClass enable dynamic provisioning?
- What happens to a PVC when the Pod using it is deleted?
- Which Kubernetes workload type is most suitable for stateful applications like databases?
Solution
- A Volume is tied to a Pod's lifecycle; a PersistentVolume is a cluster-level resource that exists independently of any Pod
- emptyDir is deleted when the Pod is terminated; PVC persists independently of Pod lifecycle
- The volume can be mounted read-write by a single node at a time
- StorageClasses define a provisioner and its parameters; when a PVC references a StorageClass, the provisioner automatically creates a matching PersistentVolume
- The PVC remains intact (and still Bound); only explicitly deleting the PVC removes the data
- StatefulSet — provides stable network identity and per-Pod PVCs
Hands-on¶
Apply the concepts from this section in the lab exercises.
Next section¶
Once you've reviewed the content and completed the lab, proceed to the next section