Pod Security Admission (PSA) Implementation Guide - AI-Optimized Technical Reference
Overview
What PSA Does:
- Enforces security standards at namespace level using labels
- Replaces Pod Security Policies (PSPs) removed in Kubernetes 1.25
- Built-in admission controller (no webhooks required)
- Validates pods during admission process before scheduling
Critical Context:
- PSA enabled by default since Kubernetes 1.23
- Mandatory migration for clusters upgrading past 1.24
- Namespace-level enforcement only (no per-pod granularity like PSPs)
- Three enforcement modes can run simultaneously per namespace
Security Levels and Real-World Impact
Privileged Level
Configuration: No restrictions
Use Cases:
- System workloads (kube-system, monitoring)
- Legacy applications requiring root access
- CI/CD pipelines with Docker-in-Docker
Production Reality: Most production workloads end up here due to legacy constraints
Baseline Level
Configuration: Blocks obvious security disasters
Restrictions:
- No privileged containers
- No host networking/PID/IPC access
- No host path volumes
- Allows root user (UID 0)
Compatibility: Most semi-modern applications can run under baseline without major modifications
Restricted Level
Configuration: Full security enforcement
Critical Requirements:
- Must run as non-root user
- Read-only root filesystem required
- No privilege escalation allowed
- Minimal capabilities only
- Comprehensive security context required
Implementation Reality:
- Breaks 95% of legacy applications immediately
- Requires extensive security context modifications
- Most Java applications fail (cannot write to temp directories)
- Database containers typically incompatible
Implementation Configuration
Namespace Labels (Required)
apiVersion: v1
kind: Namespace
metadata:
name: production-workloads
labels:
# Enforcement - actually blocks non-compliant pods
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/enforce-version: v1.29
# Audit - logs violations without blocking
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/audit-version: v1.29
# Warn - shows warnings to users
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: v1.29
Critical Warning: Always pin versions or cluster upgrades will change enforcement rules unexpectedly
Cluster-Wide Configuration
Location: /etc/kubernetes/
on control plane nodes
Risk Level: HIGH - Malformed YAML will prevent cluster startup
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: baseline # Never start with restricted
enforce-version: v1.29
exemptions:
namespaces: [kube-system, kube-public, kube-node-lease]
Backup Requirement: Always backup control plane configuration before changes
Exemption Requirements
System Namespaces (Always Exempt)
kube-system
- Core Kubernetes componentskube-public
- Public cluster informationkube-node-lease
- Node heartbeat mechanism
Common Production Exemptions
istio-system
- Service mesh requires privilegescert-manager
- DNS challenge requirementsmonitoring
- Host access for metrics collectiongitlab-runner
- CI/CD privilege requirements
Implementation Timeline and Resource Requirements
Phase 1: Discovery (1-2 weeks)
Actions:
- Enable audit mode on all namespaces
- Collect violation data
- Identify exemption candidates
Expected Results: 500-4000+ audit violations depending on infrastructure age
Phase 2: Quick Wins (2-4 weeks)
Actions:
- Fix applications with existing security contexts
- Update recent microservices
- Implement exemptions for infrastructure components
Success Rate: 20-30% of applications if built after 2020
Phase 3: Legacy Application Remediation (2-6 months)
Challenges:
- Applications requiring root access
- Init containers needing privileges
- Third-party Helm charts without security contexts
- Database containers with filesystem requirements
Real-World Success Rate: 40-60% of applications achievable with significant effort
Phase 4: Acceptance (Ongoing)
Reality: 60-80% of production workloads remain in privileged namespaces permanently
Common Failure Scenarios
Monitoring Stack Failures
Root Cause: Monitoring agents require host access
Symptoms: Node exporters, log collectors cannot start
Solution: Exempt monitoring namespace to privileged mode
Example Error: violates PodSecurity restricted:v1.29: hostNetwork access forbidden
CI/CD Pipeline Breakage
Root Cause: Build processes require privileged operations
Impact: Complete deployment pipeline failure
Resolution Timeline: Immediate exemption required for business continuity
Database Container Issues
Root Cause: Containers expect to run as root, modify filesystem permissions
Symptoms: PostgreSQL, MySQL containers fail to initialize
Required Fix: Custom container images with proper security contexts
Java Application Failures
Root Cause: Cannot write to /tmp
with read-only filesystem
Frequency: Nearly universal for pre-2020 Java applications
Solution: Add writable volume mounts for temporary directories
Security Context Remediation Patterns
Basic Non-Root Configuration
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: app
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
Writable Filesystem Requirements
spec:
containers:
- name: app
volumeMounts:
- name: tmp-volume
mountPath: /tmp
- name: var-log
mountPath: /var/log
volumes:
- name: tmp-volume
emptyDir: {}
- name: var-log
emptyDir: {}
Cloud Provider Implementations
Amazon EKS
Status: Enabled by default in 1.23+
Reliability: High - works as documented
Special Considerations: None
Google GKE
Standard Mode: Works correctly
Autopilot Mode: Additional restrictions applied by Google
Complexity: Medium - conflicts with GKE-specific policies
Azure AKS
Status: Enabled by default
Conflict: Fights with Azure Policy addon
Resolution: Disable one security system or accept conflicts
Troubleshooting Guide
Validation Commands
# Check PSA enablement
kubectl get pods -n kube-system -l component=kube-apiserver -o yaml | grep PodSecurity
# Verify namespace configuration
kubectl get namespace NAMESPACE -o yaml | grep pod-security
# Test with throwaway namespace
kubectl create namespace psa-test
kubectl label namespace psa-test pod-security.kubernetes.io/enforce=restricted
kubectl run test --image=nginx --namespace=psa-test
Common Error Patterns
Error Message | Root Cause | Solution |
---|---|---|
spec.securityContext.runAsUser: 0 |
Running as root | Set runAsUser: 1000 and runAsNonRoot: true |
spec.containers[0].securityContext.privileged: true |
Privileged container | Remove privileged flag or exempt namespace |
spec.containers[0].securityContext.capabilities.add[0]: SYS_ADMIN |
Excessive capabilities | Remove capabilities or use baseline mode |
violates PodSecurity restricted: hostNetwork |
Host networking | Remove hostNetwork or exempt namespace |
Emergency Procedures
# Immediate deployment fix - disable enforcement
kubectl label namespace NAMESPACE pod-security.kubernetes.io/enforce=privileged
# Remove all PSA labels
kubectl label namespace NAMESPACE pod-security.kubernetes.io/enforce-
kubectl label namespace NAMESPACE pod-security.kubernetes.io/audit-
kubectl label namespace NAMESPACE pod-security.kubernetes.io/warn-
Performance and Operational Impact
Engineering Velocity Impact
- First Month: -50% deployment speed due to debugging
- Learning Phase: 2-3 weeks per engineer for security context proficiency
- Ongoing Overhead: +30% time for new application deployment
Incident Response Considerations
- PSA violations appear as deployment failures
- Error messages often lack specific remediation guidance
- Debugging requires security context expertise
- Emergency exemption procedures must be documented
Migration from Pod Security Policies
Pre-Migration Assessment
# Identify active PSPs
kubectl get psp
kubectl get pods --all-namespaces -o custom-columns="NAMESPACE:.metadata.namespace,NAME:.metadata.name,PSP:.metadata.annotations.kubernetes\.io/psp"
PSP to PSA Mapping
- Most restrictive PSP → Baseline PSA (not Restricted)
- Standard PSP → Privileged PSA
- Permissive PSP → Privileged PSA
Migration Timeline Reality
- Documentation Claims: 6-8 weeks
- Actual Experience: 3-6 months for significant infrastructure
- Success Metrics: 40-60% of workloads achieve baseline enforcement
Post-Migration Cleanup
# Remove PSP resources (requires cluster admin)
kubectl delete psp --all
kubectl delete clusterrole psp:*
kubectl delete clusterrolebinding psp:*
Tool Recommendations
Essential Tools
- kubectl dry-run:
kubectl apply --dry-run=server
- Validate before deployment - Polaris: Pre-deployment PSA violation detection
- kube-score: Alternative validation tool
Documentation Resources
- Pod Security Standards: Primary reference for security levels
- Security Context Configuration: Essential for violation remediation
- Stack Overflow #kubernetes: Real-world solutions and troubleshooting
Monitoring and Alerting
# Monitor PSA violations
kubectl get events --field-selector reason=FailedCreate
kubectl get events --field-selector reason=PolicyViolation
Decision Framework
When to Use Each Security Level
Choose Privileged When:
- Legacy applications requiring root access
- Infrastructure components (monitoring, service mesh)
- CI/CD pipelines with privileged operations
- Time constraints prevent security context remediation
Choose Baseline When:
- Modern applications with some security awareness
- Accepting most container images will need modification
- Balancing security with operational complexity
Choose Restricted When:
- New applications designed for security
- Compliance requirements mandate strictest controls
- Engineering resources available for extensive testing
Cost-Benefit Analysis
Security Improvement: Moderate - prevents obvious misconfigurations
Implementation Cost: High - months of engineering effort
Maintenance Overhead: Medium - ongoing security context management
Compatibility Impact: High - significant application modifications required
Critical Success Factors
- Executive Support: Security context remediation requires significant engineering time
- Gradual Rollout: Never enable restricted mode cluster-wide immediately
- Exemption Strategy: Plan permanent exemptions for infrastructure components
- Training Investment: Engineers need security context expertise
- Realistic Timelines: Plan 3-6 months for mature infrastructure migration
- Emergency Procedures: Document rapid exemption process for production issues
Breaking Points and Limitations
PSA Cannot Prevent:
- Runtime privilege escalation by sophisticated attackers
- Container breakouts through kernel vulnerabilities
- Network-based attacks between pods
PSA Will Break:
- Most pre-2020 container images
- Helm charts without security context configuration
- Applications expecting filesystem write access
- System monitoring and debugging tools
Resource Requirements:
- Engineering Time: 2-6 months for comprehensive implementation
- Expertise: Security context configuration knowledge required
- Testing Infrastructure: Separate environments for PSA validation
- Documentation: Extensive runbooks for troubleshooting violations
Useful Links for Further Investigation
Resources That Actually Help (And Which Ones Suck)
Link | Description |
---|---|
Pod Security Standards | Actually fucking useful. This is the one doc that clearly explains what each security level blocks and why your pods are failing. Bookmark this shit, you'll need it constantly. |
PSA Configuration Reference | Good for reference. Dry as hell but actually accurate. Use this when you need to understand the YAML structure and don't want to guess. |
PSP to PSA Migration Guide | Overly optimistic garbage. Written by someone who clearly never had to migrate a real production cluster with 500+ deployments and 47 different monitoring agents. Makes migration sound like a fun weekend project when it's actually 3-6 months of debugging why your CI pipeline breaks every goddamn Tuesday. Read for context, but don't follow it blindly unless you enjoy explaining to management why every deployment is broken. |
Admission Controllers Overview | Skip this. Generic overview that doesn't help with PSA-specific issues. |
Security Context Configuration | Essential reading. You'll reference this constantly when fixing PSA violations. Shows you how to write proper security contexts. |
Security Context Constraints (OpenShift) | Useful if you're on OpenShift. Red Hat's approach is different but the concepts apply. |
EKS Pod Security Standards | Surprisingly decent. AWS actually managed to keep it simple for once. Works as advertised, which is shocking. |
GKE Pod Security Standards | Confusing as fuck. Google's docs assume you already understand their byzantine security model (spoiler: you don't). Autopilot mode makes it even worse by adding mystery meat restrictions. |
AKS Pod Security Standards | Fights with Azure Policy constantly. Microsoft's implementation technically works but immediately starts beefing with their other security features. Classic Microsoft. |
kubectl dry-run | Use this constantly. `kubectl apply --dry-run=server` saves you from deploying broken configs. |
Polaris | Actually useful. Catches PSA violations before deployment, which is way better than finding out in production. |
kube-score | Does the same thing as Polaris. Pick one, they're both fine. |
Falco | Overkill for PSA. Detects violations after they happen, but PSA should prevent them anyway. |
Stack Overflow #kubernetes | Where you'll actually find solutions that work. Search for "pod security admission violates" and you'll find your exact error message with someone who's already suffered through fixing it. |
Kubernetes Community Forums | Real war stories from the trenches. Engineers sharing how they actually implemented PSA, including all the spectacular failures nobody talks about in conference presentations. |
Kubernetes Slack #sig-auth | Good for really obscure edge cases. The actual maintainers hang out here, but their responses tend to be academic and assume you have unlimited time to read RFCs. |
CNCF #kubernetes-security | Hit or miss bullshit. Lots of theoretical discussion about security principles, very little practical "here's how to fix your broken deployment" advice. |
Kubernetes PSA Issues | Track known bugs. If PSA is behaving weirdly, check if it's a known issue. |
kubernetes/enhancements PSA KEPs | For the masochists. Read the design docs if you want to understand why PSA works the way it does. |
CIS Kubernetes Benchmark | Theoretical compliance. Good for checkbox security, not practical implementation guidance. |
NSA/CISA Kubernetes Hardening Guide | Actually practical. NSA/CISA guide is surprisingly useful for real-world hardening. |
PSA Troubleshooting Checklist | Start here when things break. Basic debugging steps. |
kubectl debug | For debugging pod failures. Essential when PSA blocks your deployments. |
Related Tools & Recommendations
Your Kubernetes Cluster is Probably Fucked
Zero Trust implementation for when you get tired of being owned
Kubernetes Security Policies Are Blocking Everything - Here's How to Actually Fix It
Learn to diagnose and resolve Kubernetes security policy violations, including PodSecurity and RBAC errors. Get quick triage tips and lasting fixes to unblock y
Hardening GKE Enterprise - Security That Actually Works
Secure Google Kubernetes Engine Enterprise (GKE) clusters with this hardening guide. Learn best practices for Workload Identity, Binary Authorization, and the G
Pod Security Standards - Three Security Levels Instead of Policy Hell
Replace the clusterfuck that was Pod Security Policies with simple security profiles
RHACS Compliance Implementation: Stop Panicking When Auditors Show Up
I've been through 5 SOC 2 audits with RHACS. Here's what actually works (and what's complete bullshit)
Complete Kubernetes Security Monitoring Stack Setup - Zero to Production
Learn to build a complete Kubernetes security monitoring stack from zero to production. Discover why commercial tools fail, get a step-by-step implementation gu
Red Hat OpenShift Container Platform - Enterprise Kubernetes That Actually Works
More expensive than vanilla K8s but way less painful to operate in production
Your GPU Pods Are Stuck Pending (Here's How I Fixed It After 4 Hours at 3AM)
When nvidia-smi shows 8 GPUs but Kubernetes sees zero, and you're about to lose your shit
Amazon EKS - Managed Kubernetes That Actually Works
Kubernetes without the 3am etcd debugging nightmares (but you'll pay $73/month for the privilege)
RHACS Troubleshooting Guide: Fix the Stuff That Breaks
When your security platform decides to become the security problem
How to Reduce Kubernetes Costs in Production - Complete Optimization Guide
Master Kubernetes cost optimization with our complete guide. Learn to assess, right-size resources, integrate spot instances, and automate savings for productio
When Kubernetes Network Policies Break Everything (And How to Fix It)
Your pods can't talk, logs are useless, and everything's broken
Container Runtime Security is Where Everything Goes to Hell
I've watched container escapes take down entire production environments. Here's what actually works.
Stop Kubernetes From Ruining Your Life - Prevention Guide That Actually Works
Prevent Kubernetes production outages with this guide. Learn proactive strategies, effective monitoring, and advanced troubleshooting to keep your clusters stab
Setup Kubernetes Production Deployment - Complete Guide
The step-by-step playbook to deploy Kubernetes in production without losing your weekends to certificate errors and networking hell
Escape Istio Hell: How to Migrate to Linkerd Without Destroying Production
Stop feeding the Istio monster - here's how to escape to Linkerd without destroying everything
Docker Security Scanner Failures - Debug the Bullshit That Breaks at 3AM
Troubleshoot common Docker security scanner failures like Trivy database timeouts or 'resource temporarily unavailable' errors in CI/CD. Learn to debug and fix
Shopify Polaris - Stop Building the Same Components Over and Over
competes with Shopify Polaris
When Admission Controllers Shit the Bed and Block Your Deployments
Fix the Webhook Timeout Hell That's Breaking Your CI/CD
Kubernetes Enterprise Review - Is It Worth The Investment in 2025?
Evaluate Kubernetes for enterprise. This guide assesses real-world implementation, success stories, pain points, and total cost of ownership for businesses in 2
Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization