Why are my pods being rejected by PSA?

PSA enforces security constraints that many applications weren't designed to meet. Common violations include: - Running containers as root user (UID 0) - Missing or incomplete security contexts - Using privileged containers for system access - Mounting host directories or using restricted volume types - Allowing privilege escalation (enabled by default in many container runtimes) Fix this by adding proper security contexts to your pod specs, or just say "fuck it" and set the namespace to privileged mode if your app legitimately needs root access and you don't have time to rewrite the entire application stack.

How do I fix CI/CD pipeline issues with PSA?

CI/CD systems are fundamentally incompatible with PSA because they need to do sketchy shit like mounting Docker sockets and running privileged builds. PSA takes one look at this and goes "absolutely fucking not." Quick fix (what everyone actually does because we have deadlines): ```bash kubectl label namespace ci-system pod-security.kubernetes.io/enforce=privileged ``` "Proper" longer-term approaches (that nobody actually implements because life is short): - Using rootless build tools like kaniko or buildah (good luck getting them to work correctly on your first try... or your tenth) - Running builds in dedicated privileged namespaces (defeats the entire fucking purpose of PSA but whatever) - Remote build services that don't run in the cluster (expensive, complicated, and your networking team will hate you)

How do I temporarily disable PSA enforcement?

When you need to quickly restore service during an outage: ```bash # Set namespace to privileged mode kubectl label namespace your-namespace pod-security.kubernetes.io/enforce=privileged # Or remove enforcement labels entirely kubectl label namespace your-namespace pod-security.kubernetes.io/enforce- ``` This fixes your immediate problem but you'll probably forget to re-enable security later. Most people document "temporary" privileged namespaces and then they stay that way forever.

Why do monitoring tools fail with PSA enabled?

Monitoring agents like Prometheus node exporter, Datadog agents, and log collectors need to read host metrics, mount `/proc`, and generally do all the things that PSA considers evil. PSA takes one look at your monitoring stack and says "fuck no" to basically everything. How people actually fix this in the real world: 1. **Dedicated monitoring namespace**: Exempt the monitoring namespace and call it a day (what 95% of people do) 2. **Baseline security level**: Try baseline and hope it doesn't break your monitoring (spoiler: it probably will anyway) 3. **eBPF-based tools**: Fancy new monitoring that doesn't need root access (if you can afford the licensing and have 6 months to migrate everything) 4. **Remote monitoring**: Use external SaaS services (expensive but works, until your CFO sees the bill) Most people just exempt the monitoring namespace because debugging why node-exporter can't read `/proc/stats` at 2 AM on a Saturday while your on-call alerts are going off is not how anyone wants to spend their weekend. I spent 4 hours on this exact issue once - turns out node-exporter needs `hostNetwork: true` and `hostPID: true` to function, which PSA restricted mode blocks faster than you can say "incident response." ```bash kubectl label namespace monitoring pod-security.kubernetes.io/enforce=privileged ```

Can I just ignore the warnings and deploy anyway?

Yes, warnings don't block deployments. But ignoring them is like ignoring the check engine light in your car - everything works fine until it suddenly doesn't. The warn mode is basically PSA's way of saying "this is stupid but I'll allow it." Eventually you'll need to fix it or accept that your security posture is garbage.

How do I debug "violates PodSecurity" errors?

The error messages are intentionally vague because Kubernetes hates you. Here's what actually helps: ```bash # Get the full error message (still won't help much) kubectl describe pod your-broken-pod # Check what security level is enforced kubectl get namespace your-namespace -o yaml | grep pod-security # Compare against what your pod is trying to do kubectl get pod your-broken-pod -o yaml | grep -A 20 securityContext ``` Most common violations: - `runAsUser: 0` (running as root) - **Error: "spec.securityContext.runAsUser: 0"** - Missing `runAsNonRoot: true` - **Error: "spec.securityContext.runAsNonRoot: false"** - `privileged: true` anywhere - **Error: "spec.containers[0].securityContext.privileged: true"** - Capabilities that PSA doesn't like - **Error: "spec.containers[0].securityContext.capabilities.add[0]: SYS_ADMIN"** Pro tip: The pod probably runs as root. It's always running as root.

Does this actually make my cluster more secure?

PSA prevents some obvious security mistakes, but it's mostly security theater. If your main concern is "developers accidentally running privileged containers," then yes, it helps. If your threat model includes "sophisticated attackers who have already gained access to your cluster," then PSA is about as useful as a screen door on a submarine. Real security comes from proper RBAC, network policies, and not running random containers from the internet.

What's the difference between this and the old Pod Security Policies?

PSPs were a nightmare to configure and debug. PSA is much simpler but also less flexible. **PSPs**: Required PhD in Kubernetes RBAC to understand which policy applied to which pod. Debugging took hours. **PSA**: Uses simple namespace labels. When it breaks, you know immediately what's wrong. Trade-off: PSPs could do complex per-pod rules. PSA is one-size-fits-all per namespace.

How long does it take to fix all the violations?

Plan for 2-6 months if you have any legacy applications, and that's being optimistic. Here's what actually happens in the real world: - Week 1: Discover that literally everything violates restricted policies because your entire infrastructure was built in 2018 - Week 2: Fix the easy stuff (those 3 new microservices that actually have proper security contexts) - Week 3-8: Fight tooth and nail with legacy applications and third-party Helm charts that were written by people who thought security was optional - Week 9: Give up, exempt 80% of your namespaces, and start drinking heavily - Week 10: Declare victory with baseline enforcement on 3 namespaces and hope management doesn't ask too many questions The audit phase will show you 500+ violations on day one. I tried to fix them all once - biggest mistake of my career. Spent 3 months debugging security contexts and ended up with more gray hair. Just fix the critical apps and exempt the rest. Life's too short to debug why a 5-year-old Java app can't write to `/tmp`, and your sanity is worth more than perfect security compliance.

Can I use PSA with service meshes like Istio?

Istio does whatever it wants and ignores most security policies. Put Istio in its own exempt namespace and don't ask questions: ```bash kubectl label namespace istio-system pod-security.kubernetes.io/enforce=privileged ``` Same goes for most CNI plugins, ingress controllers, and anything else that considers itself "infrastructure."

Currently viewing the AI version

Switch to human version

Pod Security Admission (PSA) Implementation Guide - AI-Optimized Technical Reference

Overview

What PSA Does:

Enforces security standards at namespace level using labels
Replaces Pod Security Policies (PSPs) removed in Kubernetes 1.25
Built-in admission controller (no webhooks required)
Validates pods during admission process before scheduling

Critical Context:

PSA enabled by default since Kubernetes 1.23
Mandatory migration for clusters upgrading past 1.24
Namespace-level enforcement only (no per-pod granularity like PSPs)
Three enforcement modes can run simultaneously per namespace

Security Levels and Real-World Impact

Privileged Level

Configuration: No restrictions
Use Cases:

System workloads (kube-system, monitoring)
Legacy applications requiring root access
CI/CD pipelines with Docker-in-Docker

Production Reality: Most production workloads end up here due to legacy constraints

Baseline Level

Configuration: Blocks obvious security disasters
Restrictions:

No privileged containers
No host networking/PID/IPC access
No host path volumes
Allows root user (UID 0)

Compatibility: Most semi-modern applications can run under baseline without major modifications

Restricted Level

Configuration: Full security enforcement
Critical Requirements:

Must run as non-root user
Read-only root filesystem required
No privilege escalation allowed
Minimal capabilities only
Comprehensive security context required

Implementation Reality:

Breaks 95% of legacy applications immediately
Requires extensive security context modifications
Most Java applications fail (cannot write to temp directories)
Database containers typically incompatible

Implementation Configuration

Namespace Labels (Required)

apiVersion: v1
kind: Namespace
metadata:
  name: production-workloads
  labels:
    # Enforcement - actually blocks non-compliant pods
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/enforce-version: v1.29

    # Audit - logs violations without blocking
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: v1.29

    # Warn - shows warnings to users
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: v1.29

Critical Warning: Always pin versions or cluster upgrades will change enforcement rules unexpectedly

Cluster-Wide Configuration

Location: /etc/kubernetes/ on control plane nodes
Risk Level: HIGH - Malformed YAML will prevent cluster startup

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
  configuration:
    apiVersion: pod-security.admission.config.k8s.io/v1beta1
    kind: PodSecurityConfiguration
    defaults:
      enforce: baseline  # Never start with restricted
      enforce-version: v1.29
    exemptions:
      namespaces: [kube-system, kube-public, kube-node-lease]

Backup Requirement: Always backup control plane configuration before changes

Exemption Requirements

System Namespaces (Always Exempt)

kube-system - Core Kubernetes components
kube-public - Public cluster information
kube-node-lease - Node heartbeat mechanism

Common Production Exemptions

istio-system - Service mesh requires privileges
cert-manager - DNS challenge requirements
monitoring - Host access for metrics collection
gitlab-runner - CI/CD privilege requirements

Implementation Timeline and Resource Requirements

Phase 1: Discovery (1-2 weeks)

Actions:

Enable audit mode on all namespaces
Collect violation data
Identify exemption candidates

Expected Results: 500-4000+ audit violations depending on infrastructure age

Phase 2: Quick Wins (2-4 weeks)

Actions:

Fix applications with existing security contexts
Update recent microservices
Implement exemptions for infrastructure components

Success Rate: 20-30% of applications if built after 2020

Phase 3: Legacy Application Remediation (2-6 months)

Challenges:

Applications requiring root access
Init containers needing privileges
Third-party Helm charts without security contexts
Database containers with filesystem requirements

Real-World Success Rate: 40-60% of applications achievable with significant effort

Phase 4: Acceptance (Ongoing)

Reality: 60-80% of production workloads remain in privileged namespaces permanently

Common Failure Scenarios

Monitoring Stack Failures

Root Cause: Monitoring agents require host access
Symptoms: Node exporters, log collectors cannot start
Solution: Exempt monitoring namespace to privileged mode

Example Error: violates PodSecurity restricted:v1.29: hostNetwork access forbidden

CI/CD Pipeline Breakage

Root Cause: Build processes require privileged operations
Impact: Complete deployment pipeline failure
Resolution Timeline: Immediate exemption required for business continuity

Database Container Issues

Root Cause: Containers expect to run as root, modify filesystem permissions
Symptoms: PostgreSQL, MySQL containers fail to initialize
Required Fix: Custom container images with proper security contexts

Java Application Failures

Root Cause: Cannot write to /tmp with read-only filesystem
Frequency: Nearly universal for pre-2020 Java applications
Solution: Add writable volume mounts for temporary directories

Security Context Remediation Patterns

Basic Non-Root Configuration

spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 2000
  containers:
  - name: app
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL

Writable Filesystem Requirements

spec:
  containers:
  - name: app
    volumeMounts:
    - name: tmp-volume
      mountPath: /tmp
    - name: var-log
      mountPath: /var/log
  volumes:
  - name: tmp-volume
    emptyDir: {}
  - name: var-log
    emptyDir: {}

Cloud Provider Implementations

Amazon EKS

Status: Enabled by default in 1.23+
Reliability: High - works as documented
Special Considerations: None

Google GKE

Standard Mode: Works correctly
Autopilot Mode: Additional restrictions applied by Google
Complexity: Medium - conflicts with GKE-specific policies

Azure AKS

Status: Enabled by default
Conflict: Fights with Azure Policy addon
Resolution: Disable one security system or accept conflicts

Troubleshooting Guide

Validation Commands

# Check PSA enablement
kubectl get pods -n kube-system -l component=kube-apiserver -o yaml | grep PodSecurity

# Verify namespace configuration
kubectl get namespace NAMESPACE -o yaml | grep pod-security

# Test with throwaway namespace
kubectl create namespace psa-test
kubectl label namespace psa-test pod-security.kubernetes.io/enforce=restricted
kubectl run test --image=nginx --namespace=psa-test

Common Error Patterns

Error Message	Root Cause	Solution
`spec.securityContext.runAsUser: 0`	Running as root	Set `runAsUser: 1000` and `runAsNonRoot: true`
`spec.containers[0].securityContext.privileged: true`	Privileged container	Remove privileged flag or exempt namespace
`spec.containers[0].securityContext.capabilities.add[0]: SYS_ADMIN`	Excessive capabilities	Remove capabilities or use baseline mode
`violates PodSecurity restricted: hostNetwork`	Host networking	Remove hostNetwork or exempt namespace

Emergency Procedures

# Immediate deployment fix - disable enforcement
kubectl label namespace NAMESPACE pod-security.kubernetes.io/enforce=privileged

# Remove all PSA labels
kubectl label namespace NAMESPACE pod-security.kubernetes.io/enforce-
kubectl label namespace NAMESPACE pod-security.kubernetes.io/audit-
kubectl label namespace NAMESPACE pod-security.kubernetes.io/warn-

Performance and Operational Impact

Engineering Velocity Impact

First Month: -50% deployment speed due to debugging
Learning Phase: 2-3 weeks per engineer for security context proficiency
Ongoing Overhead: +30% time for new application deployment

Incident Response Considerations

PSA violations appear as deployment failures
Error messages often lack specific remediation guidance
Debugging requires security context expertise
Emergency exemption procedures must be documented

Migration from Pod Security Policies

Pre-Migration Assessment

# Identify active PSPs
kubectl get psp
kubectl get pods --all-namespaces -o custom-columns="NAMESPACE:.metadata.namespace,NAME:.metadata.name,PSP:.metadata.annotations.kubernetes\.io/psp"

PSP to PSA Mapping

Most restrictive PSP → Baseline PSA (not Restricted)
Standard PSP → Privileged PSA
Permissive PSP → Privileged PSA

Migration Timeline Reality

Documentation Claims: 6-8 weeks
Actual Experience: 3-6 months for significant infrastructure
Success Metrics: 40-60% of workloads achieve baseline enforcement

Post-Migration Cleanup

# Remove PSP resources (requires cluster admin)
kubectl delete psp --all
kubectl delete clusterrole psp:*
kubectl delete clusterrolebinding psp:*

Tool Recommendations

Essential Tools

kubectl dry-run: kubectl apply --dry-run=server - Validate before deployment
Polaris: Pre-deployment PSA violation detection
kube-score: Alternative validation tool

Documentation Resources

Pod Security Standards: Primary reference for security levels
Security Context Configuration: Essential for violation remediation
Stack Overflow #kubernetes: Real-world solutions and troubleshooting

Monitoring and Alerting

# Monitor PSA violations
kubectl get events --field-selector reason=FailedCreate
kubectl get events --field-selector reason=PolicyViolation

Decision Framework

When to Use Each Security Level

Choose Privileged When:

Legacy applications requiring root access
Infrastructure components (monitoring, service mesh)
CI/CD pipelines with privileged operations
Time constraints prevent security context remediation

Choose Baseline When:

Modern applications with some security awareness
Accepting most container images will need modification
Balancing security with operational complexity

Choose Restricted When:

New applications designed for security
Compliance requirements mandate strictest controls
Engineering resources available for extensive testing

Cost-Benefit Analysis

Security Improvement: Moderate - prevents obvious misconfigurations
Implementation Cost: High - months of engineering effort
Maintenance Overhead: Medium - ongoing security context management
Compatibility Impact: High - significant application modifications required

Critical Success Factors

Executive Support: Security context remediation requires significant engineering time
Gradual Rollout: Never enable restricted mode cluster-wide immediately
Exemption Strategy: Plan permanent exemptions for infrastructure components
Training Investment: Engineers need security context expertise
Realistic Timelines: Plan 3-6 months for mature infrastructure migration
Emergency Procedures: Document rapid exemption process for production issues

Breaking Points and Limitations

PSA Cannot Prevent:

Runtime privilege escalation by sophisticated attackers
Container breakouts through kernel vulnerabilities
Network-based attacks between pods

PSA Will Break:

Most pre-2020 container images
Helm charts without security context configuration
Applications expecting filesystem write access
System monitoring and debugging tools

Resource Requirements:

Engineering Time: 2-6 months for comprehensive implementation
Expertise: Security context configuration knowledge required
Testing Infrastructure: Separate environments for PSA validation
Documentation: Extensive runbooks for troubleshooting violations

Useful Links for Further Investigation

Resources That Actually Help (And Which Ones Suck)

Link	Description
Pod Security Standards	Actually fucking useful. This is the one doc that clearly explains what each security level blocks and why your pods are failing. Bookmark this shit, you'll need it constantly.
PSA Configuration Reference	Good for reference. Dry as hell but actually accurate. Use this when you need to understand the YAML structure and don't want to guess.
PSP to PSA Migration Guide	Overly optimistic garbage. Written by someone who clearly never had to migrate a real production cluster with 500+ deployments and 47 different monitoring agents. Makes migration sound like a fun weekend project when it's actually 3-6 months of debugging why your CI pipeline breaks every goddamn Tuesday. Read for context, but don't follow it blindly unless you enjoy explaining to management why every deployment is broken.
Admission Controllers Overview	Skip this. Generic overview that doesn't help with PSA-specific issues.
Security Context Configuration	Essential reading. You'll reference this constantly when fixing PSA violations. Shows you how to write proper security contexts.
Security Context Constraints (OpenShift)	Useful if you're on OpenShift. Red Hat's approach is different but the concepts apply.
EKS Pod Security Standards	Surprisingly decent. AWS actually managed to keep it simple for once. Works as advertised, which is shocking.
GKE Pod Security Standards	Confusing as fuck. Google's docs assume you already understand their byzantine security model (spoiler: you don't). Autopilot mode makes it even worse by adding mystery meat restrictions.
AKS Pod Security Standards	Fights with Azure Policy constantly. Microsoft's implementation technically works but immediately starts beefing with their other security features. Classic Microsoft.
kubectl dry-run	Use this constantly. `kubectl apply --dry-run=server` saves you from deploying broken configs.
Polaris	Actually useful. Catches PSA violations before deployment, which is way better than finding out in production.
kube-score	Does the same thing as Polaris. Pick one, they're both fine.
Falco	Overkill for PSA. Detects violations after they happen, but PSA should prevent them anyway.
Stack Overflow #kubernetes	Where you'll actually find solutions that work. Search for "pod security admission violates" and you'll find your exact error message with someone who's already suffered through fixing it.
Kubernetes Community Forums	Real war stories from the trenches. Engineers sharing how they actually implemented PSA, including all the spectacular failures nobody talks about in conference presentations.
Kubernetes Slack #sig-auth	Good for really obscure edge cases. The actual maintainers hang out here, but their responses tend to be academic and assume you have unlimited time to read RFCs.
CNCF #kubernetes-security	Hit or miss bullshit. Lots of theoretical discussion about security principles, very little practical "here's how to fix your broken deployment" advice.
Kubernetes PSA Issues	Track known bugs. If PSA is behaving weirdly, check if it's a known issue.
kubernetes/enhancements PSA KEPs	For the masochists. Read the design docs if you want to understand why PSA works the way it does.
CIS Kubernetes Benchmark	Theoretical compliance. Good for checkbox security, not practical implementation guidance.
NSA/CISA Kubernetes Hardening Guide	Actually practical. NSA/CISA guide is surprisingly useful for real-world hardening.
PSA Troubleshooting Checklist	Start here when things break. Basic debugging steps.
kubectl debug	For debugging pod failures. Essential when PSA blocks your deployments.