Look, authorization is usually a clusterfuck. You've got IF statements scattered across 47 microservices, and when the security team wants to add a new rule, you're updating code in 12 repos. OPA centralizes this mess so you write the policy once and ask "should this user do this thing?" instead of hardcoding logic everywhere.
How OPA Actually Works (Not the Marketing Version)
You send JSON to OPA asking "can user X do action Y on resource Z?" OPA checks your Rego policies and returns a decision. That's it. No magic, no AI, just rules evaluation.
The basic flow:
- Your service hits an endpoint
- Instead of checking
if user.role == "admin"
, you ask OPA - OPA runs your policies against the request data
- You get back allow/deny (or structured data)
package authz
default allow := false
allow if {
input.user.role == "admin"
}
allow if {
input.user.role == "user"
input.resource.owner == input.user.id
}
Reality check: Works great in demos with 10 rules. Try debugging a 500-line policy file when auth breaks at 2am. You'll question every life choice that led you to Rego.
Where People Actually Use OPA
Kubernetes Admission Controllers: Validate/mutate resources before they hit etcd. Gatekeeper makes this somewhat bearable, but expect pain setting it up.
API Gateways: Envoy integration lets you centralize auth decisions. Works well until you need low latency - every auth call is now a network hop.
Infrastructure Validation: Conftest checks your Terraform/Dockerfiles before deployment. Actually useful and works as advertised.
Application Authorization: Replace your spaghetti auth code with centralized policies. Great in theory, harder in practice when you hit performance limits.
Production Reality Check
Companies like Netflix do use OPA in production, but they have teams of people maintaining it. Here's what they actually deal with:
- Memory usage that scales linearly with policy size (plan for 20x overhead vs JSON)
- Performance degrades significantly with large policy sets
- Real production deployments see 1-5ms response times, not "microseconds"
- Debugging Rego makes you question your career choices
The sidecar pattern sounds great until OPA crashes and takes your auth system with it. Always implement fallback policies unless you enjoy 3am outages.
Bottom line: OPA works great for <10k policies and simple authorization. Beyond that, you're in for operational complexity that most teams underestimate.
Additional Resources: