ArgoCD - GitOps for Kubernetes That Actually Works

Why ArgoCD Exists (And Why You Might Need It)

ArgoCD solves a specific problem: your production Kubernetes clusters slowly drifting away from what you think they should look like. Someone kubectl applies a hotfix, a deployment fails halfway through, or a ConfigMap gets manually edited - suddenly your "identical" staging and prod environments aren't identical anymore.

GitOps changes this by making Git your single source of truth. ArgoCD is a Kubernetes controller that watches your Git repos and automatically syncs changes to your clusters. Originally built at Intuit (now a CNCF graduated project), it's basically a robot that never gets tired of checking if your cluster matches Git.

The big win is visibility. ArgoCD's web UI shows you exactly what's deployed, what's out of sync, and what broke during the last deployment. Instead of running kubectl get pods and hoping for the best, you get a visual dashboard that actually tells you what's happening across all your applications.

The ArgoCD dashboard shows each application as a tree of Kubernetes resources - you can see pods, services, deployments, and their relationships at a glance. When something's broken, the UI highlights it in red and shows you exactly what failed.

How ArgoCD Actually Works

Traditional CI/CD tools push changes to your cluster - Jenkins runs kubectl apply, CircleCI hits the Kubernetes API, whatever. ArgoCD flips this around using a "pull" model. It runs inside your cluster and continuously polls Git repositories (every 3 minutes by default, configurable with repository settings).

When ArgoCD finds changes in Git, it renders your manifests (Helm charts, Kustomize, plain YAML) and compares them with what's actually running in the cluster. If there's a difference, it syncs automatically (if you enable auto-sync) or waits for you to click the sync button.

This means your cluster can't drift without ArgoCD noticing. Someone manually edits a Deployment? ArgoCD sees the drift and can either fix it automatically (self-heal) or alert you. Your Git history becomes your deployment audit trail - no more "who changed the replica count" mysteries.

Traditional CI/CD pushes changes to clusters from external systems (push model), while GitOps pulls changes from Git repositories running inside the cluster (pull model). This architectural difference makes GitOps more secure and auditable.

ArgoCD Architecture Diagram

The Reality of Using ArgoCD in Production

ArgoCD v3.1 was released in June 2025 with OCI registry support (still beta - I wouldn't use it in prod yet) and multi-source applications. The multi-source feature actually solves a real problem if you're splitting app code from environment configs across different Git repos.

According to the latest CNCF survey, 97% of ArgoCD users run it in production, which makes sense since it's stable and battle-tested. It manages clusters at companies like Intuit, RedHat, and thousands of others according to adoption tracking.

But here's what the adoption stats don't tell you: ArgoCD's learning curve is steeper than it looks. The basic concepts are simple - Git as source of truth, pull-based deployments - but the devil's in the details. Application health checks fail silently, sync policies are confusing until you get burned by them, and the UI gets sluggish with hundreds of applications.

I've been running ArgoCD for 2 years across multiple clusters and I still occasionally discover gotchas. The resource hooks are powerful but they fail without clear error messages. RBAC configuration is a pain if you're not already a Kubernetes RBAC expert. Multi-cluster setup works great until you hit networking edge cases.

The big benefit though? When something breaks in production, ArgoCD's web UI immediately shows you what's different from Git. Instead of debugging with kubectl, you get a visual diff of what went wrong. That alone makes it worth the learning curve for most teams managing more than a handful of applications.

ArgoCD consists of several key components: the API Server (handles UI and CLI), Application Controller (watches Git and manages sync), Repository Server (clones repos and renders manifests), and Redis (caches Git state for performance).

ArgoCD vs Leading GitOps and CI/CD Platforms

Feature	ArgoCD	FluxCD	Jenkins	GitLab CI/CD	Spinnaker
GitOps Native	✅ Full GitOps	✅ Full GitOps	❌ Push-based	⚠️ Partial	⚠️ Hybrid
Web UI	✅ Rich dashboard	❌ Limited	✅ Comprehensive	✅ Integrated	✅ Complex UI
Kubernetes Focus	✅ Native K8s	✅ Native K8s	⚠️ Plugin-based	⚠️ Multi-platform	✅ Multi-cloud
Multi-Cluster	✅ Built-in	✅ Built-in	⚠️ Complex setup	⚠️ Enterprise only	✅ Native
RBAC & Security	✅ Granular RBAC	✅ K8s RBAC	✅ Extensive	✅ Enterprise grade	✅ Advanced
Application Sync	✅ Real-time	✅ Real-time	❌ Manual	⚠️ Pipeline-based	✅ Automated
Rollback Support	✅ Git-based	✅ Git-based	⚠️ Manual	⚠️ Pipeline-based	✅ Advanced
Learning Curve	⚠️ Moderate	⚠️ Steep	⚠️ Steep	⚠️ Moderate	❌ Very steep
CNCF Status	✅ Graduated	✅ Graduated	❌ None	❌ None	❌ None
License	✅ Apache 2.0	✅ Apache 2.0	✅ MIT/Commercial	⚠️ Commercial	✅ Apache 2.0

What ArgoCD Actually Does Under the Hood

ArgoCD isn't magic - it's a set of Kubernetes controllers that do specific jobs. If you're evaluating ArgoCD, understanding these components helps you figure out what might break and how to fix it when things go sideways.

The Core Controllers (And Their Pain Points)

TheApplication Controller is the heart of ArgoCD. It polls your Git repos every 3 minutes (configurable) and compares what's in Git with what's running in your cluster. This sounds simple until you have 200+ applications and the controller starts hitting GitHub's API rate limits. You'll need to configure repository credentials carefully and potentially set up webhook-based sync to reduce polling load.

TheRepository Server is where the complexity lives. It clones Git repos, renders Helm charts, processes Kustomize overlays, and handles authentication with GitHub, GitLab, Bitbucket, whatever. This is also where things break most often - memory usage scales poorly with repo size, and complex Helm charts can cause timeouts during rendering. I've seen the repo server consume 4GB+ RAM just from large monorepos with hundreds of microservices.

The Web UI (ArgoCD's Killer Feature)

ArgoCD's web interface is honestly why most people choose it over FluxCD. The UI shows you a visual topology of your applications - pods, services, ingresses, whatever - with real-time health status and sync state. When something breaks, you can see immediately what's degraded without running a dozen kubectl commands.

The UI lets you manually trigger syncs, view diffs between Git and cluster state, and roll back deployments with a button click. It's especially useful for debugging - you can drill down into individual resources, see their events, and check logs without leaving the interface. The API server powers both the UI and CLI, and it's decent for building custom integrations.

But the UI has limits. With 500+ applications, page load times get painful. The tree view becomes unwieldy with complex applications that have dozens of resources. And don't expect fancy filtering or saved views - you'll be clicking through applications one by one.

In a hub-and-spoke architecture, one central ArgoCD instance manages applications across multiple remote clusters, providing a single pane of glass for deployments while maintaining cluster isolation.

Features That Actually Matter in Production

Multi-Source Applications (New in v3.1)

The multi-source feature lets one application pull from multiple Git repos or sources. This solves the common pattern where app config lives in one repo and environment-specific values live in another. Before this, you needed ApplicationSets or complex templating workarounds.

Progressive Delivery with Argo Rollouts

ArgoCD integrates with Argo Rollouts for canary and blue-green deployments. Rollouts extends Kubernetes Deployments with advanced deployment strategies. The integration is clean - ArgoCD manages the Rollout resource, Rollouts handles the deployment strategy. Works well if you need more than basic rolling updates.

Sync Policies (Where You'll Get Burned)

ArgoCD has sync policies that control how deployments work: manual sync (you click the button), auto-sync (changes deploy immediately), self-heal (fixes manual drift), and prune (removes deleted resources). The gotcha is that these interact in non-obvious ways. Enable auto-sync + self-heal + prune without understanding the implications, and you might accidentally delete resources you care about.

The application details view shows real-time sync status, health state, and resource relationships. Out-of-sync resources are highlighted, making it easy to identify what changed between Git and the cluster.

Resource Hooks and Health Checks (The Devils in the Details)

Sync Hooks and Waves

Resource hooks let you run Jobs before/after syncs - database migrations, cache warming, whatever. They work great when they work, but debugging failed hooks is painful because error messages are often buried in Job logs that aren't obvious from the ArgoCD UI.

Sync waves handle dependencies by applying resources in phases. Wave 0 resources deploy first, then wave 1, etc. This solves the "ConfigMap needs to exist before Deployment" problem, but managing complex dependency chains gets messy quickly.

Health Checks That Sometimes Lie

ArgoCD has built-in health checks for standard Kubernetes resources, plus custom health checks for CRDs. The health system works well for simple cases - pods are healthy when they're running, services when their endpoints exist.

But custom resources are hit-or-miss. An Istio VirtualService might show as healthy even when it has invalid syntax that's breaking traffic routing. A cert-manager Certificate might be "healthy" while actually failing to renew. Always verify that health checks actually match your application's reality - don't just trust the green checkmarks.

The drift detection is solid though. ArgoCD shows exactly which resources differ from Git, which specific fields have changed, and gives you a visual diff. When debugging production issues, this immediate visibility is invaluable compared to running kubectl diffs manually.

ArgoCD's component architecture ensures separation of concerns: Git repository management, manifest rendering, and cluster synchronization each have dedicated services that can be scaled independently for performance.

Frequently Asked Questions (Real Problems You'll Hit)

Why is my ArgoCD sync stuck at "Progressing" forever?

This usually means a resource hook failed silently or you have a dependency issue with sync waves. Check the Events tab in the Argo

CD UI for the stuck resources

look for "FailedCreate" or "FailedUpdate" errors. The most common culprits are pre-sync hooks that failed, insufficient RBAC permissions, or resource conflicts. Run kubectl describe on the stuck resources to see what's actually happening. If it's a hook failure, you might need to manually delete the failed Job and re-sync.

ArgoCD says the app is "Healthy" but my service is completely broken - what gives?

ArgoCD's health checks only verify that Kubernetes resources are created successfully, not that your application actually works. A Deployment shows as "Healthy" when pods are running, even if they're crash-looping or serving 500 errors. For external services like databases or APIs, you need custom health checks or integrate with tools like Argo Rollouts for real application health monitoring.

How do I stop ArgoCD from constantly reverting my manual changes?

You've probably enabled the "Self Heal" option. ArgoCD will automatically revert any manual kubectl changes to match what's in Git. This is usually what you want, but if you need to temporarily disable it, either turn off self-heal in the application settings or add the argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true annotation to resources you're debugging. Just remember to commit your changes to Git afterward.

The ArgoCD UI is super slow with 200+ applications - any fixes?

This is a known issue. The web UI becomes sluggish because it's rendering all application trees in the browser. Some workarounds: enable application list paging in server config, use the CLI for bulk operations, or consider splitting applications across multiple ArgoCD instances. There are also performance improvements in the HA setup that help with large deployments.

My Git repository is private - how do I authenticate ArgoCD?

You need to configure repository credentials in Argo

CD. For GitHub, the easiest approach is creating a personal access token with repo access, then adding it via the UI under Settings > Repositories. For SSH, generate a deploy key in your Git provider and add the private key to ArgoCD. If you're hitting API rate limits, consider using Git webhooks instead of polling

though webhook setup can be finicky depending on your network setup.

ArgoCD deleted resources that I thought it shouldn't touch - why?

You probably enabled the "Prune" option, which removes Kubernetes resources that exist in the cluster but aren't defined in Git.

This is dangerous if you're not careful about what ArgoCD manages. Check the application's "Prune" setting and consider using sync options like Prune=false for individual resources. Always test prune behavior in dev environments first

I've seen teams accidentally delete databases because they weren't properly excluded from ArgoCD management.

ArgoCD won't sync my secrets from Vault - what's wrong?

Secret management is tricky with GitOps.

ArgoCD itself doesn't handle secrets

it just applies whatever manifests you give it. If you're using Vault integration, check that the ArgoCD service account has proper Vault policies and the argocd-vault-plugin is correctly configured.

Common issues: expired Vault tokens, incorrect policy paths, or the plugin not being installed in the right container. External Secrets Operator is often easier to debug than the Vault plugin.

If ArgoCD goes down, do my apps keep running or everything breaks?

Your applications keep running fine.

Argo

CD only manages deployments, not runtime operations. If ArgoCD crashes, your pods, services, etc. continue working normally

you just can't deploy new changes until it comes back up. For production, you should run ArgoCD HA mode with multiple replicas and Redis clustering. But honestly, even single-instance ArgoCD is pretty stable
it's just a Kubernetes controller.

How many clusters can one ArgoCD instance handle before it becomes a nightmare?

The official guidance is around 50 clusters per instance, but it really depends on your app count and sync frequency.

I've seen instances managing 100+ clusters with ApplicationSets, but the UI becomes painful and troubleshooting gets harder. Most companies end up with multiple Argo

CD instances

per environment, per team, or per region. The hub-and-spoke pattern works until you hit the UI performance wall or need stronger isolation between teams.ApplicationSets use generators to template applications across multiple clusters automatically. The cluster generator creates one application per target cluster, while the git directory generator creates applications based on repository structure.

How long does it take to actually become productive with ArgoCD?

If you're already comfortable with Kubernetes, expect about 2 weeks to get basic Argo

CD deployments working. But mastering the gotchas (sync policies, resource hooks, health checks) takes months. The hardest part isn't ArgoCD itself

it's restructuring your deployment workflow around GitOps principles. Teams often underestimate the Git repository organization work needed to make ArgoCD successful.

Does ArgoCD work with operators and CRDs or do I need special setup?

ArgoCD handles CRDs fine since it's just applying YAML to Kubernetes.

It has built-in health checks for popular operators like Istio, cert-manager, and Prometheus Operator.

For custom operators, you might need to define custom health checks

the default health logic just checks if the resource was created, not if it's actually working. Resource hooks help with operator initialization sequences.

Is ArgoCD free or do I need to pay for support?

ArgoCD itself is Apache 2.0 licensed

completely free.

Community support through GitHub issues and Slack is excellent.

If you need commercial support, Akuity (founded by Argo

CD creators) offers managed ArgoCD, and Codefresh provides enterprise Git

Ops platforms built on ArgoCD. But honestly, the open-source version with community support works fine for most teams.

Can I use ArgoCD with a monorepo or do I need to split everything?

ArgoCD works great with monorepos.

You can deploy multiple applications from different paths within the same repository using path-based configurations.

The ApplicationSets controller makes it easy to template applications across different environments or microservices. The key is organizing your monorepo with clear directory structures that match your application boundaries

/apps/frontend/, /apps/api/, /apps/worker/, etc.

Quick Navigation

How ArgoCD Actually Works

The Reality of Using ArgoCD in Production

The Core Controllers (And Their Pain Points)

The Web UI (ArgoCD's Killer Feature)

Features That Actually Matter in Production

Multi-Source Applications (New in v3.1)

Progressive Delivery with Argo Rollouts

Sync Policies (Where You'll Get Burned)

Resource Hooks and Health Checks (The Devils in the Details)

Sync Hooks and Waves

Health Checks That Sometimes Lie

Why is my ArgoCD sync stuck at "Progressing" forever?

ArgoCD says the app is "Healthy" but my service is completely broken - what gives?

How do I stop ArgoCD from constantly reverting my manual changes?

The ArgoCD UI is super slow with 200+ applications - any fixes?

My Git repository is private - how do I authenticate ArgoCD?

ArgoCD deleted resources that I thought it shouldn't touch - why?

ArgoCD won't sync my secrets from Vault - what's wrong?

If ArgoCD goes down, do my apps keep running or everything breaks?

How many clusters can one ArgoCD instance handle before it becomes a nightmare?

How long does it take to actually become productive with ArgoCD?

Does ArgoCD work with operators and CRDs or do I need special setup?

Is ArgoCD free or do I need to pay for support?

Can I use ArgoCD with a monorepo or do I need to split everything?

Related Tools & Recommendations

Flux GitOps: Secure Kubernetes Deployments with CI/CD

Helm: Simplify Kubernetes Deployments & Avoid YAML Chaos

GKE Overview: Google Kubernetes Engine & Managed Clusters

Jenkins Docker Kubernetes CI/CD: Deploy Without Breaking Production

Helm Troubleshooting Guide: Fix Deployments & Debug Errors

etcd Overview: The Core Database Powering Kubernetes Clusters

ArgoCD Production Troubleshooting: Debugging & Fixing Deployments

Linkerd Overview: The Lightweight Kubernetes Service Mesh

Enterprise Git Hosting: What GitHub, GitLab and Bitbucket Actually Cost

GitLab CI/CD - The Platform That Does Everything (Usually)

containerd - The Container Runtime That Actually Just Works

Istio Service Mesh: Real-World Complexity, Benefits & Deployment

GitHub Copilot - AI Pair Programming That Actually Works

Django Production Deployment Guide: Docker, Security, Monitoring

Kubernetes Crisis Management: Fix Your Down Cluster Fast

Kubernetes Pricing: Uncover Hidden K8s Costs & Skyrocketing Bills

Docker, Podman & Kubernetes Enterprise Pricing Comparison

Aqua Security - Container Security That Actually Works

Fix Kubernetes Pod CrashLoopBackOff - Complete Troubleshooting Guide

Kubernetes CrashLoopBackOff: Debug & Fix Pod Restart Issues