Editorial

Docker Registry Architecture

The Reality of GitLab Container Registry

GitLab's container registry solves one of the most annoying problems in DevOps: managing separate credentials for your code repo and your Docker images. Before this, you'd have to juggle Docker Hub credentials, set up separate authentication in CI/CD, and pray that the tokens didn't expire during a critical deployment.

The registry runs alongside your GitLab instance and uses the same permissions. If you can push code to the repo, you can push images to the registry. No more "docker login" commands scattered across your CI files, no more service accounts with mysterious permissions, no more authentication failures that break your builds at 2am.

It's built on Docker Distribution, which means it talks the same Registry HTTP API V2 as everything else. Your existing docker commands work: docker push, docker pull, all the shit you're already used to. The difference is authentication just works because it's integrated with GitLab's JWT system.

How It Actually Works (And Where It Breaks)

The registry runs as a separate service, usually on port 5000, talking to GitLab through JWT tokens. When you docker push, GitLab generates a token with your repo permissions and hands it to the registry. Simple enough, until the clock skew between servers makes tokens invalid and you get hit with HTTP 401 Unauthorized: authentication required errors at 3am. Clock drift of more than 5 minutes breaks JWT validation completely.

Storage is where things get expensive fast. You can run it on local filesystem for dev, but production means S3 or equivalent. Enable lifecycle policies immediately or your storage bill will make you cry. I've seen orgs hit $50k/month in S3 costs because no one set up cleanup policies and developers kept pushing 2GB images with every commit. Use multi-stage builds and proper storage optimization to avoid this nightmare.

The registry authenticates through GitLab's main auth system, which sounds great until LDAP is down and no one can deploy. At least with Docker Hub, when their auth breaks, it's their problem.

The Metadata Database Finally Fixes the Garbage Collection Nightmare

GitLab 17.3 introduced a metadata database that moves registry metadata from object storage into PostgreSQL. This finally fixes the garbage collection problem that's been making ops teams miserable for years.

Before this, cleaning up old images meant taking the entire registry offline - coordinating with every team, setting maintenance windows, and inevitably someone's deployment would break because they didn't get the memo. Online garbage collection runs in the background now, cleaning up orphaned layers without downtime.

The migration to metadata database is scary as hell for production systems though. You're basically moving the registry's brain from file-based storage to PostgreSQL. The migration process can take hours or days depending on how much shit you've accumulated, and there's no rollback if something goes wrong.

Here's what actually works: the metadata database also gives you storage usage metrics that actually work. Before this, figuring out which projects were eating your storage budget meant parsing S3 logs like a caveman. You can finally get real-time storage reports, project-level usage breakdown, and automated cleanup alerts instead of surprise billing.

Security Scanning (And the False Positive Hell)

Security scanning runs automatically with Trivy built in. It'll find vulnerabilities in your images, then you'll spend 3 hours figuring out which ones actually matter and which are false positives. Container scanning happens during CI builds and dumps results into GitLab's security dashboard.

The good news is access control just works with GitLab's existing permissions. If you can push to the repo, you can push images. If you can read the project, you can pull images. No separate ACL bullshit to maintain. Project-level permissions control registry access automatically.

The bad news is vulnerability scanning can slow down your builds significantly, especially for large images. You can disable it, but then security teams get grumpy. You can configure it to only scan certain branches, but then you miss vulnerabilities in development. There's no perfect solution - just different levels of pain. The scanner finds CVE-2023-44487 (HTTP/2 Rapid Reset) in every image using Alpine 3.17, but you can't fix it because updating breaks your application dependencies. Configure security policies and vulnerability management for the full security theater experience.

GitLab Container Registry vs Major Alternatives

Feature

GitLab Container Registry

Harbor

JFrog Artifactory

Docker Hub

AWS ECR

Azure ACR

OCI Compliance

Full OCI v1.1

Full OCI v2.0

Full OCI v1.1

Full OCI

Full OCI

Full OCI v1.1

Deployment Model

SaaS + Self-hosted

Self-hosted only

Self-hosted + Cloud

SaaS only

SaaS only

SaaS only

Vulnerability Scanning

Trivy (built-in)

Trivy, Clair

Xray (commercial)

Snyk (paid)

Inspector

Qualys VMDR

CI/CD Integration

Native GitLab CI/CD

Webhook-based

Multiple platforms

Limited

AWS native

Azure native

Storage Backend

File/S3/GCS/Azure

File/S3/GCS/Azure

Multiple backends

Proprietary

S3-based

Azure Blob

Access Control

GitLab RBAC

Project-based RBAC

Complex ACL system

Org/Team based

AWS IAM

Azure AD/RBAC

Pricing Model

Usage-based/Free tier

Open source

Per user/feature

Free/Pro tiers

Pay per usage

Pay per usage

Multi-format Support

Container images only

OCI artifacts

30+ package types

Container images

Container images

Multi-format

Image Signing

Built-in (metadata DB)

Built-in

Built-in

Third-party

Third-party

Third-party

Garbage Collection

Online (zero-downtime)

Manual/scheduled

Automatic

N/A

Lifecycle rules

Lifecycle rules

Geographic Distribution

Limited

Manual replication

Global CDN

Global CDN

Regional

Global replication

API Compatibility

Docker Registry v2

Docker Registry v2

Multiple APIs

Docker Registry v2

Docker Registry v2

Docker Registry v2

Production Reality: Storage Bills and Performance Hell

Running GitLab's registry in production means dealing with two main problems: storage costs spiraling out of control and performance degrading as you scale. You can run it as SaaS on GitLab.com (their problem), or self-hosted (your problem). Most enterprises end up self-hosted because of compliance requirements, which means you get to deal with all the operational complexity.

Docker System Components

Storage Backend Hell (And How to Not Go Broke)

Local filesystem storage works for dev environments. Production needs cloud object storage - S3, GCS, or Azure Blob - which is where your storage bill starts growing like a cancer.

Here's what happens: developers push 2GB images for every commit, CI builds create temporary images that never get cleaned up, and multi-stage builds leave intermediate layers scattered everywhere. Without cleanup policies, you're looking at exponential storage growth. I've seen companies hit $200k/year in S3 costs before anyone noticed.

The v2 storage drivers are supposed to fix performance issues, but they're still beta and breaking production deployments. The S3 v2 driver broke in GitLab 17.2 when using IAM roles, throwing NoCredentialProviders: no valid providers in chain errors that took three days to debug.

Enable lifecycle policies on your S3 bucket from day one. Set up storage quota limits per project. Configure automated cleanup policies, monitor storage usage analytics, implement retention policies, and check out cost optimization strategies. Your future self will thank you when the storage bill arrives.

Performance Degrades as You Scale (Shocking!)

If your registry is slow, it's probably because you're using filesystem storage and didn't configure the metadata database. This will bite you when you hit 100+ repositories and docker pull starts taking 30 seconds because it's enumerating layers through object storage API calls.

The metadata database actually works well once you get through the migration process. Tag listing becomes fast, cleanup policies run without timing out, and you get real storage metrics instead of "contact your S3 admin" nonsense.

For scaling beyond a single instance, you can run multiple registry instances behind a load balancer. This works until you hit database contention and now your registry is slow because PostgreSQL is the bottleneck. Redis caching helps with reads, but doesn't solve the fundamental issue that everyone's hitting the same database.

Network performance is where things get weird. The registry can redirect downloads to S3 directly, which reduces your bandwidth costs but breaks in air-gapped environments. You can configure CDN integration and proxy caching, but you're choosing between bandwidth costs and operational complexity.

Enterprise Features (Good Luck Getting IT Approval for the Database)

Protected container repositories in GitLab 17.8 let you lock down who can push to production registries. This is actually useful when you want to prevent developers from directly pushing to prod images and bypassing your entire CI/CD process.

The compliance features integrate with GitLab's audit logging, which generates massive amounts of logs that no one ever reads until the security audit. SBOM generation happens automatically, creating JSON files full of software bill of materials data that satisfies checkbox compliance but doesn't actually improve security.

Cleanup policies can be set at project, group, or instance level, which sounds great until you realize different teams need different retention policies and you're stuck maintaining a complex hierarchy of rules. The policies work with online garbage collection, assuming you've migrated to the metadata database and haven't hit any of the migration edge cases that leave your registry in an inconsistent state. You'll end up dealing with compliance frameworks and enterprise authentication headaches.

Questions from Engineers Who Actually Use This Shit

Q

Why does my docker push randomly fail with "unauthorized" errors?

A

Your GitLab token expired, or there's clock skew between your GitLab server and the registry. The exact error is Error response from daemon: Head https://registry.gitlab.com/v2/myproject/myimage/manifests/latest: unauthorized: HTTP Basic: Access denied. Try docker login registry.gitlab.com again, or if you're using CI, check that your CI runner time is synchronized. Clock drift over 5 minutes breaks JWT validation completely.

Q

Why is my storage bill $50k this month?

A

Because you didn't set up cleanup policies and your developers have been pushing 2GB images for every commit. Self-hosted instances have no storage limits by default, which is great until your S3 bill arrives. Set up lifecycle policies on your S3 bucket immediately, configure project-level retention policies, and educate developers about multi-stage builds to reduce image size.

Q

Why is vulnerability scanning slowing down my builds by 10 minutes?

A

Container scanning with Trivy runs during your CI builds and scans every layer of your image for vulnerabilities. It's thorough but slow, especially for large images. You can speed it up by scanning only on main branch, using smaller base images, or disabling it for development branches. The security team won't be happy, but your developers will stop complaining about slow builds.

Q

Can I use this with Jenkins/GitHub Actions/other CI systems?

A

Yes, it speaks the same Docker Registry HTTP API V2 as everything else. You'll need to create deploy tokens or personal access tokens for authentication. External CI systems work fine, but you lose the tight integration that makes GitLab's registry actually useful.

Q

Should I migrate to the metadata database? (Spoiler: Yes, but it's scary)

A

The metadata database migration is scary as hell but necessary. You're moving the registry's entire brain from object storage to PostgreSQL. The migration can take hours or days, there's no rollback, and if it fails you might be fucked. But garbage collection finally works without downtime, and performance actually improves. Plan for a maintenance window anyway, despite what the docs say about online migration.

Q

Why can't I delete this damn image?

A

Probably because other images reference the same layers, or you have dangling manifests that aren't cleaned up yet. Online garbage collection runs every 24 hours, not immediately. You can manually trigger cleanup, but it might take multiple runs to actually free up space. This is why storage bills grow faster than you can delete images.

Q

How do I migrate 500 images from Docker Hub without losing my sanity?

A

Use Skopeo for bulk migrations: skopeo copy docker://docker.io/myimage docker://registry.gitlab.com/myproject/myimage. Don't use docker pull/tag/push for large migrations unless you enjoy waiting hours and hitting rate limits. Plan for downtime anyway because something always breaks during migration.

Q

Which storage backend should I use?

A

Local filesystem for dev/testing. S3 for production, unless you want your registry to die when the disk fills up. The v2 storage drivers are supposed to be better but they're still beta and have authentication quirks. Stick with the legacy drivers unless you're feeling adventurous.

Q

Why are my cleanup policies not freeing up space?

A

Because cleanup policies only mark images for deletion

  • garbage collection actually removes the data. If you're not on the metadata database, garbage collection requires downtime. If you are on metadata database, it runs daily but can take multiple cycles to clean up shared layers. Storage bills lag behind actual cleanup by weeks.

Essential Resources and Documentation

Related Tools & Recommendations

tool
Similar content

Amazon ECR Overview: Why You Need AWS Container Registry

AWS's container registry for when you're fucking tired of managing your own Docker Hub alternative

Amazon Elastic Container Registry
/tool/amazon-ecr/overview
100%
tool
Similar content

Azure Container Registry: Private Docker Registry & Features Guide

Store your container images without the headaches of running your own registry. ACR works with Docker CLI, costs more than you think, but actually works when yo

Azure Container Registry
/tool/azure-container-registry/overview
85%
tool
Similar content

GitLab CI/CD Overview: Features, Setup, & Real-World Use

CI/CD, security scanning, and project management in one place - when it works, it's great

GitLab CI/CD
/tool/gitlab-ci-cd/overview
64%
pricing
Similar content

Container Registry Cost Comparison: Enterprise Pricing & Hidden Fees

Registry pricing is all over the place - some charge per GB, others have sneaky minimum fees

Amazon ECR
/pricing/container-registry-enterprise-cost-comparison/pricing-overview
34%
tool
Similar content

OpenCost: Kubernetes Cost Monitoring, Optimization & Setup Guide

When your AWS bill doubles overnight and nobody knows why

OpenCost
/tool/opencost/overview
33%
tool
Similar content

Rancher Desktop: The Free Docker Desktop Alternative That Works

Discover why Rancher Desktop is a powerful, free alternative to Docker Desktop. Learn its features, installation process, and solutions for common issues on mac

Rancher Desktop
/tool/rancher-desktop/overview
32%
pricing
Similar content

Enterprise Git Hosting: GitHub, GitLab & Bitbucket Cost Analysis

When your boss ruins everything by asking for "enterprise features"

GitHub Enterprise
/pricing/github-enterprise-bitbucket-gitlab/enterprise-deployment-cost-analysis
30%
troubleshoot
Similar content

Fix Bun Container Crashes: Exit 143, OOM, & CI Failures

Diagnose and resolve common Bun container crashes including Exit 143, Out of Memory (OOM), and CI pipeline failures. Learn effective troubleshooting and fixes.

Bun
/troubleshoot/bun-docker/container-failures
29%
tool
Similar content

Debug Kubernetes Issues: The 3AM Production Survival Guide

When your pods are crashing, services aren't accessible, and your pager won't stop buzzing - here's how to actually fix it

Kubernetes
/tool/kubernetes/debugging-kubernetes-issues
25%
tool
Similar content

Jsonnet Overview: Stop Copy-Pasting YAML Like an Animal

Because managing 50 microservice configs by hand will make you lose your mind

Jsonnet
/tool/jsonnet/overview
25%
tool
Similar content

Azure AI Foundry Production Deployment: Reality Check & Debugging Guide

Microsoft finally unfucked their scattered AI mess, but get ready to finance another Tesla payment

Microsoft Azure AI
/tool/microsoft-azure-ai/production-deployment
25%
tool
Similar content

Composer: Essential PHP Dependency Management & Package Tool

Finally, dependency management that doesn't make you want to quit programming

Composer
/tool/composer/overview
24%
troubleshoot
Recommended

Docker Desktop Won't Install? Welcome to Hell

When the "simple" installer turns your weekend into a debugging nightmare

Docker Desktop
/troubleshoot/docker-cve-2025-9074/installation-startup-failures
23%
howto
Recommended

Complete Guide to Setting Up Microservices with Docker and Kubernetes (2025)

Split Your Monolith Into Services That Will Break in New and Exciting Ways

Docker
/howto/setup-microservices-docker-kubernetes/complete-setup-guide
23%
troubleshoot
Recommended

Fix Docker Daemon Connection Failures

When Docker decides to fuck you over at 2 AM

Docker Engine
/troubleshoot/docker-error-during-connect-daemon-not-running/daemon-connection-failures
23%
tool
Similar content

Kubernetes Operators: Custom Controllers for App Automation

Explore Kubernetes Operators, custom controllers that understand your application's needs. Learn what they are, why they're essential, and how to build your fir

Kubernetes Operator
/tool/kubernetes-operator/overview
23%
tool
Similar content

Google Cloud Developer Tools: SDKs, CLIs & Automation Guide

Google's collection of SDKs, CLIs, and automation tools that actually work together (most of the time).

Google Cloud Developer Tools
/tool/google-cloud-developer-tools/overview
23%
tool
Similar content

GitHub Actions Marketplace: Simplify CI/CD with Pre-built Workflows

Discover GitHub Actions Marketplace: a vast library of pre-built CI/CD workflows. Simplify CI/CD, find essential actions, and learn why companies adopt it for e

GitHub Actions Marketplace
/tool/github-actions-marketplace/overview
23%
tool
Similar content

Grafana: Monitoring Dashboards, Observability & Ecosystem Overview

Explore Grafana's journey from monitoring dashboards to a full observability ecosystem. Learn about its features, LGTM stack, and how it empowers 20 million use

Grafana
/tool/grafana/overview
22%
tool
Similar content

Change Data Capture (CDC) Integration Patterns for Production

Set up CDC at three companies. Got paged at 2am during Black Friday when our setup died. Here's what keeps working.

Change Data Capture (CDC)
/tool/change-data-capture/integration-deployment-patterns
22%

Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization