Currently viewing the AI version
Switch to human version

GitHub Actions + Docker + AWS ECS CI/CD: AI-Optimized Technical Reference

Configuration Requirements

Docker Configuration That Actually Works

FROM node:18-alpine
WORKDIR /app

# Copy package files first for better caching
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

# Copy app source
COPY . .

# Don't run as root
USER node
EXPOSE 3000
CMD ["npm", "start"]

Critical Docker Requirements:

  • Multi-stage builds reduce image size from 1.5GB to 200MB but introduce dependency breakage
  • npm ci --only=production breaks builds requiring TypeScript or build tools
  • .dockerignore is mandatory - without it, images reach 2GB including node_modules, .git
  • Multi-platform builds required: docker build --platform linux/amd64

ECS Task Definition Requirements

{
  "family": "my-app",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "executionRoleArn": "arn:aws:iam::account:role/ecsTaskExecutionRole",
  "containerDefinitions": [{
    "name": "my-app",
    "image": "account.dkr.ecr.region.amazonaws.com/my-app:latest",
    "portMappings": [{"containerPort": 3000}],
    "environment": [
      {"name": "PORT", "value": "3000"}
    ],
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "/ecs/my-app",
        "awslogs-region": "us-east-1",
        "awslogs-stream-prefix": "ecs"
      }
    }
  }]
}

Critical ECS Specifications:

  • CPU units: 256-4096 (AWS engineers hate round numbers)
  • Memory: Must be specific combinations or ECS fails
  • Environment variables must be strings, not numbers
  • Health check endpoint is mandatory: /health

GitHub Actions OIDC Setup

name: Deploy
on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Configure AWS
      uses: aws-actions/configure-aws-credentials@v4
      with:
        role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-role
        role-session-name: GitHubActions
        aws-region: us-east-1

OIDC Trust Policy:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Federated": "arn:aws:iam::ACCOUNT:oidc-provider/token.actions.githubusercontent.com"
    },
    "Action": "sts:AssumeRole",
    "Condition": {
      "StringEquals": {
        "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
      },
      "StringLike": {
        "token.actions.githubusercontent.com:sub": "repo:username/repo:*"
      }
    }
  }]
}

Resource Requirements and Costs

Time Investment

  • Week 1: Docker builds locally, dependency conflicts
  • Week 2: GitHub Actions ECR integration, OIDC debugging
  • Week 3: ECS task execution, security group issues
  • Week 4: Health check configuration, environment variables
  • Week 5: Production deployment, database migration integration
  • Week 6: Monitoring setup, false alarm resolution

Total Implementation Time: 6 weeks minimum

Cost Breakdown

  • GitHub Actions: $0.008/minute - inefficient builds cost $100+/month
  • ECR: $0.10/GB/month - 50 old images = $60/month without lifecycle policies
  • Fargate: $0.04048/vCPU/hour - small app (0.25 vCPU) = $7.20/month
  • Load Balancer: $16/month constant cost
  • CloudWatch Logs: Free with 30-day retention, expensive with indefinite retention

Expertise Requirements

  • Docker: Multi-stage builds, layer caching, platform differences
  • AWS Networking: Security groups, subnets, NAT gateways
  • ECS: Task definitions, service configuration, deployment strategies
  • GitHub Actions: YAML syntax, OIDC setup, secret management

Critical Warnings and Failure Modes

Docker Build Failures

Symptom: Builds work locally, fail in CI
Root Causes:

  • Different architectures (ARM vs x86)
  • Missing .dockerignore
  • Hard-coded paths breaking on Linux
  • Dependency on local environment variables

Solution: Use --platform linux/amd64, proper .dockerignore

ECS Task Death (Exit Code 1)

Most Common Causes:

  1. Port mismatch: App listens on 3000, task definition expects 80
  2. Missing environment variables: DATABASE_URL undefined
  3. Wrong working directory: App expects /usr/src/app, Dockerfile uses /app
  4. Permission issues: Running as root locally, restricted in container

Debugging: Check CloudWatch logs, not ECS console messages

Networking Failures

Symptoms: App runs but users can't reach it
Check Order:

  1. Security groups - ECS task needs outbound internet rules
  2. Load balancer target groups - health check path must exist
  3. Subnet routing - public subnets for ALB, private for ECS
  4. NAT Gateway - private subnets need internet for outbound calls

GitHub Actions Cost Explosions

Root Causes:

  • No Docker layer caching (default disabled)
  • Using npm install instead of npm ci
  • Rebuilding unchanged dependencies
  • No node_modules caching

Build Time Reduction: 45 minutes → 3 minutes with proper caching

Database Migration Disasters

Anti-Pattern: Running migrations in app container
Consequence: "App is up but database is broken" scenarios
Solution: Separate migration task definition, run before app deployment

Decision Support Matrix

Deployment Strategies

Strategy AWS Claims Reality Failure Mode Recommendation
Rolling "Gradual replacement" Half traffic hits new code immediately Health checks fail, ECS kills everything Use unless you're Netflix
Blue-Green "Zero downtime" Costs 2x for 10 minutes, works perfectly Database migrations break everything Skip unless handling money
Canary "Risk mitigation" More config time than deploy time 5% users get broken experience Only with dedicated DevOps team
Recreate "Simple strategy" Site down 3 minutes Users notice, support tickets Never use in production

Alternative Platforms

Platform Cost Complexity Best For
ECS $100+/month High Fine-grained control, AWS integration
Render $7/month Low Side projects, "just works"
Railway Good free tier Low Similar to Render
Fly.io Moderate Medium Balance of control and simplicity

Recommendation: Only use ECS if you need production-grade orchestration or AWS service integration

Implementation Checklist

Initial Setup

  • Create ECR repository with lifecycle policies
  • Set up Fargate cluster (not EC2)
  • Configure OIDC provider and IAM role
  • Create task definition with proper resource limits

Docker Optimization

  • Multi-stage build structure
  • Proper .dockerignore file
  • Platform-specific builds
  • Node.js memory limits: --max-old-space-size=400

GitHub Actions Configuration

  • OIDC authentication (no AWS keys in repo)
  • Docker layer caching enabled
  • Node modules caching
  • Parallel build steps where possible

Monitoring and Alerting

  • Health check endpoint: GET /health
  • CloudWatch alarms: task count, CPU, memory, error rate
  • Log retention set to 30 days
  • Cost monitoring enabled

Security Hardening

  • Run containers as non-root user
  • Security groups properly configured
  • No secrets in environment variables
  • Regular base image updates

Breaking Points and Thresholds

Performance Limits

  • UI breaks at 1000+ spans: Makes debugging large distributed transactions impossible
  • Build timeout: 6 hours GitHub Actions limit with inefficient Docker builds
  • Memory limits: Node.js will consume all available RAM without --max-old-space-size
  • Health check failures: 30-second restart loop if endpoint doesn't respond

Cost Thresholds

  • $100/month: Reasonable for production workload
  • $400/month: Usually indicates misconfiguration (old images, oversized containers)
  • 10 deploys/day: $3.60 per deploy with inefficient builds

Scaling Considerations

  • Fargate vs EC2: 3x cost premium for Fargate worth it for mental health
  • Single AZ vs Multi-AZ: Cross-AZ data transfer charges add up quickly
  • Auto-scaling thresholds: CPU >80%, Memory >85% trigger scaling

Common Misconceptions

  1. "ECS is like Docker Compose": ECS networking is quantum physics-level complex
  2. "Fargate is expensive": Mental health cost of EC2 debugging exceeds price difference
  3. "Health checks are optional": Without them, ECS restarts containers every 30 seconds
  4. "GitHub Actions is cheap": Inefficient builds cost $100+/month
  5. "Multi-stage builds always help": They introduce dependency management complexity

Success Criteria

Working System Indicators

  • Deploy to main triggers automatic build and deployment
  • Health checks pass consistently
  • Application logs visible in CloudWatch
  • No manual server access required
  • Rollback capability through ECS service updates

Performance Benchmarks

  • Build time: <5 minutes after first run
  • Deployment time: <10 minutes end-to-end
  • Zero downtime: Rolling deployments work without service interruption
  • Cost predictability: Monthly AWS bill variance <20%

Useful Links for Further Investigation

Resources That Actually Help (When Things Go Wrong)

LinkDescription
GitHub Actions ECS deployment issuesFind real problems and real solutions for GitHub Actions and Amazon ECS deployment issues on Stack Overflow.
Docker build failures in CIExplore solutions for Docker build failures in CI environments, including platform differences and layer caching issues.
ECS task keeps stoppingTroubleshoot Amazon ECS tasks that keep stopping, focusing on exit code debugging and health check configurations.
AWS IAM OIDC permission deniedAddress AWS IAM OIDC permission denied errors by troubleshooting trust policy configurations and OpenID Connect setups.
aws-actions/configure-aws-credentials issuesReview common OIDC setup problems and solutions related to the aws-actions/configure-aws-credentials GitHub action.
aws-actions/amazon-ecs-deploy-task-definition issuesInvestigate deployment failures and known issues for the aws-actions/amazon-ecs-deploy-task-definition GitHub action.
Docker build failuresExamine reported Docker build failures, including platform compatibility and caching issues, for the build-push-action.
OIDC with AWS setupOfficial documentation for configuring OpenID Connect in Amazon Web Services for GitHub Actions, essential for correct setup.
Workflow syntaxComprehensive guide to GitHub Actions workflow syntax, useful when implementing specific YAML features in your workflows.
Security hardeningBest practices and guidelines for security hardening your GitHub Actions deployments to prevent unauthorized access and attacks.
ECS Task Definition ParametersEssential reference documentation for all Amazon ECS Task Definition parameters, crucial for configuring your containerized applications.
ECS Service Auto ScalingLearn how to configure Amazon ECS Service Auto Scaling to manage application capacity efficiently without causing disruptions.
ECR Lifecycle PoliciesUnderstand and implement Amazon ECR Lifecycle Policies to automatically manage and clean up old container images, saving costs.
aws-actions/amazon-ecs-deploy-task-definitionThe official GitHub Action for deploying Amazon ECS task definitions, providing a reliable and functional starting point.
GitHub starter workflowsA simple and effective AWS deployment template from GitHub's starter workflows, serving as a good initial configuration.
Netflix/dispatchExplore Netflix's dispatch project for insights into real-world, production-grade Amazon ECS deployment strategies and patterns.
Shopify/shipit-engineExamine Shopify's shipit-engine to understand real deployment patterns and practices used in a large-scale production environment.
AWS CLI v2The essential AWS Command Line Interface version 2, crucial for managing AWS services from your local development environment.
Docker DesktopDocker Desktop provides a convenient environment for local container testing and development on your workstation.
aws-vaultA tool for secure credential management, allowing you to store and access AWS credentials safely on your local machine.
ecs-cliThe Amazon ECS CLI simplifies operations for managing your Amazon ECS clusters, services, and tasks from the command line.
AWS CloudWatch LogsAccess and analyze your application logs in AWS CloudWatch Logs to identify and debug errors effectively.
ECS ExecUse Amazon ECS Exec to securely shell into running containers, enabling direct debugging and troubleshooting.
ctopA command-line tool similar to htop, providing real-time monitoring and management for your running containers.
AWS Cost ExplorerUtilize AWS Cost Explorer to visualize, understand, and manage your AWS spending, helping identify cost-saving opportunities.
AWS Containers BlogThe official AWS Containers Blog provides updates, announcements, and deep dives into new features for container services.
Depot.dev BlogExplore the Depot.dev Blog for articles and insights focused on optimizing Docker build processes and performance.
Last Week in AWSStay informed with 'Last Week in AWS', offering curated AWS news and insightful opinions on recent developments.
DevOps Chat SlackJoin the DevOps Chat Slack community, specifically the #aws channel, for discussions and support related to AWS.
Stack Overflow DevOpsEngage with the Stack Overflow DevOps community to learn from real experiences, challenges, and solutions in DevOps practices.
AWS Community SlackThe official AWS Community Slack channel, a resource for connecting with other AWS users and seeking assistance.
AWS SupportAccess AWS Premium Support, where the $29/month Developer plan offers valuable assistance when you're stuck.
GitHub CommunityThe official GitHub Community forum provides a platform for users to ask questions and get support from peers.
Stack Overflow AWSSeek community help and solutions for Amazon Web Services related questions on the dedicated Stack Overflow tag.
RenderA platform offering $7/month plans, seamless GitHub integration, and a reputation for just working for your deployments.
RailwayA platform known for its good free tier and straightforward deployment process, simplifying application hosting.
Fly.ioOffers a balance between Platform-as-a-Service and Amazon ECS, providing more control without the full complexity.
Digital Ocean App PlatformDigital Ocean's App Platform provides a straightforward solution for deploying and managing containerized applications.
ECS Console GuideThe Amazon ECS Console Guide is an essential bookmark for quickly checking the status of your ECS services and clusters.
ECS Task DefinitionsReference this guide to effectively manage and understand Amazon ECS task definitions, crucial for your container configurations.
ECR Repositories GuideThe Amazon ECR Repositories Guide helps you manage your container images, including creation, deletion, and access policies.
AWS Cost ManagementMonitor and manage your AWS spending effectively using the AWS Cost Management dashboard and related tools.
GitHub Actions MarketplaceDiscover and integrate various AWS ECS actions from the GitHub Actions Marketplace to enhance your deployment workflows.
GitHub Docs - ActionsThe comprehensive official documentation for GitHub Actions, covering all aspects of workflow creation and management.
GitHub Actions SamplesExplore GitHub Actions samples and deployment workflow templates to quickly set up and customize your CI/CD pipelines.

Related Tools & Recommendations

integration
Recommended

GitOps Integration Hell: Docker + Kubernetes + ArgoCD + Prometheus

How to Wire Together the Modern DevOps Stack Without Losing Your Sanity

docker
/integration/docker-kubernetes-argocd-prometheus/gitops-workflow-integration
100%
integration
Similar content

Stop Fighting Your CI/CD Tools - Make Them Work Together

When Jenkins, GitHub Actions, and GitLab CI All Live in Your Company

GitHub Actions
/integration/github-actions-jenkins-gitlab-ci/hybrid-multi-platform-orchestration
79%
integration
Similar content

GitHub Actions + Jenkins Security Integration

When Security Wants Scans But Your Pipeline Lives in Jenkins Hell

GitHub Actions
/integration/github-actions-jenkins-security-scanning/devsecops-pipeline-integration
76%
troubleshoot
Recommended

Fix Kubernetes ImagePullBackOff Error - The Complete Battle-Tested Guide

From "Pod stuck in ImagePullBackOff" to "Problem solved in 90 seconds"

Kubernetes
/troubleshoot/kubernetes-imagepullbackoff/comprehensive-troubleshooting-guide
49%
troubleshoot
Recommended

Fix Kubernetes OOMKilled Pods - Production Memory Crisis Management

When your pods die with exit code 137 at 3AM and production is burning - here's the field guide that actually works

Kubernetes
/troubleshoot/kubernetes-oom-killed-pod/oomkilled-production-crisis-management
49%
compare
Recommended

Docker Desktop vs Podman Desktop vs Rancher Desktop vs OrbStack: What Actually Happens

alternative to Docker Desktop

Docker Desktop
/compare/docker-desktop/podman-desktop/rancher-desktop/orbstack/performance-efficiency-comparison
46%
howto
Recommended

Stop Docker from Killing Your Containers at Random (Exit Code 137 Is Not Your Friend)

Three weeks into a project and Docker Desktop suddenly decides your container needs 16GB of RAM to run a basic Node.js app

Docker Desktop
/howto/setup-docker-development-environment/complete-development-setup
45%
troubleshoot
Recommended

CVE-2025-9074 Docker Desktop Emergency Patch - Critical Container Escape Fixed

Critical vulnerability allowing container breakouts patched in Docker Desktop 4.44.3

Docker Desktop
/troubleshoot/docker-cve-2025-9074/emergency-response-patching
45%
tool
Recommended

Podman Desktop - Free Docker Desktop Alternative

competes with Podman Desktop

Podman Desktop
/tool/podman-desktop/overview
35%
alternatives
Recommended

Podman Desktop Alternatives That Don't Suck

Container tools that actually work (tested by someone who's debugged containers at 3am)

Podman Desktop
/alternatives/podman-desktop/comprehensive-alternatives-guide
35%
tool
Recommended

CircleCI - Fast CI/CD That Actually Works

competes with CircleCI

CircleCI
/tool/circleci/overview
32%
tool
Recommended

Jenkins - The CI/CD Server That Won't Die

competes with Jenkins

Jenkins
/tool/jenkins/overview
32%
tool
Recommended

Rancher Desktop - Docker Desktop's Free Replacement That Actually Works

alternative to Rancher Desktop

Rancher Desktop
/tool/rancher-desktop/overview
30%
tool
Similar content

Amazon ECR - Because Managing Your Own Registry Sucks

AWS's container registry for when you're fucking tired of managing your own Docker Hub alternative

Amazon Elastic Container Registry
/tool/amazon-ecr/overview
30%
tool
Recommended

containerd - The Container Runtime That Actually Just Works

The boring container runtime that Kubernetes uses instead of Docker (and you probably don't need to care about it)

containerd
/tool/containerd/overview
29%
troubleshoot
Recommended

Docker Swarm Service Discovery Broken? Here's How to Unfuck It

When your containers can't find each other and everything goes to shit

Docker Swarm
/troubleshoot/docker-swarm-production-failures/service-discovery-routing-mesh-failures
27%
troubleshoot
Recommended

Docker Swarm Node Down? Here's How to Fix It

When your production cluster dies at 3am and management is asking questions

Docker Swarm
/troubleshoot/docker-swarm-node-down/node-down-recovery
27%
tool
Recommended

Docker Swarm - Container Orchestration That Actually Works

Multi-host Docker without the Kubernetes PhD requirement

Docker Swarm
/tool/docker-swarm/overview
27%
tool
Similar content

GitHub Actions - CI/CD That Actually Lives Inside GitHub

Discover GitHub Actions: the integrated CI/CD solution. Learn its core concepts, production realities, migration strategies from Jenkins, and get answers to com

GitHub Actions
/tool/github-actions/overview
24%
troubleshoot
Recommended

Docker говорит permission denied? Админы заблокировали права?

depends on Docker

Docker
/ru:troubleshoot/docker-permission-denied-linux/permission-denied-solutions
23%

Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization