Every serverless platform promises "deploy in minutes" until you actually try it. Modal's no different - their docs show you the happy path, but real deployments hit issues their getting started guide conveniently skips.
I've been through this setup hell multiple times. Here's what actually breaks and how to fix it without losing your sanity.
Installation That Actually Works
Skip their cute pip install modal
line if you value your time. Do this instead:
## Create a clean environment first - mixing Modal with existing packages is asking for trouble
python -m venv modal-env
source modal-env/bin/activate # or modal-env\Scripts\activate on Windows
## Install with constraints to avoid the protobuf nightmare
pip install --upgrade pip setuptools wheel
pip install modal>=0.65.0
The Protobuf Horror Story
Real error from December 2024 that the docs don't mention:
TypeError: Couldn't build proto file into descriptor pool:
field with proto3_optional was not in a oneof (modal.options.audit_target_attr)
This happens when your system has conflicting protobuf versions. The fix:
pip uninstall protobuf grpcio grpcio-status
pip install protobuf==4.25.1 grpcio==1.58.0
pip install modal
If that doesn't work, nuke everything:
pip freeze | grep -E \"(protobuf|grpcio|modal)\" | xargs pip uninstall -y
pip install modal
Authentication That Doesn't Suck
modal setup
works... until it doesn't. Common failures:
Browser Won't Open
## Skip the browser circus
modal setup --no-browser
## Copy the URL it prints and paste into your browser manually
Corporate VPN/Firewall Issues
## Set your company's proxy if needed
export HTTPS_PROXY=http://your-proxy:8080
modal setup
\"Authentication failed\" Errors
## Clear the broken auth state
rm -rf ~/.modal
modal setup
Your First Function That Actually Deploys
Forget their hello world example. Here's what you should test first:
import modal
app = modal.App(\"test-deploy\")
@app.function()
def test_basic():
import sys
print(f\"Python version: {sys.version}\")
print(\"If you see this, Modal is working\")
return \"success\"
@app.local_entrypoint()
def main():
print(\"Testing local call...\")
print(test_basic.local())
print(\"Testing remote call...\")
print(test_basic.remote())
Save as test_modal.py
and run:
modal run test_modal.py
Common Import Hell and How to Escape It
ModuleNotFoundError: The Greatest Hits
"No module named 'your_custom_module'"
Modal doesn't see your local modules. Two fixes:
Option 1: Include your code in the image
app = modal.App(\"my-app\")
## Mount your local code
image = modal.Image.debian_slim().pip_install(\"your-requirements.txt\")
@app.function(image=image, mounts=[modal.Mount.from_local_dir(".\", remote_path=\"/app\")])
def your_function():
import sys
sys.path.append(\"/app\")
import your_module # Now this works
Option 2: Package your shit properly
## Create a proper Python package
pip install build
python -m build
pip install dist/your_package-*.whl
"Import works locally but fails on Modal"
Check your Python version mismatch:
@app.function()
def debug_environment():
import sys, platform
print(f\"Python: {sys.version}\")
print(f\"Platform: {platform.platform()}\")
print(f\"Path: {sys.path}\")
Modal defaults to Python 3.11. If you're on 3.9 locally, things break.
Circular Import Hell
Modal's decorator magic chokes on circular imports that work fine locally:
ImportError: cannot import name 'function_a' from partially initialized module
Fix: Break the circular dependency or lazy import:
@app.function()
def problematic_function():
# Don't import at module level
from .other_module import needed_function
return needed_function()
Network and Container Failures
\"Connection refused\" Errors
Your container can't reach external APIs. Common causes:
- Corporate firewall blocking Modal's IPs
- API keys not available in container
- Wrong region selected
Debug with:
@app.function()
def test_network():
import requests
try:
resp = requests.get(\"https://httpbin.org/ip\")
print(f\"External IP: {resp.json()}\")
return \"Network OK\"
except Exception as e:
print(f\"Network failed: {e}\")
return \"Network broken\"
Container Startup Timeouts
Timeout: Function did not become ready within 300 seconds
Your image is too fucking big or your imports take forever. Fix:
## Minimize the image
image = modal.Image.debian_slim().pip_install([
\"numpy==1.24.0\", # Pin versions
\"torch==2.1.0\"
])
## Don't import heavy libraries at module level
@app.function(image=image)
def lightweight_function():
# Import only when needed
import torch # This happens after container starts
return \"OK\"
Memory and Resource Failures
OOMKilled Before You Even Start
Container killed: exit code 137 (OOMKilled)
Default Modal containers get 1GB RAM. Your imports use more. Fix:
@app.function(memory=4096) # 4GB
def memory_hungry():
import pandas as pd
import torch
# Now you won't die immediately
GPU Not Found
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA version mismatch. Modal's CUDA 12.1 doesn't work with PyTorch compiled for 11.8:
## Use Modal's pre-built GPU image
from modal import Image
gpu_image = Image.from_registry(
\"nvcr.io/nvidia/pytorch:24.01-py3\",
add_python=\"3.11\"
)
@app.function(gpu=\"T4\", image=gpu_image)
def gpu_function():
import torch
print(f\"CUDA available: {torch.cuda.is_available()}\")
print(f\"GPU count: {torch.cuda.device_count()}\")
The Debug Hell and How to Escape
Container Logs Disappear
Modal's logging is shit for debugging. Get better logs:
import logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
@app.function()
def debug_function():
logging.debug(\"This will actually show up\")
print(\"Regular print still works\")
# Your code here
Interactive Debugging
When everything breaks, drop into a shell:
modal shell
Or debug specific functions:
@app.function()
def broken_function():
# Add breakpoint for container debugging
import pdb; pdb.set_trace()
# Your broken code here
Secrets Not Loading
Your API keys aren't available in the container:
## Create the secret in Modal dashboard first
@app.function(secrets=[modal.Secret.from_name(\"my-api-key\")])
def api_function():
import os
api_key = os.environ[\"API_KEY\"] # Must match secret name
if not api_key:
raise ValueError(\"API key not found - check your secret setup\")
Time and Money Saving Tips
Test Locally First
Always run .local()
before .remote()
:
## Debug locally first
result = my_function.local(test_input)
print(f\"Local result: {result}\")
## Only then test remote
result = my_function.remote(test_input)
Minimize Cold Starts
Modal's "sub-second" startup is bullshit for real models. Optimize:
## Pre-load heavy stuff
@app.function(
image=my_image,
keep_warm=1, # Keep one container hot
memory=4096
)
def optimized_function():
# This happens once per container
global model
if 'model' not in globals():
model = load_heavy_model()
# This happens per request
return model.predict(input)
Budget Protection
Set spending limits before you accidentally train a model all weekend:
## In your Modal dashboard, set billing alerts
## CLI doesn't have budget controls (because of course it doesn't)
When to Give Up on Modal
Sometimes Modal isn't the answer:
- Complex networking requirements - their abstractions get in the way
- Bare metal performance needs - serverless overhead kills performance
- Always-on workloads - reserved instances are cheaper
- Multi-language projects - Python-only limitation
Don't force it. Use the right tool for the job.