Claude's code execution runs Python 3.11 in isolated containers that crash when you least expect it. Got about 950MB RAM before it kills your process without warning - learned this trying to load a 180MB CSV that crashed 6 times before I figured out pandas was eating 800MB just parsing it. The environment has the usual data science libraries - pandas 1.5.3, numpy 1.24.3, matplotlib 3.7.1, scikit-learn 1.2.2 - but don't expect bleeding edge versions, they're like 8 months behind.
Container Environment Limits
Python 3.11.7 running on Ubuntu Linux, locked down tight with no internet access. Can't pip install anything, what's included is what you get. Found out the hard way when I tried to use pandas.DataFrame.map()
(added in 2.1.0) with their 1.5.3 version and got AttributeError for three fucking hours thinking I was going crazy.
What you're actually working with:
- Memory: 950MB before OOMKilled with zero warning
- Storage: 2GB temp space until you load a real dataset
- Libraries: Standard data science stack, 6-8 months behind latest
- Internet: Completely blocked for security (can't even ping 8.8.8.8)
- Container cost: $0.05/hour until you forget cleanup and hit $1200/month
- Lifetime: Max 1 hour, often dies after 45 minutes for no reason
How This Actually Works with Express
Your Express app turns into a container babysitter - managing async operations that hang forever, containers that die mid-execution, and error messages that tell you absolutely nothing useful. Simple text completion responds in 2-3 seconds, but code execution takes 30-90 seconds and fails in ways you never imagined possible.
What You Actually Need:
{
\"dependencies\": {
\"@anthropic-ai/sdk\": \"^0.28.0\", // They fix critical bugs weekly, stay current
\"express\": \"^4.19.2\", // Any recent version works fine
\"multer\": \"^1.4.5\", // For file uploads that will definitely break everything
\"helmet\": \"^7.1.0\" // Basic security, containers will still bankrupt you
}
}
The Simplest Setup That Actually Works
This handles basic Python execution. Works great for math and small datasets. Breaks horribly when you try anything complex.
What You'll Actually Build:
// services/codeExecutionService.ts
import Anthropic from '@anthropic-ai/sdk';
export class CodeExecutionService {
private client: Anthropic;
constructor(apiKey: string) {
this.client = new Anthropic({
apiKey,
defaultHeaders: {
'anthropic-beta': 'code-execution-2024-10-22' // This beta header breaks if you use wrong date
}
});
}
// TODO: this error handling is terrible but it works
async executeAnalysis(prompt: string, containerId?: string) {
try {
// Cross your fingers the container doesn't randomly die mid-execution
const response = await this.client.messages.create({
model: 'claude-3-5-sonnet-20241022', // Current Sonnet, Opus costs 3x more for same shit
max_tokens: 4096, // Max you can actually use, will need all of it for real output
container: containerId, // Reuse containers or pay $0.05/hour for each new one
messages: [{ role: 'user', content: prompt }],
tools: [{
type: 'code_execution_20241022',
name: 'code_execution'
}]
});
return {
content: response.content,
containerId: response.container?.id, // Save this or lose all your fucking data
expiresAt: response.container?.expires_at // Usually 1 hour, sometimes 47 minutes
};
} catch (error) {
// Error messages are cryptic as hell, good luck debugging this shit
console.error('Container execution failed:', error);
throw new Error(`Execution failed: ${error.message || 'Container died for mysterious reasons'}`);
}
}
}
The Express Route (With Realistic Timeouts):
// routes/analysis.ts
import { Router } from 'express';
import { CodeExecutionService } from '../services/codeExecutionService';
const router = Router();
const codeExecution = new CodeExecutionService(process.env.ANTHROPIC_API_KEY!);
router.post('/analyze', async (req, res) => {
const { prompt, containerId } = req.body;
// Set a realistic timeout - containers hang for 3+ minutes when they feel like it
req.setTimeout(180000); // 3 minutes max, learned this after 100+ timeouts
try {
const result = await codeExecution.executeAnalysis(prompt, containerId);
res.json({
success: true,
result: result.content,
container: {
id: result.containerId, // Store this for subsequent requests or you start over
expiresAt: result.expiresAt // Usually in ~57 minutes, not the full hour
}
});
} catch (error) {
// Log absolutely everything for debugging - you'll need every bit of context
console.error('Analysis failed:', error);
res.status(500).json({
error: 'Container execution failed',
message: error.message || 'Container died for no goddamn reason'
});
}
});
export { router as analysisRoutes };
File Processing (Where Things Get Expensive)
File uploads plus code execution - works fine for 5MB test files, crashes spectacularly on anything real. First week in production some fucking user uploaded a 284MB Excel file with 900,000 rows and killed every single container we had running. Took 6 hours to figure out containers were hanging around because cleanup wasn't working right - had 47 dead containers burning through cash. March bill was $1,847.32 instead of the expected $200. Check out Multer documentation for file handling and Node.js streams for large file processing. The Files API limits are real and will bite you in the ass.
File Processing Service:
// services/fileProcessingService.ts
import multer from 'multer';
import { Readable } from 'stream';
export class FileProcessingService extends CodeExecutionService {
private upload = multer({
storage: multer.memoryStorage(),
limits: { fileSize: 25 * 1024 * 1024 } // 25MB limit learned after pandas crashed trying to load 50MB
});
async uploadAndAnalyze(file: Express.Multer.File, analysisPrompt: string) {
// Upload file to Claude Files API
const uploadedFile = await this.client.files.create({
file: new Readable({
read() {
this.push(file.buffer);
this.push(null);
}
}),
purpose: 'user_data'
});
// Process with code execution (pray it doesn't crash)
const response = await this.client.messages.create({
model: 'claude-3-5-sonnet-20241022', // Opus is 3x more expensive and not worth it
max_tokens: 8192,
messages: [{
role: 'user',
content: [
{ type: 'text', text: analysisPrompt },
{ type: 'container_upload', file_id: uploadedFile.id }
]
}],
tools: [{
type: 'code_execution_20241022',
name: 'code_execution'
}]
});
return response;
}
}
Container Management for Stateful Workflows
Container reuse enables sophisticated workflows where previous calculations, loaded datasets, or generated files persist across requests. This is essential for iterative data analysis, multi-step processing pipelines, or maintaining user-specific work environments.
Container Lifecycle Management:
// services/containerManager.ts
export class ContainerManager {
private containers = new Map<string, ContainerInfo>();
async createWorkspace(userId: string): Promise<string> {
const containerId = await this.initializeContainer();
this.containers.set(userId, {
containerId,
createdAt: new Date(),
lastUsed: new Date(),
expiresAt: new Date(Date.now() + 60 * 60 * 1000) // 1 hour
});
return containerId;
}
async getWorkspace(userId: string): Promise<string | null> {
const container = this.containers.get(userId);
if (!container || container.expiresAt < new Date()) {
this.containers.delete(userId);
return null;
}
container.lastUsed = new Date();
return container.containerId;
}
async executeInWorkspace(userId: string, prompt: string) {
let containerId = await this.getWorkspace(userId);
if (!containerId) {
containerId = await this.createWorkspace(userId);
}
return this.codeExecution.executeAnalysis(prompt, containerId);
}
}
Advanced Error Handling Patterns
Code execution introduces unique failure modes: Python syntax errors, runtime exceptions, resource limits, container expiration, and network timeouts. Robust error handling distinguishes professional applications from fragile prototypes.
Comprehensive Error Handler:
// middleware/codeExecutionErrorHandler.ts
export class CodeExecutionErrorHandler {
static handle(error: any, req: Request, res: Response, next: NextFunction) {
if (error.type === 'code_execution_tool_result_error') {
switch (error.error_code) {
case 'unavailable':
return res.status(503).json({
error: 'Code execution service unavailable',
message: 'The analysis service is temporarily unavailable. Please try again.',
retryAfter: '5 minutes'
});
case 'code_execution_exceeded':
return res.status(408).json({
error: 'Execution timeout',
message: 'Code execution exceeded maximum allowed time. Simplify your analysis.',
maxExecutionTime: '60 seconds'
});
case 'container_expired':
return res.status(409).json({
error: 'Session expired',
message: 'Your analysis session expired. Start a new session.',
action: 'create_new_session'
});
}
}
// Handle Python runtime errors
if (error.content?.stderr) {
const pythonError = this.parsePythonError(error.content.stderr);
return res.status(400).json({
error: 'Python execution error',
message: 'Code execution failed with error',
details: pythonError
});
}
next(error);
}
}
Production Monitoring and Observability
Code execution requires specialized monitoring beyond traditional API metrics. Track execution times, container usage, error patterns, and resource consumption to maintain reliable service performance.
Monitoring Implementation:
// middleware/executionMetrics.ts
export class ExecutionMetrics {
static trackExecution = (req: Request, res: Response, next: NextFunction) => {
const startTime = Date.now();
res.on('finish', () => {
const duration = Date.now() - startTime;
const containerId = res.locals.containerId;
// Track metrics
metrics.executionDuration.record(duration, {
userId: req.user?.id,
containerId,
success: res.statusCode < 400
});
metrics.containerUsage.increment({
action: containerId ? 'reused' : 'created'
});
});
next();
};
}
Code execution turns your Express app into something actually useful instead of another boring CRUD API that pushes JSON around. Users upload datasets, ask for analysis, and get back visualizations and insights that would take weeks to build yourself. The architecture patterns for this are way more complex than simple CRUD bullshit.
Now you can build automated reporting, data pipelines, document processing, and interactive analysis tools. Way better than writing custom data processing from scratch and debugging fucking pandas memory errors for days.