Why does my Server Action timeout after 15 seconds?

Next.js Server Actions have a [15-second timeout limit by default on Vercel Pro](https://nextjs.org/docs/app/api-reference/functions/server-actions) (configurable up to 5 minutes with `maxDuration`). Claude sometimes takes 20+ seconds to respond. **Quick fix:** Use API routes instead of Server Actions for long-running Claude requests: ```typescript // app/api/claude/route.ts export async function POST(req: Request) { // This doesn't have the 15s Server Action timeout limit const response = await askClaude(prompt); return Response.json({ response }); } ``` **Better fix:** Switch to [Claude 3 Haiku](https://docs.anthropic.com/en/docs/about-claude/models#claude-3-haiku) for faster responses, or implement streaming.

Why am I getting "Error: 429 Too Many Requests" constantly?

Claude's [rate limits](https://docs.anthropic.com/en/api/rate-limits) hit fast. You're probably hitting the requests-per-minute limit, not the tokens limit. **Reality check:** - Free tier: 5 requests/minute (basically useless for real apps) - Paid tier: 4,000 requests/minute (sounds like a lot, isn't) **Simple fix:** Just retry with exponential backoff: ```typescript export async function askClaudeWithRetry(prompt: string) { for (let i = 0; i setTimeout(resolve, 1000 * Math.pow(2, i))); continue; } throw error; } } throw new Error('Still rate limited after retries'); } ```

My streaming connection keeps breaking randomly. What's wrong?

Streaming is fragile. Users navigate away, networks drop, connections timeout. The [ReadableStream API](https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream) doesn't handle this gracefully. **Common issues:** - Stream hangs if user navigates away - Memory leaks from unclosed streams - No error handling for network drops **Honest advice:** Skip streaming unless you actually need it. Most users prefer fast responses over streaming responses. If you must stream, use the [Vercel AI SDK](https://sdk.vercel.ai/docs) - they've handled the edge cases.

My Claude API key isn't working in production but works locally. Why?

**Most common issues:** 1. **You committed your .env.local to git** (rookie mistake) 2. **Wrong environment variable name** in production 3. **Trailing spaces in the API key** when you copy-pasted it **Debug steps (learned from my personal hell):** ```typescript // Add this debugging - spent 2 hours on this bullshit once console.log('API key exists:', !!process.env.ANTHROPIC_API_KEY); console.log('API key starts with:', process.env.ANTHROPIC_API_KEY?.slice(0, 10)); // This fucking error message tells you nothing useful: // Error: 401 Unauthorized: authentication_error // {\"type\":\"error\",\"error\":{\"type\":\"authentication_error\",\"message\":\"invalid x-api-key\"}} // Could mean: wrong key, missing key, trailing whitespace, or Claude just doesn't like you today ``` **Vercel users:** Set environment variables in the dashboard, not in your code.

How much does Claude API actually cost in production?

**Real numbers from our shitshow:** - Month 1: Around $50 (maybe 500 users, hard to tell with all the bots crawling us) - Month 3: Holy shit $800 (went viral, forgot caching was a thing) - Month 6: Back down to like $200 (finally got our act together with caching) **Pricing reality:** - [Claude 3.5 Haiku](https://docs.anthropic.com/en/docs/about-claude/models): $0.80/1M input tokens, $4/1M output (cheapest, fastest) - [Claude 3.5 Sonnet](https://docs.anthropic.com/en/docs/about-claude/models): Around $3/1M input, $15/1M output (what most people should use) **Cost optimization tips:** - Use Haiku for simple stuff (summaries, formatting) - way cheaper but dumber - Use Sonnet for most stuff - good balance of speed and smarts - Cache everything (seriously, everything) - [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) saves up to 90% - Shorter prompts = lower costs - every token counts

Why is my Claude API bill so high?

**Common cost killers:** 1. **No caching** - Same prompt called 100 times = 100x the cost 2. **Long prompts** - Input tokens cost money too 3. **Using Sonnet for everything** - Haiku is 10x cheaper **Simple caching that actually works:** ```typescript import { unstable_cache } from 'next/cache'; const cachedAskClaude = unstable_cache( (prompt: string) => askClaude(prompt), ['claude'], { revalidate: 3600 } // Cache for 1 hour ); // Same prompt called 100 times = 1 API call ``` **Our caching saved 80% on costs.** Start there before adding Redis or complex shit.

Should I use tool calling or just keep it simple?

**Skip tool calling initially.** It's complex, adds failure modes, and most apps don't need it. **When to consider tools:** - Your app needs Claude to actually DO things (create database records, send emails) - Simple text responses aren't enough - You've already mastered basic Claude integration **Reality check:** We built our first app without tools. Added them 6 months later when we actually needed them. Don't over-engineer - I wasted 2 weeks implementing tool calling for a feature nobody fucking asked for.

Why does Claude return different results for the same prompt?

Claude isn't deterministic. Same input ≠ same output. This is by design. **If you need consistent results:** - Set `temperature: 0` (reduces randomness) - Use caching for identical prompts - Accept that AI is probabilistic, not deterministic **This is a feature, not a bug.** Different responses can be more creative and useful.

How do I debug what's going wrong?

**Start with the basics:** 1. **Check your API key** - wrong key = authentication errors 2. **Log request/response** - see what Claude actually receives/returns 3. **Check rate limits** - 429 errors mean you're hitting limits 4. **Verify model names** - `claude-3-5-sonnet-20241022` is the current one as of Sept 2025 5. **Check your hosting platform** - Netlify functions timeout at 10s, Vercel free tier has limits **Simple debug logging:** ```typescript export async function askClaudeWithLogging(prompt: string) { console.log('Sending to Claude:', { prompt, timestamp: new Date() }); try { const response = await askClaude(prompt); console.log('Claude response:', { response, timestamp: new Date() }); return response; } catch (error) { console.error('Claude error:', error); throw error; } } ``` **Current model names (late 2024, will change):** - `claude-3-5-sonnet-20241022` (what you probably want) - `claude-3-5-haiku-20241022` (fastest, cheapest, good enough for simple shit)

My app works in development but breaks in production. Why?

**Common production issues (that will ruin your weekend):** 1. **Missing environment variables** - API key works locally, doesn't exist in prod. Classic. 2. **Different timeouts** - Vercel free plan gives you 10 seconds, Claude takes 15. Math is hard. 3. **Rate limiting** - 50 users hit your app simultaneously and Claude starts returning 429s 4. **Memory limits** - Serverless functions run out of RAM because of zombie streams 5. **Memory leaks** - Took down prod for 2 hours because streaming responses weren't getting cleaned up properly **Quick production checklist:** - ✅ API key set correctly - ✅ Using real model names - ✅ Error handling for 429/500 errors - ✅ Timeout handling for slow responses - ✅ Caching enabled to reduce API calls **Pro tip from someone who learned the expensive way:** Test with production API keys in staging first. Create a separate key for staging - I burned through $200 in credits because I'm an idiot and used my prod key for testing.

Currently viewing the AI version

Switch to human version

Claude API + Next.js App Router: Production Implementation Guide

Name: Claude API
Availability: InStock

Configuration That Works in Production

Basic Setup

// lib/claude.ts - Production-ready configuration
import Anthropic from '@anthropic-ai/sdk';

export const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!, // Will fail fast if missing
});

export async function askClaude(prompt: string) {
  try {
    const response = await anthropic.messages.create({
      model: 'claude-3-5-sonnet-20241022', // Current model as of late 2024
      max_tokens: 1000,
      messages: [{ role: 'user', content: prompt }]
    });
    return response.content[0].text;
  } catch (error) {
    console.error('Claude API error:', error);
    return 'Claude is having issues. Try again.';
  }
}

Environment Variables

# .env.local
ANTHROPIC_API_KEY=sk-ant-api03-your-key-here

Critical Warnings:

Model names change frequently - causes 400 Bad Request errors
API key must not have trailing spaces or will cause 401 Unauthorized
Missing environment variables in production while working locally is common

Resource Requirements

Cost Analysis (Real Production Data)

Month 1: $50 (500 users)
Month 3: $800 (viral traffic, no caching)
Month 6: $200 (optimized with caching)

Pricing Structure

Claude 3.5 Haiku: $0.80/1M input tokens, $4/1M output (cheapest, fastest)
Claude 3.5 Sonnet: $3/1M input, $15/1M output (recommended balance)
Caching saves 80-90% of costs with same prompts

Performance Expectations

Average response time: Under 5 seconds (can take 15+ when Claude is processing complex requests)
Timeout issues: Claude sometimes takes 20+ seconds, exceeding Next.js 15-second Server Action limit
Rate limits: 4,000 requests/minute (paid tier), hits faster than expected with concurrent users

Integration Patterns Comparison

Pattern	Use Case	Performance	Complexity	Failure Modes
Server Components	Static analysis, content generation	Fast when Claude responds quickly	Low	Page hangs for 45+ seconds if Claude is slow
Server Actions	Form processing, mutations	Good	Low	15-second timeout limit, cryptic error messages
Route Handlers	External integrations, real-time	Moderate	Medium	No built-in timeout protection
Streaming	Real-time chat, long responses	Good when working	High	Memory leaks, hanging connections, random disconnects

Critical Failure Modes and Solutions

Timeout Issues

Problem: Next.js Server Actions timeout after 15 seconds, Claude can take 20+ seconds
Impact: Users see timeout errors, app appears broken
Solution:

Use API routes instead of Server Actions for long-running requests
Switch to Claude 3 Haiku for faster responses
Implement timeout handling with graceful degradation

Rate Limiting (429 Errors)

Problem: Rate limits hit at 4,000 requests/minute
Frequency: Happens with 100+ concurrent users
Solution:

export async function askClaudeWithRetry(prompt: string, retries = 2) {
  for (let i = 0; i <= retries; i++) {
    try {
      return await askClaude(prompt);
    } catch (error: any) {
      if (error.status === 429) {
        // Rate limited - exponential backoff
        await new Promise(resolve => setTimeout(resolve, 2000 * (i + 1)));
        continue;
      }
      if (error.status === 400) {
        // Bad request - don't retry
        return `Invalid request: ${error.message}`;
      }
      if (i === retries) {
        return 'Claude is having issues. Try again in a few minutes.';
      }
    }
  }
}

Streaming Memory Leaks

Problem: Long-running streams cause memory issues, zombie streams from users navigating away
Impact: Servers run out of memory, crash every 2 hours
Cost: $500/month server repeatedly failing
Solution: Avoid streaming unless absolutely necessary, use Vercel AI SDK if streaming required

Production vs Development Failures

Common Production Issues:

Missing environment variables (works locally, fails in production)
Different timeout limits on hosting platforms
Memory limits on serverless functions
Rate limiting with multiple concurrent users

Implementation Recommendations

Error Handling (60% of Production Code)

export async function askClaude(prompt: string, retries = 2): Promise<string> {
  for (let i = 0; i <= retries; i++) {
    try {
      const response = await anthropic.messages.create({
        model: 'claude-3-5-sonnet-20241022',
        max_tokens: 2000,
        messages: [{ role: 'user', content: prompt }],
      });
      
      return response.content[0]?.text || 'Claude returned empty response';
    } catch (error: any) {
      console.error(`Attempt ${i + 1} failed:`, error.message);
      
      if (error.status === 429) {
        // Rate limited - exponential backoff
        await new Promise(resolve => setTimeout(resolve, 2000 * (i + 1)));
        continue;
      }
      
      if (error.status === 400) {
        // Bad request - don't retry
        return `Invalid request: ${error.message}`;
      }
      
      if (i === retries) {
        return 'Claude is having issues. Try again in a few minutes.';
      }
    }
  }
  
  return 'Something went wrong. Maybe restart the server and pray.';
}

Cost-Effective Caching

import { unstable_cache } from 'next/cache';

export const getCachedResponse = unstable_cache(
  askClaude,
  ['claude'],
  { revalidate: 3600 } // 1 hour cache
);

// Usage: getCachedResponse('same prompt') - second call is free

Production Monitoring Requirements

Essential Metrics:

Average response time (target: under 5 seconds)
Error rate (target: under 1%, expect more during API outages)
Daily cost tracking (set alerts at budget limits)
Rate limit hits (indicates need for queuing)

Decision Criteria

When to Use Each Model

Claude 3.5 Haiku: Simple tasks, summaries, formatting (10x cheaper)
Claude 3.5 Sonnet: Most production use cases (good balance)
Avoid tool calling initially: Adds complexity, most apps don't need it

When NOT to Use Streaming

Most users don't care about streaming vs fast responses
Adds debugging complexity and new failure modes
Memory leak risks in production
Only implement if actually needed for UX

Hosting Platform Considerations

Vercel Free: 10-second timeout limit
Netlify Functions: 10-second timeout limit
Vercel Pro: 15-second default, configurable up to 5 minutes

Troubleshooting Production Issues

Authentication Errors (401)

Causes:

Wrong API key (most common)
Missing environment variable in production
Trailing spaces in API key
Committed .env.local to git

Model Not Found Errors (400)

Cause: Using outdated model names
Solution: Check Anthropic docs for current model identifiers

Cost Escalation

Warning Signs: Bill jumps from $50 to $800+ suddenly
Root Causes:

No caching implementation
Using Sonnet for simple tasks that Haiku could handle
Long prompts with unnecessary content
Viral traffic without rate limiting

Debugging Steps

Check API key exists and format
Log request/response to see actual data
Verify current model names
Check hosting platform timeout limits
Monitor rate limit responses

Security Best Practices

API Key Management

Never commit API keys to version control
Use separate keys for staging and production
Set environment variables in hosting platform dashboard
Validate API key format on application startup

Production Deployment Checklist

✅ API key set correctly in production environment
✅ Using current model names (check Anthropic docs)
✅ Error handling for 429/500 errors implemented
✅ Timeout handling for slow responses
✅ Caching enabled to reduce API calls
✅ Cost monitoring and alerts configured
✅ Rate limiting strategy implemented

Resource Links (Verified Useful)

Essential Documentation

Anthropic Claude API Docs - Official API reference
Next.js App Router Docs - Server Components and Actions
Anthropic Console - API keys and usage monitoring

Development Tools

Claude Token Counter - Cost estimation
Anthropic TypeScript SDK - Official SDK
Vercel AI SDK - Simplified streaming (if needed)

Support Resources

Anthropic Discord - Community support
Claude Status Page - Service status monitoring
Next.js Caching Docs - Cost optimization

Useful Links for Further Investigation

Actually Useful Resources (Not Just Link Spam)

Link	Description
Anthropic Claude API Docs	The only docs that matter. Everything else is just people rewriting this. Has the real model names, actual pricing, and rate limits.
Next.js App Router Docs	Actually useful unlike most tutorials - covers the gotchas. Skip the marketing fluff, go straight to Server Components and Server Actions.
Anthropic Console	Where you get your API key and watch your money disappear. Has a token counter that's actually accurate.
Vercel AI SDK	Makes streaming easier but adds another dependency. Only use if you can't figure out streaming yourself.
Anthropic TypeScript SDK	Official SDK. Works fine. Nothing fancy but gets the job done.
Anthropic Discord	Actual humans who might help. Better than Stack Overflow for Claude-specific issues.
Next.js Discord	Good for Next.js App Router questions when the docs don't make sense.
Claude Token Counter	Figure out why your bill is so high. Spoiler: your prompts are too long.
Next.js Caching Docs	Read this or go broke. Caching saves 80% of API costs.
Claude Status Page	Check here first when everything breaks. Claude goes down sometimes and it's not your fault.

Related Tools & Recommendations

integration

Claude API Code Execution Integration - Advanced Tools Guide

Build production-ready applications with Claude's code execution and file processing tools

Claude API

/integration/claude-api-nodejs-express/advanced-tools-integration

Claude API + Next.js App Router: Production Implementation Guide

Configuration That Works in Production

Basic Setup

Environment Variables

Resource Requirements

Cost Analysis (Real Production Data)

Pricing Structure

Performance Expectations

Integration Patterns Comparison

Critical Failure Modes and Solutions

Timeout Issues

Rate Limiting (429 Errors)

Streaming Memory Leaks

Production vs Development Failures

Implementation Recommendations

Error Handling (60% of Production Code)

Cost-Effective Caching

Production Monitoring Requirements

Decision Criteria

When to Use Each Model

When NOT to Use Streaming

Hosting Platform Considerations

Troubleshooting Production Issues

Authentication Errors (401)

Model Not Found Errors (400)

Cost Escalation

Debugging Steps

Security Best Practices

API Key Management

Production Deployment Checklist

Resource Links (Verified Useful)

Essential Documentation

Development Tools

Support Resources

Useful Links for Further Investigation

Actually Useful Resources (Not Just Link Spam)

Related Tools & Recommendations

Claude API Code Execution Integration - Advanced Tools Guide

Multi-Framework AI Agent Integration - What Actually Works in Production

LangChain vs LlamaIndex vs Haystack vs AutoGen - Which One Won't Ruin Your Weekend

SvelteKit + TypeScript + Tailwind: What I Learned Building 3 Production Apps

Migrate JavaScript to TypeScript Without Losing Your Mind

Deploy Next.js to Vercel Production Without Losing Your Shit

Python vs JavaScript vs Go vs Rust - Production Reality Check

Vite vs Webpack vs Turbopack: Which One Doesn't Suck?

OpenAI Alternatives That Actually Save Money (And Don't Suck)

OpenAI Alternatives That Won't Bankrupt You

I've Been Testing Enterprise AI Platforms in Production - Here's What Actually Works

Google Gemini API: What breaks and how to fix it

Migrating CRA Tests from Jest to Vitest

Remix - HTML Forms That Don't Suck

Remix vs SvelteKit vs Next.js: Which One Breaks Less

Nuxt - I Got Tired of Vue Setup Hell

Framework Wars Survivor Guide: Next.js, Nuxt, SvelteKit, Remix vs Gatsby

Google Vertex AI - Google's Answer to AWS SageMaker

Cursor vs GitHub Copilot vs Codeium vs Tabnine vs Amazon Q - Which One Won't Screw You Over

Amazon ECR - Because Managing Your Own Registry Sucks