App Router confused the absolute shit out of me at first. Coming from Pages Router, everything felt like someone had inverted the entire framework just to mess with me. But after rage-building way too many RAG apps and nearly throwing my laptop across the room, I finally get what Vercel was thinking. It does solve real problems - you just gotta stop fighting it and accept the weirdness.
When It Finally Clicked
Traditional RAG apps are a mess of API calls. Users stare at loading spinners while your API routes talk to 3 different services one by one. It's slow as shit and feels broken even when it works.
App Router lets you fetch data directly in your React components on the server. Sounds like voodoo, but it works. Your component can talk to Supabase, search Pinecone, and call OpenAI all in one server-side render. No API routes, no loading states, no waiting for network requests to finish.
But here's what the docs don't tell you: it breaks in creative ways.
What Actually Works in Production
Server Components: Great Until They're Not
Server Components are great for loading data. No more building API routes just to fetch from your database. But here's what screws you over:
The Good: Direct database access, no loading states, automatic caching. The Next.js Server Components docs cover the basics.
The Bad: Error handling is weird, debugging sucks, and TypeScript sometimes loses its mind.
// This looks clean but will randomly break
export default async function DocumentsPage() {
const supabase = createServerComponentClient({ cookies })
// This will fail if user isn't logged in and you'll get a cryptic error
const { data: { user } } = await supabase.auth.getUser()
// This times out if you have more than ~100 documents
const { data: documents } = await supabase
.from('documents')
.select('*')
.eq('user_id', user?.id)
}
What breaks in production:
- Timeouts with large datasets (Vercel kills you at 10s, learned this when my dashboard started timing out)
- Auth errors that don't surface properly (spent way too long debugging "undefined user" with zero context)
- TypeScript inference breaks with complex queries (RLS policies confuse the hell out of the type system)
- Caching gets weird with dynamic user data (user A sees user B's documents randomly)
The fix: Add proper error boundaries, use pagination, and test auth edge cases early. I learned this the hard way when our staging demo showed the wrong user's data.
Server Actions: The Good, Bad, and "Why Does This Timeout?"
Server Actions are great until you try to upload anything bigger than a text file. Then they become a source of pain.
What works: Simple mutations, form handling, quick database updates.
What doesn't: File processing, embedding generation, anything that takes more than 10 seconds on Hobby (60s on Pro). Vercel will kill your function.
// This will timeout and you'll hate your life
export async function uploadDocument(formData: FormData) {
const file = formData.get('file') as File
// Reading large files blocks everything
const content = await file.text() // 💀 Dies on 10MB+ files
// This takes forever and will timeout
const chunks = chunkText(content) // 💀 5+ seconds for long docs
// This definitely times out
const embeddings = await Promise.all(
chunks.map(chunk => generateEmbedding(chunk)) // 💀 RIP
)
}
Shit that broke in production:
- File uploads die at timeout limits (10s Hobby, 60s Pro - found this out the hard way when a customer's massive compliance manual just vanished)
- OpenAI randomly throttles you with zero warning (embedding generation just... stops. No error. Nothing. Took way too long to realize what was happening)
- Pinecone quota limits fail silently (documents disappear into the void, no errors logged, spent hours thinking I was losing my mind)
- Impatient users spam-click upload buttons (customers upload the same damn PDF multiple times because they think nothing's happening)
- Error messages are worse than useless ("Internal Server Error" is about as helpful as a chocolate teapot)
What actually works: Use Server Actions for saving metadata, queue background jobs for processing.
// This survives production
export async function uploadDocument(formData: FormData) {
// Save file metadata only
const { data: document } = await supabase
.from('documents')
.insert({ title: file.name, status: 'pending' })
// Queue background processing (separate service)
await fetch('/api/process-document', {
method: 'POST',
body: JSON.stringify({ documentId: document.id })
})
return { success: true }
}
Here's the brutal truth: Server Actions are for quick database writes, not heavy lifting. I spent way too long in denial trying to force massive PDFs through them before admitting defeat. Don't be as stubborn as I was.
Route Handlers for Streaming: When It Works, It's Magic
Streaming AI responses is pure magic when it fucking works. Users see text flowing in real-time instead of staring at loading spinners for 15 seconds. But making it work reliably nearly broke me - way too many long days fueled by spite and Red Bull.
What breaks:
- Streams randomly cut off mid-sentence
- Pinecone queries timeout and break the whole stream
- Auth cookies get stale during long conversations
- Error handling is nightmare - users see half responses with no error message
// This works until it doesn't
export async function POST(request: Request) {
// Auth fails randomly during long chats
const { user } = await supabase.auth.getUser()
const result = await streamText({
model: openai('gpt-4-turbo'),
tools: {
search: tool({
execute: async ({ query }) => {
// This times out randomly and kills the stream
const results = await pinecone.query({
vector: embedding,
topK: 5
})
// Stream dies here with zero error info
}
})
}
})
}
What I learned the hard way:
- Always wrap tool calls in try-catch or the stream dies silently
- Set aggressive timeouts on Pinecone (3s max)
- Handle auth refresh or users get logged out mid-conversation
- Add stream resumption - users will refresh the page when streams break
The Vercel AI SDK is actually pretty good once you add proper error handling everywhere.
What Actually Matters
App Router isn't perfect, but it's the best way I've found to build RAG apps. Here's the deal:
Use Server Components for: Initial data loading, dashboard pages, anything that doesn't need interactivity.
Use Server Actions for: Simple mutations, form handling, triggering background jobs.
Use Route Handlers for: Streaming AI responses, webhooks, anything that needs real-time updates.
Don't use Server Actions for: File processing, embedding generation, anything that takes more than 10 seconds.
The biggest mindset shift is understanding when to use each pattern. Coming from traditional React, everything feels backwards at first. But once you get it, you won't want to go back to building APIs for everything.
Key lessons from production:
- Add error boundaries everywhere or debugging sucks
- Test auth edge cases early - they will bite you
- Use background jobs for heavy processing
- Streaming is amazing but needs proper error handling
- TypeScript can get confused with Server Components
- "use client" doesn't fix everything - you'll still hit server/client boundary issues with state
App Router makes RAG apps feel more integrated than the old API-first approach. Your frontend and backend are actually talking to each other instead of just throwing HTTP requests over the wall.
But the biggest challenge isn't the architecture patterns - it's getting AI responses to stream reliably. That's where most RAG apps shit the bed in production, and where you'll spend most of your 3am debugging sessions.