Build AI-Enhanced Web Apps you own this product

How to get reliable results with React, Next.js, and Vercel

Theo Despoudis

February 2026
ISBN 9781633436084
392 pages

Included with a Manning Online subscription

printed in black & white

catalog / Data Science / AI

resources: Source code Book forum Source code on GitHub Register your pBook for a free eBook

table of content

Part 1 Building basic generative AI web apps

1 Using generative AI in web apps

1.1 What generative AI can do for web applications

1.1.1 Generative AI capabilities

1.1.2 Real-world uses of generative AI

1.2 How a generative AI web app works

1.2.1 Core components

1.2.2 The flow of user interactions

1.3 AI tools and the ecosystem

1.4 Choosing the right model

1.4.1 Model types

1.4.2 Pretrained vs. self-hosted

1.4.3 Performance considerations

1.5 Generative vs. traditional AI

1.6 Handling the concerns and implications of generative AI

1.6.1 What are the limitations of generative AI?

1.6.2 Will developers lose jobs because of AI?

1.6.3 Are generative AI outputs reliable?

2 Building your first generative AI web application

2.1 Introducing Astra

2.2 Project goal and requirements

2.2.1 Goal: Build a simple interactive AI chat interface

2.2.2 Project and technology requirements

2.2.3 Setting up

2.2.4 Running the project

2.3 Under the hood: The generative AI lifecycle

2.4 Designing for a better user experience

2.5 Building the major components

2.5.1 Frontend

2.5.2 Autoscroll

2.5.3 ChatPage

2.5.4 ChatList

2.5.5 The backend: Handling API communication

2.5.6 Tests

2.5.7 Common challenges and solutions

2.6 Assessing the app’s first iteration

2.7 Migrating the app to Next.js

2.7.1 Setting up

2.7.2 Running the project

2.8 Routing and configuration on Next.js

2.8.1 File-based routing

2.8.2 Configuration

2.8.3 Environment variables in Next.js

2.8.4 Route groups

2.8.5 Layout components

2.8.6 Route API handlers

2.8.7 Going deeper with Next.js

3 Connecting AI models with the Vercel AI SDK

3.1 Introducing the Vercel AI SDK

3.1.1 Key features and benefits

3.1.2 A strategic approach to integration

3.1.3 Practical integration: The Vercel AI SDK with Astra AI

3.2 Handling streaming responses with the Vercel AI SDK

3.2.1 Challenges and how the SDK solves streaming in web applications

3.2.2 Implementing streaming with the Vercel AI SDK

3.2.3 Integrating streaming into Astra AI

3.3 Working with multiple AI providers

3.3.1 Handling different AI providers and models

3.3.2 Using the Vercel AI SDK’s interoperability

3.3.3 Astra AI project: Integrating multiple AI providers and models

3.4 Enhancing conversational UIs with multimedia content

3.4.1 Introducing OpenAI’s vision capabilities

3.4.2 Astra AI project: Integrating Gemini vision queries

4 Managing conversation and state in your application

4.1 AI SDK React server components

4.1.1 Overview of RSCs

4.1.2 Using server actions for AI-powered RSCs

4.1.3 Updating the UI to use server actions

4.1.4 Techniques for generating and streaming UI components

4.1.5 Creating streamable UI components from LLM providers with streamUI

4.1.6 Streaming React components with createStreamableUI

4.2 Managing UI state in AI-powered applications

4.2.1 Separating AI and UI state in React/Next.js applications

4.2.2 Key components for UI state management

4.2.3 Implementing UI state management patterns

4.3 Structured data generation using the Vercel AI SDK

4.3.1 How structured data generation works

4.3.2 Techniques for generating structured data from AI responses

4.3.3 Tools for implementing type-safe AI-generated content

4.3.4 Integrating structured data generation into our web application

4.4 Tool and function calling with AI models

4.4.1 Understanding tool calling and function calling in AI models

4.4.2 Implementing custom tools and functions with the Vercel AI SDK

Part 2 Advanced generative AI techniques and deployment

5 Prompt engineering in web applications

5.1 Introducing prompt engineering

5.1.1 What exactly are prompts?

5.1.2 Prompt types

5.1.3 Organizing your prompts: Versioning, testing, and optimization

5.2 Few-shot learning

5.2.1 Examples of few-shot learning

5.2.2 General methodology for creating few-shot learning prompts

5.3 Chain-of-thought prompting: A deeper dive into reasoning

5.3.1 Example of chain-of-thought prompting

5.3.2 General methodology for creating chain-of-thought prompts

5.4 Embeddings: Giving AI a sense of meaning

5.4.1 The restaurant menu analogy: A taste of embeddings

5.4.2 Using embeddings in practice: The Vercel AI SDK

5.4.3 Use case: IT support knowledge base

5.5 Going deeper into LLM techniques

5.5.1 Tree of thoughts

5.5.2 Self-refine

5.5.3 LLM-as-a-judge

6 Building AI workflows with LangChain.js

6.1 Introducing LangChain

6.1.1 Chaining calls with LangChain

6.1.2 Integration with the Vercel AI SDK

6.2 Preparing and storing documents for retrieval using LangChain

6.2.1 Document ingestion using text splitters

6.2.2 Introducing vector stores

6.2.3 Document retrieval

6.2.4 Full example of preparing and storing documents with LangChain

6.3 Using memory components in LangChain to remember conversation history

6.4 Utilizing agents in LangChain.js

6.4.1 How LangChain agents work

6.4.2 Creating an agent using LangChain.js

6.4.3 Agent integration with the Vercel AI SDK

6.4.4 Overview of LangChain.js modules

6.5 Going deeper with LangChain.js

6.5.1 LangChain Expression Language

6.5.2 LangGraph

7 Document summarization and RAG with LangChain.js

7.1 Building a document summarization web application with LangChain.js

7.1.1 Summarization app project requirements

7.1.2 Architecture and workflow

7.1.3 Building the document summarization web application

7.1.4 Caveats and limitations of document summarization

7.1.5 Demonstrating the app

7.1.6 Additional considerations for summarizing documents

7.2 Building a RAG web application with LangChain.js

7.2.1 RAG app project requirements

7.2.2 Key architectural components of RAG

7.2.3 Technical architecture overview

7.2.4 RAG system components

7.2.5 Web app demonstration

7.2.6 Adding grounding support

8 Testing and debugging techniques

8.1 Debugging Next.js AI applications

8.1.1 Debugging common Next.js rendering Issues

8.1.2 Debugging client–server problems

8.1.3 Handling state management

8.1.4 Performance monitoring

8.2 Vercel AI SDK troubleshooting

8.2.1 Handling error states in AI-generated content

8.2.2 Managing token limits and rate limiting

8.3 Troubleshooting LangChain.js

8.3.1 Chain execution errors

8.3.2 Troubleshooting model integration problems

8.4 Testing strategies for AI applications

8.4.1 Unit and integration testing in React and Next.js

8.4.2 Mocking LLM responses

8.4.3 Testing Vercel AI SDK responses

8.4.4 Testing LangChain.js

9 Deployment and security

9.1 Building a secure foundation with input validation, rate limits, and middleware

9.1.1 Input validation

9.1.2 Security middleware layer

9.2 Building a core security and data protection pipeline

9.3 Setting up authentication and authorization

9.3.1 Simple authentication with Clerk.js and Next.js

9.3.2 Practical security control: Rate limiting

9.4 API key and secrets management

9.4.1 Understanding Next.js environment variables

9.4.2 Application-level API keys

9.4.3 User-provided API keys

9.5 Data protection and compliance

9.5.1 Example: Adding anonymization to our chat messages

9.6 Deployment considerations for AI web applications

9.6.1 Deployment options

9.6.2 Production deployment checklist

9.6.3 Example deployment to Vercel

9.6.4 Alternative deployments: Netlify

9.6.5 Alternative deployments: Hugging Face Spaces

9.6.6 Next steps

Part 3 Hands-on projects

10 Building an AI interview assistant: Project walk-through

10.1 Overview of the application

10.1.1 Key features

10.1.2 Technical implementation

10.1.3 Technology stack overview

10.2 Security measures implemented

10.3 Challenges during development

10.3.1 State management considerations

10.3.2 Text-to-speech integration

10.3.3 Generating feedback

10.4 Additional considerations and improvements

11 Building an AI RAG agent: Project walk-through

11.1 Overview of the application

11.1.1 Key features

11.1.2 Technical implementation

11.1.3 Technology stack overview

11.2 Challenges during development

11.2.1 Shared vs. dedicated user data in vector stores

11.2.2 Security considerations around document management and heavy workloads

11.2.3 API design and URL structure to minimize information exposure

11.3 Additional thoughts on AI and the future of web development

Part 4 Advanced integrations and the future of AI

12 Integrating web apps with the Model Context Protocol

12.1 Why the MCP matters for AI integration

12.2 MCP architecture

12.3 Connecting Next.js and the Vercel AI SDK with the MCP

12.3.1 Architecture overview

12.3.2 Building an end-to-end integration with the MCP in Next.js

12.3.3 Benefits of using the MCP for web applications with LLMs

12.4 Inside an MCP server: Extending web applications

12.4.1 MCP server structure

12.4.2 Additional considerations for MCP servers

12.5 Integrating MCP servers with LangChain.js

12.5.1 Architecture overview

12.5.2 Building an end-to-end integration with LangChain.js

12.6 The future of the MCP: Gateways, directories, and MCP-as-a-service

12.6.1 MCP gateways

12.6.2 MCP-as-a-service

12.6.3 MCP directories and registries

12.7 Your next steps with MCP servers

Appendix

Appendix A: Running the examples

A.1 Running examples

A.2 Accessing OpenAI APIs

A.3 Accessing Google AI APIs

A.4 Accessing the Upstash Redis database

A.5 Integrating Clerk.js authentication

Overview

9 Deployment and security

Deploying AI-powered web apps compounds familiar web risks with LLM-specific threats such as prompt injection, model manipulation, and runaway API usage. The chapter frames security as a multilayered pipeline: start with careful threat modeling, put server-side validation at the source of truth, and let every request traverse progressively stricter checks before it reaches core logic. It calls out stack-specific pitfalls—Next.js API routes and SSR surfaces, configuration leaks through abstractions like the Vercel AI SDK, and the fast-moving ecosystem around LangChain.js—urging up-to-date dependencies, conservative defaults, and strict control of public endpoints.

Concrete defenses include robust input validation (e.g., Zod) guided by a clear threat model, plus a composable middleware layer that handles CORS, authentication, security headers, anomaly detection, and rate limiting. The book demonstrates Redis-backed rate limits and per-user quotas to prevent abuse and control costs, alongside practical auth/authorization with Clerk.js to gate sensitive routes and enforce fair-use policies. It emphasizes secrets hygiene—never exposing private keys to the client, preferring server-only environment variables, and using server-executed code paths—while warning against storing or relying on user-supplied API keys. Because data protection is paramount, it recommends encrypting sensitive data, least-privilege access, retention policies, thorough audit logging, and PII redaction/anonymization before data reaches an LLM to meet GDPR/CCPA expectations.

On deployment, the chapter surveys hosted, containerized, and self-hosted paths, recommending Vercel for a streamlined baseline with HTTPS, CI/CD, environment variable management, and solid observability, while noting Docker/Kubernetes and self-hosting for advanced control. A production checklist—scaled by traffic tiers—covers cost management, privacy and compliance, security hardening (including WAF and upstream rate limits), latency optimization, scaling and reliability, monitoring and alerting, and safe rollout strategies. Pre-deployment steps (populate env vars, verify Node/runtime, ensure local builds) and CLI-driven releases are complemented by post-deployment practices: enable analytics and logs, add external alerting if needed, apply firewall rules, and continuously test and monitor. The overarching message is to bake security, privacy, and observability into the pipeline from day one, then iterate safely as features and usage grow.

A flow diagram illustrating the multilayered security checks from user input validation through request processing, showcasing multiple defensive layers including input validation, security middleware, and rate limiting mechanisms.

A layered approach covering secure development practices, authentication and authorization, API key management, data protection and compliance to ensure robust security and threat prevention

Message quota implementation flow. A flowchart depicting the server quota enforcement process, where user requests are authenticated via Clerk, checked against Redis-stored daily limits, and either processed or rejected based on quota status.

Overview of the message quota implementation. The top part displays the Upstash Data Browser, showing the Redis key and value tracking the number of messages sent by a user on a given day. The bottom part shows the API response, indicating a 429 Too Many Requests status code when the user exceeds their daily message limit of 10 messages.

OpenAI API usage dashboard displaying metrics for a GPT-4 model instance, showing API requests and total token usage. The dashboard interface includes a navigation sidebar with options for monitoring various aspects of the API deployment.

Simple data anonymization feature in action. The user provides a message containing personal information (name, email address, and phone number), and the system responds by replacing the sensitive data with generic placeholders: PERSON_NAME, EMAIL_ADDRESS, and PHONE_NUMBER.

Configuration interface for project environment variables in the Vercel Dashboard, displaying several masked API keys and authentication tokens.

Vercel Dashboard build configuration settings screen showing where we set the Node.js version to 18.x for both the build step and serverless functions.

Terminal output showing successful Vercel CLI deployment of a chat application. The deployment shows both the project dashboard URL and the preview domain, along with instructions for promoting to production using the "vercel --prod" command.

Terminal output showing a failed Vercel deployment attempt with error messages. The output includes links to the deployment dashboard and preview URLs, along with instructions for viewing detailed error logs either through the web interface or via the "vercel logs" CLI command.

The Netlify UI dashboard for a successful deployment, showing the chat-deployment.netlify.app project overview, with a "Production deploys" section.

Summary

AI-powered applications face unique security threats like prompt injection, model manipulation, and API abuse, requiring specialized security measures beyond traditional software practices.
Securing AI applications involves a multi-layered approach, starting with input validation to prevent malicious data from reaching the AI models.
Establishing a threat model is crucial for identifying potential vulnerabilities in your application, including public endpoints, user input points, and the sensitivity of the data being processed.
Server validation is essential to ensure data integrity and security, as it cannot be bypassed by malicious users, unlike client validation which primarily enhances user experience.
A security middleware layer acts as a central decision point, analyzing incoming requests for potential threats using various techniques, rate limiting, signature matching, token validation, and machine learning models.
Effective security middleware should be positioned at the beginning of the request processing pipeline to intercept and analyze all incoming requests before they reach the application's core logic.
Rate limiting is a crucial security control for protecting APIs from abuse by setting a maximum threshold on the number of requests allowed within a specific timeframe.
User-provided API keys can be stored on the server (encrypted in a database) or on the client, but server storage is recommended for better security and control.
Prioritizing data protection and compliance with regulations like GDPR is critical when building AI applications that store user data.
Anonymizing or pseudonymizing chat histories helps protect user privacy and comply with data protection regulations, especially when using the data for model training.
Deploying AI web applications requires careful consideration of cost management, data privacy, and latency.
Deployment involves not only the application itself but also the configuration of databases and external services (e.g., authentication, rate limiting) for production use.
Dedicated providers like Vercel offer streamlined deployment workflows specifically designed for Next.js applications, including CDN integration, monitoring, and firewalls.
For more advanced deployment needs, Docker and Kubernetes are powerful tools for containerization and orchestration.

FAQ

What security threats are unique to AI web apps, and how does this chapter suggest mitigating them?

Common AI-specific risks: prompt injection, model manipulation, and API abuse (credit drain, DoS via LLM calls).
Mitigations follow a multilayered model: input validation (e.g., Zod schemas), security middleware (CORS, auth checks, headers), rate limiting (Upstash/edge WAF), and strict authentication/authorization with quotas.
Continuously update dependencies (e.g., LangChain.js) and validate external data sources to avoid harmful inputs.

How should I approach input validation in Next.js to reduce prompt injection and abuse?

Establish a threat model: map public endpoints, user input points, and data sensitivity.
Always do server-side validation (client-side is only UX). Treat all client input as untrusted.
Use a schema validator (e.g., Zod) to enforce constraints like length (e.g., prompt max 1000 chars) and shape before processing.
Mirror limits in the UI (e.g., textarea maxLength) for immediate feedback.

Where should security middleware run in Next.js, and what checks should it perform?

Place middleware at the very start of the request pipeline; on Vercel, run it at the edge for lower latency.
Typical checks: CORS, authentication/authorization (e.g., Clerk protect), rate limiting, security headers, known attack signature checks, header/user-agent/IP anomalies.
Use a composable pattern to sequence controls and short-circuit on violations.

How do I implement rate limiting for AI endpoints with Upstash Redis?

Use @upstash/ratelimit with a sliding window (e.g., 5 requests per 10s) and identify clients by IP (or user ID for authenticated calls).
On limit breach, return 429 Too Many Requests and stop further processing.
Prefer edge-level or managed rate limiting (Cloudflare, AWS WAF, Vercel Firewall) to block abusive traffic before it reaches your app.

How can I enforce a daily message quota per user to control LLM costs?

Authenticate users (e.g., Clerk) and derive a per-user key such as message_count:{userId}:{YYYY-MM-DD}.
Use Redis INCR and set EXPIRE 24h on first write; if count exceeds the quota (e.g., 10), return 429 and block processing.
Place rate limiting before quota checks to reduce Redis churn; consider cooldowns to deter rapid bursts.

What are best practices for API key and secret management in Next.js?

Never expose secrets to the client. Only variables prefixed with NEXT_PUBLIC_ are safe for the browser.
Inject secrets at deploy time (e.g., Vercel Environment Variables). Avoid committing defaults in code.
Run sensitive code server-side (e.g., use server directive) and proxy API calls through server routes.
Application-level keys: keep private and monitor usage. User-provided keys: store server-side, encrypted, and proxy access; avoid client-side storage.
Audit and rotate keys regularly.

How do I anonymize or redact PII before sending user messages to an LLM?

Add an anonymization step in your API handler or middleware to redact PII before logging or forwarding prompts.
For demos: use a simple library like redact-pii; for production: prefer robust tools (Microsoft Presidio or Google Cloud DLP), especially for multi-language support and better accuracy.
Verify logs to ensure only redacted text reaches the LLM; if the UI shows PII, inspect client rendering logic.

What authentication and authorization setup is recommended to prevent abuse and control costs?

Require sign-in (e.g., Clerk.js with Next.js middleware) and protect sensitive routes (e.g., /chat).
Use OAuth/social login or passwordless for convenience; add MFA for higher security or regulated apps.
Combine auth with per-user quotas and rate limiting to cap API usage and cost exposure.

What deployment options work well for AI web apps, and when should I choose each?

Vercel: best default for Next.js; provides HTTPS, edge middleware, CI/CD, logs, analytics, env var management.
Docker/Kubernetes: pick for complex, high-scale, or custom infra needs (control over autoscaling, rollouts, geo-distribution).
Self-hosting: maximum control with higher operational overhead.
Alternatives: Netlify (static + serverless), Hugging Face Spaces (AI/model-centric hosting).

What belongs on my production checklist, and how do I monitor after launch?

Checklist highlights: cost controls (LLM pricing + quotas), privacy/compliance (GDPR/CCPA; encrypt in transit/at rest; anonymize), security (WAF, rate limits, key rotation), performance (caching, latency), CI/CD, observability.
Monitoring: use Vercel Analytics and Logs; integrate Sentry/Datadog/New Relic for richer tracing and alerts; add firewall rules in Vercel Firewall.
Troubleshooting Vercel: verify Node version, env vars, local npm run build, clear cache (vercel --force), inspect build logs, and consult Vercel discussions.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$47.99 $26.39

you save $21.60 (45%)

include audio $24.99 $13.74

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$47.99 $26.39

you save $21.60 (45%)

include audio $24.99 $13.74

eBook

pdf, ePub, online

$47.99 $26.39

you save $21.60 (45%)

include audio $24.99 $13.74

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more