Vibe Engineering you own this product

Best practices, mistakes, and tradeoffs

Tomasz Lelek and Artur Skowroński

MEAP began December 2025
Last updated February 2026
Publication in Fall 2026 (estimated)

ISBN 9781633434363
275 pages (estimated)

Included with a Manning Online subscription

printed in black & white

catalog / Data Science / AI

resources: Source code Book forum Source code on GitHub

table of content

1 Building on Quicksand: The challenges of Vibe Engineering

1.1 AIchemy: A new frontier of software creation

1.2 Illusion of speed or “Vibe over Engineering”

1.2.1 A startup hacked within days of launch

1.2.2 A command that erased an entire project

1.2.3 A pull request that turned into a trojan

1.2.4 An agent that decided to “clean up” production data

1.3 The end of scale worship: diminishing returns

1.4 Defining a new discipline: Vibe Engineering

1.4.1 Two faces of the vibe: coding vs engineering

1.4.2 Trust: a new kind of debt

1.4.3 From intelligent autocompletion to a partner

1.4.4 Stuck in old ritual: what stalls real adoption

1.5 A new mental model for vibe engineering

1.5.1 Practical example of using a cycle

1.5.2 Tools as force multipliers: IDE + CI/CD

1.5.3 The winning loop - and the risks ahead

1.6 Owning - The last mile of vibe engineering

1.6.1 The not-the-end-yet 70% Problem

1.7 The beginning of “Software Engineering”

1.8 Summary

2 Building a legacy modernization framework powered by Vibe Engineering

2.1 Context and application

2.2 Pre-analysis of the application

2.2.1 Extracting the bigger picture

2.2.2 Deployment models

2.2.3 Decisions and Assumptions

2.3 Trade-offs: copying the code in place vs a new repo

2.3.1 A hybrid approach

2.3.2 Procedure for code increments

2.4 Testing

2.4.1 Generate missing tests

2.4.2 Migrate missing tests to the new framework

2.4.3 Mistake of missing steps in the process

2.5 Production Code Migration trade-offs

2.5.1 Applying a minimal set of changes

2.6 Cleanup stage

2.7 UI layer migration

2.7.1 Taking a step back when AI tools did too much

2.7.2 Trade-off between LLM inferring too much vs too little.

2.7.3 Fixing bugs with AI tools

2.8 Cleanup - the 2nd stage

2.8.1 Mistake of overusing LLMs over static code refactoring tools

2.9 Adding documentation

2.10 Migrating the persistence layer

2.10.1 Adding Mongo support

2.10.2 Improving tests not to require a manual step

2.11 Conclude with merge pull requests

2.12 Summary

3 Context Engineering: Optimizing context for AI Agents

3.1 Vibe Coding traps: Garbage in, garbage out

3.2 Context Vacuum: First potential mistake

3.2.1 From a single-shot to multi-shot examples

3.2.2 Good Multishot prompt vs Bad Multishot prompt

3.3 Building context together with LLMs

3.3.1 Using Model Context Protocol to instrument LLMs

3.3.2 Building Context for UI Component with an MCPs

3.3.3 Accessing external knowledge through MCPs

3.3.4 Deep integration with Language Server Protocol

3.4 Context Rot: Is too much context a bad thing?

3.4.1 “Lost in the middle” problem

3.4.2 Manual Reordering: "Sandwich" Method

3.5 Using AI coding tools to manage context

3.5.1 Automated reordering using Retrieval-Augmented Generation (RAG)

3.5.2 Context Anchor: ToDo list for LLM

3.5.3 Context Compaction

3.5.4 …can I be lousy again if I’m using Coding AI?

3.6 Context through reasoning

3.6.1 Chain-of-Thought: forcing the LLM to “show its work”

3.6.2 Chain-of-Verification: Internal fact-checking loop

3.6.3 How to introduce self correction?

3.6.4 Is Reasoning always THE solution?

3.7 Summary

4 LLM-driven data analytics and visualization

5 Continuous AI development

6 A scientific approach for validating LLM-based solutions

6.1 Creating a Text-to-SQL service

6.1.1 Creating the service skeleton

6.1.2 Defining the API of the service

6.2 Integration with the OpenAI API

6.2.1 Testing the integration

6.3 Developing the accuracy verification framework

6.3.1 Delving into the bird-bench dataset

6.3.2 Methods for comparing the expected vs the actual generated SQL and their trade-offs

6.3.3 Submitting the queries to the SQL-generator-service

6.4 Trade-offs with input-context size

6.4.1 Accuracy vs context-size

6.4.2 Context-size vs cost of running the model

6.5 Summary

7 Vibe Performance Engineering: When assumptions mislead

7.1 Mistake of LLMs over-engineering performance improvements

7.1.1 The human solution

7.1.2 Vibe Performance Engineering Solution

7.1.3 Adding Traffic Expectations to the Vibe Performance Engineering Solution

7.2 Vibe Performance Engineering of the Hot-Path in code

7.2.1 A word service with a potential hot-path

7.2.2 Hot-path detection in your code

7.3 Measuring the code

7.3.1 Measuring the code with Vibe Performance Engineering techniques

7.3.2 Mistake of generating too much code

7.4 Improvements for hot-path performance

7.4.1 Improving the performance with Vibe Performance Engineering techniques

7.4.2 Compare performance results based on Gatling simulations

7.5 Concluding thoughts

7.6 Summary

8 Evaluation is King: Ultra-tight engineering loop for AI-powered codebases

9 FinOps for LLMs: Cost-cutting, confidentiality, and the right chips

10 Code Organization for AI: Tame your codebase with Monorepo & Friends

Overview

1 Building on Quicksand: The challenges of Vibe Engineering

AI-assisted development has unlocked a fast, exploratory way to turn ideas into software, but the speed of “vibe coding” hides fragility. Bigger, newer models no longer deliver step-change reliability; the gains are incremental, and the failures are costly. The chapter proposes “vibe engineering” as the remedy: treat LLMs as probabilistic components wrapped by deterministic process, with human intent captured as executable specifications and enforced by verification. The aim is to convert creative chaos into a disciplined practice that balances rapid discovery with security, maintainability, and ownership.

Documented failures—rushed apps hacked days after launch, hallucinated commands erasing work, tainted pull requests, and agents “cleaning” production data—expose the core risk: AI outputs detached from real-world constraints and consequences. This detachment breeds “trust debt” and automation bias, especially in dump-and-review workflows where responsibility diffuses and hidden costs surface later. With scale worship fading, advantage shifts from model horsepower to mastery of usage: precise intent, context curation, orchestration, testing, and cost-aware quality. The discipline demands verify-then-merge, sandboxing, and auditable artifacts so teams ship code they understand and can defend.

Vibe engineering codifies that discipline through a spec-first loop—Vibe → Specify/Plan → Task/Verify → Refactor/Own—where human-authored, executable contracts guide and check AI work. Techniques include systematic prompting against tests, grounded retrieval, checklist-based PR reviews, incident triage, guarded automation, and CI gates for security, performance, and compliance. This reframes the developer’s role from code author to system designer and validator, while squarely addressing trade-offs (speed vs. verification, abstraction cost) and the “hard last mile” of ownership, cognitive load, and the 70% problem. Culturally, it marks a shift from craftsmanship to engineering: turning taste and tacit knowledge into explicit, reusable, and verifiable rules—so what ships is resilient, secure, and truly team-owned.

Increasing Autonomy & Risk Label

Vibe → Specify/Plan → Task/Verify → Refactor/Own Loop

Summary

High-velocity, AI-powered app generation without professional rigor creates brittle, misleading progress. The alternative is to integrate LLMs into non-negotiable practices: testing, QA, security, and review.
Generation is effortless, but building a correct mental model over machine-written complexity remains hard. Real ownership depends on understanding, not just producing, code. Effectively, AI makes the process of understanding harder.
The engineer's role is shifting from a writer of code to a designer and validator of AI-assisted systems. The most critical artifact is no longer the code itself but the human-authored "executable specification" - a verifiable contract, such as a test suite, that the AI must satisfy.
Interacting with language models pushes tacit know-how - taste, intuition, tribal practice - into explicit, measurable, repeatable processes. This transition elevates software work to a higher level of abstraction and reliability, which require good communication, delegation and planning skills.
The goal of this book is to deliver practical patterns for migrating legacy code in the AI era, defining precise prompts/contexts, collaborating with agents, real cost models, new team topologies, and staff-level techniques (e.g. squeezing performance). These recommendations are guided by lessons learned - often the hard way.

FAQ

What is “Vibe Coding” and why is it risky?

Vibe Coding is intuition-first, rapid prototyping with LLMs that prioritizes speed over rigor. It’s useful for exploration and MVP scaffolding, but in production it tends to yield brittle, opaque code with weak testing, security, and maintainability—creating an illusion of progress that collapses under real-world constraints.

How does “Vibe Engineering” differ from Vibe Coding?

Vibe Engineering is a disciplined, provider-agnostic methodology that wraps probabilistic LLM output in deterministic guardrails: executable specifications, verification gates in CI/CD, security and performance checks, and human ownership. The focus shifts from writing code to designing the factory that produces safe, verifiable code.

Which real-world failures illustrate the dangers of undisciplined AI-generated code?

- A “zero hand-written code” startup (Enrichlead) was hacked within days due to basic security lapses.
- A Gemini CLI command “hallucinated success” and effectively destroyed a project’s files.
- An AI-generated PR in NX introduced command injection, compromising thousands of developers.
- An autonomous agent “cleaned” a production database, deleting critical records and fabricating data.
The common root cause: code detached from consequences and missing standard engineering safeguards.

What is “trust debt” and how does it accumulate?

Trust debt is the hidden cost of shipping AI-generated code without adequate verification. “Dump-and-review” diffuses responsibility to reviewers, who face vigilance decrement and automation bias. Short-term velocity gains are offset by later firefighting, refactors, and security fixes—costs that land on the most senior engineers and the organization at large.

Why won’t the next, bigger model solve these problems?

We’re hitting diminishing returns: data exhaustion, modest incremental model gains, and commercial incentives prioritizing latency/throughput over accuracy. Advantage shifts from “strongest model” to “best process”: context curation, retrieval, orchestration, testing, and operations. Engineering rigor—not scale worship—drives reliability.

What role do executable specifications play, and how do they work in practice?

Executable specs are the contract that defines correctness, performance, and safety up front. In the ISBN-13 example, human-authored tests (including edge cases) became the source of truth; multiple models produced different implementations that all passed the same spec. Reliability comes from the spec and verification, not the model.

How do teams balance speed with verification without losing AI’s benefits?

- Use Vibe Coding for discovery; don’t ship it as-is.
- Formalize the handoff: treat prompts/specs as versioned artifacts, adopt “verify-then-merge,” and decompose work into small, staged, testable changes.
- Run changes in sandboxes/canaries; gate on policy checks (security, compliance, perf SLOs).
The goal is fast feedback without blind spots.

What is the “70% Problem,” and why does the last 30% remain hard?

AI accelerates the first 70% (scaffolding, boilerplate) but struggles with the judgment-heavy 30%: edge cases, integration and architecture fit, comprehensive verification (property, mutation, perf, security), and compliance. Ownership requires a robust mental model—reading, understanding, and refactoring AI code—which is cognitively costly but essential.

What is the autonomy ladder for AI in development, and what risks rise with it?

Autonomy rises from token completion → block suggestions → conversational IDE agents → local autonomous agents → fully autonomous developers. Each rung increases leverage and systemic risk, making failures rarer but more consequential. Traditional human-centric reviews don’t scale; verification must become a designed pipeline, not ad-hoc inspection.

What does a robust validation and operations pipeline look like for Vibe Engineering?

- Treat prompts/specs as auditable, versioned artifacts with provenance.
- Enforce executable contracts: API schemas, property tests, mutation thresholds, perf SLO gates, and security/compliance policy checks in CI/CD.
- Guarded automation: agents propose, CI verifies, humans approve; automatic rollback on health regressions.
- Avoid “machine verifying the machine” by keeping humans authoring/curating specs and adversarial test plans.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$55.99 $33.59

you save $22.40 (40%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$55.99 $33.59

you save $22.40 (40%)

eBook

pdf, ePub, online

$55.99 $33.59

you save $22.40 (40%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more