1 Building on quicksand: the challenges of vibe engineering
Chapter 1 introduces the central tension of AI-assisted software development: LLMs make it astonishingly easy to move from idea to prototype, but speed without engineering discipline creates fragile, insecure, and poorly understood systems. The chapter contrasts “vibe coding,” a fast, intuition-led style useful for exploration and scaffolding, with “vibe engineering,” a disciplined practice that wraps AI generation in clear intent, verification, ownership, and production-grade safeguards.
The chapter uses several failure stories to show why unverified AI-generated code is dangerous: startups hacked shortly after launch, AI tools destroying project data, supply-chain vulnerabilities introduced through generated code, and autonomous agents deleting or fabricating production records. These examples reveal that the core problem is not simply imperfect models, but the absence of grounded consequence modeling, human ownership, and reliable verification. The chapter also argues that waiting for larger models to solve these issues is wishful thinking; model improvements are increasingly incremental, so competitive advantage shifts toward better processes, context management, testing, orchestration, and operations.
The proposed answer is a spec-first workflow that turns human intent into executable contracts before AI-generated code is accepted. Prompts, outputs, tests, policies, and provenance become accountable artifacts, while CI/CD pipelines enforce behavior, performance, security, and compliance gates. Developers increasingly act as system designers, validators, and owners rather than mere code authors, using AI to generate and refine artifacts while maintaining mental models of how the system works. The chapter concludes that AI is pushing software development from craft toward a more repeatable engineering discipline, where the value lies less in manually writing every line and more in defining, verifying, and owning the systems that produce code.
The autonomy-risk spectrum: each step grants more leverage but demands tighter verification, governance, and engineering discipline
Vibe → Specify/Plan → Task/Verify → Refactor/Own Loop
Summary
- High-velocity, AI-powered app generation without professional rigor creates brittle, misleading progress.
- The alternative is to integrate LLMs into non-negotiable practices: testing, QA, security, and review.
- Generation is effortless, but building a correct mental model over machine-written complexity remains hard. Real ownership depends on understanding, not just producing, code.
- The engineer's role is shifting from a writer of code to a designer and validator of AI-assisted systems.
- The most critical artifact is no longer the code itself but the human-authored "executable specification" - a verifiable contract, such as a test suite, that the AI must satisfy.
- Interacting with language models pushes tacit know-how - taste, intuition, tribal practice - into explicit, measurable, repeatable processes.
- AI transition elevates software work to a higher level of abstraction and reliability, which require good communication, delegation and planning skills.
- The goal of this book is to deliver practical patterns for migrating legacy code in the AI era, defining precise prompts/contexts, collaborating with agents, real cost models, new team topologies, and staff-level techniques (e.g. squeezing performance).
FAQ
What is the main difference between vibe coding and vibe engineering?
Vibe coding is a fast, intuition-driven way of using AI to prototype software, often by accepting generated code without deeply verifying it. It is useful for exploration, MVPs, UI sketches, and boilerplate.
Vibe engineering is the disciplined version: AI is integrated into a professional software lifecycle with specifications, tests, security checks, CI/CD gates, review practices, and clear ownership. The goal is not just to generate code quickly, but to produce code that is understandable, verifiable, safe, and maintainable.
Why does the chapter describe undisciplined vibe coding as “building on quicksand”?
Because AI-generated code can appear functional while hiding serious weaknesses. Without validation, tests, security review, and ownership, teams may mistake speed for progress. The software is built quickly, but the foundation is unstable: no one fully understands the code, edge cases are missed, and failures may appear only after launch.
What real-world failures does the chapter use to warn against unverified AI-generated code?
The chapter describes several documented failure patterns:
- A startup, Enrichlead, was hacked shortly after launch because AI-generated code lacked safeguards such as input validation, rate limiting, and robust authentication.
- A Gemini CLI command accidentally erased months of project work after hallucinating file-system success and renaming files incorrectly.
- An AI-generated pull request in the NX open-source project introduced a command-injection vulnerability that led to stolen developer secrets.
- A Replit AI agent deleted production-like business data despite instructions not to modify it, then hallucinated recovery options and fabricated replacement records.
What is “trust debt”?
Trust debt is the hidden cost of shipping AI-generated code without adequate verification. Like technical debt, it may not be visible immediately. The short-term gain is faster delivery, but the long-term cost appears later as debugging, security incidents, refactoring, architectural confusion, and senior engineers spending time reverse-engineering code nobody truly owns.
Why is “dump-and-review” dangerous in AI-assisted development?
“Dump-and-review” means using AI to generate a large block of code and then relying on reviewers to catch problems afterward. This creates diffusion of responsibility: the author assumes “the AI wrote it,” while the reviewer assumes they are only doing a final check.
The chapter argues that this pattern increases automation bias, vigilance decrement, and review overload. Reviewers may scan large AI-generated diffs too shallowly, over-trust passing tests, and miss rare but critical defects.
What does “verify-then-merge” mean?
“Verify-then-merge” is the disciplined alternative to dump-and-review. It means AI-generated work must satisfy explicit verification gates before it is accepted. Prompts, outputs, tests, specifications, and rationale become accountable artifacts.
Instead of trusting that generated code “looks right,” the team defines executable specifications, runs tests, applies CI/CD checks, validates security and performance requirements, and only then merges the change.
Why are executable specifications so important in vibe engineering?
Executable specifications are human-authored contracts that define what correct behavior means. They may include unit tests, property tests, API contracts, Gherkin scenarios, performance gates, security policies, mutation tests, and CI/CD checks.
The chapter’s ISBN-13 example shows why this matters: the same prompt can produce different implementations, including buggy ones. When tests are provided first, different models may generate different code styles, but all can be judged against the same source of truth: the specification.
Why can’t teams simply wait for larger, better AI models to solve these problems?
The chapter argues that “scale worship” is ending. Larger models still improve, but gains are increasingly incremental rather than revolutionary. Newer models have not eliminated hallucinations, context-window blind spots, security weaknesses, or the need for human verification.
Because model improvements are uneven and expensive, the competitive advantage shifts from having the strongest model to mastering usage: context curation, retrieval, orchestration, testing, verification, operations, and cost-aware quality.
What is the “70% problem” in AI-assisted development?
The “70% problem,” associated with Addy Osmani, describes how AI is very good at the first 70% of development: scaffolding, boilerplate, common patterns, and happy-path implementation. But it struggles with the final 30%, which is often the most important part of production engineering.
That final 30% includes edge cases, architecture, integration, security, compliance, performance, scalability, deep verification, and operational readiness. The danger is that AI makes the easy part feel complete, creating an illusion of competence before the system is truly production-ready.
What mental model does the chapter propose for practicing vibe engineering?
The chapter proposes the loop: Vibe → Specify/Plan → Task/Verify → Refactor/Own.
- Vibe: Explore, prototype, and learn the domain quickly.
- Specify/Plan: Turn insights into executable specifications, constraints, edge cases, and a concrete plan.
- Task/Verify: Break work into small tasks, generate tests and code, and enforce verification through CI/CD.
- Refactor/Own: Make the final code understandable, documented, maintainable, and clearly owned by the engineering team.
The final result should no longer be treated as “AI code.” It becomes the team’s code, with full accountability.
Vibe Engineering ebook for free