Overview

6 Normalization and Implementation

This chapter guides you through normalizing and implementing a relational database design for an online store scenario, emphasizing data integrity and clarity. It explains functional dependency and how successive normal forms (1NF through BCNF) systematically remove redundancy and anomalies. Practical rules emerge: every table must have a primary key, avoid multi-valued columns, ensure every non-key attribute depends only on a key, and eliminate transitive dependencies. The chapter illustrates how to split tables to meet these rules (for example, moving line-item details into a junction table and separating supplier info from products) and how to detect subtle transitive dependencies, including those that appear across a cycle of relationships.

Once normalized, the design is translated into SQL with constraints that enforce the model. You apply NOT NULL to critical attributes, define PRIMARY KEYs (including composite keys) and name constraints for maintainability, and add FOREIGN KEYs to uphold referential integrity. The chapter covers referential actions—default RESTRICT/NO ACTION, CASCADE for coordinated deletes/updates, and SET NULL for optional relationships—so child rows remain consistent when parent rows change. It also shows when to use UNIQUE (single or multi-column) to guard real-world uniqueness, DEFAULT to supply sensible values (such as timestamps, with attention to time zones across RDBMS), and CHECK to constrain domain values (like U.S. state names).

Throughout, the emphasis is on preventing insertion, update, and deletion anomalies while preserving truthful history and supporting future changes. Design choices such as keeping product price and name on purchase line items protect past receipts from later catalog changes; removing redundant foreign keys resolves hidden transitive dependencies; and judicious constraint selection makes misuse difficult and intent explicit. The result is a schema in BCNF that is resilient, clear to query, and enforceable by the database engine—laying a solid foundation before tackling performance trade-offs and further refinements.

Summary

  • Normalization is the process in which you break down the database into smaller and more manageable tables, with each table representing a single entity or concept.
  • Typically, when all your tables are in BCNF, you can consider your database to be fully normalized. A table in BCNF must have the following features:
    • It has a primary key.
    • It has no multi-valued columns.
    • All columns are dependent on a key but nothing else.
    • It contains no transitive dependency (All its non-key columns are directly dependent on a key).
    • In a sense, normalization is about checking each of your tables against the guidelines of BCNF, identifying violations, and revising your database design.
  • The constraints, such as NOT NULL, primary key, or foreign key constraints, play a critical role in implementing your database design.
  • If there is ever a small chance that you need to restructure your database design or migrate your database, you had better name all your constraints that can be named.
  • When NULL values are to be avoided in a column, think of the NOT NULL constraint.
  • The UNIQUE constraint is handy if you have some non-key columns that should all contain unique values.
  • The DEFAULT constraint is handy when you are dealing with timestamps or placeholder values.
  • The CHECK constraint can help to validate data format or ranges, or any other complex business logics when other constraints fail.

FAQ

What is normalization and why is it important before implementation?Normalization organizes data to minimize redundancy and dependency while maximizing integrity. Practically, you split data into well-scoped tables (each about one concept), which prevents update/insert/delete anomalies and makes constraints enforceable. Do this before translating your E-R design into SQL.
Which normal forms does the chapter focus on, and when is a design “normalized enough”?The chapter uses 1NF → 2NF → 3NF → BCNF as a practical ladder. In real projects, getting every table to BCNF is typically “normalized enough.” A BCNF table: has a primary key; has no multivalued columns; every non-key column depends on a key (practically, the primary key) and nothing else; has no transitive dependency.
What are multivalued columns and how do I eliminate them?A multivalued column holds multiple values in a single cell (for users, each value is an indivisible attribute). Symptoms: awkward queries, repeated data, inconsistencies. Fix by redesigning: store one value per row and move the repeated attribute to its own table, often with a junction table when needed.
What is a functional dependency and how does it guide BCNF?If knowing A always determines B, we say A → B (B is functionally dependent on A). In BCNF, every non-key column must depend on a key—and on nothing else. If any column depends on something other than the table’s key, split the table so each piece contains only attributes determined by its key. Example: if supplier_contact depends on supplier_name (not the product’s key), move supplier data to a supplier table and reference it via a foreign key.
What is a transitive dependency and how do I remove it?A transitive dependency is a chain where a non-key column depends on another non-key column that depends on a key (Key → X → Y). It causes redundancy and anomalies. Fix by splitting tables so non-key columns depend directly and only on the key. A red flag is a three-table “cycle” of relationships; remove redundant links so dependencies aren’t chained through another FK.
When should I use NOT NULL, and does PRIMARY KEY imply it?Use NOT NULL for essential attributes (e.g., username, password), timestamps, and foreign keys in mandatory relationships (minimum cardinality 1). PRIMARY KEY implies NOT NULL automatically, so you don’t need to add it again for the primary key column. A DEFAULT value does not replace NOT NULL—you can still insert NULL unless NOT NULL is present.
How do I choose and declare keys and uniqueness (PK, composite PK, UNIQUE)?- Choose a primary key from your candidate keys (minimal unique identifiers). A PRIMARY KEY ensures uniqueness and non-null. - For composite keys, declare a table-level PRIMARY KEY(col1, col2, …). - Use UNIQUE when non-key attributes (single or multi-column) must be unique in the real world (e.g., username, or name+manufacturer for products). Naming constraints (e.g., pk_table, unq_username) improves clarity and maintenance.
How do foreign keys enforce referential integrity, and when should FKs be NOT NULL?A foreign key in the child table references the parent’s primary key; the database blocks inserts in the child if the referenced parent row doesn’t exist. If the relationship is mandatory (min cardinality = 1), declare the FK column NOT NULL. If optional, the FK can be nullable.
What are referential actions (RESTRICT/NO ACTION, CASCADE, SET NULL), and how do I choose?- RESTRICT/NO ACTION (default): Prevents parent delete/update if matching child rows exist; safest for integrity. - CASCADE: Propagates parent delete/update to children; useful when child rows are meaningful only with the parent (e.g., payment methods tied to a user). - SET NULL: On parent delete, nulls out child FKs (keeps child rows); use only if the relationship can become unknown/optional and queries can handle NULLs.
How should I handle defaults and timestamps, especially with UTC across RDBMS?Use DEFAULT to auto-fill values, e.g., DEFAULT CURRENT_TIMESTAMP for created_at. Pair with NOT NULL to avoid NULL inserts. Time zone handling varies: for UTC storage, MySQL stores TIMESTAMP in UTC by default; PostgreSQL requires TIMESTAMP WITH TIME ZONE to preserve time zone info with CURRENT_TIMESTAMP. Always check your RDBMS specifics.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Grokking Relational Database Design ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Grokking Relational Database Design ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Grokking Relational Database Design ebook for free