Relational Databases: How Structured Data Works
Every application eventually needs to store data that outlives a single request. You can start with a JSON file, or keep everything in memory, but the moment two users edit the same record at the same time — or you need to answer the question "which customers spent more than $500 last month?" — you need something more serious.
That something is a relational database. It's been the backbone of serious software for over fifty years, and despite the rise of many alternatives, it remains the right default choice for the vast majority of applications.
This article is the foundation for everything in this module. There's no code here — just mental models, vocabulary, and a map of where we're going. By the end, you'll understand what a relational database is, why the relational model is so powerful, and when it's the right tool. Every subsequent article builds directly on what we establish here.
Quick Reference
What it is: A database that organizes data into tables with rows and columns, and enforces relationships between them.
Primary language: SQL (Structured Query Language) — the same language works across PostgreSQL, MySQL, SQLite, and others.
When to use it:
- Your data has clear structure and relationships
- You need strong consistency guarantees (financial data, orders, user accounts)
- You need to query across multiple related entities
- Your team needs to ask ad-hoc questions about the data
When to look elsewhere:
- Your data has no fixed schema (consider MongoDB)
- You need extremely high write throughput with simple access patterns (consider Redis or Cassandra)
- You're storing graph relationships with arbitrary depth (consider Neo4j)
See also:
- NoSQL Databases — the parallel track for document and key-value stores
What You Need to Know First
This is the first article in the module — no database knowledge required.
You should be comfortable with:
- Basic programming concepts: variables, functions, loops, conditionals
- TypeScript/JavaScript basics: syntax, async/await (for later articles in this module)
If you're not familiar with TypeScript, the Node.js Fundamentals module is the right starting point.
What We'll Cover in This Article
By the end of this guide, you'll understand:
- What a relational database is and how it organizes data
- What tables, rows, columns, and schemas are
- What relationships are and the three types you'll encounter
- What SQL is and how it relates to specific databases like PostgreSQL
- When to choose a relational database vs. a non-relational one
- The running scenario we'll use throughout this entire module
What We'll Explain Along the Way
We'll introduce these concepts with full explanations:
- Primary keys and foreign keys (the glue of relational data)
- ACID properties (briefly — why databases are trustworthy)
- Normalization (why you split data across multiple tables)
- PostgreSQL vs. other relational databases
The Problem That Relational Databases Solve
Let's start with a story. Imagine you're building an e-commerce platform. Your first version stores everything in a JavaScript object in memory:
const store = {
users: [
{ id: 1, name: "Aisha", email: "aisha@example.com" },
{ id: 2, name: "Marco", email: "marco@example.com" },
],
orders: [
{ id: 1, userId: 1, total: 89.99, status: "shipped" },
{ id: 2, userId: 1, total: 34.5, status: "pending" },
],
};
This works fine on day one. But then the questions start arriving:
- "Which users have placed more than two orders?"
- "What's the total revenue from orders placed this week?"
- "If I update a user's email, where else does it need to change?"
- "What happens if two people place an order at exactly the same moment?"
Each of these questions is hard — or impossible — to answer reliably with an in-memory object. You'd have to write custom code for every query, manually handle concurrent writes, and hope nothing crashes while you're mid-update.
A relational database solves all of this. It gives you:
- A structured way to store data with enforced rules
- A powerful query language (SQL) to answer complex questions
- Concurrency control so two operations don't corrupt each other
- Durability so data survives crashes and restarts
Let's look at how it does this.
The Core Idea: Tables, Rows, and Columns
A relational database organizes data into tables. If you've used a spreadsheet, you already have the right mental model — a table looks exactly like a spreadsheet with column headers and rows of data.
Here's a users table:
| id | name | created_at | |
|---|---|---|---|
| 1 | Aisha | aisha@example.com | 2024-01-15 09:23:00 |
| 2 | Marco | marco@example.com | 2024-01-16 14:05:00 |
| 3 | Priya | priya@example.com | 2024-01-17 11:30:00 |
Each column defines a piece of information stored for every record, and each column has a strict data type — id is an integer, name is text, created_at is a timestamp.
Each row represents a single record — one user, in this case.
This strictness is intentional. You can't accidentally store "hello" in the id column, or leave email empty if the column is marked required. The database enforces your rules for you.
What Makes It "Relational": Relationships Between Tables
The name "relational database" doesn't mean "a database with relationships between tables" — though that is a key feature. It actually refers to the mathematical concept of a relation (a set of tuples), which is what a table is formally.
That said, the ability to express and query relationships between tables is what makes these databases so powerful in practice.
There are three types of relationships you'll encounter:
One-to-Many
The most common relationship. One record in table A relates to many records in table B.
Example: One user can have many orders. Each order belongs to exactly one user.
users orders
───── ──────
id ◄────────── user_id
name id
email total
status
Many-to-Many
Many records in table A relate to many records in table B. This requires a third junction table to express.
Example: One order can contain many products. One product can appear in many orders.
orders order_items products
────── ─────────── ────────
id ◄────────── order_id id
product_id ──────► id
quantity name
price_at_purchase price
The order_items table is the junction. It has no meaning on its own — it only exists to connect orders and products.
One-to-One
One record in table A relates to exactly one record in table B. Less common, but used to split a table that's grown too wide, or to separate data with different access patterns (e.g., a users table and a user_preferences table).
The Glue: Primary Keys and Foreign Keys
Two concepts hold relational data together, and you'll encounter them in every article that follows.
Primary key — a column (or combination of columns) that uniquely identifies each row in a table. No two rows can have the same primary key. Typically, this is an auto-incrementing integer called id.
Foreign key — a column in one table that references the primary key of another table. This is what creates the relationship.
In our orders table, the user_id column is a foreign key — it references id in the users table. The database uses this to enforce referential integrity: you can't create an order with user_id = 99 if there's no user with id = 99. The database simply refuses.
This enforcement is one of the most valuable things a relational database gives you. Your data can't quietly become inconsistent — the database is a guardian.
The Language: SQL
SQL (Structured Query Language, pronounced "sequel" or "S-Q-L" — both are fine) is the language you use to interact with a relational database. It handles everything: reading data, writing data, creating tables, and managing permissions.
Here's what SQL looks like in practice:
-- Get all orders placed by Aisha
SELECT orders.id, orders.total, orders.status
FROM orders
JOIN users ON orders.user_id = users.id
WHERE users.name = 'Aisha'
ORDER BY orders.total DESC;
Notice something important: SQL is declarative. You describe what you want, not how to get it. You're not writing a loop to scan through every order — you're stating the result you need, and the database figures out the most efficient way to retrieve it.
This takes some adjustment if you're used to procedural code. You'll stop thinking "iterate over these records and filter them" and start thinking "give me the records that match these conditions." The database is better at the iteration than you are.
SQL is also standardized — the core language works across PostgreSQL, MySQL, SQLite, SQL Server, and others. The differences are mostly in advanced features and syntax around data types. Throughout this module, we'll use PostgreSQL, which is the most feature-complete open-source option and the one most production Node.js applications reach for.
The Trust Layer: ACID Properties
Relational databases are trusted with financial transactions, medical records, and user accounts because they provide strong guarantees about what happens to your data — especially when things go wrong.
These guarantees are summarized as ACID:
- Atomicity — An operation either fully succeeds or fully fails. If you're transferring money between accounts and the server crashes halfway through, neither account changes. The partial operation is rolled back automatically.
- Consistency — The database always moves from one valid state to another. Your constraints (required fields, foreign key rules) are always enforced.
- Isolation — Concurrent operations don't interfere with each other. Two users checking out at the same moment won't corrupt each other's orders.
- Durability — Once a write is confirmed, it survives crashes. Confirmed data is written to disk, not just held in memory.
You don't need to deeply understand ACID right now — we'll revisit it concretely in a future Transactions in Node.js article. For now, know that this is why relational databases are the default choice for any application where data correctness matters.
The Running Scenario: ShopFlow
Throughout this entire module, we'll build the data layer for a fictional e-commerce platform called ShopFlow. Every concept — queries, schema design, indexes, transactions — will be applied to the same four tables:
┌─────────────┐ ┌─────────────────┐
│ users │ │ products │
│─────────────│ │─────────────────│
│ id │ │ id │
│ name │ │ name │
│ email │ │ description │
│ created_at │ │ price │
└──────┬──────┘ │ stock_quantity │
│ │ category │
│ 1:many └────────┬────────┘
│ │
▼ │ many:many
┌─────────────┐ │
│ orders │ ┌────────▼────────┐
│─────────────│ │ order_items │
│ id │ 1:many │─────────────────│
│ user_id ├────────►│ id │
│ status │ │ order_id │
│ created_at │ │ product_id │
└─────────────┘ │ quantity │
│ price_at_purchase│
└─────────────────┘
Diagram: The four ShopFlow tables and their relationships. users has many orders. Each order has many order_items. Each order_item connects to one product. The price_at_purchase column on order_items stores the price at the time of purchase — because products.price can change later.
This schema is deliberately realistic. It has a many-to-many relationship (orders ↔ products via order_items), a derived field with a business rule behind it (price_at_purchase), and enough data to make queries interesting.
By the end of this module, you'll have built every part of this schema from scratch, written efficient queries against it, added indexes, handled transactions, and connected it to a Node.js application three different ways.
Relational vs. Non-Relational: Making the Choice
The existence of NoSQL databases doesn't mean relational databases are outdated — it means different problems have different right answers. Here's an honest breakdown:
Choose a relational database when:
- Your data has consistent structure (every user has the same fields)
- You need to query across relationships frequently ("give me all orders with their users and products")
- Data integrity is non-negotiable (financial records, inventory, user accounts)
- Your team needs to run ad-hoc analytical queries
- You're not sure yet — relational is the safer default
Consider a non-relational database when:
- Your data has genuinely variable structure (a product catalog where electronics have different attributes than clothing)
- You need a simple, extremely fast key-value store for caching or sessions (Redis)
- You're storing content that's always read and written as a single unit, never queried across relationships
The most common mistake is choosing NoSQL for the wrong reason: "we might need to scale later." Most applications never reach the scale where a well-tuned PostgreSQL instance becomes the bottleneck. Premature NoSQL adoption often trades away data integrity and query power for a problem you don't actually have yet.
For a deeper look at the non-relational side, the NoSQL Databases module covers MongoDB and Redis with the same depth as this one.
Why PostgreSQL?
This module uses PostgreSQL (often called "Postgres") as the database engine. Here's why:
- Feature complete — supports advanced features like JSON columns, full-text search, and window functions that MySQL lacks or added later
- Standard compliant — follows the SQL standard closely, so your SQL knowledge transfers well to other databases
- Production proven — powers applications at Instagram, Shopify, GitHub, and countless others
- Free and open source — no licensing costs, no vendor lock-in
- Excellent Node.js support — the
pgdriver and all major ORMs (Drizzle, Prisma, Knex) have first-class PostgreSQL support
The SQL you'll write here works in MySQL and SQLite with minor modifications. Where PostgreSQL-specific syntax is used, we'll call it out.
What's Next?
You now understand what a relational database is, how it organizes data into tables and relationships, and why it's the right default choice for most applications. You've also met ShopFlow — the schema we'll be building together throughout this module.
The natural next step is SQL Basics: Reading Data with SELECT — this is where you write your first SQL queries and start pulling data out of the ShopFlow tables. Everything you just learned about tables, rows, and columns will immediately become concrete the moment you run your first SELECT.
Summary: Key Takeaways
- A relational database stores data in tables — rows and columns with strict types
- Relationships between tables are expressed through primary keys and foreign keys
- The three relationship types are one-to-many, many-to-many, and one-to-one
- SQL is the declarative language used across all relational databases — you describe what you want, not how to get it
- ACID properties are why relational databases are trusted for data that must be correct
- PostgreSQL is the recommended engine — open source, feature complete, and widely used in production
- Choose relational when data has consistent structure and integrity matters; consider NoSQL only when there's a specific reason
- Throughout this module, we'll build the ShopFlow e-commerce schema:
users,products,orders, andorder_items
Version Information
Tested with:
- PostgreSQL: 15.x, 16.x
- Node.js: v18.x, v20.x, v22.x
Last verified: May 2025