Databases

Understanding database types, design principles and data management strategies

Last updated: 8/15/2025

Master the fundamentals of data storage, retrieval and management - the backbone of every modern application.

What are Databases?

The Core Concept

Organised storage for data

A database is like a digital filing cabinet with superpowers. It not only stores information in an organised way but can also quickly search, sort and connect related information across millions of records.

Real-world analogy: Think of a database like a library. Instead of having books scattered everywhere, they're organised by category, author and title. You can quickly find what you need, see what's related and manage the entire collection efficiently.

Types of Databases

Relational Databases (SQL)

Structured data with relationships

Relational databases store data in tables with rows and columns, where relationships between data are defined through foreign keys.

Popular options:

PostgreSQL (open source, feature-rich)
MySQL (fast, widely used)
SQLite (lightweight, embedded)
Microsoft SQL Server (enterprise)
Oracle Database (enterprise)

Example table structure:

-- Users table
CREATE TABLE users (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(255) UNIQUE,
  created_at TIMESTAMP
);

-- Orders table
CREATE TABLE orders (
  id INT PRIMARY KEY,
  user_id INT,
  total DECIMAL(10,2),
  status VARCHAR(50),
  FOREIGN KEY (user_id) REFERENCES users(id)
);

NoSQL Databases

Flexible, non-relational data storage

NoSQL databases don't require a fixed schema and can handle unstructured or semi-structured data more flexibly.

Types:

Document stores: MongoDB, CouchDB
Key-value stores: Redis, DynamoDB
Column-family: Cassandra, HBase
Graph databases: Neo4j, ArangoDB

Example MongoDB document:

{
  "_id": ObjectId("..."),
  "name": "John Doe",
  "email": "john@example.com",
  "profile": {
    "age": 30,
    "location": "Sydney",
    "interests": ["coding", "music", "travel"]
  },
  "orders": [
    {
      "product": "Laptop",
      "price": 1299.99,
      "date": "2024-01-15"
    }
  ]
}

Database Design Principles

Normalisation

Organising data to reduce redundancy

Normalisation is the process of organising data to minimise duplication and dependency.

First Normal Form (1NF):

Each column contains atomic values
No repeating groups

Second Normal Form (2NF):

In 1NF
All non-key attributes depend on the entire primary key

Third Normal Form (3NF):

In 2NF
No transitive dependencies

Example of normalisation:

-- Before normalisation (redundant data)
CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_name VARCHAR(100),
  customer_email VARCHAR(255),
  customer_address TEXT,
  product_name VARCHAR(100),
  product_price DECIMAL(10,2)
);

-- After normalisation
CREATE TABLE customers (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(255),
  address TEXT
);

CREATE TABLE products (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  price DECIMAL(10,2)
);

CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_id INT,
  product_id INT,
  quantity INT,
  FOREIGN KEY (customer_id) REFERENCES customers(id),
  FOREIGN KEY (product_id) REFERENCES products(id)
);

Denormalisation

Intentionally adding redundancy for performance

Sometimes you need to denormalise for better query performance.

When to denormalise:

Read-heavy workloads
Complex analytical queries
Performance is critical

Example:

-- Denormalised for performance
CREATE TABLE order_summary (
  id INT PRIMARY KEY,
  customer_name VARCHAR(100),
  total_amount DECIMAL(10,2),
  order_count INT,
  last_order_date DATE
);

SQL Fundamentals

Basic Queries

SELECT statement:

-- Basic select
SELECT name, email FROM users;

-- With conditions
SELECT * FROM users WHERE age > 18;

-- With sorting
SELECT name, created_at FROM users ORDER BY created_at DESC;

-- With limits
SELECT * FROM users LIMIT 10;

INSERT statement:

-- Insert single row
INSERT INTO users (name, email, age) VALUES ('John Doe', 'john@example.com', 30);

-- Insert multiple rows
INSERT INTO users (name, email, age) VALUES 
  ('Jane Smith', 'jane@example.com', 25),
  ('Bob Johnson', 'bob@example.com', 35);

UPDATE statement:

-- Update single row
UPDATE users SET age = 31 WHERE id = 1;

-- Update multiple rows
UPDATE users SET status = 'active' WHERE last_login > '2024-01-01';

DELETE statement:

-- Delete specific rows
DELETE FROM users WHERE status = 'inactive';

-- Delete all rows (be careful!)
DELETE FROM users;

Joins

Combining data from multiple tables

INNER JOIN:

SELECT u.name, o.total
FROM users u
INNER JOIN orders o ON u.id = o.user_id;

LEFT JOIN:

SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- Shows all users, even those without orders

RIGHT JOIN:

SELECT u.name, o.total
FROM users u
RIGHT JOIN orders o ON u.id = o.user_id;
-- Shows all orders, even if user doesn't exist

Aggregation Functions

Calculating values across multiple rows

-- Count rows
SELECT COUNT(*) FROM users;

-- Sum values
SELECT SUM(total) FROM orders;

-- Average values
SELECT AVG(price) FROM products;

-- Group by
SELECT category, COUNT(*) as count
FROM products
GROUP BY category;

Database Indexing

What are Indexes?

Speed up data retrieval

Indexes are like the index at the back of a book - they help you find information quickly without reading every page.

Types of indexes:

B-tree: Most common, good for equality and range queries
Hash: Fast for equality queries only
Full-text: For searching text content
Composite: Multiple columns

Creating indexes:

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Composite index
CREATE INDEX idx_users_name_email ON users(name, email);

-- Unique index
CREATE UNIQUE INDEX idx_users_email ON users(email);

When to use indexes:

Columns used in WHERE clauses
Columns used in JOIN conditions
Columns used in ORDER BY
Columns used in GROUP BY

When not to use indexes:

Small tables
Tables with frequent updates
Columns with low selectivity

Database Transactions

ACID Properties

Ensuring data integrity

Atomicity: All operations in a transaction succeed or fail together Consistency: Database remains in a valid state Isolation: Concurrent transactions don't interfere with each other Durability: Committed transactions are permanent

Transaction example:

BEGIN TRANSACTION;

-- Transfer money between accounts
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- Check if both updates succeeded
IF @@ERROR = 0
  COMMIT TRANSACTION;
ELSE
  ROLLBACK TRANSACTION;
END IF;

Transaction Isolation Levels

Read Uncommitted: Lowest isolation, can see uncommitted changes Read Committed: Can't see uncommitted changes (default in most databases) Repeatable Read: Consistent reads within a transaction Serializable: Highest isolation, transactions appear to run sequentially

Database Security

Authentication and Authorization

Controlling access to data

Authentication: Verifying who you are Authorization: Determining what you can do

User management:

-- Create user
CREATE USER 'app_user'@'localhost' IDENTIFIED BY 'secure_password';

-- Grant permissions
GRANT SELECT, INSERT, UPDATE ON database_name.* TO 'app_user'@'localhost';

-- Revoke permissions
REVOKE DELETE ON database_name.* FROM 'app_user'@'localhost';

Row-Level Security (RLS)

Fine-grained access control

-- Enable RLS
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;

-- Create policy
CREATE POLICY user_orders ON orders
  FOR ALL
  USING (user_id = current_user_id());

Data Encryption

Protecting sensitive information

At rest: Encrypting data stored on disk In transit: Encrypting data moving between client and database In use: Encrypting data while being processed

Performance Optimisation

Query Optimisation

Making queries faster

Use EXPLAIN to analyse queries:

EXPLAIN SELECT * FROM users WHERE email = 'john@example.com';

Common optimisations:

Use appropriate indexes
Avoid SELECT *
Use LIMIT for large result sets
Optimise JOIN order
Use appropriate data types

Connection Pooling

Managing database connections efficiently

Connection pooling keeps a pool of database connections ready for use, reducing the overhead of creating new connections.

Benefits:

Faster response times
Better resource utilisation
Improved scalability

Database Migration and Versioning

Schema Changes

Evolving your database structure

Safe migration practices:

Make changes backward compatible when possible
Use transactions for schema changes
Test migrations on staging data
Have rollback plans

Example migration:

-- Add new column with default value
ALTER TABLE users ADD COLUMN phone VARCHAR(20) DEFAULT NULL;

-- Update existing records
UPDATE users SET phone = 'N/A' WHERE phone IS NULL;

-- Make column required
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;

Database Versioning Tools

Managing schema changes

Popular tools:

Flyway: Java-based, simple migrations
Liquibase: XML/YAML-based, powerful features
Alembic: Python-based, works with SQLAlchemy
Prisma Migrate: TypeScript-based, works with Prisma

Modern Database Trends

Cloud Databases

Managed database services

Benefits:

Automatic scaling
Built-in backups
High availability
Managed security

Popular cloud databases:

AWS: RDS, Aurora, DynamoDB
Azure: SQL Database, Cosmos DB
Google Cloud: Cloud SQL, Firestore
Supabase: PostgreSQL with real-time features

Database as a Service (DBaaS)

Fully managed database solutions

Examples:

PlanetScale: MySQL-compatible, branching
Neon: Serverless PostgreSQL
Fauna: Globally distributed, ACID transactions

Real-time Databases

Live data synchronisation

Use cases:

Live dashboards
Collaborative applications
Gaming
Chat applications

Technologies:

Firebase: Real-time document database
Supabase: Real-time PostgreSQL
Pusher: Real-time messaging

Getting Started

Choose the Right Database

Consider these factors:

Data structure and relationships
Query patterns
Scale requirements
Team expertise
Budget constraints

Quick decision guide:

Relational: Complex relationships, ACID requirements
Document: Flexible schema, JSON-like data
Key-value: Simple lookups, caching
Graph: Complex relationships, network analysis

Local Development Setup

PostgreSQL:

# macOS with Homebrew
brew install postgresql
brew services start postgresql

# Create database
createdb myapp_development

MongoDB:

# macOS with Homebrew
brew install mongodb-community
brew services start mongodb-community

# Connect to MongoDB
mongosh

SQLite:

# Built into most systems
sqlite3 myapp.db

Your First Database

Step 1: Design your schema

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  username VARCHAR(50) UNIQUE NOT NULL,
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Step 2: Add some data

INSERT INTO users (username, email) VALUES 
  ('john_doe', 'john@example.com'),
  ('jane_smith', 'jane@example.com');

Step 3: Query your data

SELECT * FROM users WHERE username = 'john_doe';

Summary

Databases are the foundation of data-driven applications. Understanding database types, design principles and best practices is essential for building scalable and maintainable systems.

Key takeaways:

Choose the right database type for your needs
Design schemas with normalisation in mind
Use indexes to improve performance
Implement proper security measures
Plan for schema evolution
Consider cloud and managed solutions

Remember: Good database design is an investment that pays off in performance, maintainability and scalability!

title: Databases description: Understanding database types, design principles and data management strategies category: learn order: 2 lastUpdated: 15 Aug 2025

Master the fundamentals of data storage, retrieval and management - the backbone of every modern application.

What are Databases?

The Core Concept

Organised storage for data

Types of Databases

Relational Databases (SQL)

Structured data with relationships

Relational databases store data in tables with rows and columns, where relationships between data are defined through foreign keys.

Popular options:

PostgreSQL (open source, feature-rich)
MySQL (fast, widely used)
SQLite (lightweight, embedded)
Microsoft SQL Server (enterprise)
Oracle Database (enterprise)

Example table structure:

-- Users table
CREATE TABLE users (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(255) UNIQUE,
  created_at TIMESTAMP
);

-- Orders table
CREATE TABLE orders (
  id INT PRIMARY KEY,
  user_id INT,
  total DECIMAL(10,2),
  status VARCHAR(50),
  FOREIGN KEY (user_id) REFERENCES users(id)
);

NoSQL Databases

Flexible, non-relational data storage

NoSQL databases don't require a fixed schema and can handle unstructured or semi-structured data more flexibly.

Types:

Document stores: MongoDB, CouchDB
Key-value stores: Redis, DynamoDB
Column-family: Cassandra, HBase
Graph databases: Neo4j, ArangoDB

Example MongoDB document:

{
  "_id": ObjectId("..."),
  "name": "John Doe",
  "email": "john@example.com",
  "profile": {
    "age": 30,
    "location": "Sydney",
    "interests": ["coding", "music", "travel"]
  },
  "orders": [
    {
      "product": "Laptop",
      "price": 1299.99,
      "date": "2024-01-15"
    }
  ]
}

Database Design Principles

Normalisation

Organising data to reduce redundancy

Normalisation is the process of organising data to minimise duplication and dependency.

First Normal Form (1NF):

Each column contains atomic values
No repeating groups

Second Normal Form (2NF):

In 1NF
All non-key attributes depend on the entire primary key

Third Normal Form (3NF):

In 2NF
No transitive dependencies

Example of normalisation:

-- Before normalisation (redundant data)
CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_name VARCHAR(100),
  customer_email VARCHAR(255),
  customer_address TEXT,
  product_name VARCHAR(100),
  product_price DECIMAL(10,2)
);

-- After normalisation
CREATE TABLE customers (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(255),
  address TEXT
);

CREATE TABLE products (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  price DECIMAL(10,2)
);

CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_id INT,
  product_id INT,
  quantity INT,
  FOREIGN KEY (customer_id) REFERENCES customers(id),
  FOREIGN KEY (product_id) REFERENCES products(id)
);

Denormalisation

Intentionally adding redundancy for performance

Sometimes you need to denormalise for better query performance.

When to denormalise:

Read-heavy workloads
Complex analytical queries
Performance is critical

Example:

-- Denormalised for performance
CREATE TABLE order_summary (
  id INT PRIMARY KEY,
  customer_name VARCHAR(100),
  total_amount DECIMAL(10,2),
  order_count INT,
  last_order_date DATE
);

SQL Fundamentals

Basic Queries

SELECT statement:

-- Basic select
SELECT name, email FROM users;

-- With conditions
SELECT * FROM users WHERE age > 18;

-- With sorting
SELECT name, created_at FROM users ORDER BY created_at DESC;

-- With limits
SELECT * FROM users LIMIT 10;

INSERT statement:

-- Insert single row
INSERT INTO users (name, email, age) VALUES ('John Doe', 'john@example.com', 30);

-- Insert multiple rows
INSERT INTO users (name, email, age) VALUES 
  ('Jane Smith', 'jane@example.com', 25),
  ('Bob Johnson', 'bob@example.com', 35);

UPDATE statement:

-- Update single row
UPDATE users SET age = 31 WHERE id = 1;

-- Update multiple rows
UPDATE users SET status = 'active' WHERE last_login > '2024-01-01';

DELETE statement:

-- Delete specific rows
DELETE FROM users WHERE status = 'inactive';

-- Delete all rows (be careful!)
DELETE FROM users;

Joins

Combining data from multiple tables

INNER JOIN:

SELECT u.name, o.total
FROM users u
INNER JOIN orders o ON u.id = o.user_id;

LEFT JOIN:

SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- Shows all users, even those without orders

RIGHT JOIN:

SELECT u.name, o.total
FROM users u
RIGHT JOIN orders o ON u.id = o.user_id;
-- Shows all orders, even if user doesn't exist

Aggregation Functions

Calculating values across multiple rows

-- Count rows
SELECT COUNT(*) FROM users;

-- Sum values
SELECT SUM(total) FROM orders;

-- Average values
SELECT AVG(price) FROM products;

-- Group by
SELECT category, COUNT(*) as count
FROM products
GROUP BY category;

Database Indexing

What are Indexes?

Speed up data retrieval

Indexes are like the index at the back of a book - they help you find information quickly without reading every page.

Types of indexes:

B-tree: Most common, good for equality and range queries
Hash: Fast for equality queries only
Full-text: For searching text content
Composite: Multiple columns

Creating indexes:

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Composite index
CREATE INDEX idx_users_name_email ON users(name, email);

-- Unique index
CREATE UNIQUE INDEX idx_users_email ON users(email);

When to use indexes:

Columns used in WHERE clauses
Columns used in JOIN conditions
Columns used in ORDER BY
Columns used in GROUP BY

When not to use indexes:

Small tables
Tables with frequent updates
Columns with low selectivity

Database Transactions

ACID Properties

Ensuring data integrity

Transaction example:

BEGIN TRANSACTION;

-- Transfer money between accounts
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- Check if both updates succeeded
IF @@ERROR = 0
  COMMIT TRANSACTION;
ELSE
  ROLLBACK TRANSACTION;
END IF;

Transaction Isolation Levels

Database Security

Authentication and Authorization

Controlling access to data

Authentication: Verifying who you are Authorization: Determining what you can do

User management:

-- Create user
CREATE USER 'app_user'@'localhost' IDENTIFIED BY 'secure_password';

-- Grant permissions
GRANT SELECT, INSERT, UPDATE ON database_name.* TO 'app_user'@'localhost';

-- Revoke permissions
REVOKE DELETE ON database_name.* FROM 'app_user'@'localhost';

Row-Level Security (RLS)

Fine-grained access control

-- Enable RLS
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;

-- Create policy
CREATE POLICY user_orders ON orders
  FOR ALL
  USING (user_id = current_user_id());

Data Encryption

Protecting sensitive information

At rest: Encrypting data stored on disk In transit: Encrypting data moving between client and database In use: Encrypting data while being processed

Performance Optimisation

Query Optimisation

Making queries faster

Use EXPLAIN to analyse queries:

EXPLAIN SELECT * FROM users WHERE email = 'john@example.com';

Common optimisations:

Use appropriate indexes
Avoid SELECT *
Use LIMIT for large result sets
Optimise JOIN order
Use appropriate data types

Connection Pooling

Managing database connections efficiently

Connection pooling keeps a pool of database connections ready for use, reducing the overhead of creating new connections.

Benefits:

Faster response times
Better resource utilisation
Improved scalability

Database Migration and Versioning

Schema Changes

Evolving your database structure

Safe migration practices:

Make changes backward compatible when possible
Use transactions for schema changes
Test migrations on staging data
Have rollback plans

Example migration:

-- Add new column with default value
ALTER TABLE users ADD COLUMN phone VARCHAR(20) DEFAULT NULL;

-- Update existing records
UPDATE users SET phone = 'N/A' WHERE phone IS NULL;

-- Make column required
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;

Database Versioning Tools

Managing schema changes

Popular tools:

Flyway: Java-based, simple migrations
Liquibase: XML/YAML-based, powerful features
Alembic: Python-based, works with SQLAlchemy
Prisma Migrate: TypeScript-based, works with Prisma

Modern Database Trends

Cloud Databases

Managed database services

Benefits:

Automatic scaling
Built-in backups
High availability
Managed security

Popular cloud databases:

AWS: RDS, Aurora, DynamoDB
Azure: SQL Database, Cosmos DB
Google Cloud: Cloud SQL, Firestore
Supabase: PostgreSQL with real-time features

Database as a Service (DBaaS)

Fully managed database solutions

Examples:

PlanetScale: MySQL-compatible, branching
Neon: Serverless PostgreSQL
Fauna: Globally distributed, ACID transactions

Real-time Databases

Live data synchronisation

Use cases:

Live dashboards
Collaborative applications
Gaming
Chat applications

Technologies:

Firebase: Real-time document database
Supabase: Real-time PostgreSQL
Pusher: Real-time messaging

Getting Started

Choose the Right Database

Consider these factors:

Data structure and relationships
Query patterns
Scale requirements
Team expertise
Budget constraints

Quick decision guide:

Relational: Complex relationships, ACID requirements
Document: Flexible schema, JSON-like data
Key-value: Simple lookups, caching
Graph: Complex relationships, network analysis

Local Development Setup

PostgreSQL:

# macOS with Homebrew
brew install postgresql
brew services start postgresql

# Create database
createdb myapp_development

MongoDB:

# macOS with Homebrew
brew install mongodb-community
brew services start mongodb-community

# Connect to MongoDB
mongosh

SQLite:

# Built into most systems
sqlite3 myapp.db

Your First Database

Step 1: Design your schema

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  username VARCHAR(50) UNIQUE NOT NULL,
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Step 2: Add some data

INSERT INTO users (username, email) VALUES 
  ('john_doe', 'john@example.com'),
  ('jane_smith', 'jane@example.com');

Step 3: Query your data

SELECT * FROM users WHERE username = 'john_doe';

Summary

Databases are the foundation of data-driven applications. Understanding database types, design principles and best practices is essential for building scalable and maintainable systems.

Key takeaways:

Choose the right database type for your needs
Design schemas with normalisation in mind
Use indexes to improve performance
Implement proper security measures
Plan for schema evolution
Consider cloud and managed solutions

Remember: Good database design is an investment that pays off in performance, maintainability and scalability!