Databases
Understanding database types, design principles and data management strategies
Last updated: 8/15/2025
Master the fundamentals of data storage, retrieval and management - the backbone of every modern application.
What are Databases?
The Core Concept
Organised storage for data
A database is like a digital filing cabinet with superpowers. It not only stores information in an organised way but can also quickly search, sort and connect related information across millions of records.
Real-world analogy: Think of a database like a library. Instead of having books scattered everywhere, they're organised by category, author and title. You can quickly find what you need, see what's related and manage the entire collection efficiently.
Types of Databases
Relational Databases (SQL)
Structured data with relationships
Relational databases store data in tables with rows and columns, where relationships between data are defined through foreign keys.
Popular options:
- PostgreSQL (open source, feature-rich)
- MySQL (fast, widely used)
- SQLite (lightweight, embedded)
- Microsoft SQL Server (enterprise)
- Oracle Database (enterprise)
Example table structure:
-- Users table
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(255) UNIQUE,
created_at TIMESTAMP
);
-- Orders table
CREATE TABLE orders (
id INT PRIMARY KEY,
user_id INT,
total DECIMAL(10,2),
status VARCHAR(50),
FOREIGN KEY (user_id) REFERENCES users(id)
);
NoSQL Databases
Flexible, non-relational data storage
NoSQL databases don't require a fixed schema and can handle unstructured or semi-structured data more flexibly.
Types:
- Document stores: MongoDB, CouchDB
- Key-value stores: Redis, DynamoDB
- Column-family: Cassandra, HBase
- Graph databases: Neo4j, ArangoDB
Example MongoDB document:
{
"_id": ObjectId("..."),
"name": "John Doe",
"email": "john@example.com",
"profile": {
"age": 30,
"location": "Sydney",
"interests": ["coding", "music", "travel"]
},
"orders": [
{
"product": "Laptop",
"price": 1299.99,
"date": "2024-01-15"
}
]
}
Database Design Principles
Normalisation
Organising data to reduce redundancy
Normalisation is the process of organising data to minimise duplication and dependency.
First Normal Form (1NF):
- Each column contains atomic values
- No repeating groups
Second Normal Form (2NF):
- In 1NF
- All non-key attributes depend on the entire primary key
Third Normal Form (3NF):
- In 2NF
- No transitive dependencies
Example of normalisation:
-- Before normalisation (redundant data)
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_name VARCHAR(100),
customer_email VARCHAR(255),
customer_address TEXT,
product_name VARCHAR(100),
product_price DECIMAL(10,2)
);
-- After normalisation
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(255),
address TEXT
);
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(100),
price DECIMAL(10,2)
);
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT,
product_id INT,
quantity INT,
FOREIGN KEY (customer_id) REFERENCES customers(id),
FOREIGN KEY (product_id) REFERENCES products(id)
);
Denormalisation
Intentionally adding redundancy for performance
Sometimes you need to denormalise for better query performance.
When to denormalise:
- Read-heavy workloads
- Complex analytical queries
- Performance is critical
Example:
-- Denormalised for performance
CREATE TABLE order_summary (
id INT PRIMARY KEY,
customer_name VARCHAR(100),
total_amount DECIMAL(10,2),
order_count INT,
last_order_date DATE
);
SQL Fundamentals
Basic Queries
SELECT statement:
-- Basic select
SELECT name, email FROM users;
-- With conditions
SELECT * FROM users WHERE age > 18;
-- With sorting
SELECT name, created_at FROM users ORDER BY created_at DESC;
-- With limits
SELECT * FROM users LIMIT 10;
INSERT statement:
-- Insert single row
INSERT INTO users (name, email, age) VALUES ('John Doe', 'john@example.com', 30);
-- Insert multiple rows
INSERT INTO users (name, email, age) VALUES
('Jane Smith', 'jane@example.com', 25),
('Bob Johnson', 'bob@example.com', 35);
UPDATE statement:
-- Update single row
UPDATE users SET age = 31 WHERE id = 1;
-- Update multiple rows
UPDATE users SET status = 'active' WHERE last_login > '2024-01-01';
DELETE statement:
-- Delete specific rows
DELETE FROM users WHERE status = 'inactive';
-- Delete all rows (be careful!)
DELETE FROM users;
Joins
Combining data from multiple tables
INNER JOIN:
SELECT u.name, o.total
FROM users u
INNER JOIN orders o ON u.id = o.user_id;
LEFT JOIN:
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- Shows all users, even those without orders
RIGHT JOIN:
SELECT u.name, o.total
FROM users u
RIGHT JOIN orders o ON u.id = o.user_id;
-- Shows all orders, even if user doesn't exist
Aggregation Functions
Calculating values across multiple rows
-- Count rows
SELECT COUNT(*) FROM users;
-- Sum values
SELECT SUM(total) FROM orders;
-- Average values
SELECT AVG(price) FROM products;
-- Group by
SELECT category, COUNT(*) as count
FROM products
GROUP BY category;
Database Indexing
What are Indexes?
Speed up data retrieval
Indexes are like the index at the back of a book - they help you find information quickly without reading every page.
Types of indexes:
- B-tree: Most common, good for equality and range queries
- Hash: Fast for equality queries only
- Full-text: For searching text content
- Composite: Multiple columns
Creating indexes:
-- Single column index
CREATE INDEX idx_users_email ON users(email);
-- Composite index
CREATE INDEX idx_users_name_email ON users(name, email);
-- Unique index
CREATE UNIQUE INDEX idx_users_email ON users(email);
When to use indexes:
- Columns used in WHERE clauses
- Columns used in JOIN conditions
- Columns used in ORDER BY
- Columns used in GROUP BY
When not to use indexes:
- Small tables
- Tables with frequent updates
- Columns with low selectivity
Database Transactions
ACID Properties
Ensuring data integrity
Atomicity: All operations in a transaction succeed or fail together Consistency: Database remains in a valid state Isolation: Concurrent transactions don't interfere with each other Durability: Committed transactions are permanent
Transaction example:
BEGIN TRANSACTION;
-- Transfer money between accounts
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
-- Check if both updates succeeded
IF @@ERROR = 0
COMMIT TRANSACTION;
ELSE
ROLLBACK TRANSACTION;
END IF;
Transaction Isolation Levels
Read Uncommitted: Lowest isolation, can see uncommitted changes Read Committed: Can't see uncommitted changes (default in most databases) Repeatable Read: Consistent reads within a transaction Serializable: Highest isolation, transactions appear to run sequentially
Database Security
Authentication and Authorization
Controlling access to data
Authentication: Verifying who you are Authorization: Determining what you can do
User management:
-- Create user
CREATE USER 'app_user'@'localhost' IDENTIFIED BY 'secure_password';
-- Grant permissions
GRANT SELECT, INSERT, UPDATE ON database_name.* TO 'app_user'@'localhost';
-- Revoke permissions
REVOKE DELETE ON database_name.* FROM 'app_user'@'localhost';
Row-Level Security (RLS)
Fine-grained access control
-- Enable RLS
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
-- Create policy
CREATE POLICY user_orders ON orders
FOR ALL
USING (user_id = current_user_id());
Data Encryption
Protecting sensitive information
At rest: Encrypting data stored on disk In transit: Encrypting data moving between client and database In use: Encrypting data while being processed
Performance Optimisation
Query Optimisation
Making queries faster
Use EXPLAIN to analyse queries:
EXPLAIN SELECT * FROM users WHERE email = 'john@example.com';
Common optimisations:
- Use appropriate indexes
- Avoid SELECT *
- Use LIMIT for large result sets
- Optimise JOIN order
- Use appropriate data types
Connection Pooling
Managing database connections efficiently
Connection pooling keeps a pool of database connections ready for use, reducing the overhead of creating new connections.
Benefits:
- Faster response times
- Better resource utilisation
- Improved scalability
Database Migration and Versioning
Schema Changes
Evolving your database structure
Safe migration practices:
- Make changes backward compatible when possible
- Use transactions for schema changes
- Test migrations on staging data
- Have rollback plans
Example migration:
-- Add new column with default value
ALTER TABLE users ADD COLUMN phone VARCHAR(20) DEFAULT NULL;
-- Update existing records
UPDATE users SET phone = 'N/A' WHERE phone IS NULL;
-- Make column required
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;
Database Versioning Tools
Managing schema changes
Popular tools:
- Flyway: Java-based, simple migrations
- Liquibase: XML/YAML-based, powerful features
- Alembic: Python-based, works with SQLAlchemy
- Prisma Migrate: TypeScript-based, works with Prisma
Modern Database Trends
Cloud Databases
Managed database services
Benefits:
- Automatic scaling
- Built-in backups
- High availability
- Managed security
Popular cloud databases:
- AWS: RDS, Aurora, DynamoDB
- Azure: SQL Database, Cosmos DB
- Google Cloud: Cloud SQL, Firestore
- Supabase: PostgreSQL with real-time features
Database as a Service (DBaaS)
Fully managed database solutions
Examples:
- PlanetScale: MySQL-compatible, branching
- Neon: Serverless PostgreSQL
- Fauna: Globally distributed, ACID transactions
Real-time Databases
Live data synchronisation
Use cases:
- Live dashboards
- Collaborative applications
- Gaming
- Chat applications
Technologies:
- Firebase: Real-time document database
- Supabase: Real-time PostgreSQL
- Pusher: Real-time messaging
Getting Started
Choose the Right Database
Consider these factors:
- Data structure and relationships
- Query patterns
- Scale requirements
- Team expertise
- Budget constraints
Quick decision guide:
- Relational: Complex relationships, ACID requirements
- Document: Flexible schema, JSON-like data
- Key-value: Simple lookups, caching
- Graph: Complex relationships, network analysis
Local Development Setup
PostgreSQL:
# macOS with Homebrew
brew install postgresql
brew services start postgresql
# Create database
createdb myapp_development
MongoDB:
# macOS with Homebrew
brew install mongodb-community
brew services start mongodb-community
# Connect to MongoDB
mongosh
SQLite:
# Built into most systems
sqlite3 myapp.db
Your First Database
Step 1: Design your schema
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Step 2: Add some data
INSERT INTO users (username, email) VALUES
('john_doe', 'john@example.com'),
('jane_smith', 'jane@example.com');
Step 3: Query your data
SELECT * FROM users WHERE username = 'john_doe';
Summary
Databases are the foundation of data-driven applications. Understanding database types, design principles and best practices is essential for building scalable and maintainable systems.
Key takeaways:
- Choose the right database type for your needs
- Design schemas with normalisation in mind
- Use indexes to improve performance
- Implement proper security measures
- Plan for schema evolution
- Consider cloud and managed solutions
Remember: Good database design is an investment that pays off in performance, maintainability and scalability!
title: Databases description: Understanding database types, design principles and data management strategies category: learn order: 2 lastUpdated: 15 Aug 2025
Master the fundamentals of data storage, retrieval and management - the backbone of every modern application.
What are Databases?
The Core Concept
Organised storage for data
A database is like a digital filing cabinet with superpowers. It not only stores information in an organised way but can also quickly search, sort and connect related information across millions of records.
Real-world analogy: Think of a database like a library. Instead of having books scattered everywhere, they're organised by category, author and title. You can quickly find what you need, see what's related and manage the entire collection efficiently.
Types of Databases
Relational Databases (SQL)
Structured data with relationships
Relational databases store data in tables with rows and columns, where relationships between data are defined through foreign keys.
Popular options:
- PostgreSQL (open source, feature-rich)
- MySQL (fast, widely used)
- SQLite (lightweight, embedded)
- Microsoft SQL Server (enterprise)
- Oracle Database (enterprise)
Example table structure:
-- Users table
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(255) UNIQUE,
created_at TIMESTAMP
);
-- Orders table
CREATE TABLE orders (
id INT PRIMARY KEY,
user_id INT,
total DECIMAL(10,2),
status VARCHAR(50),
FOREIGN KEY (user_id) REFERENCES users(id)
);
NoSQL Databases
Flexible, non-relational data storage
NoSQL databases don't require a fixed schema and can handle unstructured or semi-structured data more flexibly.
Types:
- Document stores: MongoDB, CouchDB
- Key-value stores: Redis, DynamoDB
- Column-family: Cassandra, HBase
- Graph databases: Neo4j, ArangoDB
Example MongoDB document:
{
"_id": ObjectId("..."),
"name": "John Doe",
"email": "john@example.com",
"profile": {
"age": 30,
"location": "Sydney",
"interests": ["coding", "music", "travel"]
},
"orders": [
{
"product": "Laptop",
"price": 1299.99,
"date": "2024-01-15"
}
]
}
Database Design Principles
Normalisation
Organising data to reduce redundancy
Normalisation is the process of organising data to minimise duplication and dependency.
First Normal Form (1NF):
- Each column contains atomic values
- No repeating groups
Second Normal Form (2NF):
- In 1NF
- All non-key attributes depend on the entire primary key
Third Normal Form (3NF):
- In 2NF
- No transitive dependencies
Example of normalisation:
-- Before normalisation (redundant data)
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_name VARCHAR(100),
customer_email VARCHAR(255),
customer_address TEXT,
product_name VARCHAR(100),
product_price DECIMAL(10,2)
);
-- After normalisation
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(255),
address TEXT
);
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(100),
price DECIMAL(10,2)
);
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT,
product_id INT,
quantity INT,
FOREIGN KEY (customer_id) REFERENCES customers(id),
FOREIGN KEY (product_id) REFERENCES products(id)
);
Denormalisation
Intentionally adding redundancy for performance
Sometimes you need to denormalise for better query performance.
When to denormalise:
- Read-heavy workloads
- Complex analytical queries
- Performance is critical
Example:
-- Denormalised for performance
CREATE TABLE order_summary (
id INT PRIMARY KEY,
customer_name VARCHAR(100),
total_amount DECIMAL(10,2),
order_count INT,
last_order_date DATE
);
SQL Fundamentals
Basic Queries
SELECT statement:
-- Basic select
SELECT name, email FROM users;
-- With conditions
SELECT * FROM users WHERE age > 18;
-- With sorting
SELECT name, created_at FROM users ORDER BY created_at DESC;
-- With limits
SELECT * FROM users LIMIT 10;
INSERT statement:
-- Insert single row
INSERT INTO users (name, email, age) VALUES ('John Doe', 'john@example.com', 30);
-- Insert multiple rows
INSERT INTO users (name, email, age) VALUES
('Jane Smith', 'jane@example.com', 25),
('Bob Johnson', 'bob@example.com', 35);
UPDATE statement:
-- Update single row
UPDATE users SET age = 31 WHERE id = 1;
-- Update multiple rows
UPDATE users SET status = 'active' WHERE last_login > '2024-01-01';
DELETE statement:
-- Delete specific rows
DELETE FROM users WHERE status = 'inactive';
-- Delete all rows (be careful!)
DELETE FROM users;
Joins
Combining data from multiple tables
INNER JOIN:
SELECT u.name, o.total
FROM users u
INNER JOIN orders o ON u.id = o.user_id;
LEFT JOIN:
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
-- Shows all users, even those without orders
RIGHT JOIN:
SELECT u.name, o.total
FROM users u
RIGHT JOIN orders o ON u.id = o.user_id;
-- Shows all orders, even if user doesn't exist
Aggregation Functions
Calculating values across multiple rows
-- Count rows
SELECT COUNT(*) FROM users;
-- Sum values
SELECT SUM(total) FROM orders;
-- Average values
SELECT AVG(price) FROM products;
-- Group by
SELECT category, COUNT(*) as count
FROM products
GROUP BY category;
Database Indexing
What are Indexes?
Speed up data retrieval
Indexes are like the index at the back of a book - they help you find information quickly without reading every page.
Types of indexes:
- B-tree: Most common, good for equality and range queries
- Hash: Fast for equality queries only
- Full-text: For searching text content
- Composite: Multiple columns
Creating indexes:
-- Single column index
CREATE INDEX idx_users_email ON users(email);
-- Composite index
CREATE INDEX idx_users_name_email ON users(name, email);
-- Unique index
CREATE UNIQUE INDEX idx_users_email ON users(email);
When to use indexes:
- Columns used in WHERE clauses
- Columns used in JOIN conditions
- Columns used in ORDER BY
- Columns used in GROUP BY
When not to use indexes:
- Small tables
- Tables with frequent updates
- Columns with low selectivity
Database Transactions
ACID Properties
Ensuring data integrity
Atomicity: All operations in a transaction succeed or fail together Consistency: Database remains in a valid state Isolation: Concurrent transactions don't interfere with each other Durability: Committed transactions are permanent
Transaction example:
BEGIN TRANSACTION;
-- Transfer money between accounts
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
-- Check if both updates succeeded
IF @@ERROR = 0
COMMIT TRANSACTION;
ELSE
ROLLBACK TRANSACTION;
END IF;
Transaction Isolation Levels
Read Uncommitted: Lowest isolation, can see uncommitted changes Read Committed: Can't see uncommitted changes (default in most databases) Repeatable Read: Consistent reads within a transaction Serializable: Highest isolation, transactions appear to run sequentially
Database Security
Authentication and Authorization
Controlling access to data
Authentication: Verifying who you are Authorization: Determining what you can do
User management:
-- Create user
CREATE USER 'app_user'@'localhost' IDENTIFIED BY 'secure_password';
-- Grant permissions
GRANT SELECT, INSERT, UPDATE ON database_name.* TO 'app_user'@'localhost';
-- Revoke permissions
REVOKE DELETE ON database_name.* FROM 'app_user'@'localhost';
Row-Level Security (RLS)
Fine-grained access control
-- Enable RLS
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
-- Create policy
CREATE POLICY user_orders ON orders
FOR ALL
USING (user_id = current_user_id());
Data Encryption
Protecting sensitive information
At rest: Encrypting data stored on disk In transit: Encrypting data moving between client and database In use: Encrypting data while being processed
Performance Optimisation
Query Optimisation
Making queries faster
Use EXPLAIN to analyse queries:
EXPLAIN SELECT * FROM users WHERE email = 'john@example.com';
Common optimisations:
- Use appropriate indexes
- Avoid SELECT *
- Use LIMIT for large result sets
- Optimise JOIN order
- Use appropriate data types
Connection Pooling
Managing database connections efficiently
Connection pooling keeps a pool of database connections ready for use, reducing the overhead of creating new connections.
Benefits:
- Faster response times
- Better resource utilisation
- Improved scalability
Database Migration and Versioning
Schema Changes
Evolving your database structure
Safe migration practices:
- Make changes backward compatible when possible
- Use transactions for schema changes
- Test migrations on staging data
- Have rollback plans
Example migration:
-- Add new column with default value
ALTER TABLE users ADD COLUMN phone VARCHAR(20) DEFAULT NULL;
-- Update existing records
UPDATE users SET phone = 'N/A' WHERE phone IS NULL;
-- Make column required
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;
Database Versioning Tools
Managing schema changes
Popular tools:
- Flyway: Java-based, simple migrations
- Liquibase: XML/YAML-based, powerful features
- Alembic: Python-based, works with SQLAlchemy
- Prisma Migrate: TypeScript-based, works with Prisma
Modern Database Trends
Cloud Databases
Managed database services
Benefits:
- Automatic scaling
- Built-in backups
- High availability
- Managed security
Popular cloud databases:
- AWS: RDS, Aurora, DynamoDB
- Azure: SQL Database, Cosmos DB
- Google Cloud: Cloud SQL, Firestore
- Supabase: PostgreSQL with real-time features
Database as a Service (DBaaS)
Fully managed database solutions
Examples:
- PlanetScale: MySQL-compatible, branching
- Neon: Serverless PostgreSQL
- Fauna: Globally distributed, ACID transactions
Real-time Databases
Live data synchronisation
Use cases:
- Live dashboards
- Collaborative applications
- Gaming
- Chat applications
Technologies:
- Firebase: Real-time document database
- Supabase: Real-time PostgreSQL
- Pusher: Real-time messaging
Getting Started
Choose the Right Database
Consider these factors:
- Data structure and relationships
- Query patterns
- Scale requirements
- Team expertise
- Budget constraints
Quick decision guide:
- Relational: Complex relationships, ACID requirements
- Document: Flexible schema, JSON-like data
- Key-value: Simple lookups, caching
- Graph: Complex relationships, network analysis
Local Development Setup
PostgreSQL:
# macOS with Homebrew
brew install postgresql
brew services start postgresql
# Create database
createdb myapp_development
MongoDB:
# macOS with Homebrew
brew install mongodb-community
brew services start mongodb-community
# Connect to MongoDB
mongosh
SQLite:
# Built into most systems
sqlite3 myapp.db
Your First Database
Step 1: Design your schema
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Step 2: Add some data
INSERT INTO users (username, email) VALUES
('john_doe', 'john@example.com'),
('jane_smith', 'jane@example.com');
Step 3: Query your data
SELECT * FROM users WHERE username = 'john_doe';
Summary
Databases are the foundation of data-driven applications. Understanding database types, design principles and best practices is essential for building scalable and maintainable systems.
Key takeaways:
- Choose the right database type for your needs
- Design schemas with normalisation in mind
- Use indexes to improve performance
- Implement proper security measures
- Plan for schema evolution
- Consider cloud and managed solutions
Remember: Good database design is an investment that pays off in performance, maintainability and scalability!