Database Design Fundamentals for Marketplace Platforms
Complete guide to designing scalable marketplace database schemas with proven patterns, indexing strategies, and optimization techniques.
Prerequisites
- •Intermediate SQL knowledge (joins, indexes, constraints)
- •Understanding of relational database concepts
- •Familiarity with PostgreSQL or similar RDBMS
- •Basic knowledge of marketplace architecture
Database design determines whether your marketplace can handle 1,000 users or 1,000,000 users. A well-architected schema enables rapid feature development while maintaining performance at scale. This guide provides the patterns and strategies we use to design marketplace databases that scale.
Core Entity Relationships
Marketplaces require unique data modeling due to their multi-sided nature. Understanding the relationships between users, products, transactions, and supporting entities is foundational.
User and Role Architecture
Multi-sided marketplaces need flexible user models that support buyers, sellers, and administrators with varying permissions.
User Authentication Schema:
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
user_type VARCHAR(50) NOT NULL CHECK (user_type IN ('buyer', 'seller', 'admin')),
email_verified BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Index for authentication queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_type_verified ON users(user_type, email_verified);
Profile Data Separation:
Separate authentication from profile data for security and flexibility:
CREATE TABLE user_profiles (
user_id UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
display_name VARCHAR(255),
avatar_url VARCHAR(500),
bio TEXT,
phone VARCHAR(50),
metadata JSONB DEFAULT '{}',
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- JSONB index for flexible attribute queries
CREATE INDEX idx_user_profiles_metadata ON user_profiles USING gin(metadata);
Design Rationale:
- •Separate tables prevent auth queries from scanning profile data
- •JSONB metadata field accommodates varying user attributes without schema changes
- •UUID primary keys enable distributed systems and prevent enumeration attacks
- •CHECK constraints enforce data integrity at database level
Product Catalog Design
Marketplace catalogs must balance flexibility with query performance.
Flexible Product Schema:
CREATE TABLE products (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
seller_id UUID REFERENCES users(id) ON DELETE CASCADE,
title VARCHAR(500) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL,
attributes JSONB DEFAULT '{}',
status VARCHAR(50) DEFAULT 'draft' CHECK (status IN ('draft', 'active', 'sold', 'archived')),
view_count INTEGER DEFAULT 0,
search_vector tsvector,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Full-text search index
CREATE INDEX idx_products_search ON products USING gin(search_vector);
-- JSONB indexes for flexible attribute queries
CREATE INDEX idx_products_attributes ON products USING gin(attributes);
-- Composite indexes for common queries
CREATE INDEX idx_products_seller_status ON products(seller_id, status);
CREATE INDEX idx_products_status_created ON products(status, created_at DESC);
Maintaining Search Vectors:
Automatically update full-text search when products change:
CREATE FUNCTION products_search_trigger() RETURNS trigger AS $$
BEGIN
NEW.search_vector :=
setweight(to_tsvector('english', COALESCE(NEW.title, '')), 'A') ||
setweight(to_tsvector('english', COALESCE(NEW.description, '')), 'B');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER products_search_update
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW EXECUTE FUNCTION products_search_trigger();
JSONB Attribute Pattern:
Store varying product attributes without schema changes:
-- Example: Clothing marketplace
UPDATE products SET attributes = jsonb_build_object(
'size', 'Medium',
'color', 'Blue',
'material', 'Cotton',
'brand', 'Acme Co.'
) WHERE id = '...';
-- Query by attributes
SELECT * FROM products
WHERE attributes @> '{"color": "Blue", "size": "Medium"}';
Transaction and Order Management
Transaction schemas must support complex states, escrow functionality, and audit trails.
Order Architecture:
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
buyer_id UUID REFERENCES users(id) ON DELETE RESTRICT,
seller_id UUID REFERENCES users(id) ON DELETE RESTRICT,
product_id UUID REFERENCES products(id) ON DELETE RESTRICT,
amount DECIMAL(10, 2) NOT NULL,
platform_fee DECIMAL(10, 2) NOT NULL,
status VARCHAR(50) NOT NULL DEFAULT 'pending',
payment_intent_id VARCHAR(255),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
completed_at TIMESTAMP,
CONSTRAINT different_users CHECK (buyer_id != seller_id)
);
-- Indexes for dashboard queries
CREATE INDEX idx_orders_buyer ON orders(buyer_id, status);
CREATE INDEX idx_orders_seller ON orders(seller_id, status);
CREATE INDEX idx_orders_status_created ON orders(status, created_at DESC);
CREATE INDEX idx_orders_payment_intent ON orders(payment_intent_id);
Order State Transitions:
Document valid state transitions and enforce with application logic:
pending → paid → processing → completed
pending → paid → processing → disputed → resolved/refunded
pending → expired (after 24 hours)
pending → cancelled (by buyer)
Escrow Support:
Track payment holds and releases:
CREATE TABLE payment_holds (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
order_id UUID REFERENCES orders(id) ON DELETE CASCADE,
amount DECIMAL(10, 2) NOT NULL,
hold_created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
hold_released_at TIMESTAMP,
release_to VARCHAR(50) CHECK (release_to IN ('seller', 'buyer', 'split')),
notes TEXT
);
CREATE INDEX idx_payment_holds_order ON payment_holds(order_id);
CREATE INDEX idx_payment_holds_unreleased ON payment_holds(hold_released_at)
WHERE hold_released_at IS NULL;
Strategic Indexing
Proper indexing is the difference between 50ms and 5000ms query times. Index based on actual query patterns, not assumptions.
Composite Index Design
Composite indexes should match query patterns with column order based on selectivity.
Query-Driven Index Creation:
-- For seller dashboard: "Show my active listings"
CREATE INDEX idx_products_seller_active
ON products(seller_id, status, created_at DESC)
WHERE status = 'active';
-- For category browsing with price sorting
CREATE INDEX idx_products_category_price
ON products(category_id, price DESC, created_at DESC)
WHERE status = 'active';
-- For buyer order history
CREATE INDEX idx_orders_buyer_recent
ON orders(buyer_id, created_at DESC)
WHERE status IN ('completed', 'processing');
Index Column Ordering:
Follow the equality-first, sorting-last principle:
- •Columns in WHERE equality conditions (highest selectivity first)
- •Columns in WHERE range conditions
- •Columns in ORDER BY clauses
Example:
-- Query: Find seller's active products sorted by price
SELECT * FROM products
WHERE seller_id = ? AND status = 'active'
ORDER BY price DESC;
-- Optimal index order:
CREATE INDEX idx ON products(seller_id, status, price DESC);
Partial Indexes
Partial indexes reduce index size and improve write performance by indexing only relevant rows.
When to Use Partial Indexes:
Use when queries consistently filter on a specific condition:
-- Index only active products (reduces index size by 60-80%)
CREATE INDEX idx_active_products
ON products(created_at DESC)
WHERE status = 'active';
-- Index only verified sellers
CREATE INDEX idx_verified_sellers
ON users(created_at DESC)
WHERE user_type = 'seller' AND email_verified = TRUE;
-- Index only pending orders (orders change status frequently)
CREATE INDEX idx_pending_orders
ON orders(created_at DESC)
WHERE status = 'pending';
Benefits:
- •Smaller index size (faster scans, less disk space)
- •Faster writes (fewer index entries to update)
- •More efficient for queries that filter on the same condition
Covering Indexes
Covering indexes include all columns needed by a query, eliminating table lookups.
Include Columns in Indexes:
PostgreSQL supports INCLUDE for covering indexes:
-- Query frequently needs title, price, and thumbnail
CREATE INDEX idx_products_listing
ON products(seller_id, status, created_at DESC)
INCLUDE (title, price, thumbnail_url);
-- Now this query uses only the index (no table lookup):
SELECT title, price, thumbnail_url
FROM products
WHERE seller_id = ? AND status = 'active'
ORDER BY created_at DESC;
When to Use Covering Indexes:
- •High-traffic queries that select specific columns
- •Dashboard/list views that don't need full row data
- •API endpoints serving JSON responses with limited fields
Trade-offs:
- •Larger index size (more disk space, slower writes)
- •Only beneficial if query is frequent and performance-critical
Query Optimization Techniques
Optimization begins with understanding query execution. Use EXPLAIN ANALYZE to identify bottlenecks.
Using EXPLAIN ANALYZE
Analyze actual query performance:
EXPLAIN ANALYZE
SELECT p.*, u.display_name, u.avatar_url
FROM products p
JOIN user_profiles u ON p.seller_id = u.user_id
WHERE p.status = 'active'
AND p.price BETWEEN 10 AND 100
ORDER BY p.created_at DESC
LIMIT 20;
Key Metrics to Review:
- •Seq Scan vs Index Scan: Sequential scans indicate missing indexes
- •Rows returned vs Rows estimated: Large discrepancies suggest outdated statistics
- •Execution time: Total query time including planning
- •Buffers: Shared vs temp buffers indicate memory usage
Common Issues and Solutions:
| Issue | Indicator | Solution |
|---|---|---|
| Missing index | Seq Scan on large table | Create appropriate index |
| Wrong index used | Index Scan on wrong column | Adjust index or query |
| Outdated statistics | Rows off by orders of magnitude | Run ANALYZE |
| Large result set | Many rows scanned | Add LIMIT, pagination |
| Complex join | Nested Loop on large tables | Consider materialized view |
Query Optimization Patterns
Avoid SELECT * in Application Code:
-- Bad: Retrieves unnecessary data
SELECT * FROM products WHERE status = 'active';
-- Good: Select only needed columns
SELECT id, title, price, thumbnail_url FROM products WHERE status = 'active';
Use EXISTS Instead of COUNT for Existence Checks:
-- Bad: Counts all matching rows
SELECT COUNT(*) FROM orders WHERE buyer_id = ? AND status = 'completed';
-- Good: Stops at first match
SELECT EXISTS(SELECT 1 FROM orders WHERE buyer_id = ? AND status = 'completed');
Batch Queries to Reduce Round Trips:
-- Bad: N+1 query problem in application code
SELECT * FROM products WHERE seller_id = ?;
-- Then for each product:
SELECT * FROM reviews WHERE product_id = ?;
-- Good: Single query with JOIN
SELECT p.*, array_agg(r.*) as reviews
FROM products p
LEFT JOIN reviews r ON p.id = r.product_id
WHERE p.seller_id = ?
GROUP BY p.id;
Query Performance Monitoring
Set up slow query logging:
-- postgresql.conf settings
log_min_duration_statement = 1000 -- Log queries over 1 second
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
log_statement = 'none'
Track slow queries and optimize systematically:
- •Identify slowest queries from logs
- •Run EXPLAIN ANALYZE on slow queries
- •Create or adjust indexes
- •Rewrite inefficient queries
- •Validate improvement with EXPLAIN ANALYZE
- •Monitor in production
Handling Soft Deletes and Audit Trails
Marketplaces require data retention for legal, financial, and analytics purposes. Implement soft deletes rather than hard deletes.
Soft Delete Pattern
Add deletion timestamp to tables:
ALTER TABLE products ADD COLUMN deleted_at TIMESTAMP;
-- Index for filtering out deleted records
CREATE INDEX idx_products_not_deleted
ON products(status, created_at DESC)
WHERE deleted_at IS NULL;
-- Queries must filter deleted records
SELECT * FROM products
WHERE deleted_at IS NULL
AND status = 'active'
ORDER BY created_at DESC;
Application-Level Considerations:
- •Create database views that automatically filter deleted_at IS NULL
- •Use ORM global scopes to filter soft-deleted records
- •Provide admin interfaces to view/restore deleted data
Audit Trail Architecture
Track all changes to critical tables:
CREATE TABLE audit_logs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
table_name VARCHAR(100) NOT NULL,
record_id UUID NOT NULL,
action VARCHAR(20) NOT NULL CHECK (action IN ('INSERT', 'UPDATE', 'DELETE')),
old_data JSONB,
new_data JSONB,
changed_by UUID REFERENCES users(id),
changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
ip_address INET,
user_agent TEXT
);
CREATE INDEX idx_audit_table_record ON audit_logs(table_name, record_id);
CREATE INDEX idx_audit_user ON audit_logs(changed_by, changed_at DESC);
Trigger-Based Audit Logging:
CREATE FUNCTION audit_trigger() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'DELETE' THEN
INSERT INTO audit_logs (table_name, record_id, action, old_data)
VALUES (TG_TABLE_NAME, OLD.id, 'DELETE', row_to_json(OLD));
ELSIF TG_OP = 'UPDATE' THEN
INSERT INTO audit_logs (table_name, record_id, action, old_data, new_data)
VALUES (TG_TABLE_NAME, NEW.id, 'UPDATE', row_to_json(OLD), row_to_json(NEW));
ELSIF TG_OP = 'INSERT' THEN
INSERT INTO audit_logs (table_name, record_id, action, new_data)
VALUES (TG_TABLE_NAME, NEW.id, 'INSERT', row_to_json(NEW));
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Apply to critical tables
CREATE TRIGGER audit_orders
AFTER INSERT OR UPDATE OR DELETE ON orders
FOR EACH ROW EXECUTE FUNCTION audit_trigger();
Scaling Strategies
Plan for scale from the beginning. Database architecture changes are expensive at 100K users.
Read Replicas
Route read-heavy queries to replica databases:
Architecture Pattern:
- •Primary (Write): All INSERT, UPDATE, DELETE operations
- •Replicas (Read): SELECT queries for analytics, reporting, dashboards
Implementation Considerations:
- •Use connection pooling (PgBouncer) to manage connections
- •Handle replication lag (typically 1-5 seconds)
- •Route traffic at application level or use read-replica proxy
- •Monitor replication lag and failover to primary if lag exceeds threshold
When to Implement:
- •Read:write ratio exceeds 10:1
- •Primary database CPU consistently over 70%
- •Dashboard queries impact transactional performance
Table Partitioning
Partition large tables to improve query performance and maintenance:
Time-Based Partitioning:
CREATE TABLE orders (
id UUID,
buyer_id UUID,
seller_id UUID,
amount DECIMAL(10, 2),
created_at TIMESTAMP NOT NULL,
PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);
-- Create partitions for each month
CREATE TABLE orders_2024_01 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE orders_2024_02 PARTITION OF orders
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
-- Automatically create future partitions
CREATE EXTENSION IF NOT EXISTS pg_partman;
Benefits:
- •Faster queries (scan only relevant partitions)
- •Easier data archival (drop old partitions)
- •Improved vacuum and index maintenance
When to Use:
- •Tables exceeding 100 million rows
- •Time-series data (orders, messages, analytics)
- •Queries frequently filter by time range
Connection Pooling
Database connections are expensive. Pool and reuse them:
PgBouncer Configuration:
[databases]
marketplace = host=db.example.com port=5432 dbname=marketplace
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3
Benefits:
- •Handle 1000+ concurrent app connections with 25 database connections
- •Reduce connection overhead
- •Protect database from connection exhaustion
Implementation Checklist
Use this checklist when designing marketplace databases:
Schema Design:
- • Separate authentication from profile data
- • Use UUID primary keys for distributed systems
- • Implement JSONB for flexible attributes
- • Add CHECK constraints for data integrity
- • Include soft delete columns on user-facing tables
- • Design for common query patterns
Indexing:
- • Create indexes based on actual queries, not assumptions
- • Use composite indexes matching query column order
- • Implement partial indexes for frequently filtered conditions
- • Add covering indexes for high-traffic read queries
- • Include full-text search indexes for searchable content
- • JSONB GIN indexes for attribute queries
Performance:
- • Use EXPLAIN ANALYZE to validate query performance
- • Set up slow query logging
- • Monitor index usage and remove unused indexes
- • Implement connection pooling
- • Plan for read replicas when read:write ratio exceeds 10:1
Scalability:
- • Design partition strategy for large tables
- • Plan for horizontal scaling with read replicas
- • Implement caching layer (Redis) for hot data
- • Use materialized views for complex reporting queries
Audit and Compliance:
- • Implement soft deletes, not hard deletes
- • Add audit trail for financial transactions
- • Include timestamps (created_at, updated_at) on all tables
- • Track user actions on sensitive operations
When to Use This Framework
Apply these patterns when:
- •Building multi-sided marketplace platforms
- •Designing systems for 10K+ concurrent users
- •Handling financial transactions requiring audit trails
- •Creating platforms with diverse product/service catalogs
- •Building systems requiring complex search and filtering
Adjust for smaller projects:
- •Skip partitioning for databases under 10M rows
- •Use simpler indexing for low-traffic applications
- •Consider read replicas only when primary database is saturated
Conclusion
Database design determines your marketplace's scalability ceiling. Flexible schemas using JSONB enable rapid feature development. Strategic indexing delivers consistent sub-50ms query times. Query optimization prevents performance degradation as data grows. Scaling strategies provide clear growth paths from 1K to 1M users.
The patterns in this guide come from building 200+ marketplace databases. Start with solid fundamentals, measure performance continuously, and scale incrementally based on actual metrics.
Next Steps:
- •Download the marketplace schema template
- •Apply indexing strategies to your highest-traffic queries
- •Set up slow query logging and analyze bottlenecks
- •Plan scaling strategy based on projected growth
- •Implement audit trails for critical business operations
How much should your build actually cost?
Get a personalized investment estimate based on your platform type, scope, and timeline.
Open the Investment CalculatorDownloads
About the Author

Alex Chen
CTO & Co-Founder
Senior software engineer and tech entrepreneur with 15+ years building scalable platforms. Previously led engineering at two successful marketplace exits totaling $40M+. Specializes in marketplace architecture, distributed systems, high-performance databases, and AI-powered development workflows. Architect behind Directorism's platform infrastructure serving millions of requests.
Related Resources
Database Architecture Patterns for Marketplaces
Learn how to design scalable marketplace database schemas. Includes battle-tested patterns for users, listings, transactions, reviews, and messaging systems.
The Definitive Marketplace Tech Stack Guide for 2025
Choose the right tech stack for your marketplace. Learn proven architectures from Next.js to PostgreSQL, when to use each technology, and how to scale from MVP to 1M+ users with real cost projections.
Building Scalable Architecture: Technical Framework for Marketplaces
Learn how to architect marketplace platforms that scale to millions of users. Includes microservices patterns, database sharding, caching strategies, and deployment checklists.