Shared Database Across Services

Multiple services accessing one shared database, creating tight coupling.

TL;DR

Multiple services sharing one database creates tight coupling that defeats microservice independence. Schema changes affect all services. One service's writes corrupt another's data. Can't scale independently—scaling requires scaling entire database. Services become interdependent on the database level, losing autonomy. Solution: database-per-service pattern where each service owns its data and communicates through APIs, not shared tables.

Learning Objectives

You will be able to:

Understand why shared databases break service independence
Design database boundaries aligned with service boundaries
Implement asynchronous data synchronization
Handle distributed transactions safely
Migrate from shared to independent databases
Evaluate when shared databases are appropriate (rarely)

Motivating Scenario

You have User Service, Order Service, and Payment Service sharing one PostgreSQL database. This seems efficient:

Single database to manage
Easy to query across services
ACID transactions across services

But problems arise:

Problem 1: Schema Coupling Order Service adds a discount_code column to Orders table. But Payment Service has triggers on that table. The migration breaks Payment Service for 10 minutes during deployment.

Problem 2: Independent Scaling User Service gets heavy traffic (reading user profiles). You need to scale database horizontally. But the shared database must scale for all services. Now you're scaling Payment Service's database (which has light traffic) just to handle User Service load.

Problem 3: Implicit Dependencies Order Service directly queries Payments table. Payment Service doesn't know Order Service depends on it. Changes to the schema are risky.

Problem 4: Data Corruption Payment Service bug writes bad data to Payments table. Order Service reads corrupt data and creates bad orders. Now you have data consistency problems spanning services.

You realize: this isn't microservices, it's distributed application tightly coupled through the database.

Core Explanation

Why Shared Databases Look Good

Simple: one database, ACID transactions, joins work
Efficient: no network calls, direct queries
Familiar: traditional monolith pattern

Why Shared Databases Break

Tight Coupling: Services are coupled through schema
Scaling Limitations: Can't scale services independently
Deployment Risk: Schema changes affect multiple services
Data Ownership Ambiguity: Who owns what data?
Blame Shifting: "Payment Service broke my data"

The Database-Per-Service Pattern

Each service owns its data store. Services communicate through APIs:

User Service (owns: users) ← API → Order Service (owns: orders)
Order Service ← API → Payment Service (owns: payments)

Changes to Order schema don't affect Payment Service.

Code Examples

Database Design
Python

Shared Database (Anti-pattern)
Database Per Service (Solution)

-- Single database shared by User, Order, Payment services
-- ALL services can read/write ALL tables

CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR,
    email VARCHAR
);

CREATE TABLE orders (
    id INT PRIMARY KEY,
    user_id INT REFERENCES users(id),
    total DECIMAL
);

CREATE TABLE payments (
    id INT PRIMARY KEY,
    order_id INT REFERENCES orders(id),  -- Coupling!
    amount DECIMAL,
    status VARCHAR
);

-- User Service directly queries all tables
SELECT * FROM orders WHERE user_id = ?;
SELECT * FROM payments WHERE order_id = ?;

-- Payment Service directly queries all tables
SELECT * FROM orders WHERE id = ?;
UPDATE orders SET status = 'paid' WHERE id = ?;  -- Modifying Order's data!

-- Order Service modifies Payment table
INSERT INTO payments VALUES (...);

-- All services are tightly coupled through the database
-- Problems:
// 1. Payment Service modifies orders (who owns orders?)
// 2. Schema change in payments affects all services
// 3. Can't scale database independently per service
// 4. No data isolation

-- User Service Database
CREATE DATABASE user_service;
USE user_service;

CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR,
    email VARCHAR
);

-- Order Service Database
CREATE DATABASE order_service;
USE order_service;

CREATE TABLE orders (
    id INT PRIMARY KEY,
    user_id INT,  -- No foreign key to users table!
    total DECIMAL
);

CREATE TABLE order_items (
    id INT PRIMARY KEY,
    order_id INT REFERENCES orders(id),
    product_id INT,
    quantity INT
);

-- Payment Service Database
CREATE DATABASE payment_service;
USE payment_service;

CREATE TABLE payments (
    id INT PRIMARY KEY,
    order_id INT,  -- No foreign key to orders!
    amount DECIMAL,
    status VARCHAR
);

-- Benefits:
// 1. Each service owns its tables (no cross-service writes)
// 2. Schema changes isolated to one service
// 3. Can scale each database independently
// 4. Clear data ownership
// 5. Services coupled through API contracts, not database

// Communication through APIs:
// Order Service needs user details:
//   Call: GET /api/users/{id}  (User Service)
//
// Payment Service needs order details:
//   Call: GET /api/orders/{id}  (Order Service)

Shared Database (Anti-pattern)
Database Per Service (Solution)

shared_database.py
import psycopg2

# Shared database connection for all services
shared_db = psycopg2.connect("postgresql://localhost/shared_db")

class UserService:
    def get_user(self, user_id):
        cursor = shared_db.cursor()
        cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
        return cursor.fetchone()

class OrderService:
    def create_order(self, user_id, items):
        cursor = shared_db.cursor()
        # Check if user exists by querying users table
        cursor.execute("SELECT id FROM users WHERE id = %s", (user_id,))
        if not cursor.fetchone():
            raise Exception("User not found")

        # Create order
        cursor.execute(
            "INSERT INTO orders (user_id, total) VALUES (%s, %s)",
            (user_id, self._calculate_total(items))
        )
        cursor.commit()

    def get_order(self, order_id):
        cursor = shared_db.cursor()
        cursor.execute("SELECT * FROM orders WHERE id = %s", (order_id,))
        return cursor.fetchone()

class PaymentService:
    def process_payment(self, order_id, amount):
        cursor = shared_db.cursor()
        # Payment Service directly queries and modifies Order data
        cursor.execute("SELECT * FROM orders WHERE id = %s", (order_id,))
        order = cursor.fetchone()
        if not order:
            raise Exception("Order not found")

        # Process payment
        cursor.execute(
            "INSERT INTO payments (order_id, amount, status) VALUES (%s, %s, %s)",
            (order_id, amount, 'pending')
        )

        # Payment Service modifies Order table (tight coupling!)
        cursor.execute(
            "UPDATE orders SET status = %s WHERE id = %s",
            ('paid', order_id)
        )
        cursor.commit()

# Problems:
// 1. PaymentService modifies orders table (who owns orders?)
// 2. Schema change in payments affects all services
// 3. Direct database coupling
// 4. Can't scale services independently

database_per_service.py
import psycopg2
import requests

# Each service has its own database
user_db = psycopg2.connect("postgresql://localhost/user_db")
order_db = psycopg2.connect("postgresql://localhost/order_db")
payment_db = psycopg2.connect("postgresql://localhost/payment_db")

class UserService:
    def get_user(self, user_id):
        """User Service owns users table"""
        cursor = user_db.cursor()
        cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
        return cursor.fetchone()

class OrderService:
    def create_order(self, user_id, items):
        """Order Service owns orders table"""
        # Check user exists via API, not database query
        user_resp = requests.get(f"http://user-service/users/{user_id}")
        if not user_resp.ok:
            raise Exception("User not found")

        # Create order in Order Service's database only
        cursor = order_db.cursor()
        total = sum(item['price'] * item['qty'] for item in items)
        cursor.execute(
            "INSERT INTO orders (user_id, total, status) VALUES (%s, %s, %s)",
            (user_id, total, 'pending')
        )
        order_id = cursor.fetchall()[0][0]
        cursor.commit()

        # Emit event for other services (asynchronously)
        # Payment Service listens and processes payment
        self.emit_event('OrderCreated', {
            'order_id': order_id,
            'user_id': user_id,
            'total': total
        })

        return order_id

    def get_order(self, order_id):
        """Order Service only queries its own table"""
        cursor = order_db.cursor()
        cursor.execute("SELECT * FROM orders WHERE id = %s", (order_id,))
        return cursor.fetchone()

class PaymentService:
    def process_payment(self, order_id, amount):
        """Payment Service owns payments table"""
        # Get order details via API, not direct database query
        order_resp = requests.get(f"http://order-service/orders/{order_id}")
        if not order_resp.ok:
            raise Exception("Order not found")
        order = order_resp.json()

        # Process payment in Payment Service's database only
        cursor = payment_db.cursor()
        cursor.execute(
            "INSERT INTO payments (order_id, amount, status) VALUES (%s, %s, %s)",
            (order_id, amount, 'pending')
        )
        cursor.commit()

        # Emit event for Order Service to handle
        self.emit_event('PaymentProcessed', {
            'order_id': order_id,
            'status': 'paid'
        })

# Benefits:
// 1. Clear data ownership (each service owns its tables)
// 2. Schema changes isolated (Payment change doesn't affect Order)
// 3. Services loosely coupled through APIs
// 4. Can scale databases independently
// 5. Asynchronous communication is resilient

Design Review Checklist

Does each service have its own database?
Are there any cross-service foreign keys?
Do services communicate through APIs, not shared tables?
Can you deploy one service without coordinating database changes?
Can you scale one service's database independently?
Is data ownership clear (no ambiguity)?
Are there no direct cross-database queries?
Do services use event-driven communication for consistency?
Can you change a service's schema without affecting others?

Showcase

Signals of Shared Database Coupling

Anti-Signals (Problems)

Multiple services share one database
Cross-service foreign keys
Service A queries Service B's tables
Schema change requires coordinating multiple services
Can't scale services independently

Healthy Signals (Solutions)

Database per service
No cross-service foreign keys
Services communicate via APIs
Schema changes isolated
Independent scaling possible

One Takeaway

info

Database-per-service eliminates the most insidious coupling: the database schema. Services become truly independent when they don't share data stores.

Shared Database Across Services

TL;DR​

Learning Objectives​

Motivating Scenario​

Core Explanation​

Code Examples​

Design Review Checklist​

Showcase​

Signals of Shared Database Coupling

One Takeaway​

References​