Consistency Models and Trade-offs
Choose the right consistency guarantee for your data, balancing correctness, latency, and availability
TL;DR
Consistency models form a spectrum. Strong consistency (linearizability) guarantees all reads see all completed writes, but sacrifices latency and availability. Eventual consistency guarantees all writes eventually propagate, accepting temporary inconsistency. Between them lie causal consistency and other models. Choose based on your data's importance and your tolerance for inconsistency.
Learning Objectives
- Understand the spectrum of consistency models
- Distinguish strong, causal, and eventual consistency
- Recognize the latency and availability cost of stronger consistency
- Apply per-operation and per-data consistency strategies
Motivating Scenario
Your e-commerce system needs inventory to be accurate. A customer buys the last item; you must not sell it again. But performance testing shows that strong consistency drops throughput by 60%. What if you used eventual consistency just for inventory? Risk: the same item sells twice. What if you use strong consistency only for inventory changes? Complexity: application logic must vary by operation. This is the consistency trade-off in practice.
The Consistency Spectrum
Strong Consistency (Linearizability)
Definition: Every read returns the result of the most recent write. All operations appear to execute atomically in a single total order.
Characteristics:
- All nodes agree on data values
- Reads never return stale data
- Highest latency (waiting for confirmation from replicas)
- Lowest throughput (mutations must be coordinated)
- Easiest to reason about for application developers
When to Use:
- Financial transactions (must be accurate)
- Inventory management (must prevent overselling)
- Atomic counters or versioning
- Any scenario where consistency errors cascade
Trade-off: You pay in latency and availability. During network partitions, you must choose between serving stale data (losing consistency) or refusing requests (losing availability).
- Example
Time: 0 Alice writes balance = 100 to Node A
Time: 1 Bob reads from Node B
Result: Bob sees 100 (not old value)
Guarantees: Bob MUST see Alice's write
Cost: Node B must sync with Node A first
Causal Consistency
Definition: Operations that causally relate to each other are seen by all processes in the same order. Non-related operations can appear in different orders.
Characteristics:
- Weaker than strong consistency (allows some ordering divergences)
- Stronger than eventual consistency (respects causal relationships)
- Better latency than strong consistency
- Useful for operations with dependencies
When to Use:
- Message threads (replies must follow messages)
- Collaborative editing (edits depend on previous state)
- Comment threads (replies must appear after comments)
- Any scenario with ordered dependencies
Trade-off: You avoid some latency costs of strong consistency while maintaining logical ordering.
- Example
Time: 0 Alice posts message "Hello"
Time: 1 Bob reads message "Hello"
Time: 2 Bob replies "Hi there"
Time: 3 Charlie reads both message and reply
Guarantees: Charlie sees message before reply
(causal relationship preserved)
No guarantee: If Dave posts unrelated message,
order relative to Alice's message is undefined
Eventual Consistency
Definition: All updates eventually propagate to all replicas. No guarantees about timing, but all writes eventually appear everywhere.
Characteristics:
- Highest throughput (no coordination overhead)
- Lowest latency (writes don't wait for confirmation)
- Temporary inconsistency (reads may be stale)
- Requires application logic to handle conflicts
- Simple to scale horizontally
When to Use:
- Social media feeds (eventual correctness is fine)
- Product recommendations (stale data acceptable)
- Caching layers (temporary inconsistency expected)
- Systems with high read traffic (can scale reads infinitely)
- Analytics and logging (eventual consistency natural)
Trade-off: Simplest to implement and scale, but application must handle temporary inconsistency.
- Example
Time: 0 Alice writes counter = 100 to Node A
(doesn't wait for Node B to acknowledge)
Time: 1 Bob reads counter from Node B
Result: Bob might see 99 (old value)
Time: 2 Update propagates to Node B
(Bob's next read sees 100)
No guarantee: When exactly Bob sees the update
Real-World Consistency Failures and How to Handle Them
Case Study 1: Overselling in E-Commerce
Problem: Two orders place last item in inventory simultaneously
Time 0: Item inventory = 1
Time 1: Order A checks inventory (sees 1, places order)
Time 2: Order B checks inventory (sees 1, places order - race condition!)
Time 3: System processes Order A (decrements to 0)
Time 4: System processes Order B (decrements to -1, error!)
Solutions by consistency model:
# Strong Consistency: Serialize with lock
def purchase_item_strong(item_id):
with lock(item_id): # Mutex
inventory = database.get(item_id)
if inventory > 0:
database.update(item_id, inventory - 1)
return success
return out_of_stock
# Eventual Consistency: Accept then reconcile
def purchase_item_eventual(item_id):
# Optimistic: assume success
database.decrement(item_id)
# Later: reconcile if inventory goes negative
job = BackfillInventory(item_id)
if inventory < 0:
notify_customer_cancellation()
Case Study 2: Payment Processing
Problem: Customer charged but order never created (duplicate charges)
Strong Consistency:
- Atomic transaction: both succeed or both fail
- Guaranteed: if charged, order created
Eventual Consistency:
- Charge succeeds, order creation fails
- Must detect and handle: refund customer, retry order creation
ACID vs BASE
These frameworks describe consistency philosophies:
- SQL databases
- Transactional systems
- Financial ledgers
- NoSQL databases
- Distributed caches
- Event-driven systems
Practical Strategies
1. Hybrid Consistency
Use different consistency models for different operations:
- Python
- Node.js
class InventoryService:
def purchase_item(self, user_id, item_id):
"""Inventory operations need strong consistency"""
# Use quorum write/read (strong consistency)
self.inventory_store.put(
key=f"item:{item_id}",
value=decreased_quantity,
consistency='strong'
)
def view_recommendations(self, user_id):
"""Recommendations can use eventual consistency"""
# Use fast local replica (eventual consistency)
return self.recommendation_cache.get(
key=f"recommendations:{user_id}",
consistency='eventual'
)
class InventoryService {
async purchaseItem(userId, itemId) {
// Strong consistency: wait for quorum
await this.inventory.put(
`item:${itemId}`,
decreasedQuantity,
{ waitForReplicas: 'all' }
);
}
async viewRecommendations(userId) {
// Eventual consistency: read from nearest replica
return this.cache.get(
`recommendations:${userId}`,
{ waitForReplicas: 'any' }
);
}
}
2. Conflict Resolution
With eventual consistency, you need conflict resolution strategies:
- Last-Write-Wins: Timestamp wins. Simple but loses data.
- Application Logic: Custom merge logic. Preserves data but complex.
- Quorum/Voting: Multiple versions vote on correct value.
- Operational Transformation: Track and merge concurrent edits (Google Docs).
3. Monotonic Reads
Prevent a user from seeing their own writes go backward:
Time 0: Alice writes value = 100
Time 1: Alice reads value (sees 100)
Time 2: Alice's request routed to different replica (hadn't replicated yet)
Without monotonic reads: Alice sees 99 (inconsistent!)
With monotonic reads: Still sees 100
Trade-off Matrix
| Property | Strong Consistency | Causal | Eventual |
|---|---|---|---|
| Latency | High | Medium | Low |
| Throughput | Low | Medium | High |
| Availability | Lower | Medium | Higher |
| Reasoning Difficulty | Easy | Medium | Hard |
| Scalability | Limited | Better | Excellent |
Implementation Patterns
Read-After-Write Consistency
Guarantee user sees their own writes immediately:
class ConsistencyManager:
def write(self, user_id, key, value):
# Write to primary (strong consistency)
self.primary_db.put(key, value)
# Remember this user's write timestamp
self.user_write_times[user_id] = time.time()
def read(self, user_id, key):
# If user just wrote, read from primary
if user_id in self.user_write_times:
last_write = self.user_write_times[user_id]
if time.time() - last_write < 1.0: # Recent write
return self.primary_db.get(key)
# Otherwise, read from replica (faster)
return self.replica_db.get(key)
Quorum-Based Consistency
Ensure majority agreement:
class QuorumReplicaSet:
def write(self, key, value, quorum_size):
"""Write to majority of replicas"""
acks = 0
for replica in self.replicas:
if replica.write(key, value):
acks += 1
if acks >= quorum_size:
return True # Success
return False
def read(self, key, quorum_size):
"""Read from majority of replicas"""
results = {}
for replica in self.replicas:
val = replica.read(key)
results[val] = results.get(val, 0) + 1
# Return most common value (majority agreement)
majority_value = max(results, key=results.get)
return majority_value
Eventual Consistency with Conflict Resolution
class DataStore:
def put(self, key, value, timestamp=None):
"""Store with timestamp for conflict resolution"""
if timestamp is None:
timestamp = time.time()
self.data[key] = (value, timestamp)
def merge_replica(self, other_store):
"""Merge another replica, resolve conflicts by timestamp"""
for key, (value, timestamp) in other_store.data.items():
if key not in self.data:
self.data[key] = (value, timestamp)
else:
local_timestamp = self.data[key][1]
# Last-write-wins: latest timestamp wins
if timestamp > local_timestamp:
self.data[key] = (value, timestamp)
Trade-off Decision Matrix
| Scenario | Consistency | Reason |
|---|---|---|
| Bank balance | Strong | Money must be accurate; overselling terrible |
| Post likes | Eventual | Approximate count OK; stale count acceptable |
| Friend list | Causal | If A adds B, then B sees A (causal relationship) |
| Cache | Eventual | Stale cached data is expected |
| E-commerce inventory | Strong | Can't sell same item twice |
| User profile name | Weak | Small delay in name change acceptable |
| Message thread | Causal | Replies must follow messages |
| Leaderboard | Eventual | Approximate rankings acceptable |
| Session token | Strong | Must validate immediately |
Monitoring and Observability
class ConsistencyMonitor:
def __init__(self):
self.replication_lag = [] # Track lag distribution
self.divergence_count = 0 # Count of reads that diverged
def measure_replication_lag(self, primary, replica):
"""Measure how far replica is behind primary"""
primary_version = primary.get_version(key)
replica_version = replica.get_version(key)
lag_seconds = primary_version - replica_version
self.replication_lag.append(lag_seconds)
def alert_if_high_lag(self, threshold_ms=100):
avg_lag = sum(self.replication_lag) / len(self.replication_lag)
if avg_lag > threshold_ms:
alert(f"High replication lag: {avg_lag}ms")
def detect_divergence(self, replicas):
"""Detect if replicas have different values"""
values = [r.get(key) for r in replicas]
if len(set(values)) > 1:
self.divergence_count += 1
alert(f"Replicas diverged: {values}")
Self-Check
For each scenario, decide what consistency model to use and explain why:
- User's bank balance? Strong. Money errors have severe consequences; exact balance critical.
- Social media post likes? Eventual. Approximate counts acceptable; consistency delay fine.
- User's profile name? Weak/Eventual. Small propagation delay acceptable; not mission-critical.
- Shopping cart contents? Strong/Causal. Users must see their additions immediately.
- Product reviews? Eventual. Slight delay in review appearing acceptable.
- Authentication token validation? Strong. Must immediately reject revoked tokens.
- Recommendation feed? Eventual. Stale recommendations acceptable; data doesn't need to match perfectly.
- Distributed lock? Strong. Lock coordination requires immediate visibility.
Consistency isn't binary—it's a spectrum. Stronger consistency costs latency and availability. Choose the weakest consistency model that's correct for your use case.
Next Steps
- Handle Failures: Read Partition Tolerance and Failure Modes
- Enable Retries: Learn about Idempotency
- Implement Communication: Explore API Styles
References
- Kleppmann, M. (2017). "Designing Data-Intensive Applications". O'Reilly Media.
- Vogels, W. (2008). "Eventually Consistent". Communications of the ACM.
- Tanenbaum, A. S., & Van Steen, M. (2006). "Distributed Systems: Principles and Paradigms".
- Coulouris, G., Dollimore, J., Kindberg, T., & Blair, G. (2011). "Distributed Systems: Concepts and Design" (5th ed.).