Strategic Decomposition to Services

Break monoliths into services aligned with bounded contexts

TL;DR

Strategic decomposition breaks monoliths into services aligned with bounded contexts (DDD principle). Identify service boundaries based on business capability, not technical layers. Use the Strangler Pattern for safe, incremental migration: build new service alongside monolith, route requests via API Gateway, gradually shift load, then decommission old code. Benefits: independent scaling, team autonomy, deployment velocity. Risks: increased complexity, eventual consistency, distributed debugging. Plan 3-6 months per service for proper decomposition.

Learning Objectives

Identify bounded contexts and service boundaries using DDD
Map monolith code to contexts and dependencies
Prioritize decomposition: which services first (ROI analysis)
Apply Strangler Pattern for safe, incremental migration
Manage data consistency and ownership across service boundaries
Design APIs, contracts, and event schemas
Monitor and validate decomposition progress
Handle rollback and failure scenarios

Motivation: When to Decompose

Decompose when monolith becomes:

Hard to change: Multiple teams step on each other; deployment coordination pain
Slow to deploy: Small change requires full testing, deployment; 30+ minutes from code to prod
Uneven scaling: One feature needs 10x more instances; forced to scale entire monolith
Recruitment bottleneck: New engineers scared of touching monolith; high onboarding time
Technology blocker: Want to use different languages/frameworks for different domains

Do NOT decompose:

Small, fast-changing systems (simplicity > benefits)
Early-stage startups (product-market fit unclear)
When you don't understand your domain well (boundaries fuzzy)
Single team owning everything (no team scaling benefit)

Monolith vs Microservices: When does the crossover happen?

Domain Analysis: Identifying Bounded Contexts

Before coding, understand your domains:

Domain Analysis Process
Real-World Example: E-commerce

Step 1: List Business Capabilities
  e-commerce platform capabilities:
    - Product Catalog (browse, search, recommendations)
    - Orders (create, track, history)
    - Payments (authorize, capture, refund)
    - Shipping (calculate cost, track)
    - Inventory (stock, reservations)
    - Notifications (email, SMS, push)
    - Customer Accounts (profile, preferences)
    - Reviews & Ratings (CRUD, moderation)

Step 2: Group into Bounded Contexts
  Each context has:
    - Clear responsibility (one reason to change)
    - Distinct ubiquitous language
    - Own data model
    - Limited external dependencies

  Initial Contexts:
    1. Catalog Context (Products, Search, Recommendations)
    2. Order Context (Orders, Order Items)
    3. Payment Context (Payments, Transactions)
    4. Shipping Context (Shipments, Tracking)
    5. Inventory Context (Stock, Reservations)
    6. Notification Context (Email, SMS, Push)
    7. Customer Context (Accounts, Profiles)

Step 3: Analyze Dependencies
  Order Context
    ├─ depends on: Inventory (check stock)
    ├─ depends on: Product Catalog (product details)
    ├─ depends on: Customer (shipping address)
    ├─ publishes: OrderCreated, OrderShipped
    ├─ subscribes to: PaymentConfirmed
    └─ risk: HIGH (many dependencies)

  Notification Context
    ├─ depends on: none (subscribes to events)
    ├─ publishes: none
    └─ risk: LOW (loosely coupled)

Step 4: Calculate Extraction Cost vs Benefit
  Metrics:
    - Code size (lines extractable from monolith)
    - Coupling to others (dependencies)
    - Deployment frequency needed
    - Scaling requirements
    - Team size assigned to context
    - Business value

  Scoring:
    ┌─────────────────────────────────────┐
    │ Service      │ ROI  │ Effort │ Risk  │
    ├─────────────────────────────────────┤
    │ Payments     │ HIGH │ MED   │ MED   │ ✓ First
    │ Notifications│ MED  │ LOW   │ LOW   │ ✓ Second
    │ Shipping     │ MED  │ HIGH  │ HIGH  │ ✓ Third
    │ Inventory    │ HIGH │ HIGH  │ HIGH  │ ✗ Later
    └─────────────────────────────────────┘

Step 5: Prioritize Extraction Order
  Ideal services to extract first:
    1. HIGH value, LOW effort
    2. Loosely coupled
    3. HIGH business impact
    4. Clear boundaries
    5. High deployment frequency

MONOLITH TODAY (Single codebase):

src/
  ├─ domain/
  │   ├─ order/
  │   │   ├─ Order.java
  │   │   ├─ OrderService.java
  │   │   ├─ OrderRepository.java
  │   │   └─ OrderStatus.enum
  │   ├─ payment/
  │   │   ├─ Payment.java
  │   │   ├─ PaymentService.java  ← Logic: authorize, capture, refund
  │   │   ├─ PaymentGatewayAdapter.java  ← Stripe/Square integration
  │   │   └─ Transaction.java
  │   ├─ inventory/
  │   │   ├─ Stock.java
  │   │   ├─ InventoryService.java
  │   │   └─ Reservation.java
  │   ├─ shipping/
  │   │   ├─ Shipment.java
  │   │   ├─ ShippingService.java
  │   │   └─ TrackingNumber.java
  │   └─ notification/
  │       ├─ Email.java
  │       ├─ NotificationService.java
  │       └─ EmailTemplate.java
  ├─ api/
  │   └─ Controllers
  ├─ db/
  │   └─ schema.sql
  └─ config/
      └─ Application.yml

DECOMPOSITION PLAN:

Phase 1 (Month 1): Extract Payment Service
  New Repository: payment-service/
    ├─ src/main/java/com/company/payment/
    │   ├─ domain/
    │   │   ├─ Payment.java
    │   │   ├─ PaymentService.java
    │   │   └─ PaymentGatewayAdapter.java
    │   ├─ api/
    │   │   └─ PaymentController.java (REST API)
    │   ├─ event/
    │   │   ├─ PaymentConfirmed.java (publish)
    │   │   └─ OrderCreated.java (subscribe)
    │   ├─ db/
    │   │   └─ payments_schema.sql (separate database)
    │   └─ config/
    │       └─ application.yml
    └─ tests/

Phase 2 (Month 2-3): Extract Notification Service
  New Repository: notification-service/
    ├─ Subscribes to events (OrderCreated, PaymentConfirmed, etc.)
    ├─ No external dependencies on order/payment/inventory
    ├─ Separate database (minimal)
    └─ Template engine, email provider integration

Phase 3 (Month 3-4): Extract Shipping Service
  New Repository: shipping-service/
    ├─ Depends on: Orders (by reference)
    ├─ Publishes: ShipmentCreated, ShipmentDispatched, DeliveryConfirmed
    ├─ Subscribes to: OrderConfirmed, PaymentConfirmed
    └─ Calls: Shipping provider API (UPS, FedEx, DHL)

Remaining in Monolith:
  - Order Service (complex, many dependencies)
  - Inventory Service (complex, many interactions)
  - Catalog Service
  - Customer Service

Strangler Pattern: Incremental Migration

Safe migration in phases:

Phase 1: Build & Test (Weeks 1-4)
Phase 2: Route Test Traffic (Weeks 5-6)
Phase 3: Data Migration (Weeks 7-10)
Phase 4: Gradual Cutover (Weeks 11-14)

Build New Payment Service:
  ✓ Greenfield implementation
  ✓ All unit + integration tests
  ✓ Own database (separate schema)
  ✓ REST API matching monolith's payment API
  ✓ Event publisher for PaymentConfirmed
  ✓ Ready for requests, but not receiving any yet

Monolith Status:
  ✓ Payment code still present
  ✓ No awareness of new service
  ✓ Handling all payment requests
  ✓ Fully operational

Risk Assessment:
  ✓ ZERO impact on production (new service not live)
  ✓ Only risk: development/testing

Timeline:
  Week 1: Design API, database schema, architecture
  Week 2: Implement core payment logic
  Week 3: Implement payment gateway integration
  Week 4: Test, fix bugs, performance tune

Deliverable:
  ✓ Payment Service artifact (Docker image, JAR, etc.)
  ✓ Documentation
  ✓ Test coverage 80%+

Deploy Payment Service:
  ✓ Separate container/instance
  ✓ Separate database
  ✓ Integration with monitoring/logging

API Gateway (NEW COMPONENT):
  ✓ Intercepts all payment requests
  ✓ Decision logic:
    if (request.path == '/payments/*'):
      if (random() < 0.1):  // 10% to new service
        forward to payment-service
      else:  // 90% to monolith (production)
        forward to monolith

Traffic Flow:
  Client Request
    ├─ API Gateway decision
    ├─ 10% → Payment Service (test traffic)
    │        └─ New database, new code
    └─ 90% → Monolith (production traffic)
             └─ Old database, old code

Validation:
  ✓ Payment Service receives requests
  ✓ Handles 10% of payment volume
  ✓ Compare payment service response vs monolith
  ✓ Alert if error rates differ >1%
  ✓ Verify database consistency

Timeline:
  Day 1: Deploy service + gateway
  Day 2-7: Monitor, validate, fix issues
  Day 8: Increase traffic to 25% if stable

Rollback:
  If payment-service fails: immediately route 100% back to monolith

Sync Historical Data:
  Copy all payments from monolith DB to service DB:
    1. Dump payments from monolith (ORDER BY created_at)
    2. Load into payment service database
    3. Verify count matches (select count(*) both sides)
    4. Validate checksums (CRC, hash)

Bidirectional Sync (Running in Parallel):
  Monolith → Service:
    ✓ On each payment created/updated in monolith
    ✓ Async event published to message queue
    ✓ Payment service consumes event
    ✓ Writes to its database

  Service → Monolith (reverse sync):
    ✓ May need to update monolith for compatibility
    ✓ Service publishes PaymentConfirmed
    ✓ Monolith consumes and updates its record

Validation:
  ✓ Compare row counts (monolith vs service)
  ✓ Spot-check 100 random payments
  ✓ Verify no payments lost
  ✓ Check consistency of state

Timeline:
  Week 7: Prepare data, plan sync
  Week 8: Initial copy, validate
  Week 9: Run parallel, monitor divergence
  Week 10: Fix any inconsistencies

Risks:
  ✗ Data loss if sync fails
  ✗ Duplicate payments if not idempotent
  ✓ Mitigation: test in staging, backup databases

Gradual Traffic Migration:
  Week 11 (Day 1-7):    10% → service, 90% → monolith
  Week 11 (Day 8-14):   25% → service, 75% → monolith
  Week 12 (Day 1-7):    50% → service, 50% → monolith
  Week 12 (Day 8-14):   75% → service, 25% → monolith
  Week 13 (Day 1-7):    90% → service, 10% → monolith
  Week 13 (Day 8-14):   100% → service, 0% → monolith
  Week 14: Monitor for 2 weeks

API Gateway Config:
  ```yaml
  routes:
    - path: /payments/*
      service: payment-service
      traffic: 100  # percentage
      fallback: monolith  # if service errors

Monitoring During Cutover: ✓ Payment success rate (target: 99.9%+) ✓ Latency p99 (target: under 500ms) ✓ Error rates by type ✓ Database consistency (monolith vs service) ✓ Alert thresholds:

Error rate > 1%: alert
Latency p99 > 1s: alert
Consistency divergence: alert immediately

Rollback Procedure: If issues detected at any traffic %:

Immediate: revert traffic % back to 0%
Debug: compare monolith vs service responses
Fix: patch service or revert code
Retry: once fix validated in staging

Real-World Risks Mitigated: ✓ Data consistency: running in parallel ✓ Service bugs: caught at low traffic % ✓ Performance: measured under real load ✓ Dependencies: validated with real requests

  </TabItem>

  <TabItem value="phase5" label="Phase 5: Decommission (Weeks 15-16)">
```yaml
Monolith Cleanup:
  Week 15:
    ✓ Confirm 100% traffic on service
    ✓ Monitor for 1 week (stable)
    ✓ Verify no monolith fallback hits
    ✓ Remove payment logic from monolith code

  Week 16:
    ✓ Archive old payment code (git tag)
    ✓ Update documentation
    ✓ Decommission old payment database
    ✓ Close monitoring/alerting on old code

Knowledge Transfer:
  ✓ Team now owns payment-service
  ✓ Document API contracts
  ✓ Runbook for common issues
  ✓ Post-mortem: lessons learned

Next Service:
  ✓ Apply learnings to Notification Service
  ✓ Process becomes faster (2-3 months)
  ✓ Tools/patterns established

Data Ownership and Consistency

Critical decisions when decomposing:

From shared database to service-owned data

Data Ownership Rules
Saga Pattern (Distributed Transactions)

Rule 1: One Service Owns Data
  ✓ Payment Service owns payments table
  ✓ Only Payment Service writes to payments
  ✓ Other services READ via API (not direct DB access)

Rule 2: No Shared Database
  ✗ BAD: Order Service and Payment Service share schema
  ✓ GOOD: Each has separate database

Rule 3: Access Data via API or Events
  Order Service wants to know if payment succeeded:
    ✓ GOOD: Call Payment Service API
      GET /payments/{paymentId}
    
    ✓ GOOD: Subscribe to PaymentConfirmed event
      when PaymentConfirmed(orderId, success)
        update order.paymentStatus = success
    
    ✗ BAD: Query payments table directly
      SELECT * FROM payments WHERE order_id = ?

Rule 4: Manage Consistency Explicitly
  Order + Payment are eventually consistent:
    - Create Order (state: PENDING_PAYMENT)
    - Request Payment Service
    - If Payment fails: Order state becomes PAYMENT_FAILED
    - If Payment succeeds: Order state becomes CONFIRMED
    - Payment Service publishes PaymentConfirmed
    - Order Service receives and marks CONFIRMED
    
  Transactional consistency (ACID) LOST
  Eventual consistency (all updates eventually propagate) GAINED

Problem: Order creation needs Order + Payment + Inventory all succeed

Order Created Event
  ├─ Order Service creates order (state: PENDING)
  │
  └─ Saga Orchestrator (or Choreography)
      ├─ Step 1: Request Payment Service
      │   └─ PaymentService.authorize(orderId, amount)
      │       └─ Publish: PaymentAuthorized
      │
      ├─ Step 2: Request Inventory Service
      │   └─ InventoryService.reserve(orderId, items)
      │       └─ Publish: InventoryReserved
      │
      └─ Step 3: Confirm Order
          └─ OrderService.confirm(orderId)
              └─ state = CONFIRMED

Failure Handling (Compensation):
  If Payment fails:
    ✓ Publish PaymentFailed
    ✓ Saga detects failure
    ✓ Release inventory reservation (compensation)
    ✓ Mark order as FAILED
    ✓ Customer receives notification

  If Inventory fails:
    ✓ Release payment reservation (compensation)
    ✓ Mark order as FAILED
    ✓ Customer retry with reduced quantity

Tools:
  - Temporal (workflow orchestration)
  - Axon Framework (event sourcing + sagas)
  - AWS Step Functions
  - Custom event choreography

Common Pitfalls and How to Avoid Them

Decomposition Checklist

Decision Questions

Before decomposing a service, ask:

Is this bounded context stable? (Will requirements change frequently?)
Will this service be deployed frequently? (Multiple times per week?)
Does this need independent scaling? (Different traffic patterns than monolith?)
Are the dependencies minimal? (How many other services must it call?)
Can a team own this independently? (Clear responsibility, success metrics?)
What's the ROI? (Developer velocity gain vs operational complexity?)

If YES to 4+ questions: decompose. If NO to most: stay in monolith longer.

One Takeaway

Decompose strategically based on business domains (DDD), not technical layers. Use the Strangler Pattern to migrate safely and incrementally. Start with loosely-coupled, high-value services. Invest in monitoring and data consistency strategies before moving traffic. Decomposition is a 6-12 month investment for most systems; plan accordingly.

Next Steps

Read Domain-Driven Design by Eric Evans (bounded contexts foundation)
Study Build Microservices by Sam Newman (strangler pattern details)
Practice Domain storytelling to identify contexts
Plan your first service decomposition (2-3 months timeline)
Set up API Gateway and event bus infrastructure

References

Evans, E. (2003). Domain-Driven Design. Addison-Wesley.
Newman, S. (2015). Building Microservices. O'Reilly.
Fowler, M. (2004). Strangler Application: https://martinfowler.com/bliki/StranglerFigApplication.html
Temporal Workflows: https://temporal.io/
Event Sourcing: https://martinfowler.com/eaaDev/EventSourcing.html

Strategic Decomposition to Services

TL;DR​

Learning Objectives​

Motivation: When to Decompose​

Domain Analysis: Identifying Bounded Contexts​

Strangler Pattern: Incremental Migration​

Data Ownership and Consistency​

Common Pitfalls and How to Avoid Them​

Decomposition Checklist​

Decision Questions​

One Takeaway​

Next Steps​

References​