Using multiple databases and data storage technologies in a single application, each optimized for specific use cases.
One Size Doesn't Fit All!
┌────────────────────────────────┐
│ MySQL Database │
│ (Everything stored as tables) │
└────────────────────────────────┘
↓
User Data (tables) ✅ Good
Order History (tables) ✅ Good
Session Cache (tables) ❌ Slow!
Search Index (tables) ❌ Inefficient!
Real-time Notifications (tables) ❌ Not suitable!
Graph Relations (tables) ❌ Terrible for graphs!
- Sessions stored as rows → Slow lookups
- Full-text search → Requires complex queries
- Graph relationships → Multiple JOINs (expensive)
- Real-time notifications → Polling (inefficient)
- Time-series data → Poor aggregation performance
Application
↓
┌──────────────┼──────────────┐
↓ ↓ ↓
MySQL PostgreSQL MongoDB
(Users) (Transactions) (Orders)
↓ ↓ ↓
Redis Elasticsearch Cassandra
(Cache) (Search) (Time-series)
↓ ↓
Neo4j InfluxDB
(Graphs) (Metrics)
Databases: MySQL, PostgreSQL, Oracle
// Structured data with relationships
User:
- user_id (PK)
- name
- email
Order:
- order_id (PK)
- user_id (FK)
- amount
Query:
SELECT u.name, o.amount FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.id = 123;Use Cases:
- ✅ User accounts (structured, relational)
- ✅ Transactions (ACID needed)
- ✅ Financial records
- ✅ Inventory management
Advantages:
- ACID transactions
- Complex queries with JOINs
- Data integrity with constraints
- Standardized SQL
Disadvantages:
- Difficult to scale horizontally
- Fixed schema (not flexible)
- Not ideal for unstructured data
- Slow for large scale reads
Flexible schema, JSON-like documents
// Flexible unstructured data
{
_id: ObjectId(),
name: "Product X",
description: "...",
tags: ["electronics", "gadgets"],
ratings: [4.5, 4.8, 4.2],
warehouse: {
location: "NYC",
quantity: 100
}
}Use Cases:
- ✅ Orders (flexible structure)
- ✅ Product catalogs (varying attributes)
- ✅ User profiles (different data per user)
- ✅ Logs and events
Advantages:
- Flexible schema
- Horizontal scaling (sharding)
- Fast writes
- Nested data support
Disadvantages:
- Limited transaction support (early versions)
- No JOINs (denormalization needed)
- Larger document sizes
- Eventually consistent
Ultra-fast, in-memory storage
SET user:123:name "John" // O(1) operation
GET user:123:name // "John"
HSET user:123 name "John" age 30
HGET user:123 name // "John"
// Perfect for caching
SET cache:product:456 "{...product data...}" EX 3600Use Cases:
- ✅ Session storage
- ✅ Cache layer
- ✅ Real-time counters
- ✅ Leaderboards
- ✅ Rate limiting
Advantages:
- Ultra-fast (in-memory)
- Simple operations
- Atomic operations
- TTL support
Disadvantages:
- Data lost on restart (unless persistence)
- Limited query capabilities
- Memory limited
- Not suitable for large data
Optimized for full-text search and analytics
// Indexing
PUT /products/_doc/1
{
"name": "iPhone 14",
"brand": "Apple",
"description": "Latest smartphone..."
}
// Searching
GET /products/_search
{
"query": {
"match": {
"description": "smartphone"
}
}
}
// Millisecond response for millions of documents!
Use Cases:
- ✅ Product search
- ✅ User search
- ✅ Log analysis
- ✅ Analytics
Advantages:
- Fast full-text search
- Aggregations and analytics
- Powerful query DSL
- Horizontal scaling
Disadvantages:
- Eventual consistency
- Complex setup
- Resource-heavy
- Not transactional
Optimized for time-stamped data points
User login events:
├─ 10:00:01 UTC → user_123 logged in
├─ 10:00:15 UTC → user_456 logged in
├─ 10:01:00 UTC → user_789 logged in
Queries:
- How many logins per minute?
- CPU usage over 24 hours?
- Request latency trend?
Use Cases:
- ✅ Metrics (CPU, memory, latency)
- ✅ Monitoring data
- ✅ Application performance
- ✅ Stock prices
Advantages:
- Efficient storage for time series
- Fast aggregations
- Built-in downsampling
- Time-window queries
Disadvantages:
- Only for time-stamped data
- Not suitable for relational queries
- Limited transaction support
Optimized for relationships
Social Network:
User A -[FOLLOWS]-> User B
User B -[FOLLOWS]-> User C
User C -[FOLLOWS]-> User A
Queries:
MATCH (a:User)-[:FOLLOWS*2]->(c:User)
WHERE a.name = 'Alice'
RETURN c.name; // Friends of friends
// SQL would need multiple JOINs
// Neo4j does it efficiently!
Use Cases:
- ✅ Social networks
- ✅ Recommendation engines
- ✅ Knowledge graphs
- ✅ Fraud detection
Advantages:
- Efficient relationship traversal
- Pattern matching queries
- ACID support
- Complex relationship queries fast
Disadvantages:
- Learning curve (Cypher query language)
- Smaller ecosystem
- Scaling challenges
- Smaller community
API Gateway
↓
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌────────────┐ ┌──────────┐ ┌──────────┐
│ PostgreSQL │ │ MongoDB │ │ Redis │
│ (ACID) │ │(Flexible)│ │ (Cache) │
└────────────┘ └──────────┘ └──────────┘
Users & Txns Orders & KYC Sessions & Cache
- User accounts - Order history - Session data
- Wallet balance - User profiles - Rate limiting
- Transactions - Flexible schema - OTP cache
↓ ↓ ↓
┌────────────┐ ┌──────────┐ ┌──────────┐
│Elasticsearch│ │ Cassandra│ │ InfluxDB │
│ (Search) │ │(Time-seq)│ │ (Metrics)│
└────────────┘ └──────────┘ └──────────┘
Search Event stream Metrics & Logs
- Merchant search - Payment logs - API response time
- Transaction hist - Audit trail - User activity
- Payment history - High volume - System health
↓
┌────────────┐
│ Neo4j │
│ (Graphs) │
└────────────┘
Recommendations
- Similar users
- Fraud patterns
- Merchant relations
✅ User data → PostgreSQL (relational, ACID)
✅ Orders → MongoDB (flexible, scalable)
✅ Cache → Redis (fast access)
✅ Search → Elasticsearch (full-text)
✅ Metrics → InfluxDB (time-series)
Each service owns its database:
- User Service → PostgreSQL + Redis
- Order Service → MongoDB + Elasticsearch
- Analytics Service → InfluxDB + Cassandra
When orders placed:
1. Write to MongoDB (fast, async)
2. Publish event to Kafka
3. Elasticsearch updates (eventual consistency)
4. Analytics records in InfluxDB
All services use same events for sync
Metrics tell you if right choice:
- Query latency too high? Might need caching
- Storage growing? Might need compression
- Consistency issues? Might need stronger guarantees
| Aspect | Single DB | Polyglot |
|---|---|---|
| Complexity | Low | High |
| Performance | Good for some, bad for others | Optimized for each use case |
| Scaling | Limited | Per-database scaling |
| Consistency | Easy (ACID) | Eventually consistent |
| Flexibility | Rigid schema | Flexible |
| Operations | Simple | Complex (many systems) |
✅ Use Polyglot when:
- Your app has different data patterns
- You need high performance across multiple uses
- Large scale (different needs)
- Team expertise in multiple databases
- Microservices architecture
❌ Don't use Polyglot if:
- Simple CRUD application
- Single data type
- Small scale (one DB is sufficient)
- Team unfamiliar with multiple DBs
- Startup (keep simple initially)
"One database to rule them all" ❌ "Right tool for the right job" ✅
Different data types need different optimizations:
- Transactions → PostgreSQL (ACID)
- Flexible data → MongoDB (schema-less)
- Speed → Redis (cache)
- Search → Elasticsearch (indexed)
- Time-series → InfluxDB (metrics)
- Relationships → Neo4j (graphs)
Choose based on your actual needs, not hype!