The sample answer below is generated by Claude. Chat reference
Step 1: What the System Does (The 3 Core Functions)
Our system needs to:
Accept messages - User sends a message with a delay (e.g., “send this in 2 hours”)
Wait and deliver - System waits exactly that long, then sends the message
Allow cancellation - User can cancel before it’s sent
Step 2: High-Level Architecture (The Big Picture)
┌─────────┐
│ User │
└────┬────┘
│
▼
┌─────────────────┐
│ API Server │ ◄── Handles schedule/cancel requests
└────┬────────────┘
│
├──────────┬─────────────┐
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌─────────┐
│Database │ │ Redis │ │ Workers │ ◄── Deliver messages
│(Postgres)│ │(Queue) │ │ Pool │
└─────────┘ └──────────┘ └─────────┘
│
▼
┌──────────────────┐
│ 3rd Party API │
│ (Final delivery) │
└──────────────────┘
What each component does:
API Server: Receives HTTP requests (schedule/cancel)
Database: Stores all message data permanently
Redis: Quick lookup of “what messages need to be sent NOW”
Workers: Background processes that check Redis and deliver messages
Step 3: The Flow - Schedule a Message
User Request:
POST /schedule
{
"content": "Happy Birthday!",
"delay_seconds": 7200 // 2 hours
}
What happens:
1. API Server receives request
↓
2. Calculate delivery time = now + 7200 seconds = 2:00 PM
↓
3. Save to Database:
{
id: "msg-123",
content: "Happy Birthday!",
scheduled_time: "2:00 PM",
status: "PENDING"
}
↓
4. Add to Redis sorted set:
ZADD pending_messages 1699891200 "msg-123"
^ ^ ^
| | |
command timestamp message ID
(as score)
↓
5. Return to user: {"message_id": "msg-123"}
Why Redis Sorted Set? Redis sorted set stores items with a “score”. We use the delivery timestamp as the score. This lets us quickly ask: “Which messages have timestamp ⇐ now?”
Step 4: The Flow - Worker Delivers Message
Worker runs every 1 second:
1. Get current time: 2:00:00 PM
↓
2. Ask Redis: "Give me messages with timestamp <= 2:00:00 PM"
ZRANGEBYSCORE pending_messages 0 1699891200 LIMIT 0 100
Returns: ["msg-123", "msg-456", ...]
↓
3. For each message ID:
a. Get full message from Database
b. Check status is still "PENDING"
c. Call 3rd party API to deliver
d. If success: update status to "DELIVERED"
e. If fail: retry (up to 5 times)
↓
4. Remove from Redis:
ZREM pending_messages "msg-123"
Step 5: The Flow - Cancel a Message
User Request:
DELETE /cancel/msg-123
What happens:
1. Check Database: What's the current status?
↓
2. If status = "DELIVERED": Return error "Already sent"
↓
3. If status = "PENDING":
a. Update Database: status = "CANCELLED"
b. Remove from Redis: ZREM pending_messages "msg-123"
↓
4. Return: {"success": true}
Step 6: The Tricky Part - Race Conditions
The Problem: What if cancel and delivery happen at the SAME TIME?
Time: 2:00:00 PM
Thread 1 (User): Thread 2 (Worker):
Cancel msg-123 Get msg-123 from Redis
↓ ↓
Check status (PENDING) Check status (PENDING)
↓ ↓
Mark as CANCELLED Start delivering...
↓ ↓
CONFLICT! Both think they can proceed!
The Solution: Locks
We use a “lock” - only ONE thread can hold the lock at a time.
// Cancel functionfunc CancelMessage(messageID string) { lock := GetLock("lock:msg-123") // Get the lock lock.Acquire() // Wait until we have it msg := db.GetMessage(messageID) if msg.Status == "PENDING" { db.UpdateStatus(messageID, "CANCELLED") redis.Delete(messageID) } lock.Release() // Let others use the lock}// Delivery functionfunc DeliverMessage(messageID string) { lock := GetLock("lock:msg-123") // Get SAME lock lock.Acquire() msg := db.GetMessage(messageID) if msg.Status == "PENDING" { db.UpdateStatus(messageID, "PROCESSING") } else { lock.Release() return // Was cancelled! } lock.Release() // Now deliver (without lock) CallThirdPartyAPI(msg)}
Now they can’t conflict:
If cancel gets lock first → marks CANCELLED → delivery sees CANCELLED and stops
If delivery gets lock first → marks PROCESSING → cancel sees PROCESSING and fails
Step 7: Handling Errors
What if the 3rd party API fails?
func DeliverWithRetry(msg Message) bool { for attempt := 0; attempt < 5; attempt++ { response := CallThirdPartyAPI(msg.content) if response.Success { return true // Delivered! } if response.StatusCode == 500 { // Server error - wait and retry time.Sleep(2^attempt seconds) // 1s, 2s, 4s, 8s, 16s continue } if response.StatusCode == 400 { // Bad request - don't retry return false } } // Failed after 5 tries return false}
Exponential backoff: Wait 1s, then 2s, then 4s, then 8s, then 16s between retries.
Step 8: Scaling to 10,000 Messages/Second
Problem: One Redis sorted set + one worker = bottleneck