. Design Real-Time Chat Application (Slack/Discord) - Frontend System Design Interview Guide
Design a production-ready real-time chat application like Slack, Discord, or Microsoft Teams with channels, direct messages, presence indicators, and rich media support.
Backend as Black Box: Assume you have APIs for messages and real-time updates. Focus on the frontend architecture.
Key Challenges
This problem explores real-time communication challenges:
- Message Delivery: Send, receive, and sync messages reliably
- Message States: Sent → Delivered → Read with visual indicators
- Presence System: Online/Away/Offline status with typing indicators
- Offline Support: Queue messages, sync on reconnect
- Rich Content: Attachments, embeds, reactions, threads
When users send messages, they expect instant delivery, real-time presence updates, and reliable offline support—even with thousands of users per channel. This solution designs a real-time chat application that handles message delivery states, presence systems, and offline queuing while maintaining clear boundaries between server state and UI state. The key insight: a good chat app uses optimistic updates for instant feedback, cursor pagination for stable history, and graceful reconnection handling.
HLD interview focus: Requirements, architecture, tradeoffs, data flow, and scaling decisions. Any implementation snippets shown are optional unless explicitly asked.
I'll start by defining what makes a great chat experience—what questions does a user need answered as they send and receive messages? Then I'll design the architecture that handles message delivery states, presence systems, and offline queuing. Finally, I'll design clear boundaries between server state and UI state.
Why this approach?
Most candidates build a "message list with WebSocket." Strong candidates build a "chat experience"—a system that handles delivery states, presence updates, offline queuing, and graceful reconnection. The difference is thinking about the entire messaging lifecycle, not just real-time updates.
Think of this like building Slack or Discord. You don't just show messages—you handle delivery states, typing indicators, presence, offline queuing, and thread replies. Same principles apply here.
Before designing anything, let's define what success looks like. When users send messages, they need instant delivery, real-time presence updates, and reliable offline support.
Requirements Exploration Questions
DiscoveryWhat types of conversations?
- Direct messages (1:1)
- Group chats (small groups)
- Channels (large, many members)
What message types?
- Text messages
- Rich text (formatting, links)
- File attachments (images, docs)
- Reactions and threads
What real-time features?
- Message delivery status
- Typing indicators
- Presence (online/away/offline)
- Read receipts
Functional Requirements
Must HaveMVP (Core Features - What I'd Design First):
- Send and receive messages in real-time
- Message status (sending → sent → delivered → read)
- Channels and direct messages (1:1 and groups)
- Presence indicators (online/away/offline)
- Typing indicators with debouncing
- Offline message queue and sync
- Clear loading/empty/error states
- Basic accessibility and keyboard support
Advanced Features (Add If Time Permits):
- Edit and delete messages
- Thread replies and reactions
- File attachments and rich media
- Message search and history
- Custom status and presence
Non-Functional Requirements
Quality BarPerformance:
- Message appears < 100ms (optimistic)
- Sync latency < 200ms
- Support 10K+ users per channel
- Scroll 100K+ messages smoothly
Reliability:
- No message loss
- Offline message queue
- Reconnection with sync
- Message ordering guaranteed
Scalability:
- Handle 1000+ messages/sec per channel
- Presence updates for 10K+ users
- Efficient message history loading
Accessibility:
- Keyboard navigation
- Screen reader support
- High contrast mode
Security & Compliance:
- Strict authn/authz checks on every write path
- Input validation plus XSS/CSRF protections
- TLS in transit and secure session/token handling
Observability:
- Track p95 latency, error rate, and retry rate
- Log critical client/server sync failures
- Alert on sustained degradation and queue backlog growth
WebSocket vs REST
| Feature | WebSocket | REST |
|---|---|---|
| New messages | ✓ Real-time push | ✗ Polling |
| Typing indicators | ✓ Immediate | ✗ Too slow |
| Presence updates | ✓ Push | ✗ Polling |
| Message history | Unnecessary | ✓ Paginated |
| Search | Unnecessary | ✓ Server-side |
| File upload | ✗ Use REST | ✓ Multipart |
Best approach: WebSocket for real-time + REST for historical data
Tradeoffs & Comparisons
- CSR vs SSR/ISR: Rendering Strategies
Entity and interface contract shape with cache/reconciliation model for a frontend system design interview (backend treated as a black box).
1) Component prop interfaces (boundaries)
Define clear boundaries between thread rendering, composer interactions, and presence UI.
ChatShellProps: active workspace/channel context and high-level navigation callbacksMessageListProps: normalized message ids + render state for virtualized historyMessageComposerProps: draft value, send handler, attachment and mention entry pointsPresenceRailProps: online status and typing participants for the current channel
2) Hook interfaces (consumption contracts)
Use hook return contracts to describe React consumption without binding to transport internals.
Core Data Structures
Message:
interface Message {
id: string; // Client-generated UUID
channelId: string;
senderId: string;
content: string;
createdAt: Date;
status: MessageStatus;
// Delivery tracking
deliveredTo: string[]; // User IDs who received
readBy: string[]; // User IDs who read
// Rich content
attachments?: Attachment[];
replyTo?: string; // Parent message ID for threads
reactions?: Reaction[];
edited?: boolean;
editedAt?: Date;
}
type MessageStatus =
| 'sending' // Optimistic, not yet confirmed
| 'sent' // Server acknowledged receipt
| 'delivered' // Recipient(s) received
| 'read' // Recipient(s) opened
| 'failed'; // Send failedChannel:
interface Channel {
id: string;
type: 'channel' | 'dm' | 'group';
name: string;
members: string[];
lastMessage?: Message;
unreadCount: number;
lastReadAt: Date;
}Presence:
interface UserPresence {
userId: string;
status: 'online' | 'away' | 'offline';
lastSeen: Date;
customStatus?: string;
}Event Types
type ChatEvent =
// Messages
| { type: 'MESSAGE_SENT'; payload: Message }
| { type: 'MESSAGE_DELIVERED'; payload: { messageId: string; userId: string } }
| { type: 'MESSAGE_READ'; payload: { messageId: string; userId: string } }
| { type: 'MESSAGE_EDITED'; payload: { messageId: string; content: string } }
| { type: 'MESSAGE_DELETED'; payload: { messageId: string } }
| { type: 'REACTION_ADDED'; payload: { messageId: string; reaction: string; userId: string } }
// Presence
| { type: 'USER_ONLINE'; payload: { userId: string } }
| { type: 'USER_OFFLINE'; payload: { userId: string } }
| { type: 'USER_AWAY'; payload: { userId: string } }
| { type: 'TYPING_START'; payload: { userId: string; channelId: string } }
| { type: 'TYPING_STOP'; payload: { userId: string; channelId: string } };Optimistic Update Flow
1. User clicks Send
│
▼
┌─────────────────────────┐
│ Generate client-side ID │ ← UUID for idempotency
└───────────┬─────────────┘
│
▼
┌─────────────────────────┐
│ Add to UI immediately │ ← status: 'sending'
└───────────┬─────────────┘
│
▼
┌─────────────────────────┐
│ Send via WebSocket │ ← Include client ID
└───────────┬─────────────┘
│
┌──────┴──────┐
│ │
▼ ▼
Success Failure
│ │
▼ ▼
'sent' 'failed'
+ server ID + retry optionClient cache shape (recommended)
entitiesById: Record<ID, Entity>orderedIds: ID[]for rendering orderpageInfo/cursor metadata for pagination or range loading
Deep dive: Data Normalization
Consistency & reconciliation rules
- Make writes idempotent where retries are possible.
- Apply realtime updates with version/event ordering checks.
- Prefer server-authoritative reconciliation after optimistic mutations.
Tradeoffs & Comparisons
- Normalized vs Denormalized: Data Normalization
Component Boundaries
StructuredChatShell
Owns channel selection, layout orchestration, and cross-panel callbacks.
MessageList
Renders virtualized history from normalized ids and emits anchor/load-more events.
MessageComposer
Handles draft editing, attachment intake, and send intent.
PresenceRail
Displays online state, typing indicators, and participant metadata.
export interface ChatShellProps {
workspaceId: string;
activeChannelId: string;
onSelectChannel: (channelId: string) => void;
}
export interface MessageListProps {
messageIds: string[];
messagesById: Record<string, Message>;
hasOlder: boolean;
onLoadOlder: () => void;
onJumpToMessage: (messageId: string) => void;
}
export interface MessageComposerProps {
draft: string;
setDraft: (value: string) => void;
onSend: (payload: { text: string; attachmentIds?: string[]; replyToId?: string }) => Promise<void>;
onAttach: (files: File[]) => Promise<string[]>;
}
export interface PresenceRailProps {
members: PresenceMember[];
typingUserIds: string[];
}export interface UseChannelMessagesResult {
messageIds: string[];
messagesById: Record<string, Message>;
hasOlder: boolean;
loadOlder: () => void;
isLoading: boolean;
}
export interface UseSendMessageResult {
send: (payload: { text: string; attachmentIds?: string[]; replyToId?: string }) => Promise<void>;
pendingCount: number;
}
export function useChannelMessages(_channelId: string): UseChannelMessagesResult {
throw new Error('Contract-only snippet');
}
export function useSendMessage(_channelId: string): UseSendMessageResult {
throw new Error('Contract-only snippet');
}React interfaces & integration patterns (props, hooks, callbacks).
This section covers API contracts and React consumption patterns.
API contracts (Backend as black box)
REST API Endpoints:
Channels & Messages:
GET /api/channels
Response: { channels: Channel[]; }
GET /api/channels/:channelId/messages?cursor={cursor}&limit={limit}
Response: {
messages: Message[];
nextCursor: string | null;
hasMore: boolean;
}
POST /api/channels/:channelId/messages
{
content: string;
replyTo?: string; // Parent message ID
attachments?: string[]; // Attachment IDs
}
Response: { message: Message; }
PUT /api/messages/:messageId
{
content: string;
}
Response: { message: Message; }
DELETE /api/messages/:messageId
Response: { success: boolean; }Presence & Typing:
GET /api/channels/:channelId/presence
Response: {
users: UserPresence[];
typing: TypingUser[];
}
POST /api/channels/:channelId/typing
{
action: 'start' | 'stop';
}
Response: { success: boolean; }File Upload:
POST /api/upload
FormData: { file: File; }
Response: {
attachmentId: string;
url: string;
filename: string;
size: number;
mimeType: string;
}WebSocket Protocol:
// Connection
{
type: 'CONNECT';
token: string; // Auth token
}
// Message Events
{
type: 'MESSAGE_SENT';
payload: Message;
}
{
type: 'MESSAGE_DELIVERED';
payload: { messageId: string; userId: string; }
}
{
type: 'MESSAGE_READ';
payload: { messageId: string; userId: string; }
}
// Presence Events
{
type: 'USER_ONLINE' | 'USER_OFFLINE' | 'USER_AWAY';
payload: { userId: string; }
}
{
type: 'TYPING_START' | 'TYPING_STOP';
payload: { userId: string; channelId: string; }
}Type definitions used in contracts
interface Channel {
id: string;
name: string;
type: 'dm' | 'group' | 'channel';
}
interface Message {
id: string;
channelId: string;
senderId: string;
content: string;
createdAt: string;
status: 'sending' | 'sent' | 'delivered' | 'read' | 'failed';
}
interface UserPresence {
userId: string;
status: 'online' | 'away' | 'offline';
lastSeen?: string;
}
interface TypingUser {
userId: string;
channelId: string;
}3) Integration patterns (React wiring)
- Data down, events up: list/composer components emit intent; hooks own side effects.
- Optimistic messaging: insert local message with pending state, reconcile on ack/failure.
- Realtime patching: update delivery/read states in-place by message id.
- Scroll continuity: preserve viewport anchor while loading older history.
Integration Patterns
StructuredOptimistic delivery
Render pending message immediately, rollback or reconcile on server response.
Patch-by-id updates
Apply delivery/read/reaction updates directly to cached entities.
Reconnect-safe queues
Replay unsent actions after reconnect using idempotency keys.
Viewport stability
Keep scroll anchor stable while prepending older messages.
Virtual Scrolling for Message History
Problem: Channel with 100K messages = massive DOM = jank
Solution: Only render visible messages + buffer
┌────────────────────────────────┐
│ Old messages (not rendered) │ ← Saved in memory
├────────────────────────────────┤
│ Buffer zone (10 messages) │ ← Smooth scrolling
├────────────────────────────────┤
│ ██████████████████████████████│
│ ██ VISIBLE VIEWPORT ██████████│ ← Actually in DOM
│ ██████████████████████████████│
├────────────────────────────────┤
│ Buffer zone (10 messages) │ ← Smooth scrolling
├────────────────────────────────┤
│ New messages (not rendered) │ ← Lazy load on scroll
└────────────────────────────────┘Key behaviors:
- Start at bottom (newest messages)
- Load older on scroll up
- Auto-scroll when at bottom + new message
- "Jump to bottom" button when scrolled up
WebSocket Reconnection Strategy
Connection lost
│
▼
┌─────────────────────────┐
│ Attempt 1: Wait 1s │
└───────────┬─────────────┘
│ Failed
▼
┌─────────────────────────┐
│ Attempt 2: Wait 2s │
└───────────┬─────────────┘
│ Failed
▼
┌─────────────────────────┐
│ Attempt 3: Wait 4s │ ← Exponential backoff
└───────────┬─────────────┘
│ Failed
▼
... continue ...
│
▼
┌─────────────────────────┐
│ Max: Wait 30s │ ← Cap at 30 seconds
│ + jitter (0-5s random) │ ← Prevent thundering herd
└─────────────────────────┘On reconnect:
- Re-authenticate
- Fetch missed messages (since lastMessageId)
- Update presence
- Resume subscriptions
Offline Message Queue
User goes offline
│
▼
┌─────────────────────────────────┐
│ User sends message │
│ │
│ 1. Add to UI (status: sending) │
│ 2. Queue in IndexedDB │
│ 3. Show offline indicator │
└─────────────────────────────────┘
│
│ Network returns
▼
┌─────────────────────────────────┐
│ Process queue in order │
│ │
│ 1. Send oldest first │
│ 2. Update status on confirm │
│ 3. Handle duplicates (by ID) │
│ 4. Clear from queue │
└─────────────────────────────────┘Queue structure:
interface QueuedMessage {
id: string; // Client-generated
channelId: string;
content: string;
queuedAt: Date;
retryCount: number;
}Performance Targets
| Metric | Target | Technique |
|---|---|---|
| Message send | < 100ms perceived | Optimistic update |
| Message receive | < 200ms | WebSocket push |
| Channel switch | < 300ms | Cached messages |
| History load | < 500ms | Cursor pagination |
| Reconnect | < 3s | Exponential backoff |
| Memory (10K msgs) | < 50MB | Virtual scroll |
Why This Design Works
StructuredEvent-Driven Architecture
- Clear message flow - Events represent all state changes
- Easy to extend - Add new event types for new features
- Debuggable - Log events to trace issues
Client-Generated IDs
- Optimistic updates - Show before server confirms
- Idempotency - Retry without duplicates
- Offline support - Create IDs without server
Message Status Progression
- User feedback - Know if message was received
- Trust building - See when message was read
- Error recovery - Clear indication of failures
Presence Heartbeats
- Battery efficient - Don't poll constantly
- Accurate status - Know within 60 seconds
- Graceful degradation - Works with intermittent connection
Virtual Scrolling
- Handle any history - 100K+ messages no problem
- Smooth scrolling - Only visible items in DOM
- Memory efficient - Constant memory usage
Key Takeaways
- Client-generated message IDs enable optimistic updates and offline support
- Message status progression (sending → sent → delivered → read) builds trust
- Presence uses heartbeat with exponential backoff for efficiency
- Typing indicators need debouncing (start) and auto-timeout (stop)
- WebSocket + REST hybrid - real-time for new messages, REST for history
- Virtual scroll is essential for channels with thousands of messages
- Group consecutive messages from same sender for cleaner UI
- Reconnection with sync - fetch missed messages by lastMessageId
Key Takeaways
- ✓Client-generated message IDs enable optimistic updates and idempotency
- ✓Message status progression: sending → sent → delivered → read
- ✓Presence uses heartbeat with exponential backoff for efficiency
- ✓Typing indicators need debouncing (start) and auto-timeout (stop)
- ✓WebSocket for real-time + REST for history/search
- ✓Virtual scroll is essential for channels with thousands of messages
- ✓Group consecutive messages from same sender for cleaner UI
- ✓Reconnection strategy: exponential backoff with missed message sync