. Design a Search System for Knowledge Base - Frontend System Design Interview Guide

Medium

Design a production-ready search system for a knowledge base (like Confluence, Notion, or Zendesk Help Center) with intelligent indexing, relevance ranking, and fast autocomplete.

Backend as Black Box: Assume you have a search API. Focus on the frontend architecture including client-side search capabilities.

Key Challenges

This problem goes beyond basic autocomplete to explore:

  • Client-Side Indexing: How to build a search index in the browser
  • Relevance Scoring: TF-IDF, BM25, and semantic similarity
  • Faceted Search: Filter by category, author, date ranges
  • Search-as-you-type: Sub-100ms autocomplete with highlighting
  • Offline Search: Search cached content without network
Quick Links:

When users search a knowledge base, they expect instant suggestions, relevant results, and fast filtering—even with 100,000+ documents. This solution designs a search system that handles intelligent indexing, relevance ranking, and faceted search while balancing client-side speed with server-side completeness. The key insight: a good search system uses hybrid architecture—server-side relevance with client-side caching and refinement.

HLD interview focus: Requirements, architecture, tradeoffs, data flow, and scaling decisions. Any implementation snippets shown are optional unless explicitly asked.

Before designing anything, let's define what success looks like. When users search a knowledge base, they need instant suggestions, relevant results, and fast filtering.

Requirements Exploration Questions

Discovery
What content types should be searchable?
  • Articles, documentation pages
  • Comments and discussions
  • File attachments (PDF, Word)
  • Code snippets
What search capabilities?
  • Full-text search
  • Autocomplete/typeahead
  • Fuzzy matching for typos
  • Faceted filtering (category, author, date)
  • Saved searches
What are the latency requirements?
  • Autocomplete: < 100ms (feels instant)
  • Full results: < 500ms
  • Filter updates: < 200ms
Offline requirements?
  • Search recently viewed content offline
  • Sync when back online

Functional Requirements

Must Have

MVP (Core Features - What I'd Design First):

  • Full-text search across all documents
  • Search-as-you-type with instant suggestions (< 100ms)
  • Fuzzy matching for typos (1-2 character tolerance)
  • Faceted filtering (category, author, date)
  • Relevance-ranked results with highlighting
  • Clear loading/empty/error states
  • Basic accessibility and keyboard support

Advanced Features (Add If Time Permits):

  • Boolean operators (AND, OR, NOT) and phrase search
  • Client-side indexing for offline search
  • Saved searches and search history
  • Multi-language support

Non-Functional Requirements

Quality Bar

Performance:

  • Autocomplete: < 100ms (P95)
  • Full search: < 500ms (P95)
  • No UI jank during search
  • Handle 100K+ documents

Scalability:

  • Support 10K+ concurrent users
  • Index updates within 5 minutes
  • Handle 1000+ queries/second

Reliability:

  • Graceful degradation on API failure
  • Offline search for cached content
  • No data loss

Accessibility:

  • Keyboard navigation
  • Screen reader support
  • WCAG 2.1 AA compliance

Security & Compliance:

  • Strict authn/authz checks for scoped content access
  • Query/input sanitization and XSS-safe result rendering
  • Rate limiting and abuse protection for search endpoints

Observability:

  • Track p95 latency, error rate, and retry rate
  • Log critical client/server sync failures
  • Alert on sustained degradation and queue backlog growth

Hybrid Search Architecture

Why Hybrid?

  • Client-side: Instant results for recent/cached content
  • Server-side: Complete results across full corpus
┌─────────────────────────────────────────────────────────────────────────┐
│                              Client                                      │
│  ┌─────────────────┐    ┌──────────────────┐    ┌────────────────────┐  │
│  │   Search Input  │───►│  Search Manager  │◄──►│    Results UI      │  │
  (Debounced)  (Orchestrator)  (Ranked List)     │  │
│  └─────────────────┘    └────────┬─────────┘    └────────────────────┘  │
│                                  │                                       │
│           ┌──────────────────────┼──────────────────────┐               │
│           ▼                      ▼                      ▼               │
│  ┌────────────────┐    ┌────────────────┐    ┌────────────────────┐    │
│  │ Local Index    │    │ Server API     │    │ Facet Engine       │    │
 (Web Worker) (REST/GraphQL) (Aggregations)     │    │
│  │                │    │                │    │                    │    │
│  │ • Recent docs  │    │ • Full corpus  │    │ • Category counts  │    │
│  │ • Viewed docs  │    │ • Fuzzy match  │    │ • Author counts    │    │
│  │ • < 50ms       │    │ • < 500ms      │    │ • Date histogram   │    │
│  └────────────────┘    └────────────────┘    └────────────────────┘    │
│           │                                                             │
│           ▼                                                             │
│  ┌────────────────────────────────────────────────────────────────┐    │
│  │                        IndexedDB                                │    │
│  │  • Persisted search index (offline support)                    │    │
│  │  • Recently viewed documents                                    │    │
│  │  • Search history                                               │    │
│  └────────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
Server (Black Box)│  • Elasticsearch / Algolia / MeiliSearch                                │
│  • Full document corpus                                                  │
│  • Advanced relevance tuning                                             │
└─────────────────────────────────────────────────────────────────────────┘

Progressive Results Strategy

User types query
┌─────────────────────────────────┐
1. Search local index           │ ← Instant (< 50ms)
│    Show results immediately     │
└─────────────────────────────────┘
        (parallel)
┌─────────────────────────────────┐
2. Query server API             │ ← Background (< 500ms)
    (debounced 200ms)└─────────────────────────────────┘
┌─────────────────────────────────┐
3. Merge & re-rank results      │ ← Seamless update
│    Dedupe, boost local matches  │
└─────────────────────────────────┘

Why This Works:

  • User sees instant feedback
  • Results improve as server responds
  • Works offline with local index only

Why Web Workers for Indexing?

Problem: Building search index on main thread = UI freeze

Solution: Offload to Web Worker

Main Thread                    Web Worker
     │                              │
     │──── Documents to index ─────►│
          (don't block UI)     │                              │ ← Build index in background
     │                              │
     │◄──── Index ready ────────────│
     │                              │
     │──── Search query ───────────►│
     │                              │ ← Search in parallel
     │◄──── Results ────────────────│

Benefits:

  • UI stays responsive during indexing
  • Can index thousands of documents without jank
  • Parallel search doesn't block typing

Tradeoffs & Comparisons

Entity and interface contract shape with cache/reconciliation model for a frontend system design interview (backend treated as a black box).

1) Component prop interfaces (boundaries)

Define boundaries between query input, facet controls, and result surfaces.

  • SearchShellProps: route-level search context + navigation callbacks
  • SearchInputProps: controlled query state + keyboard handling
  • FacetPanelProps: selected filters and toggle handlers
  • ResultListProps: ranked ids + pagination/selection callbacks

2) Hook interfaces (consumption contracts)

Hook contracts keep data-consumption details explicit while hiding fetch internals.

Document Structure

interface Document {
  id: string;
  title: string;
  content: string;           // Full text content
  snippet: string;           // Preview snippet (200 chars)
  url: string;
  
  // Metadata for facets
  category: string;
  author: {
    id: string;
    name: string;
  };
  space: string;             // Workspace/project
  tags: string[];
  
  // Temporal
  createdAt: Date;
  updatedAt: Date;
  viewedAt?: Date;           // For recency boost
}

Inverted Index Structure

What is an Inverted Index?

Instead of: Document → Words (forward index)

Use: Word → Documents (inverted index)

Forward Index (slow):
  Doc1: ["react", "hooks", "tutorial"]
  Doc2: ["react", "state", "management"]
  
  To find "react": Scan ALL documents ❌

Inverted Index (fast):
  "react":     [Doc1, Doc2]
  "hooks":     [Doc1]
  "tutorial":  [Doc1]
  "state":     [Doc2]
  
  To find "react": O(1) lookup ✓

Index Structure:

interface SearchIndex {
  // Term → Document IDs with positions
  invertedIndex: Map<string, PostingList>;
  
  // Document metadata for ranking
  documents: Map<string, DocumentMeta>;
  
  // Pre-computed stats for scoring
  avgDocLength: number;
  totalDocs: number;
}

interface PostingList {
  docIds: string[];
  termFrequency: Map<string, number>;  // docId → count
  positions: Map<string, number[]>;     // docId → positions
}

Relevance Scoring: TF-IDF vs BM25

TF-IDF (Term Frequency - Inverse Document Frequency):

Score = TF × IDF

TF = (times term appears in doc) / (total terms in doc)
IDF = log(total docs / docs containing term)
  • Common words (the, a, is) → low IDF
  • Rare words (kubernetes, authentication) → high IDF

BM25 (Better Matching 25):

Improved version that handles document length better:

  • Short docs with term → higher score
  • Long docs with term → lower score (term may be incidental)

When to use which:

  • TF-IDF: Simple, good for short documents
  • BM25: Better for varied document lengths (articles, docs)

Search Query AST

Parse queries into Abstract Syntax Tree:

Query: react hooks category:tutorials -deprecated

{
  type: "AND",
  children: [
    { type: "TERM", value: "react" },
    { type: "TERM", value: "hooks" },
    { type: "FILTER", field: "category", value: "tutorials" },
    { type: "NOT", child: { type: "TERM", value: "deprecated" } }
  ]
}

Supported syntax:

  • term - full-text search
  • "phrase" - exact phrase match
  • field:value - field filter
  • -term - exclude term
  • OR - any match

Client cache shape (recommended)

  • entitiesById: Record<ID, Entity>
  • orderedIds: ID[] for rendering order
  • pageInfo/cursor metadata for pagination or range loading

Deep dive: Data Normalization

Consistency & reconciliation rules

  • Make writes idempotent where retries are possible.
  • Apply realtime updates with version/event ordering checks.
  • Prefer server-authoritative reconciliation after optimistic mutations.

Tradeoffs & Comparisons

Component Boundaries

Structured
SearchShell

Coordinates search state, URL sync, and panel-level composition.

SearchInput

Owns query editing, keyboard controls, and suggestion open state.

FacetPanel

Manages selected filters and emits filter toggle intents.

ResultList

Renders ranked results and emits pagination/selection actions.

kb-search-component-interfaces.ts
export interface SearchShellProps {
  initialQuery?: string;
  onOpenDocument: (docId: string) => void;
}

export interface SearchInputProps {
  query: string;
  setQuery: (query: string) => void;
  onSubmit: () => void;
  onKeyDown: (event: React.KeyboardEvent<HTMLInputElement>) => void;
}

export interface FacetPanelProps {
  selectedFilters: Record<string, string[]>;
  onToggleFilter: (facet: string, value: string) => void;
  onClearAll: () => void;
}

export interface ResultListProps {
  resultIds: string[];
  resultsById: Record<string, SearchResult>;
  hasMore: boolean;
  onLoadMore: () => void;
}
kb-search-hook-contracts.ts
export interface UseSearchQueryResult {
  resultIds: string[];
  resultsById: Record<string, SearchResult>;
  facets: FacetBucket[];
  isLoading: boolean;
  hasMore: boolean;
  loadMore: () => void;
}

export interface UseSearchInputResult {
  query: string;
  setQuery: (value: string) => void;
  activeSuggestionIndex: number;
  setActiveSuggestionIndex: (index: number) => void;
}

export function useSearchQuery(_input: SearchQueryInput): UseSearchQueryResult {
  throw new Error('Contract-only snippet');
}

export function useSearchInput(): UseSearchInputResult {
  throw new Error('Contract-only snippet');
}

React interfaces & integration patterns (props, hooks, callbacks).

This section covers API contracts and React consumption patterns.

API contracts (Backend as black box)

Search API:

GET /api/search?q={query}&page={page}&limit={limit}&facets={facets}

Request Parameters:
- q: Search query string (required)
- page: Page number (default: 1)
- limit: Results per page (default: 20)
- facets: Comma-separated facet filters (e.g., "category:tutorial,author:john")

Response:
{
  results: SearchResult[];
  total: number;
  page: number;
  hasMore: boolean;
  facets: {
    category: FacetValue[];
    author: FacetValue[];
    dateRange: FacetValue[];
    tags: FacetValue[];
  };
  query: string;                  // Echo of search query
  suggestions?: string[];         // Search suggestions
}

interface SearchResult {
  id: string;
  title: string;
  snippet: string;                // Highlighted snippet with <mark> tags
  url: string;
  category: string;
  author: string;
  publishedAt: string;
  tags: string[];
  relevanceScore: number;
}

interface FacetValue {
  value: string;
  count: number;
  selected?: boolean;
}

Autocomplete/Suggestions API:

GET /api/search/suggestions?q={query}&limit={limit}

Response:
{
  suggestions: string[];
  recentSearches?: string[];
  popularSearches?: string[];
}

Document Detail API:

GET /api/documents/:documentId

Response:
{
  id: string;
  title: string;
  content: string;                // Full content or HTML
  author: string;
  category: string;
  publishedAt: string;
  updatedAt: string;
  tags: string[];
  relatedDocuments?: string[];     // Document IDs
}

Search Analytics API:

POST /api/search/analytics
{
  query: string;
  resultClicked?: string;          // Document ID if clicked
  timeSpent?: number;             // milliseconds
}

Response: { success: boolean; }

Type definitions used in contracts

interface SearchResult {
  id: string;
  title: string;
  snippet: string;
  url: string;
  category: string;
  author: string;
  publishedAt: string;
}

interface SearchSuggestion {
  value: string;
  source: 'recent' | 'popular' | 'query';
}

interface SearchAnalyticsPayload {
  query: string;
  resultClicked?: string;
  timeSpent?: number;
}

3) Integration patterns (React wiring)

  • Debounced intent pipeline: input changes produce request intent after debounce.
  • Stale response protection: guard state updates by query/request token.
  • Facet + result coherence: keep selected filters and result cache in sync.
  • Keyboard-first navigation: preserve focus and ARIA semantics across rerenders.

Integration Patterns

Structured
Debounced querying

Convert rapid input into stable request intents.

Token-based guards

Ignore stale responses that no longer match active query state.

Facet coherence

Keep facet selection and result cache snapshots aligned.

A11y-first focus flow

Maintain predictable keyboard and screen-reader behavior.

Search Performance Optimizations

1. Debouncing Search Input

Problem: API call on every keystroke = wasted requests

Solution: Wait for typing pause

User types: r-e-a-c-t (5 keystrokes)

Without debounce: 5 API calls ❌
With debounce (200ms): 1 API call ✓

Debounce timing:

  • Autocomplete: 150-200ms (fast feedback)
  • Full search: 300-400ms (more deliberate)

2. Client-Side Index for Instant Results

Index recently viewed documents locally:

Document viewed → Add to local index
User searches ────────►│
                       ├── Check local index (< 50ms)
                       │         │
                       │         ▼
                       │   Show instant results
                       └── Query server (parallel)
                           Merge & update

What to index locally:

  • Recently viewed docs (last 100)
  • Frequently accessed docs
  • User's own content
  • Search history

3. Fuzzy Matching for Typos

Problem: "deploment" doesn't match "deployment"

Solution: Levenshtein distance tolerance

Query: "deploment"
     Calculate edit distance to all terms
     "deployment" - distance 1  (within threshold)
     "development" - distance 3  (too far)

Implementation options:

  • Trigram index: "dep" → "deployment", "development"
  • N-gram tokenization: "dep", "epl", "plo", "loy", ...
  • Levenshtein automaton for efficient matching

4. Highlighting Matches

Show users WHY a result matched:

Query: "react hooks"

Before: "Learn about React Hooks and state management"
After:  "Learn about <mark>React</mark> <mark>Hooks</mark> and state management"

Implementation:

  • Tokenize query into terms
  • Find term positions in result
  • Wrap matches with highlight tags
  • Extend context around matches (50 chars each side)

5. Facet Count Optimization

Problem: Counting facets for 100K docs = slow

Solution: Pre-compute + approximate counts

Strategy 1: Pre-aggregated counts (cached)
  { "tutorials": 1234, "guides": 567, ... }

Strategy 2: Approximate for large sets
  If results > 10K, show "1K+" instead of exact count

Strategy 3: Lazy load counts
  Show results first, load counts async

Performance Targets

MetricTargetTechnique
Autocomplete< 100msLocal index + debounce
Full search< 500msServer-side (Elasticsearch)
Fuzzy match< 50msTrigram index
Facet counts< 200msPre-aggregation
Index update< 5sWeb Worker background

Why Local Index + Server Search?

┌─────────────────────────────┐
                    │      Query Complexity        │
                    │                              │
                    │   Simple ───────► Complex    │
                    │                              │
                    │   Local       Server         │
                    │   Index       Search         │
                    │                              │
                    │   • Fast      • Complete     │
                    │   • Limited   • Slow         │
                    │   • Offline   • Online only  │
                    └─────────────────────────────┘

Optimal: Use both!
- Local for instant feedback
- Server for comprehensive results
- Merge for best of both worlds

Why This Design Works

Structured
Hybrid Search Architecture
  • Local index: Instant results for cached content
  • Server search: Complete results across full corpus
  • Progressive loading: Fast feedback → refined results
Inverted Index
  • O(1) term lookup: Don't scan all documents
  • Position tracking: Enable phrase search and highlighting
  • Term frequency: Enable relevance scoring
Web Workers
  • Non-blocking: UI stays responsive during indexing
  • Parallel search: Main thread handles typing, worker handles search
  • Large index support: Can index thousands of documents
Fuzzy Matching
  • Typo tolerance: "deploment" matches "deployment"
  • N-gram index: Fast fuzzy lookup
  • Configurable distance: 1-2 characters for short words
BM25 Scoring
  • Document length normalization: Fair ranking across varied lengths
  • Term frequency saturation: Diminishing returns for repeated terms
  • Industry standard: Used by Elasticsearch, Lucene

Key Takeaways

  1. Use hybrid search - local for speed, server for completeness
  2. Build inverted index - O(1) term lookup vs O(n) scan
  3. Offload to Web Workers - keep UI responsive during indexing
  4. Debounce input - 200-300ms to reduce API calls
  5. Implement fuzzy matching - users make typos
  6. Highlight matches - show users why results matched
  7. Progressive results - show local first, refine with server
  8. Persist index - IndexedDB for offline search

Key Takeaways

  • Use hybrid search: local index for speed, server for completeness
  • Build inverted index for O(1) term lookup
  • Offload indexing to Web Workers to keep UI responsive
  • Debounce search input (200-300ms) to reduce API calls
  • Implement fuzzy matching using trigram or n-gram indexes
  • Highlight matching terms in results for better UX
  • Use BM25 for relevance scoring (handles document length better than TF-IDF)
  • Persist search index in IndexedDB for offline capability