Chunking Guidelines

A breakdown of how to configure chunking settings for optimal performance in search, retrieval, summarization, and embedding pipelines.

What is Chunking?

Chunking breaks down large documents into smaller, manageable pieces that can be processed more effectively by AI models. This is essential for search and retrieval systems, embedding generation, and staying within model token limits.

Chunking Parameters Explained

Setting	Description
`Chunk Size`	Max size (in tokens or characters) for each chunk
`Chunk Overlap`	Overlapping size between chunks to preserve context
`Minimum Characters per Sentence`	Prevents splitting tiny, noisy sentences
`Minimum Sentences per Chunk`	Prevents creation of unhelpful, short fragments

Chunk Size

Controls the granularity of each chunk.

Use Case	Recommended Chunk Size
Document Search	`256–512`
Chatbot Memory	`128–256`
Legal/Financial Docs	`512–768`
Code Embedding	`128–256`
Academic Papers	`384–768`

Example:
Chunk Size = 512 → Includes ~2–3 paragraphs for better semantic understanding.

Chunk Overlap

Adds shared characters between adjacent chunks to retain context.

Use Case	Recommended Overlap
Question-Answer Bots	`64–128`
General Search	`32–64`
RAG Pipelines	`128+`
Speed-Critical Systems	`0` (no overlap)

Example:
Chunk Size = 256, Overlap = 64 → Smooth "sliding window."

Minimum Characters per Sentence

Filters out fragmented or irrelevant sentences.

Use Case	Recommended Min Characters
General Text	`50–64`
Technical Notes	`30–40`
Spoken Dialogue	`15–20`

Ensures each sentence carries meaningful information.

Minimum Sentences per Chunk

Prevents tiny, contextless chunks.

Use Case	Recommended Min Sentences
Support Logs	`2–3`
Wikipedia-style Text	`1`
Transcripts	`3–5`

1 = flexible chunking, 3+ = ensures more coherent units of thought.

Use Case Presets (Quick Reference)

Use Case	Chunk Size	Overlap	Min Sentences	Notes
Chatbots (FAQ)	`256`	`64`	`1`	Fast + good recall
Search on Long Docs	`512`	`128`	`2`	Preserve topic continuity
RAG Pipeline for QA	`384`	`128`	`3`	Chunks remain LLM-friendly
Real-Time Summarization	`128`	`32`	`1`	Keeps results concise
Support Transcripts	`512`	`64`	`3`	Captures complete interactions
Legal/Policy Documents	`768`	`128`	`2`	Avoids mid-clause cuts
Coding Documentation	`256`	`64`	`1`	Logical function or block splits

Embedding Model Considerations

Different embedding models have different token limits that affect your chunking strategy.

Model-Specific Token Limits

Model Family	Max Tokens	Recommended Chunk Size	Best For
E5 Models (small/base/large)	`512`	`384–448`	General purpose, multilingual
ModernBERT Models	`512`	`384–448`	BERT-based tasks
Snowflake Arctic Embed	`8192`	`768–2048`	Long documents, transcripts

Avoiding Truncation

Critical: Ensure your chunk size + overlap does not exceed the model's maximum token limit.

Exceeding limits causes truncation at the embedding stage
Truncation leads to context loss and reduced semantic understanding
Always verify: chunk_size + overlap ≤ model_max_tokens

Example for E5 models:
Chunk Size = 400, Overlap = 100 → Total = 500 tokens (within 512 limit)

Configuration Guidelines

Choosing Settings

Start with use case presets then adjust based on your specific content and performance requirements
Balance chunk size and overlap to maintain context while avoiding redundancy
Consider your content type when setting minimum sentence and character thresholds
Account for embedding model token limits to avoid truncation and context loss

Common Considerations

Smaller chunks: Better precision, may lose context
Larger chunks: Better context, may exceed model limits
More overlap: Better context preservation, more storage/processing
Less overlap: Faster processing, potential context loss at boundaries

Chunking Guidelines

What is Chunking?

Chunking Parameters Explained

Chunk Size

Chunk Overlap

Minimum Characters per Sentence

Minimum Sentences per Chunk

Use Case Presets (Quick Reference)

Embedding Model Considerations

Model-Specific Token Limits

Avoiding Truncation

Configuration Guidelines

Choosing Settings

Common Considerations

Some Helpful Links

<%= article.name %>

Search

Chunking Guidelines

What is Chunking?

Chunking Parameters Explained

Chunk Size

Chunk Overlap

Minimum Characters per Sentence

Minimum Sentences per Chunk

Use Case Presets (Quick Reference)

Embedding Model Considerations

Model-Specific Token Limits

Avoiding Truncation

Configuration Guidelines

Choosing Settings

Common Considerations

Some Helpful Links

<%= article.name %>