Chunking
Splitting documents into smaller segments for processing and retrieval.
Definition
Chunking is the process of dividing documents into smaller, manageable segments for AI processing. Effective chunking preserves semantic meaning while creating pieces small enough for embedding models and context windows. Strategies include fixed-size chunks, sentence-based splitting, paragraph boundaries, and semantic chunking that respects document structure. Poor chunking can split important context across segments, degrading retrieval quality.
Related terms
More in Data Infrastructure
Embedding
A numerical representation of text that captures its semantic meaning.
Vector Database
A database optimised for storing and querying high-dimensional vector data.
Knowledge Graph
A structured representation of entities and their relationships.
Reranking
Reordering search results using a more sophisticated model to improve relevance.
See Chunking in action
Understanding the terminology is the first step. See how Conductor applies these concepts to solve real document intelligence challenges.
Request a demo