Overview

Text splitters define how input text is divided before chunk composition.

Every splitter follows a shared interface, so you can replace built-in strategies with custom implementations without changing your chunker API.

Built-in basic splitters

Chunkipy includes delimiter-based splitters in chunkipy.text_splitters:

Semantic sentence splitters

For NLP-aware sentence segmentation, use:

When to use what

  • Use basic splitters for lightweight, deterministic splitting.

  • Use spaCy splitter when your project already uses spaCy pipelines.

  • Use Stanza splitter when Stanza language coverage fits your workflow.

  • Implement a custom splitter by extending BaseTextSplitter when your domain needs specific rules.