Custom text splitters
Create a custom splitter when built-in separator/sentence splitters are not enough for your domain.
Base class
Extend BaseTextSplitter and implement _split(self, text: str) -> list[str].
Minimal example
from chunkipy import RecursiveTextChunker
from chunkipy.text_splitters.base_text_splitter import BaseTextSplitter
class ArrowTextSplitter(BaseTextSplitter):
def _split(self, text: str) -> list[str]:
return [part.strip() for part in text.split("->") if part.strip()]
splitter = ArrowTextSplitter()
chunker = RecursiveTextChunker(chunk_size=50, text_splitters=[splitter])
text = "part one -> part two -> part three"
chunks = chunker.chunk(text)
Guidelines
Return meaningful segments, not single characters, unless explicitly needed.
Keep splitter output deterministic.
Validate edge separators in unit tests for your domain texts.