chunkipy.text_splitters

Public text splitter classes exposed by chunkipy.text_splitters.

class chunkipy.text_splitters.BaseTextSplitter[source]

Bases: ABC

Base class for splitter strategies that divide text into smaller pieces.

split(text)[source]

Template method for splitting text. Validates the input and delegates the actual splitting logic to the subclass.

Parameters:

text (str) – The text to be split.

Returns:

A list of text text parts.

Return type:

list[str]

class chunkipy.text_splitters.ColonTextSplitter[source]

Bases: SeparatorTextSplitter

Split text on ``: `` boundaries.

class chunkipy.text_splitters.CommaTextSplitter[source]

Bases: SeparatorTextSplitter

Split text on ``, `` boundaries.

class chunkipy.text_splitters.FullStopTextSplitter[source]

Bases: SeparatorTextSplitter

Split text on ``. `` sentence-like boundaries.

class chunkipy.text_splitters.NewlineTextSplitter[source]

Bases: SeparatorTextSplitter

Split text on newline boundaries.

class chunkipy.text_splitters.SemicolonTextSplitter[source]

Bases: SeparatorTextSplitter

Split text on ``; `` boundaries.

class chunkipy.text_splitters.SeparatorTextSplitter(separator)[source]

Bases: BaseTextSplitter

Split text using a fixed separator while preserving the separator.

Parameters:

separator (str)

property separator: str

Return the delimiter used by the splitter.

class chunkipy.text_splitters.WordTextSplitter[source]

Bases: SeparatorTextSplitter

Split text on spaces while preserving trailing whitespace.

Modules

base_text_splitter

basic_text_splitters

semantic

Semantic splitter abstractions and implementations.