chunkipy.text_chunker.text_chunker

Classes

TextChunker([chunk_size, size_estimator, ...])

class chunkipy.text_chunker.text_chunker.TextChunker(chunk_size=None, size_estimator=None, overlap_ratio=0.0, text_splitters=[])[source]

Bases: object

Parameters:
chunk(text)[source]

Chunk the provided text into smaller parts based on the configured chunk size and overlap.

Parameters:

text (str) – The text to be chunked

Returns:

A list containing the chunks and for each chunks the list of text parts the made it up.

Return type:

Chunks

split_text(text)[source]

Split the provided text into smaller parts based on the configured text splitters and chunk size.

Parameters:

text (str) – The text to be split.

Yields:

Generator [TextPart, None, None] – A generator yielding TextPart objects, each containing a piece of text and its estimated size.

Return type:

Generator[TextPart, None, None]