chunkipy.text_chunker
- class chunkipy.text_chunker.Chunk(overlap=<factory>, content=<factory>)[source]
Bases:
objectRepresents a single chunk of text, which consists of multiple text parts.
Computed Properties: :param text: Represents the full text of the chunk by joining all ‘text’ values from its ‘text parts. :type overlap:
Overlap:param overlap: A list of TextPart objects that make up the chunk. :type content:TextParts:param content: A list of TextPart objects that make up the chunk.- property size: int
Calculates and returns the total size of all TextPart objects within text_parts.
- Returns:
The total size of all TextPart objects.
- Return type:
- class chunkipy.text_chunker.Chunks(iterable=(), /)[source]
-
A list-like collection of chunks with utility methods for aggregation.
Inherits from ‘list’ to act as a standard list, while providing additional methods for aggregated operations.
- class chunkipy.text_chunker.Overlap[source]
-
A deque-like collection of TextParts with utility methods for aggregation. Inherits from deque to act as a standard deque, while providing additional methods for aggregated operations (e.g. size).
- class chunkipy.text_chunker.TextChunker(chunk_size=None, size_estimator=None, overlap_ratio=0.0, text_splitters=[])[source]
Bases:
object- Parameters:
chunk_size (int)
size_estimator (BaseSizeEstimator)
overlap_ratio (float)
text_splitters (List[BaseTextSplitter])
- chunk(text)[source]
Chunk the provided text into smaller parts based on the configured chunk size and overlap.
- class chunkipy.text_chunker.TextPart(size, text)[source]
Bases:
objectRepresents a fragment or segment of a complete text, along with its character size.
- Parameters:
Modules