chunkipy
- class chunkipy.Chunk(text_parts=<factory>)[source]
Bases:
objectRepresents a single chunk of text, which consists of multiple text parts.
Computed Properties: :param text: Represents the full text of the chunk by joining all ‘text’ values from its ‘text parts. :param text:parts: A list of TextPart objects that make up the chunk.
- property size: int
Calculates and returns the total size of all TextPart objects within text_parts.
Returns: int: The total size of all TextPart objects.
- class chunkipy.Chunks(iterable=(), /)[source]
Bases:
listA list-like collection of chunks with utility methods for aggregation.
Inherits from ‘list’ to act as a standard list, while providing additional methods for aggregated operations.
- class chunkipy.Overlapping[source]
Bases:
dequeA deque-like collection of TextParts with utility methods for aggregation. Inherits from deque to act as a standard deque, while providing additional methods for aggregated operations (e.g. size).
- class chunkipy.TextChunker(chunk_size=1000, size_estimator=None, tokens=False, overlap_ratio=0.0, text_splitters=[])[source]
Bases:
object- Parameters:
chunk_size (int)
size_estimator (BaseSizeEstimator)
tokens (bool)
overlap_ratio (float)
text_splitters (List[BaseTextSplitter])
- chunk(text)[source]
Chunk the provided text into smaller parts based on the configured chunk size and overlap.
- class chunkipy.TextPart(text, size)[source]
Bases:
objectRepresents a fragment or segment of a complete text, along with its character size.
Modules