root/local/: graphrag-chunking-3.0.2 metadata and description

Simple index

Chunking utilities for GraphRAG

author Mónica Carvajal
author_email Alonso Guevara Fernández <alonsog@microsoft.com>, Andrés Morales Esquivel <andresmor@microsoft.com>, Chris Trevino <chtrevin@microsoft.com>, David Tittsworth <datittsw@microsoft.com>, Dayenne de Souza <ddesouza@microsoft.com>, Derek Worthen <deworthe@microsoft.com>, Gaudy Blanco Meneses <gaudyb@microsoft.com>, Ha Trinh <trinhha@microsoft.com>, Jonathan Larson <jolarso@microsoft.com>, Josh Bradley <joshbradley@microsoft.com>, Kate Lytvynets <kalytv@microsoft.com>, Kenny Zhang <zhangken@microsoft.com>, Nathan Evans <naevans@microsoft.com>, Rodrigo Racanicci <rracanicci@microsoft.com>, Sarah Smith <smithsarah@microsoft.com>
classifiers
  • Programming Language :: Python :: 3
  • Programming Language :: Python :: 3.11
  • Programming Language :: Python :: 3.12
  • Programming Language :: Python :: 3.13
description_content_type text/markdown
license MIT
project_urls
  • Source, https://github.com/microsoft/graphrag
requires_dist
  • graphrag-common==3.0.2
  • pydantic~=2.10
requires_python <3.14,>=3.11

Because this project isn't in the mirror_whitelist, no releases from root/pypi are included.

File Tox results History
graphrag_chunking-3.0.2-py3-none-any.whl
Size
9 KB
Type
Python Wheel
Python
3
  • Replaced 6 time(s)
  • Uploaded to root/local by root 2026-02-19 21:53:49
graphrag_chunking-3.0.2.tar.gz
Size
6 KB
Type
Source
  • Replaced 6 time(s)
  • Uploaded to root/local by root 2026-02-19 21:54:01

GraphRAG Chunking

This package contains a collection of text chunkers, a core config model, and a factory for acquiring instances.

Examples

Basic sentence chunking with nltk

The SentenceChunker class splits text into individual sentences by identifying sentence boundaries. It takes input text and returns a list where each element is a separate sentence, making it easy to process text at the sentence level.

Open the notebook to explore the basic sentence example code

Token chunking

The TokenChunker splits text into fixed-size chunks based on token count rather than sentence boundaries. It uses a tokenizer to encode text into tokens, then creates chunks of a specified size with configurable overlap between chunks.

Open the notebook to explore the token chunking example code

Using the factory via helper util

The create_chunker factory function provides a configuration-driven approach to instantiate chunkers by accepting a ChunkingConfig object that specifies the chunking strategy and parameters. This allows for more flexible and maintainable code by separating chunker configuration from direct instantiation.

Open the notebook to explore the factory helper util example code