itext2kg

Name	itext2kg JSON
Version	0.0.9 JSON
	download
home_page	None
Summary	Incremental Knowledge Graphs Constructor Using Large Language Models
upload_time	2025-09-01 10:39:22
maintainer	None
docs_url	None
author	Auvalab - Yassir LAIRGI
requires_python	>=3.10
license	None
keywords	kg construction llms neo4j graphs
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models

![GitHub stars](https://img.shields.io/github/stars/auvalab/itext2kg?style=social)
![GitHub forks](https://img.shields.io/github/forks/auvalab/itext2kg?style=social)
![PyPI](https://img.shields.io/pypi/dm/itext2kg)
![Total Downloads](https://img.shields.io/pepy/dt/itext2kg)
[![Paper](https://img.shields.io/badge/Paper-View-green?style=flat&logo=adobeacrobatreader)](https://arxiv.org/abs/2409.03284)
![PyPI](https://img.shields.io/pypi/v/itext2kg)
[![Demo](https://img.shields.io/badge/Demo-Available-blue)](./examples/)
![Status](https://img.shields.io/badge/Status-Work%20in%20Progress-yellow)

🎉 Accepted @ [WISE 2024](https://wise2024-qatar.com/)

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="./docs/logo_white.png" width="300">
    <source media="(prefers-color-scheme: light)" srcset="./docs/logo_black.png" width="300">
    <img alt="Logo" src="./docs/logo_white.png" width="300">
  </picture>
</p>

## Overview

iText2KG is a Python package designed to incrementally construct consistent knowledge graphs with resolved entities and relations by leveraging large language models for entity and relation extraction from text documents. It features zero-shot capability, allowing for knowledge extraction across various domains without specific training. The package includes modules for document distillation, entity extraction, and relation extraction, ensuring resolved and unique entities and relationships. It continuously updates the KG with new documents and integrates them into Neo4j for visual representation.

## 🔥 News
* [29/07/2025] New Features and Enhanced Capabilities:
  - **iText2KG_Star**: Introduced a simpler and more efficient version of iText2KG that eliminates the separate entity extraction step. Instead of extracting entities and relations separately, iText2KG_Star directly extracts relationships from text, automatically deriving entities from those relationships. This approach is more efficient as it reduces processing time and token consumption and does not need to handle invented/isolated entities.
  - **Facts-Based KG Construction**: Enhanced the framework with facts-based knowledge graph construction using the Document Distiller to extract structured facts from documents, which are then used for incremental KG building. This approach provides more exhaustive and precise knowledge graphs by focusing on factual information extraction.
  - **Dynamic Knowledge Graphs**: iText2KG now supports building dynamic knowledge graphs that evolve over time. By leveraging the incremental nature of the framework and document snapshots with observation dates, users can track how knowledge changes and grows. See example: [Dynamic KG Construction](./examples/building_dynamic_kg_openai_posts.ipynb). **NB: The temporal/logical conflicts resolution is not handled in this version. But you can apply a post processing filter to resolve them**

* [19/07/2025] Major Performance and Reliability Updates:
  - **Asynchronous Architecture**: Complete migration to async/await patterns for all core methods (`build_graph`, `extract_entities`, `extract_relations`, etc.) enabling better performance and non-blocking I/O operations with LLM APIs.
  - **Logging System**: Implemented comprehensive logging infrastructure to replace all print statements with structured, configurable logging (DEBUG, INFO, WARNING, ERROR levels) with timestamps and module identification.
  - **Enhanced Batch Processing**: Improved efficiency through async batch processing for multiple document handling and LLM API calls.
  - **Better Error Handling**: Enhanced error handling and retry mechanisms with proper logging for production environments.

* [07/10/2024] Latest features:
  - The entire iText2KG code has been refactored by adding data models that describe an Entity, a Relation, and a KnowledgeGraph.
  - Each entity is embedded using both its name and label to avoid merging concepts with similar names but different labels. For example, Python:Language and Python:Snake.
    - The weights for entity name embedding and entity label are configurable, with defaults set to 0.4 for the entity label and 0.6 for the entity name.
  - A max_tries parameter has been added to the iText2KG.build_graph function for entity and relation extraction to prevent hallucinatory effects in structuring the output. Additionally, a max_tries_isolated_entities parameter has been added to the same method to handle hallucinatory effects when processing isolated entities.

* [17/09/2024] Latest features: 
  - Now, iText2KG is compatible with all the chat/embeddings models supported by LangChain. For available chat models, refer to the options listed at: https://python.langchain.com/v0.2/docs/integrations/chat/. For embedding models, explore the choices at: https://python.langchain.com/v0.2/docs/integrations/text_embedding/.

  - The constructed graph can be expanded by passing the already extracted entities and relationships as arguments to the `build_graph` function in iText2KG.
  - iText2KG is compatible with all Python versions above 3.9.


* [16/07/2024] We have addressed two major LLM hallucination issues related to KG construction with LLMs when passing the entities list and context to the LLM. These issues are:

  - The LLM might invent entities that do not exist in the provided entities list. We handled this problem by replacing the invented entities with the most similar ones from the input entities list.
  - The LLM might fail to assign a relation to some entities from the input entities list, causing a "forgetting effect." We handled this problem by reprompting the LLM to extract relations for those entities.


## Installation

To install iText2KG, ensure you have **Python 3.9 or higher** installed (required for async/await functionality), then use pip to install:

```bash
pip install itext2kg
```

## The Overall Architecture

The ```iText2KG``` package consists of four main modules that work together to construct and visualize knowledge graphs from unstructured text. An overview of the overall architecture:

1. **Document Distiller**: This module processes raw documents and reformulates them into semantic blocks based on a user-defined schema. It improves the signal-to-noise ratio by focusing on relevant information and structuring it in a predefined format. 

2. **Incremental Entity Extractor**: This module extracts unique entities from the semantic blocks and resolves ambiguities to ensure each entity is clearly defined. It uses cosine similarity measures to match local entities with global entities.

3. **Incremental Relation Extractor**: This module identifies relationships between the extracted entities. It can operate in two modes: using global entities to enrich the graph with potential information or using local entities for more precise relationships. 

4. **Graph Integrator and Visualization**: This module integrates the extracted entities and relationships into a Neo4j database, providing a visual representation of the knowledge graph. It allows for interactive exploration and analysis of the structured data.

![itext2kg](./docs/itext2kg.png)

The LLM is prompted to extract entities representing one unique concept to avoid semantically mixed entities. The following figure presents the entity and relation extraction prompts using the Langchain JSON Parser. They are categorized as follows: Blue - prompts automatically formatted by Langchain; Regular - prompts we have designed; and Italic - specifically designed prompts for entity and relation extraction. (a) prompts for relation extraction and (b) prompts for entity extraction.

![prompts](./docs/prompts_.png)

## Modules and Examples
All the examples are provided in the following jupyter notebooks:
- [Different LLM Models](./examples/different_llm_models.ipynb) - Basic usage with various language models
- [Dynamic Knowledge Graphs](./examples/building_dynamic_kg_openai_posts.ipynb) - Building evolving KGs with temporal context
- Additional examples showcasing facts extraction, iText2KG_Star, and more advanced features

Now, iText2KG is compatible with all language models supported by LangChain.

To use iText2KG, you will need both a chat model and an embeddings model.

For available chat models, refer to the options listed at: https://python.langchain.com/v0.2/docs/integrations/chat/. For embedding models, explore the choices at: https://python.langchain.com/v0.2/docs/integrations/text_embedding/.

Please ensure that you install the necessary package for each chat model before use.

#### Mistral


For Mistral, please set up your model using the tutorial here: https://python.langchain.com/v0.2/docs/integrations/chat/mistralai/. Similarly, for the embedding model, follow the setup guide here: https://python.langchain.com/v0.2/docs/integrations/text_embedding/mistralai/ .

```python
from langchain_mistralai import ChatMistralAI
from langchain_mistralai import MistralAIEmbeddings

mistral_api_key = "##"
mistral_llm_model = ChatMistralAI(
    api_key = mistral_api_key,
    model="mistral-large-latest",
    temperature=0,
    max_retries=2,
)


mistral_embeddings_model = MistralAIEmbeddings(
    model="mistral-embed",
    api_key = mistral_api_key
)
```

The Document Distiller module reformulates raw documents into predefined and semantic blocks using LLMs. It utilizes a schema to guide the extraction of specific information from each document.

#### OpenAI
The same applies for OpenAI.

please setup your model using the tutorial : https://python.langchain.com/v0.2/docs/integrations/chat/openai/ The same for embedding model : https://python.langchain.com/v0.2/docs/integrations/text_embedding/openai/

```python
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

openai_api_key = "##"

openai_llm_model = llm = ChatOpenAI(
    api_key = openai_api_key,
    model="gpt-4o",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

openai_embeddings_model = OpenAIEmbeddings(
    api_key = openai_api_key ,
    model="text-embedding-3-large",
)
```

### The ```DocumentDistiller```

Example

```python
import asyncio
from itext2kg import DocumentDistiller
# You can define a schema or use the predefined ones from schemas.py
from itext2kg.models.schemas import Article

async def main():
    # Initialize the DocumentDistiller with llm model.
    document_distiller = DocumentDistiller(llm_model = openai_llm_model)

    # List of documents to be distilled.
    documents = ["doc1", "doc2", "doc3"]

    # Information extraction query.
    IE_query = '''
    # DIRECTIVES : 
    - Act like an experienced information extractor. 
    - You have a chunk of a scientific paper.
    - If you do not find the right information, keep its place empty.
    '''

    # Distill the documents using the defined query and output data structure.
    # Note: distill() is now async and requires await
    distilled_doc = await document_distiller.distill(documents=documents, IE_query=IE_query, output_data_structure=Article)
    
    return distilled_doc

# Run the async function
distilled_doc = asyncio.run(main())
```
The schema depends on the user's specific requirements, as it outlines the essential components to extract or emphasize during the knowledge graph construction. Since there is no universal blueprint for all use cases, its design is subjective and varies by application or context. This flexibility is crucial to making the ```iText2KG``` method adaptable across a wide range of scenarios.

You can define a custom schema using  ```pydantic```. Some example schemas are available in [models/schemas.py](./itext2kg/models/schemas.py). You can use these or create new ones depending on your use-case. 


```python
from typing import List, Optional
from pydantic import BaseModel, Field

# Define an Author model with name and affiliation fields.
class Author(BaseModel):
    name: str = Field(description="The name of the author")
    affiliation: str = Field(description="The affiliation of the author")
    
# Define an Article model with various fields describing a scientific article.
class Article(BaseModel):
    title: str = Field(description="The title of the scientific article")
    authors: List[Author] = Field(description="The list of the article's authors and their affiliation")
    abstract: str = Field(description="The article's abstract")
    key_findings: str = Field(description="The key findings of the article")
    limitation_of_sota: str = Field(description="limitation of the existing work")
    proposed_solution: str = Field(description="The proposed solution in details")
    paper_limitations: str = Field(description="The limitations of the proposed solution of the paper")

```


### Facts-Based Knowledge Graph Construction

For more exhaustive knowledge graphs, you can use facts-based construction by extracting structured facts from documents first, then using these facts for KG building:

```python
import asyncio
from itext2kg import DocumentDistiller
from itext2kg.models.schemas import Facts

async def extract_facts():
    # Initialize the DocumentDistiller
    document_distiller = DocumentDistiller(llm_model=openai_llm_model)
    
    # Your documents
    documents = ["OpenAI announced ChatGPT agent with new capabilities...", 
                 "The new model can perform complex tasks autonomously..."]
    
    # Extract facts from each document
    IE_query = '''
    # DIRECTIVES : 
    - Act like an experienced information extractor. 
    - Extract clear, factual statements from the text.
    '''
    
    facts_list = await asyncio.gather(*[
        document_distiller.distill(
            documents=[doc], 
            IE_query=IE_query, 
            output_data_structure=Facts
        ) for doc in documents
    ])
    
    return facts_list

# Run the async function
facts = asyncio.run(extract_facts())
```

### The ```iText2KG_Star``` (Recommended)

iText2KG_Star is a simpler and more efficient version that directly extracts relationships from text and automatically derives entities from those relationships, eliminating the separate entity extraction step:

```python
import asyncio
from itext2kg import iText2KG_Star
from itext2kg.logging_config import setup_logging, get_logger

# Optional: Configure logging
setup_logging(level="INFO", log_file="itext2kg.log")
logger = get_logger(__name__)

async def build_knowledge_graph_star():
    # Initialize iText2KG_Star with the llm model and embeddings model
    itext2kg_star = iText2KG_Star(llm_model=openai_llm_model, embeddings_model=openai_embeddings_model)

    # Your text sections (can be facts from document distiller or raw text)
    sections = [
        "OpenAI announced ChatGPT agent with new capabilities for autonomous task execution.",
        "The new model integrates browser tools and terminal access for comprehensive automation.",
        "ChatGPT agent is rolling out to Pro, Plus, and Team users with enhanced safety measures."
    ]

    logger.info("Starting knowledge graph construction with iText2KG_Star...")
    
    # Build the knowledge graph - entities are automatically derived from relationships
    kg = await itext2kg_star.build_graph(
        sections=sections,
        ent_threshold=0.8,      # Higher threshold for more distinct entities
        rel_threshold=0.7,      # Threshold for relationship merging
        observation_date="2025-01-15"  # Optional: add temporal context
    )
    
    logger.info(f"Knowledge graph completed! Entities: {len(kg.entities)}, Relationships: {len(kg.relationships)}")
    return kg

# Run the async function
kg = asyncio.run(build_knowledge_graph_star())
```

### The ```iText2KG```
The iText2KG module is the original component of the package, responsible for integrating various functionalities to construct the knowledge graph. It uses the distilled semantic sections from documents to extract entities and relationships separately, and then builds the knowledge graph incrementally. 

Although it is highly recommended to pass the documents through the ```Document Distiller``` module, it is not required for graph creation. You can directly pass your chunks into the ```build_graph``` function of the ```iText2KG``` class; however, your graph may contain some noisy information.

```python
import asyncio
from itext2kg import iText2KG
from itext2kg.logging_config import setup_logging, get_logger

# Optional: Configure logging
setup_logging(level="INFO", log_file="itext2kg.log")
logger = get_logger(__name__)

async def build_knowledge_graph():
    # Initialize iText2KG with the llm model and embeddings model.
    itext2kg = iText2KG(llm_model = openai_llm_model, embeddings_model = openai_embeddings_model)

    # Format the distilled document into semantic sections.
    semantic_blocks = [f"{key} - {value}".replace("{", "[").replace("}", "]") for key, value in distilled_doc.items()]

    logger.info("Starting knowledge graph construction...")
    
    # Build the knowledge graph using the semantic sections.
    # Note: build_graph() is now async and requires await
    kg = await itext2kg.build_graph(sections=semantic_blocks)
    
    logger.info("Knowledge graph construction completed successfully!")
    return kg

# Run the async function
kg = asyncio.run(build_knowledge_graph())
```

### Arguments

The Arguments of ```iText2KG_Star``` (Recommended):

- `llm_model`: The language model instance to be used for extracting relationships directly from text.
- `embeddings_model`: The embeddings model instance to be used for creating vector representations of entities and relationships.
- `sleep_time (int)`: The time to wait (in seconds) when encountering rate limits or errors. Defaults to 5 seconds.

The Arguments of ```iText2KG_Star``` method ```build_graph```:

- `sections (List[str])`: A list of strings where each string represents a section of the document from which relationships will be extracted and entities derived.
- `existing_knowledge_graph (KnowledgeGraph, optional)`: An existing knowledge graph to merge with. Default is None.
- `ent_threshold (float, optional)`: The threshold for entity matching when merging sections. Default is 0.7.
- `rel_threshold (float, optional)`: The threshold for relationship matching when merging sections. Default is 0.7.
- `max_tries (int, optional)`: The maximum number of attempts to extract relationships. Defaults to 5.
- `entity_name_weight (float)`: The weight of the entity name in matching. Default is 0.6.
- `entity_label_weight (float)`: The weight of the entity label in matching. Default is 0.4.
- `observation_date (str)`: Observation date to add to relationships for temporal tracking. Defaults to "".

The Arguments of ```iText2KG```:

- `llm_model`: The language model instance to be used for extracting entities and relationships from text.
- `embeddings_model`: The embeddings model instance to be used for creating vector representations of extracted entities.
- `sleep_time (int)`: The time to wait (in seconds) when encountering rate limits or errors (for OpenAI only). Defaults to 5 seconds.

The Argument of ```iText2KG``` method ```build_graph```:

- `sections (List[str])`: A list of strings (semantic blocks) where each string represents a section of the document from which entities and relationships will be extracted.
- `ent_threshold (float, optional)`: The threshold for entity matching, used to merge entities from different sections. Default is 0.7.
- `rel_threshold (float, optional)`: The threshold for relationship matching, used to merge relationships from different sections. Default is 0.7.
- `existing_knowledge_graph (KnowledgeGraph, optional)`: An existing knowledge graph to merge the newly extracted entities and relationships into. Default is None.
- `entity_name_weight (float)`: The weight of the entity name in the entity embedding process. Default is 0.6.
- `entity_label_weight (float)`: The weight of the entity label in the entity embedding process. Default is 0.4.
- `max_tries (int, optional)`: The maximum number of attempts to extract entities and relationships. Defaults to 5.
- `max_tries_isolated_entities (int, optional)`: The maximum number of attempts to process isolated entities  (entities without relationships). Defaults to 3.
- `observation_date (str)`: Observation date to add to relationships for temporal tracking. Defaults to "".

### Dynamic Knowledge Graph Construction

Build knowledge graphs that evolve over time by processing documents with temporal context:

```python
import asyncio
from itext2kg import DocumentDistiller, iText2KG_Star
from itext2kg.models.schemas import Facts

async def build_dynamic_knowledge_graph():
    # Initialize components
    document_distiller = DocumentDistiller(llm_model=openai_llm_model)
    itext2kg_star = iText2KG_Star(llm_model=openai_llm_model, embeddings_model=openai_embeddings_model)
    
    # Sample time-series data (e.g., social media posts, news articles, reports)
    time_series_data = [
        {
            "observation_date": "2025-01-15",
            "content": "OpenAI announced ChatGPT agent with autonomous task execution capabilities."
        },
        {
            "observation_date": "2025-01-16", 
            "content": "ChatGPT agent now integrates browser tools and terminal access for enhanced automation."
        },
        {
            "observation_date": "2025-01-17",
            "content": "The new agent is rolling out to Pro, Plus, and Team users with enhanced safety measures."
        }
    ]
    
    # Extract facts from each time point
    IE_query = '''
    # DIRECTIVES : 
    - Act like an experienced information extractor.
    - Extract clear, factual statements from the text.
    '''
    
    # Process first document to initialize the KG
    facts_0 = await document_distiller.distill(
        documents=[time_series_data[0]["content"]], 
        IE_query=IE_query, 
        output_data_structure=Facts
    )
    
    # Build initial knowledge graph
    kg = await itext2kg_star.build_graph(
        sections=facts_0.facts,
        observation_date=time_series_data[0]["observation_date"],
        ent_threshold=0.8,
        rel_threshold=0.7
    )
    
    # Incrementally update with subsequent documents
    for i in range(1, len(time_series_data)):
        print(f"Processing document {i+1} from {time_series_data[i]['observation_date']}")
        
        # Extract facts from current document
        facts = await document_distiller.distill(
            documents=[time_series_data[i]["content"]], 
            IE_query=IE_query, 
            output_data_structure=Facts
        )
        
        # Update the knowledge graph incrementally
        kg = await itext2kg_star.build_graph(
            sections=facts.facts,
            observation_date=time_series_data[i]["observation_date"],
            existing_knowledge_graph=kg.model_copy(),  # Pass existing KG for incremental updates
            ent_threshold=0.8,
            rel_threshold=0.7
        )
    
    print(f"Dynamic KG completed! Entities: {len(kg.entities)}, Relationships: {len(kg.relationships)}")
    
    # Each relationship now contains observation_dates showing when it was first observed
    for rel in kg.relationships:
        if rel.properties.observation_dates:
            print(f"Relationship '{rel.name}' first observed: {rel.properties.observation_dates[0]}")
    
    return kg

# Run the dynamic KG construction
dynamic_kg = asyncio.run(build_dynamic_knowledge_graph())
```

For a complete example of dynamic KG construction from social media posts, see: [Dynamic KG Construction Example](./examples/building_dynamic_kg_openai_posts.ipynb)

## The ```GraphIntegrator```
It integrates the extracted entities and relationships into a Neo4j graph database and provides a visualization of the knowledge graph. This module allows users to easily explore and analyze the structured data using Neo4j's graph capabilities.

```python
from itext2kg.graph_integration import Neo4jStorage

URI = "bolt://localhost:7687"
USERNAME = "neo4j"
PASSWORD = "###"

# Note: Graph visualization remains synchronous
graph_integrator = Neo4jStorage(uri=URI, username=USERNAME, password=PASSWORD)
graph_integrator.visualize_graph(knowledge_graph=kg)
```


## Some ```iText2KG``` use-cases

In the figure below, we have constructed a KG for the article [seasonal](./datasets/scientific_articles/seasonal.pdf) and for the company [company](https://auvalie.com/), with its permission to publish it publicly. Additionally, the Curriculum Vitae (CV) KG is based on the following generated [CV](./datasets/cvs/CV_Emily_Davis.pdf).

![text2kg](./docs/text_2_kg.png)

## Dataset
The dataset consists of five generated CVs using GPT-4, five randomly selected scientific articles representing various domains of study with diverse structures, and five company websites from different industries of varying sizes. Additionally, we have included distilled versions of the CVs and scientific articles based on predefined schemas.

Another dataset has been added, consisting of 1,500 similar entity pairs and 500 relationships, inspired by various domains (e.g., news, scientific articles, HR practices), to estimate the threshold for merging entities and relationships based on cosine similarity.

## Public Collaboration
We welcome contributions from the community to improve iText2KG.

## Citation
```bibtex
@article{lairgi2024itext2kg,
  title={iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models},
  author={Lairgi, Yassir and Moncla, Ludovic and Cazabet, R{\'e}my and Benabdeslem, Khalid and Cl{\'e}au, Pierre},
  journal={arXiv preprint arXiv:2409.03284},
  year={2024},
  note={Accepted at The International Web Information Systems Engineering conference (WISE) 2024},
  url={https://arxiv.org/abs/2409.03284},
  eprint={2409.03284},
  archivePrefix={arXiv},
  primaryClass={cs.AI}
}
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "itext2kg",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "kg construction, llms, neo4j, graphs",
    "author": "Auvalab - Yassir LAIRGI",
    "author_email": "<yassir.lairgi@auvalie.com>",
    "download_url": "https://files.pythonhosted.org/packages/34/a6/20c1c718230ad408aac80f1e582edfa4e2369dbfdeb9c8519ccc51905f52/itext2kg-0.0.9.tar.gz",
    "platform": null,
    "description": "# iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models\n\n![GitHub stars](https://img.shields.io/github/stars/auvalab/itext2kg?style=social)\n![GitHub forks](https://img.shields.io/github/forks/auvalab/itext2kg?style=social)\n![PyPI](https://img.shields.io/pypi/dm/itext2kg)\n![Total Downloads](https://img.shields.io/pepy/dt/itext2kg)\n[![Paper](https://img.shields.io/badge/Paper-View-green?style=flat&logo=adobeacrobatreader)](https://arxiv.org/abs/2409.03284)\n![PyPI](https://img.shields.io/pypi/v/itext2kg)\n[![Demo](https://img.shields.io/badge/Demo-Available-blue)](./examples/)\n![Status](https://img.shields.io/badge/Status-Work%20in%20Progress-yellow)\n\n\ud83c\udf89 Accepted @ [WISE 2024](https://wise2024-qatar.com/)\n\n<p align=\"center\">\n  <picture>\n    <source media=\"(prefers-color-scheme: dark)\" srcset=\"./docs/logo_white.png\" width=\"300\">\n    <source media=\"(prefers-color-scheme: light)\" srcset=\"./docs/logo_black.png\" width=\"300\">\n    <img alt=\"Logo\" src=\"./docs/logo_white.png\" width=\"300\">\n  </picture>\n</p>\n\n## Overview\n\niText2KG is a Python package designed to incrementally construct consistent knowledge graphs with resolved entities and relations by leveraging large language models for entity and relation extraction from text documents. It features zero-shot capability, allowing for knowledge extraction across various domains without specific training. The package includes modules for document distillation, entity extraction, and relation extraction, ensuring resolved and unique entities and relationships. It continuously updates the KG with new documents and integrates them into Neo4j for visual representation.\n\n## \ud83d\udd25 News\n* [29/07/2025] New Features and Enhanced Capabilities:\n  - **iText2KG_Star**: Introduced a simpler and more efficient version of iText2KG that eliminates the separate entity extraction step. Instead of extracting entities and relations separately, iText2KG_Star directly extracts relationships from text, automatically deriving entities from those relationships. This approach is more efficient as it reduces processing time and token consumption and does not need to handle invented/isolated entities.\n  - **Facts-Based KG Construction**: Enhanced the framework with facts-based knowledge graph construction using the Document Distiller to extract structured facts from documents, which are then used for incremental KG building. This approach provides more exhaustive and precise knowledge graphs by focusing on factual information extraction.\n  - **Dynamic Knowledge Graphs**: iText2KG now supports building dynamic knowledge graphs that evolve over time. By leveraging the incremental nature of the framework and document snapshots with observation dates, users can track how knowledge changes and grows. See example: [Dynamic KG Construction](./examples/building_dynamic_kg_openai_posts.ipynb). **NB: The temporal/logical conflicts resolution is not handled in this version. But you can apply a post processing filter to resolve them**\n\n* [19/07/2025] Major Performance and Reliability Updates:\n  - **Asynchronous Architecture**: Complete migration to async/await patterns for all core methods (`build_graph`, `extract_entities`, `extract_relations`, etc.) enabling better performance and non-blocking I/O operations with LLM APIs.\n  - **Logging System**: Implemented comprehensive logging infrastructure to replace all print statements with structured, configurable logging (DEBUG, INFO, WARNING, ERROR levels) with timestamps and module identification.\n  - **Enhanced Batch Processing**: Improved efficiency through async batch processing for multiple document handling and LLM API calls.\n  - **Better Error Handling**: Enhanced error handling and retry mechanisms with proper logging for production environments.\n\n* [07/10/2024] Latest features:\n  - The entire iText2KG code has been refactored by adding data models that describe an Entity, a Relation, and a KnowledgeGraph.\n  - Each entity is embedded using both its name and label to avoid merging concepts with similar names but different labels. For example, Python:Language and Python:Snake.\n    - The weights for entity name embedding and entity label are configurable, with defaults set to 0.4 for the entity label and 0.6 for the entity name.\n  - A max_tries parameter has been added to the iText2KG.build_graph function for entity and relation extraction to prevent hallucinatory effects in structuring the output. Additionally, a max_tries_isolated_entities parameter has been added to the same method to handle hallucinatory effects when processing isolated entities.\n\n* [17/09/2024] Latest features: \n  - Now, iText2KG is compatible with all the chat/embeddings models supported by LangChain. For available chat models, refer to the options listed at: https://python.langchain.com/v0.2/docs/integrations/chat/. For embedding models, explore the choices at: https://python.langchain.com/v0.2/docs/integrations/text_embedding/.\n\n  - The constructed graph can be expanded by passing the already extracted entities and relationships as arguments to the `build_graph` function in iText2KG.\n  - iText2KG is compatible with all Python versions above 3.9.\n\n\n* [16/07/2024] We have addressed two major LLM hallucination issues related to KG construction with LLMs when passing the entities list and context to the LLM. These issues are:\n\n  - The LLM might invent entities that do not exist in the provided entities list. We handled this problem by replacing the invented entities with the most similar ones from the input entities list.\n  - The LLM might fail to assign a relation to some entities from the input entities list, causing a \"forgetting effect.\" We handled this problem by reprompting the LLM to extract relations for those entities.\n\n\n## Installation\n\nTo install iText2KG, ensure you have **Python 3.9 or higher** installed (required for async/await functionality), then use pip to install:\n\n```bash\npip install itext2kg\n```\n\n## The Overall Architecture\n\nThe ```iText2KG``` package consists of four main modules that work together to construct and visualize knowledge graphs from unstructured text. An overview of the overall architecture:\n\n1. **Document Distiller**: This module processes raw documents and reformulates them into semantic blocks based on a user-defined schema. It improves the signal-to-noise ratio by focusing on relevant information and structuring it in a predefined format. \n\n2. **Incremental Entity Extractor**: This module extracts unique entities from the semantic blocks and resolves ambiguities to ensure each entity is clearly defined. It uses cosine similarity measures to match local entities with global entities.\n\n3. **Incremental Relation Extractor**: This module identifies relationships between the extracted entities. It can operate in two modes: using global entities to enrich the graph with potential information or using local entities for more precise relationships. \n\n4. **Graph Integrator and Visualization**: This module integrates the extracted entities and relationships into a Neo4j database, providing a visual representation of the knowledge graph. It allows for interactive exploration and analysis of the structured data.\n\n![itext2kg](./docs/itext2kg.png)\n\nThe LLM is prompted to extract entities representing one unique concept to avoid semantically mixed entities. The following figure presents the entity and relation extraction prompts using the Langchain JSON Parser. They are categorized as follows: Blue - prompts automatically formatted by Langchain; Regular - prompts we have designed; and Italic - specifically designed prompts for entity and relation extraction. (a) prompts for relation extraction and (b) prompts for entity extraction.\n\n![prompts](./docs/prompts_.png)\n\n## Modules and Examples\nAll the examples are provided in the following jupyter notebooks:\n- [Different LLM Models](./examples/different_llm_models.ipynb) - Basic usage with various language models\n- [Dynamic Knowledge Graphs](./examples/building_dynamic_kg_openai_posts.ipynb) - Building evolving KGs with temporal context\n- Additional examples showcasing facts extraction, iText2KG_Star, and more advanced features\n\nNow, iText2KG is compatible with all language models supported by LangChain.\n\nTo use iText2KG, you will need both a chat model and an embeddings model.\n\nFor available chat models, refer to the options listed at: https://python.langchain.com/v0.2/docs/integrations/chat/. For embedding models, explore the choices at: https://python.langchain.com/v0.2/docs/integrations/text_embedding/.\n\nPlease ensure that you install the necessary package for each chat model before use.\n\n#### Mistral\n\n\nFor Mistral, please set up your model using the tutorial here: https://python.langchain.com/v0.2/docs/integrations/chat/mistralai/. Similarly, for the embedding model, follow the setup guide here: https://python.langchain.com/v0.2/docs/integrations/text_embedding/mistralai/ .\n\n```python\nfrom langchain_mistralai import ChatMistralAI\nfrom langchain_mistralai import MistralAIEmbeddings\n\nmistral_api_key = \"##\"\nmistral_llm_model = ChatMistralAI(\n    api_key = mistral_api_key,\n    model=\"mistral-large-latest\",\n    temperature=0,\n    max_retries=2,\n)\n\n\nmistral_embeddings_model = MistralAIEmbeddings(\n    model=\"mistral-embed\",\n    api_key = mistral_api_key\n)\n```\n\nThe Document Distiller module reformulates raw documents into predefined and semantic blocks using LLMs. It utilizes a schema to guide the extraction of specific information from each document.\n\n#### OpenAI\nThe same applies for OpenAI.\n\nplease setup your model using the tutorial : https://python.langchain.com/v0.2/docs/integrations/chat/openai/ The same for embedding model : https://python.langchain.com/v0.2/docs/integrations/text_embedding/openai/\n\n```python\nfrom langchain_openai import ChatOpenAI, OpenAIEmbeddings\n\nopenai_api_key = \"##\"\n\nopenai_llm_model = llm = ChatOpenAI(\n    api_key = openai_api_key,\n    model=\"gpt-4o\",\n    temperature=0,\n    max_tokens=None,\n    timeout=None,\n    max_retries=2,\n)\n\nopenai_embeddings_model = OpenAIEmbeddings(\n    api_key = openai_api_key ,\n    model=\"text-embedding-3-large\",\n)\n```\n\n### The ```DocumentDistiller```\n\nExample\n\n```python\nimport asyncio\nfrom itext2kg import DocumentDistiller\n# You can define a schema or use the predefined ones from schemas.py\nfrom itext2kg.models.schemas import Article\n\nasync def main():\n    # Initialize the DocumentDistiller with llm model.\n    document_distiller = DocumentDistiller(llm_model = openai_llm_model)\n\n    # List of documents to be distilled.\n    documents = [\"doc1\", \"doc2\", \"doc3\"]\n\n    # Information extraction query.\n    IE_query = '''\n    # DIRECTIVES : \n    - Act like an experienced information extractor. \n    - You have a chunk of a scientific paper.\n    - If you do not find the right information, keep its place empty.\n    '''\n\n    # Distill the documents using the defined query and output data structure.\n    # Note: distill() is now async and requires await\n    distilled_doc = await document_distiller.distill(documents=documents, IE_query=IE_query, output_data_structure=Article)\n    \n    return distilled_doc\n\n# Run the async function\ndistilled_doc = asyncio.run(main())\n```\nThe schema depends on the user's specific requirements, as it outlines the essential components to extract or emphasize during the knowledge graph construction. Since there is no universal blueprint for all use cases, its design is subjective and varies by application or context. This flexibility is crucial to making the ```iText2KG``` method adaptable across a wide range of scenarios.\n\nYou can define a custom schema using  ```pydantic```. Some example schemas are available in [models/schemas.py](./itext2kg/models/schemas.py). You can use these or create new ones depending on your use-case. \n\n\n```python\nfrom typing import List, Optional\nfrom pydantic import BaseModel, Field\n\n# Define an Author model with name and affiliation fields.\nclass Author(BaseModel):\n    name: str = Field(description=\"The name of the author\")\n    affiliation: str = Field(description=\"The affiliation of the author\")\n    \n# Define an Article model with various fields describing a scientific article.\nclass Article(BaseModel):\n    title: str = Field(description=\"The title of the scientific article\")\n    authors: List[Author] = Field(description=\"The list of the article's authors and their affiliation\")\n    abstract: str = Field(description=\"The article's abstract\")\n    key_findings: str = Field(description=\"The key findings of the article\")\n    limitation_of_sota: str = Field(description=\"limitation of the existing work\")\n    proposed_solution: str = Field(description=\"The proposed solution in details\")\n    paper_limitations: str = Field(description=\"The limitations of the proposed solution of the paper\")\n\n```\n\n\n### Facts-Based Knowledge Graph Construction\n\nFor more exhaustive knowledge graphs, you can use facts-based construction by extracting structured facts from documents first, then using these facts for KG building:\n\n```python\nimport asyncio\nfrom itext2kg import DocumentDistiller\nfrom itext2kg.models.schemas import Facts\n\nasync def extract_facts():\n    # Initialize the DocumentDistiller\n    document_distiller = DocumentDistiller(llm_model=openai_llm_model)\n    \n    # Your documents\n    documents = [\"OpenAI announced ChatGPT agent with new capabilities...\", \n                 \"The new model can perform complex tasks autonomously...\"]\n    \n    # Extract facts from each document\n    IE_query = '''\n    # DIRECTIVES : \n    - Act like an experienced information extractor. \n    - Extract clear, factual statements from the text.\n    '''\n    \n    facts_list = await asyncio.gather(*[\n        document_distiller.distill(\n            documents=[doc], \n            IE_query=IE_query, \n            output_data_structure=Facts\n        ) for doc in documents\n    ])\n    \n    return facts_list\n\n# Run the async function\nfacts = asyncio.run(extract_facts())\n```\n\n### The ```iText2KG_Star``` (Recommended)\n\niText2KG_Star is a simpler and more efficient version that directly extracts relationships from text and automatically derives entities from those relationships, eliminating the separate entity extraction step:\n\n```python\nimport asyncio\nfrom itext2kg import iText2KG_Star\nfrom itext2kg.logging_config import setup_logging, get_logger\n\n# Optional: Configure logging\nsetup_logging(level=\"INFO\", log_file=\"itext2kg.log\")\nlogger = get_logger(__name__)\n\nasync def build_knowledge_graph_star():\n    # Initialize iText2KG_Star with the llm model and embeddings model\n    itext2kg_star = iText2KG_Star(llm_model=openai_llm_model, embeddings_model=openai_embeddings_model)\n\n    # Your text sections (can be facts from document distiller or raw text)\n    sections = [\n        \"OpenAI announced ChatGPT agent with new capabilities for autonomous task execution.\",\n        \"The new model integrates browser tools and terminal access for comprehensive automation.\",\n        \"ChatGPT agent is rolling out to Pro, Plus, and Team users with enhanced safety measures.\"\n    ]\n\n    logger.info(\"Starting knowledge graph construction with iText2KG_Star...\")\n    \n    # Build the knowledge graph - entities are automatically derived from relationships\n    kg = await itext2kg_star.build_graph(\n        sections=sections,\n        ent_threshold=0.8,      # Higher threshold for more distinct entities\n        rel_threshold=0.7,      # Threshold for relationship merging\n        observation_date=\"2025-01-15\"  # Optional: add temporal context\n    )\n    \n    logger.info(f\"Knowledge graph completed! Entities: {len(kg.entities)}, Relationships: {len(kg.relationships)}\")\n    return kg\n\n# Run the async function\nkg = asyncio.run(build_knowledge_graph_star())\n```\n\n### The ```iText2KG```\nThe iText2KG module is the original component of the package, responsible for integrating various functionalities to construct the knowledge graph. It uses the distilled semantic sections from documents to extract entities and relationships separately, and then builds the knowledge graph incrementally. \n\nAlthough it is highly recommended to pass the documents through the ```Document Distiller``` module, it is not required for graph creation. You can directly pass your chunks into the ```build_graph``` function of the ```iText2KG``` class; however, your graph may contain some noisy information.\n\n```python\nimport asyncio\nfrom itext2kg import iText2KG\nfrom itext2kg.logging_config import setup_logging, get_logger\n\n# Optional: Configure logging\nsetup_logging(level=\"INFO\", log_file=\"itext2kg.log\")\nlogger = get_logger(__name__)\n\nasync def build_knowledge_graph():\n    # Initialize iText2KG with the llm model and embeddings model.\n    itext2kg = iText2KG(llm_model = openai_llm_model, embeddings_model = openai_embeddings_model)\n\n    # Format the distilled document into semantic sections.\n    semantic_blocks = [f\"{key} - {value}\".replace(\"{\", \"[\").replace(\"}\", \"]\") for key, value in distilled_doc.items()]\n\n    logger.info(\"Starting knowledge graph construction...\")\n    \n    # Build the knowledge graph using the semantic sections.\n    # Note: build_graph() is now async and requires await\n    kg = await itext2kg.build_graph(sections=semantic_blocks)\n    \n    logger.info(\"Knowledge graph construction completed successfully!\")\n    return kg\n\n# Run the async function\nkg = asyncio.run(build_knowledge_graph())\n```\n\n### Arguments\n\nThe Arguments of ```iText2KG_Star``` (Recommended):\n\n- `llm_model`: The language model instance to be used for extracting relationships directly from text.\n- `embeddings_model`: The embeddings model instance to be used for creating vector representations of entities and relationships.\n- `sleep_time (int)`: The time to wait (in seconds) when encountering rate limits or errors. Defaults to 5 seconds.\n\nThe Arguments of ```iText2KG_Star``` method ```build_graph```:\n\n- `sections (List[str])`: A list of strings where each string represents a section of the document from which relationships will be extracted and entities derived.\n- `existing_knowledge_graph (KnowledgeGraph, optional)`: An existing knowledge graph to merge with. Default is None.\n- `ent_threshold (float, optional)`: The threshold for entity matching when merging sections. Default is 0.7.\n- `rel_threshold (float, optional)`: The threshold for relationship matching when merging sections. Default is 0.7.\n- `max_tries (int, optional)`: The maximum number of attempts to extract relationships. Defaults to 5.\n- `entity_name_weight (float)`: The weight of the entity name in matching. Default is 0.6.\n- `entity_label_weight (float)`: The weight of the entity label in matching. Default is 0.4.\n- `observation_date (str)`: Observation date to add to relationships for temporal tracking. Defaults to \"\".\n\nThe Arguments of ```iText2KG```:\n\n- `llm_model`: The language model instance to be used for extracting entities and relationships from text.\n- `embeddings_model`: The embeddings model instance to be used for creating vector representations of extracted entities.\n- `sleep_time (int)`: The time to wait (in seconds) when encountering rate limits or errors (for OpenAI only). Defaults to 5 seconds.\n\nThe Argument of ```iText2KG``` method ```build_graph```:\n\n- `sections (List[str])`: A list of strings (semantic blocks) where each string represents a section of the document from which entities and relationships will be extracted.\n- `ent_threshold (float, optional)`: The threshold for entity matching, used to merge entities from different sections. Default is 0.7.\n- `rel_threshold (float, optional)`: The threshold for relationship matching, used to merge relationships from different sections. Default is 0.7.\n- `existing_knowledge_graph (KnowledgeGraph, optional)`: An existing knowledge graph to merge the newly extracted entities and relationships into. Default is None.\n- `entity_name_weight (float)`: The weight of the entity name in the entity embedding process. Default is 0.6.\n- `entity_label_weight (float)`: The weight of the entity label in the entity embedding process. Default is 0.4.\n- `max_tries (int, optional)`: The maximum number of attempts to extract entities and relationships. Defaults to 5.\n- `max_tries_isolated_entities (int, optional)`: The maximum number of attempts to process isolated entities  (entities without relationships). Defaults to 3.\n- `observation_date (str)`: Observation date to add to relationships for temporal tracking. Defaults to \"\".\n\n### Dynamic Knowledge Graph Construction\n\nBuild knowledge graphs that evolve over time by processing documents with temporal context:\n\n```python\nimport asyncio\nfrom itext2kg import DocumentDistiller, iText2KG_Star\nfrom itext2kg.models.schemas import Facts\n\nasync def build_dynamic_knowledge_graph():\n    # Initialize components\n    document_distiller = DocumentDistiller(llm_model=openai_llm_model)\n    itext2kg_star = iText2KG_Star(llm_model=openai_llm_model, embeddings_model=openai_embeddings_model)\n    \n    # Sample time-series data (e.g., social media posts, news articles, reports)\n    time_series_data = [\n        {\n            \"observation_date\": \"2025-01-15\",\n            \"content\": \"OpenAI announced ChatGPT agent with autonomous task execution capabilities.\"\n        },\n        {\n            \"observation_date\": \"2025-01-16\", \n            \"content\": \"ChatGPT agent now integrates browser tools and terminal access for enhanced automation.\"\n        },\n        {\n            \"observation_date\": \"2025-01-17\",\n            \"content\": \"The new agent is rolling out to Pro, Plus, and Team users with enhanced safety measures.\"\n        }\n    ]\n    \n    # Extract facts from each time point\n    IE_query = '''\n    # DIRECTIVES : \n    - Act like an experienced information extractor.\n    - Extract clear, factual statements from the text.\n    '''\n    \n    # Process first document to initialize the KG\n    facts_0 = await document_distiller.distill(\n        documents=[time_series_data[0][\"content\"]], \n        IE_query=IE_query, \n        output_data_structure=Facts\n    )\n    \n    # Build initial knowledge graph\n    kg = await itext2kg_star.build_graph(\n        sections=facts_0.facts,\n        observation_date=time_series_data[0][\"observation_date\"],\n        ent_threshold=0.8,\n        rel_threshold=0.7\n    )\n    \n    # Incrementally update with subsequent documents\n    for i in range(1, len(time_series_data)):\n        print(f\"Processing document {i+1} from {time_series_data[i]['observation_date']}\")\n        \n        # Extract facts from current document\n        facts = await document_distiller.distill(\n            documents=[time_series_data[i][\"content\"]], \n            IE_query=IE_query, \n            output_data_structure=Facts\n        )\n        \n        # Update the knowledge graph incrementally\n        kg = await itext2kg_star.build_graph(\n            sections=facts.facts,\n            observation_date=time_series_data[i][\"observation_date\"],\n            existing_knowledge_graph=kg.model_copy(),  # Pass existing KG for incremental updates\n            ent_threshold=0.8,\n            rel_threshold=0.7\n        )\n    \n    print(f\"Dynamic KG completed! Entities: {len(kg.entities)}, Relationships: {len(kg.relationships)}\")\n    \n    # Each relationship now contains observation_dates showing when it was first observed\n    for rel in kg.relationships:\n        if rel.properties.observation_dates:\n            print(f\"Relationship '{rel.name}' first observed: {rel.properties.observation_dates[0]}\")\n    \n    return kg\n\n# Run the dynamic KG construction\ndynamic_kg = asyncio.run(build_dynamic_knowledge_graph())\n```\n\nFor a complete example of dynamic KG construction from social media posts, see: [Dynamic KG Construction Example](./examples/building_dynamic_kg_openai_posts.ipynb)\n\n## The ```GraphIntegrator```\nIt integrates the extracted entities and relationships into a Neo4j graph database and provides a visualization of the knowledge graph. This module allows users to easily explore and analyze the structured data using Neo4j's graph capabilities.\n\n```python\nfrom itext2kg.graph_integration import Neo4jStorage\n\nURI = \"bolt://localhost:7687\"\nUSERNAME = \"neo4j\"\nPASSWORD = \"###\"\n\n# Note: Graph visualization remains synchronous\ngraph_integrator = Neo4jStorage(uri=URI, username=USERNAME, password=PASSWORD)\ngraph_integrator.visualize_graph(knowledge_graph=kg)\n```\n\n\n## Some ```iText2KG``` use-cases\n\nIn the figure below, we have constructed a KG for the article [seasonal](./datasets/scientific_articles/seasonal.pdf) and for the company [company](https://auvalie.com/), with its permission to publish it publicly. Additionally, the Curriculum Vitae (CV) KG is based on the following generated [CV](./datasets/cvs/CV_Emily_Davis.pdf).\n\n![text2kg](./docs/text_2_kg.png)\n\n## Dataset\nThe dataset consists of five generated CVs using GPT-4, five randomly selected scientific articles representing various domains of study with diverse structures, and five company websites from different industries of varying sizes. Additionally, we have included distilled versions of the CVs and scientific articles based on predefined schemas.\n\nAnother dataset has been added, consisting of 1,500 similar entity pairs and 500 relationships, inspired by various domains (e.g., news, scientific articles, HR practices), to estimate the threshold for merging entities and relationships based on cosine similarity.\n\n## Public Collaboration\nWe welcome contributions from the community to improve iText2KG.\n\n## Citation\n```bibtex\n@article{lairgi2024itext2kg,\n  title={iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models},\n  author={Lairgi, Yassir and Moncla, Ludovic and Cazabet, R{\\'e}my and Benabdeslem, Khalid and Cl{\\'e}au, Pierre},\n  journal={arXiv preprint arXiv:2409.03284},\n  year={2024},\n  note={Accepted at The International Web Information Systems Engineering conference (WISE) 2024},\n  url={https://arxiv.org/abs/2409.03284},\n  eprint={2409.03284},\n  archivePrefix={arXiv},\n  primaryClass={cs.AI}\n}\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Incremental Knowledge Graphs Constructor Using Large Language Models",
    "version": "0.0.9",
    "project_urls": null,
    "split_keywords": [
        "kg construction",
        " llms",
        " neo4j",
        " graphs"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "97f1cae98ac942601af528f44a0b9e58d1d72432a35b462b782f9607ac1ef46b",
                "md5": "d7ff2b1a014a3d4efde38854052bf58b",
                "sha256": "8e2da5bad8efd970c6297152936ff8188ffe18937a2d59f8bd9320e486ae86f3"
            },
            "downloads": -1,
            "filename": "itext2kg-0.0.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d7ff2b1a014a3d4efde38854052bf58b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 53415,
            "upload_time": "2025-09-01T10:39:21",
            "upload_time_iso_8601": "2025-09-01T10:39:21.653277Z",
            "url": "https://files.pythonhosted.org/packages/97/f1/cae98ac942601af528f44a0b9e58d1d72432a35b462b782f9607ac1ef46b/itext2kg-0.0.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "34a620c1c718230ad408aac80f1e582edfa4e2369dbfdeb9c8519ccc51905f52",
                "md5": "30ddee48acd76c8fb3d13ff0435f4f93",
                "sha256": "ea481cf3d6f982912023b85ae7346a0a3085ac37fc7e0b847107fd3eb5d6645e"
            },
            "downloads": -1,
            "filename": "itext2kg-0.0.9.tar.gz",
            "has_sig": false,
            "md5_digest": "30ddee48acd76c8fb3d13ff0435f4f93",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 50558,
            "upload_time": "2025-09-01T10:39:22",
            "upload_time_iso_8601": "2025-09-01T10:39:22.840581Z",
            "url": "https://files.pythonhosted.org/packages/34/a6/20c1c718230ad408aac80f1e582edfa4e2369dbfdeb9c8519ccc51905f52/itext2kg-0.0.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-01 10:39:22",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "itext2kg"
}

Auvalab - Yassir LAIRGI