struct-ie


Namestruct-ie JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryA Python library for structured information extraction with LLMs.
upload_time2024-08-13 16:46:22
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords named-entity-recognition ner data-science natural-language-processing artificial-intelligence nlp machine-learning transformers
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # `Struct-IE`: Structured Information Extraction with Large Language Models

`struct-ie` is a Python library for named entity extraction using a transformer-based model.

## Installation

You can install the `struct-ie` library from PyPI:

```bash
pip install struct_ie
```

## To-Do List

- [x] Implement batch prediction
- [ ] Implement a Trainer fot Instruction Tuning
- [ ] PrefixLM for Instruction Tuning
- [ ] Add RelationExtractor
- [ ] Add GraphExtractor
- [ ] Add JsonExtractor


## Usage

You can try it on google colab: <a href="https://colab.research.google.com/drive/1RjtZ8xWg6KU4ztHiRfSSrEr1UeZr6eZ2?usp=sharing">
        <img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />
</a>

Here's an example of how to use the `EntityExtractor`:

### 1. Basic Usage

```python
from struct_ie import EntityExtractor

# Define the entity types with descriptions (optional)
entity_types_with_descriptions = {
    "Name": "Names of individuals like 'Jane Doe'",
    "Award": "Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'",
    "Date": None,
    "Competition": "Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'",
    "Team": None
}

# Initialize the EntityExtractor
extractor = EntityExtractor("Qwen/Qwen2-0.5B-Instruct", entity_types_with_descriptions, device="cpu")

# Example text for entity extraction
text = "Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018."

# Extract entities from the text
entities = extractor.extract_entities(text)
print(entities)
```

### 2. Usage with a Custom Prompt

```python
from struct_ie import EntityExtractor

# Define the entity types with descriptions (optional)
entity_types_with_descriptions = {
    "Name": "Names of individuals like 'Jean-Luc Picard' or 'Jane Doe'",
    "Award": "Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'",
    "Date": None,
    "Competition": "Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'",
    "Team": "Names of sports teams or organizations like 'Manchester United' or 'FC Barcelona'"
}

# Initialize the EntityExtractor
extractor = EntityExtractor("Qwen/Qwen2-0.5B-Instruct", entity_types_with_descriptions, device="cpu")

# Example text for entity extraction
text = "Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018."

# Custom prompt for entity extraction
prompt = "You are an expert on Named Entity Recognition. Extract entities from this text."

# Extract entities from the text using a custom prompt
entities = extractor.extract_entities(text, prompt=prompt)
print(entities)
```

### 3. Usage with Few-shot Examples

```python
from struct_ie import EntityExtractor

# Define the entity types with descriptions (optional)
entity_types_with_descriptions = {
    "Name": "Names of individuals like 'Jean-Luc Picard' or 'Jane Doe'",
    "Award": "Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'",
    "Date": None,
    "Competition": "Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'",
    "Team": "Names of sports teams or organizations like 'Manchester United' or 'FC Barcelona'"
}

# Initialize the EntityExtractor
extractor = EntityExtractor("Qwen/Qwen2-0.5B-Instruct", entity_types_with_descriptions, device="cpu")

# Example text for entity extraction
text = "Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018."

# Few-shot examples for improved entity extraction
demonstrations = [
    {"input": "Lionel Messi won the Ballon d'Or 7 times.", "output": [("Lionel Messi", "Name"), ("Ballon d'Or", "Award")]}
]

# Extract entities from the text using few-shot examples
entities = extractor.extract_entities(text, few_shot_examples=demonstrations)
print(entities)
```

## License

This project is licensed under the Apache-2.0.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "struct-ie",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Urchade Zaratiana <urchade.zaratiana@gmail.com>",
    "keywords": "named-entity-recognition, ner, data-science, natural-language-processing, artificial-intelligence, nlp, machine-learning, transformers",
    "author": null,
    "author_email": "Urchade Zaratiana <urchade.zaratiana@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/42/14/f3afd3999a978fe7296dd8571d13ba18135f8e2a26d006f58aaf2944455c/struct_ie-0.0.2.tar.gz",
    "platform": null,
    "description": "# `Struct-IE`: Structured Information Extraction with Large Language Models\n\n`struct-ie` is a Python library for named entity extraction using a transformer-based model.\n\n## Installation\n\nYou can install the `struct-ie` library from PyPI:\n\n```bash\npip install struct_ie\n```\n\n## To-Do List\n\n- [x] Implement batch prediction\n- [ ] Implement a Trainer fot Instruction Tuning\n- [ ] PrefixLM for Instruction Tuning\n- [ ] Add RelationExtractor\n- [ ] Add GraphExtractor\n- [ ] Add JsonExtractor\n\n\n## Usage\n\nYou can try it on google colab: <a href=\"https://colab.research.google.com/drive/1RjtZ8xWg6KU4ztHiRfSSrEr1UeZr6eZ2?usp=sharing\">\n        <img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />\n</a>\n\nHere's an example of how to use the `EntityExtractor`:\n\n### 1. Basic Usage\n\n```python\nfrom struct_ie import EntityExtractor\n\n# Define the entity types with descriptions (optional)\nentity_types_with_descriptions = {\n    \"Name\": \"Names of individuals like 'Jane Doe'\",\n    \"Award\": \"Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'\",\n    \"Date\": None,\n    \"Competition\": \"Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'\",\n    \"Team\": None\n}\n\n# Initialize the EntityExtractor\nextractor = EntityExtractor(\"Qwen/Qwen2-0.5B-Instruct\", entity_types_with_descriptions, device=\"cpu\")\n\n# Example text for entity extraction\ntext = \"Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018.\"\n\n# Extract entities from the text\nentities = extractor.extract_entities(text)\nprint(entities)\n```\n\n### 2. Usage with a Custom Prompt\n\n```python\nfrom struct_ie import EntityExtractor\n\n# Define the entity types with descriptions (optional)\nentity_types_with_descriptions = {\n    \"Name\": \"Names of individuals like 'Jean-Luc Picard' or 'Jane Doe'\",\n    \"Award\": \"Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'\",\n    \"Date\": None,\n    \"Competition\": \"Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'\",\n    \"Team\": \"Names of sports teams or organizations like 'Manchester United' or 'FC Barcelona'\"\n}\n\n# Initialize the EntityExtractor\nextractor = EntityExtractor(\"Qwen/Qwen2-0.5B-Instruct\", entity_types_with_descriptions, device=\"cpu\")\n\n# Example text for entity extraction\ntext = \"Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018.\"\n\n# Custom prompt for entity extraction\nprompt = \"You are an expert on Named Entity Recognition. Extract entities from this text.\"\n\n# Extract entities from the text using a custom prompt\nentities = extractor.extract_entities(text, prompt=prompt)\nprint(entities)\n```\n\n### 3. Usage with Few-shot Examples\n\n```python\nfrom struct_ie import EntityExtractor\n\n# Define the entity types with descriptions (optional)\nentity_types_with_descriptions = {\n    \"Name\": \"Names of individuals like 'Jean-Luc Picard' or 'Jane Doe'\",\n    \"Award\": \"Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'\",\n    \"Date\": None,\n    \"Competition\": \"Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'\",\n    \"Team\": \"Names of sports teams or organizations like 'Manchester United' or 'FC Barcelona'\"\n}\n\n# Initialize the EntityExtractor\nextractor = EntityExtractor(\"Qwen/Qwen2-0.5B-Instruct\", entity_types_with_descriptions, device=\"cpu\")\n\n# Example text for entity extraction\ntext = \"Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018.\"\n\n# Few-shot examples for improved entity extraction\ndemonstrations = [\n    {\"input\": \"Lionel Messi won the Ballon d'Or 7 times.\", \"output\": [(\"Lionel Messi\", \"Name\"), (\"Ballon d'Or\", \"Award\")]}\n]\n\n# Extract entities from the text using few-shot examples\nentities = extractor.extract_entities(text, few_shot_examples=demonstrations)\nprint(entities)\n```\n\n## License\n\nThis project is licensed under the Apache-2.0.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python library for structured information extraction with LLMs.",
    "version": "0.0.2",
    "project_urls": {
        "Homepage": "https://github.com/urchade/struct_ie",
        "Repository": "https://github.com/urchade/struct_ie"
    },
    "split_keywords": [
        "named-entity-recognition",
        " ner",
        " data-science",
        " natural-language-processing",
        " artificial-intelligence",
        " nlp",
        " machine-learning",
        " transformers"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d29df99456e35982224b6bb06939ab52c34ba20177913836deadd9154e10292a",
                "md5": "eacc83f843e4fea648e6a4a5cd2c5b62",
                "sha256": "80a7a43c37f19871fa22478f27fe73ba5f6b1e41abb6150e2ada66b73da4fab6"
            },
            "downloads": -1,
            "filename": "struct_ie-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "eacc83f843e4fea648e6a4a5cd2c5b62",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 8380,
            "upload_time": "2024-08-13T16:46:19",
            "upload_time_iso_8601": "2024-08-13T16:46:19.820354Z",
            "url": "https://files.pythonhosted.org/packages/d2/9d/f99456e35982224b6bb06939ab52c34ba20177913836deadd9154e10292a/struct_ie-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4214f3afd3999a978fe7296dd8571d13ba18135f8e2a26d006f58aaf2944455c",
                "md5": "7a3414ace856a44651ccb4cc476fdf25",
                "sha256": "156c24c128b88b4c7c047bbfb668465ec3cd3ee25b9d78be9b29d66741e24633"
            },
            "downloads": -1,
            "filename": "struct_ie-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "7a3414ace856a44651ccb4cc476fdf25",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 8058,
            "upload_time": "2024-08-13T16:46:22",
            "upload_time_iso_8601": "2024-08-13T16:46:22.607111Z",
            "url": "https://files.pythonhosted.org/packages/42/14/f3afd3999a978fe7296dd8571d13ba18135f8e2a26d006f58aaf2944455c/struct_ie-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-13 16:46:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "urchade",
    "github_project": "struct_ie",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "struct-ie"
}
        
Elapsed time: 3.76975s