gliner-spacy


Namegliner-spacy JSON
Version 0.0.10 PyPI version JSON
download
home_pagehttps://github.com/theirstory/gliner-spacy
SummaryA SpaCy wrapper for the GLiNER model for enhanced Named Entity Recognition capabilities
upload_time2024-07-16 13:12:42
maintainerNone
docs_urlNone
authorWilliam J. B. Mattingly
requires_python>=3.7
licenseNone
keywords
VCS
bugtrack_url
requirements spacy gliner seaborn matplotlib
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # GLiNER SpaCy Wrapper

## Introduction
This project is a wrapper for integrating [GLiNER](https://github.com/urchade/GLiNER), a Named Entity Recognition (NER) model, with the SpaCy Natural Language Processing (NLP) library. GLiNER, which stands for Generalized Language INdependent Entity Recognition, is an advanced model for recognizing entities in text. The SpaCy wrapper enables easy integration and use of GLiNER within the SpaCy environment, enhancing NER capabilities with GLiNER's advanced features.

**For GliNER to work properly, you need to use a Python version 3.7-3.10**

## Features
- Integrates GLiNER with SpaCy for advanced NER tasks.
- Customizable chunk size for processing large texts.
- Support for specific entity labels like 'person' and 'organization'.
- Configurable output style for entity recognition results.

## Installation
To install this library, install it via pip:

```bash
pip install gliner-spacy
```

## Usage
To use this wrapper in your SpaCy pipeline, follow these steps:

1. Import SpaCy.
2. Create a SpaCy `Language` instance.
3. Add the `gliner_spacy` component to the SpaCy pipeline.
4. Process text using the pipeline.

Example code:

```python
import spacy

nlp = spacy.blank("en")
nlp.add_pipe("gliner_spacy")
text = "This is a text about Bill Gates and Microsoft."
doc = nlp(text)

for ent in doc.ents:
    print(ent.text, ent.label_)
```

### Expected Output

```
Bill Gates person
Microsoft organization
```

## Example with Custom Configs

```python
import spacy

custom_spacy_config = { "gliner_model": "urchade/gliner_multi",
                            "chunk_size": 250,
                            "labels": ["people","company"],
                            "style": "ent"}
nlp = spacy.blank("en")
nlp.add_pipe("gliner_spacy", config=custom_spacy_config)

text = "This is a text about Bill Gates and Microsoft."
doc = nlp(text)

for ent in doc.ents:
    print(ent.text, ent.label_, ent._.score)

#Output
# Bill Gates people 0.9967108964920044
# Microsoft company 0.9966742992401123    
```

## Configuration
The default configuration of the wrapper can be modified according to your requirements. The configurable parameters are:
- `gliner_model`: The GLiNER model to be used.
- `chunk_size`: Size of the text chunk to be processed at once.
- `labels`: The entity labels to be recognized.
- `style`: The style of output for the entities (either 'ent' or 'span').
- `threshold`: The threshold of the GliNER model (controls the degree to which a hit is considered an entity)
- `map_location`: The device on which to run the model: `cpu` or `cuda`

## Contributing
Contributions to this project are welcome. Please ensure that your code adheres to the project's coding standards and include tests for new features.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/theirstory/gliner-spacy",
    "name": "gliner-spacy",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": null,
    "author": "William J. B. Mattingly",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/31/4c/19f9f2abb3aae6a8df1f860d9565cff3d31b86682bdb0c4e6bc2e156085c/gliner-spacy-0.0.10.tar.gz",
    "platform": null,
    "description": "# GLiNER SpaCy Wrapper\n\n## Introduction\nThis project is a wrapper for integrating [GLiNER](https://github.com/urchade/GLiNER), a Named Entity Recognition (NER) model, with the SpaCy Natural Language Processing (NLP) library. GLiNER, which stands for Generalized Language INdependent Entity Recognition, is an advanced model for recognizing entities in text. The SpaCy wrapper enables easy integration and use of GLiNER within the SpaCy environment, enhancing NER capabilities with GLiNER's advanced features.\n\n**For GliNER to work properly, you need to use a Python version 3.7-3.10**\n\n## Features\n- Integrates GLiNER with SpaCy for advanced NER tasks.\n- Customizable chunk size for processing large texts.\n- Support for specific entity labels like 'person' and 'organization'.\n- Configurable output style for entity recognition results.\n\n## Installation\nTo install this library, install it via pip:\n\n```bash\npip install gliner-spacy\n```\n\n## Usage\nTo use this wrapper in your SpaCy pipeline, follow these steps:\n\n1. Import SpaCy.\n2. Create a SpaCy `Language` instance.\n3. Add the `gliner_spacy` component to the SpaCy pipeline.\n4. Process text using the pipeline.\n\nExample code:\n\n```python\nimport spacy\n\nnlp = spacy.blank(\"en\")\nnlp.add_pipe(\"gliner_spacy\")\ntext = \"This is a text about Bill Gates and Microsoft.\"\ndoc = nlp(text)\n\nfor ent in doc.ents:\n    print(ent.text, ent.label_)\n```\n\n### Expected Output\n\n```\nBill Gates person\nMicrosoft organization\n```\n\n## Example with Custom Configs\n\n```python\nimport spacy\n\ncustom_spacy_config = { \"gliner_model\": \"urchade/gliner_multi\",\n                            \"chunk_size\": 250,\n                            \"labels\": [\"people\",\"company\"],\n                            \"style\": \"ent\"}\nnlp = spacy.blank(\"en\")\nnlp.add_pipe(\"gliner_spacy\", config=custom_spacy_config)\n\ntext = \"This is a text about Bill Gates and Microsoft.\"\ndoc = nlp(text)\n\nfor ent in doc.ents:\n    print(ent.text, ent.label_, ent._.score)\n\n#Output\n# Bill Gates people 0.9967108964920044\n# Microsoft company 0.9966742992401123    \n```\n\n## Configuration\nThe default configuration of the wrapper can be modified according to your requirements. The configurable parameters are:\n- `gliner_model`: The GLiNER model to be used.\n- `chunk_size`: Size of the text chunk to be processed at once.\n- `labels`: The entity labels to be recognized.\n- `style`: The style of output for the entities (either 'ent' or 'span').\n- `threshold`: The threshold of the GliNER model (controls the degree to which a hit is considered an entity)\n- `map_location`: The device on which to run the model: `cpu` or `cuda`\n\n## Contributing\nContributions to this project are welcome. Please ensure that your code adheres to the project's coding standards and include tests for new features.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A SpaCy wrapper for the GLiNER model for enhanced Named Entity Recognition capabilities",
    "version": "0.0.10",
    "project_urls": {
        "Homepage": "https://github.com/theirstory/gliner-spacy"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3a1219d20532e91b35e2529f6cb3dd3ff927cacce679c86d51bac073d295fb72",
                "md5": "9b6475a069d700e6b1d5aca0fd38d65e",
                "sha256": "88532eaf43baa744ae807983cb4276ec678c5f5bee75e0ba40d33da1a05bcb10"
            },
            "downloads": -1,
            "filename": "gliner_spacy-0.0.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9b6475a069d700e6b1d5aca0fd38d65e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 6571,
            "upload_time": "2024-07-16T13:12:41",
            "upload_time_iso_8601": "2024-07-16T13:12:41.386486Z",
            "url": "https://files.pythonhosted.org/packages/3a/12/19d20532e91b35e2529f6cb3dd3ff927cacce679c86d51bac073d295fb72/gliner_spacy-0.0.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "314c19f9f2abb3aae6a8df1f860d9565cff3d31b86682bdb0c4e6bc2e156085c",
                "md5": "4dce0be4655b06bab0f312317c3f0281",
                "sha256": "4ae1a7aea3d81872ea2ac5640d318bd4923c7d4eb1af719f1adecb64514ee46a"
            },
            "downloads": -1,
            "filename": "gliner-spacy-0.0.10.tar.gz",
            "has_sig": false,
            "md5_digest": "4dce0be4655b06bab0f312317c3f0281",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 6033,
            "upload_time": "2024-07-16T13:12:42",
            "upload_time_iso_8601": "2024-07-16T13:12:42.668898Z",
            "url": "https://files.pythonhosted.org/packages/31/4c/19f9f2abb3aae6a8df1f860d9565cff3d31b86682bdb0c4e6bc2e156085c/gliner-spacy-0.0.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-16 13:12:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "theirstory",
    "github_project": "gliner-spacy",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "spacy",
            "specs": []
        },
        {
            "name": "gliner",
            "specs": []
        },
        {
            "name": "seaborn",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        }
    ],
    "lcname": "gliner-spacy"
}
        
Elapsed time: 8.17810s