extr


Nameextr JSON
Version 0.0.44 PyPI version JSON
download
home_pagehttps://github.com/dpasse/extr
SummaryNamed Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions
upload_time2023-06-02 00:36:57
maintainer
docs_urlNone
author
requires_python
license
keywords named entity recognition relation extraction entity linking ner re nlp
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Extr
> Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions

<br />

## Install

```
pip install extr
```

## Example

```python
text = 'Ted is a Pitcher.'
```

### 1. Entity Extraction
> Find Named Entities from text.

```python
from extr import RegEx, RegExLabel
from extr.entities import EntityExtractor

entity_extractor = EntityExtractor([
    RegExLabel('PERSON', [
        RegEx([r'ted'], re.IGNORECASE)
    ]),
    RegExLabel('POSITION', [
        RegEx([r'pitcher'], re.IGNORECASE)
    ]),
])

entities = entity_extractor.get_entities(text)

## entities == [
##      <Entity label="POSITION" text="Pitcher" span=(9, 16)>,
##      <Entity label="PERSON" text="Ted" span=(0, 3)>
## ]
```

**<i> or add a knowledge base</i>**

```python
from extr import RegEx, RegExLabel
from extr.entities import create_entity_extractor

entity_extractor = create_entity_extractor(
    [
        RegExLabel('POSITION', [
            RegEx([r'pitcher'], re.IGNORECASE)
        ]),
    ],
    kb={
        'PERSON': ['Ted']
    }
)

entities = entity_extractor.get_entities(text)

## entities == [
##      <Entity label="POSITION" text="Pitcher" span=(9, 16)>,
##      <Entity label="PERSON" text="Ted" span=(0, 3)>
## ]
```

### 2. Visualize Entities in HTML
> Annotate text to display in HTML.

```python
from extr.entities.viewers import HtmlViewer

viewer = HtmlViewer()
viewer.append(text, entities)

html = viewer.create_view(custom_styles="""
    .lb-PERSON {
        background-color: orange;
    }

    .lb-POSITION {
        background-color: yellow;
    }
""")
```

![](https://github.com/dpasse/extr/blob/main/docs/images/annotations.JPG)

### 3. Relation Extraction
> Annotate and Extract Relationships between Entities

```python
from extr.entities import EntityAnnotator
from extr.relations import RelationExtractor, \
                           RegExRelationLabelBuilder

## define relationship between PERSON and POSITION
relationship = RegExRelationLabelBuilder('is_a') \
    .add_e1_to_e2(
        'PERSON', ## e1
        [
            ## define how the relationship exists in nature
            r'\s+is\s+a\s+',
        ],
        'POSITION' ## e2
    ) \
    .build()

relations_to_extract = [relationship]

## `entities` see 'Entity Extraction' above
annotated_text = EntityAnnotator().annotate(text, entities)
relations = RelationExtractor(relations_to_extract).extract(annotated_text, entities)

## relations == [
##      <Relation e1="Ted" r="is_a" e2="Pitcher">
## ]

```



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dpasse/extr",
    "name": "extr",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Named Entity Recognition,Relation Extraction,Entity Linking,NER,RE,NLP",
    "author": "",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/ae/55/5303faafa47b5ad9c3242c723690cfad3cd387cf0539d068c9e0afa55d88/extr-0.0.44.tar.gz",
    "platform": null,
    "description": "# Extr\r\n> Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions\r\n\r\n<br />\r\n\r\n## Install\r\n\r\n```\r\npip install extr\r\n```\r\n\r\n## Example\r\n\r\n```python\r\ntext = 'Ted is a Pitcher.'\r\n```\r\n\r\n### 1. Entity Extraction\r\n> Find Named Entities from text.\r\n\r\n```python\r\nfrom extr import RegEx, RegExLabel\r\nfrom extr.entities import EntityExtractor\r\n\r\nentity_extractor = EntityExtractor([\r\n    RegExLabel('PERSON', [\r\n        RegEx([r'ted'], re.IGNORECASE)\r\n    ]),\r\n    RegExLabel('POSITION', [\r\n        RegEx([r'pitcher'], re.IGNORECASE)\r\n    ]),\r\n])\r\n\r\nentities = entity_extractor.get_entities(text)\r\n\r\n## entities == [\r\n##      <Entity label=\"POSITION\" text=\"Pitcher\" span=(9, 16)>,\r\n##      <Entity label=\"PERSON\" text=\"Ted\" span=(0, 3)>\r\n## ]\r\n```\r\n\r\n**<i> or add a knowledge base</i>**\r\n\r\n```python\r\nfrom extr import RegEx, RegExLabel\r\nfrom extr.entities import create_entity_extractor\r\n\r\nentity_extractor = create_entity_extractor(\r\n    [\r\n        RegExLabel('POSITION', [\r\n            RegEx([r'pitcher'], re.IGNORECASE)\r\n        ]),\r\n    ],\r\n    kb={\r\n        'PERSON': ['Ted']\r\n    }\r\n)\r\n\r\nentities = entity_extractor.get_entities(text)\r\n\r\n## entities == [\r\n##      <Entity label=\"POSITION\" text=\"Pitcher\" span=(9, 16)>,\r\n##      <Entity label=\"PERSON\" text=\"Ted\" span=(0, 3)>\r\n## ]\r\n```\r\n\r\n### 2. Visualize Entities in HTML\r\n> Annotate text to display in HTML.\r\n\r\n```python\r\nfrom extr.entities.viewers import HtmlViewer\r\n\r\nviewer = HtmlViewer()\r\nviewer.append(text, entities)\r\n\r\nhtml = viewer.create_view(custom_styles=\"\"\"\r\n    .lb-PERSON {\r\n        background-color: orange;\r\n    }\r\n\r\n    .lb-POSITION {\r\n        background-color: yellow;\r\n    }\r\n\"\"\")\r\n```\r\n\r\n![](https://github.com/dpasse/extr/blob/main/docs/images/annotations.JPG)\r\n\r\n### 3. Relation Extraction\r\n> Annotate and Extract Relationships between Entities\r\n\r\n```python\r\nfrom extr.entities import EntityAnnotator\r\nfrom extr.relations import RelationExtractor, \\\r\n                           RegExRelationLabelBuilder\r\n\r\n## define relationship between PERSON and POSITION\r\nrelationship = RegExRelationLabelBuilder('is_a') \\\r\n    .add_e1_to_e2(\r\n        'PERSON', ## e1\r\n        [\r\n            ## define how the relationship exists in nature\r\n            r'\\s+is\\s+a\\s+',\r\n        ],\r\n        'POSITION' ## e2\r\n    ) \\\r\n    .build()\r\n\r\nrelations_to_extract = [relationship]\r\n\r\n## `entities` see 'Entity Extraction' above\r\nannotated_text = EntityAnnotator().annotate(text, entities)\r\nrelations = RelationExtractor(relations_to_extract).extract(annotated_text, entities)\r\n\r\n## relations == [\r\n##      <Relation e1=\"Ted\" r=\"is_a\" e2=\"Pitcher\">\r\n## ]\r\n\r\n```\r\n\r\n\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions",
    "version": "0.0.44",
    "project_urls": {
        "Homepage": "https://github.com/dpasse/extr"
    },
    "split_keywords": [
        "named entity recognition",
        "relation extraction",
        "entity linking",
        "ner",
        "re",
        "nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a76b5c9c0281700418196fbd93ac38e033c4a347cfe37a7869c9031c32c2b535",
                "md5": "9c67c27be01a04b9b237f1b025b5c3c8",
                "sha256": "93f1ce73482208fc0beab2f79b8c8e997f0c2261eb7b5f903f0e3ba379a74842"
            },
            "downloads": -1,
            "filename": "extr-0.0.44-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9c67c27be01a04b9b237f1b025b5c3c8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 16410,
            "upload_time": "2023-06-02T00:36:55",
            "upload_time_iso_8601": "2023-06-02T00:36:55.146448Z",
            "url": "https://files.pythonhosted.org/packages/a7/6b/5c9c0281700418196fbd93ac38e033c4a347cfe37a7869c9031c32c2b535/extr-0.0.44-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ae555303faafa47b5ad9c3242c723690cfad3cd387cf0539d068c9e0afa55d88",
                "md5": "34b979b226d50bc646d30e8fbbfa17bc",
                "sha256": "08330bf28c496b5743c5a51c04f1b4f8de1d6b87cabdfbb794339b6c2fc07673"
            },
            "downloads": -1,
            "filename": "extr-0.0.44.tar.gz",
            "has_sig": false,
            "md5_digest": "34b979b226d50bc646d30e8fbbfa17bc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 11560,
            "upload_time": "2023-06-02T00:36:57",
            "upload_time_iso_8601": "2023-06-02T00:36:57.183007Z",
            "url": "https://files.pythonhosted.org/packages/ae/55/5303faafa47b5ad9c3242c723690cfad3cd387cf0539d068c9e0afa55d88/extr-0.0.44.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-02 00:36:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dpasse",
    "github_project": "extr",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "extr"
}
        
Elapsed time: 1.16439s