nerblackbox


Namenerblackbox JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://pypi.org/project/nerblackbox
Summarya high-level library for named entity recognition in python
upload_time2023-08-20 16:54:48
maintainer
docs_urlNone
authorFelix Stollenwerk
requires_python>=3.8
licenseApache 2.0
keywords nlp ner named entity recognition bert transformer pytorch
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ===========
nerblackbox
===========

A High-level Library for Named Entity Recognition in Python.

.. image:: https://img.shields.io/pypi/v/nerblackbox
    :target: https://pypi.org/project/nerblackbox
    :alt: PyPI

.. image:: https://img.shields.io/pypi/pyversions/nerblackbox
    :target: https://www.python.org/doc/versions/
    :alt: PyPI - Python Version

.. image:: https://github.com/flxst/nerblackbox/actions/workflows/python-package.yml/badge.svg
    :target: https://github.com/flxst/nerblackbox/actions/workflows/python-package.yml
    :alt: CI

.. image:: https://coveralls.io/repos/github/flxst/nerblackbox/badge.svg?branch=master
    :target: https://coveralls.io/github/flxst/nerblackbox?branch=master

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
    :target: https://github.com/psf/black

.. image:: https://img.shields.io/pypi/l/nerblackbox
    :target: https://github.com/flxst/nerblackbox/blob/latest/LICENSE.txt
    :alt: PyPI - License

Resources
=========

- Source Code: https://github.com/flxst/nerblackbox
- Documentation: https://flxst.github.io/nerblackbox
- PyPI: https://pypi.org/project/nerblackbox

Installation
============

::

    pip install nerblackbox

About
=====

.. image:: https://raw.githubusercontent.com/flxst/nerblackbox/master/docs/docs/images/nerblackbox_sources.png

Take a dataset from one of many available sources.
Then train, evaluate and apply a language model
in a few simple steps.

1. Data
"""""""

- Choose a dataset from **HuggingFace (HF)**, the **Local Filesystem (LF)**, an **Annotation Tool (AT)** server, or a **Built-in (BI)** dataset

::

    dataset = Dataset("conll2003",  source="HF")  # HuggingFace
    dataset = Dataset("my_dataset", source="LF")  # Local Filesystem
    dataset = Dataset("swe_nerc",   source="BI")  # Built-in

- Set up the dataset

::

    dataset.set_up()


2. Training
"""""""""""

- Define the training by choosing a pretrained model and a dataset

::

    training = Training("my_training", model="bert-base-cased", dataset="conll2003")

- Run the training and get the performance of the fine-tuned model

::

    training.run()
    training.get_result(metric="f1", level="entity", phase="test")
    # 0.9045


3. Evaluation
"""""""""""""

- Load the model

::

    model = Model.from_training("my_training")

- Evaluate the model

::

    results = model.evaluate_on_dataset("ehealth_kd", phase="test")
    results["micro"]["entity"]["f1"]
    # 0.9045


4. Inference
""""""""""""

- Load the model

::

    model = Model.from_training("my_training")

- Let the model predict

::

    model.predict("The United Nations has never recognised Jakarta's move.")
    # [[
    #  {'char_start': '4', 'char_end': '18', 'token': 'United Nations', 'tag': 'ORG'},
    #  {'char_start': '40', 'char_end': '47', 'token': 'Jakarta', 'tag': 'LOC'}
    # ]]

There is much more to it than that! See the `documentation <https://flxst.github.io/nerblackbox>`__ to get started.

Features
========

*Data*

* Integration of Datasets from Multiple Sources (HuggingFace, Annotation Tools, ..)
* Support for Multiple Dataset Types (Standard, Pretokenized)
* Support for Multiple Annotation Schemes (IO, BIO, BILOU)
* Text Encoding

*Training*

* Adaptive Fine-tuning
* Hyperparameter Search
* Multiple Runs with Different Random Seeds
* Detailed Analysis of Training Results

*Evaluation*

* Evaluation of Any Model on Any Dataset

*Inference*

* Versatile Model Inference (Entity/Word Level, Probabilities, ..)

*Other*

* Full Compatibility with HuggingFace
* GPU Support
* Language Agnosticism

See the `documentation <https://flxst.github.io/nerblackbox>`__ for details.

Citation
========

::

    @misc{nerblackbox,
      author = {Stollenwerk, Felix},
      title  = {nerblackbox: a high-level library for named entity recognition in python},
      year   = {2021},
      url    = {https://github.com/flxst/nerblackbox},
    }



            

Raw data

            {
    "_id": null,
    "home_page": "https://pypi.org/project/nerblackbox",
    "name": "nerblackbox",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "NLP,NER,named entity recognition,BERT,transformer,pytorch",
    "author": "Felix Stollenwerk",
    "author_email": "felix.stollenwerk@ai.se",
    "download_url": "https://files.pythonhosted.org/packages/8e/7d/b5d10381102a98b2c75488afe9015fe469e755be8f091e8abae963d50f0e/nerblackbox-1.0.0.tar.gz",
    "platform": null,
    "description": "===========\nnerblackbox\n===========\n\nA High-level Library for Named Entity Recognition in Python.\n\n.. image:: https://img.shields.io/pypi/v/nerblackbox\n    :target: https://pypi.org/project/nerblackbox\n    :alt: PyPI\n\n.. image:: https://img.shields.io/pypi/pyversions/nerblackbox\n    :target: https://www.python.org/doc/versions/\n    :alt: PyPI - Python Version\n\n.. image:: https://github.com/flxst/nerblackbox/actions/workflows/python-package.yml/badge.svg\n    :target: https://github.com/flxst/nerblackbox/actions/workflows/python-package.yml\n    :alt: CI\n\n.. image:: https://coveralls.io/repos/github/flxst/nerblackbox/badge.svg?branch=master\n    :target: https://coveralls.io/github/flxst/nerblackbox?branch=master\n\n.. image:: https://img.shields.io/badge/code%20style-black-000000.svg\n    :target: https://github.com/psf/black\n\n.. image:: https://img.shields.io/pypi/l/nerblackbox\n    :target: https://github.com/flxst/nerblackbox/blob/latest/LICENSE.txt\n    :alt: PyPI - License\n\nResources\n=========\n\n- Source Code: https://github.com/flxst/nerblackbox\n- Documentation: https://flxst.github.io/nerblackbox\n- PyPI: https://pypi.org/project/nerblackbox\n\nInstallation\n============\n\n::\n\n    pip install nerblackbox\n\nAbout\n=====\n\n.. image:: https://raw.githubusercontent.com/flxst/nerblackbox/master/docs/docs/images/nerblackbox_sources.png\n\nTake a dataset from one of many available sources.\nThen train, evaluate and apply a language model\nin a few simple steps.\n\n1. Data\n\"\"\"\"\"\"\"\n\n- Choose a dataset from **HuggingFace (HF)**, the **Local Filesystem (LF)**, an **Annotation Tool (AT)** server, or a **Built-in (BI)** dataset\n\n::\n\n    dataset = Dataset(\"conll2003\",  source=\"HF\")  # HuggingFace\n    dataset = Dataset(\"my_dataset\", source=\"LF\")  # Local Filesystem\n    dataset = Dataset(\"swe_nerc\",   source=\"BI\")  # Built-in\n\n- Set up the dataset\n\n::\n\n    dataset.set_up()\n\n\n2. Training\n\"\"\"\"\"\"\"\"\"\"\"\n\n- Define the training by choosing a pretrained model and a dataset\n\n::\n\n    training = Training(\"my_training\", model=\"bert-base-cased\", dataset=\"conll2003\")\n\n- Run the training and get the performance of the fine-tuned model\n\n::\n\n    training.run()\n    training.get_result(metric=\"f1\", level=\"entity\", phase=\"test\")\n    # 0.9045\n\n\n3. Evaluation\n\"\"\"\"\"\"\"\"\"\"\"\"\"\n\n- Load the model\n\n::\n\n    model = Model.from_training(\"my_training\")\n\n- Evaluate the model\n\n::\n\n    results = model.evaluate_on_dataset(\"ehealth_kd\", phase=\"test\")\n    results[\"micro\"][\"entity\"][\"f1\"]\n    # 0.9045\n\n\n4. Inference\n\"\"\"\"\"\"\"\"\"\"\"\"\n\n- Load the model\n\n::\n\n    model = Model.from_training(\"my_training\")\n\n- Let the model predict\n\n::\n\n    model.predict(\"The United Nations has never recognised Jakarta's move.\")\n    # [[\n    #  {'char_start': '4', 'char_end': '18', 'token': 'United Nations', 'tag': 'ORG'},\n    #  {'char_start': '40', 'char_end': '47', 'token': 'Jakarta', 'tag': 'LOC'}\n    # ]]\n\nThere is much more to it than that! See the `documentation <https://flxst.github.io/nerblackbox>`__ to get started.\n\nFeatures\n========\n\n*Data*\n\n* Integration of Datasets from Multiple Sources (HuggingFace, Annotation Tools, ..)\n* Support for Multiple Dataset Types (Standard, Pretokenized)\n* Support for Multiple Annotation Schemes (IO, BIO, BILOU)\n* Text Encoding\n\n*Training*\n\n* Adaptive Fine-tuning\n* Hyperparameter Search\n* Multiple Runs with Different Random Seeds\n* Detailed Analysis of Training Results\n\n*Evaluation*\n\n* Evaluation of Any Model on Any Dataset\n\n*Inference*\n\n* Versatile Model Inference (Entity/Word Level, Probabilities, ..)\n\n*Other*\n\n* Full Compatibility with HuggingFace\n* GPU Support\n* Language Agnosticism\n\nSee the `documentation <https://flxst.github.io/nerblackbox>`__ for details.\n\nCitation\n========\n\n::\n\n    @misc{nerblackbox,\n      author = {Stollenwerk, Felix},\n      title  = {nerblackbox: a high-level library for named entity recognition in python},\n      year   = {2021},\n      url    = {https://github.com/flxst/nerblackbox},\n    }\n\n\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "a high-level library for named entity recognition in python",
    "version": "1.0.0",
    "project_urls": {
        "Homepage": "https://pypi.org/project/nerblackbox"
    },
    "split_keywords": [
        "nlp",
        "ner",
        "named entity recognition",
        "bert",
        "transformer",
        "pytorch"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "46887eb532ef4657a7d4e601fe9290b2ed8a441602b596fca9b3c49a2f6d1ba2",
                "md5": "7552781ae2cd7bcc846ee9fb42f7592b",
                "sha256": "64eea60cc76f614fe1e8ca808d7f77445c1934a79d102ab0c028775f2861ceae"
            },
            "downloads": -1,
            "filename": "nerblackbox-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7552781ae2cd7bcc846ee9fb42f7592b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 175150,
            "upload_time": "2023-08-20T16:54:46",
            "upload_time_iso_8601": "2023-08-20T16:54:46.335787Z",
            "url": "https://files.pythonhosted.org/packages/46/88/7eb532ef4657a7d4e601fe9290b2ed8a441602b596fca9b3c49a2f6d1ba2/nerblackbox-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8e7db5d10381102a98b2c75488afe9015fe469e755be8f091e8abae963d50f0e",
                "md5": "28edbd0e5d6eb80e9555c819275e9090",
                "sha256": "f978f5a6fadb1a832b6ebab75ba640fe18f1445a5543cf67bbcd6c551df04cc4"
            },
            "downloads": -1,
            "filename": "nerblackbox-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "28edbd0e5d6eb80e9555c819275e9090",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 126981,
            "upload_time": "2023-08-20T16:54:48",
            "upload_time_iso_8601": "2023-08-20T16:54:48.077867Z",
            "url": "https://files.pythonhosted.org/packages/8e/7d/b5d10381102a98b2c75488afe9015fe469e755be8f091e8abae963d50f0e/nerblackbox-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-20 16:54:48",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "nerblackbox"
}
        
Elapsed time: 2.41118s