===========
nerblackbox
===========
A High-level Library for Named Entity Recognition in Python.
.. image:: https://img.shields.io/pypi/v/nerblackbox
:target: https://pypi.org/project/nerblackbox
:alt: PyPI
.. image:: https://img.shields.io/pypi/pyversions/nerblackbox
:target: https://www.python.org/doc/versions/
:alt: PyPI - Python Version
.. image:: https://github.com/flxst/nerblackbox/actions/workflows/python-package.yml/badge.svg
:target: https://github.com/flxst/nerblackbox/actions/workflows/python-package.yml
:alt: CI
.. image:: https://coveralls.io/repos/github/flxst/nerblackbox/badge.svg?branch=master
:target: https://coveralls.io/github/flxst/nerblackbox?branch=master
.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/psf/black
.. image:: https://img.shields.io/pypi/l/nerblackbox
:target: https://github.com/flxst/nerblackbox/blob/latest/LICENSE.txt
:alt: PyPI - License
Resources
=========
- Source Code: https://github.com/flxst/nerblackbox
- Documentation: https://flxst.github.io/nerblackbox
- PyPI: https://pypi.org/project/nerblackbox
Installation
============
::
pip install nerblackbox
About
=====
.. image:: https://raw.githubusercontent.com/flxst/nerblackbox/master/docs/docs/images/nerblackbox_sources.png
Take a dataset from one of many available sources.
Then train, evaluate and apply a language model
in a few simple steps.
1. Data
"""""""
- Choose a dataset from **HuggingFace (HF)**, the **Local Filesystem (LF)**, an **Annotation Tool (AT)** server, or a **Built-in (BI)** dataset
::
dataset = Dataset("conll2003", source="HF") # HuggingFace
dataset = Dataset("my_dataset", source="LF") # Local Filesystem
dataset = Dataset("swe_nerc", source="BI") # Built-in
- Set up the dataset
::
dataset.set_up()
2. Training
"""""""""""
- Define the training by choosing a pretrained model and a dataset
::
training = Training("my_training", model="bert-base-cased", dataset="conll2003")
- Run the training and get the performance of the fine-tuned model
::
training.run()
training.get_result(metric="f1", level="entity", phase="test")
# 0.9045
3. Evaluation
"""""""""""""
- Load the model
::
model = Model.from_training("my_training")
- Evaluate the model
::
results = model.evaluate_on_dataset("ehealth_kd", phase="test")
results["micro"]["entity"]["f1"]
# 0.9045
4. Inference
""""""""""""
- Load the model
::
model = Model.from_training("my_training")
- Let the model predict
::
model.predict("The United Nations has never recognised Jakarta's move.")
# [[
# {'char_start': '4', 'char_end': '18', 'token': 'United Nations', 'tag': 'ORG'},
# {'char_start': '40', 'char_end': '47', 'token': 'Jakarta', 'tag': 'LOC'}
# ]]
There is much more to it than that! See the `documentation <https://flxst.github.io/nerblackbox>`__ to get started.
Features
========
*Data*
* Integration of Datasets from Multiple Sources (HuggingFace, Annotation Tools, ..)
* Support for Multiple Dataset Types (Standard, Pretokenized)
* Support for Multiple Annotation Schemes (IO, BIO, BILOU)
* Text Encoding
*Training*
* Adaptive Fine-tuning
* Hyperparameter Search
* Multiple Runs with Different Random Seeds
* Detailed Analysis of Training Results
*Evaluation*
* Evaluation of Any Model on Any Dataset
*Inference*
* Versatile Model Inference (Entity/Word Level, Probabilities, ..)
*Other*
* Full Compatibility with HuggingFace
* GPU Support
* Language Agnosticism
See the `documentation <https://flxst.github.io/nerblackbox>`__ for details.
Citation
========
::
@misc{nerblackbox,
author = {Stollenwerk, Felix},
title = {nerblackbox: a high-level library for named entity recognition in python},
year = {2021},
url = {https://github.com/flxst/nerblackbox},
}
Raw data
{
"_id": null,
"home_page": "https://pypi.org/project/nerblackbox",
"name": "nerblackbox",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "NLP,NER,named entity recognition,BERT,transformer,pytorch",
"author": "Felix Stollenwerk",
"author_email": "felix.stollenwerk@ai.se",
"download_url": "https://files.pythonhosted.org/packages/8e/7d/b5d10381102a98b2c75488afe9015fe469e755be8f091e8abae963d50f0e/nerblackbox-1.0.0.tar.gz",
"platform": null,
"description": "===========\nnerblackbox\n===========\n\nA High-level Library for Named Entity Recognition in Python.\n\n.. image:: https://img.shields.io/pypi/v/nerblackbox\n :target: https://pypi.org/project/nerblackbox\n :alt: PyPI\n\n.. image:: https://img.shields.io/pypi/pyversions/nerblackbox\n :target: https://www.python.org/doc/versions/\n :alt: PyPI - Python Version\n\n.. image:: https://github.com/flxst/nerblackbox/actions/workflows/python-package.yml/badge.svg\n :target: https://github.com/flxst/nerblackbox/actions/workflows/python-package.yml\n :alt: CI\n\n.. image:: https://coveralls.io/repos/github/flxst/nerblackbox/badge.svg?branch=master\n :target: https://coveralls.io/github/flxst/nerblackbox?branch=master\n\n.. image:: https://img.shields.io/badge/code%20style-black-000000.svg\n :target: https://github.com/psf/black\n\n.. image:: https://img.shields.io/pypi/l/nerblackbox\n :target: https://github.com/flxst/nerblackbox/blob/latest/LICENSE.txt\n :alt: PyPI - License\n\nResources\n=========\n\n- Source Code: https://github.com/flxst/nerblackbox\n- Documentation: https://flxst.github.io/nerblackbox\n- PyPI: https://pypi.org/project/nerblackbox\n\nInstallation\n============\n\n::\n\n pip install nerblackbox\n\nAbout\n=====\n\n.. image:: https://raw.githubusercontent.com/flxst/nerblackbox/master/docs/docs/images/nerblackbox_sources.png\n\nTake a dataset from one of many available sources.\nThen train, evaluate and apply a language model\nin a few simple steps.\n\n1. Data\n\"\"\"\"\"\"\"\n\n- Choose a dataset from **HuggingFace (HF)**, the **Local Filesystem (LF)**, an **Annotation Tool (AT)** server, or a **Built-in (BI)** dataset\n\n::\n\n dataset = Dataset(\"conll2003\", source=\"HF\") # HuggingFace\n dataset = Dataset(\"my_dataset\", source=\"LF\") # Local Filesystem\n dataset = Dataset(\"swe_nerc\", source=\"BI\") # Built-in\n\n- Set up the dataset\n\n::\n\n dataset.set_up()\n\n\n2. Training\n\"\"\"\"\"\"\"\"\"\"\"\n\n- Define the training by choosing a pretrained model and a dataset\n\n::\n\n training = Training(\"my_training\", model=\"bert-base-cased\", dataset=\"conll2003\")\n\n- Run the training and get the performance of the fine-tuned model\n\n::\n\n training.run()\n training.get_result(metric=\"f1\", level=\"entity\", phase=\"test\")\n # 0.9045\n\n\n3. Evaluation\n\"\"\"\"\"\"\"\"\"\"\"\"\"\n\n- Load the model\n\n::\n\n model = Model.from_training(\"my_training\")\n\n- Evaluate the model\n\n::\n\n results = model.evaluate_on_dataset(\"ehealth_kd\", phase=\"test\")\n results[\"micro\"][\"entity\"][\"f1\"]\n # 0.9045\n\n\n4. Inference\n\"\"\"\"\"\"\"\"\"\"\"\"\n\n- Load the model\n\n::\n\n model = Model.from_training(\"my_training\")\n\n- Let the model predict\n\n::\n\n model.predict(\"The United Nations has never recognised Jakarta's move.\")\n # [[\n # {'char_start': '4', 'char_end': '18', 'token': 'United Nations', 'tag': 'ORG'},\n # {'char_start': '40', 'char_end': '47', 'token': 'Jakarta', 'tag': 'LOC'}\n # ]]\n\nThere is much more to it than that! See the `documentation <https://flxst.github.io/nerblackbox>`__ to get started.\n\nFeatures\n========\n\n*Data*\n\n* Integration of Datasets from Multiple Sources (HuggingFace, Annotation Tools, ..)\n* Support for Multiple Dataset Types (Standard, Pretokenized)\n* Support for Multiple Annotation Schemes (IO, BIO, BILOU)\n* Text Encoding\n\n*Training*\n\n* Adaptive Fine-tuning\n* Hyperparameter Search\n* Multiple Runs with Different Random Seeds\n* Detailed Analysis of Training Results\n\n*Evaluation*\n\n* Evaluation of Any Model on Any Dataset\n\n*Inference*\n\n* Versatile Model Inference (Entity/Word Level, Probabilities, ..)\n\n*Other*\n\n* Full Compatibility with HuggingFace\n* GPU Support\n* Language Agnosticism\n\nSee the `documentation <https://flxst.github.io/nerblackbox>`__ for details.\n\nCitation\n========\n\n::\n\n @misc{nerblackbox,\n author = {Stollenwerk, Felix},\n title = {nerblackbox: a high-level library for named entity recognition in python},\n year = {2021},\n url = {https://github.com/flxst/nerblackbox},\n }\n\n\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "a high-level library for named entity recognition in python",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://pypi.org/project/nerblackbox"
},
"split_keywords": [
"nlp",
"ner",
"named entity recognition",
"bert",
"transformer",
"pytorch"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "46887eb532ef4657a7d4e601fe9290b2ed8a441602b596fca9b3c49a2f6d1ba2",
"md5": "7552781ae2cd7bcc846ee9fb42f7592b",
"sha256": "64eea60cc76f614fe1e8ca808d7f77445c1934a79d102ab0c028775f2861ceae"
},
"downloads": -1,
"filename": "nerblackbox-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7552781ae2cd7bcc846ee9fb42f7592b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 175150,
"upload_time": "2023-08-20T16:54:46",
"upload_time_iso_8601": "2023-08-20T16:54:46.335787Z",
"url": "https://files.pythonhosted.org/packages/46/88/7eb532ef4657a7d4e601fe9290b2ed8a441602b596fca9b3c49a2f6d1ba2/nerblackbox-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8e7db5d10381102a98b2c75488afe9015fe469e755be8f091e8abae963d50f0e",
"md5": "28edbd0e5d6eb80e9555c819275e9090",
"sha256": "f978f5a6fadb1a832b6ebab75ba640fe18f1445a5543cf67bbcd6c551df04cc4"
},
"downloads": -1,
"filename": "nerblackbox-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "28edbd0e5d6eb80e9555c819275e9090",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 126981,
"upload_time": "2023-08-20T16:54:48",
"upload_time_iso_8601": "2023-08-20T16:54:48.077867Z",
"url": "https://files.pythonhosted.org/packages/8e/7d/b5d10381102a98b2c75488afe9015fe469e755be8f091e8abae963d50f0e/nerblackbox-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-08-20 16:54:48",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "nerblackbox"
}