# TEXTA CRF Extractor



## Requirements
* Python >= 3.8
* SciPy installation for scikit-learn (requires BLAS & LAPACK system libraries).
## Installation:
```
# For debian based systems (ex: debian:buster) to install binary dependencies.
apt-get update && apt-get install python3-scipy
# Install without MLP
pip install texta-crf-extractor
# Install with MLP
pip install texta-crf-extractor[mlp]
```
## Usage:
```
from texta_crf_extractor.crf_extractor import CRFExtractor
from texta_mlp.mlp import MLP
mlp = MLP(language_codes=["en"], default_language_code="en")
# prepare data
texts = ["foo", "bar"]
mlp_docs = [mlp.process(text) for text in texts]
# create extractor
extractor = CRFExtractor(mlp=mlp)
# train the CRF model
extractor.train(mlp_docs)
# tag something
extractor.tag("Tere maailm!")
```
Raw data
{
"_id": null,
"home_page": "https://git.texta.ee/texta/texta-crf-extractor-python",
"name": "texta-crf-extractor",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "TEXTA",
"author_email": "info@texta.ee",
"download_url": "https://files.pythonhosted.org/packages/30/17/531bfc401a61e69dbff19269d0ebd21c59ef5731b236c65987941b168c0c/texta-crf-extractor-2.2.0.tar.gz",
"platform": null,
"description": "# TEXTA CRF Extractor\n\n\n\n\n\n## Requirements\n\n* Python >= 3.8\n* SciPy installation for scikit-learn (requires BLAS & LAPACK system libraries).\n\n## Installation:\n\n```\n# For debian based systems (ex: debian:buster) to install binary dependencies.\n\napt-get update && apt-get install python3-scipy\n\n# Install without MLP\n\npip install texta-crf-extractor\n\n# Install with MLP \n\npip install texta-crf-extractor[mlp]\n\n```\n\n## Usage:\n\n```\nfrom texta_crf_extractor.crf_extractor import CRFExtractor\nfrom texta_mlp.mlp import MLP\n\nmlp = MLP(language_codes=[\"en\"], default_language_code=\"en\")\n\n# prepare data\ntexts = [\"foo\", \"bar\"]\nmlp_docs = [mlp.process(text) for text in texts]\n\n# create extractor\nextractor = CRFExtractor(mlp=mlp)\n\n# train the CRF model\nextractor.train(mlp_docs)\n\n# tag something\nextractor.tag(\"Tere maailm!\")\n\n```",
"bugtrack_url": null,
"license": "GPLv3",
"summary": "texta-crf-extractor",
"version": "2.2.0",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3017531bfc401a61e69dbff19269d0ebd21c59ef5731b236c65987941b168c0c",
"md5": "ed67638b555dd41a74827e0d3ef22057",
"sha256": "b7cfe1fe5e7b0ba45a32a3eec819b3ea3a7aac067898cc7152cbca7588a538ab"
},
"downloads": -1,
"filename": "texta-crf-extractor-2.2.0.tar.gz",
"has_sig": false,
"md5_digest": "ed67638b555dd41a74827e0d3ef22057",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20727,
"upload_time": "2023-01-24T06:47:40",
"upload_time_iso_8601": "2023-01-24T06:47:40.154637Z",
"url": "https://files.pythonhosted.org/packages/30/17/531bfc401a61e69dbff19269d0ebd21c59ef5731b236c65987941b168c0c/texta-crf-extractor-2.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-24 06:47:40",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "texta-crf-extractor"
}