<img src="https://github.com/natasha/natasha-logos/blob/master/yargy.svg">
![CI](https://github.com/natasha/yargy/actions/workflows/test.yml/badge.svg)
Yargy uses rules and dictionaries to extract structured information from Russian texts. Yargy is similar to <a href="https://yandex.ru/dev/tomita">Tomita parser</a>.
## Install
Yargy supports Python 3.7+, PyPy 3, depends only on <a href="http://github.com/pymorphy2/pymorphy2">Pymorphy2</a>.
```bash
$ pip install yargy
```
## Usage
```python
from yargy import Parser, rule, and_, not_
from yargy.interpretation import fact
from yargy.predicates import gram
from yargy.relations import gnc_relation
from yargy.pipelines import morph_pipeline
Name = fact(
'Name',
['first', 'last'],
)
Person = fact(
'Person',
['position', 'name']
)
LAST = and_(
gram('Surn'),
not_(gram('Abbr')),
)
FIRST = and_(
gram('Name'),
not_(gram('Abbr')),
)
POSITION = morph_pipeline([
'управляющий директор',
'вице-мэр'
])
gnc = gnc_relation()
NAME = rule(
FIRST.interpretation(
Name.first
).match(gnc),
LAST.interpretation(
Name.last
).match(gnc)
).interpretation(
Name
)
PERSON = rule(
POSITION.interpretation(
Person.position
).match(gnc),
NAME.interpretation(
Person.name
)
).interpretation(
Person
)
parser = Parser(PERSON)
match = parser.match('управляющий директор Иван Ульянов')
print(match)
Person(
position='управляющий директор',
name=Name(
first='Иван',
last='Ульянов'
)
)
```
## Documentation
All materials are in Russian:
* <a href="https://habr.com/ru/post/349864/">Overview</a>
* <a href="https://www.youtube.com/watch?v=NQxzx0qYgK8">Video from workshop</a>
* <a href="https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/index.ipynb">Getting started</a>
* <a href="https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/ref.ipynb">Reference</a>
* <a href="https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/cookbook.ipynb">Cookbook</a>
* <a href="https://github.com/natasha/yargy-examples">Examples</a>
* <a href="https://github.com/natasha/natasha-usage#yargy">Code snippets</a>
## Support
- Chat — https://t.me/natural_language_processing
- Issues — https://github.com/natasha/yargy/issues
- Commercial support — https://lab.alexkuk.ru
## Development
Dev env
```bash
brew install graphviz
python -m venv ~/.venvs/natasha-yargy
source ~/.venvs/natasha-yargy/bin/activate
pip install -r requirements/dev.txt
pip install -e .
python -m ipykernel install --user --name natasha-yargy
```
Test + lint
```bash
make test
```
Update docs
```bash
make exec-docs
# Manually check git diff docs/, commit
```
Release
```bash
# Update setup.py version
git commit -am 'Up version'
git tag v0.16.0
git push
git push --tags
# Github Action builds dist and publishes to PyPi
```
Raw data
{
"_id": null,
"home_page": "https://github.com/natasha/yargy",
"name": "yargy",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "natural language processing,russian morphology,glr,parser",
"author": "Yargy contributors",
"author_email": "d.a.veselov@yandex.ru, alex@alexkuk.ru",
"download_url": "https://files.pythonhosted.org/packages/87/ff/0ac3b2ae6aca6026e1acc872c1c371182662e94b1c1ab0b9c68854472670/yargy-0.16.0.tar.gz",
"platform": null,
"description": "<img src=\"https://github.com/natasha/natasha-logos/blob/master/yargy.svg\">\n\n![CI](https://github.com/natasha/yargy/actions/workflows/test.yml/badge.svg)\n\nYargy uses rules and dictionaries to extract structured information from Russian texts. Yargy is similar to <a href=\"https://yandex.ru/dev/tomita\">Tomita parser</a>.\n\n## Install\n\nYargy supports Python 3.7+, PyPy 3, depends only on <a href=\"http://github.com/pymorphy2/pymorphy2\">Pymorphy2</a>.\n\n```bash\n$ pip install yargy\n```\n\n## Usage\n\n```python\nfrom yargy import Parser, rule, and_, not_\nfrom yargy.interpretation import fact\nfrom yargy.predicates import gram\nfrom yargy.relations import gnc_relation\nfrom yargy.pipelines import morph_pipeline\n\n\nName = fact(\n 'Name',\n ['first', 'last'],\n)\nPerson = fact(\n 'Person',\n ['position', 'name']\n)\n\nLAST = and_(\n gram('Surn'),\n not_(gram('Abbr')),\n)\nFIRST = and_(\n gram('Name'),\n not_(gram('Abbr')),\n)\n\nPOSITION = morph_pipeline([\n '\u0443\u043f\u0440\u0430\u0432\u043b\u044f\u044e\u0449\u0438\u0439 \u0434\u0438\u0440\u0435\u043a\u0442\u043e\u0440',\n '\u0432\u0438\u0446\u0435-\u043c\u044d\u0440'\n])\n\ngnc = gnc_relation()\nNAME = rule(\n FIRST.interpretation(\n Name.first\n ).match(gnc),\n LAST.interpretation(\n Name.last\n ).match(gnc)\n).interpretation(\n Name\n)\n\nPERSON = rule(\n POSITION.interpretation(\n Person.position\n ).match(gnc),\n NAME.interpretation(\n Person.name\n )\n).interpretation(\n Person\n)\n\nparser = Parser(PERSON)\n\nmatch = parser.match('\u0443\u043f\u0440\u0430\u0432\u043b\u044f\u044e\u0449\u0438\u0439 \u0434\u0438\u0440\u0435\u043a\u0442\u043e\u0440 \u0418\u0432\u0430\u043d \u0423\u043b\u044c\u044f\u043d\u043e\u0432')\nprint(match)\n\nPerson(\n position='\u0443\u043f\u0440\u0430\u0432\u043b\u044f\u044e\u0449\u0438\u0439 \u0434\u0438\u0440\u0435\u043a\u0442\u043e\u0440',\n name=Name(\n first='\u0418\u0432\u0430\u043d',\n last='\u0423\u043b\u044c\u044f\u043d\u043e\u0432'\n )\n)\n\n```\n\n## Documentation\n\nAll materials are in Russian:\n\n* <a href=\"https://habr.com/ru/post/349864/\">Overview</a>\n* <a href=\"https://www.youtube.com/watch?v=NQxzx0qYgK8\">Video from workshop</a>\n* <a href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/index.ipynb\">Getting started</a>\n* <a href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/ref.ipynb\">Reference</a>\n* <a href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/cookbook.ipynb\">Cookbook</a>\n* <a href=\"https://github.com/natasha/yargy-examples\">Examples</a>\n* <a href=\"https://github.com/natasha/natasha-usage#yargy\">Code snippets</a>\n\n## Support\n\n- Chat \u2014 https://t.me/natural_language_processing\n- Issues \u2014 https://github.com/natasha/yargy/issues\n- Commercial support \u2014 https://lab.alexkuk.ru\n\n## Development\n\nDev env\n\n```bash\nbrew install graphviz\n\npython -m venv ~/.venvs/natasha-yargy\nsource ~/.venvs/natasha-yargy/bin/activate\n\npip install -r requirements/dev.txt\npip install -e .\n\npython -m ipykernel install --user --name natasha-yargy\n```\n\nTest + lint\n\n```bash\nmake test\n```\n\nUpdate docs\n\n```bash\nmake exec-docs\n\n# Manually check git diff docs/, commit\n```\n\nRelease\n\n```bash\n# Update setup.py version\n\ngit commit -am 'Up version'\ngit tag v0.16.0\n\ngit push\ngit push --tags\n\n# Github Action builds dist and publishes to PyPi\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Rule-based facts extraction for Russian language",
"version": "0.16.0",
"project_urls": {
"Homepage": "https://github.com/natasha/yargy"
},
"split_keywords": [
"natural language processing",
"russian morphology",
"glr",
"parser"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b755d065a9812c619889fbe01a1863743ee45f7c60c462fc95b19576972ee9e4",
"md5": "5ccec641d27d5fc53207666a83f2159f",
"sha256": "7ca469fa47b336367fab49e8f33ccc195584f69ab758e8196f2fdaa7492adf22"
},
"downloads": -1,
"filename": "yargy-0.16.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5ccec641d27d5fc53207666a83f2159f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 33972,
"upload_time": "2023-07-24T09:57:21",
"upload_time_iso_8601": "2023-07-24T09:57:21.536516Z",
"url": "https://files.pythonhosted.org/packages/b7/55/d065a9812c619889fbe01a1863743ee45f7c60c462fc95b19576972ee9e4/yargy-0.16.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "87ff0ac3b2ae6aca6026e1acc872c1c371182662e94b1c1ab0b9c68854472670",
"md5": "4d60e6f3ebc5567a69e85c752a61d29b",
"sha256": "c917eefb32a40c23c46b6ca88d68927072dd00ab94e90fd5dc6ab0a62b59b593"
},
"downloads": -1,
"filename": "yargy-0.16.0.tar.gz",
"has_sig": false,
"md5_digest": "4d60e6f3ebc5567a69e85c752a61d29b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 68218,
"upload_time": "2023-07-24T09:57:23",
"upload_time_iso_8601": "2023-07-24T09:57:23.191829Z",
"url": "https://files.pythonhosted.org/packages/87/ff/0ac3b2ae6aca6026e1acc872c1c371182662e94b1c1ab0b9c68854472670/yargy-0.16.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-07-24 09:57:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "natasha",
"github_project": "yargy",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "yargy"
}