yargy


Nameyargy JSON
Version 0.16.0 PyPI version JSON
download
home_pagehttps://github.com/natasha/yargy
SummaryRule-based facts extraction for Russian language
upload_time2023-07-24 09:57:23
maintainer
docs_urlNone
authorYargy contributors
requires_python
licenseMIT
keywords natural language processing russian morphology glr parser
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <img src="https://github.com/natasha/natasha-logos/blob/master/yargy.svg">

![CI](https://github.com/natasha/yargy/actions/workflows/test.yml/badge.svg)

Yargy uses rules and dictionaries to extract structured information from Russian texts. Yargy is similar to <a href="https://yandex.ru/dev/tomita">Tomita parser</a>.

## Install

Yargy supports Python 3.7+, PyPy 3, depends only on <a href="http://github.com/pymorphy2/pymorphy2">Pymorphy2</a>.

```bash
$ pip install yargy
```

## Usage

```python
from yargy import Parser, rule, and_, not_
from yargy.interpretation import fact
from yargy.predicates import gram
from yargy.relations import gnc_relation
from yargy.pipelines import morph_pipeline


Name = fact(
    'Name',
    ['first', 'last'],
)
Person = fact(
    'Person',
    ['position', 'name']
)

LAST = and_(
    gram('Surn'),
    not_(gram('Abbr')),
)
FIRST = and_(
    gram('Name'),
    not_(gram('Abbr')),
)

POSITION = morph_pipeline([
    'управляющий директор',
    'вице-мэр'
])

gnc = gnc_relation()
NAME = rule(
    FIRST.interpretation(
        Name.first
    ).match(gnc),
    LAST.interpretation(
        Name.last
    ).match(gnc)
).interpretation(
    Name
)

PERSON = rule(
    POSITION.interpretation(
        Person.position
    ).match(gnc),
    NAME.interpretation(
        Person.name
    )
).interpretation(
    Person
)

parser = Parser(PERSON)

match = parser.match('управляющий директор Иван Ульянов')
print(match)

Person(
    position='управляющий директор',
    name=Name(
        first='Иван',
        last='Ульянов'
    )
)

```

## Documentation

All materials are in Russian:

* <a href="https://habr.com/ru/post/349864/">Overview</a>
* <a href="https://www.youtube.com/watch?v=NQxzx0qYgK8">Video from workshop</a>
* <a href="https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/index.ipynb">Getting started</a>
* <a href="https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/ref.ipynb">Reference</a>
* <a href="https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/cookbook.ipynb">Cookbook</a>
* <a href="https://github.com/natasha/yargy-examples">Examples</a>
* <a href="https://github.com/natasha/natasha-usage#yargy">Code snippets</a>

## Support

- Chat — https://t.me/natural_language_processing
- Issues — https://github.com/natasha/yargy/issues
- Commercial support — https://lab.alexkuk.ru

## Development

Dev env

```bash
brew install graphviz

python -m venv ~/.venvs/natasha-yargy
source ~/.venvs/natasha-yargy/bin/activate

pip install -r requirements/dev.txt
pip install -e .

python -m ipykernel install --user --name natasha-yargy
```

Test + lint

```bash
make test
```

Update docs

```bash
make exec-docs

# Manually check git diff docs/, commit
```

Release

```bash
# Update setup.py version

git commit -am 'Up version'
git tag v0.16.0

git push
git push --tags

# Github Action builds dist and publishes to PyPi
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/natasha/yargy",
    "name": "yargy",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "natural language processing,russian morphology,glr,parser",
    "author": "Yargy contributors",
    "author_email": "d.a.veselov@yandex.ru, alex@alexkuk.ru",
    "download_url": "https://files.pythonhosted.org/packages/87/ff/0ac3b2ae6aca6026e1acc872c1c371182662e94b1c1ab0b9c68854472670/yargy-0.16.0.tar.gz",
    "platform": null,
    "description": "<img src=\"https://github.com/natasha/natasha-logos/blob/master/yargy.svg\">\n\n![CI](https://github.com/natasha/yargy/actions/workflows/test.yml/badge.svg)\n\nYargy uses rules and dictionaries to extract structured information from Russian texts. Yargy is similar to <a href=\"https://yandex.ru/dev/tomita\">Tomita parser</a>.\n\n## Install\n\nYargy supports Python 3.7+, PyPy 3, depends only on <a href=\"http://github.com/pymorphy2/pymorphy2\">Pymorphy2</a>.\n\n```bash\n$ pip install yargy\n```\n\n## Usage\n\n```python\nfrom yargy import Parser, rule, and_, not_\nfrom yargy.interpretation import fact\nfrom yargy.predicates import gram\nfrom yargy.relations import gnc_relation\nfrom yargy.pipelines import morph_pipeline\n\n\nName = fact(\n    'Name',\n    ['first', 'last'],\n)\nPerson = fact(\n    'Person',\n    ['position', 'name']\n)\n\nLAST = and_(\n    gram('Surn'),\n    not_(gram('Abbr')),\n)\nFIRST = and_(\n    gram('Name'),\n    not_(gram('Abbr')),\n)\n\nPOSITION = morph_pipeline([\n    '\u0443\u043f\u0440\u0430\u0432\u043b\u044f\u044e\u0449\u0438\u0439 \u0434\u0438\u0440\u0435\u043a\u0442\u043e\u0440',\n    '\u0432\u0438\u0446\u0435-\u043c\u044d\u0440'\n])\n\ngnc = gnc_relation()\nNAME = rule(\n    FIRST.interpretation(\n        Name.first\n    ).match(gnc),\n    LAST.interpretation(\n        Name.last\n    ).match(gnc)\n).interpretation(\n    Name\n)\n\nPERSON = rule(\n    POSITION.interpretation(\n        Person.position\n    ).match(gnc),\n    NAME.interpretation(\n        Person.name\n    )\n).interpretation(\n    Person\n)\n\nparser = Parser(PERSON)\n\nmatch = parser.match('\u0443\u043f\u0440\u0430\u0432\u043b\u044f\u044e\u0449\u0438\u0439 \u0434\u0438\u0440\u0435\u043a\u0442\u043e\u0440 \u0418\u0432\u0430\u043d \u0423\u043b\u044c\u044f\u043d\u043e\u0432')\nprint(match)\n\nPerson(\n    position='\u0443\u043f\u0440\u0430\u0432\u043b\u044f\u044e\u0449\u0438\u0439 \u0434\u0438\u0440\u0435\u043a\u0442\u043e\u0440',\n    name=Name(\n        first='\u0418\u0432\u0430\u043d',\n        last='\u0423\u043b\u044c\u044f\u043d\u043e\u0432'\n    )\n)\n\n```\n\n## Documentation\n\nAll materials are in Russian:\n\n* <a href=\"https://habr.com/ru/post/349864/\">Overview</a>\n* <a href=\"https://www.youtube.com/watch?v=NQxzx0qYgK8\">Video from workshop</a>\n* <a href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/index.ipynb\">Getting started</a>\n* <a href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/ref.ipynb\">Reference</a>\n* <a href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/cookbook.ipynb\">Cookbook</a>\n* <a href=\"https://github.com/natasha/yargy-examples\">Examples</a>\n* <a href=\"https://github.com/natasha/natasha-usage#yargy\">Code snippets</a>\n\n## Support\n\n- Chat \u2014 https://t.me/natural_language_processing\n- Issues \u2014 https://github.com/natasha/yargy/issues\n- Commercial support \u2014 https://lab.alexkuk.ru\n\n## Development\n\nDev env\n\n```bash\nbrew install graphviz\n\npython -m venv ~/.venvs/natasha-yargy\nsource ~/.venvs/natasha-yargy/bin/activate\n\npip install -r requirements/dev.txt\npip install -e .\n\npython -m ipykernel install --user --name natasha-yargy\n```\n\nTest + lint\n\n```bash\nmake test\n```\n\nUpdate docs\n\n```bash\nmake exec-docs\n\n# Manually check git diff docs/, commit\n```\n\nRelease\n\n```bash\n# Update setup.py version\n\ngit commit -am 'Up version'\ngit tag v0.16.0\n\ngit push\ngit push --tags\n\n# Github Action builds dist and publishes to PyPi\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Rule-based facts extraction for Russian language",
    "version": "0.16.0",
    "project_urls": {
        "Homepage": "https://github.com/natasha/yargy"
    },
    "split_keywords": [
        "natural language processing",
        "russian morphology",
        "glr",
        "parser"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b755d065a9812c619889fbe01a1863743ee45f7c60c462fc95b19576972ee9e4",
                "md5": "5ccec641d27d5fc53207666a83f2159f",
                "sha256": "7ca469fa47b336367fab49e8f33ccc195584f69ab758e8196f2fdaa7492adf22"
            },
            "downloads": -1,
            "filename": "yargy-0.16.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5ccec641d27d5fc53207666a83f2159f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 33972,
            "upload_time": "2023-07-24T09:57:21",
            "upload_time_iso_8601": "2023-07-24T09:57:21.536516Z",
            "url": "https://files.pythonhosted.org/packages/b7/55/d065a9812c619889fbe01a1863743ee45f7c60c462fc95b19576972ee9e4/yargy-0.16.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "87ff0ac3b2ae6aca6026e1acc872c1c371182662e94b1c1ab0b9c68854472670",
                "md5": "4d60e6f3ebc5567a69e85c752a61d29b",
                "sha256": "c917eefb32a40c23c46b6ca88d68927072dd00ab94e90fd5dc6ab0a62b59b593"
            },
            "downloads": -1,
            "filename": "yargy-0.16.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4d60e6f3ebc5567a69e85c752a61d29b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 68218,
            "upload_time": "2023-07-24T09:57:23",
            "upload_time_iso_8601": "2023-07-24T09:57:23.191829Z",
            "url": "https://files.pythonhosted.org/packages/87/ff/0ac3b2ae6aca6026e1acc872c1c371182662e94b1c1ab0b9c68854472670/yargy-0.16.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-24 09:57:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "natasha",
    "github_project": "yargy",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "yargy"
}
        
Elapsed time: 0.09865s