uaddress


Nameuaddress JSON
Version 1.0.7 PyPI version JSON
download
home_pagehttps://github.com/RapidappsIT/uaddress
SummaryUkrainian address parser
upload_time2023-10-27 10:43:51
maintainer
docs_urlNone
authorEvgen Kytonin
requires_python
licenseMIT
keywords nlp ukraine address research parsing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![header](https://github.com/RapidappsIT/uaddress/raw/master/doc/header.png)
# Описание
[![PyPI version](https://badge.fury.io/py/uaddress.svg)](https://badge.fury.io/py/uaddress)

Разборка адреса на типы. Адаптация библиотеки [usaddress](https://github.com/datamade/usaddress) под украинские адреса 

> Read this in other language: [English](README.en.md), [Русский](README.md), [Український](README.ua.md)

# Требования
* python3
* [jackmartin.parserator](https://github.com/martinjack/parserator)

# Установка
```sh
pip3 install uaddress
```
# Установка локально
```sh
python3 setup.py install --user
```

# Обучение модели
```shell
parserator train training/data.xml uaddress
```
### Когда другое расположение модели
```shell
parserator train training/data.xml uaddress --modelfile anotherpath/uaddr.crfsuite
```

# Тестирование модели
```shell
parserator label training/raw.csv training/data.xml uaddress
```
### Когда другое расположение модели
```shell
parserator label trainig/raw.csv training/data.xml uaddress --modelfile anotherpath/uaddr.crfsuite
```

# Структура
| Файл                      | Описание                                      |
| :-------------            | :-------------                                |
| training/data.xml         | Набор данных для модели                       |
| training/raw.csv          | Список адресов для обучения или проверки      |
| uaddress/uaddr.crfsuite   | NLP модель                                    |

# Примеры
![example1](https://github.com/RapidappsIT/uaddress/raw/master/doc/example1.gif)

## Пример скрипта
```sh 
python3 example.py
```
![example2](https://github.com/RapidappsIT/uaddress/raw/master/doc/example2.gif)

# Типы
| Название                  | Описание                                      |
| :-------------            | :-------------                                |
| Country                   | Страна                                        |
| RegionType                | Тип области                                   |
| Region                    | Область                                       |
| CountyType                | Тип района                                    |
| County                    | Район                                         |
| SubLocalityType           | Тип подрайона                                 |
| SubLocality               | Подрайон                                      |
| LocalityType              | Тип населённого пункта                        |
| Locality                  | Населённый пункт                              |
| StreetType                | Тип улицы                                     |
| Street                    | Улица                                         |
| HousingType               | Тип корпуса                                   |
| Housing                   | Корпус                                        |
| HostelType                | Тип общежития                                 |
| Hostel                    | Общежитие                                     |
| HouseNumberType           | Тип номера дома                               |
| HouseNumber               | Номер дома                                    |
| HouseNumberAdditionally   | Дополнительный номер дома                     |
| SectionType               | Тип секции                                    |
| Section                   | Секция                                        |
| ApartmentType             | Тип квартиры                                  |
| Apartment                 | Квартира                                      |
| RoomType                  | Тип комнаты                                   |
| Room                      | Комната                                       |
| Sector                    | Сектор                                        |
| EntranceType              | Тип подъезда                                  |
| Entrance                  | Номер подъезда                                |
| FloorType                 | Тип этажа                                     |
| Floor                     | Этаж                                          |
| PostCode                  | Индекс                                        |
| Manually                  | Набор типов для дальнейшей разборки адреса    |
| NotAddress                | Не адрес                                      |
| Comment                   | Комментарий                                   |
| AdditionalData            | Дополнительные данные                         |
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/RapidappsIT/uaddress",
    "name": "uaddress",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "nlp,ukraine,address,research,parsing",
    "author": "Evgen Kytonin",
    "author_email": "killfess@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/0a/08/be39f1b09f8b0b5cf3846783fd911e04f0f1118150f9a6372bf45428c959/uaddress-1.0.7.tar.gz",
    "platform": null,
    "description": "![header](https://github.com/RapidappsIT/uaddress/raw/master/doc/header.png)\n# \u041e\u043f\u0438\u0441\u0430\u043d\u0438\u0435\n[![PyPI version](https://badge.fury.io/py/uaddress.svg)](https://badge.fury.io/py/uaddress)\n\n\u0420\u0430\u0437\u0431\u043e\u0440\u043a\u0430 \u0430\u0434\u0440\u0435\u0441\u0430 \u043d\u0430 \u0442\u0438\u043f\u044b. \u0410\u0434\u0430\u043f\u0442\u0430\u0446\u0438\u044f \u0431\u0438\u0431\u043b\u0438\u043e\u0442\u0435\u043a\u0438 [usaddress](https://github.com/datamade/usaddress) \u043f\u043e\u0434 \u0443\u043a\u0440\u0430\u0438\u043d\u0441\u043a\u0438\u0435 \u0430\u0434\u0440\u0435\u0441\u0430 \n\n> Read this in other language: [English](README.en.md), [\u0420\u0443\u0441\u0441\u043a\u0438\u0439](README.md), [\u0423\u043a\u0440\u0430\u0457\u043d\u0441\u044c\u043a\u0438\u0439](README.ua.md)\n\n# \u0422\u0440\u0435\u0431\u043e\u0432\u0430\u043d\u0438\u044f\n* python3\n* [jackmartin.parserator](https://github.com/martinjack/parserator)\n\n# \u0423\u0441\u0442\u0430\u043d\u043e\u0432\u043a\u0430\n```sh\npip3 install uaddress\n```\n# \u0423\u0441\u0442\u0430\u043d\u043e\u0432\u043a\u0430 \u043b\u043e\u043a\u0430\u043b\u044c\u043d\u043e\n```sh\npython3 setup.py install --user\n```\n\n# \u041e\u0431\u0443\u0447\u0435\u043d\u0438\u0435 \u043c\u043e\u0434\u0435\u043b\u0438\n```shell\nparserator train training/data.xml uaddress\n```\n### \u041a\u043e\u0433\u0434\u0430 \u0434\u0440\u0443\u0433\u043e\u0435 \u0440\u0430\u0441\u043f\u043e\u043b\u043e\u0436\u0435\u043d\u0438\u0435 \u043c\u043e\u0434\u0435\u043b\u0438\n```shell\nparserator train training/data.xml uaddress --modelfile anotherpath/uaddr.crfsuite\n```\n\n# \u0422\u0435\u0441\u0442\u0438\u0440\u043e\u0432\u0430\u043d\u0438\u0435 \u043c\u043e\u0434\u0435\u043b\u0438\n```shell\nparserator label training/raw.csv training/data.xml uaddress\n```\n### \u041a\u043e\u0433\u0434\u0430 \u0434\u0440\u0443\u0433\u043e\u0435 \u0440\u0430\u0441\u043f\u043e\u043b\u043e\u0436\u0435\u043d\u0438\u0435 \u043c\u043e\u0434\u0435\u043b\u0438\n```shell\nparserator label trainig/raw.csv training/data.xml uaddress --modelfile anotherpath/uaddr.crfsuite\n```\n\n# \u0421\u0442\u0440\u0443\u043a\u0442\u0443\u0440\u0430\n| \u0424\u0430\u0439\u043b                      | \u041e\u043f\u0438\u0441\u0430\u043d\u0438\u0435                                      |\n| :-------------            | :-------------                                |\n| training/data.xml         | \u041d\u0430\u0431\u043e\u0440 \u0434\u0430\u043d\u043d\u044b\u0445 \u0434\u043b\u044f \u043c\u043e\u0434\u0435\u043b\u0438                       |\n| training/raw.csv          | \u0421\u043f\u0438\u0441\u043e\u043a \u0430\u0434\u0440\u0435\u0441\u043e\u0432 \u0434\u043b\u044f \u043e\u0431\u0443\u0447\u0435\u043d\u0438\u044f \u0438\u043b\u0438 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438      |\n| uaddress/uaddr.crfsuite   | NLP \u043c\u043e\u0434\u0435\u043b\u044c                                    |\n\n# \u041f\u0440\u0438\u043c\u0435\u0440\u044b\n![example1](https://github.com/RapidappsIT/uaddress/raw/master/doc/example1.gif)\n\n## \u041f\u0440\u0438\u043c\u0435\u0440 \u0441\u043a\u0440\u0438\u043f\u0442\u0430\n```sh \npython3 example.py\n```\n![example2](https://github.com/RapidappsIT/uaddress/raw/master/doc/example2.gif)\n\n# \u0422\u0438\u043f\u044b\n| \u041d\u0430\u0437\u0432\u0430\u043d\u0438\u0435                  | \u041e\u043f\u0438\u0441\u0430\u043d\u0438\u0435                                      |\n| :-------------            | :-------------                                |\n| Country                   | \u0421\u0442\u0440\u0430\u043d\u0430                                        |\n| RegionType                | \u0422\u0438\u043f \u043e\u0431\u043b\u0430\u0441\u0442\u0438                                   |\n| Region                    | \u041e\u0431\u043b\u0430\u0441\u0442\u044c                                       |\n| CountyType                | \u0422\u0438\u043f \u0440\u0430\u0439\u043e\u043d\u0430                                    |\n| County                    | \u0420\u0430\u0439\u043e\u043d                                         |\n| SubLocalityType           | \u0422\u0438\u043f \u043f\u043e\u0434\u0440\u0430\u0439\u043e\u043d\u0430                                 |\n| SubLocality               | \u041f\u043e\u0434\u0440\u0430\u0439\u043e\u043d                                      |\n| LocalityType              | \u0422\u0438\u043f \u043d\u0430\u0441\u0435\u043b\u0451\u043d\u043d\u043e\u0433\u043e \u043f\u0443\u043d\u043a\u0442\u0430                        |\n| Locality                  | \u041d\u0430\u0441\u0435\u043b\u0451\u043d\u043d\u044b\u0439 \u043f\u0443\u043d\u043a\u0442                              |\n| StreetType                | \u0422\u0438\u043f \u0443\u043b\u0438\u0446\u044b                                     |\n| Street                    | \u0423\u043b\u0438\u0446\u0430                                         |\n| HousingType               | \u0422\u0438\u043f \u043a\u043e\u0440\u043f\u0443\u0441\u0430                                   |\n| Housing                   | \u041a\u043e\u0440\u043f\u0443\u0441                                        |\n| HostelType                | \u0422\u0438\u043f \u043e\u0431\u0449\u0435\u0436\u0438\u0442\u0438\u044f                                 |\n| Hostel                    | \u041e\u0431\u0449\u0435\u0436\u0438\u0442\u0438\u0435                                     |\n| HouseNumberType           | \u0422\u0438\u043f \u043d\u043e\u043c\u0435\u0440\u0430 \u0434\u043e\u043c\u0430                               |\n| HouseNumber               | \u041d\u043e\u043c\u0435\u0440 \u0434\u043e\u043c\u0430                                    |\n| HouseNumberAdditionally   | \u0414\u043e\u043f\u043e\u043b\u043d\u0438\u0442\u0435\u043b\u044c\u043d\u044b\u0439 \u043d\u043e\u043c\u0435\u0440 \u0434\u043e\u043c\u0430                     |\n| SectionType               | \u0422\u0438\u043f \u0441\u0435\u043a\u0446\u0438\u0438                                    |\n| Section                   | \u0421\u0435\u043a\u0446\u0438\u044f                                        |\n| ApartmentType             | \u0422\u0438\u043f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u044b                                  |\n| Apartment                 | \u041a\u0432\u0430\u0440\u0442\u0438\u0440\u0430                                      |\n| RoomType                  | \u0422\u0438\u043f \u043a\u043e\u043c\u043d\u0430\u0442\u044b                                   |\n| Room                      | \u041a\u043e\u043c\u043d\u0430\u0442\u0430                                       |\n| Sector                    | \u0421\u0435\u043a\u0442\u043e\u0440                                        |\n| EntranceType              | \u0422\u0438\u043f \u043f\u043e\u0434\u044a\u0435\u0437\u0434\u0430                                  |\n| Entrance                  | \u041d\u043e\u043c\u0435\u0440 \u043f\u043e\u0434\u044a\u0435\u0437\u0434\u0430                                |\n| FloorType                 | \u0422\u0438\u043f \u044d\u0442\u0430\u0436\u0430                                     |\n| Floor                     | \u042d\u0442\u0430\u0436                                          |\n| PostCode                  | \u0418\u043d\u0434\u0435\u043a\u0441                                        |\n| Manually                  | \u041d\u0430\u0431\u043e\u0440 \u0442\u0438\u043f\u043e\u0432 \u0434\u043b\u044f \u0434\u0430\u043b\u044c\u043d\u0435\u0439\u0448\u0435\u0439 \u0440\u0430\u0437\u0431\u043e\u0440\u043a\u0438 \u0430\u0434\u0440\u0435\u0441\u0430    |\n| NotAddress                | \u041d\u0435 \u0430\u0434\u0440\u0435\u0441                                      |\n| Comment                   | \u041a\u043e\u043c\u043c\u0435\u043d\u0442\u0430\u0440\u0438\u0439                                   |\n| AdditionalData            | \u0414\u043e\u043f\u043e\u043b\u043d\u0438\u0442\u0435\u043b\u044c\u043d\u044b\u0435 \u0434\u0430\u043d\u043d\u044b\u0435                         |",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Ukrainian address parser",
    "version": "1.0.7",
    "project_urls": {
        "Homepage": "https://github.com/RapidappsIT/uaddress"
    },
    "split_keywords": [
        "nlp",
        "ukraine",
        "address",
        "research",
        "parsing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0a08be39f1b09f8b0b5cf3846783fd911e04f0f1118150f9a6372bf45428c959",
                "md5": "4e7543932f48605566a5ee261fc9cfb2",
                "sha256": "85316ba8ae0016c1749c473c7232d6a83d2b1080b033131870f864022c136255"
            },
            "downloads": -1,
            "filename": "uaddress-1.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "4e7543932f48605566a5ee261fc9cfb2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 78668,
            "upload_time": "2023-10-27T10:43:51",
            "upload_time_iso_8601": "2023-10-27T10:43:51.644362Z",
            "url": "https://files.pythonhosted.org/packages/0a/08/be39f1b09f8b0b5cf3846783fd911e04f0f1118150f9a6372bf45428c959/uaddress-1.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-27 10:43:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "RapidappsIT",
    "github_project": "uaddress",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "uaddress"
}
        
Elapsed time: 4.92249s