pyap2


Namepyap2 JSON
Version 0.1.9 PyPI version JSON
download
home_pagehttps://github.com/argyle-engineering/pyap
SummaryPyap2 is a maintained fork of pyap, a regex-based library for parsing US, CA, and UK addresses. The fork adds typing support, handles more address formats and edge cases.
upload_time2024-10-31 16:36:55
maintainerNone
docs_urlNone
authorArgyle Developers
requires_python<4.0,>=3.9
licenseMIT
keywords address parser regex
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Pyap2: Python address parser
============================


Pyap2 is a maintained fork of Pyap, a regex-based python library for
detecting and parsing addresses. Currently it supports US πŸ‡ΊπŸ‡Έ, Canadian πŸ‡¨πŸ‡¦ and British πŸ‡¬πŸ‡§ addresses. 


.. code-block:: python

    >>> import pyap
    >>> test_address = """
        Lorem ipsum
        225 E. John Carpenter Freeway, 
        Suite 1500 Irving, Texas 75062
        Dorem sit amet
        """
    >>> addresses = pyap.parse(test_address, country='US')
    >>> for address in addresses:
            # shows found address
            print(address)
            # shows address parts
            print(address.as_dict())
    ...




Installation
------------

To install Pyap2, simply:

.. code-block:: bash

    $ pip install pyap2



About
-----
We started improving the original `pyap` by adopting poetry and adding typing support. 
It was extensively tested in web-scraping operations on thousands of US addresses. 
Gradually, we added support for many rarer address formats and edge cases, as well 
as the ability to parse a partial address where only street info is available. 


Typical workflow
----------------
Pyap should be used as a first thing when you need to detect an address
inside a text when you don't know for sure whether the text contains
addresses or not.


Limitations
-----------
Because Pyap2 (and Pyap) is based on regular expressions it provides fast results.
This is also a limitation because regexps intentionally do not use too
much context to detect an address.

In other words in order to detect US address, the library doesn't
use any list of US cities or a list of typical street names. It
looks for a pattern which is most likely to be an address.

For example the string below would be detected as a valid address: 
"1 SPIRITUAL HEALER DR SHARIF NSAMBU SPECIALISING IN"

It happens because this string has all the components of a valid
address: street number "1", street name "SPIRITUAL HEALER" followed
by a street identifier "DR" (Drive), city "SHARIF NSAMBU SPECIALISING"
and a state name abbreviation "IN" (Indiana).

The good news is that the above mentioned errors are **quite rare**.




            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/argyle-engineering/pyap",
    "name": "pyap2",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "address, parser, regex",
    "author": "Argyle Developers",
    "author_email": "developers@argyle.io",
    "download_url": "https://files.pythonhosted.org/packages/0d/b7/1a4728faefc66783a76e254f8fd1d70eb6b13f837eb164bfaa6d96fc507d/pyap2-0.1.9.tar.gz",
    "platform": null,
    "description": "Pyap2: Python address parser\n============================\n\n\nPyap2 is a maintained fork of Pyap, a regex-based python library for\ndetecting and parsing addresses. Currently it supports US \ud83c\uddfa\ud83c\uddf8, Canadian \ud83c\udde8\ud83c\udde6 and British \ud83c\uddec\ud83c\udde7 addresses. \n\n\n.. code-block:: python\n\n    >>> import pyap\n    >>> test_address = \"\"\"\n        Lorem ipsum\n        225 E. John Carpenter Freeway, \n        Suite 1500 Irving, Texas 75062\n        Dorem sit amet\n        \"\"\"\n    >>> addresses = pyap.parse(test_address, country='US')\n    >>> for address in addresses:\n            # shows found address\n            print(address)\n            # shows address parts\n            print(address.as_dict())\n    ...\n\n\n\n\nInstallation\n------------\n\nTo install Pyap2, simply:\n\n.. code-block:: bash\n\n    $ pip install pyap2\n\n\n\nAbout\n-----\nWe started improving the original `pyap` by adopting poetry and adding typing support. \nIt was extensively tested in web-scraping operations on thousands of US addresses. \nGradually, we added support for many rarer address formats and edge cases, as well \nas the ability to parse a partial address where only street info is available. \n\n\nTypical workflow\n----------------\nPyap should be used as a first thing when you need to detect an address\ninside a text when you don't know for sure whether the text contains\naddresses or not.\n\n\nLimitations\n-----------\nBecause Pyap2 (and Pyap) is based on regular expressions it provides fast results.\nThis is also a limitation because regexps intentionally do not use too\nmuch context to detect an address.\n\nIn other words in order to detect US address, the library doesn't\nuse any list of US cities or a list of typical street names. It\nlooks for a pattern which is most likely to be an address.\n\nFor example the string below would be detected as a valid address: \n\"1 SPIRITUAL HEALER DR SHARIF NSAMBU SPECIALISING IN\"\n\nIt happens because this string has all the components of a valid\naddress: street number \"1\", street name \"SPIRITUAL HEALER\" followed\nby a street identifier \"DR\" (Drive), city \"SHARIF NSAMBU SPECIALISING\"\nand a state name abbreviation \"IN\" (Indiana).\n\nThe good news is that the above mentioned errors are **quite rare**.\n\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Pyap2 is a maintained fork of pyap, a regex-based library for parsing US, CA, and UK addresses. The fork adds typing support, handles more address formats and edge cases.",
    "version": "0.1.9",
    "project_urls": {
        "Documentation": "https://github.com/argyle-engineering/pyap",
        "Homepage": "https://github.com/argyle-engineering/pyap",
        "Repository": "https://github.com/argyle-engineering/pyap"
    },
    "split_keywords": [
        "address",
        " parser",
        " regex"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8622aed56d62102d85de1ccbfbba68baf1416b6684fd0e4475b046d66563c709",
                "md5": "981c937754582e5299555e3be64029ee",
                "sha256": "2b9ab4951e748d3a802caa3c58091a5d30ca1e25e18c5c605287917f61f62e19"
            },
            "downloads": -1,
            "filename": "pyap2-0.1.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "981c937754582e5299555e3be64029ee",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 22613,
            "upload_time": "2024-10-31T16:36:54",
            "upload_time_iso_8601": "2024-10-31T16:36:54.200284Z",
            "url": "https://files.pythonhosted.org/packages/86/22/aed56d62102d85de1ccbfbba68baf1416b6684fd0e4475b046d66563c709/pyap2-0.1.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0db71a4728faefc66783a76e254f8fd1d70eb6b13f837eb164bfaa6d96fc507d",
                "md5": "e0ab19ef1ce84c866ec672cf1050f3c5",
                "sha256": "37b327f55cf8c062e8a9c4c5ece4aa7099440c7d1d7c66d7d638d18ea97d4395"
            },
            "downloads": -1,
            "filename": "pyap2-0.1.9.tar.gz",
            "has_sig": false,
            "md5_digest": "e0ab19ef1ce84c866ec672cf1050f3c5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 18975,
            "upload_time": "2024-10-31T16:36:55",
            "upload_time_iso_8601": "2024-10-31T16:36:55.387284Z",
            "url": "https://files.pythonhosted.org/packages/0d/b7/1a4728faefc66783a76e254f8fd1d70eb6b13f837eb164bfaa6d96fc507d/pyap2-0.1.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-31 16:36:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "argyle-engineering",
    "github_project": "pyap",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "pyap2"
}
        
Elapsed time: 0.36758s