whereabouts


Namewhereabouts JSON
Version 0.3.7 PyPI version JSON
download
home_pageNone
SummaryOpen source geocoding in Python
upload_time2024-04-07 06:47:59
maintainerNone
docs_urlNone
authoralex2718
requires_python<3.13,>=3.10
licenseNone
keywords geocoding geospatial record linkage
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Whereabouts
Fast, scalable geocoding for Python using DuckDB. The geocoding algorithms are based on the following papers:
- https://arxiv.org/abs/1708.01402
- https://arxiv.org/abs/1712.09691

## Description
Geocode addresses and reverse geocode coordinates directly from Python in your own environment. 
- No additional database setup required. Uses DuckDB to run all queries
- No need to send data to an external geocoding API
- Fast (Geocode 1000s / sec and reverse geocode 200,000s / sec)
- Robust to typographical errors


## Requirements
- Python 3.8+
- Poetry (for package management)

## Installation
Once Poetry is installed and you are in the project directory:

```
poetry shell
poetry install
```

## Create a geocoder database
To start geocoding, a geocoding database has to be created, which uses a reference dataset containing addresses and corresponding latitude, longitude values.

The reference file should be a single csv file with at least three fields: the complete address, latitude, longitude. These fields should be specified in a `setup.yml` file. An example is included.

Once the `setup.yml` is created and a reference dataset is available, the geocoding database can be created using the `setup_geocoder` function from whereabouts.utils.

The current process for using Australian data from the GNAF is as follows:
1) Download the latest version of GNAF core from https://geoscape.com.au/data/g-naf-core/
2) Update the `setup.yml` file to point to the location of the GNAF core file
3) Finally, setup the geocoder. This creates the required reference tables

```
python -m whereabouts setup_geocoder setup.yml
```

To use address data from another country, the file should have the following columns:

| Column name | Description |
| ----------- | ----------- |
| ADDRESS_DETAIL_PID | Unique identifier for address |
| ADDRESS_LABEL | The full address |
| ADDRESS_SITE_NAME | Name of the site. This is usually null |
| LOCALITY_NAME | Name of the suburb or locality |
| POSTCODE | Postcode of address |
| STATE | State 
| LATITUDE | Latitude of geocoded address |
| LONGITUDE | Longitude of geocoded address |

## Examples

Geocode a list of addresses 
```
from whereabouts.Matcher import Matcher

matcher = Matcher(db_name='gnaf_au')
matcher.geocode(addresslist, how='standard')
```

For more accurate geocoding you can use trigram phrases rather than token phrases (note that the trigram option has to have been specified in the setup.yml file as part of the setup)
```
matcher.geocode(addresslist, how='trigram')
```

Once a Matcher object is created, the KD-tree for fast geocoding will also be created. A list of latitude, longitude values can then be reverse geocoded as follows
```
matcher.reverse_geocode(coordinates)
```
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "whereabouts",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.10",
    "maintainer_email": null,
    "keywords": "geocoding, geospatial, record linkage",
    "author": "alex2718",
    "author_email": "ajlee3141@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ce/ab/631f6b54fe49cd120463dce79e08d2440830aa3fbe151fed1dc085f7b155/whereabouts-0.3.7.tar.gz",
    "platform": null,
    "description": "# Whereabouts\nFast, scalable geocoding for Python using DuckDB. The geocoding algorithms are based on the following papers:\n- https://arxiv.org/abs/1708.01402\n- https://arxiv.org/abs/1712.09691\n\n## Description\nGeocode addresses and reverse geocode coordinates directly from Python in your own environment. \n- No additional database setup required. Uses DuckDB to run all queries\n- No need to send data to an external geocoding API\n- Fast (Geocode 1000s / sec and reverse geocode 200,000s / sec)\n- Robust to typographical errors\n\n\n## Requirements\n- Python 3.8+\n- Poetry (for package management)\n\n## Installation\nOnce Poetry is installed and you are in the project directory:\n\n```\npoetry shell\npoetry install\n```\n\n## Create a geocoder database\nTo start geocoding, a geocoding database has to be created, which uses a reference dataset containing addresses and corresponding latitude, longitude values.\n\nThe reference file should be a single csv file with at least three fields: the complete address, latitude, longitude. These fields should be specified in a `setup.yml` file. An example is included.\n\nOnce the `setup.yml` is created and a reference dataset is available, the geocoding database can be created using the `setup_geocoder` function from whereabouts.utils.\n\nThe current process for using Australian data from the GNAF is as follows:\n1) Download the latest version of GNAF core from https://geoscape.com.au/data/g-naf-core/\n2) Update the `setup.yml` file to point to the location of the GNAF core file\n3) Finally, setup the geocoder. This creates the required reference tables\n\n```\npython -m whereabouts setup_geocoder setup.yml\n```\n\nTo use address data from another country, the file should have the following columns:\n\n| Column name | Description |\n| ----------- | ----------- |\n| ADDRESS_DETAIL_PID | Unique identifier for address |\n| ADDRESS_LABEL | The full address |\n| ADDRESS_SITE_NAME | Name of the site. This is usually null |\n| LOCALITY_NAME | Name of the suburb or locality |\n| POSTCODE | Postcode of address |\n| STATE | State \n| LATITUDE | Latitude of geocoded address |\n| LONGITUDE | Longitude of geocoded address |\n\n## Examples\n\nGeocode a list of addresses \n```\nfrom whereabouts.Matcher import Matcher\n\nmatcher = Matcher(db_name='gnaf_au')\nmatcher.geocode(addresslist, how='standard')\n```\n\nFor more accurate geocoding you can use trigram phrases rather than token phrases (note that the trigram option has to have been specified in the setup.yml file as part of the setup)\n```\nmatcher.geocode(addresslist, how='trigram')\n```\n\nOnce a Matcher object is created, the KD-tree for fast geocoding will also be created. A list of latitude, longitude values can then be reverse geocoded as follows\n```\nmatcher.reverse_geocode(coordinates)\n```",
    "bugtrack_url": null,
    "license": null,
    "summary": "Open source geocoding in Python",
    "version": "0.3.7",
    "project_urls": null,
    "split_keywords": [
        "geocoding",
        " geospatial",
        " record linkage"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2bfdcdcef8f687750c27f38d9b7dbc3f9de26127656e8be2ac2c9abc933ce998",
                "md5": "7657c226b46dd9c5f6ae78474de13981",
                "sha256": "33540bc5a08c9ee445e14c18e1ef68f72be5cbb147f56d0785768a66f931c0cc"
            },
            "downloads": -1,
            "filename": "whereabouts-0.3.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7657c226b46dd9c5f6ae78474de13981",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.10",
            "size": 27237,
            "upload_time": "2024-04-07T06:47:56",
            "upload_time_iso_8601": "2024-04-07T06:47:56.967782Z",
            "url": "https://files.pythonhosted.org/packages/2b/fd/cdcef8f687750c27f38d9b7dbc3f9de26127656e8be2ac2c9abc933ce998/whereabouts-0.3.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ceab631f6b54fe49cd120463dce79e08d2440830aa3fbe151fed1dc085f7b155",
                "md5": "184419593f99eb8ac3381a4f3b1f5359",
                "sha256": "e4d0c25606ef5cd14dd0be66df879ace1bdf5fba3a84bb27c4f98f889895c388"
            },
            "downloads": -1,
            "filename": "whereabouts-0.3.7.tar.gz",
            "has_sig": false,
            "md5_digest": "184419593f99eb8ac3381a4f3b1f5359",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.10",
            "size": 14067,
            "upload_time": "2024-04-07T06:47:59",
            "upload_time_iso_8601": "2024-04-07T06:47:59.943467Z",
            "url": "https://files.pythonhosted.org/packages/ce/ab/631f6b54fe49cd120463dce79e08d2440830aa3fbe151fed1dc085f7b155/whereabouts-0.3.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-07 06:47:59",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "whereabouts"
}
        
Elapsed time: 0.22156s