fastpynuts


Namefastpynuts JSON
Version 1.0.0 PyPI version JSON
download
home_pageNone
SummaryA fast implementation of querying for NUTS regions by location.
upload_time2024-05-13 15:52:22
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords eurostat nuts nomenclature of territorial units for statistics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FastPyNUTS
A fast implementation of querying the [NUTS - Nomenclature of territorial units for statistics](https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts) dataset by location, particularly useful for large-scale applications.


![Figure: NUTS levels (Eurostat)](img/levels.gif) <br>
Figure: [_Eurostat_](https://ec.europa.eu/eurostat/documents/7116161/7117206/NUTS-layers.gif)


## Features
- fast querying of NUTS regions (~0.3ms/query)
- find all NUTS regions of a point or query user-defined NUTS-levels (0-3)
- use your own custom NUTS dataset (other CRS, enriched metadata, etc.)


## Installation
```cmd
pip install fastpynuts
```
`FastPyNUTS` requires `numpy`, `shapely`, `treelib` and `rtree`


## Usage

#### Initialization and finding NUTS regions
The `NUTSfinder` class is the main tool to determine the NUTS regions of a point. It can be initialized from a local file
containing the NUTS regions, or via automatic download from [Eurostat](https://gisco-services.ec.europa.eu/distribution/v2/nuts).
```python
from fastpynuts import NUTSfinder

# construct from local file
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson")

# retrieve data automatically (file will be downloaded to or if already existing read from '.data')
nf = NUTSfinder.from_web(scale=1, year=2021, epsg=4326)


# find NUTS regions
point = (11.57, 48.13)
regions = nf.find(*point)                   # find all regions
regions3 = nf.find_level(*point, 3)         # only find NUTS-3 regions
```

#### Assessing the results
The NUTS regions will be returned as an ordered list of `NUTSregion` objects.
```python
>>> regions
[NUTS0: DE, NUTS1: DE2, NUTS2: DE21, NUTS3: DE212]
```

Each region object holds information about
- its ID and NUTS level
```python
>>> region = regions[0]
>>> region.id
DE
>>> region.level
0
```
- its geometry (a `shapely` Polygon or MultiPolygon) and the corresponding bounding box
```python
>>> region.geom
<MULTIPOLYGON (((10.454 47.556, 10.44 47.525, 10.441 47.514, 10.432 47.504, ...>
>>> region.bbox
(5.867697, 47.270114, 15.04116, 55.058165)
```
- further fields from the NUTS dataset and the original input feature in GeoJSON format
```python
>>> region.properties
{
    "NUTS_ID": "DE",
    "LEVL_CODE": 0,
    "CNTR_CODE": "DE",
    "NAME_LATN": "Deutschland",
    "NUTS_NAME": "Deutschland",
    "MOUNT_TYPE": 0,
    "URBN_TYPE": 0,
    "COAST_TYPE": 0,
    "FID": "DE"
}
>>> region.feature
{
    'type': 'Feature',
    'geometry': {
        'type': 'MultiPolygon',
        'coordinates': [
            [
                [
                    [10.454439, 47.555797],
                    ...
                ]
            ]
        ],
    },
    'properties': {
        "NUTS_ID": "DE",
        ...
}
```

## Advanced Usage
```python
# apply a buffer to the input regions to catch points on the boundary (for further info on the buffering, see the documentation)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", buffer_geoms=1e-5)

# only load certain levels of regions (here levels 2 and 3)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", min_level=2, max_level=3)


# if the point to be queried is guaranteed to lie within a NUTS region, setting valid_point to True may speed up the runtime
regions = nf.find(*point, valid_point=True)
```


## Runtime Comparison
`FastPyNUTS` is optimized for query speed and result correctness, at the expense of more expensive initialization time.

A R-tree-based approach proved to be the fastest option:
<table>
 <tr>
    <td> <img src="img/benchmark_1.png" alt="Benchmark for scale 1."> </td>
    <td> <img src="img/benchmark_20_zoom.png" alt="Benchmark for scale 1."> </td>
  </tr>
</table>

Compared to other packages like [nuts-finder](https://github.com/nestauk/nuts_finder), a large performance boost can be achieved

![](img/benchmark_other.png)

**Tips**:
- if interested only in certain levels (0-3) of the NUTS dataset, initialize the `NUTSfinder` using its `min_level` and `max_level` arguments
- if it's known beforehand that the queried point lies within the interior of a NUTS region, use `find(valid_point=True)`

For a full runtime analysis, see [benchmark.ipynb](benchmark.ipynb)



## Contributors
- [Colin Moldenhauer](https://github.com/ColinMoldenhauer/)
- [meengel](https://github.com/meengel)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fastpynuts",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "eurostat, NUTS, nomenclature of territorial units for statistics",
    "author": null,
    "author_email": "Colin Moldenhauer <colin.moldenhauer@tum.de>, Michael Engel <m.engel@tum.de>",
    "download_url": "https://files.pythonhosted.org/packages/b0/16/b80f7f88fad4f9fdfadca982e15a6678f3992e625442182f016bc0ea9f02/fastpynuts-1.0.0.tar.gz",
    "platform": null,
    "description": "# FastPyNUTS\r\nA fast implementation of querying the [NUTS - Nomenclature of territorial units for statistics](https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts) dataset by location, particularly useful for large-scale applications.\r\n\r\n\r\n![Figure: NUTS levels (Eurostat)](img/levels.gif) <br>\r\nFigure: [_Eurostat_](https://ec.europa.eu/eurostat/documents/7116161/7117206/NUTS-layers.gif)\r\n\r\n\r\n## Features\r\n- fast querying of NUTS regions (~0.3ms/query)\r\n- find all NUTS regions of a point or query user-defined NUTS-levels (0-3)\r\n- use your own custom NUTS dataset (other CRS, enriched metadata, etc.)\r\n\r\n\r\n## Installation\r\n```cmd\r\npip install fastpynuts\r\n```\r\n`FastPyNUTS` requires `numpy`, `shapely`, `treelib` and `rtree`\r\n\r\n\r\n## Usage\r\n\r\n#### Initialization and finding NUTS regions\r\nThe `NUTSfinder` class is the main tool to determine the NUTS regions of a point. It can be initialized from a local file\r\ncontaining the NUTS regions, or via automatic download from [Eurostat](https://gisco-services.ec.europa.eu/distribution/v2/nuts).\r\n```python\r\nfrom fastpynuts import NUTSfinder\r\n\r\n# construct from local file\r\nnf = NUTSfinder(\"PATH_TO_LOCAL_FILE.geojson\")\r\n\r\n# retrieve data automatically (file will be downloaded to or if already existing read from '.data')\r\nnf = NUTSfinder.from_web(scale=1, year=2021, epsg=4326)\r\n\r\n\r\n# find NUTS regions\r\npoint = (11.57, 48.13)\r\nregions = nf.find(*point)                   # find all regions\r\nregions3 = nf.find_level(*point, 3)         # only find NUTS-3 regions\r\n```\r\n\r\n#### Assessing the results\r\nThe NUTS regions will be returned as an ordered list of `NUTSregion` objects.\r\n```python\r\n>>> regions\r\n[NUTS0: DE, NUTS1: DE2, NUTS2: DE21, NUTS3: DE212]\r\n```\r\n\r\nEach region object holds information about\r\n- its ID and NUTS level\r\n```python\r\n>>> region = regions[0]\r\n>>> region.id\r\nDE\r\n>>> region.level\r\n0\r\n```\r\n- its geometry (a `shapely` Polygon or MultiPolygon) and the corresponding bounding box\r\n```python\r\n>>> region.geom\r\n<MULTIPOLYGON (((10.454 47.556, 10.44 47.525, 10.441 47.514, 10.432 47.504, ...>\r\n>>> region.bbox\r\n(5.867697, 47.270114, 15.04116, 55.058165)\r\n```\r\n- further fields from the NUTS dataset and the original input feature in GeoJSON format\r\n```python\r\n>>> region.properties\r\n{\r\n    \"NUTS_ID\": \"DE\",\r\n    \"LEVL_CODE\": 0,\r\n    \"CNTR_CODE\": \"DE\",\r\n    \"NAME_LATN\": \"Deutschland\",\r\n    \"NUTS_NAME\": \"Deutschland\",\r\n    \"MOUNT_TYPE\": 0,\r\n    \"URBN_TYPE\": 0,\r\n    \"COAST_TYPE\": 0,\r\n    \"FID\": \"DE\"\r\n}\r\n>>> region.feature\r\n{\r\n    'type': 'Feature',\r\n    'geometry': {\r\n        'type': 'MultiPolygon',\r\n        'coordinates': [\r\n            [\r\n                [\r\n                    [10.454439, 47.555797],\r\n                    ...\r\n                ]\r\n            ]\r\n        ],\r\n    },\r\n    'properties': {\r\n        \"NUTS_ID\": \"DE\",\r\n        ...\r\n}\r\n```\r\n\r\n## Advanced Usage\r\n```python\r\n# apply a buffer to the input regions to catch points on the boundary (for further info on the buffering, see the documentation)\r\nnf = NUTSfinder(\"PATH_TO_LOCAL_FILE.geojson\", buffer_geoms=1e-5)\r\n\r\n# only load certain levels of regions (here levels 2 and 3)\r\nnf = NUTSfinder(\"PATH_TO_LOCAL_FILE.geojson\", min_level=2, max_level=3)\r\n\r\n\r\n# if the point to be queried is guaranteed to lie within a NUTS region, setting valid_point to True may speed up the runtime\r\nregions = nf.find(*point, valid_point=True)\r\n```\r\n\r\n\r\n## Runtime Comparison\r\n`FastPyNUTS` is optimized for query speed and result correctness, at the expense of more expensive initialization time.\r\n\r\nA R-tree-based approach proved to be the fastest option:\r\n<table>\r\n <tr>\r\n    <td> <img src=\"img/benchmark_1.png\" alt=\"Benchmark for scale 1.\"> </td>\r\n    <td> <img src=\"img/benchmark_20_zoom.png\" alt=\"Benchmark for scale 1.\"> </td>\r\n  </tr>\r\n</table>\r\n\r\nCompared to other packages like [nuts-finder](https://github.com/nestauk/nuts_finder), a large performance boost can be achieved\r\n\r\n![](img/benchmark_other.png)\r\n\r\n**Tips**:\r\n- if interested only in certain levels (0-3) of the NUTS dataset, initialize the `NUTSfinder` using its `min_level` and `max_level` arguments\r\n- if it's known beforehand that the queried point lies within the interior of a NUTS region, use `find(valid_point=True)`\r\n\r\nFor a full runtime analysis, see [benchmark.ipynb](benchmark.ipynb)\r\n\r\n\r\n\r\n## Contributors\r\n- [Colin Moldenhauer](https://github.com/ColinMoldenhauer/)\r\n- [meengel](https://github.com/meengel)\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A fast implementation of querying for NUTS regions by location.",
    "version": "1.0.0",
    "project_urls": {
        "Homepage": "https://github.com/ColinMoldenhauer/FastPyNUTS",
        "Issues": "https://github.com/ColinMoldenhauer/FastPyNUTS/issues"
    },
    "split_keywords": [
        "eurostat",
        " nuts",
        " nomenclature of territorial units for statistics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "93d028ff2fac0e39fafe61c99d3eea6b9f42e74da6a7b0426d4579c82e18a03b",
                "md5": "e8ad52621d5949ae09f844aa8c4e9320",
                "sha256": "6cf51bd43c08cbdd251687e45d64c9baf55c97959bc427c9ee38b86977216e0d"
            },
            "downloads": -1,
            "filename": "fastpynuts-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e8ad52621d5949ae09f844aa8c4e9320",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 23764,
            "upload_time": "2024-05-13T15:52:20",
            "upload_time_iso_8601": "2024-05-13T15:52:20.405193Z",
            "url": "https://files.pythonhosted.org/packages/93/d0/28ff2fac0e39fafe61c99d3eea6b9f42e74da6a7b0426d4579c82e18a03b/fastpynuts-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b016b80f7f88fad4f9fdfadca982e15a6678f3992e625442182f016bc0ea9f02",
                "md5": "ad5fd32abcfbfef77f6c015c748dcc97",
                "sha256": "f6b06cae8fd91fd660f13136a681b3ccc088455124acfd329495c846ee4f176b"
            },
            "downloads": -1,
            "filename": "fastpynuts-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "ad5fd32abcfbfef77f6c015c748dcc97",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 26081,
            "upload_time": "2024-05-13T15:52:22",
            "upload_time_iso_8601": "2024-05-13T15:52:22.401966Z",
            "url": "https://files.pythonhosted.org/packages/b0/16/b80f7f88fad4f9fdfadca982e15a6678f3992e625442182f016bc0ea9f02/fastpynuts-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-13 15:52:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ColinMoldenhauer",
    "github_project": "FastPyNUTS",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "fastpynuts"
}
        
Elapsed time: 0.23114s