fastpynuts


Namefastpynuts JSON
Version 1.1.0 PyPI version JSON
download
home_pageNone
SummaryA fast implementation of querying for NUTS regions by location.
upload_time2024-06-30 13:43:07
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords eurostat nuts nomenclature of territorial units for statistics
VCS
bugtrack_url
requirements numpy shapely rtree treelib
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FastPyNUTS
A fast implementation of querying the [NUTS - Nomenclature of territorial units for statistics](https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts) dataset by location, particularly useful for large-scale applications.


![Figure: NUTS levels (Eurostat)](img/levels.gif) <br>
Figure: [_Eurostat_](https://ec.europa.eu/eurostat/documents/7116161/7117206/NUTS-layers.gif)


## Features
- fast querying of NUTS regions (~0.3ms/query)
- find all NUTS regions of a point or query user-defined NUTS-levels (0-3)
- use your own custom NUTS dataset (other CRS, enriched metadata, etc.)


## Installation
```cmd
pip install fastpynuts
```
`FastPyNUTS` requires `numpy`, `shapely`, `treelib` and `rtree`


## Usage

#### Initialization and finding NUTS regions
The `NUTSfinder` class is the main tool to determine the NUTS regions of a point. It can be initialized from a local file
containing the NUTS regions, or via automatic download from [Eurostat](https://gisco-services.ec.europa.eu/distribution/v2/nuts).
```python
from fastpynuts import NUTSfinder

# construct from local file
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson")

# retrieve data automatically (file will be downloaded to or if already existing read from '.data')
nf = NUTSfinder.from_web(scale=1, year=2021, epsg=4326)


# find NUTS regions
point = (11.57, 48.13)
regions = nf.find(*point)                   # find all regions via a point

bbox = (11.57, 48.13, 11.62, 49.)           # lon_min, lat_min, lon_max, lat_max
regions = nf.find_bbox()                    # find all regions via a bbox

geom = {
    "type": "Polygon",
    "coordinates": [
        [
            [11.595733032762524, 48.11837184946995],
            [11.631858436052113, 48.14289890153063],
            [11.627498473585405, 48.16409081247133],
            [11.595733032762524, 48.11837184946995]
        ]
    ]
}
regions = nf.find_bbox()                    # find all regions via a GeoJSON geometry (supports shapely geometries and all objects that can be converted into one)


# filter for regions of specific levels
level3 = nf.filter_levels(regions, 3)
level2or3 = nf.filter_levels(regions, 2, 3)
```

#### Assessing the results
The NUTS regions will be returned as an ordered list of `NUTSregion` objects.
```python
>>> regions
[NUTS0: DE, NUTS1: DE2, NUTS2: DE21, NUTS3: DE212]
```

Each region object holds information about
- its ID and NUTS level
```python
>>> region = regions[0]
>>> region.id
DE
>>> region.level
0
```
- its geometry (a `shapely` Polygon or MultiPolygon) and the corresponding bounding box
```python
>>> region.geom
<MULTIPOLYGON (((10.454 47.556, 10.44 47.525, 10.441 47.514, 10.432 47.504, ...>
>>> region.bbox
(5.867697, 47.270114, 15.04116, 55.058165)
```
- further fields from the NUTS dataset and the original input feature in GeoJSON format
```python
>>> region.properties
{
    "NUTS_ID": "DE",
    "LEVL_CODE": 0,
    "CNTR_CODE": "DE",
    "NAME_LATN": "Deutschland",
    "NUTS_NAME": "Deutschland",
    "MOUNT_TYPE": 0,
    "URBN_TYPE": 0,
    "COAST_TYPE": 0,
    "FID": "DE"
}
>>> region.feature
{
    'type': 'Feature',
    'geometry': {
        'type': 'MultiPolygon',
        'coordinates': [
            [
                [
                    [10.454439, 47.555797],
                    ...
                ]
            ]
        ],
    },
    'properties': {
        "NUTS_ID": "DE",
        ...
}
```

## Advanced Usage
```python
# apply a buffer to the input regions to catch points on the boundary (for further info on the buffering, see the documentation)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", buffer_geoms=1e-5)

# only load certain levels of regions (here levels 2 and 3)
nf = NUTSfinder("PATH_TO_LOCAL_FILE.geojson", min_level=2, max_level=3)


# if the point to be queried is guaranteed to lie within a NUTS region, setting valid_point to True may speed up the runtime
regions = nf.find(*point, valid_point=True)
```


## Runtime Comparison
`FastPyNUTS` is optimized for query speed and result correctness, at the expense of more expensive initialization time.

A R-tree-based approach proved to be the fastest option:
<table>
 <tr>
    <td> <img src="img/benchmark_1.png" alt="Benchmark for scale 1."> </td>
    <td> <img src="img/benchmark_20_zoom.png" alt="Benchmark for scale 1."> </td>
  </tr>
</table>

Compared to other packages like [nuts-finder](https://github.com/nestauk/nuts_finder), a large performance boost can be achieved

![](img/benchmark_other.png)

**Tips**:
- if interested only in certain levels (0-3) of the NUTS dataset, initialize the `NUTSfinder` using its `min_level` and `max_level` arguments
- if it's known beforehand that the queried point lies within the interior of a NUTS region, use `find(valid_point=True)`

For a full runtime analysis, see [benchmark.ipynb](benchmark.ipynb)



## Contributors
- [Colin Moldenhauer](https://github.com/ColinMoldenhauer/)
- [meengel](https://github.com/meengel)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fastpynuts",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "eurostat, NUTS, nomenclature of territorial units for statistics",
    "author": null,
    "author_email": "Colin Moldenhauer <colin.moldenhauer@tum.de>, Michael Engel <m.engel@tum.de>",
    "download_url": "https://files.pythonhosted.org/packages/ba/53/fd6036195c76d1624080e86d06779761073d016263f4646c6a4f890d999c/fastpynuts-1.1.0.tar.gz",
    "platform": null,
    "description": "# FastPyNUTS\r\nA fast implementation of querying the [NUTS - Nomenclature of territorial units for statistics](https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts) dataset by location, particularly useful for large-scale applications.\r\n\r\n\r\n![Figure: NUTS levels (Eurostat)](img/levels.gif) <br>\r\nFigure: [_Eurostat_](https://ec.europa.eu/eurostat/documents/7116161/7117206/NUTS-layers.gif)\r\n\r\n\r\n## Features\r\n- fast querying of NUTS regions (~0.3ms/query)\r\n- find all NUTS regions of a point or query user-defined NUTS-levels (0-3)\r\n- use your own custom NUTS dataset (other CRS, enriched metadata, etc.)\r\n\r\n\r\n## Installation\r\n```cmd\r\npip install fastpynuts\r\n```\r\n`FastPyNUTS` requires `numpy`, `shapely`, `treelib` and `rtree`\r\n\r\n\r\n## Usage\r\n\r\n#### Initialization and finding NUTS regions\r\nThe `NUTSfinder` class is the main tool to determine the NUTS regions of a point. It can be initialized from a local file\r\ncontaining the NUTS regions, or via automatic download from [Eurostat](https://gisco-services.ec.europa.eu/distribution/v2/nuts).\r\n```python\r\nfrom fastpynuts import NUTSfinder\r\n\r\n# construct from local file\r\nnf = NUTSfinder(\"PATH_TO_LOCAL_FILE.geojson\")\r\n\r\n# retrieve data automatically (file will be downloaded to or if already existing read from '.data')\r\nnf = NUTSfinder.from_web(scale=1, year=2021, epsg=4326)\r\n\r\n\r\n# find NUTS regions\r\npoint = (11.57, 48.13)\r\nregions = nf.find(*point)                   # find all regions via a point\r\n\r\nbbox = (11.57, 48.13, 11.62, 49.)           # lon_min, lat_min, lon_max, lat_max\r\nregions = nf.find_bbox()                    # find all regions via a bbox\r\n\r\ngeom = {\r\n    \"type\": \"Polygon\",\r\n    \"coordinates\": [\r\n        [\r\n            [11.595733032762524, 48.11837184946995],\r\n            [11.631858436052113, 48.14289890153063],\r\n            [11.627498473585405, 48.16409081247133],\r\n            [11.595733032762524, 48.11837184946995]\r\n        ]\r\n    ]\r\n}\r\nregions = nf.find_bbox()                    # find all regions via a GeoJSON geometry (supports shapely geometries and all objects that can be converted into one)\r\n\r\n\r\n# filter for regions of specific levels\r\nlevel3 = nf.filter_levels(regions, 3)\r\nlevel2or3 = nf.filter_levels(regions, 2, 3)\r\n```\r\n\r\n#### Assessing the results\r\nThe NUTS regions will be returned as an ordered list of `NUTSregion` objects.\r\n```python\r\n>>> regions\r\n[NUTS0: DE, NUTS1: DE2, NUTS2: DE21, NUTS3: DE212]\r\n```\r\n\r\nEach region object holds information about\r\n- its ID and NUTS level\r\n```python\r\n>>> region = regions[0]\r\n>>> region.id\r\nDE\r\n>>> region.level\r\n0\r\n```\r\n- its geometry (a `shapely` Polygon or MultiPolygon) and the corresponding bounding box\r\n```python\r\n>>> region.geom\r\n<MULTIPOLYGON (((10.454 47.556, 10.44 47.525, 10.441 47.514, 10.432 47.504, ...>\r\n>>> region.bbox\r\n(5.867697, 47.270114, 15.04116, 55.058165)\r\n```\r\n- further fields from the NUTS dataset and the original input feature in GeoJSON format\r\n```python\r\n>>> region.properties\r\n{\r\n    \"NUTS_ID\": \"DE\",\r\n    \"LEVL_CODE\": 0,\r\n    \"CNTR_CODE\": \"DE\",\r\n    \"NAME_LATN\": \"Deutschland\",\r\n    \"NUTS_NAME\": \"Deutschland\",\r\n    \"MOUNT_TYPE\": 0,\r\n    \"URBN_TYPE\": 0,\r\n    \"COAST_TYPE\": 0,\r\n    \"FID\": \"DE\"\r\n}\r\n>>> region.feature\r\n{\r\n    'type': 'Feature',\r\n    'geometry': {\r\n        'type': 'MultiPolygon',\r\n        'coordinates': [\r\n            [\r\n                [\r\n                    [10.454439, 47.555797],\r\n                    ...\r\n                ]\r\n            ]\r\n        ],\r\n    },\r\n    'properties': {\r\n        \"NUTS_ID\": \"DE\",\r\n        ...\r\n}\r\n```\r\n\r\n## Advanced Usage\r\n```python\r\n# apply a buffer to the input regions to catch points on the boundary (for further info on the buffering, see the documentation)\r\nnf = NUTSfinder(\"PATH_TO_LOCAL_FILE.geojson\", buffer_geoms=1e-5)\r\n\r\n# only load certain levels of regions (here levels 2 and 3)\r\nnf = NUTSfinder(\"PATH_TO_LOCAL_FILE.geojson\", min_level=2, max_level=3)\r\n\r\n\r\n# if the point to be queried is guaranteed to lie within a NUTS region, setting valid_point to True may speed up the runtime\r\nregions = nf.find(*point, valid_point=True)\r\n```\r\n\r\n\r\n## Runtime Comparison\r\n`FastPyNUTS` is optimized for query speed and result correctness, at the expense of more expensive initialization time.\r\n\r\nA R-tree-based approach proved to be the fastest option:\r\n<table>\r\n <tr>\r\n    <td> <img src=\"img/benchmark_1.png\" alt=\"Benchmark for scale 1.\"> </td>\r\n    <td> <img src=\"img/benchmark_20_zoom.png\" alt=\"Benchmark for scale 1.\"> </td>\r\n  </tr>\r\n</table>\r\n\r\nCompared to other packages like [nuts-finder](https://github.com/nestauk/nuts_finder), a large performance boost can be achieved\r\n\r\n![](img/benchmark_other.png)\r\n\r\n**Tips**:\r\n- if interested only in certain levels (0-3) of the NUTS dataset, initialize the `NUTSfinder` using its `min_level` and `max_level` arguments\r\n- if it's known beforehand that the queried point lies within the interior of a NUTS region, use `find(valid_point=True)`\r\n\r\nFor a full runtime analysis, see [benchmark.ipynb](benchmark.ipynb)\r\n\r\n\r\n\r\n## Contributors\r\n- [Colin Moldenhauer](https://github.com/ColinMoldenhauer/)\r\n- [meengel](https://github.com/meengel)\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A fast implementation of querying for NUTS regions by location.",
    "version": "1.1.0",
    "project_urls": {
        "Homepage": "https://github.com/ColinMoldenhauer/FastPyNUTS",
        "Issues": "https://github.com/ColinMoldenhauer/FastPyNUTS/issues"
    },
    "split_keywords": [
        "eurostat",
        " nuts",
        " nomenclature of territorial units for statistics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ce6d2f24590dcce27061f603af4229608e022e8cf17c71e79bd46384221b4e71",
                "md5": "636e854e55357e85b2b74081a937e1e9",
                "sha256": "063bbb00dc4d73a17216486c3c37048f7ae9a6befef57e061905e02be564fd3a"
            },
            "downloads": -1,
            "filename": "fastpynuts-1.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "636e854e55357e85b2b74081a937e1e9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 25012,
            "upload_time": "2024-06-30T13:43:05",
            "upload_time_iso_8601": "2024-06-30T13:43:05.182405Z",
            "url": "https://files.pythonhosted.org/packages/ce/6d/2f24590dcce27061f603af4229608e022e8cf17c71e79bd46384221b4e71/fastpynuts-1.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ba53fd6036195c76d1624080e86d06779761073d016263f4646c6a4f890d999c",
                "md5": "0e9c522162093bbcae26ce74e12ebc26",
                "sha256": "71f543557d27b71e389d9838ea24c864419da7dd11e0c842b047b0312f30f675"
            },
            "downloads": -1,
            "filename": "fastpynuts-1.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "0e9c522162093bbcae26ce74e12ebc26",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 28200,
            "upload_time": "2024-06-30T13:43:07",
            "upload_time_iso_8601": "2024-06-30T13:43:07.048627Z",
            "url": "https://files.pythonhosted.org/packages/ba/53/fd6036195c76d1624080e86d06779761073d016263f4646c6a4f890d999c/fastpynuts-1.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-30 13:43:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ColinMoldenhauer",
    "github_project": "FastPyNUTS",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "shapely",
            "specs": []
        },
        {
            "name": "rtree",
            "specs": []
        },
        {
            "name": "treelib",
            "specs": []
        }
    ],
    "lcname": "fastpynuts"
}
        
Elapsed time: 0.31392s