# GEOPIP: Geojson Point in Polygon (PIP)
[![CI](https://github.com/tammoippen/geopip/actions/workflows/CI.yml/badge.svg)](https://github.com/tammoippen/geopip/actions/workflows/CI.yml)
[![Coverage Status](https://coveralls.io/repos/github/tammoippen/geopip/badge.svg?branch=master)](https://coveralls.io/github/tammoippen/geopip?branch=master)
[![Tested CPython Versions](https://img.shields.io/badge/cpython-3.9%2C%203.10%2C%203.11%2C%203.12%2C%203.13-brightgreen.svg)](https://img.shields.io/badge/cpython-3.9%2C%203.10%2C%203.11%2C%203.12%2C%203.13-brightgreen.svg)
[![Tested PyPy Versions](https://img.shields.io/badge/pypy-3.9%2C%203.10-brightgreen.svg)](https://img.shields.io/badge/pypy-3.9%2C%203.10%2C%203.10-brightgreen.svg)
[![PyPi version](https://img.shields.io/pypi/v/geopip.svg)](https://pypi.python.org/pypi/geopip)
[![PyPi license](https://img.shields.io/pypi/l/geopip.svg)](https://pypi.python.org/pypi/geopip)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
Reverse geocode a lng/lat coordinate within a geojson `FeatureCollection` and return information about the containing country (polygon).
Basically, you can use any [geojson](https://tools.ietf.org/html/rfc7946) file (top level is a `FeatureCollection`) for reverse coding - set the environment variable `REVERSE_GEOCODE_DATA` to the geojson file. Only `Polygon` and `MultiPolygon` features will be used! If a point is found to be in a feature, the `properties` of that feature will be returned.
In other words, provide a geojson with postcode boundaries, and you can query for the postcode in which a coordinate is. Provide timezone boundaries and you can find the timezone for a coordinate. Be creative :).
The default shape data (contained within the package) is from [thematicmapping](http://thematicmapping.org/downloads/world_borders.php) (the simple shapes). It contains polygons representing one country with the following meta-data (`properties`):
```
FIPS String(2) FIPS 10-4 Country Code
ISO2 String(2) ISO 3166-1 Alpha-2 Country Code
ISO3 String(3) ISO 3166-1 Alpha-3 Country Code
UN Short Integer(3) ISO 3166-1 Numeric-3 Country Code
NAME String(50) Name of country/area
AREA Long Integer(7) Land area, FAO Statistics (2002)
POP2005 Double(10,0) Population, World Population Prospects (2005)
REGION Short Integer(3) Macro geographical (continental region), UN Statistics
SUBREGION Short Integer(3) Geographical sub-region, UN Statistics
LON FLOAT (7,3) Longitude
LAT FLOAT (6,3) Latitude
```
Hence, you can use this package as an *offline reverse geocoder on the country level* (by default):
```python
In [1]: import geopip
In [2]: geopip.search(lng=4.910248, lat=50.850981)
Out[2]:
{'AREA': 0,
'FIPS': 'BE',
'ISO2': 'BE',
'ISO3': 'BEL',
'LAT': 50.643,
'LON': 4.664,
'NAME': 'Belgium',
'POP2005': 10398049,
'REGION': 150,
'SUBREGION': 155,
'UN': 56}
```
**NOTE**: Since the polygons for each country are quite simple, reverse geocoding at the borders of two countrys is **not** exact. Use polygons with higher resolution for these use cases (see [Data](#data)).
The `shapely` package will be used, if installed. Otherwise, a pure python implementation will be used (on the basis of [winding numbers](https://en.wikipedia.org/wiki/Winding_number)). See [here](https://www.toptal.com/python/computational-geometry-in-python-from-theory-to-implementation), [here](http://geomalgorithms.com/a03-_inclusion.html) and [here](http://www.dgp.toronto.edu/~mac/e-stuff/point_in_polygon.py) for more informations and example implementations. Espacially for larger features, the shapely implementation might give performance improvements (default shape data and 2.6 GHz Intel Core i7, python3.6.2, cythonized version of [geohash-hilbert](https://github.com/tammoippen/geohash-hilbert)):
*Pure*:
```python
In [1]: import geopip
In [2]: geopip._geopip.p_in_polygon?
Signature: geopip._geopip.p_in_polygon(p, shp)
Docstring:
Test, whether point `p` is in shape `shp`.
Use the pure python implementation for this.
Parameters:
p: Tuple[float, float] Point (lng, lat) in WGS84.
shp: Dict[str, Any] Prepared shape dictionary from `geopip._pure.prepare()`.
Returns:
boolean: True, if p in shp, False otherwise
File: ~/repositories/geopip/geopip/_pure.py
Type: function
In [3]: %timeit geopip.search(4.910248, 50.850981)
25.6 µs ± 390 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
*Shapely*:
```python
In [1]: import geopip
In [2]: geopip_geopip.p_in_polygon?
Signature: geopip._geopip.p_in_polygon(p, shp)
Docstring:
Test, whether point `p` is in shape `shp`.
Use the shapely implementation for this.
Parameters:
p: Tuple[float, float] Point (lng, lat) in WGS84.
shp: Dict[str, Any] Prepared shape dictionary from `geopip._shapely.prepare()`.
Returns:
boolean: True, if p in shp, False otherwise
File: ~/repositories/geopip/geopip/_shapely.py
Type: function
In [3]: %timeit geopip.search(4.910248, 50.850981)
50 µs ± 601 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
For simple geojsons, the pure python implementation is faster, but on more complex polygons, the shapely implementation will win.
## Install
```sh
pip install geopip
```
If you require the extra speed, because you have many polygons and / or very detailed polygons, try installing geohash-hilbert with Cython extensions and / or have (vectorized) shapely installed.
```sh
# make sure to have GEOS library installed (including dev extensions)
pip install numpy 'shapely[vectorized]>=1.6'
pip install cython # for building geohash-hilbert's cython extension
pip install --upgrade geohash-hilbert
```
## Data
Other interesting shape data can be found at:
- http://www.naturalearthdata.com/downloads/ : Different thematic shape files at 10m, 50m and 110m resolution.
- http://www.gadm.org/version2 : Administrative area 0 or 1 contain contries or states, respectively. Attention to the license!
- https://www2.census.gov/geo/tiger/: Various shape/gdb files and information for USA.
- http://guides.library.upenn.edu/c.php?g=475518&p=3254770: Links to various geoinformation data.
- http://thematicmapping.org/downloads/world_borders.php: Country borders and some interesting information. The default file is from here. There is also a higher resolution version.
- https://github.com/evansiroky/timezone-boundary-builder: Time zone boundaries. See releases for downloads.
- https://www.suche-postleitzahl.org/plz-karte-erstellen: DE postalcodes + size + population (Census / OSM).
- https://www2.census.gov/geo/tiger/TIGER2010DP1/ZCTA_2010Census_DP1.zip: US postalcodes + size + population (Census; field definition see `DP_TableDescriptions.xls` in the zip).
- https://github.com/berlinermorgenpost/Berlin-Geodaten: Geo shapes of Berlin, DE.
- https://github.com/gregoiredavid/france-geojson: Geojson of regions, arrondissements, ... France.
- https://data.opendatasoft.com/explore/dataset/arrondissements@parisdata/: Geojson of arrondissements of Paris, FR.
- https://data.opendatasoft.com/pages/home/: Lots of different data, some have geojson, see above.
**NOTE**: shapefiles / gdb databases have to be transformed into geojson. One way is to use [fiona](https://github.com/Toblerity/Fiona). Assuming the gdb files are in the directory `/data/gdb`:
```python
fio insp /data/gdb
# a python shell opens
>>> import json
>>> features = []
>>> for feat in src:
... features += [feat]
...
>>> f = open('/data/gdb.geo.json', 'w')
>>> json.dump(dict(type='FeatureCollection', features=features), f)
>>> f.close()
```
Then the `gdb` will be transformed into a geojson file `gdb.geo.json`.
# Documentation
(*TODO* more)
Basically, there are the two functions `geopip.search` and `geopip.search_all` that perform the search in the provided `FeatureCollection`. Then there is the class `geopip.GeoPIP` that accepts a `FeatureCollection` either as a file or a dictionary and provides the same search functionality:
## `search`
```python
In [1]: import geopip
In [2]: geopip.search?
Signature: geopip.search(lng, lat)
Docstring:
Reverse geocode lng/lat coordinate within the features from `instance().shapes`.
Look within the features from the `instance().shapes` function for a polygon that
contains the point (lng, lat). From the first found feature the `porperties`
will be returned. `None`, if no feature containes the point.
Parameters:
lng: float Longitude (-180, 180) of point. (WGS84)
lat: float Latitude (-90, 90) of point. (WGS84)
Returns:
Dict[Any, Any] `Properties` of found feature. `None` if nothing is found.
File: ~/repositories/geopip/geopip/__init__.py
Type: function
```
## `search_all`
```python
In [1]: import geopip
In [2]: geopip.search_all?
Signature: geopip.search_all(lng, lat)
Docstring:
Reverse geocode lng/lat coordinate within the features from `instance().shapes`.
Look within the features from the `instance().shapes` function for all polygon that
contains the point (lng, lat). From all found feature the `porperties`
will be returned (more or less sorted from smallest to largest feature).
`None`, if no feature containes the point.
Parameters:
lng: float Longitude (-180, 180) of point. (WGS84)
lat: float Latitude (-90, 90) of point. (WGS84)
Returns:
Iterator[Dict[Any, Any]] Iterator for `properties` of found features.
File: ~/repositories/geopip/geopip/__init__.py
Type: function
```
## `GeoPIP`
```python
In [1]: import geopip
In [2]: geopip.GeoPIP?
Init signature: geopip.GeoPIP(self, *args, **kwargs)
Docstring:
GeoPIP: Geojson Point in Polygon (PIP)
Reverse geocode a lng/lat coordinate within a geojson `FeatureCollection` and
return information about the containing polygon.
Init docstring:
Provide the geojson either as a file (`filename`) or as a geojson
dict (`geojson_dict`). If none of both is given, it tries to load the
file pointed to in the environment variable `REVERSE_GEOCODE_DATA`. If the
variable is not set, a default geojson will be loaded (packaged):
http://thematicmapping.org/downloads/world_borders.php
During init, the geojson will be prepared (see pure / shapely implementation)
and indexed with geohashes.
Provide the parameters as kwargs!
Allowed parameters:
filename: str Path to a geojson file.
geojson_dict: Dict[str, Any] Geojson dictionary. `FeatureCollection` required!
File: ~/repositories/geopip/geopip/_geopip.py
Type: type
```
A `GeoPIP` object provides the same `search` and `search_all` functions.
Raw data
{
"_id": null,
"home_page": "https://github.com/tammoippen/geopip",
"name": "geopip",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "geojson, point in polygon, reverse geocode, countries",
"author": "Tammo Ippen",
"author_email": "tammo.ippen@posteo.de",
"download_url": "https://files.pythonhosted.org/packages/7c/06/3bacba48889d02f69c5c30ffaffcd46950e07695acade91978214e175f04/geopip-2.1.0.tar.gz",
"platform": null,
"description": "# GEOPIP: Geojson Point in Polygon (PIP)\n\n[![CI](https://github.com/tammoippen/geopip/actions/workflows/CI.yml/badge.svg)](https://github.com/tammoippen/geopip/actions/workflows/CI.yml)\n[![Coverage Status](https://coveralls.io/repos/github/tammoippen/geopip/badge.svg?branch=master)](https://coveralls.io/github/tammoippen/geopip?branch=master)\n[![Tested CPython Versions](https://img.shields.io/badge/cpython-3.9%2C%203.10%2C%203.11%2C%203.12%2C%203.13-brightgreen.svg)](https://img.shields.io/badge/cpython-3.9%2C%203.10%2C%203.11%2C%203.12%2C%203.13-brightgreen.svg)\n[![Tested PyPy Versions](https://img.shields.io/badge/pypy-3.9%2C%203.10-brightgreen.svg)](https://img.shields.io/badge/pypy-3.9%2C%203.10%2C%203.10-brightgreen.svg)\n[![PyPi version](https://img.shields.io/pypi/v/geopip.svg)](https://pypi.python.org/pypi/geopip)\n[![PyPi license](https://img.shields.io/pypi/l/geopip.svg)](https://pypi.python.org/pypi/geopip)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n\nReverse geocode a lng/lat coordinate within a geojson `FeatureCollection` and return information about the containing country (polygon).\n\nBasically, you can use any [geojson](https://tools.ietf.org/html/rfc7946) file (top level is a `FeatureCollection`) for reverse coding - set the environment variable `REVERSE_GEOCODE_DATA` to the geojson file. Only `Polygon` and `MultiPolygon` features will be used! If a point is found to be in a feature, the `properties` of that feature will be returned.\n\nIn other words, provide a geojson with postcode boundaries, and you can query for the postcode in which a coordinate is. Provide timezone boundaries and you can find the timezone for a coordinate. Be creative :).\n\nThe default shape data (contained within the package) is from [thematicmapping](http://thematicmapping.org/downloads/world_borders.php) (the simple shapes). It contains polygons representing one country with the following meta-data (`properties`):\n```\nFIPS String(2) FIPS 10-4 Country Code\nISO2 String(2) ISO 3166-1 Alpha-2 Country Code\nISO3 String(3) ISO 3166-1 Alpha-3 Country Code\nUN Short Integer(3) ISO 3166-1 Numeric-3 Country Code\nNAME String(50) Name of country/area\nAREA Long Integer(7) Land area, FAO Statistics (2002)\nPOP2005 Double(10,0) Population, World Population Prospects (2005)\nREGION Short Integer(3) Macro geographical (continental region), UN Statistics\nSUBREGION Short Integer(3) Geographical sub-region, UN Statistics\nLON FLOAT (7,3) Longitude\nLAT FLOAT (6,3) Latitude\n```\n\nHence, you can use this package as an *offline reverse geocoder on the country level* (by default):\n```python\nIn [1]: import geopip\nIn [2]: geopip.search(lng=4.910248, lat=50.850981)\nOut[2]:\n{'AREA': 0,\n 'FIPS': 'BE',\n 'ISO2': 'BE',\n 'ISO3': 'BEL',\n 'LAT': 50.643,\n 'LON': 4.664,\n 'NAME': 'Belgium',\n 'POP2005': 10398049,\n 'REGION': 150,\n 'SUBREGION': 155,\n 'UN': 56}\n```\n\n**NOTE**: Since the polygons for each country are quite simple, reverse geocoding at the borders of two countrys is **not** exact. Use polygons with higher resolution for these use cases (see [Data](#data)).\n\nThe `shapely` package will be used, if installed. Otherwise, a pure python implementation will be used (on the basis of [winding numbers](https://en.wikipedia.org/wiki/Winding_number)). See [here](https://www.toptal.com/python/computational-geometry-in-python-from-theory-to-implementation), [here](http://geomalgorithms.com/a03-_inclusion.html) and [here](http://www.dgp.toronto.edu/~mac/e-stuff/point_in_polygon.py) for more informations and example implementations. Espacially for larger features, the shapely implementation might give performance improvements (default shape data and 2.6 GHz Intel Core i7, python3.6.2, cythonized version of [geohash-hilbert](https://github.com/tammoippen/geohash-hilbert)):\n\n*Pure*:\n```python\nIn [1]: import geopip\nIn [2]: geopip._geopip.p_in_polygon?\nSignature: geopip._geopip.p_in_polygon(p, shp)\nDocstring:\nTest, whether point `p` is in shape `shp`.\n\nUse the pure python implementation for this.\n\nParameters:\n p: Tuple[float, float] Point (lng, lat) in WGS84.\n shp: Dict[str, Any] Prepared shape dictionary from `geopip._pure.prepare()`.\n\nReturns:\n boolean: True, if p in shp, False otherwise\nFile: ~/repositories/geopip/geopip/_pure.py\nType: function\nIn [3]: %timeit geopip.search(4.910248, 50.850981)\n25.6 \u00b5s \u00b1 390 ns per loop (mean \u00b1 std. dev. of 7 runs, 10000 loops each)\n```\n\n*Shapely*:\n```python\nIn [1]: import geopip\nIn [2]: geopip_geopip.p_in_polygon?\nSignature: geopip._geopip.p_in_polygon(p, shp)\nDocstring:\nTest, whether point `p` is in shape `shp`.\n\nUse the shapely implementation for this.\n\nParameters:\n p: Tuple[float, float] Point (lng, lat) in WGS84.\n shp: Dict[str, Any] Prepared shape dictionary from `geopip._shapely.prepare()`.\n\nReturns:\n boolean: True, if p in shp, False otherwise\nFile: ~/repositories/geopip/geopip/_shapely.py\nType: function\nIn [3]: %timeit geopip.search(4.910248, 50.850981)\n50 \u00b5s \u00b1 601 ns per loop (mean \u00b1 std. dev. of 7 runs, 10000 loops each)\n```\n\nFor simple geojsons, the pure python implementation is faster, but on more complex polygons, the shapely implementation will win.\n\n## Install\n```sh\npip install geopip\n```\n\nIf you require the extra speed, because you have many polygons and / or very detailed polygons, try installing geohash-hilbert with Cython extensions and / or have (vectorized) shapely installed.\n```sh\n# make sure to have GEOS library installed (including dev extensions)\npip install numpy 'shapely[vectorized]>=1.6'\n\npip install cython # for building geohash-hilbert's cython extension\npip install --upgrade geohash-hilbert\n```\n\n## Data\n\nOther interesting shape data can be found at:\n\n- http://www.naturalearthdata.com/downloads/ : Different thematic shape files at 10m, 50m and 110m resolution.\n- http://www.gadm.org/version2 : Administrative area 0 or 1 contain contries or states, respectively. Attention to the license!\n- https://www2.census.gov/geo/tiger/: Various shape/gdb files and information for USA.\n- http://guides.library.upenn.edu/c.php?g=475518&p=3254770: Links to various geoinformation data.\n- http://thematicmapping.org/downloads/world_borders.php: Country borders and some interesting information. The default file is from here. There is also a higher resolution version.\n- https://github.com/evansiroky/timezone-boundary-builder: Time zone boundaries. See releases for downloads.\n- https://www.suche-postleitzahl.org/plz-karte-erstellen: DE postalcodes + size + population (Census / OSM).\n- https://www2.census.gov/geo/tiger/TIGER2010DP1/ZCTA_2010Census_DP1.zip: US postalcodes + size + population (Census; field definition see `DP_TableDescriptions.xls` in the zip).\n- https://github.com/berlinermorgenpost/Berlin-Geodaten: Geo shapes of Berlin, DE.\n- https://github.com/gregoiredavid/france-geojson: Geojson of regions, arrondissements, ... France.\n- https://data.opendatasoft.com/explore/dataset/arrondissements@parisdata/: Geojson of arrondissements of Paris, FR.\n- https://data.opendatasoft.com/pages/home/: Lots of different data, some have geojson, see above.\n\n**NOTE**: shapefiles / gdb databases have to be transformed into geojson. One way is to use [fiona](https://github.com/Toblerity/Fiona). Assuming the gdb files are in the directory `/data/gdb`:\n\n```python\nfio insp /data/gdb\n# a python shell opens\n>>> import json\n>>> features = []\n>>> for feat in src:\n... features += [feat]\n...\n>>> f = open('/data/gdb.geo.json', 'w')\n>>> json.dump(dict(type='FeatureCollection', features=features), f)\n>>> f.close()\n```\nThen the `gdb` will be transformed into a geojson file `gdb.geo.json`.\n\n# Documentation\n\n(*TODO* more)\nBasically, there are the two functions `geopip.search` and `geopip.search_all` that perform the search in the provided `FeatureCollection`. Then there is the class `geopip.GeoPIP` that accepts a `FeatureCollection` either as a file or a dictionary and provides the same search functionality:\n\n## `search`\n```python\nIn [1]: import geopip\nIn [2]: geopip.search?\nSignature: geopip.search(lng, lat)\nDocstring:\nReverse geocode lng/lat coordinate within the features from `instance().shapes`.\n\nLook within the features from the `instance().shapes` function for a polygon that\ncontains the point (lng, lat). From the first found feature the `porperties`\nwill be returned. `None`, if no feature containes the point.\n\nParameters:\n lng: float Longitude (-180, 180) of point. (WGS84)\n lat: float Latitude (-90, 90) of point. (WGS84)\n\nReturns:\n Dict[Any, Any] `Properties` of found feature. `None` if nothing is found.\nFile: ~/repositories/geopip/geopip/__init__.py\nType: function\n```\n\n## `search_all`\n```python\nIn [1]: import geopip\nIn [2]: geopip.search_all?\nSignature: geopip.search_all(lng, lat)\nDocstring:\nReverse geocode lng/lat coordinate within the features from `instance().shapes`.\n\nLook within the features from the `instance().shapes` function for all polygon that\ncontains the point (lng, lat). From all found feature the `porperties`\nwill be returned (more or less sorted from smallest to largest feature).\n`None`, if no feature containes the point.\n\nParameters:\n lng: float Longitude (-180, 180) of point. (WGS84)\n lat: float Latitude (-90, 90) of point. (WGS84)\n\nReturns:\n Iterator[Dict[Any, Any]] Iterator for `properties` of found features.\nFile: ~/repositories/geopip/geopip/__init__.py\nType: function\n```\n\n## `GeoPIP`\n```python\nIn [1]: import geopip\nIn [2]: geopip.GeoPIP?\nInit signature: geopip.GeoPIP(self, *args, **kwargs)\nDocstring:\nGeoPIP: Geojson Point in Polygon (PIP)\n\nReverse geocode a lng/lat coordinate within a geojson `FeatureCollection` and\nreturn information about the containing polygon.\nInit docstring:\nProvide the geojson either as a file (`filename`) or as a geojson\ndict (`geojson_dict`). If none of both is given, it tries to load the\nfile pointed to in the environment variable `REVERSE_GEOCODE_DATA`. If the\nvariable is not set, a default geojson will be loaded (packaged):\n http://thematicmapping.org/downloads/world_borders.php\n\nDuring init, the geojson will be prepared (see pure / shapely implementation)\nand indexed with geohashes.\n\nProvide the parameters as kwargs!\n\nAllowed parameters:\n filename: str Path to a geojson file.\n geojson_dict: Dict[str, Any] Geojson dictionary. `FeatureCollection` required!\nFile: ~/repositories/geopip/geopip/_geopip.py\nType: type\n```\n\nA `GeoPIP` object provides the same `search` and `search_all` functions.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Reverse geocode a lng/lat coordinate within a geojson FeatureCollection.",
"version": "2.1.0",
"project_urls": {
"Homepage": "https://github.com/tammoippen/geopip",
"Repository": "https://github.com/tammoippen/geopip"
},
"split_keywords": [
"geojson",
" point in polygon",
" reverse geocode",
" countries"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7c063bacba48889d02f69c5c30ffaffcd46950e07695acade91978214e175f04",
"md5": "e16adcb387e3a380ddd310d549a81263",
"sha256": "13ff6ae2c031f9897835b46973f88f52b6a79575786bb6d1094ddac08433b4a1"
},
"downloads": -1,
"filename": "geopip-2.1.0.tar.gz",
"has_sig": false,
"md5_digest": "e16adcb387e3a380ddd310d549a81263",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 355642,
"upload_time": "2024-12-20T08:50:50",
"upload_time_iso_8601": "2024-12-20T08:50:50.346503Z",
"url": "https://files.pythonhosted.org/packages/7c/06/3bacba48889d02f69c5c30ffaffcd46950e07695acade91978214e175f04/geopip-2.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-20 08:50:50",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tammoippen",
"github_project": "geopip",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "geopip"
}