[![codecov](https://codecov.io/gh/wrightky/ipdw/branch/main/graph/badge.svg?token=KQTT88DN6E)](https://codecov.io/gh/wrightky/ipdw)
[![PyPi](https://img.shields.io/pypi/v/ipdw)](https://pypi.org/project/ipdw/)
[![Python Version](https://img.shields.io/pypi/pyversions/ipdw)](https://pypi.org/project/ipdw/)
[![Downloads](https://img.shields.io/pypi/dm/ipdw)](https://pypi.org/project/ipdw)
# Inverse Path-Distance Weighted Interpolation
`ipdw` is a Python package for performing [inverse-distance-weighted interpolation](https://en.wikipedia.org/wiki/Inverse_distance_weighting) using **path distances** instead of Euclidean distances. In other words, interpolation weights account for domain geometry and obstacles, which is often preferred in geophysical contexts in which "in water" distances are often more relevant to the process at hand than Euclidean distances. For example, this package allows you to quickly and easily interpolate inside a water body, accounting for complex coastlines, islands, and other barriers.
This package is inspired by [Stachelek & Madden (2015)](doi.org/10.1080/13658816.2015.1018833) and is a Python alternative to the original `R` package [`ipdw`](https://github.com/jsta/ipdw), although likely using different methodology under the hood. We rely heavily on the fast-marching approach of [`scikit-fmm`](https://github.com/scikit-fmm/scikit-fmm) to compute path distances, and rely exclusively on `matplotlib.path` and `numpy` to make sense of domain geometry information, which can be specified in either raster or vector format. _No geospatial packages are required._ Due to the use of fast-marching, vectorization, and a reliance on more efficient packages for the compututationally-expensive steps, this package should be quite fast and computationally efficient relative to the complexity of the underlying algorithm.
## Installation
Install the latest stable release from [PyPi](https://pypi.org/project/ipdw/) with pip:
```
pip install ipdw
```
If you want to install from source, clone this repository, navigate to the new directory, and run `python setup.py`.
**Requirements**: `numpy`, `matplotlib`, and `scikit-fmm`
## Background
The IPDW estimate for the value $V$ of at some location $(x,y)$ can be written:
$$V(x,y) = \frac{ \sum_{i} V_i \cdot d_i^{-P} }{ \sum_{i} d_i^{-P}}$$
for nearest locations $i = \[1:N\]$, and in which $V_i$ is the input value at location $i$, $d_i$ is the path distance to that location, $P$ is some positive exponent (often 1 or 2), and $N$ is the number of nearest locations to use in the weighting. The weight of each input point decreases for increasing distance away from that point, at a rate of $1/d^P$. If $N=1$, this is equivalent to nearest-neighbor interpolation.
## Example Usage
To use this package for interpolation, the user simply needs to provide domain geometry information and the locations/values of input data points. Domain geometry can be specified in one of two ways (i.e. vector or raster format). Regardless of input format, the output will be a raster array.
To specify domain geometry in **vector format**, the user needs to specify:
- The target `cellsize` of the output array.
- The domain `boundary`, specified as a `list` or `array` of $(x,y)$ points for each vertex along the boundary polygon, e.g. `[[x1,y1],[x2,y2],...,[xn,yn]]`. The length-scale used when specifying these coordinates should be equal to that used for the `cellsize`. If specified as an `array`, dimensions should be `(N,2)`.
- Optionally, any internal `holes` inside the domain (i.e. obstacles), specified as a `list` of holes, where each hole has the same input format requirements as `boundary`.
Then, an example `Gridded` object can be instantiated as follows:
```
import ipdw
cellsize = X
boundary = [[x1,y1],[x2,y2],...,[xn,yn]]
holes = [hole_1, hole_2, ..., hole_n]
grid = ipdw.Gridded(cellsize, boundary, holes)
```
Under the hood, this process creates a binary array of whatever size is needed to span the `boundary` coordinates at the requested `cellsize`, and then relies heavily on `matplotlib.path.Path` to query whether cell locations within that array are (1) inside the domain boundary and (2) not inside one of the specified holes. If a large number of holes are specified, this process may take a while. After the `Gridded` object has been initialized, the basemap used for interpolation is stored in the `grid.raster` attribute.
To specify domain geometry in **raster format**, the user needs to specify:
- A binary `raster` to be used as the basemap for interpolation (in other words, if this raster already exists, we do not need to create it). This raster must be binary (i.e. composed of 1's and 0's), and it will be converted to an integer array during initialization, so any `NaNs` should be set to 0.
- The geospatial `extent` of that `raster` (i.e. the footprint), specified as `[xmin, xmax, ymin, ymax]`.
Then, an example `Gridded` object can be instantiated as follows (using `rasterio` as an example way in which one might access a basemap):
```
import ipdw
import rasterio
src = rasterio.open('path_to_file')
raster = src.read(1).astype(int)
extent = [src.bounds[0], src.bounds[2], src.bounds[1], src.bounds[3]]
grid = ipdw.Gridded(raster=raster, extent=extent)
```
Note that the combination of **either** (`cellsize`, `boundary`) **or** (`raster`, `extent`) is required to create a `Gridded` object, they cannot be mixed/matched. Also, if the former pair is specified, the latter will be ignored.
Regardless of which of the two options above was used, performing an interpolation looks the same. The user needs to specify:
- `input_locations` and `input_values` for the data between which we are interpolating
- Interpolation settings, most notably `n_nearest` (the number of nearest input data points used in the weighting) and `power` (the exponent $P$ above), as well as a few other optional parameters described in the function documentation.
Then, it is as simple as:
```
input_locations = [[x1,y1],[x2,y2],...,[xn,yn]]
input_values = [V1, V2, ..., Vn]
output = grid.interpolate(input_locations, input_values, n_nearest=3, power=1)
```
in which `output` is an array with interpolated values in all viable cells, and `NaN` everywhere else. That's it!
Also, note that there exists another function, `Gridded.reinterpolate`, which can be used to perform additional interpolations using new `input_values` from the same locations (and with the same parameter settings), which is faster than re-running `interpolate`.
A few example interpolation outputs using different parameter choices are shown below.
<div class="nav3" style="width:800px;">
<img src="https://raw.githubusercontent.com/wrightky/ipdw/main/example.jpeg" alt="Image" width="100%"></a>
</div>
## Contributing
Please feel free to open an issue or a pull request!
Raw data
{
"_id": null,
"home_page": "https://github.com/wrightky/ipdw",
"name": "ipdw",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "interpolation ipdw idw",
"author": "Kyle Wright",
"author_email": "Kyle.Wright@twdb.texas.gov",
"download_url": "https://files.pythonhosted.org/packages/f8/ac/2c5f557ed91656a0c9244df95f4b586d6d107574ebbbc8d30c640b3caafe/ipdw-1.0.1.tar.gz",
"platform": null,
"description": "[![codecov](https://codecov.io/gh/wrightky/ipdw/branch/main/graph/badge.svg?token=KQTT88DN6E)](https://codecov.io/gh/wrightky/ipdw)\n[![PyPi](https://img.shields.io/pypi/v/ipdw)](https://pypi.org/project/ipdw/)\n[![Python Version](https://img.shields.io/pypi/pyversions/ipdw)](https://pypi.org/project/ipdw/)\n[![Downloads](https://img.shields.io/pypi/dm/ipdw)](https://pypi.org/project/ipdw)\n# Inverse Path-Distance Weighted Interpolation\n\n`ipdw` is a Python package for performing [inverse-distance-weighted interpolation](https://en.wikipedia.org/wiki/Inverse_distance_weighting) using **path distances** instead of Euclidean distances. In other words, interpolation weights account for domain geometry and obstacles, which is often preferred in geophysical contexts in which \"in water\" distances are often more relevant to the process at hand than Euclidean distances. For example, this package allows you to quickly and easily interpolate inside a water body, accounting for complex coastlines, islands, and other barriers.\n\nThis package is inspired by [Stachelek & Madden (2015)](doi.org/10.1080/13658816.2015.1018833) and is a Python alternative to the original `R` package [`ipdw`](https://github.com/jsta/ipdw), although likely using different methodology under the hood. We rely heavily on the fast-marching approach of [`scikit-fmm`](https://github.com/scikit-fmm/scikit-fmm) to compute path distances, and rely exclusively on `matplotlib.path` and `numpy` to make sense of domain geometry information, which can be specified in either raster or vector format. _No geospatial packages are required._ Due to the use of fast-marching, vectorization, and a reliance on more efficient packages for the compututationally-expensive steps, this package should be quite fast and computationally efficient relative to the complexity of the underlying algorithm.\n\n## Installation\n\nInstall the latest stable release from [PyPi](https://pypi.org/project/ipdw/) with pip:\n```\npip install ipdw\n```\nIf you want to install from source, clone this repository, navigate to the new directory, and run `python setup.py`.\n\n**Requirements**: `numpy`, `matplotlib`, and `scikit-fmm`\n\n## Background\n\nThe IPDW estimate for the value $V$ of at some location $(x,y)$ can be written:\n$$V(x,y) = \\frac{ \\sum_{i} V_i \\cdot d_i^{-P} }{ \\sum_{i} d_i^{-P}}$$\nfor nearest locations $i = \\[1:N\\]$, and in which $V_i$ is the input value at location $i$, $d_i$ is the path distance to that location, $P$ is some positive exponent (often 1 or 2), and $N$ is the number of nearest locations to use in the weighting. The weight of each input point decreases for increasing distance away from that point, at a rate of $1/d^P$. If $N=1$, this is equivalent to nearest-neighbor interpolation.\n\n## Example Usage\n\nTo use this package for interpolation, the user simply needs to provide domain geometry information and the locations/values of input data points. Domain geometry can be specified in one of two ways (i.e. vector or raster format). Regardless of input format, the output will be a raster array.\n\nTo specify domain geometry in **vector format**, the user needs to specify:\n- The target `cellsize` of the output array.\n- The domain `boundary`, specified as a `list` or `array` of $(x,y)$ points for each vertex along the boundary polygon, e.g. `[[x1,y1],[x2,y2],...,[xn,yn]]`. The length-scale used when specifying these coordinates should be equal to that used for the `cellsize`. If specified as an `array`, dimensions should be `(N,2)`.\n- Optionally, any internal `holes` inside the domain (i.e. obstacles), specified as a `list` of holes, where each hole has the same input format requirements as `boundary`.\n\nThen, an example `Gridded` object can be instantiated as follows:\n```\nimport ipdw\n\ncellsize = X\nboundary = [[x1,y1],[x2,y2],...,[xn,yn]]\nholes = [hole_1, hole_2, ..., hole_n]\n\ngrid = ipdw.Gridded(cellsize, boundary, holes)\n```\nUnder the hood, this process creates a binary array of whatever size is needed to span the `boundary` coordinates at the requested `cellsize`, and then relies heavily on `matplotlib.path.Path` to query whether cell locations within that array are (1) inside the domain boundary and (2) not inside one of the specified holes. If a large number of holes are specified, this process may take a while. After the `Gridded` object has been initialized, the basemap used for interpolation is stored in the `grid.raster` attribute.\n\nTo specify domain geometry in **raster format**, the user needs to specify:\n- A binary `raster` to be used as the basemap for interpolation (in other words, if this raster already exists, we do not need to create it). This raster must be binary (i.e. composed of 1's and 0's), and it will be converted to an integer array during initialization, so any `NaNs` should be set to 0. \n- The geospatial `extent` of that `raster` (i.e. the footprint), specified as `[xmin, xmax, ymin, ymax]`.\n\nThen, an example `Gridded` object can be instantiated as follows (using `rasterio` as an example way in which one might access a basemap):\n```\nimport ipdw\nimport rasterio\n\nsrc = rasterio.open('path_to_file')\nraster = src.read(1).astype(int)\nextent = [src.bounds[0], src.bounds[2], src.bounds[1], src.bounds[3]]\n\ngrid = ipdw.Gridded(raster=raster, extent=extent)\n```\nNote that the combination of **either** (`cellsize`, `boundary`) **or** (`raster`, `extent`) is required to create a `Gridded` object, they cannot be mixed/matched. Also, if the former pair is specified, the latter will be ignored.\n\nRegardless of which of the two options above was used, performing an interpolation looks the same. The user needs to specify:\n- `input_locations` and `input_values` for the data between which we are interpolating\n- Interpolation settings, most notably `n_nearest` (the number of nearest input data points used in the weighting) and `power` (the exponent $P$ above), as well as a few other optional parameters described in the function documentation.\n\nThen, it is as simple as:\n```\ninput_locations = [[x1,y1],[x2,y2],...,[xn,yn]]\ninput_values = [V1, V2, ..., Vn]\n\noutput = grid.interpolate(input_locations, input_values, n_nearest=3, power=1)\n```\nin which `output` is an array with interpolated values in all viable cells, and `NaN` everywhere else. That's it! \nAlso, note that there exists another function, `Gridded.reinterpolate`, which can be used to perform additional interpolations using new `input_values` from the same locations (and with the same parameter settings), which is faster than re-running `interpolate`.\n\nA few example interpolation outputs using different parameter choices are shown below.\n\n<div class=\"nav3\" style=\"width:800px;\">\n <img src=\"https://raw.githubusercontent.com/wrightky/ipdw/main/example.jpeg\" alt=\"Image\" width=\"100%\"></a>\n</div>\n\n## Contributing\n\nPlease feel free to open an issue or a pull request!\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Inverse-Path-Distance-Weighted Interpolation for Python",
"version": "1.0.1",
"project_urls": {
"Homepage": "https://github.com/wrightky/ipdw"
},
"split_keywords": [
"interpolation",
"ipdw",
"idw"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "047cfe8c8525e006bae1cca92d3f9f1f27b9132472d582f347c8d5b3e41fb82d",
"md5": "8db984b0b917ad444e4a3f7ff09a808b",
"sha256": "9fcc98988b7b179e8e99e8e69c1bd5151db976ad7e3d335c7717fc7a811e1f99"
},
"downloads": -1,
"filename": "ipdw-1.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8db984b0b917ad444e4a3f7ff09a808b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 9763,
"upload_time": "2023-06-05T22:46:30",
"upload_time_iso_8601": "2023-06-05T22:46:30.104396Z",
"url": "https://files.pythonhosted.org/packages/04/7c/fe8c8525e006bae1cca92d3f9f1f27b9132472d582f347c8d5b3e41fb82d/ipdw-1.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f8ac2c5f557ed91656a0c9244df95f4b586d6d107574ebbbc8d30c640b3caafe",
"md5": "273590435603b96518cbd399640c3f83",
"sha256": "224e8321e0e22807543e920e43e00f5aa073edf701f5ad00d3e93ad394df3282"
},
"downloads": -1,
"filename": "ipdw-1.0.1.tar.gz",
"has_sig": false,
"md5_digest": "273590435603b96518cbd399640c3f83",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 12879,
"upload_time": "2023-06-05T22:46:31",
"upload_time_iso_8601": "2023-06-05T22:46:31.596747Z",
"url": "https://files.pythonhosted.org/packages/f8/ac/2c5f557ed91656a0c9244df95f4b586d6d107574ebbbc8d30c640b3caafe/ipdw-1.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-05 22:46:31",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "wrightky",
"github_project": "ipdw",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "ipdw"
}