geowrangler


Namegeowrangler JSON
Version 0.5.1 PyPI version JSON
download
home_pagehttps://github.com/thinkingmachines/geowrangler
Summary🌏 A python package for wrangling geospatial datasets
upload_time2024-09-10 05:28:55
maintainerNone
docs_urlNone
authorThinking Machines Data Sciences Inc.
requires_python>=3.8
licenseMIT License
keywords nbdev jupyter notebook python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # Geowrangler


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

<img src="https://raw.githubusercontent.com/thinkingmachines/geowrangler/master/images/Geowrangler.svg" alt="Geowrangler logo" style="max-width: 245px;" />

### Overview

[![License:MIT](https://img.shields.io/github/license/thinkingmachines/geowrangler?style=flat-square.png)](https://github.com/thinkingmachines/geowrangler/blob/master/LICENSE)
[![Versions](https://img.shields.io/pypi/pyversions/geowrangler.svg?style=flat-square)](https://pypi.org/project/geowrangler/)
[![Docs](https://img.shields.io/badge/docs-passing-green?style=flat-square.png)](https://geowrangler.thinkingmachin.es)

**Geowrangler** is a Python package for geodata wrangling. It helps you
build data transformation workflows that have no out-of-the-box
solutions from other geospatial libraries.

We surveyed our past geospatial projects to extract these solutions for
our work and hope that these will be useful for others as well.

Our audience are researchers, analysts, and engineers delivering
geospatial projects.

We [welcome your comments, suggestions, bug reports, and code
contributions](https://github.com/thinkingmachines/geowrangler/discussions)
to make **Geowrangler** better.

[![](https://raw.githubusercontent.com/thinkingmachines/geowrangler/master/images/github.svg "View on github button")](https://github.com/thinkingmachines/geowrangler)

### Context

**Geowrangler** was borne out of our efforts to reduce the amount of
boilerplate code in wrangling geospatial data. It builds on top of
existing geospatial libraries such as geopandas, rasterio, rasterstats,
morecantile, and others. Our goals are centered on the following tasks:

- Extracting area of interest zonal statistics from vector and raster
  data
- Gridding areas of interest
- Validating geospatial datasets
- Downloading of publically available geospatial datasets (e.g., OSM,
  Ookla, and Nightlights)
- Other geospatial vector and raster data processing tasks

To make it easy to document, maintain, and extend the package, we opted
to maintain the source code, tests and documentation on Jupyter
notebooks. We use [nbdev](https://nbdev.fast.ai) to generate the Python
package and documentation from the notebooks. See this
[document](https://github.com/thinkingmachines/geowrangler/blob/master/DEVELOPMENT.md)
to learn more about our development workflow.

By doing this, we hope to make it easy for geospatial analysts,
scientists, and engineers to learn, explore, and extend this package for
their geospatial processing needs.

Aside from providing reference documentation for each module, we have
included extensive tutorials and use case examples in order to make it
easy to learn and use.

### Modules

- Grid Tile Generation
- Geometry Validation
- Vector Zonal Stats
- Raster Zonal Stats
- Area Zonal Stats
- Distance Zonal Stats
- Vector to Raster Mask
- Raster to Dataframe
- Raster Processing
- Demographic and Health Survey (DHS) Processing Utils
- Geofabrik (OSM) Data Download
- Ookla Data Download
- Night Lights
- Dataset Utils
- Tile Clustering
- Spatial Join Highest Intersection

*Check [this page for more details about our
Roadmap](https://github.com/orgs/thinkingmachines/projects/17).*

### Installation

    pip install geowrangler

### Exploring the Documentation

We develop the package modules alongside their documentation. Each page
comes with an *Open in Colab* button that will open the Jupyter notebook
in Colab for exploration (including this page).

Click on the *Open in Colab* button below to open this page as a Google
Colab notebook.

[![](https://colab.research.google.com/assets/colab-badge.svg "Open in Colab button")](https://colab.research.google.com/github/thinkingmachines/geowrangler/blob/master/notebooks/index.ipynb)

``` python
# view the source of a grid component
gdf = gpd.GeoDataFrame()
grid = geowrangler.grids.SquareGridGenerator(gdf, 1)
grid??
```

    Type:        SquareGridGenerator
    String form: <geowrangler.grids.SquareGridGenerator object>
    File:        ~/work/unicef-ai4d/geowrangler-1/geowrangler/grids.py
    Source:     
    class SquareGridGenerator:
        def __init__(
            self,
            cell_size: float,  # height and width of a square cell in meters
            grid_projection: str = "EPSG:3857",  # projection of grid output
            boundary: Union[SquareGridBoundary, List[float]] = None,  # original boundary
        ):
            self.cell_size = cell_size
            self.grid_projection = grid_projection
            self.boundary = boundary

#### Tutorials

- [Grids
  Generation](https://geowrangler.thinkingmachin.es/tutorial.grids.html)
- [Geometry
  Validation](https://geowrangler.thinkingmachin.es/tutorial.geometry_validation.html)
- [Vector Zonal
  Stats](https://geowrangler.thinkingmachin.es/tutorial.vector_zonal_stats.html)
- [Raster Zonal
  Stats](https://geowrangler.thinkingmachin.es/tutorial.raster_zonal_stats.html)
- [Area Zonal
  Stats](https://geowrangler.thinkingmachin.es/tutorial.area_zonal_stats.html)
- [Distance Zonal
  Stats](https://geowrangler.thinkingmachin.es/tutorial.distance_zonal_stats.html)
- [DHS Processing
  Utils](https://geowrangler.thinkingmachin.es/tutorial.dhs.html)
- [Dataset
  Downloads](https://geowrangler.thinkingmachin.es/tutorial.datasets.html)
- [Spatial Join Using Highest
  Intersection](https://geowrangler.thinkingmachin.es/tutorial.spatial_join_highest_intersection.html)
- [Tile
  Clustering](https://geowrangler.thinkingmachin.es/tutorial.tile_clustering.html)
- [Raster
  Processing](https://geowrangler.thinkingmachin.es/tutorial.raster_processing.html)
- [Raster to
  Dataframe](https://geowrangler.thinkingmachin.es/tutorial.raster_to_dataframe.html)

#### Reference

- [Grids Generation](https://geowrangler.thinkingmachin.es/grids.html)
- [Polygon Fill
  Algorithms](https://geowrangler.thinkingmachin.es/polygon_fill.html)
- [Geometry
  Validation](https://geowrangler.thinkingmachin.es/validation.html)
- [Vector Zonal
  Stats](https://geowrangler.thinkingmachin.es/vector_zonal_stats.html)
- [Raster Zonal
  Stats](https://geowrangler.thinkingmachin.es/raster_zonal_stats.html)
- [Area Zonal
  Stats](https://geowrangler.thinkingmachin.es/area_zonal_stats.html)
- [Distance Zonal
  Stats](https://geowrangler.thinkingmachin.es/distance_zonal_stats.html)
- [DHS Processing Utils](https://geowrangler.thinkingmachin.es/dhs.html)
- [Dataset Geofabrik
  (OSM)](https://geowrangler.thinkingmachin.es/datasets_geofabrik.html)
- [Dataset
  Ookla](https://geowrangler.thinkingmachin.es/datasets_ookla.html)

> [!NOTE]
>
> All the documentation pages (including the references) are executable
> Jupyter notebooks.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/thinkingmachines/geowrangler",
    "name": "geowrangler",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "nbdev jupyter notebook python",
    "author": "Thinking Machines Data Sciences Inc.",
    "author_email": "geowrangler@thinkingmachin.es",
    "download_url": "https://files.pythonhosted.org/packages/74/56/7e0c1d06504fe8fb8d39a444e2caecca5f355e06a5e27efb9297277f57ab/geowrangler-0.5.1.tar.gz",
    "platform": null,
    "description": "# Geowrangler\n\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n<img src=\"https://raw.githubusercontent.com/thinkingmachines/geowrangler/master/images/Geowrangler.svg\" alt=\"Geowrangler logo\" style=\"max-width: 245px;\" />\n\n### Overview\n\n[![License:MIT](https://img.shields.io/github/license/thinkingmachines/geowrangler?style=flat-square.png)](https://github.com/thinkingmachines/geowrangler/blob/master/LICENSE)\n[![Versions](https://img.shields.io/pypi/pyversions/geowrangler.svg?style=flat-square)](https://pypi.org/project/geowrangler/)\n[![Docs](https://img.shields.io/badge/docs-passing-green?style=flat-square.png)](https://geowrangler.thinkingmachin.es)\n\n**Geowrangler** is a Python package for geodata wrangling. It helps you\nbuild data transformation workflows that have no out-of-the-box\nsolutions from other geospatial libraries.\n\nWe surveyed our past geospatial projects to extract these solutions for\nour work and hope that these will be useful for others as well.\n\nOur audience are researchers, analysts, and engineers delivering\ngeospatial projects.\n\nWe [welcome your comments, suggestions, bug reports, and code\ncontributions](https://github.com/thinkingmachines/geowrangler/discussions)\nto make **Geowrangler** better.\n\n[![](https://raw.githubusercontent.com/thinkingmachines/geowrangler/master/images/github.svg \"View on github button\")](https://github.com/thinkingmachines/geowrangler)\n\n### Context\n\n**Geowrangler** was borne out of our efforts to reduce the amount of\nboilerplate code in wrangling geospatial data. It builds on top of\nexisting geospatial libraries such as geopandas, rasterio, rasterstats,\nmorecantile, and others. Our goals are centered on the following tasks:\n\n- Extracting area of interest zonal statistics from vector and raster\n  data\n- Gridding areas of interest\n- Validating geospatial datasets\n- Downloading of publically available geospatial datasets (e.g., OSM,\n  Ookla, and Nightlights)\n- Other geospatial vector and raster data processing tasks\n\nTo make it easy to document, maintain, and extend the package, we opted\nto maintain the source code, tests and documentation on Jupyter\nnotebooks. We use [nbdev](https://nbdev.fast.ai) to generate the Python\npackage and documentation from the notebooks. See this\n[document](https://github.com/thinkingmachines/geowrangler/blob/master/DEVELOPMENT.md)\nto learn more about our development workflow.\n\nBy doing this, we hope to make it easy for geospatial analysts,\nscientists, and engineers to learn, explore, and extend this package for\ntheir geospatial processing needs.\n\nAside from providing reference documentation for each module, we have\nincluded extensive tutorials and use case examples in order to make it\neasy to learn and use.\n\n### Modules\n\n- Grid Tile Generation\n- Geometry Validation\n- Vector Zonal Stats\n- Raster Zonal Stats\n- Area Zonal Stats\n- Distance Zonal Stats\n- Vector to Raster Mask\n- Raster to Dataframe\n- Raster Processing\n- Demographic and Health Survey (DHS) Processing Utils\n- Geofabrik (OSM) Data Download\n- Ookla Data Download\n- Night Lights\n- Dataset Utils\n- Tile Clustering\n- Spatial Join Highest Intersection\n\n*Check [this page for more details about our\nRoadmap](https://github.com/orgs/thinkingmachines/projects/17).*\n\n### Installation\n\n    pip install geowrangler\n\n### Exploring the Documentation\n\nWe develop the package modules alongside their documentation. Each page\ncomes with an *Open in Colab* button that will open the Jupyter notebook\nin Colab for exploration (including this page).\n\nClick on the *Open in Colab* button below to open this page as a Google\nColab notebook.\n\n[![](https://colab.research.google.com/assets/colab-badge.svg \"Open in Colab button\")](https://colab.research.google.com/github/thinkingmachines/geowrangler/blob/master/notebooks/index.ipynb)\n\n``` python\n# view the source of a grid component\ngdf = gpd.GeoDataFrame()\ngrid = geowrangler.grids.SquareGridGenerator(gdf, 1)\ngrid??\n```\n\n    Type:        SquareGridGenerator\n    String form: <geowrangler.grids.SquareGridGenerator object>\n    File:        ~/work/unicef-ai4d/geowrangler-1/geowrangler/grids.py\n    Source:     \n    class SquareGridGenerator:\n        def __init__(\n            self,\n            cell_size: float,  # height and width of a square cell in meters\n            grid_projection: str = \"EPSG:3857\",  # projection of grid output\n            boundary: Union[SquareGridBoundary, List[float]] = None,  # original boundary\n        ):\n            self.cell_size = cell_size\n            self.grid_projection = grid_projection\n            self.boundary = boundary\n\n#### Tutorials\n\n- [Grids\n  Generation](https://geowrangler.thinkingmachin.es/tutorial.grids.html)\n- [Geometry\n  Validation](https://geowrangler.thinkingmachin.es/tutorial.geometry_validation.html)\n- [Vector Zonal\n  Stats](https://geowrangler.thinkingmachin.es/tutorial.vector_zonal_stats.html)\n- [Raster Zonal\n  Stats](https://geowrangler.thinkingmachin.es/tutorial.raster_zonal_stats.html)\n- [Area Zonal\n  Stats](https://geowrangler.thinkingmachin.es/tutorial.area_zonal_stats.html)\n- [Distance Zonal\n  Stats](https://geowrangler.thinkingmachin.es/tutorial.distance_zonal_stats.html)\n- [DHS Processing\n  Utils](https://geowrangler.thinkingmachin.es/tutorial.dhs.html)\n- [Dataset\n  Downloads](https://geowrangler.thinkingmachin.es/tutorial.datasets.html)\n- [Spatial Join Using Highest\n  Intersection](https://geowrangler.thinkingmachin.es/tutorial.spatial_join_highest_intersection.html)\n- [Tile\n  Clustering](https://geowrangler.thinkingmachin.es/tutorial.tile_clustering.html)\n- [Raster\n  Processing](https://geowrangler.thinkingmachin.es/tutorial.raster_processing.html)\n- [Raster to\n  Dataframe](https://geowrangler.thinkingmachin.es/tutorial.raster_to_dataframe.html)\n\n#### Reference\n\n- [Grids Generation](https://geowrangler.thinkingmachin.es/grids.html)\n- [Polygon Fill\n  Algorithms](https://geowrangler.thinkingmachin.es/polygon_fill.html)\n- [Geometry\n  Validation](https://geowrangler.thinkingmachin.es/validation.html)\n- [Vector Zonal\n  Stats](https://geowrangler.thinkingmachin.es/vector_zonal_stats.html)\n- [Raster Zonal\n  Stats](https://geowrangler.thinkingmachin.es/raster_zonal_stats.html)\n- [Area Zonal\n  Stats](https://geowrangler.thinkingmachin.es/area_zonal_stats.html)\n- [Distance Zonal\n  Stats](https://geowrangler.thinkingmachin.es/distance_zonal_stats.html)\n- [DHS Processing Utils](https://geowrangler.thinkingmachin.es/dhs.html)\n- [Dataset Geofabrik\n  (OSM)](https://geowrangler.thinkingmachin.es/datasets_geofabrik.html)\n- [Dataset\n  Ookla](https://geowrangler.thinkingmachin.es/datasets_ookla.html)\n\n> [!NOTE]\n>\n> All the documentation pages (including the references) are executable\n> Jupyter notebooks.\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "\ud83c\udf0f A python package for wrangling geospatial datasets",
    "version": "0.5.1",
    "project_urls": {
        "Homepage": "https://github.com/thinkingmachines/geowrangler"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c07d698c12271b9ff67c581716bf4cf58a7713819f2569f6a173b763aa2504bc",
                "md5": "d175c8cc0387578078f983f36e274ea1",
                "sha256": "761cdda281284977f0cb6f4981b2cb8d9d66500df7213c21fd977eb925497fe6"
            },
            "downloads": -1,
            "filename": "geowrangler-0.5.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d175c8cc0387578078f983f36e274ea1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 55273,
            "upload_time": "2024-09-10T05:28:53",
            "upload_time_iso_8601": "2024-09-10T05:28:53.383791Z",
            "url": "https://files.pythonhosted.org/packages/c0/7d/698c12271b9ff67c581716bf4cf58a7713819f2569f6a173b763aa2504bc/geowrangler-0.5.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "74567e0c1d06504fe8fb8d39a444e2caecca5f355e06a5e27efb9297277f57ab",
                "md5": "6852fa91b743c50843a4cfa2d9a0546e",
                "sha256": "0a8722c75f1a180c4daee3bb79334b0ea9e4b09e6ec5978f91f1dd8bd3a701e8"
            },
            "downloads": -1,
            "filename": "geowrangler-0.5.1.tar.gz",
            "has_sig": false,
            "md5_digest": "6852fa91b743c50843a4cfa2d9a0546e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 67083,
            "upload_time": "2024-09-10T05:28:55",
            "upload_time_iso_8601": "2024-09-10T05:28:55.331511Z",
            "url": "https://files.pythonhosted.org/packages/74/56/7e0c1d06504fe8fb8d39a444e2caecca5f355e06a5e27efb9297277f57ab/geowrangler-0.5.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-10 05:28:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "thinkingmachines",
    "github_project": "geowrangler",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "geowrangler"
}
        
Elapsed time: 0.33748s