# mixmasta
[![Python Tests](https://github.com/jataware/mixmasta/actions/workflows/python.yaml/badge.svg)](https://github.com/jataware/mixmasta/actions/workflows/python.yaml)
A library for common scientific model transforms. This library enables fast and intuitive transforms including:
* Converting a `geotiff` to a `csv`
* Converting a `NetCDF` to a `csv`
* Geocoding `csv`, `xls`, and `xlsx` data that contains latitude and longitude
## Setup
See `docs/docker.md` for instructions on running Mixmasta in Docker (easiest!).
Ensure you have a working installation of [GDAL](https://trac.osgeo.org/gdal/wiki/FAQInstallationAndBuilding#FAQ-InstallationandBuilding)
You also need to ensure that `numpy` is installed prior to `mixmasta` installation. This is an artifact of GDAL, which will build incorrectly if `numpy` is not already configured:
```
pip install numpy==1.20.1
pip install mixmasta
```
> Note: if you had a prior installation of GDAL you may need to run `pip install mixmasta --no-cache-dir` in a clean environment.
You must install the GADM2 and GADM3 data with:
```
mixmasta download
```
## Usage
Examples can be found in the `input` directory.
Convert a geotiff to a dataframe with:
```
from mixmasta import mixmasta as mix
df = mix.raster2df('chirps-v2.0.2021.01.3.tif', feature_name='rainfall', band=1)
```
Note that you should specify the data band of the geotiff to process if it is multi-band. You may also specify the name of the feature column to produce. You may optionally specify a `date` if the geotiff has an associated date. For example:
Convert a NetCDF to a dataframe with:
```
from mixmasta import mixmasta as mix
df = mix.netcdf2df('tos_O1_2001-2002.nc')
```
Geocode a dataframe:
```
from mixmasta import mixmasta as mix
# First, load in the geotiff as a dataframe
df = mix.raster2df('chirps-v2.0.2021.01.3.tif', feature_name='rainfall', band=1)
# next, we can geocode the dataframe to the admin-level desired (`admin2` or `admin3`)
# by specifying the names of the x and y columns
# in this case, we will geocode to admin2 where x,y are are 'longitude' and 'latitude', respectively.
df_g = mix.geocode("admin2", df, x='longitude', y='latitude')
```
## Running with CLI
After cloning the repository and changing to the `mixmasta` directory, you can run mixmasta via the command line.
Set-up:
While you can point `mixmasta` to any file you would like to transform, the examples below assume your file is in the `inputs` folder; the transformed `.csv` file will be written to the `outputs` folder.
- Transform geotiff to geocoded csv:
```
mixmasta mix --xform=geotiff --input_file=chirps-v2.0.2021.01.3.tif --output_file=geotiffTEST.csv --geo=admin2 --feature_name=rainfall --band=1 --date='5/4/2010' --x=longitude --y=latitude
```
- Transform geotiff to csv:
```
mixmasta mix --xform=geotiff --input_file=maxhop1.tif --output_file=maxhopOUT.csv --geo=admin2 --feature_name=probabilty --band=1 --x=longitude --y=latitude
```
- Transform netcdf to geocoded csv:
```
mixmasta mix --xform=netcdf --input_file=tos_O1_2001-2002.nc --output_file=netcdf.csv --geo=admin2 --x=lon --y=lat
```
- Transform netcdf to csv:
```
mixmasta mix --xform=netcdf --input_file=tos_O1_2001-2002.nc --output_file=netcdf.csv
```
-geocode an existing csv file:
```
mixmasta mix --xform=geocode --input_file=no_geo.csv --geo=admin3 --output_file=geoed_no_geo.csv --x=longitude --y=latitude
```
## World Modelers Specific Normalization
For the World Modelers program, it is necessary to convert arbitrary `csv`, `geotiff`, and `netcdf` files into a CauseMos compliant format. This can be accomplished by leveraging a `mapping` annotation file and the `causemosify` command. The output is a `gzipped` `parquet` file. This may be invoked with:
```
mixmasta causemosify --input_file=chirps-v2.0.2021.01.3.tif --mapper=mapper.json --geo=admin3 --output_file=causemosified_example
```
This will produce a file called `causemosified_example.parquet.gzip` which can be read using Pandas with:
```
pd.read_parquet('causemosified_example.parquet.gzip')
```
## Other Documents
- Docker Instructions: `docs/docker.md`
- Geo Entity Resolution Description: `docs/geo-tentity-resolution.md`
- Package Testing in SpaceTag Env: `docs/spacetag-test.md`
## Credits
This package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and the [audreyr/cookiecutter-pypackage](https://github.com/audreyr/cookiecutter-pypackage) project template.
# History
## 0.1.0 (2021-02-24)
- First release on PyPI.
Raw data
{
"_id": null,
"home_page": "https://github.com/jataware/mixmasta",
"name": "mixmasta",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.5",
"maintainer_email": "",
"keywords": "mixmasta",
"author": "Brandon Rose",
"author_email": "brandon@jataware.com",
"download_url": "https://files.pythonhosted.org/packages/42/21/cf29f591d0a0fa76e0f4ad46febc7e7a63dcd84251054513b67d2531c5cc/mixmasta-0.6.9.tar.gz",
"platform": null,
"description": "# mixmasta\n[![Python Tests](https://github.com/jataware/mixmasta/actions/workflows/python.yaml/badge.svg)](https://github.com/jataware/mixmasta/actions/workflows/python.yaml)\n\nA library for common scientific model transforms. This library enables fast and intuitive transforms including:\n\n* Converting a `geotiff` to a `csv`\n* Converting a `NetCDF` to a `csv`\n* Geocoding `csv`, `xls`, and `xlsx` data that contains latitude and longitude\n\n\n## Setup\n\nSee `docs/docker.md` for instructions on running Mixmasta in Docker (easiest!).\n\nEnsure you have a working installation of [GDAL](https://trac.osgeo.org/gdal/wiki/FAQInstallationAndBuilding#FAQ-InstallationandBuilding)\n\nYou also need to ensure that `numpy` is installed prior to `mixmasta` installation. This is an artifact of GDAL, which will build incorrectly if `numpy` is not already configured:\n\n```\npip install numpy==1.20.1\npip install mixmasta\n```\n\n> Note: if you had a prior installation of GDAL you may need to run `pip install mixmasta --no-cache-dir` in a clean environment.\n\nYou must install the GADM2 and GADM3 data with:\n\n```\nmixmasta download\n```\n\n## Usage\n\n\nExamples can be found in the `input` directory.\n\nConvert a geotiff to a dataframe with:\n\n```\nfrom mixmasta import mixmasta as mix\ndf = mix.raster2df('chirps-v2.0.2021.01.3.tif', feature_name='rainfall', band=1)\n```\n\nNote that you should specify the data band of the geotiff to process if it is multi-band. You may also specify the name of the feature column to produce. You may optionally specify a `date` if the geotiff has an associated date. For example:\n\nConvert a NetCDF to a dataframe with:\n\n```\nfrom mixmasta import mixmasta as mix\ndf = mix.netcdf2df('tos_O1_2001-2002.nc')\n```\n\nGeocode a dataframe:\n\n```\nfrom mixmasta import mixmasta as mix\n\n# First, load in the geotiff as a dataframe\ndf = mix.raster2df('chirps-v2.0.2021.01.3.tif', feature_name='rainfall', band=1)\n\n# next, we can geocode the dataframe to the admin-level desired (`admin2` or `admin3`)\n# by specifying the names of the x and y columns\n# in this case, we will geocode to admin2 where x,y are are 'longitude' and 'latitude', respectively.\ndf_g = mix.geocode(\"admin2\", df, x='longitude', y='latitude')\n```\n\n## Running with CLI\n\nAfter cloning the repository and changing to the `mixmasta` directory, you can run mixmasta via the command line.\n\nSet-up:\n\nWhile you can point `mixmasta` to any file you would like to transform, the examples below assume your file is in the `inputs` folder; the transformed `.csv` file will be written to the `outputs` folder.\n\n- Transform geotiff to geocoded csv:\n```\nmixmasta mix --xform=geotiff --input_file=chirps-v2.0.2021.01.3.tif --output_file=geotiffTEST.csv --geo=admin2 --feature_name=rainfall --band=1 --date='5/4/2010' --x=longitude --y=latitude\n```\n\n- Transform geotiff to csv:\n```\nmixmasta mix --xform=geotiff --input_file=maxhop1.tif --output_file=maxhopOUT.csv --geo=admin2 --feature_name=probabilty --band=1 --x=longitude --y=latitude\n```\n\n- Transform netcdf to geocoded csv:\n\n```\nmixmasta mix --xform=netcdf --input_file=tos_O1_2001-2002.nc --output_file=netcdf.csv --geo=admin2 --x=lon --y=lat\n```\n\n- Transform netcdf to csv:\n```\nmixmasta mix --xform=netcdf --input_file=tos_O1_2001-2002.nc --output_file=netcdf.csv\n```\n\n-geocode an existing csv file:\n\n```\nmixmasta mix --xform=geocode --input_file=no_geo.csv --geo=admin3 --output_file=geoed_no_geo.csv --x=longitude --y=latitude\n```\n\n## World Modelers Specific Normalization\n\nFor the World Modelers program, it is necessary to convert arbitrary `csv`, `geotiff`, and `netcdf` files into a CauseMos compliant format. This can be accomplished by leveraging a `mapping` annotation file and the `causemosify` command. The output is a `gzipped` `parquet` file. This may be invoked with:\n\n```\nmixmasta causemosify --input_file=chirps-v2.0.2021.01.3.tif --mapper=mapper.json --geo=admin3 --output_file=causemosified_example\n```\n\nThis will produce a file called `causemosified_example.parquet.gzip` which can be read using Pandas with:\n\n```\npd.read_parquet('causemosified_example.parquet.gzip')\n```\n\n## Other Documents\n- Docker Instructions: `docs/docker.md`\n- Geo Entity Resolution Description: `docs/geo-tentity-resolution.md`\n- Package Testing in SpaceTag Env: `docs/spacetag-test.md`\n\n## Credits\n\nThis package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and the [audreyr/cookiecutter-pypackage](https://github.com/audreyr/cookiecutter-pypackage) project template.\n\n\n# History\n\n## 0.1.0 (2021-02-24)\n\n- First release on PyPI.\n\n\n\n",
"bugtrack_url": null,
"license": "MIT license",
"summary": "A library for common scientific model transforms",
"version": "0.6.9",
"split_keywords": [
"mixmasta"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "9d404092075973ceabc3de47965135d6",
"sha256": "2b327cdddef20f8f92ec06865281e13834a7b5bf5f02a5aebbd88013a75c3b23"
},
"downloads": -1,
"filename": "mixmasta-0.6.9-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "9d404092075973ceabc3de47965135d6",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.5",
"size": 31969,
"upload_time": "2022-12-13T17:37:53",
"upload_time_iso_8601": "2022-12-13T17:37:53.386417Z",
"url": "https://files.pythonhosted.org/packages/06/69/64f8fd00f0d8869753c5811ad805f10da936c211ab386e8a5ed7c678ad71/mixmasta-0.6.9-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "39c698c50ad9db243e3b0e7216ffdedf",
"sha256": "20d4f27c46732b1b606e18aad255612b91030968067bbdb8fccf776a3f2e81c3"
},
"downloads": -1,
"filename": "mixmasta-0.6.9.tar.gz",
"has_sig": false,
"md5_digest": "39c698c50ad9db243e3b0e7216ffdedf",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.5",
"size": 2934315,
"upload_time": "2022-12-13T17:37:55",
"upload_time_iso_8601": "2022-12-13T17:37:55.612873Z",
"url": "https://files.pythonhosted.org/packages/42/21/cf29f591d0a0fa76e0f4ad46febc7e7a63dcd84251054513b67d2531c5cc/mixmasta-0.6.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-13 17:37:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "jataware",
"github_project": "mixmasta",
"travis_ci": true,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "bump2version",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "Click",
"specs": [
[
">=",
"7.0"
],
[
"<",
"8"
]
]
},
{
"name": "coverage",
"specs": [
[
"==",
"4.5.4"
]
]
},
{
"name": "Cython",
"specs": [
[
"==",
"0.29.23"
]
]
},
{
"name": "flake8",
"specs": [
[
"==",
"3.7.8"
]
]
},
{
"name": "fuzzywuzzy",
"specs": [
[
">=",
"0.18.0"
]
]
},
{
"name": "GDAL",
"specs": [
[
"==",
"3.1.4"
]
]
},
{
"name": "geofeather",
"specs": [
[
">=",
"0.3.0"
]
]
},
{
"name": "geopandas",
"specs": [
[
"==",
"0.8.1"
]
]
},
{
"name": "netCDF4",
"specs": [
[
"==",
"1.5.3"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.20.3"
]
]
},
{
"name": "openpyxl",
"specs": [
[
"==",
"3.0.7"
]
]
},
{
"name": "pip",
"specs": [
[
">=",
"21.1"
]
]
},
{
"name": "pydantic",
"specs": [
[
">=",
"1.8.2"
]
]
},
{
"name": "pyproj",
"specs": [
[
"==",
"2.6.1.post1"
]
]
},
{
"name": "python-Levenshtein",
"specs": [
[
">=",
"0.12.2"
]
]
},
{
"name": "rasterio",
"specs": [
[
">=",
"1.1.0"
]
]
},
{
"name": "Rtree",
"specs": [
[
"==",
"0.8.3"
]
]
},
{
"name": "Shapely",
"specs": [
[
"==",
"1.7.1"
]
]
},
{
"name": "Sphinx",
"specs": [
[
"==",
"1.8.5"
]
]
},
{
"name": "tox",
"specs": [
[
"==",
"3.14.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
"<",
"5.0.0"
],
[
">=",
"4.41.1"
]
]
},
{
"name": "twine",
"specs": [
[
"==",
"1.14.0"
]
]
},
{
"name": "watchdog",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "wheel",
"specs": [
[
"==",
"0.33.6"
]
]
},
{
"name": "xarray",
"specs": [
[
"==",
"0.16.1"
]
]
},
{
"name": "xlrd",
"specs": [
[
"==",
"2.0.1"
]
]
}
],
"tox": true,
"lcname": "mixmasta"
}