geotorchai

Name	geotorchai JSON
Version	0.2.0 JSON
	download
home_page	https://github.com/DataSystemsLab/GeoTorch
Summary	GeoTorchAI, formarly GeoTorch, A Spatiotemporal Deep Learning Framework
upload_time	2023-04-10 00:16:38
maintainer
docs_url	None
author	Kanchan Chowdhury
requires_python	>=3.6
license	AGPL-3.0
keywords	spatial-machine-learning spatiotemporal-deep-learning spatial forecasting deep learning machine learning spatiotemporal forecasting temporal signal raster classification satellite classification raster segmentation satellite segmentation convlstm st-resnet deepstn+ deepsatv2 lstm temporal network eurosat representation learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <img src="https://raw.githubusercontent.com/DataSystemsLab/GeoTorchAI/main/data/GoeTorchAILogo.png" class="center" width="30%">

# GeoTorchAI: A Spatiotemporal Deep Learning Framework

GeoTorchAI, formerly known as [GeoTorch](https://dl.acm.org/doi/abs/10.1145/3557915.3561036), is a spatiotemporal deep learning framework on top of PyTorch and [Apache Sedona](https://sedona.apache.org/). It enable spatiotemporal machine learning practitioners to easily and efficiently implement deep learning models targeting the applications of raster imagery datasets and spatiotemporal non-imagery datasets. Deep learning applications of raster imagery datasets include satellite imagery classification and satellite image segmentation. Applications of deep learning on spatiotemporal non-imagery datasets are mainly prediction tasks which include but are not limited to traffic volume and traffic flow prediction, taxi/bike flow/volume prediction, precipitation forecasting, and weather forecasting.

## GeoTorchAI Modules
GeoTorchAI contains various modules for deep learning and data preprocessing in both raster imagery and spatiotemporal non-imagery categories. Deep learning module offers ready-to-use raster and grid datasets, transforms, and neural network models.


<img src="https://github.com/DataSystemsLab/GeoTorchAI/blob/main/data/architecture.png?raw=true" class="center" width="60%" align="right">

* Datasets: This module conatins processed popular datasets for raster data models and grid based spatio-temporal models. Datasets are available as ready-to-use PyTorch datasets.
* Models: These are PyTorch layers for popular raster data models and grid based spatio-temporal models.
* Transforms: Various tranformations operations that can be applied to dataset samples during model training.
* Preprocessing: Supports preprocessing of raster imagery and spatiotemporal non-imagery datasets in a scalable setting on top of Apache Spark and Apache Sedona. Users don't need to learn the coding concepts of Apache Sedona and Apache Spark. They only need to write their code on Python while PySpark and Apache Sedona implementations are hidden. The preprocessing module allows machine learning practitioners to prepare a trainable grid-based spatiotemporal tensor from large raw datasets along with performing various transformations on raster imagery datasets.




## GeoTorchAI Design Principles

GeoTorchAI is designed in such a way that it has the necessary building blocks for developing raster and spatiotemporal DL applications within the PyTorch ecosystem. Various functionalities available in GeoTorch deep learning module are compatible with PyTorch core units such as neural network layers, datasets, and transformations. We make the deep learning module of GeoTorch GPU compatible so that PyTorch-provided scalability and parallelism on GPU can be achieved with GPU configured devices.

Although the data preprocessing module has dependencies on external big data processing libraries such as PySpark and Apache Sedona, the deep learning module only depends on PyTorch. Since the datasets component of the deep learning module provides preprocessed and trainable state-of-the-art benchmark datasets, designing applications with such benchmark datasets can be completed without requiring big data-related dependencies. Furthermore, to help machine learning practitioners build raster and spatiotemporal applications with their preferred raw datasets, our preprocessing module enables raster and spatiotemporal data processing in a pure Pythonic way without requiring the coding knowledge of Apache Spark, Apache Sedona, and other big data processing libraries while providing the scalability of Apache Spark at the same time.

Our preprocessing module is designed such that it minimizes the number of methods and classes in the API. Users can perform end-to-end spatiotemporal data preprocessing, which starts by loading raw datasets and ends by generating a trainable Tensor-shaped array, with a minimum number of method calls. It helps the users understand the API fast and reduces their confusion.


## Documentation
Details documentation on installation, API, and programming guide is available on [GeoTorchAI Website](https://kanchanchy.github.io/geotorchai/).

## Installation
GeoTorchAI can be installed by running the following command:
```
pip install geotorchai
```
GeoTorchAI is available on [PyPI](https://pypi.org/project/geotorchai/). For more instructions regrading the required and optional dependencies, please visit the [website](https://kanchanchy.github.io/geotorchai/installation.html).

## Example
End-to-end coding examples for various applications including model training and data preprocessing are available in our [binders](https://github.com/DataSystemsLab/GeoTorchAI/tree/main/binders) and [examples](https://github.com/DataSystemsLab/GeoTorchAI/tree/main/examples) sections.

We show a very short example of satellite imagery classification using GeoTorchAI in a step-by-step manner below. Training a satellite imagery classification model consists of three steps: loading the dataset, initializing the model and parameters, and train the model. We pick the [DeepSatV2](https://arxiv.org/abs/1911.07747) model to classify [EuroSAT](https://github.com/phelber/EuroSAT) satellite images.
#### EuroSAT Image Classes
* Annual Crop
* Forest
* Herbaceous Vegetation
* Highway
* Industrial
* Pasture
* Permanent Crop
* Residential
* River
* SeaLake
#### Spectral Bands of a Highway Image
![Highway Image](https://github.com/DataSystemsLab/GeoTorchAI/blob/main/data/euro-highway.png)
#### Spectral Bands of an Industry Image
![Industry Image](https://github.com/DataSystemsLab/GeoTorchAI/blob/main/data/euro-industry.png)
#### Loading Training Dataset
Load the EuroSAT Dataset. Setting download=True will download the full data in the given directory. If data is already available, set download=False.
```
full_data = geotorchai.datasets.raser.EuroSAT(root="data/eurosat", download=True, include_additional_features=True)
```
#### Split data into 80% train and 20% validation parts
```
dataset_size = len(full_data)
indices = list(range(dataset_size))
split = int(np.floor(0.2 * dataset_size))
np.random.seed(random_seed)
np.random.shuffle(indices)
train_indices, val_indices = indices[split:], indices[:split]

train_sampler = torch.utils.data.sampler.SubsetRandomSampler(train_indices)
valid_sampler = torch.utils.data.sampler.SubsetRandomSampler(val_indices)

train_loader = torch.utils.data.DataLoader(full_data, batch_size=16, sampler=train_sampler)
val_loader = torch.utils.data.DataLoader(full_data, batch_size=16, sampler=valid_sampler)
```
#### Train and Evaluate on GPU
If the device used to train the model has GPUs available, then the model, loss function, and tensors can be loaded on GPU. At first initialize the device with CPU or GPU based on the availability of GPU.
```
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
```
Later, model, loss function, and tensors can be loaded to CPU or GPU by calling .to(device). See the exact examples in the later parts.
#### Initializing Model and Parameters
Model initialization parameters such as in_channel, in_width, in_height, and num_classes are based on the property of SAT6 dataset.
```
model = DeepSatV2(in_channels=13, in_height=64, in_width=64, num_classes=10, num_filtered_features=len(full_data.ADDITIONAL_FEATURES))
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0002)
# Load model and loss function to GPU or CPU
model.to(device)
loss_fn.to(device)
```
#### Train the Model for One Epoch
```
for i, sample in enumerate(train_loader):
    inputs, labels, features = sample
    # Load tensors to GPU or CPU
    inputs = inputs.to(device)
    features = features.type(torch.FloatTensor).to(device)
    labels = labels.to(device)
    # Forward pass
    outputs = model(inputs, features)
    loss = loss_fn(outputs, labels)
    # Backward pass and optimize
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
```
#### Evaluate the Model on Validation Dataset
```
model.eval()
total_sample = 0
correct = 0
for i, sample in enumerate(val_loader):
    inputs, labels, features = sample
    # Load tensors to GPU or CPU
    inputs = inputs.to(device)
    features = features.type(torch.FloatTensor).to(device)
    labels = labels.to(device)
    # Forward pass
    outputs = model(inputs, features)
    total_sample += len(labels)
    _, predicted = outputs.max(1)
    correct += predicted.eq(labels).sum().item()
val_accuracy = 100 * correct / total_sample
print("Validation Accuracy: ", val_accuracy, "%")
```

## Contributing to this Project
Follow the instructions available [here](https://github.com/DataSystemsLab/GeoTorchAI/blob/main/CONTRIBUTING.md).

## Other Contributions of this Project
We also contributed to [Apache Sedona](https://sedona.apache.org/) to add transformation and write supports for GeoTiff raster images. This contribution is also a part of this project. Contribution reference: [Commits](https://github.com/apache/incubator-sedona/commits?author=kanchanchy)

## Citing the Work:
Kanchan Chowdhury and Mohamed Sarwat. 2022. GeoTorch: a spatiotemporal deep learning framework. In Proceedings of the 30th International Conference on Advances in Geographic Information Systems (SIGSPATIAL '22). Association for Computing Machinery, New York, NY, USA, Article 100, 1–4. https://doi.org/10.1145/3557915.3561036

### BibTex:
```
@inproceedings{10.1145/3557915.3561036,
author = {Chowdhury, Kanchan and Sarwat, Mohamed},
title = {GeoTorch: A Spatiotemporal Deep Learning Framework},
year = {2022},
isbn = {9781450395298},
publisher = {Association for Computing Machinery},
url = {https://doi.org/10.1145/3557915.3561036},
doi = {10.1145/3557915.3561036},
articleno = {100},
numpages = {4},
location = {Seattle, Washington},
series = {SIGSPATIAL '22}
}
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/DataSystemsLab/GeoTorch",
    "name": "geotorchai",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "spatial-machine-learning,spatiotemporal-deep-learning,spatial forecasting,deep learning,machine learning,spatiotemporal forecasting,temporal signal,raster classification,satellite classification,raster segmentation,satellite segmentation,convlstm,st-resnet,deepstn+,deepsatv2,lstm,temporal network,eurosat,representation learning",
    "author": "Kanchan Chowdhury",
    "author_email": "kchowdh1@asu.edu",
    "download_url": "https://files.pythonhosted.org/packages/00/a2/090cc8f3ddb22f69ae852cb112514214434436005f174e1ebc78e4c150a1/geotorchai-0.2.0.tar.gz",
    "platform": null,
    "description": "<img src=\"https://raw.githubusercontent.com/DataSystemsLab/GeoTorchAI/main/data/GoeTorchAILogo.png\" class=\"center\" width=\"30%\">\n\n# GeoTorchAI: A Spatiotemporal Deep Learning Framework\n\nGeoTorchAI, formerly known as [GeoTorch](https://dl.acm.org/doi/abs/10.1145/3557915.3561036), is a spatiotemporal deep learning framework on top of PyTorch and [Apache Sedona](https://sedona.apache.org/). It enable spatiotemporal machine learning practitioners to easily and efficiently implement deep learning models targeting the applications of raster imagery datasets and spatiotemporal non-imagery datasets. Deep learning applications of raster imagery datasets include satellite imagery classification and satellite image segmentation. Applications of deep learning on spatiotemporal non-imagery datasets are mainly prediction tasks which include but are not limited to traffic volume and traffic flow prediction, taxi/bike flow/volume prediction, precipitation forecasting, and weather forecasting.\n\n## GeoTorchAI Modules\nGeoTorchAI contains various modules for deep learning and data preprocessing in both raster imagery and spatiotemporal non-imagery categories. Deep learning module offers ready-to-use raster and grid datasets, transforms, and neural network models.\n\n\n<img src=\"https://github.com/DataSystemsLab/GeoTorchAI/blob/main/data/architecture.png?raw=true\" class=\"center\" width=\"60%\" align=\"right\">\n\n* Datasets: This module conatins processed popular datasets for raster data models and grid based spatio-temporal models. Datasets are available as ready-to-use PyTorch datasets.\n* Models: These are PyTorch layers for popular raster data models and grid based spatio-temporal models.\n* Transforms: Various tranformations operations that can be applied to dataset samples during model training.\n* Preprocessing: Supports preprocessing of raster imagery and spatiotemporal non-imagery datasets in a scalable setting on top of Apache Spark and Apache Sedona. Users don't need to learn the coding concepts of Apache Sedona and Apache Spark. They only need to write their code on Python while PySpark and Apache Sedona implementations are hidden. The preprocessing module allows machine learning practitioners to prepare a trainable grid-based spatiotemporal tensor from large raw datasets along with performing various transformations on raster imagery datasets.\n\n\n\n\n## GeoTorchAI Design Principles\n\nGeoTorchAI is designed in such a way that it has the necessary building blocks for developing raster and spatiotemporal DL applications within the PyTorch ecosystem. Various functionalities available in GeoTorch deep learning module are compatible with PyTorch core units such as neural network layers, datasets, and transformations. We make the deep learning module of GeoTorch GPU compatible so that PyTorch-provided scalability and parallelism on GPU can be achieved with GPU configured devices.\n\nAlthough the data preprocessing module has dependencies on external big data processing libraries such as PySpark and Apache Sedona, the deep learning module only depends on PyTorch. Since the datasets component of the deep learning module provides preprocessed and trainable state-of-the-art benchmark datasets, designing applications with such benchmark datasets can be completed without requiring big data-related dependencies. Furthermore, to help machine learning practitioners build raster and spatiotemporal applications with their preferred raw datasets, our preprocessing module enables raster and spatiotemporal data processing in a pure Pythonic way without requiring the coding knowledge of Apache Spark, Apache Sedona, and other big data processing libraries while providing the scalability of Apache Spark at the same time.\n\nOur preprocessing module is designed such that it minimizes the number of methods and classes in the API. Users can perform end-to-end spatiotemporal data preprocessing, which starts by loading raw datasets and ends by generating a trainable Tensor-shaped array, with a minimum number of method calls. It helps the users understand the API fast and reduces their confusion.\n\n\n## Documentation\nDetails documentation on installation, API, and programming guide is available on [GeoTorchAI Website](https://kanchanchy.github.io/geotorchai/).\n\n## Installation\nGeoTorchAI can be installed by running the following command:\n```\npip install geotorchai\n```\nGeoTorchAI is available on [PyPI](https://pypi.org/project/geotorchai/). For more instructions regrading the required and optional dependencies, please visit the [website](https://kanchanchy.github.io/geotorchai/installation.html).\n\n## Example\nEnd-to-end coding examples for various applications including model training and data preprocessing are available in our [binders](https://github.com/DataSystemsLab/GeoTorchAI/tree/main/binders) and [examples](https://github.com/DataSystemsLab/GeoTorchAI/tree/main/examples) sections.\n\nWe show a very short example of satellite imagery classification using GeoTorchAI in a step-by-step manner below. Training a satellite imagery classification model consists of three steps: loading the dataset, initializing the model and parameters, and train the model. We pick the [DeepSatV2](https://arxiv.org/abs/1911.07747) model to classify [EuroSAT](https://github.com/phelber/EuroSAT) satellite images.\n#### EuroSAT Image Classes\n* Annual Crop\n* Forest\n* Herbaceous Vegetation\n* Highway\n* Industrial\n* Pasture\n* Permanent Crop\n* Residential\n* River\n* SeaLake\n#### Spectral Bands of a Highway Image\n![Highway Image](https://github.com/DataSystemsLab/GeoTorchAI/blob/main/data/euro-highway.png)\n#### Spectral Bands of an Industry Image\n![Industry Image](https://github.com/DataSystemsLab/GeoTorchAI/blob/main/data/euro-industry.png)\n#### Loading Training Dataset\nLoad the EuroSAT Dataset. Setting download=True will download the full data in the given directory. If data is already available, set download=False.\n```\nfull_data = geotorchai.datasets.raser.EuroSAT(root=\"data/eurosat\", download=True, include_additional_features=True)\n```\n#### Split data into 80% train and 20% validation parts\n```\ndataset_size = len(full_data)\nindices = list(range(dataset_size))\nsplit = int(np.floor(0.2 * dataset_size))\nnp.random.seed(random_seed)\nnp.random.shuffle(indices)\ntrain_indices, val_indices = indices[split:], indices[:split]\n\ntrain_sampler = torch.utils.data.sampler.SubsetRandomSampler(train_indices)\nvalid_sampler = torch.utils.data.sampler.SubsetRandomSampler(val_indices)\n\ntrain_loader = torch.utils.data.DataLoader(full_data, batch_size=16, sampler=train_sampler)\nval_loader = torch.utils.data.DataLoader(full_data, batch_size=16, sampler=valid_sampler)\n```\n#### Train and Evaluate on GPU\nIf the device used to train the model has GPUs available, then the model, loss function, and tensors can be loaded on GPU. At first initialize the device with CPU or GPU based on the availability of GPU.\n```\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n```\nLater, model, loss function, and tensors can be loaded to CPU or GPU by calling .to(device). See the exact examples in the later parts.\n#### Initializing Model and Parameters\nModel initialization parameters such as in_channel, in_width, in_height, and num_classes are based on the property of SAT6 dataset.\n```\nmodel = DeepSatV2(in_channels=13, in_height=64, in_width=64, num_classes=10, num_filtered_features=len(full_data.ADDITIONAL_FEATURES))\nloss_fn = torch.nn.CrossEntropyLoss()\noptimizer = torch.optim.Adam(model.parameters(), lr=0.0002)\n# Load model and loss function to GPU or CPU\nmodel.to(device)\nloss_fn.to(device)\n```\n#### Train the Model for One Epoch\n```\nfor i, sample in enumerate(train_loader):\n    inputs, labels, features = sample\n    # Load tensors to GPU or CPU\n    inputs = inputs.to(device)\n    features = features.type(torch.FloatTensor).to(device)\n    labels = labels.to(device)\n    # Forward pass\n    outputs = model(inputs, features)\n    loss = loss_fn(outputs, labels)\n    # Backward pass and optimize\n    optimizer.zero_grad()\n    loss.backward()\n    optimizer.step()\n```\n#### Evaluate the Model on Validation Dataset\n```\nmodel.eval()\ntotal_sample = 0\ncorrect = 0\nfor i, sample in enumerate(val_loader):\n    inputs, labels, features = sample\n    # Load tensors to GPU or CPU\n    inputs = inputs.to(device)\n    features = features.type(torch.FloatTensor).to(device)\n    labels = labels.to(device)\n    # Forward pass\n    outputs = model(inputs, features)\n    total_sample += len(labels)\n    _, predicted = outputs.max(1)\n    correct += predicted.eq(labels).sum().item()\nval_accuracy = 100 * correct / total_sample\nprint(\"Validation Accuracy: \", val_accuracy, \"%\")\n```\n\n## Contributing to this Project\nFollow the instructions available [here](https://github.com/DataSystemsLab/GeoTorchAI/blob/main/CONTRIBUTING.md).\n\n## Other Contributions of this Project\nWe also contributed to [Apache Sedona](https://sedona.apache.org/) to add transformation and write supports for GeoTiff raster images. This contribution is also a part of this project. Contribution reference: [Commits](https://github.com/apache/incubator-sedona/commits?author=kanchanchy)\n\n## Citing the Work:\nKanchan Chowdhury and Mohamed Sarwat. 2022. GeoTorch: a spatiotemporal deep learning framework. In Proceedings of the 30th International Conference on Advances in Geographic Information Systems (SIGSPATIAL '22). Association for Computing Machinery, New York, NY, USA, Article 100, 1\u20134. https://doi.org/10.1145/3557915.3561036\n\n### BibTex:\n```\n@inproceedings{10.1145/3557915.3561036,\nauthor = {Chowdhury, Kanchan and Sarwat, Mohamed},\ntitle = {GeoTorch: A Spatiotemporal Deep Learning Framework},\nyear = {2022},\nisbn = {9781450395298},\npublisher = {Association for Computing Machinery},\nurl = {https://doi.org/10.1145/3557915.3561036},\ndoi = {10.1145/3557915.3561036},\narticleno = {100},\nnumpages = {4},\nlocation = {Seattle, Washington},\nseries = {SIGSPATIAL '22}\n}\n```\n\n\n",
    "bugtrack_url": null,
    "license": "AGPL-3.0",
    "summary": "GeoTorchAI, formarly GeoTorch, A Spatiotemporal Deep Learning Framework",
    "version": "0.2.0",
    "split_keywords": [
        "spatial-machine-learning",
        "spatiotemporal-deep-learning",
        "spatial forecasting",
        "deep learning",
        "machine learning",
        "spatiotemporal forecasting",
        "temporal signal",
        "raster classification",
        "satellite classification",
        "raster segmentation",
        "satellite segmentation",
        "convlstm",
        "st-resnet",
        "deepstn+",
        "deepsatv2",
        "lstm",
        "temporal network",
        "eurosat",
        "representation learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "19b9de1362d16f29c297c490a863306f1a9c2a968715ec87ba38f012ebc00c97",
                "md5": "6465f5843c542c4eeb570f394db24a83",
                "sha256": "8a68a96cf2c582230b53a2cc9020838fd4dfc467ae519cedd46fc4ce3e9be6a2"
            },
            "downloads": -1,
            "filename": "geotorchai-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6465f5843c542c4eeb570f394db24a83",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 102043,
            "upload_time": "2023-04-10T00:16:36",
            "upload_time_iso_8601": "2023-04-10T00:16:36.395363Z",
            "url": "https://files.pythonhosted.org/packages/19/b9/de1362d16f29c297c490a863306f1a9c2a968715ec87ba38f012ebc00c97/geotorchai-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "00a2090cc8f3ddb22f69ae852cb112514214434436005f174e1ebc78e4c150a1",
                "md5": "efdf4feea0239e923a0993d37fe7cb64",
                "sha256": "15948cef53fc2a1243e1a7935c210ad101e123ca98ac1886b07a3f734781d3a8"
            },
            "downloads": -1,
            "filename": "geotorchai-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "efdf4feea0239e923a0993d37fe7cb64",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 65301,
            "upload_time": "2023-04-10T00:16:38",
            "upload_time_iso_8601": "2023-04-10T00:16:38.966716Z",
            "url": "https://files.pythonhosted.org/packages/00/a2/090cc8f3ddb22f69ae852cb112514214434436005f174e1ebc78e4c150a1/geotorchai-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-10 00:16:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "DataSystemsLab",
    "github_project": "GeoTorch",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "geotorchai"
}

Kanchan Chowdhury