# AI/ML Python Package: `ml_dronebase_data_utils`
This package contains commonly used data functions for the AI/ML Team at Zeitview (formerly: DroneBase).
This repo provides:
- Custom Python Package: `ml_dronebase_data_utils`
<!-- MarkdownTOC -->
- [Contributing Guidelines](#contributing-guidelines)
- [Usage](#usage)
- [Object Detection Annotation Formatting](#object-detection-annotation-formatting)
- [S3 Data Utils](#s3-data-utils)
- [Installation from source](#installation-from-source)
- [Installation using pip](#installation-using-pip)
- [References](#references)
<!-- /MarkdownTOC -->
<a id="contributing-guidelines"></a>
## Contributing Guidelines
Please see the [CONTRIBUTING.md](./.github/CONTRIBUTING.md) document for details on how to contribute to this repository through Pull-Requests.
<a id="usage"></a>
## Usage
```python
import ml_dronebase_data_utils as data_utils
...
```
<a id="object-detection-annotation-formatting"></a>
## Object Detection Annotation Formatting
This package provides a Pascal VOC writer that renders ```*.xml``` annotation files for object detection tasks.
It supports regular object detection and oriented object detection annotations with an additional ```<angle>```_some angle_```</angle>``` parameter.
```python
from ml_dronebase_data_utils import PascalVOCWriter
writer = PascalVOCWriter()
for box in boxes:
xmin, ymin, xmax, ymax, angle = box
writer.addObject(
name="some class name",
xmin=xmin,
ymin=ymin,
xmax=xmax,
ymax=ymax,
angle=angle # Optional parameter
)
writer.save(annotation_path)
```
This package also provide CLI interfaces for the same,
`convert_geojson` can be used to convert geojson to voc format. This also has the ability to process in batch.
```txt
usage: convert_geojson [-h] --ortho-path ORTHO_PATH --geojson GEOJSON
--save-path SAVE_PATH
[--class-attribute CLASS_ATTRIBUTE[CLASS_ATTRIBUTE ...]]
[--class-mapping CLASS_MAPPING]
[--skip-classes SKIP_CLASSES [SKIP_CLASSES ...]]
[--rotated] [--batch] [--prefix PREFIX]
Convert geojson to voc format data
optional arguments:
-h, --help show this help message and exit
--ortho-path ORTHO_PATH
The ortho path, can be local/s3
--geojson GEOJSON The geojson path
--save-path SAVE_PATH
The save path
--class-attribute CLASS_ATTRIBUTE [CLASS_ATTRIBUTE ...]
The class attribute to use from the geojson for class labels
--class-mapping CLASS_MAPPING
A plain txt file containing class mappings
--skip-classes SKIP_CLASSES [SKIP_CLASSES ...]
Classes to skip, specify multiple
--rotated Use rotated bounding box, defaults to false
--batch Process a batch of orthos
--prefix PREFIX The prefix to use when saving the annotation
```
Example,
```bash
convert_geojson --ortho-path s3://ml-solar-ortho-fault-detection/orthos/tiff/PA140004_Thermal.tif --geojson s3://ml-solar-ortho-fault-detection/orthos/geojson/PA140004_Thermal.geojson --save-path s3://ml-solar-ortho-fault-detection/orthos/annotations/PA140004_Thermal.xml --class-attribute id --skip-classes 0 4 5 6 7 8 9 10 --class-mapping mapping.txt
```
mapping.txt must contain mappings in the format `0 = Normal`
`visualize_converted_geojson` can be used to visualize the generated annotations. This also has the ability to process in batch.
```txt
usage: visualize_converted_geojson [-h] --ortho-path ORTHO_PATH --anno-path
ANNO_PATH --save-path SAVE_PATH
[--draw-labels] [--batch]
Visualize converted geojson for quick visual inspection
optional arguments:
-h, --help show this help message and exit
--ortho-path ORTHO_PATH, -o ORTHO_PATH
The ortho path, can be local/s3
--anno-path ANNO_PATH, -a ANNO_PATH
The ortho path, can be local/s3
--save-path SAVE_PATH, -s SAVE_PATH
The ortho path, can be local/s3
--draw-labels, -d Draw the class labels
--batch, -b Run in batched mode
```
Example,
```bash
visualize_converted_geojson -o s3://ml-solar-ortho-fault-detection/orthos/tiff/PA140004_Thermal.tif -a s3://ml-solar-ortho-fault-detection/orthos/annotations/PA140004_Thermal.xml -s s3://ml-solar-ortho-fault-detection/orthos/visual_validation/PA140004_Thermal_drawn.png -d
```
<a id="s3-data-utils"></a>
## S3 Data Utils
This package also provides common AWS S3 data functions like downloading data, uploading data (data or trained models), train/test split, etc.
<a id="installation-from-source"></a>
### Installation from source
Clone and ```cd``` into the root directory of this repo, then run the following:
```bash
pip install -e .
```
<a id="installation-using-pip"></a>
### Installation using pip
```bash
pip install ml-dronebase-data-utils
```
<a id="references"></a>
## References
- [AI/ML Confluence Wiki](https://dronebase.atlassian.net/l/cp/1bcAYU82)
- [AI/ML DevOps Confluence Wiki](https://dronebase.atlassian.net/l/cp/Am4YKXPF)
- [AI/ML MLOps Confluence Wiki](https://dronebase.atlassian.net/l/cp/FER9Q0JV)
Raw data
{
"_id": null,
"home_page": "https://github.com/DroneBase/ml-dronebase-data-utils",
"name": "ml-dronebase-data-utils",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.11,>=3.9",
"maintainer_email": null,
"keywords": "python, utilities",
"author": "Sourav Agrawal, Isaac Corley, Jonathan Lwowski, Conor Wallace",
"author_email": "sourav.agrawal@zeitview.com, isaac.corley@zeitview.com, jonathan.lwowski@zeitview.com, conor.wallace@zeitview.com",
"download_url": "https://files.pythonhosted.org/packages/0f/d2/dc1ba48f2dd156d65f7c943cedda4dc45218fbc7bf8432fef9c13544f985/ml-dronebase-data-utils-0.4.0.tar.gz",
"platform": null,
"description": "# AI/ML Python Package: `ml_dronebase_data_utils`\n\nThis package contains commonly used data functions for the AI/ML Team at Zeitview (formerly: DroneBase).\n\nThis repo provides:\n\n- Custom Python Package: `ml_dronebase_data_utils`\n\n<!-- MarkdownTOC -->\n\n- [Contributing Guidelines](#contributing-guidelines)\n- [Usage](#usage)\n- [Object Detection Annotation Formatting](#object-detection-annotation-formatting)\n- [S3 Data Utils](#s3-data-utils)\n - [Installation from source](#installation-from-source)\n - [Installation using pip](#installation-using-pip)\n- [References](#references)\n\n<!-- /MarkdownTOC -->\n\n<a id=\"contributing-guidelines\"></a>\n## Contributing Guidelines\n\nPlease see the [CONTRIBUTING.md](./.github/CONTRIBUTING.md) document for details on how to contribute to this repository through Pull-Requests.\n\n<a id=\"usage\"></a>\n## Usage\n\n```python\nimport ml_dronebase_data_utils as data_utils\n...\n```\n\n<a id=\"object-detection-annotation-formatting\"></a>\n## Object Detection Annotation Formatting\n\nThis package provides a Pascal VOC writer that renders ```*.xml``` annotation files for object detection tasks.\nIt supports regular object detection and oriented object detection annotations with an additional ```<angle>```_some angle_```</angle>``` parameter.\n\n```python\nfrom ml_dronebase_data_utils import PascalVOCWriter\nwriter = PascalVOCWriter()\n\nfor box in boxes:\n xmin, ymin, xmax, ymax, angle = box\n writer.addObject(\n name=\"some class name\",\n xmin=xmin,\n ymin=ymin,\n xmax=xmax,\n ymax=ymax,\n angle=angle # Optional parameter\n )\nwriter.save(annotation_path)\n```\n\nThis package also provide CLI interfaces for the same,\n\n`convert_geojson` can be used to convert geojson to voc format. This also has the ability to process in batch.\n\n```txt\nusage: convert_geojson [-h] --ortho-path ORTHO_PATH --geojson GEOJSON \n --save-path SAVE_PATH \n [--class-attribute CLASS_ATTRIBUTE[CLASS_ATTRIBUTE ...]]\n [--class-mapping CLASS_MAPPING]\n [--skip-classes SKIP_CLASSES [SKIP_CLASSES ...]] \n [--rotated] [--batch] [--prefix PREFIX]\n\nConvert geojson to voc format data\n\noptional arguments:\n -h, --help show this help message and exit\n --ortho-path ORTHO_PATH\n The ortho path, can be local/s3\n --geojson GEOJSON The geojson path\n --save-path SAVE_PATH\n The save path\n --class-attribute CLASS_ATTRIBUTE [CLASS_ATTRIBUTE ...]\n The class attribute to use from the geojson for class labels\n --class-mapping CLASS_MAPPING\n A plain txt file containing class mappings\n --skip-classes SKIP_CLASSES [SKIP_CLASSES ...]\n Classes to skip, specify multiple\n --rotated Use rotated bounding box, defaults to false\n --batch Process a batch of orthos\n --prefix PREFIX The prefix to use when saving the annotation\n```\n\nExample,\n\n```bash\nconvert_geojson --ortho-path s3://ml-solar-ortho-fault-detection/orthos/tiff/PA140004_Thermal.tif --geojson s3://ml-solar-ortho-fault-detection/orthos/geojson/PA140004_Thermal.geojson --save-path s3://ml-solar-ortho-fault-detection/orthos/annotations/PA140004_Thermal.xml --class-attribute id --skip-classes 0 4 5 6 7 8 9 10 --class-mapping mapping.txt\n```\n\nmapping.txt must contain mappings in the format `0 = Normal`\n\n`visualize_converted_geojson` can be used to visualize the generated annotations. This also has the ability to process in batch.\n\n```txt\nusage: visualize_converted_geojson [-h] --ortho-path ORTHO_PATH --anno-path\n ANNO_PATH --save-path SAVE_PATH\n [--draw-labels] [--batch]\n\nVisualize converted geojson for quick visual inspection\n\noptional arguments:\n -h, --help show this help message and exit\n --ortho-path ORTHO_PATH, -o ORTHO_PATH\n The ortho path, can be local/s3\n --anno-path ANNO_PATH, -a ANNO_PATH\n The ortho path, can be local/s3\n --save-path SAVE_PATH, -s SAVE_PATH\n The ortho path, can be local/s3\n --draw-labels, -d Draw the class labels\n --batch, -b Run in batched mode\n```\n\nExample,\n\n```bash\nvisualize_converted_geojson -o s3://ml-solar-ortho-fault-detection/orthos/tiff/PA140004_Thermal.tif -a s3://ml-solar-ortho-fault-detection/orthos/annotations/PA140004_Thermal.xml -s s3://ml-solar-ortho-fault-detection/orthos/visual_validation/PA140004_Thermal_drawn.png -d\n```\n\n<a id=\"s3-data-utils\"></a>\n## S3 Data Utils\n\nThis package also provides common AWS S3 data functions like downloading data, uploading data (data or trained models), train/test split, etc.\n\n<a id=\"installation-from-source\"></a>\n### Installation from source\n\nClone and ```cd``` into the root directory of this repo, then run the following:\n\n```bash\npip install -e .\n```\n\n<a id=\"installation-using-pip\"></a>\n### Installation using pip\n\n```bash\npip install ml-dronebase-data-utils\n```\n\n<a id=\"references\"></a>\n## References\n\n- [AI/ML Confluence Wiki](https://dronebase.atlassian.net/l/cp/1bcAYU82)\n- [AI/ML DevOps Confluence Wiki](https://dronebase.atlassian.net/l/cp/Am4YKXPF)\n- [AI/ML MLOps Confluence Wiki](https://dronebase.atlassian.net/l/cp/FER9Q0JV)\n",
"bugtrack_url": null,
"license": null,
"summary": "A collection of commonly functions used by DroneBase ML Engineers",
"version": "0.4.0",
"project_urls": {
"Homepage": "https://github.com/DroneBase/ml-dronebase-data-utils"
},
"split_keywords": [
"python",
" utilities"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "16d1d5283f8b54350a760cada19ba0859467c72f5dd8e588f7afe2c99cdb32c7",
"md5": "6f1f8f1ad0e2bcef089b4fe3ace30989",
"sha256": "aa189fa5e6f3a3facaa0fd99cdeda2b10bd352ce5b87dbdafc868c2ed12e6ac2"
},
"downloads": -1,
"filename": "ml_dronebase_data_utils-0.4.0-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "6f1f8f1ad0e2bcef089b4fe3ace30989",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": "<3.11,>=3.9",
"size": 25158,
"upload_time": "2024-03-23T20:58:14",
"upload_time_iso_8601": "2024-03-23T20:58:14.414646Z",
"url": "https://files.pythonhosted.org/packages/16/d1/d5283f8b54350a760cada19ba0859467c72f5dd8e588f7afe2c99cdb32c7/ml_dronebase_data_utils-0.4.0-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0fd2dc1ba48f2dd156d65f7c943cedda4dc45218fbc7bf8432fef9c13544f985",
"md5": "27f63266cc552cc48147803c790310af",
"sha256": "c9e7598145aa20a3ef761eff32af6329a6ff26118fd681ab1c97a035a3105ae6"
},
"downloads": -1,
"filename": "ml-dronebase-data-utils-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "27f63266cc552cc48147803c790310af",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.11,>=3.9",
"size": 22891,
"upload_time": "2024-03-23T20:58:16",
"upload_time_iso_8601": "2024-03-23T20:58:16.476170Z",
"url": "https://files.pythonhosted.org/packages/0f/d2/dc1ba48f2dd156d65f7c943cedda4dc45218fbc7bf8432fef9c13544f985/ml-dronebase-data-utils-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-23 20:58:16",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DroneBase",
"github_project": "ml-dronebase-data-utils",
"github_not_found": true,
"lcname": "ml-dronebase-data-utils"
}