Name | mbari-aidata JSON |
Version |
1.56.1
JSON |
| download |
home_page | None |
Summary | Command line tool to do extract, transform, load and download operations on AI data for a number of projects at MBARI that require detection, clustering or classification workflows. |
upload_time | 2025-08-10 02:12:55 |
maintainer | None |
docs_url | None |
author | Danelle Cline |
requires_python | <3.12,>=3.10 |
license | Apache |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
[](http://www.mbari.org)
[](https://github.com/semantic-release/semantic-release)
[](https://opensource.org/licenses/Apache-2.0)
[](https://www.python.org/downloads/)
*mbari-aidata* is a command line tool to do extract, transform, load and download operations
on AI data for a number of projects at MBARI that require detection, clustering or classification
workflows.
More documentation and examples are available at [https://docs.mbari.org/internal/ai/data](https://docs.mbari.org/internal/ai/data/).
## 🚀 Features
* 🧠 Object Detection/Clustering Integration: Loads detection/classification/clustering output from SDCAT formatted results.
* Flexible Data Export: Downloads from Tator into machine learning formats like COCO, CIFAR, or PASCAL VOC.
* Real-Time Uploads: Pushes localizations to [Tator](https://www.tator.io/) via [Redis](https://redis.io/glossary/redis-queue/) queues for real-time workflows.
* Metadata Extraction: Parses images metadata such as GPS/time/date through a plugin-based system (extractors).
* Duplicate Detection & flexible media references: Supports duplicate media load checks with the --check-duplicates flag.
* Images or video are made accessible through a web server without needing to upload or move them from your internal NFS project mounts (e.g. Thalassa)
* Augmentation Support: Augment VOC datasets with [Albumentations](https://albumentations.ai/) to boost your object detection model performance. See examples in the [docs](https://docs.mbari.org/internal/ai/data/commands/transform/?h=aug#transform-voc-to-yolo-with-augmentations).
## Requirements
- Python 3.10 or higher
- A Tator API token and (optional) Redis password for the .env file. Contact the MBARI AI team for access.
- 🐳Docker for development and testing only, but it can also be used instead of a local Python installation.
- For local installation, you will need to install the required Python packages listed in the `requirements.txt` file, [ffmpeg](https://ffmpeg.org/), and the mp4dump tool from https://www.bento4.com/
## 📦 Installation
Install as a Python package:
```shell
pip install mbari-aidata
```
Create the .env file with the following contents in the root directory of the project:
```text
TATOR_TOKEN=your_api_token
REDIS_PASSWORD=your_redis_password
ENVIRONMENT=testing or production
```
Create a configuration file in the root directory of the project:
```bash
touch config_cfe.yaml
```
Or, use the project specific configuration from our docs server at
https://docs.mbari.org/internal/ai/projects/
This file will be used to configure the project data, such as mounts, plugins, and database connections.
```bash
aidata download --version Baseline --labels "Diatoms, Copepods" --config https://docs.mbari.org/internal/ai/projects/uav-901902/config_uav.yml
```
⚙️Example configuration file:
```yaml
# config_cfe.yml
# Config file for CFE project production
mounts:
- name: "image"
path: "/mnt/CFElab"
host: "https://mantis.shore.mbari.org"
nginx_root: "/CFElab"
- name: "video"
path: "/mnt/CFElab"
host: "https://mantis.shore.mbari.org"
nginx_root: "/CFElab"
plugins:
- name: "extractor"
module: "mbari_aidata.plugins.extractors.tap_cfe_media"
function: "extract_media"
redis:
host: "doris.shore.mbari.org"
port: 6382
vss:
project: "902111-CFE"
model: "google/vit-base-patch16-224"
tator:
project: "902111-CFE"
host: "https://mantis.shore.mbari.org"
image:
attributes:
iso_datetime: #<-------Required for images
type: datetime
depth:
type: float
video:
attributes:
iso_start_datetime: #<-------Required for videos
type: datetime
box:
attributes:
Label:
type: string
score:
type: float
cluster:
type: string
saliency:
type: float
area:
type: int
exemplar:
type: bool
```
## 🐳 Docker usage
A docker version is also available at `mbari/aidata:latest` or `mbari/aidata:latest:cuda-124`.
For example, to download data using the docker image:
```shell
docker run -it --rm -v $(pwd):/mnt mbari/aidata:latest aidata download --version Baseline --labels "Diatoms, Copepods" --config config_cfe.yml
```
## Commands
* `aidata download --help` - Download data, such as images, boxes, into various formats for machine learning e.g. COCO, CIFAR, or PASCAL VOC format. Augmentation supported for VOC exported data using Albumentations.
* `aidata load --help` - Load data, such as images, boxes, or clusters into either a Postgres or REDIS database
* `aidata db --help` - Commands related to database management
* `aidata transform --help` - Commands related to transforming downloaded data
* `aidata -h` - Print help message and exit.
Source code is available at [github.com/mbari-org/aidata](https://github.com/mbari-org/aidata/).
## Development
See the [Development Guide](https://github.com/mbari-org/aidata/blob/main/DEVELOPMENT.md) for more information on how to set up the development environment or the [justfile](justfile)
🗓️ Last updated: 2025-06-13
Raw data
{
"_id": null,
"home_page": null,
"name": "mbari-aidata",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Danelle Cline",
"author_email": "dcline@mbari.org",
"download_url": "https://files.pythonhosted.org/packages/56/b2/5f69d787459f6e3e5cf5149c4c52e7631fc4e34df1112aa093258b0d6a2d/mbari_aidata-1.56.1.tar.gz",
"platform": null,
"description": "[](http://www.mbari.org)\n[](https://github.com/semantic-release/semantic-release)\n[](https://opensource.org/licenses/Apache-2.0)\n[](https://www.python.org/downloads/)\n\n*mbari-aidata* is a command line tool to do extract, transform, load and download operations\non AI data for a number of projects at MBARI that require detection, clustering or classification\nworkflows.\n\nMore documentation and examples are available at [https://docs.mbari.org/internal/ai/data](https://docs.mbari.org/internal/ai/data/).\n \n## \ud83d\ude80 Features\n* \ud83e\udde0 Object Detection/Clustering Integration: Loads detection/classification/clustering output from SDCAT formatted results.\n* Flexible Data Export: Downloads from Tator into machine learning formats like COCO, CIFAR, or PASCAL VOC.\n* Real-Time Uploads: Pushes localizations to [Tator](https://www.tator.io/) via [Redis](https://redis.io/glossary/redis-queue/) queues for real-time workflows.\n* Metadata Extraction: Parses images metadata such as GPS/time/date through a plugin-based system (extractors).\n* Duplicate Detection & flexible media references: Supports duplicate media load checks with the --check-duplicates flag. \n* Images or video are made accessible through a web server without needing to upload or move them from your internal NFS project mounts (e.g. Thalassa)\n* Augmentation Support: Augment VOC datasets with [Albumentations](https://albumentations.ai/) to boost your object detection model performance. See examples in the [docs](https://docs.mbari.org/internal/ai/data/commands/transform/?h=aug#transform-voc-to-yolo-with-augmentations).\n\n## Requirements\n- Python 3.10 or higher\n- A Tator API token and (optional) Redis password for the .env file. Contact the MBARI AI team for access.\n- \ud83d\udc33Docker for development and testing only, but it can also be used instead of a local Python installation.\n- For local installation, you will need to install the required Python packages listed in the `requirements.txt` file, [ffmpeg](https://ffmpeg.org/), and the mp4dump tool from https://www.bento4.com/\n\n## \ud83d\udce6 Installation \nInstall as a Python package:\n\n```shell\npip install mbari-aidata\n```\n \nCreate the .env file with the following contents in the root directory of the project:\n\n```text\nTATOR_TOKEN=your_api_token\nREDIS_PASSWORD=your_redis_password\nENVIRONMENT=testing or production\n```\n\nCreate a configuration file in the root directory of the project:\n```bash\ntouch config_cfe.yaml\n```\nOr, use the project specific configuration from our docs server at\nhttps://docs.mbari.org/internal/ai/projects/\n\n\nThis file will be used to configure the project data, such as mounts, plugins, and database connections.\n```bash\naidata download --version Baseline --labels \"Diatoms, Copepods\" --config https://docs.mbari.org/internal/ai/projects/uav-901902/config_uav.yml\n```\n\n\u2699\ufe0fExample configuration file:\n```yaml\n# config_cfe.yml\n# Config file for CFE project production\nmounts:\n - name: \"image\"\n path: \"/mnt/CFElab\"\n host: \"https://mantis.shore.mbari.org\"\n nginx_root: \"/CFElab\"\n\n - name: \"video\"\n path: \"/mnt/CFElab\"\n host: \"https://mantis.shore.mbari.org\"\n nginx_root: \"/CFElab\"\n\n\nplugins:\n - name: \"extractor\"\n module: \"mbari_aidata.plugins.extractors.tap_cfe_media\"\n function: \"extract_media\"\n\nredis:\n host: \"doris.shore.mbari.org\"\n port: 6382\n\nvss:\n project: \"902111-CFE\"\n model: \"google/vit-base-patch16-224\"\n\ntator:\n project: \"902111-CFE\"\n host: \"https://mantis.shore.mbari.org\"\n image:\n attributes:\n iso_datetime: #<-------Required for images\n type: datetime\n depth:\n type: float\n video:\n attributes:\n iso_start_datetime: #<-------Required for videos\n type: datetime\n box:\n attributes:\n Label:\n type: string\n score:\n type: float\n cluster:\n type: string\n saliency:\n type: float\n area:\n type: int\n exemplar:\n type: bool\n```\n\n## \ud83d\udc33 Docker usage\nA docker version is also available at `mbari/aidata:latest` or `mbari/aidata:latest:cuda-124`.\nFor example, to download data using the docker image:\n\n```shell\ndocker run -it --rm -v $(pwd):/mnt mbari/aidata:latest aidata download --version Baseline --labels \"Diatoms, Copepods\" --config config_cfe.yml\n```\n\n## Commands\n\n* `aidata download --help` - Download data, such as images, boxes, into various formats for machine learning e.g. COCO, CIFAR, or PASCAL VOC format. Augmentation supported for VOC exported data using Albumentations.\n* `aidata load --help` - Load data, such as images, boxes, or clusters into either a Postgres or REDIS database\n* `aidata db --help` - Commands related to database management\n* `aidata transform --help` - Commands related to transforming downloaded data\n* `aidata -h` - Print help message and exit.\n \nSource code is available at [github.com/mbari-org/aidata](https://github.com/mbari-org/aidata/). \n\n## Development\nSee the [Development Guide](https://github.com/mbari-org/aidata/blob/main/DEVELOPMENT.md) for more information on how to set up the development environment or the [justfile](justfile) \n \n\ud83d\uddd3\ufe0f Last updated: 2025-06-13",
"bugtrack_url": null,
"license": "Apache",
"summary": "Command line tool to do extract, transform, load and download operations on AI data for a number of projects at MBARI that require detection, clustering or classification workflows.",
"version": "1.56.1",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "001b126aee83b297ce4b869c2b1afcb14dfea5d4ca5a692584d9b94b11632bb1",
"md5": "403d595d4df79b4f7a0d6a56d3f869f6",
"sha256": "75b803eb7436af50fd9904f372ec66a86a9fc5cfd0c1d3afbd7e725a56cb490f"
},
"downloads": -1,
"filename": "mbari_aidata-1.56.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "403d595d4df79b4f7a0d6a56d3f869f6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.10",
"size": 64562,
"upload_time": "2025-08-10T02:12:53",
"upload_time_iso_8601": "2025-08-10T02:12:53.922190Z",
"url": "https://files.pythonhosted.org/packages/00/1b/126aee83b297ce4b869c2b1afcb14dfea5d4ca5a692584d9b94b11632bb1/mbari_aidata-1.56.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "56b25f69d787459f6e3e5cf5149c4c52e7631fc4e34df1112aa093258b0d6a2d",
"md5": "47e3ff8fc0e49cc559739dc9c77453dc",
"sha256": "fd4bc1e230015ddf81a133b2c3e6f0db40414eee58c8bf2aec00ef2b6ce1e334"
},
"downloads": -1,
"filename": "mbari_aidata-1.56.1.tar.gz",
"has_sig": false,
"md5_digest": "47e3ff8fc0e49cc559739dc9c77453dc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.10",
"size": 47057,
"upload_time": "2025-08-10T02:12:55",
"upload_time_iso_8601": "2025-08-10T02:12:55.306885Z",
"url": "https://files.pythonhosted.org/packages/56/b2/5f69d787459f6e3e5cf5149c4c52e7631fc4e34df1112aa093258b0d6a2d/mbari_aidata-1.56.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-10 02:12:55",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "mbari-aidata"
}