# HCA Smart-Sync
Intelligent S3 data synchronization for HCA Atlas source datasets and integrated objects.
## Installation
```bash
cd smart-sync
poetry install
```
## Usage
```bash
# Basic sync
poetry run hca-smart-sync sync gut-v1 --profile my-profile
# Dry run
poetry run hca-smart-sync sync gut-v1 --profile my-profile --dry-run
# Development environment
poetry run hca-smart-sync sync gut-v1 --profile my-profile --environment dev
```
## Development
```bash
# Install development dependencies
make dev
# Run tests
make test-all
# Run with coverage
make test-cov
# Run linting
make lint
# Format code
make format
```
## Features
- SHA256 checksum-based synchronization
- Manifest-driven uploads
- AWS CLI integration with progress display
- Environment-based bucket selection
- Interactive upload confirmation
- Research-grade data integrity verification
## Configuration
The tool supports environment-based bucket selection:
- `prod` (default): `hca-atlas-tracker-data`
- `dev`: `hca-atlas-tracker-data-dev`
## Requirements
- Python 3.10+
- AWS CLI configured with appropriate profiles
- Poetry for dependency management
Raw data
{
"_id": null,
"home_page": null,
"name": "hca-smart-sync",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "HCA, S3, sync, bioinformatics, datasets, checksum, manifest",
"author": "HCA Team",
"author_email": "dave@clevercanary.com",
"download_url": "https://files.pythonhosted.org/packages/13/bd/3cc07ce818e75627b2c6267b71fb5e7ae0952ab77de1bd18ca01ca3ac9b5/hca_smart_sync-0.2.3.tar.gz",
"platform": null,
"description": "# HCA Smart-Sync\n\nIntelligent S3 data synchronization for HCA Atlas source datasets and integrated objects.\n\n## Installation\n\n```bash\ncd smart-sync\npoetry install\n```\n\n## Usage\n\n```bash\n# Basic sync\npoetry run hca-smart-sync sync gut-v1 --profile my-profile\n\n# Dry run\npoetry run hca-smart-sync sync gut-v1 --profile my-profile --dry-run\n\n# Development environment\npoetry run hca-smart-sync sync gut-v1 --profile my-profile --environment dev\n```\n\n## Development\n\n```bash\n# Install development dependencies\nmake dev\n\n# Run tests\nmake test-all\n\n# Run with coverage\nmake test-cov\n\n# Run linting\nmake lint\n\n# Format code\nmake format\n```\n\n## Features\n\n- SHA256 checksum-based synchronization\n- Manifest-driven uploads\n- AWS CLI integration with progress display\n- Environment-based bucket selection\n- Interactive upload confirmation\n- Research-grade data integrity verification\n\n## Configuration\n\nThe tool supports environment-based bucket selection:\n\n- `prod` (default): `hca-atlas-tracker-data`\n- `dev`: `hca-atlas-tracker-data-dev`\n\n## Requirements\n\n- Python 3.10+\n- AWS CLI configured with appropriate profiles\n- Poetry for dependency management\n\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Intelligent S3 synchronization for HCA Atlas data",
"version": "0.2.3",
"project_urls": {
"Homepage": "https://github.com/clevercanary/hca-ingest-tools/blob/main/smart-sync/README.md",
"Issues": "https://github.com/clevercanary/hca-ingest-tools/issues",
"Repository": "https://github.com/clevercanary/hca-ingest-tools"
},
"split_keywords": [
"hca",
" s3",
" sync",
" bioinformatics",
" datasets",
" checksum",
" manifest"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d139fbe27475e2ece1bb09aa4b3586b14e795f34c77b4b52a43be34d27265253",
"md5": "46a518b27d1489b0766ddf47c2ead2da",
"sha256": "66da97d718994e14ee83aead27337c1c15cb887c8764dd57d70e90835ab47242"
},
"downloads": -1,
"filename": "hca_smart_sync-0.2.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "46a518b27d1489b0766ddf47c2ead2da",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 21285,
"upload_time": "2025-10-08T04:16:49",
"upload_time_iso_8601": "2025-10-08T04:16:49.096141Z",
"url": "https://files.pythonhosted.org/packages/d1/39/fbe27475e2ece1bb09aa4b3586b14e795f34c77b4b52a43be34d27265253/hca_smart_sync-0.2.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "13bd3cc07ce818e75627b2c6267b71fb5e7ae0952ab77de1bd18ca01ca3ac9b5",
"md5": "71a0595de03a3e3535f3ffbd2d737681",
"sha256": "f53148921a3a2f40a5bf59c80e2eab4ec8624d424263008a1c9a35a3cb947ac9"
},
"downloads": -1,
"filename": "hca_smart_sync-0.2.3.tar.gz",
"has_sig": false,
"md5_digest": "71a0595de03a3e3535f3ffbd2d737681",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 18843,
"upload_time": "2025-10-08T04:16:50",
"upload_time_iso_8601": "2025-10-08T04:16:50.396155Z",
"url": "https://files.pythonhosted.org/packages/13/bd/3cc07ce818e75627b2c6267b71fb5e7ae0952ab77de1bd18ca01ca3ac9b5/hca_smart_sync-0.2.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-08 04:16:50",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "clevercanary",
"github_project": "hca-ingest-tools",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "hca-smart-sync"
}