biosample-enricher


Namebiosample-enricher JSON
Version 0.1.0rc1 PyPI version JSON
download
home_pageNone
SummaryInfer AI-friendly metadata about biosamples from multiple sources
upload_time2025-10-28 00:21:44
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseNone
keywords bioinformatics biosamples climate elevation enrichment environmental-data environmental-science geocoding geospatial marine metadata oceanography soil weather
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Biosample Enricher

Infer AI-friendly environmental and geographic metadata about biosamples from multiple sources.

[![Python Version](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://python.org)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code style: ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Type checked: mypy](https://img.shields.io/badge/type--checked-mypy-blue.svg)](https://mypy-lang.org/)

## Overview

Biosample Enricher provides 8 specialized services for enriching biosample metadata with environmental and geographic information from authoritative data sources. Each service focuses on a specific domain (elevation, weather, soil, marine, land cover, geocoding, geographic features) and returns structured, type-safe data ready for analysis or AI applications.

## Features

- **8 Specialized Services**: Elevation, soil, weather, marine, land cover, forward/reverse geocoding, geographic features
- **Service-Based Architecture**: Independent services with focused responsibilities
- **Type Safety**: Full type hints with Pydantic validation and mypy checking
- **Smart Caching**: HTTP caching with coordinate canonicalization for efficiency
- **Multiple Providers**: Automatic fallback between data providers (USGS, Google, OSM, etc.)
- **Click-Based CLIs**: User-friendly command-line tools for each service
- **Flexible Installation**: Core services only, or add optional mongodb/metrics/schema extras

## Installation

### Prerequisites

- Python 3.11 or higher
- [UV package manager](https://github.com/astral-sh/uv) (recommended)

### Add to Your Project (Recommended)

```bash
# Basic installation - all 8 enrichment services
uv add biosample-enricher

# With optional dependencies
uv add biosample-enricher --extra metrics   # Metrics and visualization
uv add biosample-enricher --extra mongodb   # MongoDB support for NMDC/GOLD
uv add biosample-enricher --extra schema    # Schema analysis tools
uv add biosample-enricher --extra all       # All optional features
```

### From Source (Development)

```bash
# Clone and install
git clone https://github.com/contextualizer-ai/biosample-enricher.git
cd biosample-enricher
uv sync

# With optional extras
uv sync --extra mongodb    # MongoDB support
uv sync --extra metrics    # Metrics and visualization
uv sync --extra schema     # Schema analysis tools
uv sync --extra all        # Everything
```

## Quick Start

### Python API

The package exports 8 services from the top level:

```python
from biosample_enricher import (
    ElevationService,
    ElevationRequest,
    SoilService,
    WeatherService,
    MarineService,
    LandService,
    ReverseGeocodingService,
    ForwardGeocodingService,
    OSMFeaturesService,
)
from datetime import date

# Get elevation for a location
elevation_service = ElevationService()
request = ElevationRequest(latitude=40.7128, longitude=-74.0060)
observations = elevation_service.get_elevation(request)

for obs in observations:
    if obs.value_numeric is not None:
        print(f"{obs.provider.name}: {obs.value_numeric}m")
# Output:
# usgs_3dep: 13.15m
# google_elevation: 13.26m
# open_topo_data: 25.0m
# osm_elevation: 51.0m

# Get weather data for a location and date
weather_service = WeatherService()
weather_result = weather_service.get_daily_weather(
    lat=37.7749,
    lon=-122.4194,
    target_date=date(2024, 1, 15)
)
print(f"Temperature: {weather_result.temperature.value}°C")
print(f"Precipitation: {weather_result.precipitation.value}mm")

# Get soil properties
soil_service = SoilService()
soil_result = soil_service.enrich_location(
    latitude=40.7128,
    longitude=-74.0060,
    depth_cm="0-5cm"
)
print(f"Provider: {soil_result.provider}")
print(f"Quality score: {soil_result.quality_score}")

# Get marine data (SST, bathymetry, chlorophyll)
marine_service = MarineService()
marine_result = marine_service.get_comprehensive_marine_data(
    latitude=36.6,
    longitude=-121.9,
    target_date=date(2024, 1, 15)
)
if marine_result.sea_surface_temperature:
    print(f"Sea surface temp: {marine_result.sea_surface_temperature.value}°C")
if marine_result.bathymetry:
    print(f"Water depth: {marine_result.bathymetry.value}m")

# Reverse geocoding (coordinates -> place names)
geocoding_service = ReverseGeocodingService()
result = geocoding_service.reverse_geocode(lat=40.7128, lon=-74.0060)
if result:
    print(f"Location: {result.get_formatted_address()}")

# Get nearby geographic features
osm_service = OSMFeaturesService()
features = osm_service.get_features_for_location(
    latitude=37.7749,
    longitude=-122.4194,
    radius_m=1000
)
if features and features.named_features:
    for feature in features.named_features[:5]:
        print(f"{feature.name} ({feature.category}): {feature.distance_km:.2f}km")
```

### CLI Usage

Each service has its own CLI command:

```bash
# Elevation lookup
uv run elevation-lookup lookup --lat 40.7128 --lon -74.0060

# Soil data
uv run soil-enricher lookup --lat 40.7128 --lon -74.0060 --depth 10

# Weather data
uv run weather-enricher lookup --lat 37.7749 --lon -122.4194 --date 2024-01-15

# Marine data
uv run marine-enricher lookup --lat 36.6 --lon -121.9 --date 2024-01-15

# Land cover
uv run land-enricher lookup --lat 40.7128 --lon -74.0060

# Batch processing from CSV
uv run elevation-lookup batch --input samples.csv --lat-col latitude --lon-col longitude

# Version info
uv run biosample-version
```

## Services

### 1. Elevation Service

Get elevation data from multiple providers (USGS, Google, Open Topo Data).

**Providers**: USGS (US only, free), Google (global, requires API key), Open Topo Data (global, free)

**Python**:
```python
from biosample_enricher import ElevationService, ElevationRequest

service = ElevationService()
request = ElevationRequest(latitude=40.7128, longitude=-74.0060)
observations = service.get_elevation(request)
```

**CLI**:
```bash
uv run elevation-lookup lookup --lat 40.7128 --lon -74.0060
```

### 2. Soil Service

Get soil properties (texture, pH, organic carbon, etc.).

**Providers**: SoilGrids (global coverage), USDA NRCS (US only)

**Python**:
```python
from biosample_enricher import SoilService

service = SoilService()
soil_result = service.enrich_location(
    latitude=40.7128,
    longitude=-74.0060,
    depth_cm="0-5cm"
)
```

**CLI**:
```bash
uv run soil-enricher lookup --lat 40.7128 --lon -74.0060 --depth 10
```

### 3. Weather Service

Get historical weather data (temperature, precipitation, humidity, etc.).

**Providers**: Open-Meteo (free, global), Meteostat (free, global)

**Python**:
```python
from biosample_enricher import WeatherService
from datetime import date

service = WeatherService()
weather_result = service.get_daily_weather(
    lat=37.7749,
    lon=-122.4194,
    target_date=date(2024, 1, 15)
)
```

**CLI**:
```bash
uv run weather-enricher lookup --lat 37.7749 --lon -122.4194 --date 2024-01-15
```

### 4. Marine Service

Get marine data (sea surface temperature, bathymetry, chlorophyll).

**Providers**: NOAA OISST (SST), GEBCO (bathymetry), ESA CCI (chlorophyll)

**Python**:
```python
from biosample_enricher import MarineService
from datetime import date

service = MarineService()
marine_result = service.get_comprehensive_marine_data(
    latitude=36.6,
    longitude=-121.9,
    target_date=date(2024, 1, 15)
)
```

**CLI**:
```bash
uv run marine-enricher lookup --lat 36.6 --lon -121.9 --date 2024-01-15
```

### 5. Land Service

Get land cover classification.

**Providers**: ESA WorldCover, MODIS, NLCD (US only)

**Python**:
```python
from biosample_enricher import LandService

service = LandService()
land_result = service.enrich_location(
    latitude=40.7128,
    longitude=-74.0060
)
```

**CLI**:
```bash
uv run land-enricher lookup --lat 40.7128 --lon -74.0060
```

### 6. Reverse Geocoding Service

Convert coordinates to human-readable addresses.

**Providers**: OSM Nominatim (free), Google Geocoding (requires API key)

**Python**:
```python
from biosample_enricher import ReverseGeocodingService

service = ReverseGeocodingService()
result = service.reverse_geocode(lat=40.7128, lon=-74.0060)
if result:
    print(result.get_formatted_address())
```

### 7. Forward Geocoding Service

Convert addresses/place names to coordinates.

**Providers**: OSM Nominatim (free), Google Geocoding (requires API key)

**Python**:
```python
from biosample_enricher import ForwardGeocodingService

service = ForwardGeocodingService()
result = service.geocode("New York City")
if result and result.locations:
    for location in result.locations[:3]:
        print(f"{location.formatted_address}: {location.latitude}, {location.longitude}")
```

### 8. OSM Features Service

Get nearby geographic features (parks, water bodies, landmarks).

**Providers**: OpenStreetMap Overpass API (free), Google Places (requires API key)

**Python**:
```python
from biosample_enricher import OSMFeaturesService

service = OSMFeaturesService()
features = service.get_features_for_location(
    latitude=37.7749,
    longitude=-122.4194,
    radius_m=1000
)
if features and features.named_features:
    for feature in features.named_features[:5]:
        print(f"{feature.name} ({feature.category})")
```

## API Keys

Only required for Google services (optional - OSM alternatives available for everything):

```bash
# Single API key for all Google services
export GOOGLE_MAIN_API_KEY="your-key-here"
```

All other services are free and require no authentication.

## Development

### Setup

```bash
# Clone repository
git clone https://github.com/contextualizer-ai/biosample-enricher.git
cd biosample-enricher

# Complete development setup
make dev-setup
```

### Testing

```bash
# Run fast tests (excludes network/slow tests)
make test-fast

# Run all tests with coverage
make test-cov

# Run specific test categories
make test-unit          # Unit tests only
make test-integration   # Integration tests
```

### Code Quality

```bash
# Format, lint, type-check, test
make dev-check

# Full CI validation
make check-ci

# Individual checks
make format       # Format with ruff
make lint         # Lint with ruff
make type-check   # Type check with mypy
make dep-check    # Check dependencies with deptry
```

## Project Structure

```
biosample-enricher/
├── biosample_enricher/
│   ├── __init__.py           # Public API exports
│   ├── elevation/            # Elevation service
│   ├── soil/                 # Soil service
│   ├── weather/              # Weather service
│   ├── marine/               # Marine service
│   ├── land/                 # Land cover service
│   ├── reverse_geocoding/    # Reverse geocoding
│   ├── forward_geocoding/    # Forward geocoding
│   ├── osm_features/         # Geographic features
│   ├── models.py             # Core data models
│   ├── http_cache.py         # HTTP caching
│   └── cli*.py               # CLI commands
├── tests/                    # Test suite
├── pyproject.toml           # Project configuration
└── Makefile                 # Development automation
```

## Dependencies

### Core Dependencies
- **Always installed**: pandas, rasterio, meteostat (required for weather aggregation and global soil coverage)
- CLI and data validation: click, pydantic, requests, rich, pyyaml

### Optional Dependencies
- **mongodb**: `pymongo` for fetching from NMDC/GOLD databases (evaluation/demo only)
- **metrics**: `matplotlib`, `seaborn` for visualization
- **schema**: `genson` for schema analysis

Install with: `uv sync --extra mongodb` or `uv sync --extra all`

## Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Run checks (`make dev-check`)
5. Commit (`git commit -m 'Add amazing feature'`)
6. Push (`git push origin feature/amazing-feature`)
7. Open a Pull Request

See [CLAUDE.md](CLAUDE.md) for detailed development guidelines.

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Acknowledgments

- Built with [UV](https://github.com/astral-sh/uv) for fast package management
- CLI powered by [Click](https://click.palletsprojects.com/)
- Data validation with [Pydantic](https://pydantic.dev/)
- Console output with [Rich](https://github.com/Textualize/rich)
- Caching with [requests-cache](https://github.com/requests-cache/requests-cache)

## Support

- **Issues**: [GitHub Issues](https://github.com/contextualizer-ai/biosample-enricher/issues)
- **Email**: info@contextualizer.ai

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "biosample-enricher",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": "contextualizer-ai <info@contextualizer.ai>",
    "keywords": "bioinformatics, biosamples, climate, elevation, enrichment, environmental-data, environmental-science, geocoding, geospatial, marine, metadata, oceanography, soil, weather",
    "author": null,
    "author_email": "contextualizer-ai <info@contextualizer.ai>",
    "download_url": "https://files.pythonhosted.org/packages/2d/37/2d3c8ce7ec880d125966ea3cb840dd78331adf05fa0e824240c5d0b3f640/biosample_enricher-0.1.0rc1.tar.gz",
    "platform": null,
    "description": "# Biosample Enricher\n\nInfer AI-friendly environmental and geographic metadata about biosamples from multiple sources.\n\n[![Python Version](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://python.org)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Code style: ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![Type checked: mypy](https://img.shields.io/badge/type--checked-mypy-blue.svg)](https://mypy-lang.org/)\n\n## Overview\n\nBiosample Enricher provides 8 specialized services for enriching biosample metadata with environmental and geographic information from authoritative data sources. Each service focuses on a specific domain (elevation, weather, soil, marine, land cover, geocoding, geographic features) and returns structured, type-safe data ready for analysis or AI applications.\n\n## Features\n\n- **8 Specialized Services**: Elevation, soil, weather, marine, land cover, forward/reverse geocoding, geographic features\n- **Service-Based Architecture**: Independent services with focused responsibilities\n- **Type Safety**: Full type hints with Pydantic validation and mypy checking\n- **Smart Caching**: HTTP caching with coordinate canonicalization for efficiency\n- **Multiple Providers**: Automatic fallback between data providers (USGS, Google, OSM, etc.)\n- **Click-Based CLIs**: User-friendly command-line tools for each service\n- **Flexible Installation**: Core services only, or add optional mongodb/metrics/schema extras\n\n## Installation\n\n### Prerequisites\n\n- Python 3.11 or higher\n- [UV package manager](https://github.com/astral-sh/uv) (recommended)\n\n### Add to Your Project (Recommended)\n\n```bash\n# Basic installation - all 8 enrichment services\nuv add biosample-enricher\n\n# With optional dependencies\nuv add biosample-enricher --extra metrics   # Metrics and visualization\nuv add biosample-enricher --extra mongodb   # MongoDB support for NMDC/GOLD\nuv add biosample-enricher --extra schema    # Schema analysis tools\nuv add biosample-enricher --extra all       # All optional features\n```\n\n### From Source (Development)\n\n```bash\n# Clone and install\ngit clone https://github.com/contextualizer-ai/biosample-enricher.git\ncd biosample-enricher\nuv sync\n\n# With optional extras\nuv sync --extra mongodb    # MongoDB support\nuv sync --extra metrics    # Metrics and visualization\nuv sync --extra schema     # Schema analysis tools\nuv sync --extra all        # Everything\n```\n\n## Quick Start\n\n### Python API\n\nThe package exports 8 services from the top level:\n\n```python\nfrom biosample_enricher import (\n    ElevationService,\n    ElevationRequest,\n    SoilService,\n    WeatherService,\n    MarineService,\n    LandService,\n    ReverseGeocodingService,\n    ForwardGeocodingService,\n    OSMFeaturesService,\n)\nfrom datetime import date\n\n# Get elevation for a location\nelevation_service = ElevationService()\nrequest = ElevationRequest(latitude=40.7128, longitude=-74.0060)\nobservations = elevation_service.get_elevation(request)\n\nfor obs in observations:\n    if obs.value_numeric is not None:\n        print(f\"{obs.provider.name}: {obs.value_numeric}m\")\n# Output:\n# usgs_3dep: 13.15m\n# google_elevation: 13.26m\n# open_topo_data: 25.0m\n# osm_elevation: 51.0m\n\n# Get weather data for a location and date\nweather_service = WeatherService()\nweather_result = weather_service.get_daily_weather(\n    lat=37.7749,\n    lon=-122.4194,\n    target_date=date(2024, 1, 15)\n)\nprint(f\"Temperature: {weather_result.temperature.value}\u00b0C\")\nprint(f\"Precipitation: {weather_result.precipitation.value}mm\")\n\n# Get soil properties\nsoil_service = SoilService()\nsoil_result = soil_service.enrich_location(\n    latitude=40.7128,\n    longitude=-74.0060,\n    depth_cm=\"0-5cm\"\n)\nprint(f\"Provider: {soil_result.provider}\")\nprint(f\"Quality score: {soil_result.quality_score}\")\n\n# Get marine data (SST, bathymetry, chlorophyll)\nmarine_service = MarineService()\nmarine_result = marine_service.get_comprehensive_marine_data(\n    latitude=36.6,\n    longitude=-121.9,\n    target_date=date(2024, 1, 15)\n)\nif marine_result.sea_surface_temperature:\n    print(f\"Sea surface temp: {marine_result.sea_surface_temperature.value}\u00b0C\")\nif marine_result.bathymetry:\n    print(f\"Water depth: {marine_result.bathymetry.value}m\")\n\n# Reverse geocoding (coordinates -> place names)\ngeocoding_service = ReverseGeocodingService()\nresult = geocoding_service.reverse_geocode(lat=40.7128, lon=-74.0060)\nif result:\n    print(f\"Location: {result.get_formatted_address()}\")\n\n# Get nearby geographic features\nosm_service = OSMFeaturesService()\nfeatures = osm_service.get_features_for_location(\n    latitude=37.7749,\n    longitude=-122.4194,\n    radius_m=1000\n)\nif features and features.named_features:\n    for feature in features.named_features[:5]:\n        print(f\"{feature.name} ({feature.category}): {feature.distance_km:.2f}km\")\n```\n\n### CLI Usage\n\nEach service has its own CLI command:\n\n```bash\n# Elevation lookup\nuv run elevation-lookup lookup --lat 40.7128 --lon -74.0060\n\n# Soil data\nuv run soil-enricher lookup --lat 40.7128 --lon -74.0060 --depth 10\n\n# Weather data\nuv run weather-enricher lookup --lat 37.7749 --lon -122.4194 --date 2024-01-15\n\n# Marine data\nuv run marine-enricher lookup --lat 36.6 --lon -121.9 --date 2024-01-15\n\n# Land cover\nuv run land-enricher lookup --lat 40.7128 --lon -74.0060\n\n# Batch processing from CSV\nuv run elevation-lookup batch --input samples.csv --lat-col latitude --lon-col longitude\n\n# Version info\nuv run biosample-version\n```\n\n## Services\n\n### 1. Elevation Service\n\nGet elevation data from multiple providers (USGS, Google, Open Topo Data).\n\n**Providers**: USGS (US only, free), Google (global, requires API key), Open Topo Data (global, free)\n\n**Python**:\n```python\nfrom biosample_enricher import ElevationService, ElevationRequest\n\nservice = ElevationService()\nrequest = ElevationRequest(latitude=40.7128, longitude=-74.0060)\nobservations = service.get_elevation(request)\n```\n\n**CLI**:\n```bash\nuv run elevation-lookup lookup --lat 40.7128 --lon -74.0060\n```\n\n### 2. Soil Service\n\nGet soil properties (texture, pH, organic carbon, etc.).\n\n**Providers**: SoilGrids (global coverage), USDA NRCS (US only)\n\n**Python**:\n```python\nfrom biosample_enricher import SoilService\n\nservice = SoilService()\nsoil_result = service.enrich_location(\n    latitude=40.7128,\n    longitude=-74.0060,\n    depth_cm=\"0-5cm\"\n)\n```\n\n**CLI**:\n```bash\nuv run soil-enricher lookup --lat 40.7128 --lon -74.0060 --depth 10\n```\n\n### 3. Weather Service\n\nGet historical weather data (temperature, precipitation, humidity, etc.).\n\n**Providers**: Open-Meteo (free, global), Meteostat (free, global)\n\n**Python**:\n```python\nfrom biosample_enricher import WeatherService\nfrom datetime import date\n\nservice = WeatherService()\nweather_result = service.get_daily_weather(\n    lat=37.7749,\n    lon=-122.4194,\n    target_date=date(2024, 1, 15)\n)\n```\n\n**CLI**:\n```bash\nuv run weather-enricher lookup --lat 37.7749 --lon -122.4194 --date 2024-01-15\n```\n\n### 4. Marine Service\n\nGet marine data (sea surface temperature, bathymetry, chlorophyll).\n\n**Providers**: NOAA OISST (SST), GEBCO (bathymetry), ESA CCI (chlorophyll)\n\n**Python**:\n```python\nfrom biosample_enricher import MarineService\nfrom datetime import date\n\nservice = MarineService()\nmarine_result = service.get_comprehensive_marine_data(\n    latitude=36.6,\n    longitude=-121.9,\n    target_date=date(2024, 1, 15)\n)\n```\n\n**CLI**:\n```bash\nuv run marine-enricher lookup --lat 36.6 --lon -121.9 --date 2024-01-15\n```\n\n### 5. Land Service\n\nGet land cover classification.\n\n**Providers**: ESA WorldCover, MODIS, NLCD (US only)\n\n**Python**:\n```python\nfrom biosample_enricher import LandService\n\nservice = LandService()\nland_result = service.enrich_location(\n    latitude=40.7128,\n    longitude=-74.0060\n)\n```\n\n**CLI**:\n```bash\nuv run land-enricher lookup --lat 40.7128 --lon -74.0060\n```\n\n### 6. Reverse Geocoding Service\n\nConvert coordinates to human-readable addresses.\n\n**Providers**: OSM Nominatim (free), Google Geocoding (requires API key)\n\n**Python**:\n```python\nfrom biosample_enricher import ReverseGeocodingService\n\nservice = ReverseGeocodingService()\nresult = service.reverse_geocode(lat=40.7128, lon=-74.0060)\nif result:\n    print(result.get_formatted_address())\n```\n\n### 7. Forward Geocoding Service\n\nConvert addresses/place names to coordinates.\n\n**Providers**: OSM Nominatim (free), Google Geocoding (requires API key)\n\n**Python**:\n```python\nfrom biosample_enricher import ForwardGeocodingService\n\nservice = ForwardGeocodingService()\nresult = service.geocode(\"New York City\")\nif result and result.locations:\n    for location in result.locations[:3]:\n        print(f\"{location.formatted_address}: {location.latitude}, {location.longitude}\")\n```\n\n### 8. OSM Features Service\n\nGet nearby geographic features (parks, water bodies, landmarks).\n\n**Providers**: OpenStreetMap Overpass API (free), Google Places (requires API key)\n\n**Python**:\n```python\nfrom biosample_enricher import OSMFeaturesService\n\nservice = OSMFeaturesService()\nfeatures = service.get_features_for_location(\n    latitude=37.7749,\n    longitude=-122.4194,\n    radius_m=1000\n)\nif features and features.named_features:\n    for feature in features.named_features[:5]:\n        print(f\"{feature.name} ({feature.category})\")\n```\n\n## API Keys\n\nOnly required for Google services (optional - OSM alternatives available for everything):\n\n```bash\n# Single API key for all Google services\nexport GOOGLE_MAIN_API_KEY=\"your-key-here\"\n```\n\nAll other services are free and require no authentication.\n\n## Development\n\n### Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/contextualizer-ai/biosample-enricher.git\ncd biosample-enricher\n\n# Complete development setup\nmake dev-setup\n```\n\n### Testing\n\n```bash\n# Run fast tests (excludes network/slow tests)\nmake test-fast\n\n# Run all tests with coverage\nmake test-cov\n\n# Run specific test categories\nmake test-unit          # Unit tests only\nmake test-integration   # Integration tests\n```\n\n### Code Quality\n\n```bash\n# Format, lint, type-check, test\nmake dev-check\n\n# Full CI validation\nmake check-ci\n\n# Individual checks\nmake format       # Format with ruff\nmake lint         # Lint with ruff\nmake type-check   # Type check with mypy\nmake dep-check    # Check dependencies with deptry\n```\n\n## Project Structure\n\n```\nbiosample-enricher/\n\u251c\u2500\u2500 biosample_enricher/\n\u2502   \u251c\u2500\u2500 __init__.py           # Public API exports\n\u2502   \u251c\u2500\u2500 elevation/            # Elevation service\n\u2502   \u251c\u2500\u2500 soil/                 # Soil service\n\u2502   \u251c\u2500\u2500 weather/              # Weather service\n\u2502   \u251c\u2500\u2500 marine/               # Marine service\n\u2502   \u251c\u2500\u2500 land/                 # Land cover service\n\u2502   \u251c\u2500\u2500 reverse_geocoding/    # Reverse geocoding\n\u2502   \u251c\u2500\u2500 forward_geocoding/    # Forward geocoding\n\u2502   \u251c\u2500\u2500 osm_features/         # Geographic features\n\u2502   \u251c\u2500\u2500 models.py             # Core data models\n\u2502   \u251c\u2500\u2500 http_cache.py         # HTTP caching\n\u2502   \u2514\u2500\u2500 cli*.py               # CLI commands\n\u251c\u2500\u2500 tests/                    # Test suite\n\u251c\u2500\u2500 pyproject.toml           # Project configuration\n\u2514\u2500\u2500 Makefile                 # Development automation\n```\n\n## Dependencies\n\n### Core Dependencies\n- **Always installed**: pandas, rasterio, meteostat (required for weather aggregation and global soil coverage)\n- CLI and data validation: click, pydantic, requests, rich, pyyaml\n\n### Optional Dependencies\n- **mongodb**: `pymongo` for fetching from NMDC/GOLD databases (evaluation/demo only)\n- **metrics**: `matplotlib`, `seaborn` for visualization\n- **schema**: `genson` for schema analysis\n\nInstall with: `uv sync --extra mongodb` or `uv sync --extra all`\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes\n4. Run checks (`make dev-check`)\n5. Commit (`git commit -m 'Add amazing feature'`)\n6. Push (`git push origin feature/amazing-feature`)\n7. Open a Pull Request\n\nSee [CLAUDE.md](CLAUDE.md) for detailed development guidelines.\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Built with [UV](https://github.com/astral-sh/uv) for fast package management\n- CLI powered by [Click](https://click.palletsprojects.com/)\n- Data validation with [Pydantic](https://pydantic.dev/)\n- Console output with [Rich](https://github.com/Textualize/rich)\n- Caching with [requests-cache](https://github.com/requests-cache/requests-cache)\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/contextualizer-ai/biosample-enricher/issues)\n- **Email**: info@contextualizer.ai\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Infer AI-friendly metadata about biosamples from multiple sources",
    "version": "0.1.0rc1",
    "project_urls": {
        "Homepage": "https://github.com/contextualizer-ai/biosample-enricher",
        "Issues": "https://github.com/contextualizer-ai/biosample-enricher/issues",
        "Repository": "https://github.com/contextualizer-ai/biosample-enricher"
    },
    "split_keywords": [
        "bioinformatics",
        " biosamples",
        " climate",
        " elevation",
        " enrichment",
        " environmental-data",
        " environmental-science",
        " geocoding",
        " geospatial",
        " marine",
        " metadata",
        " oceanography",
        " soil",
        " weather"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9c92f04936060f691a8c36f4c4c5bc42565e4a031cb1bc489ceb70208ad79a24",
                "md5": "a224d06919025b8f1e415635d1b20a3d",
                "sha256": "4f61a59926a322f964b336762a273a347aa5d00e549931e7201eabc02a3908ed"
            },
            "downloads": -1,
            "filename": "biosample_enricher-0.1.0rc1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a224d06919025b8f1e415635d1b20a3d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 237714,
            "upload_time": "2025-10-28T00:21:42",
            "upload_time_iso_8601": "2025-10-28T00:21:42.379844Z",
            "url": "https://files.pythonhosted.org/packages/9c/92/f04936060f691a8c36f4c4c5bc42565e4a031cb1bc489ceb70208ad79a24/biosample_enricher-0.1.0rc1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2d372d3c8ce7ec880d125966ea3cb840dd78331adf05fa0e824240c5d0b3f640",
                "md5": "b10a0e7725fce8290c29bbbc139f25c4",
                "sha256": "106fb1bd93887821e49ee833a3794513cec3acad8724972ffae6b3543905f45d"
            },
            "downloads": -1,
            "filename": "biosample_enricher-0.1.0rc1.tar.gz",
            "has_sig": false,
            "md5_digest": "b10a0e7725fce8290c29bbbc139f25c4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 410689,
            "upload_time": "2025-10-28T00:21:44",
            "upload_time_iso_8601": "2025-10-28T00:21:44.422145Z",
            "url": "https://files.pythonhosted.org/packages/2d/37/2d3c8ce7ec880d125966ea3cb840dd78331adf05fa0e824240c5d0b3f640/biosample_enricher-0.1.0rc1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-28 00:21:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "contextualizer-ai",
    "github_project": "biosample-enricher",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "biosample-enricher"
}
        
Elapsed time: 1.40525s