biomapper


Namebiomapper JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttps://github.com/arpanauts/biomapper
SummaryA unified Python toolkit for biological data harmonization and ontology mapping
upload_time2024-12-12 05:21:36
maintainerNone
docs_urlNone
authorTrent Leslie
requires_python<4.0,>=3.11
licenseMIT
keywords bioinformatics ontology data mapping biological data standardization harmonization
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # biomapper

A unified Python toolkit for biological data harmonization and ontology mapping. `biomapper` provides a single interface for standardizing identifiers and mapping between various biological ontologies, making multi-omic data integration more accessible and reproducible.

## Features

### Core Functionality
- **ID Standardization**: Unified interface for standardizing biological identifiers
- **Ontology Mapping**: Comprehensive ontology mapping using major biological databases
- **Data Validation**: Robust validation of input data and mappings
- **Extensible Architecture**: Easy integration of new data sources and mapping services

### Supported Systems

#### ID Standardization Tools
- BridgeDb
- RefMet
- RaMP-DB

#### Ontology Mapping Services
- UMLS Metathesaurus
- Ontology Lookup Service (OLS)
- BioPortal

## Installation

### Using pip
```bash
pip install biomapper
```

### Development Setup

1. Install Python 3.11 with pyenv (if not already installed):
```bash
# Install pyenv dependencies
sudo apt-get update
sudo apt-get install -y make build-essential libssl-dev zlib1g-dev \
libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev \
libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl

# Install pyenv
curl https://pyenv.run | bash

# Add to your shell configuration
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc

# Reload shell configuration
source ~/.bashrc

# Install Python 3.11
pyenv install 3.11.7
pyenv local 3.11.7
```

2. Install Poetry (if not already installed):
```bash
curl -sSL https://install.python-poetry.org | python3 -

# Add Poetry to your PATH
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
```

3. Clone and set up the project:
```bash
git clone https://github.com/yourusername/biomapper.git
cd biomapper

# Install dependencies with Poetry
poetry install
```

## Quick Start

```python
from biomapper import AnalyteMetadata
from biomapper.standardization import BridgeDBHandler, RaMPClient

# Example 1: Using BridgeDB
# Initialize metadata handler
metadata = AnalyteMetadata()

# Create standardization handler
bridge_handler = BridgeDBHandler()

# Process identifiers
results = bridge_handler.standardize(["P12345", "Q67890"])

# Example 2: Using RaMP-DB
# Initialize the RaMP client
ramp_client = RaMPClient()

# Get database versions
versions = ramp_client.get_source_versions()

# Get pathways for metabolites
# Example: Get pathways for Creatine (HMDB0000064)
pathways = ramp_client.get_pathways_from_analytes(["hmdb:HMDB0000064"])
```

## Development

### Using Poetry

```bash
# Activate virtual environment
poetry shell

# Run a command in the virtual environment
poetry run python script.py

# Add a new dependency
poetry add package-name

# Add a development dependency
poetry add --group dev package-name

# Update dependencies
poetry update

# Show currently installed packages
poetry show

# Build the package
poetry build
```

### Running Tests
```bash
# Run tests
poetry run pytest

# Run tests with coverage
poetry run pytest --cov=biomapper
```

### Code Quality
```bash
# Format code with black
poetry run black .

# Run linting
poetry run flake8 .

# Type checking
poetry run mypy .
```

## Project Structure

```
biomapper/
├── biomapper/           # Main package directory
│   ├── core/           # Core functionality
│   │   ├── metadata.py # Metadata handling
│   │   └── validators.py # Data validation
│   ├── standardization/# ID standardization components
│   ├── mapping/        # Ontology mapping components
│   ├── utils/          # Utility functions
│   └── schemas/        # Data schemas and models
├── tests/              # Test files
├── docs/               # Documentation
├── scripts/            # Utility scripts
├── pyproject.toml      # Poetry configuration and dependencies
└── poetry.lock        # Lock file for dependencies
```

## License

This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.

## Support

For support, please open an issue in the GitHub issue tracker.

## Roadmap

- [ ] Initial release with core functionality
- [ ] Add support for additional ontology services
- [ ] Implement caching layer
- [ ] Add batch processing capabilities
- [ ] Develop REST API interface

## Acknowledgments

- [BridgeDb](https://www.bridgedb.org/)
- [RefMet](https://refmet.metabolomicsworkbench.org/)
- [RaMP-DB](http://rampdb.org/)
- [UMLS](https://www.nlm.nih.gov/research/umls/index.html)
- [OLS](https://www.ebi.ac.uk/ols/index)
- [BioPortal](https://bioportal.bioontology.org/)
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/arpanauts/biomapper",
    "name": "biomapper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": "bioinformatics, ontology, data mapping, biological data, standardization, harmonization",
    "author": "Trent Leslie",
    "author_email": "trent.leslie@phenomehealth.org",
    "download_url": "https://files.pythonhosted.org/packages/f8/78/994ca73b45c9894a3ce287a0103c8f870912d674b56e382702380cfac273/biomapper-0.2.0.tar.gz",
    "platform": null,
    "description": "# biomapper\n\nA unified Python toolkit for biological data harmonization and ontology mapping. `biomapper` provides a single interface for standardizing identifiers and mapping between various biological ontologies, making multi-omic data integration more accessible and reproducible.\n\n## Features\n\n### Core Functionality\n- **ID Standardization**: Unified interface for standardizing biological identifiers\n- **Ontology Mapping**: Comprehensive ontology mapping using major biological databases\n- **Data Validation**: Robust validation of input data and mappings\n- **Extensible Architecture**: Easy integration of new data sources and mapping services\n\n### Supported Systems\n\n#### ID Standardization Tools\n- BridgeDb\n- RefMet\n- RaMP-DB\n\n#### Ontology Mapping Services\n- UMLS Metathesaurus\n- Ontology Lookup Service (OLS)\n- BioPortal\n\n## Installation\n\n### Using pip\n```bash\npip install biomapper\n```\n\n### Development Setup\n\n1. Install Python 3.11 with pyenv (if not already installed):\n```bash\n# Install pyenv dependencies\nsudo apt-get update\nsudo apt-get install -y make build-essential libssl-dev zlib1g-dev \\\nlibbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev \\\nlibncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl\n\n# Install pyenv\ncurl https://pyenv.run | bash\n\n# Add to your shell configuration\necho 'export PYENV_ROOT=\"$HOME/.pyenv\"' >> ~/.bashrc\necho 'command -v pyenv >/dev/null || export PATH=\"$PYENV_ROOT/bin:$PATH\"' >> ~/.bashrc\necho 'eval \"$(pyenv init -)\"' >> ~/.bashrc\n\n# Reload shell configuration\nsource ~/.bashrc\n\n# Install Python 3.11\npyenv install 3.11.7\npyenv local 3.11.7\n```\n\n2. Install Poetry (if not already installed):\n```bash\ncurl -sSL https://install.python-poetry.org | python3 -\n\n# Add Poetry to your PATH\necho 'export PATH=\"$HOME/.local/bin:$PATH\"' >> ~/.bashrc\nsource ~/.bashrc\n```\n\n3. Clone and set up the project:\n```bash\ngit clone https://github.com/yourusername/biomapper.git\ncd biomapper\n\n# Install dependencies with Poetry\npoetry install\n```\n\n## Quick Start\n\n```python\nfrom biomapper import AnalyteMetadata\nfrom biomapper.standardization import BridgeDBHandler, RaMPClient\n\n# Example 1: Using BridgeDB\n# Initialize metadata handler\nmetadata = AnalyteMetadata()\n\n# Create standardization handler\nbridge_handler = BridgeDBHandler()\n\n# Process identifiers\nresults = bridge_handler.standardize([\"P12345\", \"Q67890\"])\n\n# Example 2: Using RaMP-DB\n# Initialize the RaMP client\nramp_client = RaMPClient()\n\n# Get database versions\nversions = ramp_client.get_source_versions()\n\n# Get pathways for metabolites\n# Example: Get pathways for Creatine (HMDB0000064)\npathways = ramp_client.get_pathways_from_analytes([\"hmdb:HMDB0000064\"])\n```\n\n## Development\n\n### Using Poetry\n\n```bash\n# Activate virtual environment\npoetry shell\n\n# Run a command in the virtual environment\npoetry run python script.py\n\n# Add a new dependency\npoetry add package-name\n\n# Add a development dependency\npoetry add --group dev package-name\n\n# Update dependencies\npoetry update\n\n# Show currently installed packages\npoetry show\n\n# Build the package\npoetry build\n```\n\n### Running Tests\n```bash\n# Run tests\npoetry run pytest\n\n# Run tests with coverage\npoetry run pytest --cov=biomapper\n```\n\n### Code Quality\n```bash\n# Format code with black\npoetry run black .\n\n# Run linting\npoetry run flake8 .\n\n# Type checking\npoetry run mypy .\n```\n\n## Project Structure\n\n```\nbiomapper/\n\u251c\u2500\u2500 biomapper/           # Main package directory\n\u2502   \u251c\u2500\u2500 core/           # Core functionality\n\u2502   \u2502   \u251c\u2500\u2500 metadata.py # Metadata handling\n\u2502   \u2502   \u2514\u2500\u2500 validators.py # Data validation\n\u2502   \u251c\u2500\u2500 standardization/# ID standardization components\n\u2502   \u251c\u2500\u2500 mapping/        # Ontology mapping components\n\u2502   \u251c\u2500\u2500 utils/          # Utility functions\n\u2502   \u2514\u2500\u2500 schemas/        # Data schemas and models\n\u251c\u2500\u2500 tests/              # Test files\n\u251c\u2500\u2500 docs/               # Documentation\n\u251c\u2500\u2500 scripts/            # Utility scripts\n\u251c\u2500\u2500 pyproject.toml      # Poetry configuration and dependencies\n\u2514\u2500\u2500 poetry.lock        # Lock file for dependencies\n```\n\n## License\n\nThis project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\nFor support, please open an issue in the GitHub issue tracker.\n\n## Roadmap\n\n- [ ] Initial release with core functionality\n- [ ] Add support for additional ontology services\n- [ ] Implement caching layer\n- [ ] Add batch processing capabilities\n- [ ] Develop REST API interface\n\n## Acknowledgments\n\n- [BridgeDb](https://www.bridgedb.org/)\n- [RefMet](https://refmet.metabolomicsworkbench.org/)\n- [RaMP-DB](http://rampdb.org/)\n- [UMLS](https://www.nlm.nih.gov/research/umls/index.html)\n- [OLS](https://www.ebi.ac.uk/ols/index)\n- [BioPortal](https://bioportal.bioontology.org/)",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A unified Python toolkit for biological data harmonization and ontology mapping",
    "version": "0.2.0",
    "project_urls": {
        "Documentation": "https://github.com/arpanauts/biomapper/blob/main/README.md",
        "Homepage": "https://github.com/arpanauts/biomapper",
        "Repository": "https://github.com/arpanauts/biomapper"
    },
    "split_keywords": [
        "bioinformatics",
        " ontology",
        " data mapping",
        " biological data",
        " standardization",
        " harmonization"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5730309bd43b9620b4f5d151eaa065483b28b530b82d02336d8509a230646cd2",
                "md5": "51e3d775947c79ed80545d30cc0e18bb",
                "sha256": "f415d75bfdf31ec3173ea9f3735fa543787fa3125c56d941f3b2ee5a65366827"
            },
            "downloads": -1,
            "filename": "biomapper-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "51e3d775947c79ed80545d30cc0e18bb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 37041,
            "upload_time": "2024-12-12T05:21:34",
            "upload_time_iso_8601": "2024-12-12T05:21:34.136848Z",
            "url": "https://files.pythonhosted.org/packages/57/30/309bd43b9620b4f5d151eaa065483b28b530b82d02336d8509a230646cd2/biomapper-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f878994ca73b45c9894a3ce287a0103c8f870912d674b56e382702380cfac273",
                "md5": "e68234ee88ed6310c5477160df4f670b",
                "sha256": "862235878917de4b95984bb34ed5d9974b4376624bd6a204bd75535bbc9a496c"
            },
            "downloads": -1,
            "filename": "biomapper-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e68234ee88ed6310c5477160df4f670b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 34304,
            "upload_time": "2024-12-12T05:21:36",
            "upload_time_iso_8601": "2024-12-12T05:21:36.608327Z",
            "url": "https://files.pythonhosted.org/packages/f8/78/994ca73b45c9894a3ce287a0103c8f870912d674b56e382702380cfac273/biomapper-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-12 05:21:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "arpanauts",
    "github_project": "biomapper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "biomapper"
}
        
Elapsed time: 0.38460s