anthroab


Nameanthroab JSON
Version 1.1.0 PyPI version JSON
download
home_pagehttps://github.com/nagarh/AnthroAb
SummaryAnthroAb: Human antibody language model based on RoBERTa for humanization
upload_time2025-08-09 01:53:01
maintainerNone
docs_urlNone
authorHemant Nagar
requires_python>=3.10
licenseMIT
keywords anthroab antibody humanization roberta biophi antibody design bioinformatics protein engineering
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # AnthroAb: Antibody Humanization Language Model

```
              █████  ███    ██ ████████ ██   ██ ██████   ██████      █████  ██████  
             ██   ██ ████   ██    ██    ██   ██ ██   ██ ██    ██    ██   ██ ██   ██ 
             ███████ ██ ██  ██    ██    ███████ ██████  ██    ██ ██ ███████ ██████  
             ██   ██ ██  ██ ██    ██    ██   ██ ██   ██ ██    ██    ██   ██ ██   ██ 
             ██   ██ ██   ████    ██    ██   ██ ██   ██  ██████     ██   ██ ██████
```

AnthroAb is a human antibody language model based on RoBERTa, specifically trained for antibody humanization tasks.

## Features

- **Antibody Humanization**: Predict humanized versions of antibody sequences
- **Sequence Infilling**: Fill masked positions with human-like residues
- **Mutation Suggestions**: Suggest humanizing mutations for frameworks and CDRs
- **Embedding Generation**: Create vector representations of residues or sequences
- **Dual Chain Support**: Separate models for Variable Heavy (VH) and Variable Light (VL) chains

## Installation

```bash
# Install from PyPI 
conda create -n anthroab python=3.10
conda activate anthroab
pip install anthroab

# Or install from source
git clone https://github.com/nagarh/AnthroAb
cd AnthroAb
pip install -e .
```

## Quick Start

### Antibody Sequence Humanization

```python
import anthroab

# Humanize a heavy chain sequence
vh_sequence = "'**QLV*SGVEVKKPGASVKVSCKASGYTFTNYYMYWVRQAPGQGLEWMGGINPSNGGTNFNEKFKNRVTLTTDSSTTTAYMELKSLQFDDTAVYYCARRDYRFDMGFDYWGQGTTVTVSS"
humanized_vh = anthroab.predict_best_score(vh_sequence, 'H')
print(f"Humanized VH: {humanized_vh}")

# Humanize a light chain sequence
vl_sequence = "DIQMTQSPSSLSASV*DRVTITCRASQSISSYLNWYQQKPGKAPKLLIYSASTLASGVPSRFSGSGSGTDF*LTISSLQPEDFATYYCQQSYSTPRTFGQGTKVEIK"
humanized_vl = anthroab.predict_best_score(vl_sequence, 'L')
print(f"Humanized VL: {humanized_vl}")
```

## Model Details

### Architecture
- **Base Model**: RoBERTa (trained from scratch)
- **Architecture**: RobertaForMaskedLM
- **Model Type**: Masked Language Model for antibody sequences

### Model Specifications
- **Hidden Size**: 768
- **Number of Layers**: 12
- **Number of Attention Heads**: 12
- **Intermediate Size**: 3072
- **Max Position Embeddings**: 192 (VH), 145 (VL)
- **Vocabulary Size**: 25 tokens
- **Model Size**: ~164 MB per model

### Available Models
- **VH Model**: `hemantn/roberta-base-humAb-vh` - For Variable Heavy chains
- **VL Model**: `hemantn/roberta-base-humAb-vl` - For Variable Light chains



## Citation

If you use AnthroAb in your research, please cite:

```bibtex
@misc{anthroab,
  author = {Hemant N},
  title = {AnthroAb: Human Antibody Language Model for Humanization},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/hemantn/roberta-base-humAb-vh}
}
```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## Acknowledgments

**Note**: This codebase and API design are adopted from the [Sapiens](https://github.com/Merck/Sapiens) model by Merck.AnthroAb maintains the same interface and functionality as Sapiens but utilizes a RoBERTa-base model trained on human antibody sequences from the OAS database (up to year 2025) for antibody humanization.

### Original Sapiens Citation
> David Prihoda, Jad Maamary, Andrew Waight, Veronica Juan, Laurence Fayadat-Dilman, Daniel Svozil & Danny A. Bitton (2022) 
> BioPhi: A platform for antibody design, humanization, and humanness evaluation based on natural antibody repertoires and deep learning, mAbs, 14:1, DOI: https://doi.org/10.1080/19420862.2021.2020203

## Related Projects

- **[Sapiens](https://github.com/Merck/Sapiens)**: Original antibody language model by Merck (this codebase is based on Sapiens)
- **[BioPhi](https://github.com/Merck/BioPhi)**: Antibody design and humanization platform
- **[OAS](https://opig.stats.ox.ac.uk/webapps/oas/)**: Observed Antibody Space database

## Support

For questions, issues, or contributions, please open an issue on the GitHub repository. 

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/nagarh/AnthroAb",
    "name": "anthroab",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "anthroab, antibody humanization, roberta, biophi, antibody design, bioinformatics, protein engineering",
    "author": "Hemant Nagar",
    "author_email": "hn533621@ohio.edu",
    "download_url": "https://files.pythonhosted.org/packages/26/9a/182b8a4a1456782f9b98632a9eb78353a16645ac7792c7cba6db9a377bda/anthroab-1.1.0.tar.gz",
    "platform": null,
    "description": "# AnthroAb: Antibody Humanization Language Model\n\n```\n              \u2588\u2588\u2588\u2588\u2588  \u2588\u2588\u2588    \u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588   \u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588   \u2588\u2588\u2588\u2588\u2588\u2588      \u2588\u2588\u2588\u2588\u2588  \u2588\u2588\u2588\u2588\u2588\u2588  \n             \u2588\u2588   \u2588\u2588 \u2588\u2588\u2588\u2588   \u2588\u2588    \u2588\u2588    \u2588\u2588   \u2588\u2588 \u2588\u2588   \u2588\u2588 \u2588\u2588    \u2588\u2588    \u2588\u2588   \u2588\u2588 \u2588\u2588   \u2588\u2588 \n             \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588 \u2588\u2588  \u2588\u2588    \u2588\u2588    \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588  \u2588\u2588    \u2588\u2588 \u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588  \n             \u2588\u2588   \u2588\u2588 \u2588\u2588  \u2588\u2588 \u2588\u2588    \u2588\u2588    \u2588\u2588   \u2588\u2588 \u2588\u2588   \u2588\u2588 \u2588\u2588    \u2588\u2588    \u2588\u2588   \u2588\u2588 \u2588\u2588   \u2588\u2588 \n             \u2588\u2588   \u2588\u2588 \u2588\u2588   \u2588\u2588\u2588\u2588    \u2588\u2588    \u2588\u2588   \u2588\u2588 \u2588\u2588   \u2588\u2588  \u2588\u2588\u2588\u2588\u2588\u2588     \u2588\u2588   \u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588\n```\n\nAnthroAb is a human antibody language model based on RoBERTa, specifically trained for antibody humanization tasks.\n\n## Features\n\n- **Antibody Humanization**: Predict humanized versions of antibody sequences\n- **Sequence Infilling**: Fill masked positions with human-like residues\n- **Mutation Suggestions**: Suggest humanizing mutations for frameworks and CDRs\n- **Embedding Generation**: Create vector representations of residues or sequences\n- **Dual Chain Support**: Separate models for Variable Heavy (VH) and Variable Light (VL) chains\n\n## Installation\n\n```bash\n# Install from PyPI \nconda create -n anthroab python=3.10\nconda activate anthroab\npip install anthroab\n\n# Or install from source\ngit clone https://github.com/nagarh/AnthroAb\ncd AnthroAb\npip install -e .\n```\n\n## Quick Start\n\n### Antibody Sequence Humanization\n\n```python\nimport anthroab\n\n# Humanize a heavy chain sequence\nvh_sequence = \"'**QLV*SGVEVKKPGASVKVSCKASGYTFTNYYMYWVRQAPGQGLEWMGGINPSNGGTNFNEKFKNRVTLTTDSSTTTAYMELKSLQFDDTAVYYCARRDYRFDMGFDYWGQGTTVTVSS\"\nhumanized_vh = anthroab.predict_best_score(vh_sequence, 'H')\nprint(f\"Humanized VH: {humanized_vh}\")\n\n# Humanize a light chain sequence\nvl_sequence = \"DIQMTQSPSSLSASV*DRVTITCRASQSISSYLNWYQQKPGKAPKLLIYSASTLASGVPSRFSGSGSGTDF*LTISSLQPEDFATYYCQQSYSTPRTFGQGTKVEIK\"\nhumanized_vl = anthroab.predict_best_score(vl_sequence, 'L')\nprint(f\"Humanized VL: {humanized_vl}\")\n```\n\n## Model Details\n\n### Architecture\n- **Base Model**: RoBERTa (trained from scratch)\n- **Architecture**: RobertaForMaskedLM\n- **Model Type**: Masked Language Model for antibody sequences\n\n### Model Specifications\n- **Hidden Size**: 768\n- **Number of Layers**: 12\n- **Number of Attention Heads**: 12\n- **Intermediate Size**: 3072\n- **Max Position Embeddings**: 192 (VH), 145 (VL)\n- **Vocabulary Size**: 25 tokens\n- **Model Size**: ~164 MB per model\n\n### Available Models\n- **VH Model**: `hemantn/roberta-base-humAb-vh` - For Variable Heavy chains\n- **VL Model**: `hemantn/roberta-base-humAb-vl` - For Variable Light chains\n\n\n\n## Citation\n\nIf you use AnthroAb in your research, please cite:\n\n```bibtex\n@misc{anthroab,\n  author = {Hemant N},\n  title = {AnthroAb: Human Antibody Language Model for Humanization},\n  year = {2024},\n  publisher = {Hugging Face},\n  url = {https://huggingface.co/hemantn/roberta-base-humAb-vh}\n}\n```\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## Acknowledgments\n\n**Note**: This codebase and API design are adopted from the [Sapiens](https://github.com/Merck/Sapiens) model by Merck.AnthroAb maintains the same interface and functionality as Sapiens but utilizes a RoBERTa-base model trained on human antibody sequences from the OAS database (up to year 2025) for antibody humanization.\n\n### Original Sapiens Citation\n> David Prihoda, Jad Maamary, Andrew Waight, Veronica Juan, Laurence Fayadat-Dilman, Daniel Svozil & Danny A. Bitton (2022) \n> BioPhi: A platform for antibody design, humanization, and humanness evaluation based on natural antibody repertoires and deep learning, mAbs, 14:1, DOI: https://doi.org/10.1080/19420862.2021.2020203\n\n## Related Projects\n\n- **[Sapiens](https://github.com/Merck/Sapiens)**: Original antibody language model by Merck (this codebase is based on Sapiens)\n- **[BioPhi](https://github.com/Merck/BioPhi)**: Antibody design and humanization platform\n- **[OAS](https://opig.stats.ox.ac.uk/webapps/oas/)**: Observed Antibody Space database\n\n## Support\n\nFor questions, issues, or contributions, please open an issue on the GitHub repository. \n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "AnthroAb: Human antibody language model based on RoBERTa for humanization",
    "version": "1.1.0",
    "project_urls": {
        "Bug Reports": "https://github.com/nagarh/AnthroAb/issues",
        "Documentation": "https://github.com/nagarh/AnthroAb",
        "Download": "https://github.com/nagarh/AnthroAb/releases",
        "Homepage": "https://github.com/nagarh/AnthroAb",
        "Source": "https://github.com/nagarh/AnthroAb"
    },
    "split_keywords": [
        "anthroab",
        " antibody humanization",
        " roberta",
        " biophi",
        " antibody design",
        " bioinformatics",
        " protein engineering"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4220291977ffb1f06718324b2109e234b542b7e21a4e4ab95a646b0af9456995",
                "md5": "a91db75d70752283b0cf1161ffd9437e",
                "sha256": "cbb018d3fc5c76d5dfcdd0b87646907805dff10b8c61bc9f53b81b12320e6afa"
            },
            "downloads": -1,
            "filename": "anthroab-1.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a91db75d70752283b0cf1161ffd9437e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 8539,
            "upload_time": "2025-08-09T01:53:00",
            "upload_time_iso_8601": "2025-08-09T01:53:00.562452Z",
            "url": "https://files.pythonhosted.org/packages/42/20/291977ffb1f06718324b2109e234b542b7e21a4e4ab95a646b0af9456995/anthroab-1.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "269a182b8a4a1456782f9b98632a9eb78353a16645ac7792c7cba6db9a377bda",
                "md5": "c25ebd4c5d93ee7e2ba4ab148399ce28",
                "sha256": "330a63df7633a325aa653afa025104c0f3627781d8ec31997cae8cadd77fd84c"
            },
            "downloads": -1,
            "filename": "anthroab-1.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c25ebd4c5d93ee7e2ba4ab148399ce28",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 8612,
            "upload_time": "2025-08-09T01:53:01",
            "upload_time_iso_8601": "2025-08-09T01:53:01.301119Z",
            "url": "https://files.pythonhosted.org/packages/26/9a/182b8a4a1456782f9b98632a9eb78353a16645ac7792c7cba6db9a377bda/anthroab-1.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-09 01:53:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nagarh",
    "github_project": "AnthroAb",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "anthroab"
}
        
Elapsed time: 1.69736s