tcrpmhcdataset

Name	tcrpmhcdataset JSON
Version	0.1.2 JSON
	download
home_page	None
Summary	Handling Many-to-many TCR::Peptide-MHC Data
upload_time	2024-11-10 14:46:21
maintainer	None
docs_url	None
author	None
requires_python	>=3.7
license	None
keywords	immunoinformatics t-cell receptor machine learning bioinformatics peptide-mhc tcr immunology data handling
VCS
bugtrack_url
requirements	pandas numpy scikit-learn mhcgnomes tidytcells
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![Tests](https://github.com/pirl-unc/TCRpMHCdataset/actions/workflows/tests.yml/badge.svg)](https://github.com/pirl-unc/TCRpMHCdataset/actions/workflows/tests.yml)
[![Coverage Status](https://coveralls.io/repos/github/pirl-unc/TCRpMHCdataset/badge.svg?branch=main)](https://coveralls.io/github/pirl-unc/TCRpMHCdataset?branch=main)
<a href="https://pypi.python.org/pypi/tcrpmhcdataset/">
<img src="https://img.shields.io/pypi/v/tcrpmhcdataset.svg?maxAge=1000" alt="PyPI" />
</a>


# TCRpMHCdataset

TCRpMHCdataset is a library to help handle the many-to-many nature of TCR::pMHC data in a sanity preserving manner. It loads tabular data (.CSV) and converts it into a dataset object that can be indexed to get pairs of TCR and pMHC objects that retain a list of all other cognate -optes. The dataset is designed with the flexibility to be used with minimal changes for most TCR:pMHC related tasks. It comes packaged with a notion of directionality: TCR -> pMHC (De-Orphanization) or pMHC -> TCR (TCR design). For training and evaluating machine learning models, the dataset object implements a `split` function which robustly handles the splitting of the data into balanced train and test splits that can be stratified by the allele frequences. These splits can also explicitly hold out epitopes, epitope::allele combinations, and even TCRs that are present in the test set from the training set. 

## Installation

----------------------------------------------------------------------

### From PyPI
```bash
pip install tcrpmhcdataset
```

### From source

Clone this repository into your working directory and run the standard installation command:
```bash
git clone https://github.com/pirl-unc/TCRpMHCdataset.git
cd TCRpMHCdataset
pip install . 
```

## Usage Examples

----------------------------------------------------------------------

### Loading a Dataset
```python
from tcrpmhcdataset import TCRpMHCdataset
# Define a TCR -> pMHC dataset
deorph_dataset = TCRpMHCdataset(source='tcr', target='pmhc', use_mhc=False, use_pseudo=True, use_cdr3=True, use_both_chains=False)
deorph_dataset.load('test_data/sampled_paired_data_cleaned.csv')

In [1]: print(deorph_dataset)
Out[1]: 'TCR:pMHC Dataset of N=6833. Mode:tcr -> pmhc.'
```

### Indexing an Item
```python
from tcrpmhcdataset import TCRpMHCdataset
# Define a pMHC -> tcr dataset
design_dataset = TCRpMHCdataset(source='pmhc', target='tcr', use_mhc=False, use_pseudo=True, use_cdr3=True, use_both_chains=False)
design_dataset.load('test_data/sampled_paired_data_cleaned.csv')
pmhc, tcr = design_dataset[0]

In [1]: pmhc
Out[1]: pMHC(peptide="LIDFYLCFL", hla_allele="HLA-A*02:01", reference={'MIRA:eEE226', 'MIRA:eEE240', 'MIRA:eOX54', 'MIRA:eEE224', 'MIRA:eXL37', 'MIRA:eOX52', 'MIRA:eOX43', 'MIRA:ePD76', 'MIRA:eHO130', 'MIRA:eQD137', 'MIRA:eXL31', 'MIRA:eHH175', 'MIRA:eOX56', 'MIRA:eXL30', 'MIRA:eXL27'}, use_pseudo=True, use_mhc=False)

In [2]: tcr
Out[2]: TCR(cdr3a="None", cdr3b="CSAQDRTSNEQFF",
                trav="None", trbv="TRBV20-1", 
                traj="None", trbj="TRBJ2-1",
                trad="None", trbd="None",
                tcra_full="None", tcrb_full="MLLLLLLLGPGISLLLPGSLAGSGLGAVVSQHPSWVICKSGTSVKIECRSLDFQATTMFWYRQFPKQSLMLMATSNEGSKATYEQGVEKDKFLINHASLTLSTLTVTSAHPEDSSFYICSAQDRTSNEQFFGPGTRLTVLEDLKNVFPPEVAVFEPSEAEISHTQKATLVCLATGFYPDHVELSWWVNGKEVHSGVSTDPQPLKEQPALNDSRYCLSSRLRVSATFWQNPRNHFRCQVQFYGLSENDEWTQDRAKPVTQIVSAEAWGRADCGFTSESYQQGVLSATILYEILLGKATLYAVLVSALVLMAMVKRKDSRG",
                reference={'MIRA:eEE226'}, use_cdr3b=True)
```

### Splitting a Dataset
```python
from tcrpmhcdataset import TCRpMHCdataset
# Define a TCR -> pMHC dataset
design_dataset = TCRpMHCdataset(source='pmhc', target='tcr', use_mhc=False, use_pseudo=True, use_cdr3=True, use_both_chains=False)
design_dataset.load('test_data/sampled_paired_data_cleaned.csv')

# Split on Epitope and then pull out a validation set
train_dataset, test_dataset = design_dataset.split(test_size=0.2, balance_on_allele=True, split_on=['Epitope'])
train_dataset, val_dataset = train_dataset.split(test_size=0.1, balance_on_allele=True, split_on=['Epitope', 'Allele'])

# Split on pMHC 
train_dataset2, test_dataset2 = design_dataset.split(test_size=0.2, balance_on_allele=True, split_on=['Epitope', 'Allele'])

# Split on TCR
train_dataset3, test_dataset3 = design_dataset.split(test_size=0.2, balance_on_allele=True, split_on=['CDR3b', 'CDR3a'])
```

## Motivation

----------------------------------------------------------------------

T-cell Receptors (TCRs) are highly specific pattern recognition receptors that allow T-cells to recognize non-self molecular motifs. Though necessarily highly specific in order to avoid self-reactivity, the phenomenon of cross-reactivity is required to maintain physiologically manageable numbers of T-cell clones while still ensuring sufficient protection (See [Why must T cells be cross-reactive](https://www.nature.com/articles/nri3279)). While fascinating from a biological perspective, handling the data for this complex many-to-many mapping is another story. This project started off as three abstractions that I wrote for [this project](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=Alnulv8AAAAJ&citation_for_view=Alnulv8AAAAJ:2osOgNQ5qMEC) to help think about TCRs and pMHCs are distinct entities with attributes pertaining to what is known about them and a dataset class that could quickly load data from a tabular format, split them in a balanced and meaningful manner, and be easily indexed for model training or evaluation. After having used these abstractions in a few projects, I reckon that they could be useful to others as well and decided to package them up and release them as a library.


## Modules

----------------------------------------------------------------------

See the [documentation](https://pirl-unc.github.io/TCRpMHCdataset/tcrpmhcdataset.html) for more information.

#### TCRpMHCdataset

The `TCRpMHCdataset` object is the main object that is used to load, index, and split the data. 

#### TCR

The TCR object is a frozen dataclass object that stores information such as the 'CDR3b' sequence, 'CDR3a', 'Vb', 'Jb', etc. but also includes a set of pMHCs that this TCR is reactive against as well as a set of references that support the TCR. 

#### pMHC

Like the TCR's implementation, each pMHC object similarly contains key information including the 'peptide', 'allele' but also includes a set of TCRs that are reactive against it as well as a set of references that support the pMHC. It also computes the full HLA-sequence as well as the pseudosequence and caches these for future reference. 

*MHC Sequence*

Major Histocompatibility Complex (MHC) sequences are mapped using the parsed allele level information using the [IMGT HLA]() database. MHC proteins are sometimes annotated with mutations in relation to a known allele:

- "HLA-B\*08:01 N80I mutant"

If picked up by `mhcgnomes` These are passed in to the pMHC object which makes the necessary changes to the sequence and caches the new sequence for future reference.

*Pseudo Sequence*

Pseudosequences are derived from the full MHC sequence that are predicted to be in contact with the peptide given proximity and a polymorphism based estimator (Introduced in [netMHCpan](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0000796#s4)). All pseudo-sequenes are 34 Amino Acids long and are used in different immuno-informatics pipelines as a reduced representation of the full MHC seqeunce. In this package, the pseudo-sequences are mapped from allele level information, similar to MHC sequences. They are not updated given a mutation, but instead use the canonical allele's pseudo-sequence, if available.

*Allele Imputation strategy*

Given the sparsity of the data, every single datapoint is of critical importance. In the event that only the serotype information was provided (e.g. HLA-A2), an "eager" imputation strategy is included to try and impute a common allele (e.g. HLA-A\*02:01). This is done in a rudimentary manner by guessing the allele field from :01 -> :10 and seeing if there exists an MHC sequence from IMGT that matches the allele. This strategy should ONLY be used if using pseudosequence level information as the pseudosequence often is highly conserved within serotype. 

## Contributing

----------------------------------------------------------------------

Community help is always appreciated. No contribution is too small, and we especially appreciate efficiency and usability improvements such as better documentation, tutorials, tests, or code cleanup. If you're looking for a place to start, check out the issues labeled "good first issue" in the issue tracker.

#### Project scope
The `TCRpMHCdataset`, `TCR`, and `pMHC` classes are designed with object oriented principles in mind to think of these protein complexes as individual units with intra-and interdependent interactions, not rows in a dataframe. To this end, any and all efforts to expand the expressivity of these objects with available data is most certainly within scope. It is our hope to soon incorporate structure level information as well. 

All committed code to `TCRpMHCdataset` should be suitable for regular research use by practioners.

If you are contemplating a large contribution, such as the addition of a new class, modality, or data-structure altogether, please first open an issue on GH (or email us at dkarthikeyan1@unc.edu) to discuss and coordinate the work.

#### Making a contribution
All contributions can be made as pull requests on Github. One of the core developers will review your contribution. As needed the core contributors will also make releases and submit to PyPI.

A few other guidelines:

 * `TCRpMHCdataset` is written for Python3 on Linux and OS X. We can't guarantee support for Windows.
 * All functional modifications should be documented using [numpy-style docstrings](https://numpydoc.readthedocs.io/en/latest/format.html) with corresponding with unit tests.
 * Please use informative commit messages.
 * Bugfixes should be accompanied with test that illustrates the bug when feasible.
 * Contributions are licensed under Apache 2.0
 * All interactions must adhere to the [Contributor Covenant Code of Conduct](https://www.contributor-covenant.org/version/1/4/code-of-conduct/).


## References

- [Conditional Generation of Antigen Specific T-cell Receptor Sequences](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=Alnulv8AAAAJ&citation_for_view=Alnulv8AAAAJ:2osOgNQ5qMEC)
- [Why must T cells be cross-reactive](https://www.nature.com/articles/nri3279)
- [Mhcgnomes](https://github.com/pirl-unc/mhcgnomes)
- [Tidytcells](https://tidytcells.readthedocs.io/en/latest/#)
- [Contributor Covenant Code of Conduct](https://www.contributor-covenant.org/version/1/4/code-of-conduct/)

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "tcrpmhcdataset",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "immunoinformatics, T-cell Receptor, machine learning, bioinformatics, peptide-MHC, TCR, immunology, data handling",
    "author": null,
    "author_email": "Dhuvi Karthikeyan <dkarthikeyan@berkeley.edu>",
    "download_url": "https://files.pythonhosted.org/packages/5e/ad/7a1d3597cb809b811e45440a80742393a8b42392402a4e218b266454ce3b/tcrpmhcdataset-0.1.2.tar.gz",
    "platform": null,
    "description": "[![Tests](https://github.com/pirl-unc/TCRpMHCdataset/actions/workflows/tests.yml/badge.svg)](https://github.com/pirl-unc/TCRpMHCdataset/actions/workflows/tests.yml)\n[![Coverage Status](https://coveralls.io/repos/github/pirl-unc/TCRpMHCdataset/badge.svg?branch=main)](https://coveralls.io/github/pirl-unc/TCRpMHCdataset?branch=main)\n<a href=\"https://pypi.python.org/pypi/tcrpmhcdataset/\">\n<img src=\"https://img.shields.io/pypi/v/tcrpmhcdataset.svg?maxAge=1000\" alt=\"PyPI\" />\n</a>\n\n\n# TCRpMHCdataset\n\nTCRpMHCdataset is a library to help handle the many-to-many nature of TCR::pMHC data in a sanity preserving manner. It loads tabular data (.CSV) and converts it into a dataset object that can be indexed to get pairs of TCR and pMHC objects that retain a list of all other cognate -optes. The dataset is designed with the flexibility to be used with minimal changes for most TCR:pMHC related tasks. It comes packaged with a notion of directionality: TCR -> pMHC (De-Orphanization) or pMHC -> TCR (TCR design). For training and evaluating machine learning models, the dataset object implements a `split` function which robustly handles the splitting of the data into balanced train and test splits that can be stratified by the allele frequences. These splits can also explicitly hold out epitopes, epitope::allele combinations, and even TCRs that are present in the test set from the training set. \n\n## Installation\n\n----------------------------------------------------------------------\n\n### From PyPI\n```bash\npip install tcrpmhcdataset\n```\n\n### From source\n\nClone this repository into your working directory and run the standard installation command:\n```bash\ngit clone https://github.com/pirl-unc/TCRpMHCdataset.git\ncd TCRpMHCdataset\npip install . \n```\n\n## Usage Examples\n\n----------------------------------------------------------------------\n\n### Loading a Dataset\n```python\nfrom tcrpmhcdataset import TCRpMHCdataset\n# Define a TCR -> pMHC dataset\ndeorph_dataset = TCRpMHCdataset(source='tcr', target='pmhc', use_mhc=False, use_pseudo=True, use_cdr3=True, use_both_chains=False)\ndeorph_dataset.load('test_data/sampled_paired_data_cleaned.csv')\n\nIn [1]: print(deorph_dataset)\nOut[1]: 'TCR:pMHC Dataset of N=6833. Mode:tcr -> pmhc.'\n```\n\n### Indexing an Item\n```python\nfrom tcrpmhcdataset import TCRpMHCdataset\n# Define a pMHC -> tcr dataset\ndesign_dataset = TCRpMHCdataset(source='pmhc', target='tcr', use_mhc=False, use_pseudo=True, use_cdr3=True, use_both_chains=False)\ndesign_dataset.load('test_data/sampled_paired_data_cleaned.csv')\npmhc, tcr = design_dataset[0]\n\nIn [1]: pmhc\nOut[1]: pMHC(peptide=\"LIDFYLCFL\", hla_allele=\"HLA-A*02:01\", reference={'MIRA:eEE226', 'MIRA:eEE240', 'MIRA:eOX54', 'MIRA:eEE224', 'MIRA:eXL37', 'MIRA:eOX52', 'MIRA:eOX43', 'MIRA:ePD76', 'MIRA:eHO130', 'MIRA:eQD137', 'MIRA:eXL31', 'MIRA:eHH175', 'MIRA:eOX56', 'MIRA:eXL30', 'MIRA:eXL27'}, use_pseudo=True, use_mhc=False)\n\nIn [2]: tcr\nOut[2]: TCR(cdr3a=\"None\", cdr3b=\"CSAQDRTSNEQFF\",\n                trav=\"None\", trbv=\"TRBV20-1\", \n                traj=\"None\", trbj=\"TRBJ2-1\",\n                trad=\"None\", trbd=\"None\",\n                tcra_full=\"None\", tcrb_full=\"MLLLLLLLGPGISLLLPGSLAGSGLGAVVSQHPSWVICKSGTSVKIECRSLDFQATTMFWYRQFPKQSLMLMATSNEGSKATYEQGVEKDKFLINHASLTLSTLTVTSAHPEDSSFYICSAQDRTSNEQFFGPGTRLTVLEDLKNVFPPEVAVFEPSEAEISHTQKATLVCLATGFYPDHVELSWWVNGKEVHSGVSTDPQPLKEQPALNDSRYCLSSRLRVSATFWQNPRNHFRCQVQFYGLSENDEWTQDRAKPVTQIVSAEAWGRADCGFTSESYQQGVLSATILYEILLGKATLYAVLVSALVLMAMVKRKDSRG\",\n                reference={'MIRA:eEE226'}, use_cdr3b=True)\n```\n\n### Splitting a Dataset\n```python\nfrom tcrpmhcdataset import TCRpMHCdataset\n# Define a TCR -> pMHC dataset\ndesign_dataset = TCRpMHCdataset(source='pmhc', target='tcr', use_mhc=False, use_pseudo=True, use_cdr3=True, use_both_chains=False)\ndesign_dataset.load('test_data/sampled_paired_data_cleaned.csv')\n\n# Split on Epitope and then pull out a validation set\ntrain_dataset, test_dataset = design_dataset.split(test_size=0.2, balance_on_allele=True, split_on=['Epitope'])\ntrain_dataset, val_dataset = train_dataset.split(test_size=0.1, balance_on_allele=True, split_on=['Epitope', 'Allele'])\n\n# Split on pMHC \ntrain_dataset2, test_dataset2 = design_dataset.split(test_size=0.2, balance_on_allele=True, split_on=['Epitope', 'Allele'])\n\n# Split on TCR\ntrain_dataset3, test_dataset3 = design_dataset.split(test_size=0.2, balance_on_allele=True, split_on=['CDR3b', 'CDR3a'])\n```\n\n## Motivation\n\n----------------------------------------------------------------------\n\nT-cell Receptors (TCRs) are highly specific pattern recognition receptors that allow T-cells to recognize non-self molecular motifs. Though necessarily highly specific in order to avoid self-reactivity, the phenomenon of cross-reactivity is required to maintain physiologically manageable numbers of T-cell clones while still ensuring sufficient protection (See [Why must T cells be cross-reactive](https://www.nature.com/articles/nri3279)). While fascinating from a biological perspective, handling the data for this complex many-to-many mapping is another story. This project started off as three abstractions that I wrote for [this project](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=Alnulv8AAAAJ&citation_for_view=Alnulv8AAAAJ:2osOgNQ5qMEC) to help think about TCRs and pMHCs are distinct entities with attributes pertaining to what is known about them and a dataset class that could quickly load data from a tabular format, split them in a balanced and meaningful manner, and be easily indexed for model training or evaluation. After having used these abstractions in a few projects, I reckon that they could be useful to others as well and decided to package them up and release them as a library.\n\n\n## Modules\n\n----------------------------------------------------------------------\n\nSee the [documentation](https://pirl-unc.github.io/TCRpMHCdataset/tcrpmhcdataset.html) for more information.\n\n#### TCRpMHCdataset\n\nThe `TCRpMHCdataset` object is the main object that is used to load, index, and split the data. \n\n#### TCR\n\nThe TCR object is a frozen dataclass object that stores information such as the 'CDR3b' sequence, 'CDR3a', 'Vb', 'Jb', etc. but also includes a set of pMHCs that this TCR is reactive against as well as a set of references that support the TCR. \n\n#### pMHC\n\nLike the TCR's implementation, each pMHC object similarly contains key information including the 'peptide', 'allele' but also includes a set of TCRs that are reactive against it as well as a set of references that support the pMHC. It also computes the full HLA-sequence as well as the pseudosequence and caches these for future reference. \n\n*MHC Sequence*\n\nMajor Histocompatibility Complex (MHC) sequences are mapped using the parsed allele level information using the [IMGT HLA]() database. MHC proteins are sometimes annotated with mutations in relation to a known allele:\n\n- \"HLA-B\\*08:01 N80I mutant\"\n\nIf picked up by `mhcgnomes` These are passed in to the pMHC object which makes the necessary changes to the sequence and caches the new sequence for future reference.\n\n*Pseudo Sequence*\n\nPseudosequences are derived from the full MHC sequence that are predicted to be in contact with the peptide given proximity and a polymorphism based estimator (Introduced in [netMHCpan](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0000796#s4)). All pseudo-sequenes are 34 Amino Acids long and are used in different immuno-informatics pipelines as a reduced representation of the full MHC seqeunce. In this package, the pseudo-sequences are mapped from allele level information, similar to MHC sequences. They are not updated given a mutation, but instead use the canonical allele's pseudo-sequence, if available.\n\n*Allele Imputation strategy*\n\nGiven the sparsity of the data, every single datapoint is of critical importance. In the event that only the serotype information was provided (e.g. HLA-A2), an \"eager\" imputation strategy is included to try and impute a common allele (e.g. HLA-A\\*02:01). This is done in a rudimentary manner by guessing the allele field from :01 -> :10 and seeing if there exists an MHC sequence from IMGT that matches the allele. This strategy should ONLY be used if using pseudosequence level information as the pseudosequence often is highly conserved within serotype. \n\n## Contributing\n\n----------------------------------------------------------------------\n\nCommunity help is always appreciated. No contribution is too small, and we especially appreciate efficiency and usability improvements such as better documentation, tutorials, tests, or code cleanup. If you're looking for a place to start, check out the issues labeled \"good first issue\" in the issue tracker.\n\n#### Project scope\nThe `TCRpMHCdataset`, `TCR`, and `pMHC` classes are designed with object oriented principles in mind to think of these protein complexes as individual units with intra-and interdependent interactions, not rows in a dataframe. To this end, any and all efforts to expand the expressivity of these objects with available data is most certainly within scope. It is our hope to soon incorporate structure level information as well. \n\nAll committed code to `TCRpMHCdataset` should be suitable for regular research use by practioners.\n\nIf you are contemplating a large contribution, such as the addition of a new class, modality, or data-structure altogether, please first open an issue on GH (or email us at dkarthikeyan1@unc.edu) to discuss and coordinate the work.\n\n#### Making a contribution\nAll contributions can be made as pull requests on Github. One of the core developers will review your contribution. As needed the core contributors will also make releases and submit to PyPI.\n\nA few other guidelines:\n\n * `TCRpMHCdataset` is written for Python3 on Linux and OS X. We can't guarantee support for Windows.\n * All functional modifications should be documented using [numpy-style docstrings](https://numpydoc.readthedocs.io/en/latest/format.html) with corresponding with unit tests.\n * Please use informative commit messages.\n * Bugfixes should be accompanied with test that illustrates the bug when feasible.\n * Contributions are licensed under Apache 2.0\n * All interactions must adhere to the [Contributor Covenant Code of Conduct](https://www.contributor-covenant.org/version/1/4/code-of-conduct/).\n\n\n## References\n\n- [Conditional Generation of Antigen Specific T-cell Receptor Sequences](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=Alnulv8AAAAJ&citation_for_view=Alnulv8AAAAJ:2osOgNQ5qMEC)\n- [Why must T cells be cross-reactive](https://www.nature.com/articles/nri3279)\n- [Mhcgnomes](https://github.com/pirl-unc/mhcgnomes)\n- [Tidytcells](https://tidytcells.readthedocs.io/en/latest/#)\n- [Contributor Covenant Code of Conduct](https://www.contributor-covenant.org/version/1/4/code-of-conduct/)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Handling Many-to-many TCR::Peptide-MHC Data",
    "version": "0.1.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/pirl-unc/TCRpMHCdataset/issues",
        "Documentation": "https://pirl-unc.github.io/TCRpMHCdataset/tcrpmhcdataset.html",
        "Homepage": "https://github.com/pirl-unc/TCRpMHCdataset"
    },
    "split_keywords": [
        "immunoinformatics",
        " t-cell receptor",
        " machine learning",
        " bioinformatics",
        " peptide-mhc",
        " tcr",
        " immunology",
        " data handling"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d0d0144631d73e128024599468dc5c2aa30a01b2f3e8ba308ba395598f871818",
                "md5": "c7f26b93015089c8efd3d9cd887405ce",
                "sha256": "572ae73245318a099210d7278200318aa8f208b147530322d900e19f62a64864"
            },
            "downloads": -1,
            "filename": "tcrpmhcdataset-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c7f26b93015089c8efd3d9cd887405ce",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 158572,
            "upload_time": "2024-11-10T14:46:20",
            "upload_time_iso_8601": "2024-11-10T14:46:20.090324Z",
            "url": "https://files.pythonhosted.org/packages/d0/d0/144631d73e128024599468dc5c2aa30a01b2f3e8ba308ba395598f871818/tcrpmhcdataset-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5ead7a1d3597cb809b811e45440a80742393a8b42392402a4e218b266454ce3b",
                "md5": "e3bd85b8a3e19754db9c1d6b0826cf6c",
                "sha256": "d45425fd100d0972f4b00559218a504d013ce687c8242e20cefb396e3dc95b1e"
            },
            "downloads": -1,
            "filename": "tcrpmhcdataset-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "e3bd85b8a3e19754db9c1d6b0826cf6c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 155014,
            "upload_time": "2024-11-10T14:46:21",
            "upload_time_iso_8601": "2024-11-10T14:46:21.422483Z",
            "url": "https://files.pythonhosted.org/packages/5e/ad/7a1d3597cb809b811e45440a80742393a8b42392402a4e218b266454ce3b/tcrpmhcdataset-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-10 14:46:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pirl-unc",
    "github_project": "TCRpMHCdataset",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    "<",
                    "3.0.0"
                ],
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "mhcgnomes",
            "specs": [
                [
                    "<",
                    "2.0.0"
                ],
                [
                    ">=",
                    "1.5.0"
                ]
            ]
        },
        {
            "name": "tidytcells",
            "specs": [
                [
                    "<",
                    "3.0.0"
                ],
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        }
    ],
    "lcname": "tcrpmhcdataset"
}

None