xt-neighbor-cpu


Namext-neighbor-cpu JSON
Version 0.0.7 PyPI version JSON
download
home_pageNone
SummaryA small example package
upload_time2024-09-02 16:57:02
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # XT-neighbor-cpu

## Description
This is a wrapper Python package for calling SymDel algorithm which is used in finding nearest neighbors of AIRR sequence used in immunological applications. It supports both the CLI and Python API usage. It is mentioned in [XTNeighbor paper](https://arxiv.org/abs/2403.09010) and has its actual implementation in [Pyrepseq package](https://github.com/andim/pyrepseq).

## Installation
```bash
pip install xt-neighbor-cpu
```

## Library Usage
```python
from xt_neighbor_cpu import nearest_neighbor

seqs = ['CAA', 'CAD', 'CDA', 'CKK']
distance_threshold = 1
result = nearest_neighbor(seqs, distance_threshold)
# return [ (0,1,1), (0,2,1) ]
# where each triplet (i,j,d) represents the sequence index i,j and their edit distance d.
```


## Library Documentation
```python
    """
    List all neighboring sequences efficiently within the given distance using SymDel algorithm.
    That is, given a list of AIRR sequences and edit distance threshold, find all pairs of sequences that have their edit distance smaller or equal to the threshold.

    If seqs2 is not provided, every sequences are compared against every other sequences resulting in N(seqs)**2 combinations.
    Otherwise, seqs are compared against seqs2 resulting in N(seqs)*N(seqs2) combinations.

    For more information, see https://arxiv.org/abs/2403.09010.

    Parameters
    ----------
    seqs : iterable of strings
        list of CDR3B sequences
    max_edits : int
        maximum edit distance defining the neighbors
    max_returns : int or None
        maximum neighbor size
    custom_distance : Function(str1, str2) or "hamming"
        custom distance function to use, must statisfy 4 properties of distance (https://en.wikipedia.org/wiki/Distance#Mathematical_formalization)
    max_custom_distance : float
        maximum distance to include in the result, ignored if custom distance is not supplied
    seq2 : iterable of strings or None
        another list of CDR3B sequences to compare against
    progress : bool
        show progress bar

    Returns
    -------
    neighbors : array of 3D-tuples
        neigbors along with their edit distances in format [(x_index, y_index, edit_distance)]
    """
```

## Command Line Usage
```bash
echo "Complimentaty Commands ===="
python -m xt_neighbor_cpu --help
python -m xt_neighbor_cpu --version

echo "Basic Usage ===="
python -m xt_neighbor_cpu -i dummy_input.txt -o output1.txt
python -m xt_neighbor_cpu -i dummy_input.txt -d 2

echo "AIRR Mode ===="
python -m xt_neighbor_cpu -a -i dummy_input_airr.tsv

echo "Comparison Mode ===="
python -m xt_neighbor_cpu -a -i dummy_input_airr.tsv -I dummy_input_airr.tsv -o output2.txt

echo "Hamming Distance Mode ===="
python -m xt_neighbor_cpu -a -i dummy_input_airr.tsv -m hamming
python -m xt_neighbor_cpu -a -i dummy_input_airr.tsv -m hamming -d 2

```
See `test` folder for more information

## Command Line Documentation
```bash
usage: xt_neighbor_cpu [-h] [-d DISTANCE] [-o OUTPUT_PATH] [-m {leven,hamming}] [-v] [-V] [-a] -i INPUT_PATH [-I QUERY_INPUT_PATH]

Perform nearest neighbor search for AIR sequences with the given distance threshold using CPU-based SymDel algorithm

optional arguments:
  -h, --help            show this help message and exit
  -d DISTANCE, --distance DISTANCE
                        distance threshold defining the neighbor (default to 1)
  -o OUTPUT_PATH, --output-path OUTPUT_PATH
                        path of the output file (default to no output)
  -m {leven,hamming}, --measurement {leven,hamming}
                        distance measurement (default to leven)
  -v, --version         print the version of the program then exit
  -V, --verbose         print extra detail as the program runs for debugging purpose
  -a, --airr            use AIRR format for input-path instead. Relevant fields are cdr3_aa and duplicate_count
  -i INPUT_PATH, --input-path INPUT_PATH
                        path of csv input file. It should contain exactly 1 column (AIR sequences) or AIRR-compatible format with
                        -a mode
  -I QUERY_INPUT_PATH, --query-input-path QUERY_INPUT_PATH
                        path of the second csv input file for comparison mode with the same format as -i mode. With this argument,
                        the returning triplets (i,j,d) would have i referencing the first inputs and j referencing the second
                        inputs
```
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "xt-neighbor-cpu",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "heartnetkung <heartnetkung@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/9c/f2/f5b406956807dc1eda046bf809a95a24cc7245bab112b5cf3d01b34e9270/xt_neighbor_cpu-0.0.7.tar.gz",
    "platform": null,
    "description": "# XT-neighbor-cpu\n\n## Description\nThis is a wrapper Python package for calling SymDel algorithm which is used in finding nearest neighbors of AIRR sequence used in immunological applications. It supports both the CLI and Python API usage. It is mentioned in [XTNeighbor paper](https://arxiv.org/abs/2403.09010) and has its actual implementation in [Pyrepseq package](https://github.com/andim/pyrepseq).\n\n## Installation\n```bash\npip install xt-neighbor-cpu\n```\n\n## Library Usage\n```python\nfrom xt_neighbor_cpu import nearest_neighbor\n\nseqs = ['CAA', 'CAD', 'CDA', 'CKK']\ndistance_threshold = 1\nresult = nearest_neighbor(seqs, distance_threshold)\n# return [ (0,1,1), (0,2,1) ]\n# where each triplet (i,j,d) represents the sequence index i,j and their edit distance d.\n```\n\n\n## Library Documentation\n```python\n    \"\"\"\n    List all neighboring sequences efficiently within the given distance using SymDel algorithm.\n    That is, given a list of AIRR sequences and edit distance threshold, find all pairs of sequences that have their edit distance smaller or equal to the threshold.\n\n    If seqs2 is not provided, every sequences are compared against every other sequences resulting in N(seqs)**2 combinations.\n    Otherwise, seqs are compared against seqs2 resulting in N(seqs)*N(seqs2) combinations.\n\n    For more information, see https://arxiv.org/abs/2403.09010.\n\n    Parameters\n    ----------\n    seqs : iterable of strings\n        list of CDR3B sequences\n    max_edits : int\n        maximum edit distance defining the neighbors\n    max_returns : int or None\n        maximum neighbor size\n    custom_distance : Function(str1, str2) or \"hamming\"\n        custom distance function to use, must statisfy 4 properties of distance (https://en.wikipedia.org/wiki/Distance#Mathematical_formalization)\n    max_custom_distance : float\n        maximum distance to include in the result, ignored if custom distance is not supplied\n    seq2 : iterable of strings or None\n        another list of CDR3B sequences to compare against\n    progress : bool\n        show progress bar\n\n    Returns\n    -------\n    neighbors : array of 3D-tuples\n        neigbors along with their edit distances in format [(x_index, y_index, edit_distance)]\n    \"\"\"\n```\n\n## Command Line Usage\n```bash\necho \"Complimentaty Commands ====\"\npython -m xt_neighbor_cpu --help\npython -m xt_neighbor_cpu --version\n\necho \"Basic Usage ====\"\npython -m xt_neighbor_cpu -i dummy_input.txt -o output1.txt\npython -m xt_neighbor_cpu -i dummy_input.txt -d 2\n\necho \"AIRR Mode ====\"\npython -m xt_neighbor_cpu -a -i dummy_input_airr.tsv\n\necho \"Comparison Mode ====\"\npython -m xt_neighbor_cpu -a -i dummy_input_airr.tsv -I dummy_input_airr.tsv -o output2.txt\n\necho \"Hamming Distance Mode ====\"\npython -m xt_neighbor_cpu -a -i dummy_input_airr.tsv -m hamming\npython -m xt_neighbor_cpu -a -i dummy_input_airr.tsv -m hamming -d 2\n\n```\nSee `test` folder for more information\n\n## Command Line Documentation\n```bash\nusage: xt_neighbor_cpu [-h] [-d DISTANCE] [-o OUTPUT_PATH] [-m {leven,hamming}] [-v] [-V] [-a] -i INPUT_PATH [-I QUERY_INPUT_PATH]\n\nPerform nearest neighbor search for AIR sequences with the given distance threshold using CPU-based SymDel algorithm\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -d DISTANCE, --distance DISTANCE\n                        distance threshold defining the neighbor (default to 1)\n  -o OUTPUT_PATH, --output-path OUTPUT_PATH\n                        path of the output file (default to no output)\n  -m {leven,hamming}, --measurement {leven,hamming}\n                        distance measurement (default to leven)\n  -v, --version         print the version of the program then exit\n  -V, --verbose         print extra detail as the program runs for debugging purpose\n  -a, --airr            use AIRR format for input-path instead. Relevant fields are cdr3_aa and duplicate_count\n  -i INPUT_PATH, --input-path INPUT_PATH\n                        path of csv input file. It should contain exactly 1 column (AIR sequences) or AIRR-compatible format with\n                        -a mode\n  -I QUERY_INPUT_PATH, --query-input-path QUERY_INPUT_PATH\n                        path of the second csv input file for comparison mode with the same format as -i mode. With this argument,\n                        the returning triplets (i,j,d) would have i referencing the first inputs and j referencing the second\n                        inputs\n```",
    "bugtrack_url": null,
    "license": null,
    "summary": "A small example package",
    "version": "0.0.7",
    "project_urls": {
        "Homepage": "https://github.com/heartnetkung/XT-neighbor-cpu",
        "Issues": "https://github.com/heartnetkung/XT-neighbor-cpu/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "14b86a5f9afdca07455e829b5f4cbaacd9e39e9792bc42dacbc175c3cfc83dee",
                "md5": "ff3479b36984e2729657744038a5b420",
                "sha256": "df824af8e764980a9e516ce26cc28798d1912ecc2410500d3cc7afb0525ce72c"
            },
            "downloads": -1,
            "filename": "xt_neighbor_cpu-0.0.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ff3479b36984e2729657744038a5b420",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 6168,
            "upload_time": "2024-09-02T16:56:21",
            "upload_time_iso_8601": "2024-09-02T16:56:21.434974Z",
            "url": "https://files.pythonhosted.org/packages/14/b8/6a5f9afdca07455e829b5f4cbaacd9e39e9792bc42dacbc175c3cfc83dee/xt_neighbor_cpu-0.0.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9cf2f5b406956807dc1eda046bf809a95a24cc7245bab112b5cf3d01b34e9270",
                "md5": "e141d79ff49ff582a9517853525c78b5",
                "sha256": "cef8041026534be7b55715b9a82a7243f85587160ebb796ea3514d2a0de8dde2"
            },
            "downloads": -1,
            "filename": "xt_neighbor_cpu-0.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "e141d79ff49ff582a9517853525c78b5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 84996449,
            "upload_time": "2024-09-02T16:57:02",
            "upload_time_iso_8601": "2024-09-02T16:57:02.946501Z",
            "url": "https://files.pythonhosted.org/packages/9c/f2/f5b406956807dc1eda046bf809a95a24cc7245bab112b5cf3d01b34e9270/xt_neighbor_cpu-0.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-02 16:57:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "heartnetkung",
    "github_project": "XT-neighbor-cpu",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "xt-neighbor-cpu"
}
        
Elapsed time: 0.60874s