sarand

Name	sarand JSON
Version	1.1.1 JSON
	download
home_page	https://github.com/beiko-lab/sarand
Summary	Tool to extract the neighborhood of the target Antimicrobial Resistance (AMR) genes from the assembly graph.
upload_time	2025-07-29 16:51:04
maintainer	None
docs_url	None
author	Somayeh Kafaie
requires_python	>=3.7
license	GPLv3
keywords	metagenomic assembly graph antimicrobial resistance context extraction
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Sarand

![sarand](sarand/docs/sarand.png)

Sarand is a tool to identify genes within an assembly graph and extract the local graph neighbourhood.
It has primarily been developed for the analysis of Antimicrobial Resistance (AMR) genes within metagenomic assembly graphs.
[CARD](card.mcmaster.ca) database is the default set of genes used for which neighborhoods are found but Sarand can support any user-supplied nucleotide fasta file of target genes.
<!--- Currently this is fixed to using the [CARD](card.mcmaster.ca) database but will be expanded in the near future to support any user-supplied nucleotide fasta file of target genes.-->


![sarand overview](sarand/docs/sarand_summary.png)

## 1. Installation

Sarand can be run using a conda environment or in a container (Docker or Singularity) and requires 4 key dependencies:

- [Bakta](https://github.com/oschwengers/bakta)
- [RGI](https://github.com/arpcard/rgi)
- [BLAST+](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs&DOC_TYPE=Download)
- [Bandage](https://github.com/rrwick/Bandage) or [GraphAligner](https://github.com/maickrau/GraphAligner)

### 1a. Docker

This is the easiest way to run Sarand, note that the `-v` argument maps a host directory to the Docker container.
You need to replace `/host/path` and `/container/path` in the command below with the path to the directory containing your input GFA.
Note that this will also be the location that the output is written to.

The most simple way to approach this is by mapping `/host/path` and `/container/path` to the same directory to keep paths consistent.

```shell
docker run -v /host/path:/container/path -it beiko-lab/sarand:1.1.1 -i /container/path/input.gfa -o /container/path/output
```

### 1b. Singularity

As singularity will automatically map paths, you simply need to run it in the format of:

```shell
singularity run docker://beiko-lab/sarand:1.1.1 -i input.gfa -o output
```


### 1c. Conda

As there are dependency conflicts between the tools used by sarand, you will need to create multiple conda environments.

**Creating environments:**

```shell
# 1. Create the sarand environment
conda create -n sarand-1.1.1 -c conda-forge -c bioconda -y blast=2.14.0 dna_features_viewer=3.1.2 numpy matplotlib-base gfapy=1.2.3 cd-hit=4.6.8 networkx gzip pandas python pillow biopython

# 2. Create the bakta environment
conda create -n bakta-1.8.1 -c conda-forge -c bioconda -y bakta=1.8.1

# 3.a. Create the Bandage environment
conda create -n bandage-0.8.1 -c conda-forge -c bioconda -c defaults -y bandage=0.8.1

# 3.b. Create the GraphAligner environment
conda create -n graphaligner-1.0.17b -c conda-forge -c bioconda -y graphaligner=1.0.17b

# 4. Create the RGI environment
conda create -n rgi-5.2.0 -c conda-forge -c bioconda -c defaults -y rgi=5.2.0
```
Please note that step 3.b is not required if you run the default version of Sarand. Sarand, by default, utilizes Bandage for sequence alignment in the assembly graphs. However, If you prefer to use GraphAligner, please make sure to run command 3.b and install it.

**Downloading and updating the Bakta database:**

```shell
cd /tmp
wget https://zenodo.org/record/7669534/files/db-light.tar.gz
tar -xzvf db-light.tar.gz
rm db-light.tar.gz

# Note: Here you will need to specify a path to keep the Bakta database
# This example uses /db/bakta but you can use any path you like
mkdir -p /db/bakta
mv db-light /db/bakta
conda run -n bakta-1.8.1 amrfinder_update --force_update --database /db/bakta/db-light/amrfinderplus-db
```

**Configuring conda environments:**

Here you will specify environment variables that are specific to the `sarand-1.1.1` environment,
these will be automatically used when the environment is active.

```shell
conda activate sarand-1.1.1
conda env config vars set CONDA_BAKTA_NAME=bakta-1.8.1
conda env config vars set CONDA_BANDAGE_NAME=bandage-0.8.1
conda env config vars set CONDA_RGI_NAME=rgi-5.2.0
conda env config vars set BAKTA_DB=/db/bakta/db-light
# Note1: Only run the following command if you have created graphaligner-1.0.17b conda environemnt in step 3.b above.
conda env config vars set CONDA_GRAPH_ALIGNER_NAME=graphaligner-1.0.17b

# Note2: Here you can specify an alternate exe (e.g. micromamba, mamba).
conda env config vars set CONDA_EXE_NAME=conda
```

**Installing sarand:**

```shell
conda activate sarand-1.1.1
# python -m pip install sarand==1.1.1
pip install sarand==1.1.1
```

## 2. Testing

You can test your install has worked by running the test script via `bash test/test.sh`
This will execute sarand on a test dataset (using the following command) and check all the expected outputs are created correctly.

    `sarand -i test/spade_output/assembly_graph_with_scaffolds.gfa -o test/test_output -a metaspades -k 55`



## 3. Usage

All of sarand's parameters can be set using the command line flags.
The only required input file is an assembly graph in `.gfa` format.

This can be generated using metagenomic (or genomic) de-novo assembly tools
such as [metaSPAdes](https://github.com/ablab/spades) or [megahit](https://github.com/voutcn/megahit).
If your chosen assembly tool generates a `fastg` formatted graph utilities such as `fastg2gfa` can be used to convert them.

```
usage: sarand [-h] [-v] -i INPUT_GFA -a ASSEMBLER
              -k MAX_KMER_SIZE [-j NUM_CORES] [-c COVERAGE_DIFFERENCE]
              [-t TARGET_GENES] [-x MIN_TARGET_IDENTITY]
              [-l NEIGHBOURHOOD_LENGTH] [-o OUTPUT_DIR] [-f]
              [--verbose] [--no_rgi | --rgi_include_loose] [--use_ga]
              [--ga] [--keep_intermediate_files] [--debug]

Identify and extract the local neighbourhood of target genes (such as AMR)
from a GFA formatted assembly graph

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -i INPUT_GFA, --input_gfa INPUT_GFA
                      Path to assembly graph (in GFA format) that you wish
                      to analyse
  -a {metaspades,bcalm,megahit,metacherchant,contig},
  --assembler {metaspades,bcalm,megahit,metacherchant,contig}
                      Assembler used to generate input GFA (required to
                      correctly parse coverage information)
  -k MAX_KMER_SIZE, --max_kmer_size MAX_KMER_SIZE
                      Maximum k-mer sized used by assembler to generate
                      input GFA
  --extraction_timeout EXTRACTION_TIMEOUT
                      Maximum time to extract neighbourhood per gene in
                      minutes, -1 indicates no limit
  -j NUM_CORES, --num_cores NUM_CORES
                      Number of cores to use
  -c COVERAGE_DIFFERENCE, --coverage_difference COVERAGE_DIFFERENCE
                      Maximum coverage difference to include when filtering
                      graph neighbourhood. Use -1 to indicate no coverage
                      threshold (although this will likely lead to false
                      positive neighbourhoods).
  -t TARGET_GENES, --target_genes TARGET_GENES
                      Target genes to search for in the assembly graph
                      (fasta formatted). Default is the pre-installed CARD database
  -x MIN_TARGET_IDENTITY, --min_target_identity MIN_TARGET_IDENTITY
                      Minimum identity/coverage to identify presence of
                      target gene in assembly graph
  -l NEIGHBOURHOOD_LENGTH, --neighbourhood_length NEIGHBOURHOOD_LENGTH
                      Size of gene neighbourhood to extract from the
                      assembly graph
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                      Output folder for current run of sarand
  -f, --force         Force overwrite any previous files/output     
                      directories
  --verbose           Provide verbose debugging output when logging,
                      and keep intermediate files
  --no_rgi            Disable RGI based annotation of graph neighbourhoods
  --rgi_include_loose Include loose criteria hits if using RGI to annotate
                      graph neighbourhoods
  --use_ga            Enable GraphAligner (instead of Bandage) for  
                      sequence alignment in the graph
  --ga [GA ...]       Additional arguments to supply to graph aligner in the
                      form of --ga key value, e.g. --ga E-cutoff 0.1;
                      it should be used only if use_ga is set to True
  --keep_intermediate_files
                      Do not delete intermediate files.
  --debug               Creates additional files for debugging purposes.
  -seq SEQ_NUMBER, --max_number_seq_for_cdhit SEQ_NUMBER    
  		      Max Number of sequence for cd-hit
  -sim [0 1],  -similarity [0 1]
                     similarity threshold for cdhit (a number between 0 and 1)
  --meta_main_dir METACHERCHANT_MAIN_DIR
  		     The main directory for metacherchant containing
  		     AMR_seqs_full.fasta, all AMR sequences and the
  		     extracted local graphs by metacherchant.
```

**Running for Metacherchant:**

To extract neighborhoods from Metacherchant, you first need to run Metacherchant separately on your set of target genes. For each gene, Metacherchant will generate a local neighborhood graph. However, it does not provide the actual neighborhood sequences. To extract these sequences from the generated local graphs, Sarand must be run on them.

***Required Input Files and Directories***

Ensure that the following items are placed inside a directory, which will be passed to Sarand as `meta_main_dir`:
* `AMR_seqs_full.fasta`: a FASTA file containing all target genes' names and sequences (each in a separate line)
* `AMR_info/sequences`: a directory containing separate files each each named <geneName>.fasta, where each file contains the name and sequence of a target gene.
* `output` a directory generated by Metacherchant containing the local graphs produced for the genes.

***Running Sarand***
Once the required files and directories are set up, execute Sarand for Metacherchant using the following command:
```shell
sarand -o <output_dir> -a metacherchant --meta_main_dir <meta_main_dir>
```

**Running for Contigs:**

To extract neighborhoods from contigs, execute Sarand using the following command:
```shell
sarand -i <contigs_file> -o <output_dir> -a contig
```

### 3a. Output
All results will be available in specified output directory (default is `sarand_results_` followed by a timestamp).
Here is the list of important directories and files that can be seen there and a short description of their content:
* `AMR_info`: this directory contains the list of identified AMR sequences.
    * `AMR_info/sequences/`:The sequence of identified AMRs, from graph, is stored here, with a name similar to their original name (file name is generated by calling `sarand/utils.py::restricted_amr_name_from_modified_name(amr_name_from_title(amr_original_name)))`
    * `AMR_info/alignments/`: The alignment details for all AMR sequences are stored here.

* `sequences_info/sequences_info_{neighbourhood_length}/`: This directory stores the information of extracted neighborhood sequences from the assembly graph.
    * `sequences_info/sequences_info_{params.neighbourhood_length}/sequences/`: the extracted sequences in the neighborhood of each AMR are stored in a file like `ng_sequences_{AMR_NAME}_{params.neighbourhood_length}_{DATE}.txt`.
For each extracted sequence, the first line denotes the corresponding path, where the nodes representing the AMR sequence are placed in '[]'. The next line denotes the extracted sequence where the AMR sequence is in lower case letters and the neighborhood is in upper case letters.
    * `sequences_info/sequences_info_{params.neighbourhood_length}/paths_info/`: The information of nodes representing the AMR neighborhood including their name, the part of the sequence represented by each node (start position and end position) as well as their coverage is stored in a file like `ng_sequences_{AMR_NAME}_{params.neighbourhood_length}_{DATE}.csv`

* `annotations/annotations_{params.neighbourhood_length}`: The annotation details are stored in this directory.
    * `annotations/annotations_{params.neighbourhood_length}/annotation_{AMR_NAME}_{params.neighbourhood_length}`: this directory contains all annotation details for a given AMR.
    * `gene_comparison_<AMR_NAME>.png`: An image visualizing annotations
    * `annotation_detail_{AMR_NAME}.csv`: the list of annotations of all extracted sequences for an AMR gene
    * `trimmed_annotation_info_{AMR_NAME}.csv`: the list of unique annotations of all extracted sequences for an AMR gene
    * `coverage_annotation_{COVERAGE_DIFFERENCE}_{AMR_NAME}.csv`: the list of the annotations in which the gene coverage difference from the AMR gene coverage is less than GENE_COVERAGE_DIFFERENCE value.
    * `bakta_dir_extracted{NUM}_{DATE}`: it contains the output of prokka for annotation of a sequence extracted from the neighborhood of the target AMR gene in the assembly graph.
    * `rgi_dir`: contains RGI annotation details for all extracted neighborhood sequences of the target AMR gene.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/beiko-lab/sarand",
    "name": "sarand",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "Metagenomic Assembly graph, Antimicrobial resistance, Context extraction",
    "author": "Somayeh Kafaie",
    "author_email": "so.kafaie@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/e3/7d/163ce0030f5a3efca85c40d519072b72d083e78b991c0cee19b0e09994db/sarand-1.1.1.tar.gz",
    "platform": null,
    "description": "# Sarand\n\n![sarand](sarand/docs/sarand.png)\n\nSarand is a tool to identify genes within an assembly graph and extract the local graph neighbourhood.\nIt has primarily been developed for the analysis of Antimicrobial Resistance (AMR) genes within metagenomic assembly graphs.\n[CARD](card.mcmaster.ca) database is the default set of genes used for which neighborhoods are found but Sarand can support any user-supplied nucleotide fasta file of target genes.\n<!--- Currently this is fixed to using the [CARD](card.mcmaster.ca) database but will be expanded in the near future to support any user-supplied nucleotide fasta file of target genes.-->\n\n\n![sarand overview](sarand/docs/sarand_summary.png)\n\n## 1. Installation\n\nSarand can be run using a conda environment or in a container (Docker or Singularity) and requires 4 key dependencies:\n\n- [Bakta](https://github.com/oschwengers/bakta)\n- [RGI](https://github.com/arpcard/rgi)\n- [BLAST+](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs&DOC_TYPE=Download)\n- [Bandage](https://github.com/rrwick/Bandage) or [GraphAligner](https://github.com/maickrau/GraphAligner)\n\n### 1a. Docker\n\nThis is the easiest way to run Sarand, note that the `-v` argument maps a host directory to the Docker container.\nYou need to replace `/host/path` and `/container/path` in the command below with the path to the directory containing your input GFA.\nNote that this will also be the location that the output is written to.\n\nThe most simple way to approach this is by mapping `/host/path` and `/container/path` to the same directory to keep paths consistent.\n\n```shell\ndocker run -v /host/path:/container/path -it beiko-lab/sarand:1.1.1 -i /container/path/input.gfa -o /container/path/output\n```\n\n### 1b. Singularity\n\nAs singularity will automatically map paths, you simply need to run it in the format of:\n\n```shell\nsingularity run docker://beiko-lab/sarand:1.1.1 -i input.gfa -o output\n```\n\n\n### 1c. Conda\n\nAs there are dependency conflicts between the tools used by sarand, you will need to create multiple conda environments.\n\n**Creating environments:**\n\n```shell\n# 1. Create the sarand environment\nconda create -n sarand-1.1.1 -c conda-forge -c bioconda -y blast=2.14.0 dna_features_viewer=3.1.2 numpy matplotlib-base gfapy=1.2.3 cd-hit=4.6.8 networkx gzip pandas python pillow biopython\n\n# 2. Create the bakta environment\nconda create -n bakta-1.8.1 -c conda-forge -c bioconda -y bakta=1.8.1\n\n# 3.a. Create the Bandage environment\nconda create -n bandage-0.8.1 -c conda-forge -c bioconda -c defaults -y bandage=0.8.1\n\n# 3.b. Create the GraphAligner environment\nconda create -n graphaligner-1.0.17b -c conda-forge -c bioconda -y graphaligner=1.0.17b\n\n# 4. Create the RGI environment\nconda create -n rgi-5.2.0 -c conda-forge -c bioconda -c defaults -y rgi=5.2.0\n```\nPlease note that step 3.b is not required if you run the default version of Sarand. Sarand, by default, utilizes Bandage for sequence alignment in the assembly graphs. However, If you prefer to use GraphAligner, please make sure to run command 3.b and install it.\n\n**Downloading and updating the Bakta database:**\n\n```shell\ncd /tmp\nwget https://zenodo.org/record/7669534/files/db-light.tar.gz\ntar -xzvf db-light.tar.gz\nrm db-light.tar.gz\n\n# Note: Here you will need to specify a path to keep the Bakta database\n# This example uses /db/bakta but you can use any path you like\nmkdir -p /db/bakta\nmv db-light /db/bakta\nconda run -n bakta-1.8.1 amrfinder_update --force_update --database /db/bakta/db-light/amrfinderplus-db\n```\n\n**Configuring conda environments:**\n\nHere you will specify environment variables that are specific to the `sarand-1.1.1` environment,\nthese will be automatically used when the environment is active.\n\n```shell\nconda activate sarand-1.1.1\nconda env config vars set CONDA_BAKTA_NAME=bakta-1.8.1\nconda env config vars set CONDA_BANDAGE_NAME=bandage-0.8.1\nconda env config vars set CONDA_RGI_NAME=rgi-5.2.0\nconda env config vars set BAKTA_DB=/db/bakta/db-light\n# Note1: Only run the following command if you have created graphaligner-1.0.17b conda environemnt in step 3.b above.\nconda env config vars set CONDA_GRAPH_ALIGNER_NAME=graphaligner-1.0.17b\n\n# Note2: Here you can specify an alternate exe (e.g. micromamba, mamba).\nconda env config vars set CONDA_EXE_NAME=conda\n```\n\n**Installing sarand:**\n\n```shell\nconda activate sarand-1.1.1\n# python -m pip install sarand==1.1.1\npip install sarand==1.1.1\n```\n\n## 2. Testing\n\nYou can test your install has worked by running the test script via `bash test/test.sh`\nThis will execute sarand on a test dataset (using the following command) and check all the expected outputs are created correctly.\n\n    `sarand -i test/spade_output/assembly_graph_with_scaffolds.gfa -o test/test_output -a metaspades -k 55`\n\n\n\n## 3. Usage\n\nAll of sarand's parameters can be set using the command line flags.\nThe only required input file is an assembly graph in `.gfa` format.\n\nThis can be generated using metagenomic (or genomic) de-novo assembly tools\nsuch as [metaSPAdes](https://github.com/ablab/spades) or [megahit](https://github.com/voutcn/megahit).\nIf your chosen assembly tool generates a `fastg` formatted graph utilities such as `fastg2gfa` can be used to convert them.\n\n```\nusage: sarand [-h] [-v] -i INPUT_GFA -a ASSEMBLER\n              -k MAX_KMER_SIZE [-j NUM_CORES] [-c COVERAGE_DIFFERENCE]\n              [-t TARGET_GENES] [-x MIN_TARGET_IDENTITY]\n              [-l NEIGHBOURHOOD_LENGTH] [-o OUTPUT_DIR] [-f]\n              [--verbose] [--no_rgi | --rgi_include_loose] [--use_ga]\n              [--ga] [--keep_intermediate_files] [--debug]\n\nIdentify and extract the local neighbourhood of target genes (such as AMR)\nfrom a GFA formatted assembly graph\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -v, --version         show program's version number and exit\n  -i INPUT_GFA, --input_gfa INPUT_GFA\n                      Path to assembly graph (in GFA format) that you wish\n                      to analyse\n  -a {metaspades,bcalm,megahit,metacherchant,contig},\n  --assembler {metaspades,bcalm,megahit,metacherchant,contig}\n                      Assembler used to generate input GFA (required to\n                      correctly parse coverage information)\n  -k MAX_KMER_SIZE, --max_kmer_size MAX_KMER_SIZE\n                      Maximum k-mer sized used by assembler to generate\n                      input GFA\n  --extraction_timeout EXTRACTION_TIMEOUT\n                      Maximum time to extract neighbourhood per gene in\n                      minutes, -1 indicates no limit\n  -j NUM_CORES, --num_cores NUM_CORES\n                      Number of cores to use\n  -c COVERAGE_DIFFERENCE, --coverage_difference COVERAGE_DIFFERENCE\n                      Maximum coverage difference to include when filtering\n                      graph neighbourhood. Use -1 to indicate no coverage\n                      threshold (although this will likely lead to false\n                      positive neighbourhoods).\n  -t TARGET_GENES, --target_genes TARGET_GENES\n                      Target genes to search for in the assembly graph\n                      (fasta formatted). Default is the pre-installed CARD database\n  -x MIN_TARGET_IDENTITY, --min_target_identity MIN_TARGET_IDENTITY\n                      Minimum identity/coverage to identify presence of\n                      target gene in assembly graph\n  -l NEIGHBOURHOOD_LENGTH, --neighbourhood_length NEIGHBOURHOOD_LENGTH\n                      Size of gene neighbourhood to extract from the\n                      assembly graph\n  -o OUTPUT_DIR, --output_dir OUTPUT_DIR\n                      Output folder for current run of sarand\n  -f, --force         Force overwrite any previous files/output     \n                      directories\n  --verbose           Provide verbose debugging output when logging,\n                      and keep intermediate files\n  --no_rgi            Disable RGI based annotation of graph neighbourhoods\n  --rgi_include_loose Include loose criteria hits if using RGI to annotate\n                      graph neighbourhoods\n  --use_ga            Enable GraphAligner (instead of Bandage) for  \n                      sequence alignment in the graph\n  --ga [GA ...]       Additional arguments to supply to graph aligner in the\n                      form of --ga key value, e.g. --ga E-cutoff 0.1;\n                      it should be used only if use_ga is set to True\n  --keep_intermediate_files\n                      Do not delete intermediate files.\n  --debug               Creates additional files for debugging purposes.\n  -seq SEQ_NUMBER, --max_number_seq_for_cdhit SEQ_NUMBER    \n  \t\t      Max Number of sequence for cd-hit\n  -sim [0 1],  -similarity [0 1]\n                     similarity threshold for cdhit (a number between 0 and 1)\n  --meta_main_dir METACHERCHANT_MAIN_DIR\n  \t\t     The main directory for metacherchant containing\n  \t\t     AMR_seqs_full.fasta, all AMR sequences and the\n  \t\t     extracted local graphs by metacherchant.\n```\n\n**Running for Metacherchant:**\n\nTo extract neighborhoods from Metacherchant, you first need to run Metacherchant separately on your set of target genes. For each gene, Metacherchant will generate a local neighborhood graph. However, it does not provide the actual neighborhood sequences. To extract these sequences from the generated local graphs, Sarand must be run on them.\n\n***Required Input Files and Directories***\n\nEnsure that the following items are placed inside a directory, which will be passed to Sarand as `meta_main_dir`:\n* `AMR_seqs_full.fasta`: a FASTA file containing all target genes' names and sequences (each in a separate line)\n* `AMR_info/sequences`: a directory containing separate files each each named <geneName>.fasta, where each file contains the name and sequence of a target gene.\n* `output` a directory generated by Metacherchant containing the local graphs produced for the genes.\n\n***Running Sarand***\nOnce the required files and directories are set up, execute Sarand for Metacherchant using the following command:\n```shell\nsarand -o <output_dir> -a metacherchant --meta_main_dir <meta_main_dir>\n```\n\n**Running for Contigs:**\n\nTo extract neighborhoods from contigs, execute Sarand using the following command:\n```shell\nsarand -i <contigs_file> -o <output_dir> -a contig\n```\n\n### 3a. Output\nAll results will be available in specified output directory (default is `sarand_results_` followed by a timestamp).\nHere is the list of important directories and files that can be seen there and a short description of their content:\n* `AMR_info`: this directory contains the list of identified AMR sequences.\n    * `AMR_info/sequences/`:The sequence of identified AMRs, from graph, is stored here, with a name similar to their original name (file name is generated by calling `sarand/utils.py::restricted_amr_name_from_modified_name(amr_name_from_title(amr_original_name)))`\n    * `AMR_info/alignments/`: The alignment details for all AMR sequences are stored here.\n\n* `sequences_info/sequences_info_{neighbourhood_length}/`: This directory stores the information of extracted neighborhood sequences from the assembly graph.\n    * `sequences_info/sequences_info_{params.neighbourhood_length}/sequences/`: the extracted sequences in the neighborhood of each AMR are stored in a file like `ng_sequences_{AMR_NAME}_{params.neighbourhood_length}_{DATE}.txt`.\nFor each extracted sequence, the first line denotes the corresponding path, where the nodes representing the AMR sequence are placed in '[]'. The next line denotes the extracted sequence where the AMR sequence is in lower case letters and the neighborhood is in upper case letters.\n    * `sequences_info/sequences_info_{params.neighbourhood_length}/paths_info/`: The information of nodes representing the AMR neighborhood including their name, the part of the sequence represented by each node (start position and end position) as well as their coverage is stored in a file like `ng_sequences_{AMR_NAME}_{params.neighbourhood_length}_{DATE}.csv`\n\n* `annotations/annotations_{params.neighbourhood_length}`: The annotation details are stored in this directory.\n    * `annotations/annotations_{params.neighbourhood_length}/annotation_{AMR_NAME}_{params.neighbourhood_length}`: this directory contains all annotation details for a given AMR.\n    * `gene_comparison_<AMR_NAME>.png`: An image visualizing annotations\n    * `annotation_detail_{AMR_NAME}.csv`: the list of annotations of all extracted sequences for an AMR gene\n    * `trimmed_annotation_info_{AMR_NAME}.csv`: the list of unique annotations of all extracted sequences for an AMR gene\n    * `coverage_annotation_{COVERAGE_DIFFERENCE}_{AMR_NAME}.csv`: the list of the annotations in which the gene coverage difference from the AMR gene coverage is less than GENE_COVERAGE_DIFFERENCE value.\n    * `bakta_dir_extracted{NUM}_{DATE}`: it contains the output of prokka for annotation of a sequence extracted from the neighborhood of the target AMR gene in the assembly graph.\n    * `rgi_dir`: contains RGI annotation details for all extracted neighborhood sequences of the target AMR gene.\n",
    "bugtrack_url": null,
    "license": "GPLv3",
    "summary": "Tool to extract the neighborhood of the target Antimicrobial Resistance (AMR) genes from the assembly graph.",
    "version": "1.1.1",
    "project_urls": {
        "Homepage": "https://github.com/beiko-lab/sarand"
    },
    "split_keywords": [
        "metagenomic assembly graph",
        " antimicrobial resistance",
        " context extraction"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "acadd30a8fcf0f6b82c39cda293165f16f195eab77dc222c1e745e19f148ef18",
                "md5": "445678c77744ffee9320ec38e878c32f",
                "sha256": "c6fc722da01131d90052a2002487071ac155381447b83f12de663cf549eb299d"
            },
            "downloads": -1,
            "filename": "sarand-1.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "445678c77744ffee9320ec38e878c32f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 673017,
            "upload_time": "2025-07-29T16:51:03",
            "upload_time_iso_8601": "2025-07-29T16:51:03.042231Z",
            "url": "https://files.pythonhosted.org/packages/ac/ad/d30a8fcf0f6b82c39cda293165f16f195eab77dc222c1e745e19f148ef18/sarand-1.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e37d163ce0030f5a3efca85c40d519072b72d083e78b991c0cee19b0e09994db",
                "md5": "eaf29245c65c90dbd3563fd73e17b64b",
                "sha256": "3832eb15c8c77a2c52633ff4f62c87ab5e628e4fe086febffa85411c096b810c"
            },
            "downloads": -1,
            "filename": "sarand-1.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "eaf29245c65c90dbd3563fd73e17b64b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 589988,
            "upload_time": "2025-07-29T16:51:04",
            "upload_time_iso_8601": "2025-07-29T16:51:04.394029Z",
            "url": "https://files.pythonhosted.org/packages/e3/7d/163ce0030f5a3efca85c40d519072b72d083e78b991c0cee19b0e09994db/sarand-1.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-29 16:51:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "beiko-lab",
    "github_project": "sarand",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sarand"
}

Somayeh Kafaie