Name | cloneArmy JSON |
Version |
0.2.0
JSON |
| download |
home_page | None |
Summary | Analyze haplotypes from Illumina paired-end amplicon sequencing |
upload_time | 2024-11-24 04:02:50 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# CloneArmy
CloneArmy is a modern Python package for analyzing haplotypes from Illumina paired-end amplicon sequencing data. It provides a streamlined workflow for processing FASTQ files, aligning reads, identifying sequence variants, and performing comparative analyses between samples.
## Features
- Fast paired-end read processing using BWA-MEM
- Quality-based filtering of bases and alignments
- Haplotype identification and frequency analysis
- Statistical comparison between samples with FDR correction
- Interactive visualization of mutation frequencies
- Rich command-line interface with progress tracking
- Comprehensive HTML reports
- Multi-threading support
- Support for full-length sequence analysis
## Installation
```bash
pip install cloneArmy
```
### Requirements
- Python ≥ 3.8
- BWA (must be installed and available in PATH)
- Samtools (must be installed and available in PATH)
## Usage
### Command Line Interface
#### Basic Analysis
```bash
# Basic usage
cloneArmy run /path/to/fastq/directory reference.fasta
# With all options
cloneArmy run /path/to/fastq/directory reference.fasta \
--threads 8 \
--output results \
--min-base-quality 20 \
--min-mapping-quality 30 \
--no-report # Skip HTML report generation
```
#### Comparative Analysis
```bash
# Compare two samples
cloneArmy compare \
/path/to/sample1/fastq \
/path/to/sample2/fastq \
reference.fasta \
--threads 8 \
--output comparison_results \
--min-base-quality 20 \
--min-mapping-quality 30 \
--full-length-only # Only consider full-length sequences
```
### Python API
```python
from pathlib import Path
from clone_army.processor import AmpliconProcessor
from clone_army.comparison import run_comparative_analysis
# Initialize processor
processor = AmpliconProcessor(
reference_path="reference.fasta",
min_base_quality=20,
min_mapping_quality=30
)
# Process samples
results1 = processor.process_sample(
fastq_r1="sample1_R1.fastq.gz",
fastq_r2="sample1_R2.fastq.gz",
output_dir="results/sample1",
threads=4
)
results2 = processor.process_sample(
fastq_r1="sample2_R1.fastq.gz",
fastq_r2="sample2_R2.fastq.gz",
output_dir="results/sample2",
threads=4
)
# Perform comparative analysis
comparison_results = run_comparative_analysis(
results1={"sample1": results1},
results2={"sample2": results2},
reference_seq="ATCG...", # Reference sequence string
output_path="comparison_results.csv",
full_length_only=False
)
# Results are returned as pandas DataFrames
print(results1) # Sample 1 haplotypes
print(comparison_results) # Statistical comparison
```
## Output Files
### Single Sample Analysis
- Sorted BAM file with alignments
- CSV file containing haplotype information:
- Sequence
- Read count
- Frequency
- Number of mutations
- Full-length status
- Interactive HTML report (optional)
- Console output with summary statistics
### Comparative Analysis
- CSV file with statistical comparisons:
- Mutation positions and types
- Frequencies in each sample
- Statistical significance (p-values)
- FDR-corrected p-values
- Interactive HTML plot showing mutation frequency differences
- Console output with significant mutations
Raw data
{
"_id": null,
"home_page": null,
"name": "cloneArmy",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Jason D Limberis <Jason.Limberis@ucsf.edu>",
"download_url": "https://files.pythonhosted.org/packages/aa/5a/7e77eeecc894f2abd3aba243139ee31e056bb56f550b73a069ebdec0f2dc/clonearmy-0.2.0.tar.gz",
"platform": null,
"description": "# CloneArmy\n\nCloneArmy is a modern Python package for analyzing haplotypes from Illumina paired-end amplicon sequencing data. It provides a streamlined workflow for processing FASTQ files, aligning reads, identifying sequence variants, and performing comparative analyses between samples.\n\n## Features\n\n- Fast paired-end read processing using BWA-MEM\n- Quality-based filtering of bases and alignments\n- Haplotype identification and frequency analysis\n- Statistical comparison between samples with FDR correction\n- Interactive visualization of mutation frequencies\n- Rich command-line interface with progress tracking\n- Comprehensive HTML reports\n- Multi-threading support\n- Support for full-length sequence analysis\n\n## Installation\n\n```bash\npip install cloneArmy\n```\n\n### Requirements\n\n- Python \u2265 3.8\n- BWA (must be installed and available in PATH)\n- Samtools (must be installed and available in PATH)\n\n## Usage\n\n### Command Line Interface\n\n#### Basic Analysis\n\n```bash\n# Basic usage\ncloneArmy run /path/to/fastq/directory reference.fasta\n\n# With all options\ncloneArmy run /path/to/fastq/directory reference.fasta \\\n --threads 8 \\\n --output results \\\n --min-base-quality 20 \\\n --min-mapping-quality 30 \\\n --no-report # Skip HTML report generation\n```\n\n#### Comparative Analysis\n\n```bash\n# Compare two samples\ncloneArmy compare \\\n /path/to/sample1/fastq \\\n /path/to/sample2/fastq \\\n reference.fasta \\\n --threads 8 \\\n --output comparison_results \\\n --min-base-quality 20 \\\n --min-mapping-quality 30 \\\n --full-length-only # Only consider full-length sequences\n```\n\n### Python API\n\n```python\nfrom pathlib import Path\nfrom clone_army.processor import AmpliconProcessor\nfrom clone_army.comparison import run_comparative_analysis\n\n# Initialize processor\nprocessor = AmpliconProcessor(\n reference_path=\"reference.fasta\",\n min_base_quality=20,\n min_mapping_quality=30\n)\n\n# Process samples\nresults1 = processor.process_sample(\n fastq_r1=\"sample1_R1.fastq.gz\",\n fastq_r2=\"sample1_R2.fastq.gz\",\n output_dir=\"results/sample1\",\n threads=4\n)\n\nresults2 = processor.process_sample(\n fastq_r1=\"sample2_R1.fastq.gz\",\n fastq_r2=\"sample2_R2.fastq.gz\",\n output_dir=\"results/sample2\",\n threads=4\n)\n\n# Perform comparative analysis\ncomparison_results = run_comparative_analysis(\n results1={\"sample1\": results1},\n results2={\"sample2\": results2},\n reference_seq=\"ATCG...\", # Reference sequence string\n output_path=\"comparison_results.csv\",\n full_length_only=False\n)\n\n# Results are returned as pandas DataFrames\nprint(results1) # Sample 1 haplotypes\nprint(comparison_results) # Statistical comparison\n```\n\n## Output Files\n\n### Single Sample Analysis\n- Sorted BAM file with alignments\n- CSV file containing haplotype information:\n - Sequence\n - Read count\n - Frequency\n - Number of mutations\n - Full-length status\n- Interactive HTML report (optional)\n- Console output with summary statistics\n\n### Comparative Analysis\n- CSV file with statistical comparisons:\n - Mutation positions and types\n - Frequencies in each sample\n - Statistical significance (p-values)\n - FDR-corrected p-values\n- Interactive HTML plot showing mutation frequency differences\n- Console output with significant mutations\n",
"bugtrack_url": null,
"license": null,
"summary": "Analyze haplotypes from Illumina paired-end amplicon sequencing",
"version": "0.2.0",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "aa5a7e77eeecc894f2abd3aba243139ee31e056bb56f550b73a069ebdec0f2dc",
"md5": "923a9b6acdcaa97057300020e333e3ae",
"sha256": "220f37d88e07b90fb7bcd9277138a28fadcac89c06b86145ce9b6e6e40a26e27"
},
"downloads": -1,
"filename": "clonearmy-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "923a9b6acdcaa97057300020e333e3ae",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 15158,
"upload_time": "2024-11-24T04:02:50",
"upload_time_iso_8601": "2024-11-24T04:02:50.136715Z",
"url": "https://files.pythonhosted.org/packages/aa/5a/7e77eeecc894f2abd3aba243139ee31e056bb56f550b73a069ebdec0f2dc/clonearmy-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-24 04:02:50",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "clonearmy"
}