# dsRNAscan
[](https://github.com/Bass-Lab/dsRNAscan/actions/workflows/ci-simple.yml)
[](https://www.python.org/downloads/)
[](https://www.gnu.org/licenses/gpl-3.0)
**dsRNAscan** is a bioinformatics tool for genome-wide identification of **double-stranded RNA (dsRNA) structures**. It uses a sliding window approach to detect inverted repeats that can form dsRNA secondary structures, with special support for **G-U wobble base pairing**.
### Install from PyPI
```bash
pip install dsrnascan
```
### Basic Usage
```bash
# Scan a genome/sequence for dsRNA structures
dsrnascan input.fasta # This uses defaults of -w 10000 -s 150 --score 50
# Process specific chromosome
dsrnascan genome.fasta --chr chr21 -c 8
# Use custom parameters for sensitive detection
dsrnascan sequence.fasta -w 5000 --min 20 --score 30
```
## 📋 Requirements
- **Python 3.8+**
- **Dependencies** (automatically installed):
- numpy ≥1.19
- pandas ≥1.1
- biopython ≥1.78
- ViennaRNA ≥2.4
### Important: einverted Binary
dsRNAscan requires the `einverted` tool from EMBOSS with our **G-U wobble patch** for accurate RNA structure detection.
**Option 1: Automatic** (macOS with included binary)
- The package includes a pre-compiled einverted for macOS ARM64
- It will be used automatically on compatible systems
**Option 2: System Installation** (Linux/Other)
```bash
# Ubuntu/Debian
sudo apt-get install emboss
# macOS with Homebrew
brew install emboss
# Conda (recommended for bioinformatics workflows)
conda install -c bioconda emboss
```
**Note:** System-installed EMBOSS won't have the G-U patch. For full RNA functionality with G-U wobble pairs, compile from source:
```bash
# Compile with G-U patch (optional but recommended)
cd dsRNAscan
DSRNASCAN_COMPILE_FULL=true pip install .
```
## Detailed Usage
### Command-Line Options
```bash
dsrnascan --help
```
Key parameters:
- `-w/--window`: Window size for scanning (default: 10000)
- `-s/--step`: Step size between windows (default: 150)
- `--score`: Minimum score threshold for inverted repeats (default: 50)
- `--min/--max`: Min/max length of inverted repeats (default: 30/10000)
- `--paired_cutoff`: Minimum percentage of paired bases (default: 70%)
- `-c/--cpus`: Number of CPUs to use (default: 4)
- `--chr`: Specific chromosome to process
- `--reverse`: Scan reverse strand
### Output Files
dsRNAscan generates several output files in a timestamped directory:
1. **`*_merged_results.txt`**: Tab-delimited file with all predicted dsRNAs
- Columns include: coordinates, scores, sequences, structures, folding energy
2. **`*.dsRNApredictions.bp`**: IGV-compatible visualization file
- Load in IGV to visualize dsRNA locations on genome
### Example Workflow
```bash
# 1. Basic genome scan
dsrnascan human_genome.fa -c 16 --output-dir results/
# 2. Scan specific region with sensitive parameters
dsrnascan chr21.fa -w 5000 -s 100 --score 30 --min 20
# 3. Process RNA-seq assembled transcripts
dsrnascan transcripts.fa -w 1000 --paired_cutoff 60
# 4. Scan both strands
dsrnascan sequence.fa --reverse
```
## Installation Troubleshooting
### "einverted binary not found"
The package needs einverted from EMBOSS. Solutions:
1. Install EMBOSS: `conda install -c bioconda emboss`
2. Or compile during install: `DSRNASCAN_COMPILE_FULL=true pip install .`
3. Or use the package without functional testing: `dsrnascan --help` works without einverted
### "ModuleNotFoundError: No module named 'ViennaRNA'"
Install ViennaRNA Python bindings:
```bash
# Via conda (recommended)
conda install -c bioconda viennarna
# Via pip
pip install ViennaRNA
```
### Installation on HPC/Cluster
```bash
module load python/3.8 # or your Python module
module load emboss # if available
pip install --user git+https://github.com/Bass-Lab/dsRNAscan.git
```
## Using dsRNAscan as a Python Module
While primarily designed as a command-line tool, dsRNAscan can be imported and used in Python scripts:
```python
# Method 1: Simple usage
from dsrnascan import main
import sys
# Simulate command line arguments
sys.argv = ['dsrnascan', 'input.fasta', '-w', '1000', '--score', '30']
main()
# Method 2: Using subprocess for better control
import subprocess
result = subprocess.run(['dsrnascan', 'input.fasta', '--score', '30'],
capture_output=True, text=True)
# Method 3: Parse results programmatically
import pandas as pd
import glob
# Run dsRNAscan
subprocess.run(['dsrnascan', 'input.fasta'])
# Find and read results
output_dir = sorted(glob.glob('dsrnascan_*'))[-1]
results = pd.read_csv(f"{output_dir}/*_merged_results.txt", sep='\t')
```
For more examples, see `using_dsrnascan_as_module.py` in the repository.
## Algorithm Details
dsRNAscan uses a multi-step approach:
1. **Window Extraction**: Divides genome into overlapping windows
2. **Inverted Repeat Detection**: Uses modified einverted with G-U wobble support
3. **Structure Prediction**: Validates structures with RNAduplex (ViennaRNA)
4. **Filtering**: Applies score and pairing percentage cutoffs
5. **Parallel Processing**: Distributes windows across multiple CPUs
The key innovation is the **G-U wobble patch** for einverted, allowing detection of RNA-specific base pairs crucial for identifying functional dsRNA structures.
## Citation
If you use dsRNAscan in your research, please cite:
Comprehensive mapping of human dsRNAome reveals conservation, neuronal enrichment, and intermolecular interactions
https://doi.org/10.1101/2025.01.24.634786
## Additional Tools
- **overlap_analyzer/** - Statistical enrichment analysis for genomic features overlapping with dsRNA predictions. See [overlap_analyzer/README.md](overlap_analyzer/README.md) for details.
## License
This project is licensed under the GNU General Public License v3.0 - see the [LICENSE](LICENSE) file for details.
## Support
- **Issues**: [GitHub Issues](https://github.com/Bass-Lab/dsRNAscan/issues)
- **Documentation**: [GitHub Wiki](https://github.com/Bass-Lab/dsRNAscan/wiki)
## Acknowledgments
- EMBOSS team for the einverted tool
- ViennaRNA team for RNA folding algorithms
---
**Note**: This tool is for research purposes. Ensure you understand the parameters for your specific use case.
Raw data
{
"_id": null,
"home_page": "https://github.com/Bass-Lab/dsRNAscan",
"name": "dsrnascan",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "bioinformatics, RNA, dsRNA, secondary structure, genomics",
"author": "Bass Lab",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/41/e7/08bc9c8a2d86afa1f2afdc6ab6c29b99f9cde9a53936178d9d0c615c6d82/dsrnascan-0.3.3.tar.gz",
"platform": null,
"description": "# dsRNAscan\n\n[](https://github.com/Bass-Lab/dsRNAscan/actions/workflows/ci-simple.yml)\n[](https://www.python.org/downloads/)\n[](https://www.gnu.org/licenses/gpl-3.0)\n\n**dsRNAscan** is a bioinformatics tool for genome-wide identification of **double-stranded RNA (dsRNA) structures**. It uses a sliding window approach to detect inverted repeats that can form dsRNA secondary structures, with special support for **G-U wobble base pairing**.\n\n### Install from PyPI \n```bash\npip install dsrnascan\n```\n\n### Basic Usage\n```bash\n# Scan a genome/sequence for dsRNA structures\ndsrnascan input.fasta # This uses defaults of -w 10000 -s 150 --score 50\n\n# Process specific chromosome\ndsrnascan genome.fasta --chr chr21 -c 8\n\n# Use custom parameters for sensitive detection\ndsrnascan sequence.fasta -w 5000 --min 20 --score 30\n```\n\n## \ud83d\udccb Requirements\n\n- **Python 3.8+**\n- **Dependencies** (automatically installed):\n - numpy \u22651.19\n - pandas \u22651.1\n - biopython \u22651.78\n - ViennaRNA \u22652.4\n\n### Important: einverted Binary\n\ndsRNAscan requires the `einverted` tool from EMBOSS with our **G-U wobble patch** for accurate RNA structure detection. \n\n**Option 1: Automatic** (macOS with included binary)\n- The package includes a pre-compiled einverted for macOS ARM64\n- It will be used automatically on compatible systems\n\n**Option 2: System Installation** (Linux/Other)\n```bash\n# Ubuntu/Debian\nsudo apt-get install emboss\n\n# macOS with Homebrew\nbrew install emboss\n\n# Conda (recommended for bioinformatics workflows)\nconda install -c bioconda emboss\n```\n\n**Note:** System-installed EMBOSS won't have the G-U patch. For full RNA functionality with G-U wobble pairs, compile from source:\n\n```bash\n# Compile with G-U patch (optional but recommended)\ncd dsRNAscan\nDSRNASCAN_COMPILE_FULL=true pip install .\n```\n\n## Detailed Usage\n\n### Command-Line Options\n\n```bash\ndsrnascan --help\n```\n\nKey parameters:\n- `-w/--window`: Window size for scanning (default: 10000)\n- `-s/--step`: Step size between windows (default: 150)\n- `--score`: Minimum score threshold for inverted repeats (default: 50)\n- `--min/--max`: Min/max length of inverted repeats (default: 30/10000)\n- `--paired_cutoff`: Minimum percentage of paired bases (default: 70%)\n- `-c/--cpus`: Number of CPUs to use (default: 4)\n- `--chr`: Specific chromosome to process\n- `--reverse`: Scan reverse strand\n\n### Output Files\n\ndsRNAscan generates several output files in a timestamped directory:\n\n1. **`*_merged_results.txt`**: Tab-delimited file with all predicted dsRNAs\n - Columns include: coordinates, scores, sequences, structures, folding energy\n \n2. **`*.dsRNApredictions.bp`**: IGV-compatible visualization file\n - Load in IGV to visualize dsRNA locations on genome\n\n### Example Workflow\n\n```bash\n# 1. Basic genome scan\ndsrnascan human_genome.fa -c 16 --output-dir results/\n\n# 2. Scan specific region with sensitive parameters\ndsrnascan chr21.fa -w 5000 -s 100 --score 30 --min 20\n\n# 3. Process RNA-seq assembled transcripts\ndsrnascan transcripts.fa -w 1000 --paired_cutoff 60\n\n# 4. Scan both strands\ndsrnascan sequence.fa --reverse\n```\n\n## Installation Troubleshooting\n\n### \"einverted binary not found\"\nThe package needs einverted from EMBOSS. Solutions:\n1. Install EMBOSS: `conda install -c bioconda emboss`\n2. Or compile during install: `DSRNASCAN_COMPILE_FULL=true pip install .`\n3. Or use the package without functional testing: `dsrnascan --help` works without einverted\n\n### \"ModuleNotFoundError: No module named 'ViennaRNA'\"\nInstall ViennaRNA Python bindings:\n```bash\n# Via conda (recommended)\nconda install -c bioconda viennarna\n\n# Via pip\npip install ViennaRNA\n```\n\n### Installation on HPC/Cluster\n```bash\nmodule load python/3.8 # or your Python module\nmodule load emboss # if available\npip install --user git+https://github.com/Bass-Lab/dsRNAscan.git\n```\n\n\n## Using dsRNAscan as a Python Module\n\nWhile primarily designed as a command-line tool, dsRNAscan can be imported and used in Python scripts:\n\n```python\n# Method 1: Simple usage\nfrom dsrnascan import main\nimport sys\n\n# Simulate command line arguments\nsys.argv = ['dsrnascan', 'input.fasta', '-w', '1000', '--score', '30']\nmain()\n\n# Method 2: Using subprocess for better control\nimport subprocess\nresult = subprocess.run(['dsrnascan', 'input.fasta', '--score', '30'], \n capture_output=True, text=True)\n\n# Method 3: Parse results programmatically\nimport pandas as pd\nimport glob\n\n# Run dsRNAscan\nsubprocess.run(['dsrnascan', 'input.fasta'])\n\n# Find and read results\noutput_dir = sorted(glob.glob('dsrnascan_*'))[-1]\nresults = pd.read_csv(f\"{output_dir}/*_merged_results.txt\", sep='\\t')\n```\n\nFor more examples, see `using_dsrnascan_as_module.py` in the repository.\n\n## Algorithm Details\n\ndsRNAscan uses a multi-step approach:\n\n1. **Window Extraction**: Divides genome into overlapping windows\n2. **Inverted Repeat Detection**: Uses modified einverted with G-U wobble support\n3. **Structure Prediction**: Validates structures with RNAduplex (ViennaRNA)\n4. **Filtering**: Applies score and pairing percentage cutoffs\n5. **Parallel Processing**: Distributes windows across multiple CPUs\n\nThe key innovation is the **G-U wobble patch** for einverted, allowing detection of RNA-specific base pairs crucial for identifying functional dsRNA structures.\n\n## Citation\n\nIf you use dsRNAscan in your research, please cite:\nComprehensive mapping of human dsRNAome reveals conservation, neuronal enrichment, and intermolecular interactions\n\nhttps://doi.org/10.1101/2025.01.24.634786\n\n\n\n## Additional Tools\n\n- **overlap_analyzer/** - Statistical enrichment analysis for genomic features overlapping with dsRNA predictions. See [overlap_analyzer/README.md](overlap_analyzer/README.md) for details.\n\n## License\n\nThis project is licensed under the GNU General Public License v3.0 - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/Bass-Lab/dsRNAscan/issues)\n- **Documentation**: [GitHub Wiki](https://github.com/Bass-Lab/dsRNAscan/wiki)\n\n## Acknowledgments\n\n- EMBOSS team for the einverted tool\n- ViennaRNA team for RNA folding algorithms\n\n---\n**Note**: This tool is for research purposes. Ensure you understand the parameters for your specific use case.\n",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "A tool for genome-wide prediction of double-stranded RNA structures",
"version": "0.3.3",
"project_urls": {
"Bug Tracker": "https://github.com/Bass-Lab/dsRNAscan/issues",
"Documentation": "https://github.com/Bass-Lab/dsRNAscan/blob/main/README.md",
"Homepage": "https://github.com/Bass-Lab/dsRNAscan",
"Repository": "https://github.com/Bass-Lab/dsRNAscan"
},
"split_keywords": [
"bioinformatics",
" rna",
" dsrna",
" secondary structure",
" genomics"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4300ef75aceca1373a697e8d442c952be16f2525102cccc9cf21536d5d325dc5",
"md5": "e92e068a73eb7d84f521b2fa47d6691a",
"sha256": "3b53a9656841f5d4f532b8df263e4c7612212b3ac6e5543e51243e27b30c22e7"
},
"downloads": -1,
"filename": "dsrnascan-0.3.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e92e068a73eb7d84f521b2fa47d6691a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 31696,
"upload_time": "2025-08-11T09:35:39",
"upload_time_iso_8601": "2025-08-11T09:35:39.625141Z",
"url": "https://files.pythonhosted.org/packages/43/00/ef75aceca1373a697e8d442c952be16f2525102cccc9cf21536d5d325dc5/dsrnascan-0.3.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "41e708bc9c8a2d86afa1f2afdc6ab6c29b99f9cde9a53936178d9d0c615c6d82",
"md5": "32a890ccaa70884dda78d5fe5be54f90",
"sha256": "0f32381654d549055b2941854ba60cb85102413bf8936d99f2683c3440e72c8b"
},
"downloads": -1,
"filename": "dsrnascan-0.3.3.tar.gz",
"has_sig": false,
"md5_digest": "32a890ccaa70884dda78d5fe5be54f90",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 1660129,
"upload_time": "2025-08-11T09:35:42",
"upload_time_iso_8601": "2025-08-11T09:35:42.039770Z",
"url": "https://files.pythonhosted.org/packages/41/e7/08bc9c8a2d86afa1f2afdc6ab6c29b99f9cde9a53936178d9d0c615c6d82/dsrnascan-0.3.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-11 09:35:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Bass-Lab",
"github_project": "dsRNAscan",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "biopython",
"specs": [
[
">=",
"1.78"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.19"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"1.1"
]
]
},
{
"name": "ViennaRNA",
"specs": [
[
">=",
"2.4"
]
]
}
],
"lcname": "dsrnascan"
}