dsrnascan


Namedsrnascan JSON
Version 0.3.3 PyPI version JSON
download
home_pagehttps://github.com/Bass-Lab/dsRNAscan
SummaryA tool for genome-wide prediction of double-stranded RNA structures
upload_time2025-08-11 09:35:42
maintainerNone
docs_urlNone
authorBass Lab
requires_python>=3.7
licenseGPL-3.0-or-later
keywords bioinformatics rna dsrna secondary structure genomics
VCS
bugtrack_url
requirements biopython numpy pandas ViennaRNA
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # dsRNAscan

[![CI Tests](https://github.com/Bass-Lab/dsRNAscan/actions/workflows/ci-simple.yml/badge.svg)](https://github.com/Bass-Lab/dsRNAscan/actions/workflows/ci-simple.yml)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

**dsRNAscan** is a bioinformatics tool for genome-wide identification of **double-stranded RNA (dsRNA) structures**. It uses a sliding window approach to detect inverted repeats that can form dsRNA secondary structures, with special support for **G-U wobble base pairing**.

### Install from PyPI 
```bash
pip install dsrnascan
```

### Basic Usage
```bash
# Scan a genome/sequence for dsRNA structures
dsrnascan input.fasta # This uses defaults of -w 10000 -s 150 --score 50

# Process specific chromosome
dsrnascan genome.fasta --chr chr21 -c 8

# Use custom parameters for sensitive detection
dsrnascan sequence.fasta -w 5000 --min 20 --score 30
```

## 📋 Requirements

- **Python 3.8+**
- **Dependencies** (automatically installed):
  - numpy ≥1.19
  - pandas ≥1.1
  - biopython ≥1.78
  - ViennaRNA ≥2.4

### Important: einverted Binary

dsRNAscan requires the `einverted` tool from EMBOSS with our **G-U wobble patch** for accurate RNA structure detection. 

**Option 1: Automatic** (macOS with included binary)
- The package includes a pre-compiled einverted for macOS ARM64
- It will be used automatically on compatible systems

**Option 2: System Installation** (Linux/Other)
```bash
# Ubuntu/Debian
sudo apt-get install emboss

# macOS with Homebrew
brew install emboss

# Conda (recommended for bioinformatics workflows)
conda install -c bioconda emboss
```

**Note:** System-installed EMBOSS won't have the G-U patch. For full RNA functionality with G-U wobble pairs, compile from source:

```bash
# Compile with G-U patch (optional but recommended)
cd dsRNAscan
DSRNASCAN_COMPILE_FULL=true pip install .
```

## Detailed Usage

### Command-Line Options

```bash
dsrnascan --help
```

Key parameters:
- `-w/--window`: Window size for scanning (default: 10000)
- `-s/--step`: Step size between windows (default: 150)
- `--score`: Minimum score threshold for inverted repeats (default: 50)
- `--min/--max`: Min/max length of inverted repeats (default: 30/10000)
- `--paired_cutoff`: Minimum percentage of paired bases (default: 70%)
- `-c/--cpus`: Number of CPUs to use (default: 4)
- `--chr`: Specific chromosome to process
- `--reverse`: Scan reverse strand

### Output Files

dsRNAscan generates several output files in a timestamped directory:

1. **`*_merged_results.txt`**: Tab-delimited file with all predicted dsRNAs
   - Columns include: coordinates, scores, sequences, structures, folding energy
   
2. **`*.dsRNApredictions.bp`**: IGV-compatible visualization file
   - Load in IGV to visualize dsRNA locations on genome

### Example Workflow

```bash
# 1. Basic genome scan
dsrnascan human_genome.fa -c 16 --output-dir results/

# 2. Scan specific region with sensitive parameters
dsrnascan chr21.fa -w 5000 -s 100 --score 30 --min 20

# 3. Process RNA-seq assembled transcripts
dsrnascan transcripts.fa -w 1000 --paired_cutoff 60

# 4. Scan both strands
dsrnascan sequence.fa --reverse
```

## Installation Troubleshooting

### "einverted binary not found"
The package needs einverted from EMBOSS. Solutions:
1. Install EMBOSS: `conda install -c bioconda emboss`
2. Or compile during install: `DSRNASCAN_COMPILE_FULL=true pip install .`
3. Or use the package without functional testing: `dsrnascan --help` works without einverted

### "ModuleNotFoundError: No module named 'ViennaRNA'"
Install ViennaRNA Python bindings:
```bash
# Via conda (recommended)
conda install -c bioconda viennarna

# Via pip
pip install ViennaRNA
```

### Installation on HPC/Cluster
```bash
module load python/3.8  # or your Python module
module load emboss      # if available
pip install --user git+https://github.com/Bass-Lab/dsRNAscan.git
```


## Using dsRNAscan as a Python Module

While primarily designed as a command-line tool, dsRNAscan can be imported and used in Python scripts:

```python
# Method 1: Simple usage
from dsrnascan import main
import sys

# Simulate command line arguments
sys.argv = ['dsrnascan', 'input.fasta', '-w', '1000', '--score', '30']
main()

# Method 2: Using subprocess for better control
import subprocess
result = subprocess.run(['dsrnascan', 'input.fasta', '--score', '30'], 
                       capture_output=True, text=True)

# Method 3: Parse results programmatically
import pandas as pd
import glob

# Run dsRNAscan
subprocess.run(['dsrnascan', 'input.fasta'])

# Find and read results
output_dir = sorted(glob.glob('dsrnascan_*'))[-1]
results = pd.read_csv(f"{output_dir}/*_merged_results.txt", sep='\t')
```

For more examples, see `using_dsrnascan_as_module.py` in the repository.

## Algorithm Details

dsRNAscan uses a multi-step approach:

1. **Window Extraction**: Divides genome into overlapping windows
2. **Inverted Repeat Detection**: Uses modified einverted with G-U wobble support
3. **Structure Prediction**: Validates structures with RNAduplex (ViennaRNA)
4. **Filtering**: Applies score and pairing percentage cutoffs
5. **Parallel Processing**: Distributes windows across multiple CPUs

The key innovation is the **G-U wobble patch** for einverted, allowing detection of RNA-specific base pairs crucial for identifying functional dsRNA structures.

## Citation

If you use dsRNAscan in your research, please cite:
Comprehensive mapping of human dsRNAome reveals conservation, neuronal enrichment, and intermolecular interactions

https://doi.org/10.1101/2025.01.24.634786



## Additional Tools

- **overlap_analyzer/** - Statistical enrichment analysis for genomic features overlapping with dsRNA predictions. See [overlap_analyzer/README.md](overlap_analyzer/README.md) for details.

## License

This project is licensed under the GNU General Public License v3.0 - see the [LICENSE](LICENSE) file for details.

## Support

- **Issues**: [GitHub Issues](https://github.com/Bass-Lab/dsRNAscan/issues)
- **Documentation**: [GitHub Wiki](https://github.com/Bass-Lab/dsRNAscan/wiki)

## Acknowledgments

- EMBOSS team for the einverted tool
- ViennaRNA team for RNA folding algorithms

---
**Note**: This tool is for research purposes. Ensure you understand the parameters for your specific use case.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Bass-Lab/dsRNAscan",
    "name": "dsrnascan",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "bioinformatics, RNA, dsRNA, secondary structure, genomics",
    "author": "Bass Lab",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/41/e7/08bc9c8a2d86afa1f2afdc6ab6c29b99f9cde9a53936178d9d0c615c6d82/dsrnascan-0.3.3.tar.gz",
    "platform": null,
    "description": "# dsRNAscan\n\n[![CI Tests](https://github.com/Bass-Lab/dsRNAscan/actions/workflows/ci-simple.yml/badge.svg)](https://github.com/Bass-Lab/dsRNAscan/actions/workflows/ci-simple.yml)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n\n**dsRNAscan** is a bioinformatics tool for genome-wide identification of **double-stranded RNA (dsRNA) structures**. It uses a sliding window approach to detect inverted repeats that can form dsRNA secondary structures, with special support for **G-U wobble base pairing**.\n\n### Install from PyPI \n```bash\npip install dsrnascan\n```\n\n### Basic Usage\n```bash\n# Scan a genome/sequence for dsRNA structures\ndsrnascan input.fasta # This uses defaults of -w 10000 -s 150 --score 50\n\n# Process specific chromosome\ndsrnascan genome.fasta --chr chr21 -c 8\n\n# Use custom parameters for sensitive detection\ndsrnascan sequence.fasta -w 5000 --min 20 --score 30\n```\n\n## \ud83d\udccb Requirements\n\n- **Python 3.8+**\n- **Dependencies** (automatically installed):\n  - numpy \u22651.19\n  - pandas \u22651.1\n  - biopython \u22651.78\n  - ViennaRNA \u22652.4\n\n### Important: einverted Binary\n\ndsRNAscan requires the `einverted` tool from EMBOSS with our **G-U wobble patch** for accurate RNA structure detection. \n\n**Option 1: Automatic** (macOS with included binary)\n- The package includes a pre-compiled einverted for macOS ARM64\n- It will be used automatically on compatible systems\n\n**Option 2: System Installation** (Linux/Other)\n```bash\n# Ubuntu/Debian\nsudo apt-get install emboss\n\n# macOS with Homebrew\nbrew install emboss\n\n# Conda (recommended for bioinformatics workflows)\nconda install -c bioconda emboss\n```\n\n**Note:** System-installed EMBOSS won't have the G-U patch. For full RNA functionality with G-U wobble pairs, compile from source:\n\n```bash\n# Compile with G-U patch (optional but recommended)\ncd dsRNAscan\nDSRNASCAN_COMPILE_FULL=true pip install .\n```\n\n## Detailed Usage\n\n### Command-Line Options\n\n```bash\ndsrnascan --help\n```\n\nKey parameters:\n- `-w/--window`: Window size for scanning (default: 10000)\n- `-s/--step`: Step size between windows (default: 150)\n- `--score`: Minimum score threshold for inverted repeats (default: 50)\n- `--min/--max`: Min/max length of inverted repeats (default: 30/10000)\n- `--paired_cutoff`: Minimum percentage of paired bases (default: 70%)\n- `-c/--cpus`: Number of CPUs to use (default: 4)\n- `--chr`: Specific chromosome to process\n- `--reverse`: Scan reverse strand\n\n### Output Files\n\ndsRNAscan generates several output files in a timestamped directory:\n\n1. **`*_merged_results.txt`**: Tab-delimited file with all predicted dsRNAs\n   - Columns include: coordinates, scores, sequences, structures, folding energy\n   \n2. **`*.dsRNApredictions.bp`**: IGV-compatible visualization file\n   - Load in IGV to visualize dsRNA locations on genome\n\n### Example Workflow\n\n```bash\n# 1. Basic genome scan\ndsrnascan human_genome.fa -c 16 --output-dir results/\n\n# 2. Scan specific region with sensitive parameters\ndsrnascan chr21.fa -w 5000 -s 100 --score 30 --min 20\n\n# 3. Process RNA-seq assembled transcripts\ndsrnascan transcripts.fa -w 1000 --paired_cutoff 60\n\n# 4. Scan both strands\ndsrnascan sequence.fa --reverse\n```\n\n## Installation Troubleshooting\n\n### \"einverted binary not found\"\nThe package needs einverted from EMBOSS. Solutions:\n1. Install EMBOSS: `conda install -c bioconda emboss`\n2. Or compile during install: `DSRNASCAN_COMPILE_FULL=true pip install .`\n3. Or use the package without functional testing: `dsrnascan --help` works without einverted\n\n### \"ModuleNotFoundError: No module named 'ViennaRNA'\"\nInstall ViennaRNA Python bindings:\n```bash\n# Via conda (recommended)\nconda install -c bioconda viennarna\n\n# Via pip\npip install ViennaRNA\n```\n\n### Installation on HPC/Cluster\n```bash\nmodule load python/3.8  # or your Python module\nmodule load emboss      # if available\npip install --user git+https://github.com/Bass-Lab/dsRNAscan.git\n```\n\n\n## Using dsRNAscan as a Python Module\n\nWhile primarily designed as a command-line tool, dsRNAscan can be imported and used in Python scripts:\n\n```python\n# Method 1: Simple usage\nfrom dsrnascan import main\nimport sys\n\n# Simulate command line arguments\nsys.argv = ['dsrnascan', 'input.fasta', '-w', '1000', '--score', '30']\nmain()\n\n# Method 2: Using subprocess for better control\nimport subprocess\nresult = subprocess.run(['dsrnascan', 'input.fasta', '--score', '30'], \n                       capture_output=True, text=True)\n\n# Method 3: Parse results programmatically\nimport pandas as pd\nimport glob\n\n# Run dsRNAscan\nsubprocess.run(['dsrnascan', 'input.fasta'])\n\n# Find and read results\noutput_dir = sorted(glob.glob('dsrnascan_*'))[-1]\nresults = pd.read_csv(f\"{output_dir}/*_merged_results.txt\", sep='\\t')\n```\n\nFor more examples, see `using_dsrnascan_as_module.py` in the repository.\n\n## Algorithm Details\n\ndsRNAscan uses a multi-step approach:\n\n1. **Window Extraction**: Divides genome into overlapping windows\n2. **Inverted Repeat Detection**: Uses modified einverted with G-U wobble support\n3. **Structure Prediction**: Validates structures with RNAduplex (ViennaRNA)\n4. **Filtering**: Applies score and pairing percentage cutoffs\n5. **Parallel Processing**: Distributes windows across multiple CPUs\n\nThe key innovation is the **G-U wobble patch** for einverted, allowing detection of RNA-specific base pairs crucial for identifying functional dsRNA structures.\n\n## Citation\n\nIf you use dsRNAscan in your research, please cite:\nComprehensive mapping of human dsRNAome reveals conservation, neuronal enrichment, and intermolecular interactions\n\nhttps://doi.org/10.1101/2025.01.24.634786\n\n\n\n## Additional Tools\n\n- **overlap_analyzer/** - Statistical enrichment analysis for genomic features overlapping with dsRNA predictions. See [overlap_analyzer/README.md](overlap_analyzer/README.md) for details.\n\n## License\n\nThis project is licensed under the GNU General Public License v3.0 - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/Bass-Lab/dsRNAscan/issues)\n- **Documentation**: [GitHub Wiki](https://github.com/Bass-Lab/dsRNAscan/wiki)\n\n## Acknowledgments\n\n- EMBOSS team for the einverted tool\n- ViennaRNA team for RNA folding algorithms\n\n---\n**Note**: This tool is for research purposes. Ensure you understand the parameters for your specific use case.\n",
    "bugtrack_url": null,
    "license": "GPL-3.0-or-later",
    "summary": "A tool for genome-wide prediction of double-stranded RNA structures",
    "version": "0.3.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/Bass-Lab/dsRNAscan/issues",
        "Documentation": "https://github.com/Bass-Lab/dsRNAscan/blob/main/README.md",
        "Homepage": "https://github.com/Bass-Lab/dsRNAscan",
        "Repository": "https://github.com/Bass-Lab/dsRNAscan"
    },
    "split_keywords": [
        "bioinformatics",
        " rna",
        " dsrna",
        " secondary structure",
        " genomics"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4300ef75aceca1373a697e8d442c952be16f2525102cccc9cf21536d5d325dc5",
                "md5": "e92e068a73eb7d84f521b2fa47d6691a",
                "sha256": "3b53a9656841f5d4f532b8df263e4c7612212b3ac6e5543e51243e27b30c22e7"
            },
            "downloads": -1,
            "filename": "dsrnascan-0.3.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e92e068a73eb7d84f521b2fa47d6691a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 31696,
            "upload_time": "2025-08-11T09:35:39",
            "upload_time_iso_8601": "2025-08-11T09:35:39.625141Z",
            "url": "https://files.pythonhosted.org/packages/43/00/ef75aceca1373a697e8d442c952be16f2525102cccc9cf21536d5d325dc5/dsrnascan-0.3.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "41e708bc9c8a2d86afa1f2afdc6ab6c29b99f9cde9a53936178d9d0c615c6d82",
                "md5": "32a890ccaa70884dda78d5fe5be54f90",
                "sha256": "0f32381654d549055b2941854ba60cb85102413bf8936d99f2683c3440e72c8b"
            },
            "downloads": -1,
            "filename": "dsrnascan-0.3.3.tar.gz",
            "has_sig": false,
            "md5_digest": "32a890ccaa70884dda78d5fe5be54f90",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 1660129,
            "upload_time": "2025-08-11T09:35:42",
            "upload_time_iso_8601": "2025-08-11T09:35:42.039770Z",
            "url": "https://files.pythonhosted.org/packages/41/e7/08bc9c8a2d86afa1f2afdc6ab6c29b99f9cde9a53936178d9d0c615c6d82/dsrnascan-0.3.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-11 09:35:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Bass-Lab",
    "github_project": "dsRNAscan",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "biopython",
            "specs": [
                [
                    ">=",
                    "1.78"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.19"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.1"
                ]
            ]
        },
        {
            "name": "ViennaRNA",
            "specs": [
                [
                    ">=",
                    "2.4"
                ]
            ]
        }
    ],
    "lcname": "dsrnascan"
}
        
Elapsed time: 0.77917s