# ENATool 🧬
[](https://badge.fury.io/py/ENATool)
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
A comprehensive Python package for downloading and managing sequencing data from the European Nucleotide Archive (ENA) in terminal and through Python interface.
## ✨ Features
- 📊 **Extract Metadata** - Get comprehensive sample information from ENA projects
- 📥 **Download FASTQ Files** - Automated download with progress tracking
- 🔄 **Auto Fallback** - Automatically tries NCBI if ENA metadata unavailable
- 📈 **Progress Bars** - Real-time progress for downloads and metadata retrieval
- 📋 **Interactive Reports** - Generate searchable HTML tables with DataTables.js
- 💾 **Export to CSV** - Save metadata in standard formats
- 🔍 **Smart Verification** - Check fastq file integrity and skip existing files
- 💻 **Command line and Python interface**
## 🚀 Quick Start
### Installation
```bash
# Install from PyPI
pip install ENATool
```
### Basic Usage in Terminal
```bash
# Custom output directory
enatool download PRJNA335681 --path data/my_project
```
### Basic Usage in Python
```python
import ENATool
# Fetch metadata AND download files in one command
info, downloads = ENATool.fetch('PRJNA335681', path='data/my_project', download=True)
```
## 📊 Example Output Files
ENATool creates organized output:
```
my_project/
├── PRJNA335681.csv # Sample metadata
├── PRJNA335681.html # Interactive table
├── downoad_info_table.csv # Download tracking
└── raw_reads/ # Downloaded FASTQ files
├── SRR123456/
│ ├── SRR123456_1.fastq.gz
│ └── SRR123456_2.fastq.gz
└── SRR123457/
└── SRR123457.fastq.gz
```
## 🔧 Requirements
- Python >= 3.7
- pandas >= 1.3.0
- numpy >= 1.20.0
- requests >= 2.25.0
- xmltodict >= 0.12.0
- tqdm >= 4.60.0
- lxml >= 4.6.0
## 📖 Documentation
- [Use ENATool in Terminal](#use-enatool-in-terminal)
- [Fetching metadata](#fetching-metadata)
- [Download reads and fetch metadata](#download-reads-and-fetch-metadata)
- [Show project summary](#show-project-summary-stdout)
- [Redownload corrupted files or download selected files only](#redownload-corrupted-files-or-download-only-selected-files)
- [Leave files with incorrect md5 checksum](#leave-files-with-incorrect-md5-checksum)
- [Process multiple projects](#process-multiple-projects)
- [Hide banner](#hide-banner)
- [Disable progress bar](#disable-progress-bar)
- [Use ENATool in Python](#use-enatool-in-python)
- [Fetch Metadata](#fetch-metadata)
- [Download FASTQ Files](#download-fastq-files)
- [Download only a subset of samples](#download-only-a-subset-of-samples)
- [Leave files with incorrect md5 checksum](#leave-files-with-incorrect-md5-checksum-1)
- [Disable progress bar](#disable-progress-bar-1)
- [Work with multiple datasets](#work-with-multiple-datasets)
- [Python API Reference](#python-api-reference)
- [Citation](#-citation)
## Use ENATool in Terminal
### Fetching Metadata
Download metadata for all samples in an ENA project using `enatool fetch`.
**Syntax:**
```bash
enatool fetch PROJECT_ID [--path DIR]
```
**Arguments:**
- `PROJECT_ID` (required): ENA project accession (e.g., PRJNA335681)
- `--path DIR` or `-p DIR`: Output directory (default: PROJECT_ID)
**What it does:**
- Downloads sample metadata from ENA
- Tries NCBI BioSample as fallback if ENA fails
- Creates CSV file with all metadata
- Generates interactive HTML report
- Shows progress bars
**Output files:**
- `PROJECT_ID.csv` - Metadata in CSV format
- `PROJECT_ID.html` - Interactive HTML table
**Examples:**
```bash
# Basic usage - saves to PRJNA335681/
enatool fetch PRJNA335681
# Custom output directory
enatool fetch PRJNA335681 --path data/my_project
```
### Download Reads and Fetch Metadata
Download metadata for all samples in an ENA project and download sample files using using `enatool download`.
**Syntax:**
```bash
enatool download PROJECT_ID [--path DIR]
```
**Arguments:**
- `PROJECT_ID` (required): ENA project accession
- `--path DIR` or `-p DIR`: Output directory (default: PROJECT_ID)
**What it does:**
- Downloads metadata (same as `fetch`)
- Downloads all FASTQ files for all samples
- Uses enaDataGet tool
- Skips files that already exist
- Tracks download status
**Output files:**
- `PROJECT_ID.csv` - Metadata
- `PROJECT_ID.html` - Interactive table
- `downoad_info_table.csv` - Download tracking
- `raw_reads/` - Directory with FASTQ files
- `SRR123456/` - One directory per run
- `SRR123456_1.fastq.gz` - Forward reads
- `SRR123456_2.fastq.gz` - Reverse reads (if paired-end)
**Examples:**
```bash
# Download everything
enatool download PRJNA335681
# Custom output directory
enatool download PRJNA335681 --path data/project1
```
### Show Project Summary [stdout]
Display summary information about a downloaded project using `enatool info`.
**Syntax:**
```bash
enatool info PROJECT_ID --path DIR
```
**Arguments:**
- `PROJECT_ID` (required): ENA project accession
- `--path DIR` or `-p DIR` (required): Directory containing metadata
**What it does:**
- Reads metadata from CSV file
- Shows summary statistics
- Displays organism breakdown
- Shows sequencing platforms
- Shows download status (if available)
**Examples:**
```bash
# Show info for custom directory
enatool info PRJNA335681 --path data/my_project
```
**Output:**
```
📊 Project Information: PRJNAXXXXXX
============================================================
Total samples: 50
Organisms (2):
• Homo sapiens: 45
• Mus musculus: 5
Sequencing Platforms:
• ILLUMINA: 50
Library Strategies:
• RNA-Seq: 30
• WGS: 15
• ChIP-Seq: 5
Library Layout:
• PAIRED: 45
• SINGLE: 5
Download Status:
• OK: 48
• Error: 2
```
### Redownload Corrupted Files or Download Only Selected Files
Download all FASTQ files using previously fetched metadata or based on the subsetted metadata table using `enatool download-files`. Also forces redownload of files which previously ended up with a error.
**Syntax:**
```bash
enatool download-files PROJECT_ID --path DIR
```
**Arguments:**
- `PROJECT_ID` (required): ENA project accession
- `--path DIR` or `-p DIR` (required): Directory containing metadata
**What it does:**
- Loads sample names from existing CSV file (`PROJECT_ID.csv`)
- Downloads FASTQ files
- Useful if you already have metadata and just want the files or for filtered metadata tables.
**Use cases:**
- You fetched metadata earlier with `enatool fetch`
- You filtered the CSV file manually
- You want to re-download after failures
**Examples:**
```bash
# First get metadata (fast)
enatool fetch PRJNA335681 --path my_project
# Later, download files
enatool download-files PRJNA335681 --path my_project
# Or after filtering CSV file
enatool download-files PRJNA335681 --path my_project
```
### Leave files with incorrect md5 checksum
By default ENATool removes all the files which ended up being corrupted or md5 chesum did not match. However, you may use `--keep-failed` paramter to prevent the removal.
**Syntax:**
```bash
# with download command
enatool download PROJECT_ID --path DIR --keep-failed
# with download-files command
enatool download-files PROJECT_ID --path DIR --keep-failed
```
### Process multiple projects
For processing multiple projects:
```bash
# Simple loop
for project in PRJNA335681 PRJNA123456 PRJNA789012; do
echo "Processing $project..."
enatool fetch $project --path data/$project
done
# Or with download
for project in PRJNA335681 PRJNA123456; do
echo "Downloading $project..."
enatool download $project --path data/$project
done
```
### Hide banner
Use a global `enatool` option: `--no-banner`. Follows right after `enatool` and before the action command.
**Example:**
```bash
enatool --no-banner fetch PRJNA335681
```
### Disable progress bar
Use a global `enatool` option: `--no-progress-bar`. Follows right after `enatool` and before the action command.
**Example:**
```bash
enatool --no-progress-bar fetch PRJNA335681
```
__
## Use ENATool in Python
### Fetch Metadata
Use `fetch()` function to download metadata:
```python
import ENATool
# Basic usage - just get metadata
info_table = ENATool.fetch('PRJNA335681')
# Specify custom directory
info_table = ENATool.fetch('PRJNA335681', path='data/my_project')
# Get metadata AND download files
info_table, downloads = ENATool.fetch('PRJNA335681', download=True)
# Show some basic stats
print(f"Total samples: {len(info_table)}")
print(f"Organisms: {info_table['scientific_name'].unique()}")
print(f"Platforms: {info_table['instrument_platform'].value_counts()}")
```
**What you get:**
- Sample accessions and metadata
- Run accessions and sequencing details
- FASTQ file URLs and checksums
- Organism and experimental information
- Interactive HTML report
### Download FASTQ Files
```python
import ENATool
# Get metadata AND download files
info_table, downloads = ENATool.fetch('PRJNA335681', download=True)
# Check results
print(downloads['download_status'].value_counts())
```
**Download status values:**
- `OK` - Successfully downloaded
- `Exists` - File already exists (skipped)
- `Error` - Download failed
### Download only a subset of samples
```python
import ENATool
# Get metadata
info = ENATool.fetch('PRJNA335681')
# Filter samples
human_samples = info[info['scientific_name'] == 'Homo sapiens']
# ! Important !
# Re-initialize for filtered table
human_samples.ena.reinit(info)
# Download only filtered samples
downloads = human_samples.ena.download()
# Save to CSV
human_samples.to_csv('human_samples.csv', index=False)
```
### Leave files with incorrect md5 checksum
Prevent ENATool from automatic removal of the corrupted files.
```python
import ENATool
# Could be used in fetch method
info_table, downloads = ENATool.fetch('PRJNA335681', download=True, keep_failed=True)
# Could be used in download method
info = ENATool.fetch('PRJNA335681')
downloads = info.ena.download(keep_failed=True)
```
### Disable progress bar
```python
import ENATool
# Could be used in fetch method
info_table, downloads = ENATool.fetch('PRJNA335681', download=True, NO_PROGRESS_BAR=True)
# Could be used in download method
info = ENATool.fetch('PRJNA335681')
downloads = info.ena.download(NO_PROGRESS_BAR=True)
```
### Work with multiple datasets
```python
import ENATool
projects = ['PRJNA335681', 'PRJEB2961', 'PRJEB28350']
for project_id in projects:
try:
info = ENATool.fetch(project_id, path=f'data/{project_id}')
print(f"✓ {project_id}: {len(info)} samples")
except Exception as e:
print(f"✗ {project_id}: {e}")
```
### Python API Reference
#### `ENATool.fetch(project_id, path=None, download=False)`
Main entry point for fetching ENA data.
**Parameters:**
- `project_id` (str): ENA project accession (e.g., 'PRJNA335681')
- `path` (str, optional): Directory for outputs (defaults to project_id)
- `download` (bool, optional): Auto-download FASTQ files (default: False)
**Returns:**
- DataFrame (if download=False)
- Tuple of (info_table, download_table) (if download=True)
#### `DataFrame.ena.download()`
Download FASTQ files for samples in DataFrame.
**Returns:**
- DataFrame with download status
## 📝 Citation
If you use ENATool in your research, please cite:
```
Tikhonova, P. (2021). ENATool: European Nucleotide Archive Data Manager
(v2.0.0). Zenodo. https://doi.org/10.5281/zenodo.17443004
```
## 📜 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🔗 Links
- **PyPI:** https://pypi.org/project/ENATool/
- **GitHub:** https://github.com/PollyTikhonova/ENATool
- **Documentation:** https://github.com/PollyTikhonova/ENATool#readme
- **Bug Reports:** https://github.com/PollyTikhonova/ENATool/issues
Raw data
{
"_id": null,
"home_page": null,
"name": "ENATool",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "bioinformatics, sequencing, ENA, FASTQ, download, metadata, genomics",
"author": null,
"author_email": "\"P.Tikhonova\" <tikhonova.polly@mail.ru>",
"download_url": "https://files.pythonhosted.org/packages/6a/ae/67b65af1f0a4b9ff6357040b95bc8f7b0a9a069a9ff0cad4111ed8835f8e/enatool-2.0.0.tar.gz",
"platform": null,
"description": "# ENATool \ud83e\uddec\n\n[](https://badge.fury.io/py/ENATool)\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/MIT)\n\nA comprehensive Python package for downloading and managing sequencing data from the European Nucleotide Archive (ENA) in terminal and through Python interface.\n\n## \u2728 Features\n\n- \ud83d\udcca **Extract Metadata** - Get comprehensive sample information from ENA projects\n- \ud83d\udce5 **Download FASTQ Files** - Automated download with progress tracking\n- \ud83d\udd04 **Auto Fallback** - Automatically tries NCBI if ENA metadata unavailable\n- \ud83d\udcc8 **Progress Bars** - Real-time progress for downloads and metadata retrieval\n- \ud83d\udccb **Interactive Reports** - Generate searchable HTML tables with DataTables.js\n- \ud83d\udcbe **Export to CSV** - Save metadata in standard formats\n- \ud83d\udd0d **Smart Verification** - Check fastq file integrity and skip existing files\n- \ud83d\udcbb **Command line and Python interface**\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\n# Install from PyPI\npip install ENATool\n```\n\n### Basic Usage in Terminal\n\n```bash\n# Custom output directory\nenatool download PRJNA335681 --path data/my_project\n```\n\n### Basic Usage in Python\n\n```python\nimport ENATool\n\n# Fetch metadata AND download files in one command\ninfo, downloads = ENATool.fetch('PRJNA335681', path='data/my_project', download=True)\n```\n## \ud83d\udcca Example Output Files\n\nENATool creates organized output:\n\n```\nmy_project/\n\u251c\u2500\u2500 PRJNA335681.csv # Sample metadata\n\u251c\u2500\u2500 PRJNA335681.html # Interactive table\n\u251c\u2500\u2500 downoad_info_table.csv # Download tracking\n\u2514\u2500\u2500 raw_reads/ # Downloaded FASTQ files\n \u251c\u2500\u2500 SRR123456/\n \u2502 \u251c\u2500\u2500 SRR123456_1.fastq.gz\n \u2502 \u2514\u2500\u2500 SRR123456_2.fastq.gz\n \u2514\u2500\u2500 SRR123457/\n \u2514\u2500\u2500 SRR123457.fastq.gz\n```\n\n## \ud83d\udd27 Requirements\n\n- Python >= 3.7\n- pandas >= 1.3.0\n- numpy >= 1.20.0\n- requests >= 2.25.0\n- xmltodict >= 0.12.0\n- tqdm >= 4.60.0\n- lxml >= 4.6.0\n\n## \ud83d\udcd6 Documentation\n\n- [Use ENATool in Terminal](#use-enatool-in-terminal)\n - [Fetching metadata](#fetching-metadata)\n - [Download reads and fetch metadata](#download-reads-and-fetch-metadata)\n - [Show project summary](#show-project-summary-stdout)\n - [Redownload corrupted files or download selected files only](#redownload-corrupted-files-or-download-only-selected-files)\n - [Leave files with incorrect md5 checksum](#leave-files-with-incorrect-md5-checksum)\n - [Process multiple projects](#process-multiple-projects)\n - [Hide banner](#hide-banner)\n - [Disable progress bar](#disable-progress-bar)\n- [Use ENATool in Python](#use-enatool-in-python)\n - [Fetch Metadata](#fetch-metadata)\n - [Download FASTQ Files](#download-fastq-files)\n - [Download only a subset of samples](#download-only-a-subset-of-samples)\n - [Leave files with incorrect md5 checksum](#leave-files-with-incorrect-md5-checksum-1)\n - [Disable progress bar](#disable-progress-bar-1)\n - [Work with multiple datasets](#work-with-multiple-datasets)\n - [Python API Reference](#python-api-reference)\n- [Citation](#-citation)\n\n## Use ENATool in Terminal\n\n### Fetching Metadata\n\nDownload metadata for all samples in an ENA project using `enatool fetch`.\n\n**Syntax:**\n```bash\nenatool fetch PROJECT_ID [--path DIR]\n```\n\n**Arguments:**\n- `PROJECT_ID` (required): ENA project accession (e.g., PRJNA335681)\n- `--path DIR` or `-p DIR`: Output directory (default: PROJECT_ID)\n\n**What it does:**\n- Downloads sample metadata from ENA\n- Tries NCBI BioSample as fallback if ENA fails\n- Creates CSV file with all metadata\n- Generates interactive HTML report\n- Shows progress bars\n\n**Output files:**\n- `PROJECT_ID.csv` - Metadata in CSV format\n- `PROJECT_ID.html` - Interactive HTML table\n\n**Examples:**\n\n```bash\n# Basic usage - saves to PRJNA335681/\nenatool fetch PRJNA335681\n\n# Custom output directory\nenatool fetch PRJNA335681 --path data/my_project\n```\n\n### Download Reads and Fetch Metadata\nDownload metadata for all samples in an ENA project and download sample files using using `enatool download`.\n\n**Syntax:**\n```bash\nenatool download PROJECT_ID [--path DIR]\n```\n\n**Arguments:**\n- `PROJECT_ID` (required): ENA project accession\n- `--path DIR` or `-p DIR`: Output directory (default: PROJECT_ID)\n\n**What it does:**\n- Downloads metadata (same as `fetch`)\n- Downloads all FASTQ files for all samples\n- Uses enaDataGet tool\n- Skips files that already exist\n- Tracks download status\n\n**Output files:**\n- `PROJECT_ID.csv` - Metadata\n- `PROJECT_ID.html` - Interactive table\n- `downoad_info_table.csv` - Download tracking\n- `raw_reads/` - Directory with FASTQ files\n - `SRR123456/` - One directory per run\n - `SRR123456_1.fastq.gz` - Forward reads\n - `SRR123456_2.fastq.gz` - Reverse reads (if paired-end)\n\n**Examples:**\n\n```bash\n# Download everything\nenatool download PRJNA335681\n\n# Custom output directory\nenatool download PRJNA335681 --path data/project1\n```\n\n### Show Project Summary [stdout]\n\nDisplay summary information about a downloaded project using `enatool info`.\n\n**Syntax:**\n```bash\nenatool info PROJECT_ID --path DIR\n```\n\n**Arguments:**\n- `PROJECT_ID` (required): ENA project accession\n- `--path DIR` or `-p DIR` (required): Directory containing metadata\n\n**What it does:**\n- Reads metadata from CSV file\n- Shows summary statistics\n- Displays organism breakdown\n- Shows sequencing platforms\n- Shows download status (if available)\n\n**Examples:**\n\n```bash\n# Show info for custom directory\nenatool info PRJNA335681 --path data/my_project\n```\n\n**Output:**\n```\n\ud83d\udcca Project Information: PRJNAXXXXXX\n============================================================\nTotal samples: 50\n\nOrganisms (2):\n \u2022 Homo sapiens: 45\n \u2022 Mus musculus: 5\n\nSequencing Platforms:\n \u2022 ILLUMINA: 50\n\nLibrary Strategies:\n \u2022 RNA-Seq: 30\n \u2022 WGS: 15\n \u2022 ChIP-Seq: 5\n\nLibrary Layout:\n \u2022 PAIRED: 45\n \u2022 SINGLE: 5\n\nDownload Status:\n \u2022 OK: 48\n \u2022 Error: 2\n```\n\n\n### Redownload Corrupted Files or Download Only Selected Files\n\nDownload all FASTQ files using previously fetched metadata or based on the subsetted metadata table using `enatool download-files`. Also forces redownload of files which previously ended up with a error.\n\n**Syntax:**\n```bash\nenatool download-files PROJECT_ID --path DIR\n```\n\n**Arguments:**\n- `PROJECT_ID` (required): ENA project accession\n- `--path DIR` or `-p DIR` (required): Directory containing metadata\n\n**What it does:**\n- Loads sample names from existing CSV file (`PROJECT_ID.csv`)\n- Downloads FASTQ files\n- Useful if you already have metadata and just want the files or for filtered metadata tables.\n\n**Use cases:**\n- You fetched metadata earlier with `enatool fetch`\n- You filtered the CSV file manually\n- You want to re-download after failures\n\n**Examples:**\n\n```bash\n# First get metadata (fast)\nenatool fetch PRJNA335681 --path my_project\n\n# Later, download files \nenatool download-files PRJNA335681 --path my_project\n\n# Or after filtering CSV file\nenatool download-files PRJNA335681 --path my_project\n```\n\n### Leave files with incorrect md5 checksum\n\nBy default ENATool removes all the files which ended up being corrupted or md5 chesum did not match. However, you may use `--keep-failed` paramter to prevent the removal.\n\n**Syntax:**\n```bash\n# with download command\nenatool download PROJECT_ID --path DIR --keep-failed\n\n# with download-files command\nenatool download-files PROJECT_ID --path DIR --keep-failed\n```\n\n### Process multiple projects\n\nFor processing multiple projects:\n\n```bash\n# Simple loop\nfor project in PRJNA335681 PRJNA123456 PRJNA789012; do\n echo \"Processing $project...\"\n enatool fetch $project --path data/$project\ndone\n\n# Or with download\nfor project in PRJNA335681 PRJNA123456; do\n echo \"Downloading $project...\"\n enatool download $project --path data/$project\ndone\n```\n\n### Hide banner\nUse a global `enatool` option: `--no-banner`. Follows right after `enatool` and before the action command.\n\n**Example:**\n```bash\nenatool --no-banner fetch PRJNA335681\n```\n\n### Disable progress bar\nUse a global `enatool` option: `--no-progress-bar`. Follows right after `enatool` and before the action command.\n\n**Example:**\n```bash\nenatool --no-progress-bar fetch PRJNA335681\n```\n\n__\n## Use ENATool in Python\n### Fetch Metadata\n\nUse `fetch()` function to download metadata:\n\n```python\nimport ENATool\n\n# Basic usage - just get metadata\ninfo_table = ENATool.fetch('PRJNA335681')\n\n# Specify custom directory\ninfo_table = ENATool.fetch('PRJNA335681', path='data/my_project')\n\n# Get metadata AND download files\ninfo_table, downloads = ENATool.fetch('PRJNA335681', download=True)\n\n# Show some basic stats\nprint(f\"Total samples: {len(info_table)}\")\nprint(f\"Organisms: {info_table['scientific_name'].unique()}\")\nprint(f\"Platforms: {info_table['instrument_platform'].value_counts()}\")\n```\n\n**What you get:**\n- Sample accessions and metadata\n- Run accessions and sequencing details\n- FASTQ file URLs and checksums\n- Organism and experimental information\n- Interactive HTML report\n\n### Download FASTQ Files\n\n```python\nimport ENATool\n\n# Get metadata AND download files\ninfo_table, downloads = ENATool.fetch('PRJNA335681', download=True)\n\n# Check results\nprint(downloads['download_status'].value_counts())\n```\n\n**Download status values:**\n- `OK` - Successfully downloaded\n- `Exists` - File already exists (skipped)\n- `Error` - Download failed\n\n### Download only a subset of samples\n\n```python\nimport ENATool\n\n# Get metadata\ninfo = ENATool.fetch('PRJNA335681')\n\n# Filter samples\nhuman_samples = info[info['scientific_name'] == 'Homo sapiens']\n\n# ! Important ! \n# Re-initialize for filtered table\nhuman_samples.ena.reinit(info)\n\n# Download only filtered samples\ndownloads = human_samples.ena.download()\n\n# Save to CSV\nhuman_samples.to_csv('human_samples.csv', index=False)\n```\n\n### Leave files with incorrect md5 checksum\nPrevent ENATool from automatic removal of the corrupted files.\n\n```python\nimport ENATool\n\n# Could be used in fetch method\ninfo_table, downloads = ENATool.fetch('PRJNA335681', download=True, keep_failed=True)\n\n# Could be used in download method\ninfo = ENATool.fetch('PRJNA335681')\ndownloads = info.ena.download(keep_failed=True)\n```\n\n### Disable progress bar\n```python\nimport ENATool\n\n# Could be used in fetch method\ninfo_table, downloads = ENATool.fetch('PRJNA335681', download=True, NO_PROGRESS_BAR=True)\n\n# Could be used in download method\ninfo = ENATool.fetch('PRJNA335681')\ndownloads = info.ena.download(NO_PROGRESS_BAR=True)\n```\n\n### Work with multiple datasets\n\n```python\nimport ENATool\n\nprojects = ['PRJNA335681', 'PRJEB2961', 'PRJEB28350']\n\nfor project_id in projects:\n try:\n info = ENATool.fetch(project_id, path=f'data/{project_id}')\n print(f\"\u2713 {project_id}: {len(info)} samples\")\n except Exception as e:\n print(f\"\u2717 {project_id}: {e}\")\n```\n\n### Python API Reference\n\n#### `ENATool.fetch(project_id, path=None, download=False)`\n\nMain entry point for fetching ENA data.\n\n**Parameters:**\n- `project_id` (str): ENA project accession (e.g., 'PRJNA335681')\n- `path` (str, optional): Directory for outputs (defaults to project_id)\n- `download` (bool, optional): Auto-download FASTQ files (default: False)\n\n**Returns:**\n- DataFrame (if download=False)\n- Tuple of (info_table, download_table) (if download=True)\n\n#### `DataFrame.ena.download()`\n\nDownload FASTQ files for samples in DataFrame.\n\n**Returns:**\n- DataFrame with download status\n\n\n## \ud83d\udcdd Citation\n\nIf you use ENATool in your research, please cite:\n\n```\nTikhonova, P. (2021). ENATool: European Nucleotide Archive Data Manager\n(v2.0.0). Zenodo. https://doi.org/10.5281/zenodo.17443004\n```\n\n\n## \ud83d\udcdc License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udd17 Links\n\n- **PyPI:** https://pypi.org/project/ENATool/\n- **GitHub:** https://github.com/PollyTikhonova/ENATool\n- **Documentation:** https://github.com/PollyTikhonova/ENATool#readme\n- **Bug Reports:** https://github.com/PollyTikhonova/ENATool/issues\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Comprehensive tool for downloading and managing ENA sequencing data",
"version": "2.0.0",
"project_urls": {
"Bug Reports": "https://github.com/PollyTikhonova/ENATool/issues",
"Documentation": "https://github.com/PollyTikhonova/ENATool#readme",
"Homepage": "https://github.com/PollyTikhonova/ENATool",
"Repository": "https://github.com/PollyTikhonova/ENATool"
},
"split_keywords": [
"bioinformatics",
" sequencing",
" ena",
" fastq",
" download",
" metadata",
" genomics"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "fa03c39a9686b8144d7b8d955d907719e9adad657c12cee84bab44ce37181071",
"md5": "ed53c5dcc0b1ff7c112cc797fbbd083c",
"sha256": "8c669c5d748ac83bd7138f1e37b4d05edda3af97c5e444057497016e004632b2"
},
"downloads": -1,
"filename": "enatool-2.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ed53c5dcc0b1ff7c112cc797fbbd083c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 21159,
"upload_time": "2025-10-25T16:06:28",
"upload_time_iso_8601": "2025-10-25T16:06:28.196377Z",
"url": "https://files.pythonhosted.org/packages/fa/03/c39a9686b8144d7b8d955d907719e9adad657c12cee84bab44ce37181071/enatool-2.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "6aae67b65af1f0a4b9ff6357040b95bc8f7b0a9a069a9ff0cad4111ed8835f8e",
"md5": "9764775148d94851557002f3580d839c",
"sha256": "94e2eb295c17ed22b27c080b04d4f0410ace603cc47e5c198481d7771e7cbfde"
},
"downloads": -1,
"filename": "enatool-2.0.0.tar.gz",
"has_sig": false,
"md5_digest": "9764775148d94851557002f3580d839c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 19602,
"upload_time": "2025-10-25T16:06:29",
"upload_time_iso_8601": "2025-10-25T16:06:29.333780Z",
"url": "https://files.pythonhosted.org/packages/6a/ae/67b65af1f0a4b9ff6357040b95bc8f7b0a9a069a9ff0cad4111ed8835f8e/enatool-2.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-25 16:06:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "PollyTikhonova",
"github_project": "ENATool",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "enatool"
}