# pmultiqc
[](https://github.com/bigbio/pmultiqc/actions/workflows/python-app.yml)
[](https://github.com/bigbio/pmultiqc/actions/workflows/python-publish.yml)




## What is pmultiqc?
pmultiqc is a MultiQC plugin for comprehensive quality control reporting of proteomics data. It generates interactive HTML reports with visualizations and metrics to help you assess the quality of your mass spectrometry-based proteomics experiments.
### Key Features
- Works with multiple proteomics data formats and analysis pipelines
- Generates interactive HTML reports with visualizations
- Provides comprehensive QC metrics for MS data
- Supports different quantification methods (LFQ, TMT, DIA)
- Integrates with the MultiQC framework
## Supported Data Sources
pmultiqc supports the following data sources:
1. **[quantms pipeline](https://github.com/nf-core/quantms)** output files:
- `experimental_design.tsv`: Experimental design file
- `*.mzTab`: Results of the identification
- `*msstats*.csv`: MSstats/MSstatsTMT input files
- `*.mzML`: Spectra files
- `*ms_info.tsv`: MS quality control information
- `*.idXML`: Identification results
- `*.yml`: Pipeline parameters (optional)
- `diann_report.tsv` or `diann_report.parquet`: DIA-NN main report (DIA analysis only)
2. **[MaxQuant](https://www.maxquant.org)** result files:
- `parameters.txt`: Analysis parameters
- `proteinGroups.txt`: Protein identification results
- `summary.txt`: Summary statistics
- `evidence.txt`: Peptide evidence
- `msms.txt`: MS/MS scan information
- `msmsScans.txt`: MS/MS scan details
3. **[DIA-NN](https://aptila.bio)** result files:
- `*ms_info.parquet`: mzML statistics after Raw-to-mzML conversion (using **[quantms-utils](https://github.com/bigbio/quantms-utils)**)
- `report.tsv` or `report.parquet`: DIA-NN main report
4. **[ProteoBench](https://proteobench.readthedocs.io)** file:
- `result_performance.csv`: ProteoBench result file
5. **mzIdentML** files:
- `*.mzid`: Identification results
- `*.mzML` or `*.mgf`: Corresponding spectra files
## Installation
### Install from PyPI
```bash
# To install the stable release from PyPI:
pip install pmultiqc
```
### Install from Source (Without PyPI)
```bash
# Fork the repository on GitHub
# Clone the repository
git clone https://github.com/your-username/pmultiqc.git
cd pmultiqc
# Install the package locally
pip install .
# Now you can run pmultiqc on your own dataset
```
## Usage
pmultiqc is used as a plugin for MultiQC. After installation, you can run it using the MultiQC command-line interface.
### Basic Usage
```bash
multiqc {analysis_dir} -o {output_dir}
```
Where:
- `{analysis_dir}` is the directory containing your proteomics data files
- `{output_dir}` is the directory where you want to save the report
### Examples
#### For quantms pipeline results
```bash
# Basic usage
multiqc /path/to/quantms/results -o ./report
# With specific options
multiqc /path/to/quantms/results -o ./report --remove_decoy --condition factor
```
#### For MaxQuant results
```bash
multiqc --parse_maxquant /path/to/maxquant/results -o ./report
```
#### For DIA-NN results
```bash
multiqc /path/to/diann/results -o ./report
```
#### For ProteoBench files
```bash
multiqc --parse_proteobench /path/to/proteobench/files -o ./report
```
#### For mzIdentML files
```bash
multiqc --mzid_plugin /path/to/mzid/files -o ./report
```
### Command-line Options
| Option | Description | Default |
|--------|-------------|---------|
| `--raw` | Keep filenames in experimental design output as raw | `False` |
| `--condition` | Create conditions from provided columns | - |
| `--remove_decoy` | Remove decoy peptides when counting | `True` |
| `--decoy_affix` | Pre- or suffix of decoy proteins in their accession | `DECOY_` |
| `--contaminant_affix` | The contaminant prefix or suffix | `CONT` |
| `--affix_type` | Location of the decoy marker (prefix or suffix) | `prefix` |
| `--disable_plugin` | Disable pmultiqc plugin | `False` |
| `--quantification_method` | Quantification method for LFQ experiment | `feature_intensity` |
| `--disable_table` | Disable protein/peptide table plots for large datasets | `False` |
| `--ignored_idxml` | Ignore idXML files for faster processing | `False` |
| `--parse_maxquant` | Generate reports based on MaxQuant results | `False` |
| `--parse_proteobench` | Generate reports based on ProteoBench result | `False` |
| `--mzid_plugin` | Generate reports based on mzIdentML files | `False` |
## QC Metrics and Visualizations
pmultiqc generates a comprehensive report with multiple sections:
### General Report
- **Experimental Design**: Overview of the dataset structure
- **Pipeline Performance Overview**: Key metrics including:
- Contaminants Score
- Peptide Intensity
- Charge Score
- Missed Cleavages
- ID rate over RT
- MS2 OverSampling
- Peptide Missing Value
- **Summary Table**: Spectra counts, identification rates, peptide and protein counts
- **MS1 Information**: Quality metrics at MS1 level
- **Pipeline Results Statistics**: Overall identification results
- **Number of Peptides per Protein**: Distribution of peptide counts per protein
### Results Tables
- **Peptide Table**: First 500 peptides in the dataset
- **PSM Table**: First 500 PSMs (Peptide-Spectrum Matches)
### Identification Statistics
- **Spectra Tracking**: Summary of identification results by file
- **Search Engine Scores**: Distribution of search engine scores
- **Precursor Charges Distribution**: Distribution of precursor ion charges
- **Number of Peaks per MS/MS Spectrum**: Peak count distribution
- **Peak Intensity Distribution**: MS2 peak intensity distribution
- **Oversampling Distribution**: Analysis of MS2 oversampling
- **Delta Mass**: Mass accuracy distribution
- **Peptide/Protein Quantification Tables**: Quantitative levels across conditions
## Example Reports
You can find example reports on the [docs page](https://bigbio.github.io/pmultiqc).
## Development
To contribute to pmultiqc:
1. Fork the repository
2. Clone your fork: `git clone https://github.com/YOUR-USERNAME/pmultiqc`
3. Create a feature branch: `git checkout -b new-feature`
4. Make your changes
5. Install in development mode: `pip install -e .`
6. Test your changes: `cd tests && multiqc resources/LFQ -o ./`
7. Commit your changes: `git commit -am 'Add new feature'`
8. Push to the branch: `git push origin new-feature`
9. Submit a pull request
## License
This project is licensed under the terms of the LICENSE file included in the repository.
## Citation
If you use pmultiqc in your research, please cite:
```
pmultiqc: A MultiQC plugin for proteomics quality control
https://github.com/bigbio/pmultiqc
```
Raw data
{
"_id": null,
"home_page": null,
"name": "pmultiqc",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "quantms, proteomics, quality control, MultiQC",
"author": "Yasset Perez-Riverol",
"author_email": "ypriverol@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/cc/73/ee63c257e9b15b07b6cb904efef8f0c0631a855fb40396c0d1ac603c7446/pmultiqc-0.0.30.tar.gz",
"platform": null,
"description": "# pmultiqc\n\n[](https://github.com/bigbio/pmultiqc/actions/workflows/python-app.yml)\n[](https://github.com/bigbio/pmultiqc/actions/workflows/python-publish.yml)\n\n\n\n\n\n## What is pmultiqc?\n\npmultiqc is a MultiQC plugin for comprehensive quality control reporting of proteomics data. It generates interactive HTML reports with visualizations and metrics to help you assess the quality of your mass spectrometry-based proteomics experiments.\n\n### Key Features\n\n- Works with multiple proteomics data formats and analysis pipelines\n- Generates interactive HTML reports with visualizations\n- Provides comprehensive QC metrics for MS data\n- Supports different quantification methods (LFQ, TMT, DIA)\n- Integrates with the MultiQC framework\n\n## Supported Data Sources\n\npmultiqc supports the following data sources:\n\n1. **[quantms pipeline](https://github.com/nf-core/quantms)** output files:\n - `experimental_design.tsv`: Experimental design file\n - `*.mzTab`: Results of the identification\n - `*msstats*.csv`: MSstats/MSstatsTMT input files\n - `*.mzML`: Spectra files\n - `*ms_info.tsv`: MS quality control information\n - `*.idXML`: Identification results\n - `*.yml`: Pipeline parameters (optional)\n - `diann_report.tsv` or `diann_report.parquet`: DIA-NN main report (DIA analysis only)\n\n2. **[MaxQuant](https://www.maxquant.org)** result files:\n - `parameters.txt`: Analysis parameters\n - `proteinGroups.txt`: Protein identification results\n - `summary.txt`: Summary statistics\n - `evidence.txt`: Peptide evidence\n - `msms.txt`: MS/MS scan information\n - `msmsScans.txt`: MS/MS scan details\n\n3. **[DIA-NN](https://aptila.bio)** result files:\n - `*ms_info.parquet`: mzML statistics after Raw-to-mzML conversion (using **[quantms-utils](https://github.com/bigbio/quantms-utils)**)\n - `report.tsv` or `report.parquet`: DIA-NN main report\n\n4. **[ProteoBench](https://proteobench.readthedocs.io)** file:\n - `result_performance.csv`: ProteoBench result file\n\n5. **mzIdentML** files:\n - `*.mzid`: Identification results\n - `*.mzML` or `*.mgf`: Corresponding spectra files\n\n## Installation\n\n### Install from PyPI\n\n```bash\n# To install the stable release from PyPI:\npip install pmultiqc\n```\n\n### Install from Source (Without PyPI)\n\n```bash\n# Fork the repository on GitHub\n\n# Clone the repository\ngit clone https://github.com/your-username/pmultiqc.git\ncd pmultiqc\n\n# Install the package locally\npip install .\n\n# Now you can run pmultiqc on your own dataset\n```\n\n## Usage\n\npmultiqc is used as a plugin for MultiQC. After installation, you can run it using the MultiQC command-line interface.\n\n### Basic Usage\n\n```bash\nmultiqc {analysis_dir} -o {output_dir}\n```\n\nWhere:\n- `{analysis_dir}` is the directory containing your proteomics data files\n- `{output_dir}` is the directory where you want to save the report\n\n### Examples\n\n#### For quantms pipeline results\n\n```bash\n# Basic usage\nmultiqc /path/to/quantms/results -o ./report\n\n# With specific options\nmultiqc /path/to/quantms/results -o ./report --remove_decoy --condition factor\n```\n\n#### For MaxQuant results\n\n```bash\nmultiqc --parse_maxquant /path/to/maxquant/results -o ./report\n```\n\n#### For DIA-NN results\n\n```bash\nmultiqc /path/to/diann/results -o ./report\n```\n\n#### For ProteoBench files\n\n```bash\nmultiqc --parse_proteobench /path/to/proteobench/files -o ./report\n```\n\n#### For mzIdentML files\n\n```bash\nmultiqc --mzid_plugin /path/to/mzid/files -o ./report\n```\n\n\n### Command-line Options\n\n| Option | Description | Default |\n|--------|-------------|---------|\n| `--raw` | Keep filenames in experimental design output as raw | `False` |\n| `--condition` | Create conditions from provided columns | - |\n| `--remove_decoy` | Remove decoy peptides when counting | `True` |\n| `--decoy_affix` | Pre- or suffix of decoy proteins in their accession | `DECOY_` |\n| `--contaminant_affix` | The contaminant prefix or suffix | `CONT` |\n| `--affix_type` | Location of the decoy marker (prefix or suffix) | `prefix` |\n| `--disable_plugin` | Disable pmultiqc plugin | `False` |\n| `--quantification_method` | Quantification method for LFQ experiment | `feature_intensity` |\n| `--disable_table` | Disable protein/peptide table plots for large datasets | `False` |\n| `--ignored_idxml` | Ignore idXML files for faster processing | `False` |\n| `--parse_maxquant` | Generate reports based on MaxQuant results | `False` |\n| `--parse_proteobench` | Generate reports based on ProteoBench result | `False` |\n| `--mzid_plugin` | Generate reports based on mzIdentML files | `False` |\n\n## QC Metrics and Visualizations\n\npmultiqc generates a comprehensive report with multiple sections:\n\n### General Report\n\n- **Experimental Design**: Overview of the dataset structure\n- **Pipeline Performance Overview**: Key metrics including:\n - Contaminants Score\n - Peptide Intensity\n - Charge Score\n - Missed Cleavages\n - ID rate over RT\n - MS2 OverSampling\n - Peptide Missing Value\n- **Summary Table**: Spectra counts, identification rates, peptide and protein counts\n- **MS1 Information**: Quality metrics at MS1 level\n- **Pipeline Results Statistics**: Overall identification results\n- **Number of Peptides per Protein**: Distribution of peptide counts per protein\n\n### Results Tables\n\n- **Peptide Table**: First 500 peptides in the dataset\n- **PSM Table**: First 500 PSMs (Peptide-Spectrum Matches)\n\n### Identification Statistics\n\n- **Spectra Tracking**: Summary of identification results by file\n- **Search Engine Scores**: Distribution of search engine scores\n- **Precursor Charges Distribution**: Distribution of precursor ion charges\n- **Number of Peaks per MS/MS Spectrum**: Peak count distribution\n- **Peak Intensity Distribution**: MS2 peak intensity distribution\n- **Oversampling Distribution**: Analysis of MS2 oversampling\n- **Delta Mass**: Mass accuracy distribution\n- **Peptide/Protein Quantification Tables**: Quantitative levels across conditions\n\n## Example Reports\n\nYou can find example reports on the [docs page](https://bigbio.github.io/pmultiqc).\n\n## Development\n\nTo contribute to pmultiqc:\n\n1. Fork the repository\n2. Clone your fork: `git clone https://github.com/YOUR-USERNAME/pmultiqc`\n3. Create a feature branch: `git checkout -b new-feature`\n4. Make your changes\n5. Install in development mode: `pip install -e .`\n6. Test your changes: `cd tests && multiqc resources/LFQ -o ./`\n7. Commit your changes: `git commit -am 'Add new feature'`\n8. Push to the branch: `git push origin new-feature`\n9. Submit a pull request\n\n## License\n\nThis project is licensed under the terms of the LICENSE file included in the repository.\n\n## Citation\n\nIf you use pmultiqc in your research, please cite:\n\n```\npmultiqc: A MultiQC plugin for proteomics quality control\nhttps://github.com/bigbio/pmultiqc\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python package for quality control of proteomics datasets, based on multiqc package",
"version": "0.0.30",
"project_urls": {
"GitHub": "https://github.com/bigbio/pmultiqc",
"LICENSE": "https://github.com/bigbio/pmultiqc/blob/main/LICENSE",
"Quantms": "https://quantms.org"
},
"split_keywords": [
"quantms",
" proteomics",
" quality control",
" multiqc"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a51b83e81f1bfc84820be0129a481bf4dff4ff9bd5ecd53a1acfb7664cd94bb5",
"md5": "2055fc020e3c594cb902797f2fb65cb2",
"sha256": "d3856ac50ab4b32428ccf8d1078d64d8292c35b1c0535d22ff4f3e2ac7e47dbf"
},
"downloads": -1,
"filename": "pmultiqc-0.0.30-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2055fc020e3c594cb902797f2fb65cb2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 915379,
"upload_time": "2025-07-22T15:16:47",
"upload_time_iso_8601": "2025-07-22T15:16:47.696225Z",
"url": "https://files.pythonhosted.org/packages/a5/1b/83e81f1bfc84820be0129a481bf4dff4ff9bd5ecd53a1acfb7664cd94bb5/pmultiqc-0.0.30-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "cc73ee63c257e9b15b07b6cb904efef8f0c0631a855fb40396c0d1ac603c7446",
"md5": "1d40c428b849ebdeaf00090028894c87",
"sha256": "6e30ddc425297c09b99f6c918a0b0b42b300e8954add2b4c6d13dc5d77bf9533"
},
"downloads": -1,
"filename": "pmultiqc-0.0.30.tar.gz",
"has_sig": false,
"md5_digest": "1d40c428b849ebdeaf00090028894c87",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 896802,
"upload_time": "2025-07-22T15:16:48",
"upload_time_iso_8601": "2025-07-22T15:16:48.930473Z",
"url": "https://files.pythonhosted.org/packages/cc/73/ee63c257e9b15b07b6cb904efef8f0c0631a855fb40396c0d1ac603c7446/pmultiqc-0.0.30.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-22 15:16:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bigbio",
"github_project": "pmultiqc",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "multiqc",
"specs": [
[
">=",
"1.29"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"1.5"
]
]
},
{
"name": "pyteomics",
"specs": []
},
{
"name": "pyopenms",
"specs": []
},
{
"name": "sdrf-pipelines",
"specs": [
[
">=",
"0.0.32"
]
]
},
{
"name": "lxml",
"specs": []
},
{
"name": "numpy",
"specs": [
[
">=",
"1.23"
]
]
},
{
"name": "pyarrow",
"specs": []
},
{
"name": "scikit-learn",
"specs": [
[
">=",
"1.2"
]
]
}
],
"lcname": "pmultiqc"
}