pyquetmsMS

Name	pyquetmsMS JSON
Version	0.1.1 JSON
	download
home_page	None
Summary	Memory-efficient mzML to Parquet converter for mass spectrometry files
upload_time	2025-09-01 00:19:35
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	MIT
keywords	mass spectrometry mzml parquet proteomics metabolomics
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # pyquetms

Memory-efficient mzML to Parquet converter for mass spectrometry files.

## Overview

pyquetms provides streaming conversion of mzML files to Parquet format with minimal memory usage, making it suitable for processing large mass spectrometry datasets without running out of memory. This project was originally developed as a side project inspired by GSoC 25' with OpenMS, with the goal of providing a simple CLI for converting .mzML to .parquet files, which is especially important in big data projects (e.g., machine learning).

## Installation

### From PyPI

```bash
pip install pyquetms
```

### From source

```bash
git clone https://github.com/Avni2000/pyquetms.git
cd pyquetms
pip install .
```

### Development installation

```bash
git clone https://github.com/Avni2000/pyquetms.git
cd pyquetms
pip install -e ".[dev]"
```

## Usage

### CLI

Basic conversion:
```bash
pyquetms input.mzML
```
or
```bash
pyquetms ~/Downloads/input.mzML
```

Specify output file (defaults to working directory):
```bash
pyquetms input.mzML -o output.parquet
```

Customize batch size and compression. I recommend :
```bash
pyquetms input.mzML --batch-size 5000 --compression gzip
```

Get file information without converting:
```bash
pyquetms input.mzML --info
```

## Output Format

The converted Parquet files contain the following columns:

Depending on the type of mzml file, we have slightly different columns. 
Some columns may be blank, which is perfectly okay! It doesn't mean your mzml is wrong. 
The main expected values are time, m/z, and intensity

## Contributions

It's quite a small project, feel free to make a PR or open an issue!

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pyquetmsMS",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "mass spectrometry, mzML, parquet, proteomics, metabolomics",
    "author": null,
    "author_email": "Avni Badiwale <avnibadiwale@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/19/25/46f14d75802d272366600c216b8bad60d6eaa6eba812dc7559adb29cdcd9/pyquetmsms-0.1.1.tar.gz",
    "platform": null,
    "description": "# pyquetms\n\nMemory-efficient mzML to Parquet converter for mass spectrometry files.\n\n## Overview\n\npyquetms provides streaming conversion of mzML files to Parquet format with minimal memory usage, making it suitable for processing large mass spectrometry datasets without running out of memory. This project was originally developed as a side project inspired by GSoC 25' with OpenMS, with the goal of providing a simple CLI for converting .mzML to .parquet files, which is especially important in big data projects (e.g., machine learning).\n\n## Installation\n\n### From PyPI\n\n```bash\npip install pyquetms\n```\n\n### From source\n\n```bash\ngit clone https://github.com/Avni2000/pyquetms.git\ncd pyquetms\npip install .\n```\n\n### Development installation\n\n```bash\ngit clone https://github.com/Avni2000/pyquetms.git\ncd pyquetms\npip install -e \".[dev]\"\n```\n\n## Usage\n\n### CLI\n\nBasic conversion:\n```bash\npyquetms input.mzML\n```\nor\n```bash\npyquetms ~/Downloads/input.mzML\n```\n\nSpecify output file (defaults to working directory):\n```bash\npyquetms input.mzML -o output.parquet\n```\n\nCustomize batch size and compression. I recommend :\n```bash\npyquetms input.mzML --batch-size 5000 --compression gzip\n```\n\nGet file information without converting:\n```bash\npyquetms input.mzML --info\n```\n\n## Output Format\n\nThe converted Parquet files contain the following columns:\n\nDepending on the type of mzml file, we have slightly different columns. \nSome columns may be blank, which is perfectly okay! It doesn't mean your mzml is wrong. \nThe main expected values are time, m/z, and intensity\n\n## Contributions\n\nIt's quite a small project, feel free to make a PR or open an issue!\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Memory-efficient mzML to Parquet converter for mass spectrometry files",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/Avni2000/pyquetms"
    },
    "split_keywords": [
        "mass spectrometry",
        " mzml",
        " parquet",
        " proteomics",
        " metabolomics"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c8295a3bccbb278f45108ef55d768db76b53514403a33df8c436819ee40b47ec",
                "md5": "a43752007b12b7e6908f23cbcc433265",
                "sha256": "c5e39a1fd753ce8aecc3c847bf8d37a078a53fbb270424d4902cc8c6e6ae2666"
            },
            "downloads": -1,
            "filename": "pyquetmsms-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a43752007b12b7e6908f23cbcc433265",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 2369,
            "upload_time": "2025-09-01T00:19:34",
            "upload_time_iso_8601": "2025-09-01T00:19:34.243063Z",
            "url": "https://files.pythonhosted.org/packages/c8/29/5a3bccbb278f45108ef55d768db76b53514403a33df8c436819ee40b47ec/pyquetmsms-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "192546f14d75802d272366600c216b8bad60d6eaa6eba812dc7559adb29cdcd9",
                "md5": "8175613beb8446532dece9e681fb8c25",
                "sha256": "4812512f0605675a881834e968ce31dbbf445d8e634aadf1666bf448d3ecd389"
            },
            "downloads": -1,
            "filename": "pyquetmsms-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "8175613beb8446532dece9e681fb8c25",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 2817,
            "upload_time": "2025-09-01T00:19:35",
            "upload_time_iso_8601": "2025-09-01T00:19:35.657240Z",
            "url": "https://files.pythonhosted.org/packages/19/25/46f14d75802d272366600c216b8bad60d6eaa6eba812dc7559adb29cdcd9/pyquetmsms-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-01 00:19:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Avni2000",
    "github_project": "pyquetms",
    "github_not_found": true,
    "lcname": "pyquetmsms"
}

None