ms-file-reader


Namems-file-reader JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryReads a few common file formats for processed mass spectra and generates consistent objects from them.
upload_time2025-07-30 19:36:23
maintainerNone
docs_urlNone
authorGregory Janesch
requires_python>=3.9
licenseNone
keywords mass spectrometry jcamp massbank msp
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![linting: pylint](https://img.shields.io/badge/linting-pylint-yellowgreen)](https://github.com/pylint-dev/pylint)

This is a small library intended to read different types of mass spectrometry files, store them as (somewhat) standardized objects, and run some exploratory checks on the spectra as a whole.  The focus is on open, text-based file formats for already-processed spectra.  Functionality has been added for reading MSP, JCAMP-DX, and MassBank EU-styled files.

I wrote this because of a project that grabbed collections of mass spectra from a variety of sources.  A number of these sources had  inconsistencies within their libraries -- sometimes due to fields that don't appear in all spectra, sometimes becuase the field does always appear but it has some sort of null value in the field, and so on.  Some exploration of the data was usually necessary, and much of it was repetetive.  This library was written to streamline some of that work for anyone else in the same position.

NumPy is the only dependency for this library.

Feedback on other real-world edge cases is welcome.

# Usage

Install from PyPI:

```
pip install ms-file-reader
```

To import the individual readers:

```
from ms_file_reader.jcamp import JCAMPFileReader
from ms_file_reader.massbank import MassBankFileReader
from ms_file_reader.msp import MSPFileReader
```

The individual readers -- mostly the ones for JCAMP-DX and MSP -- come with options for trying to deal with any non-standardness of files; see the docstrings for argument details.  Processing is done by a `process_file()` method associated with each class.  The method acts on text objects instead of file handles or paths, so the content of a file has to be read in first.

A basic example:

```
from ms_file_reader.msp import MSPFileReader
with open("test.msp", "r", encoding="utf-8") as f:
    file_text = f.read()

reader = MSPFileReader(keep_empty_fields=False, max_intensity=100)
spectrum_library = reader.process_file(file_text)
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ms-file-reader",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "mass spectrometry, jcamp, massbank, msp",
    "author": "Gregory Janesch",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/34/fc/88f1da30d5e513c78220c8174ff5305019e53d5d330309c5ee5dbe740ff7/ms_file_reader-0.1.1.tar.gz",
    "platform": null,
    "description": "[![linting: pylint](https://img.shields.io/badge/linting-pylint-yellowgreen)](https://github.com/pylint-dev/pylint)\n\nThis is a small library intended to read different types of mass spectrometry files, store them as (somewhat) standardized objects, and run some exploratory checks on the spectra as a whole.  The focus is on open, text-based file formats for already-processed spectra.  Functionality has been added for reading MSP, JCAMP-DX, and MassBank EU-styled files.\n\nI wrote this because of a project that grabbed collections of mass spectra from a variety of sources.  A number of these sources had  inconsistencies within their libraries -- sometimes due to fields that don't appear in all spectra, sometimes becuase the field does always appear but it has some sort of null value in the field, and so on.  Some exploration of the data was usually necessary, and much of it was repetetive.  This library was written to streamline some of that work for anyone else in the same position.\n\nNumPy is the only dependency for this library.\n\nFeedback on other real-world edge cases is welcome.\n\n# Usage\n\nInstall from PyPI:\n\n```\npip install ms-file-reader\n```\n\nTo import the individual readers:\n\n```\nfrom ms_file_reader.jcamp import JCAMPFileReader\nfrom ms_file_reader.massbank import MassBankFileReader\nfrom ms_file_reader.msp import MSPFileReader\n```\n\nThe individual readers -- mostly the ones for JCAMP-DX and MSP -- come with options for trying to deal with any non-standardness of files; see the docstrings for argument details.  Processing is done by a `process_file()` method associated with each class.  The method acts on text objects instead of file handles or paths, so the content of a file has to be read in first.\n\nA basic example:\n\n```\nfrom ms_file_reader.msp import MSPFileReader\nwith open(\"test.msp\", \"r\", encoding=\"utf-8\") as f:\n    file_text = f.read()\n\nreader = MSPFileReader(keep_empty_fields=False, max_intensity=100)\nspectrum_library = reader.process_file(file_text)\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Reads a few common file formats for processed mass spectra and generates consistent objects from them.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/gjanesch/ms-file-reader",
        "Issues": "https://github.com/gjanesch/ms-file-reader/issues"
    },
    "split_keywords": [
        "mass spectrometry",
        " jcamp",
        " massbank",
        " msp"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "74055b9ff6c495bdb163aa30a8a5e6f29179d48ba9bdb28fb2d2d4e227d9493d",
                "md5": "6f6dd9a3fc079acb0f41e79eba6ca8b2",
                "sha256": "f78f5ad45d27e7a0c77821e053800f93ec764fce25a43a0e328f5e0c58699021"
            },
            "downloads": -1,
            "filename": "ms_file_reader-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6f6dd9a3fc079acb0f41e79eba6ca8b2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 11590,
            "upload_time": "2025-07-30T19:36:22",
            "upload_time_iso_8601": "2025-07-30T19:36:22.158014Z",
            "url": "https://files.pythonhosted.org/packages/74/05/5b9ff6c495bdb163aa30a8a5e6f29179d48ba9bdb28fb2d2d4e227d9493d/ms_file_reader-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "34fc88f1da30d5e513c78220c8174ff5305019e53d5d330309c5ee5dbe740ff7",
                "md5": "af8873899c5824640f0f676041ea0ffa",
                "sha256": "80f1219f7af955795a5e8bc5d8f77c335bbb8baf155a8c90500e94f78e5f14d5"
            },
            "downloads": -1,
            "filename": "ms_file_reader-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "af8873899c5824640f0f676041ea0ffa",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 11558,
            "upload_time": "2025-07-30T19:36:23",
            "upload_time_iso_8601": "2025-07-30T19:36:23.260214Z",
            "url": "https://files.pythonhosted.org/packages/34/fc/88f1da30d5e513c78220c8174ff5305019e53d5d330309c5ee5dbe740ff7/ms_file_reader-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-30 19:36:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "gjanesch",
    "github_project": "ms-file-reader",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "ms-file-reader"
}
        
Elapsed time: 1.47844s