MaldiAMRKit


NameMaldiAMRKit JSON
Version 0.3.1 PyPI version JSON
download
home_pageNone
SummaryToolkit to read and preprocess MALDI-TOF mass-spectra for AMR analyses.
upload_time2025-10-20 21:12:11
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords maldi mass-spectrometry machine-learning scikit-learn
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MaldiAMRKit

[![PyPI Version](https://img.shields.io/pypi/v/maldiamrkit?cacheSeconds=300)](https://pypi.org/project/maldiamrkit/)
[![PyPI Downloads](https://static.pepy.tech/badge/maldiamrkit)](https://pepy.tech/projects/maldiamrkit)
[![License](https://img.shields.io/github/license/EttoreRocchi/MaldiAMRKit)](https://github.com/EttoreRocchi/MaldiAMRKit/blob/main/LICENSE)

<p align="center">
  <img src="docs/maldiamrkit.png" alt="MaldiAMRKit" width="250"/>
</p>

<p align="center">
  <strong>A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes</strong>
</p>

<p align="center">
  <a href="#installation">Installation</a> •
  <a href="#features">Features</a> •
  <a href="#quick-start">Quick Start</a> •
  <a href="#license">License</a> •
  <a href="#contributing">Contributing</a>
</p>

## Installation

```bash
pip install maldiamrkit
```

## Features

- **📊 Spectrum Processing**: Load, smooth, baseline correct, and normalize MALDI-TOF spectra
- **📦 Dataset Management**: Process multiple spectra with metadata integration
- **🔍 Peak Detection**: Automated peak finding with customizable parameters
- **📈 Spectral Alignment (Warping)**: Multiple alignment methods (shift, linear, piecewise, DTW)
- **🤖 ML-Ready**: Direct integration with scikit-learn pipelines

## Quick Start

### Load and Preprocess a Single Spectrum

```python
from maldiamrkit.spectrum import MaldiSpectrum

# Load spectrum from file
spec = MaldiSpectrum("data/spectrum.txt")

# Preprocess: smoothing, baseline removal, normalization
spec.preprocess()

# Optional: bin to reduce dimensions
spec.bin(bin_width=3)  # 3 Da bins

# Visualize
spec.plot(binned=True)
```

### Build a Dataset from Multiple Spectra

```python
from maldiamrkit.dataset import MaldiSet

# Load multiple spectra with metadata
data = MaldiSet.from_directory(
    spectra_dir="data/spectra/",
    metadata_path="data/metadata.csv",
    aggregate_by={"antibiotic": "Drug", "species": "Species"},
    bin_width=3
)

# Access features and labels
X = data.X  # Feature matrix
y = data.y["Drug"]  # Target labels
```

### Machine Learning Pipeline

```python
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from maldiamrkit.peak_detector import MaldiPeakDetector

# Create ML pipeline
pipe = Pipeline([
    ("peaks", MaldiPeakDetector(binary=False, prominence=0.05)),
    ("scaler", StandardScaler()),
    ("clf", RandomForestClassifier(n_estimators=100, random_state=42))
])

# Train and predict
pipe.fit(X_train, y_train)
y_pred = pipe.predict(X_test)
```

### Align spectra to correct for mass calibration drift:

```python
from maldiamrkit.warping import Warping

# Create warping transformer with shift method
warper = Warping(
    method='shift',  # or 'linear', 'piecewise', 'dtw'
    reference='median',  # use median spectrum as reference
    max_shift=50
)

# Fit on training data and transform
warper.fit(X_train)
X_aligned = warper.transform(X_test)

# Visualize alignment results
fig, axes = warper.plot_alignment(
    X_original=X_test,
    X_aligned=X_aligned,
    indices=[0, 5, 10],  # plot multiple spectra
    xlim=(2000, 10000),  # zoom to m/z range
    show_peaks=True
)
```

**Alignment Methods:**
- `shift`: Global median shift (fast, simple)
- `linear`: Least-squares linear transformation
- `piecewise`: Local shifts across spectrum segments (most flexible)
- `dtw`: Dynamic Time Warping (best for non-linear drift)


For further details please see the [quick guide notebook](docs/quick_guide.ipynb).

## Contributing

Pull requests, bug reports, and feature ideas are welcome: feel free to open a PR!

## License

This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.

## Acknowledgements

This toolkit is inspired by and builds upon the methodology described in:

> **Weis, C., Cuénod, A., Rieck, B., et al.** (2022). *Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning*. **Nature Medicine**, 28, 164–174. [https://doi.org/10.1038/s41591-021-01619-9](https://doi.org/10.1038/s41591-021-01619-9)

Please consider citing this work if you find `MaldiAMRKit` useful.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "MaldiAMRKit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "MALDI, mass-spectrometry, machine-learning, scikit-learn",
    "author": null,
    "author_email": "Ettore Rocchi <ettoreroc@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/c9/d1/56d01e016eeb4fb6bd533ce93b9e4b84c731d82b5a4224914901a7dc0f90/maldiamrkit-0.3.1.tar.gz",
    "platform": null,
    "description": "# MaldiAMRKit\n\n[![PyPI Version](https://img.shields.io/pypi/v/maldiamrkit?cacheSeconds=300)](https://pypi.org/project/maldiamrkit/)\n[![PyPI Downloads](https://static.pepy.tech/badge/maldiamrkit)](https://pepy.tech/projects/maldiamrkit)\n[![License](https://img.shields.io/github/license/EttoreRocchi/MaldiAMRKit)](https://github.com/EttoreRocchi/MaldiAMRKit/blob/main/LICENSE)\n\n<p align=\"center\">\n  <img src=\"docs/maldiamrkit.png\" alt=\"MaldiAMRKit\" width=\"250\"/>\n</p>\n\n<p align=\"center\">\n  <strong>A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes</strong>\n</p>\n\n<p align=\"center\">\n  <a href=\"#installation\">Installation</a> \u2022\n  <a href=\"#features\">Features</a> \u2022\n  <a href=\"#quick-start\">Quick Start</a> \u2022\n  <a href=\"#license\">License</a> \u2022\n  <a href=\"#contributing\">Contributing</a>\n</p>\n\n## Installation\n\n```bash\npip install maldiamrkit\n```\n\n## Features\n\n- **\ud83d\udcca Spectrum Processing**: Load, smooth, baseline correct, and normalize MALDI-TOF spectra\n- **\ud83d\udce6 Dataset Management**: Process multiple spectra with metadata integration\n- **\ud83d\udd0d Peak Detection**: Automated peak finding with customizable parameters\n- **\ud83d\udcc8 Spectral Alignment (Warping)**: Multiple alignment methods (shift, linear, piecewise, DTW)\n- **\ud83e\udd16 ML-Ready**: Direct integration with scikit-learn pipelines\n\n## Quick Start\n\n### Load and Preprocess a Single Spectrum\n\n```python\nfrom maldiamrkit.spectrum import MaldiSpectrum\n\n# Load spectrum from file\nspec = MaldiSpectrum(\"data/spectrum.txt\")\n\n# Preprocess: smoothing, baseline removal, normalization\nspec.preprocess()\n\n# Optional: bin to reduce dimensions\nspec.bin(bin_width=3)  # 3 Da bins\n\n# Visualize\nspec.plot(binned=True)\n```\n\n### Build a Dataset from Multiple Spectra\n\n```python\nfrom maldiamrkit.dataset import MaldiSet\n\n# Load multiple spectra with metadata\ndata = MaldiSet.from_directory(\n    spectra_dir=\"data/spectra/\",\n    metadata_path=\"data/metadata.csv\",\n    aggregate_by={\"antibiotic\": \"Drug\", \"species\": \"Species\"},\n    bin_width=3\n)\n\n# Access features and labels\nX = data.X  # Feature matrix\ny = data.y[\"Drug\"]  # Target labels\n```\n\n### Machine Learning Pipeline\n\n```python\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.ensemble import RandomForestClassifier\nfrom maldiamrkit.peak_detector import MaldiPeakDetector\n\n# Create ML pipeline\npipe = Pipeline([\n    (\"peaks\", MaldiPeakDetector(binary=False, prominence=0.05)),\n    (\"scaler\", StandardScaler()),\n    (\"clf\", RandomForestClassifier(n_estimators=100, random_state=42))\n])\n\n# Train and predict\npipe.fit(X_train, y_train)\ny_pred = pipe.predict(X_test)\n```\n\n### Align spectra to correct for mass calibration drift:\n\n```python\nfrom maldiamrkit.warping import Warping\n\n# Create warping transformer with shift method\nwarper = Warping(\n    method='shift',  # or 'linear', 'piecewise', 'dtw'\n    reference='median',  # use median spectrum as reference\n    max_shift=50\n)\n\n# Fit on training data and transform\nwarper.fit(X_train)\nX_aligned = warper.transform(X_test)\n\n# Visualize alignment results\nfig, axes = warper.plot_alignment(\n    X_original=X_test,\n    X_aligned=X_aligned,\n    indices=[0, 5, 10],  # plot multiple spectra\n    xlim=(2000, 10000),  # zoom to m/z range\n    show_peaks=True\n)\n```\n\n**Alignment Methods:**\n- `shift`: Global median shift (fast, simple)\n- `linear`: Least-squares linear transformation\n- `piecewise`: Local shifts across spectrum segments (most flexible)\n- `dtw`: Dynamic Time Warping (best for non-linear drift)\n\n\nFor further details please see the [quick guide notebook](docs/quick_guide.ipynb).\n\n## Contributing\n\nPull requests, bug reports, and feature ideas are welcome: feel free to open a PR!\n\n## License\n\nThis project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.\n\n## Acknowledgements\n\nThis toolkit is inspired by and builds upon the methodology described in:\n\n> **Weis, C., Cu\u00e9nod, A., Rieck, B., et al.** (2022). *Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning*. **Nature Medicine**, 28, 164\u2013174. [https://doi.org/10.1038/s41591-021-01619-9](https://doi.org/10.1038/s41591-021-01619-9)\n\nPlease consider citing this work if you find `MaldiAMRKit` useful.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Toolkit to read and preprocess MALDI-TOF mass-spectra for AMR analyses.",
    "version": "0.3.1",
    "project_urls": {
        "Documentation": "https://github.com/EttoreRocchi/MaldiAMRKit#readme",
        "Homepage": "https://github.com/EttoreRocchi/MaldiAMRKit",
        "Issues": "https://github.com/EttoreRocchi/MaldiAMRKit/issues",
        "Source": "https://github.com/EttoreRocchi/MaldiAMRKit"
    },
    "split_keywords": [
        "maldi",
        " mass-spectrometry",
        " machine-learning",
        " scikit-learn"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "60bf9b652d5ffb12217664f38bee9da5ebbdacf5f4b9400b2ab4afe1724f0005",
                "md5": "af74ebd3ddc3ad03d8d9a1455751f027",
                "sha256": "82429ade033d27a258b4fcec550b7c69fff65de97f0313ae682248a446f03ae3"
            },
            "downloads": -1,
            "filename": "maldiamrkit-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "af74ebd3ddc3ad03d8d9a1455751f027",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 23349,
            "upload_time": "2025-10-20T21:12:10",
            "upload_time_iso_8601": "2025-10-20T21:12:10.111814Z",
            "url": "https://files.pythonhosted.org/packages/60/bf/9b652d5ffb12217664f38bee9da5ebbdacf5f4b9400b2ab4afe1724f0005/maldiamrkit-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c9d156d01e016eeb4fb6bd533ce93b9e4b84c731d82b5a4224914901a7dc0f90",
                "md5": "8232d6ff53f88adc6b9e47adf643bb87",
                "sha256": "47c17adc20a40b51fcaf3c80b520e0666f624a3ec674e0075f53be303c6a41ad"
            },
            "downloads": -1,
            "filename": "maldiamrkit-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "8232d6ff53f88adc6b9e47adf643bb87",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 19947,
            "upload_time": "2025-10-20T21:12:11",
            "upload_time_iso_8601": "2025-10-20T21:12:11.244893Z",
            "url": "https://files.pythonhosted.org/packages/c9/d1/56d01e016eeb4fb6bd533ce93b9e4b84c731d82b5a4224914901a7dc0f90/maldiamrkit-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-20 21:12:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "EttoreRocchi",
    "github_project": "MaldiAMRKit#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "maldiamrkit"
}
        
Elapsed time: 4.24474s