# MaldiAMRKit
[](https://pypi.org/project/maldiamrkit/)
[](https://pepy.tech/projects/maldiamrkit)
[](https://github.com/EttoreRocchi/MaldiAMRKit/blob/main/LICENSE)
<p align="center">
<img src="docs/maldiamrkit.png" alt="MaldiAMRKit" width="250"/>
</p>
<p align="center">
<strong>A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes</strong>
</p>
<p align="center">
<a href="#installation">Installation</a> •
<a href="#features">Features</a> •
<a href="#quick-start">Quick Start</a> •
<a href="#license">License</a> •
<a href="#contributing">Contributing</a>
</p>
## Installation
```bash
pip install maldiamrkit
```
## Features
- **📊 Spectrum Processing**: Load, smooth, baseline correct, and normalize MALDI-TOF spectra
- **📦 Dataset Management**: Process multiple spectra with metadata integration
- **🔍 Peak Detection**: Automated peak finding with customizable parameters
- **📈 Spectral Alignment (Warping)**: Multiple alignment methods (shift, linear, piecewise, DTW)
- **🤖 ML-Ready**: Direct integration with scikit-learn pipelines
## Quick Start
### Load and Preprocess a Single Spectrum
```python
from maldiamrkit.spectrum import MaldiSpectrum
# Load spectrum from file
spec = MaldiSpectrum("data/spectrum.txt")
# Preprocess: smoothing, baseline removal, normalization
spec.preprocess()
# Optional: bin to reduce dimensions
spec.bin(bin_width=3) # 3 Da bins
# Visualize
spec.plot(binned=True)
```
### Build a Dataset from Multiple Spectra
```python
from maldiamrkit.dataset import MaldiSet
# Load multiple spectra with metadata
data = MaldiSet.from_directory(
spectra_dir="data/spectra/",
metadata_path="data/metadata.csv",
aggregate_by={"antibiotic": "Drug", "species": "Species"},
bin_width=3
)
# Access features and labels
X = data.X # Feature matrix
y = data.y["Drug"] # Target labels
```
### Machine Learning Pipeline
```python
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from maldiamrkit.peak_detector import MaldiPeakDetector
# Create ML pipeline
pipe = Pipeline([
("peaks", MaldiPeakDetector(binary=False, prominence=0.05)),
("scaler", StandardScaler()),
("clf", RandomForestClassifier(n_estimators=100, random_state=42))
])
# Train and predict
pipe.fit(X_train, y_train)
y_pred = pipe.predict(X_test)
```
### Align spectra to correct for mass calibration drift:
```python
from maldiamrkit.warping import Warping
# Create warping transformer with shift method
warper = Warping(
method='shift', # or 'linear', 'piecewise', 'dtw'
reference='median', # use median spectrum as reference
max_shift=50
)
# Fit on training data and transform
warper.fit(X_train)
X_aligned = warper.transform(X_test)
# Visualize alignment results
fig, axes = warper.plot_alignment(
X_original=X_test,
X_aligned=X_aligned,
indices=[0, 5, 10], # plot multiple spectra
xlim=(2000, 10000), # zoom to m/z range
show_peaks=True
)
```
**Alignment Methods:**
- `shift`: Global median shift (fast, simple)
- `linear`: Least-squares linear transformation
- `piecewise`: Local shifts across spectrum segments (most flexible)
- `dtw`: Dynamic Time Warping (best for non-linear drift)
For further details please see the [quick guide notebook](docs/quick_guide.ipynb).
## Contributing
Pull requests, bug reports, and feature ideas are welcome: feel free to open a PR!
## License
This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.
## Acknowledgements
This toolkit is inspired by and builds upon the methodology described in:
> **Weis, C., Cuénod, A., Rieck, B., et al.** (2022). *Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning*. **Nature Medicine**, 28, 164–174. [https://doi.org/10.1038/s41591-021-01619-9](https://doi.org/10.1038/s41591-021-01619-9)
Please consider citing this work if you find `MaldiAMRKit` useful.
Raw data
{
"_id": null,
"home_page": null,
"name": "MaldiAMRKit",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "MALDI, mass-spectrometry, machine-learning, scikit-learn",
"author": null,
"author_email": "Ettore Rocchi <ettoreroc@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/c9/d1/56d01e016eeb4fb6bd533ce93b9e4b84c731d82b5a4224914901a7dc0f90/maldiamrkit-0.3.1.tar.gz",
"platform": null,
"description": "# MaldiAMRKit\n\n[](https://pypi.org/project/maldiamrkit/)\n[](https://pepy.tech/projects/maldiamrkit)\n[](https://github.com/EttoreRocchi/MaldiAMRKit/blob/main/LICENSE)\n\n<p align=\"center\">\n <img src=\"docs/maldiamrkit.png\" alt=\"MaldiAMRKit\" width=\"250\"/>\n</p>\n\n<p align=\"center\">\n <strong>A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes</strong>\n</p>\n\n<p align=\"center\">\n <a href=\"#installation\">Installation</a> \u2022\n <a href=\"#features\">Features</a> \u2022\n <a href=\"#quick-start\">Quick Start</a> \u2022\n <a href=\"#license\">License</a> \u2022\n <a href=\"#contributing\">Contributing</a>\n</p>\n\n## Installation\n\n```bash\npip install maldiamrkit\n```\n\n## Features\n\n- **\ud83d\udcca Spectrum Processing**: Load, smooth, baseline correct, and normalize MALDI-TOF spectra\n- **\ud83d\udce6 Dataset Management**: Process multiple spectra with metadata integration\n- **\ud83d\udd0d Peak Detection**: Automated peak finding with customizable parameters\n- **\ud83d\udcc8 Spectral Alignment (Warping)**: Multiple alignment methods (shift, linear, piecewise, DTW)\n- **\ud83e\udd16 ML-Ready**: Direct integration with scikit-learn pipelines\n\n## Quick Start\n\n### Load and Preprocess a Single Spectrum\n\n```python\nfrom maldiamrkit.spectrum import MaldiSpectrum\n\n# Load spectrum from file\nspec = MaldiSpectrum(\"data/spectrum.txt\")\n\n# Preprocess: smoothing, baseline removal, normalization\nspec.preprocess()\n\n# Optional: bin to reduce dimensions\nspec.bin(bin_width=3) # 3 Da bins\n\n# Visualize\nspec.plot(binned=True)\n```\n\n### Build a Dataset from Multiple Spectra\n\n```python\nfrom maldiamrkit.dataset import MaldiSet\n\n# Load multiple spectra with metadata\ndata = MaldiSet.from_directory(\n spectra_dir=\"data/spectra/\",\n metadata_path=\"data/metadata.csv\",\n aggregate_by={\"antibiotic\": \"Drug\", \"species\": \"Species\"},\n bin_width=3\n)\n\n# Access features and labels\nX = data.X # Feature matrix\ny = data.y[\"Drug\"] # Target labels\n```\n\n### Machine Learning Pipeline\n\n```python\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.ensemble import RandomForestClassifier\nfrom maldiamrkit.peak_detector import MaldiPeakDetector\n\n# Create ML pipeline\npipe = Pipeline([\n (\"peaks\", MaldiPeakDetector(binary=False, prominence=0.05)),\n (\"scaler\", StandardScaler()),\n (\"clf\", RandomForestClassifier(n_estimators=100, random_state=42))\n])\n\n# Train and predict\npipe.fit(X_train, y_train)\ny_pred = pipe.predict(X_test)\n```\n\n### Align spectra to correct for mass calibration drift:\n\n```python\nfrom maldiamrkit.warping import Warping\n\n# Create warping transformer with shift method\nwarper = Warping(\n method='shift', # or 'linear', 'piecewise', 'dtw'\n reference='median', # use median spectrum as reference\n max_shift=50\n)\n\n# Fit on training data and transform\nwarper.fit(X_train)\nX_aligned = warper.transform(X_test)\n\n# Visualize alignment results\nfig, axes = warper.plot_alignment(\n X_original=X_test,\n X_aligned=X_aligned,\n indices=[0, 5, 10], # plot multiple spectra\n xlim=(2000, 10000), # zoom to m/z range\n show_peaks=True\n)\n```\n\n**Alignment Methods:**\n- `shift`: Global median shift (fast, simple)\n- `linear`: Least-squares linear transformation\n- `piecewise`: Local shifts across spectrum segments (most flexible)\n- `dtw`: Dynamic Time Warping (best for non-linear drift)\n\n\nFor further details please see the [quick guide notebook](docs/quick_guide.ipynb).\n\n## Contributing\n\nPull requests, bug reports, and feature ideas are welcome: feel free to open a PR!\n\n## License\n\nThis project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.\n\n## Acknowledgements\n\nThis toolkit is inspired by and builds upon the methodology described in:\n\n> **Weis, C., Cu\u00e9nod, A., Rieck, B., et al.** (2022). *Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning*. **Nature Medicine**, 28, 164\u2013174. [https://doi.org/10.1038/s41591-021-01619-9](https://doi.org/10.1038/s41591-021-01619-9)\n\nPlease consider citing this work if you find `MaldiAMRKit` useful.\n",
"bugtrack_url": null,
"license": null,
"summary": "Toolkit to read and preprocess MALDI-TOF mass-spectra for AMR analyses.",
"version": "0.3.1",
"project_urls": {
"Documentation": "https://github.com/EttoreRocchi/MaldiAMRKit#readme",
"Homepage": "https://github.com/EttoreRocchi/MaldiAMRKit",
"Issues": "https://github.com/EttoreRocchi/MaldiAMRKit/issues",
"Source": "https://github.com/EttoreRocchi/MaldiAMRKit"
},
"split_keywords": [
"maldi",
" mass-spectrometry",
" machine-learning",
" scikit-learn"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "60bf9b652d5ffb12217664f38bee9da5ebbdacf5f4b9400b2ab4afe1724f0005",
"md5": "af74ebd3ddc3ad03d8d9a1455751f027",
"sha256": "82429ade033d27a258b4fcec550b7c69fff65de97f0313ae682248a446f03ae3"
},
"downloads": -1,
"filename": "maldiamrkit-0.3.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "af74ebd3ddc3ad03d8d9a1455751f027",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 23349,
"upload_time": "2025-10-20T21:12:10",
"upload_time_iso_8601": "2025-10-20T21:12:10.111814Z",
"url": "https://files.pythonhosted.org/packages/60/bf/9b652d5ffb12217664f38bee9da5ebbdacf5f4b9400b2ab4afe1724f0005/maldiamrkit-0.3.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "c9d156d01e016eeb4fb6bd533ce93b9e4b84c731d82b5a4224914901a7dc0f90",
"md5": "8232d6ff53f88adc6b9e47adf643bb87",
"sha256": "47c17adc20a40b51fcaf3c80b520e0666f624a3ec674e0075f53be303c6a41ad"
},
"downloads": -1,
"filename": "maldiamrkit-0.3.1.tar.gz",
"has_sig": false,
"md5_digest": "8232d6ff53f88adc6b9e47adf643bb87",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 19947,
"upload_time": "2025-10-20T21:12:11",
"upload_time_iso_8601": "2025-10-20T21:12:11.244893Z",
"url": "https://files.pythonhosted.org/packages/c9/d1/56d01e016eeb4fb6bd533ce93b9e4b84c731d82b5a4224914901a7dc0f90/maldiamrkit-0.3.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-20 21:12:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "EttoreRocchi",
"github_project": "MaldiAMRKit#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "maldiamrkit"
}