# Mixed Membership Stochastic Block Models
[![PyPI version](https://badge.fury.io/py/mmsbm.svg)](https://badge.fury.io/py/mmsbm)
[![Documentation Status](https://readthedocs.org/projects/mmsbm-docs/badge/?version=latest)](https://mmsbm-docs.readthedocs.io/en/latest/?badge=latest)
[![Python Versions](https://img.shields.io/pypi/pyversions/mmsbm.svg)](https://pypi.org/project/mmsbm/)
[![Tests](https://github.com/eudald-seeslab/mmsbm/actions/workflows/tests.yml/badge.svg)](https://github.com/eudald-seeslab/mmsbm/actions/workflows/tests.yml)
[![Coverage Status](https://coveralls.io/repos/github/eudald-seeslab/mmsbm/badge.svg?branch=main)](https://coveralls.io/github/eudald-seeslab/mmsbm?branch=main)
[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![Downloads](https://pepy.tech/badge/mmsbm)](https://pepy.tech/project/mmsbm)
A Python implementation of Mixed Membership Stochastic Block Models for recommendation systems, based on the work by Godoy-Lorite et al. (2016). This library provides an efficient, vectorized implementation suitable for both research and production environments.
## Freatures
- Fast, vectorized implementation of MMSBM.
- Support for both simple and cross-validated fitting.
- Parallel processing for multiple sampling runs.
- Comprehensive model statistics and evaluation metrics.
- Compatible with Python 3.6+ through 3.12.
## Installation
```bash
pip install mmsbm
```
## Quick Start
```python
from mmsbm import MMSBM
# Create a model
model = MMSBM(user_groups=2, item_groups=4)
# Fit and predict
model.fit(train_data)
predictions = model.predict(test_data)
# Get model results
results = model.score()
```
## Detailed Usage
### Data Format
The input data should be a pandas DataFrame with exactly 3 columns representing users, items, and ratings:
```python
import pandas as pd
from random import choice
train = pd.DataFrame(
{
"users": [f"user{choice(list(range(5)))}" for _ in range(100)],
"items": [f"item{choice(list(range(10)))}" for _ in range(100)],
"ratings": [choice(list(range(1, 6))) for _ in range(100)]
}
)
test = pd.DataFrame(
{
"users": [f"user{choice(list(range(5)))}" for _ in range(50)],
"items": [f"item{choice(list(range(10)))}" for _ in range(50)],
"ratings": [choice(list(range(1, 6))) for _ in range(50)]
}
)
```
### Model Configuration
```python
from mmsbm import MMSBM
# Initialize the MMSBM class:
model = MMSBM(
user_groups=2, # Number of user groups
item_groups=4, # Number of item groups
iterations=500, # Number of EM iterations
sampling=5, # Number of parallel runs
seed=1, # Random seed for reproducibility
debug=False # Enable debug logging
)
```
### Training Methods
#### Simple Fit
```python
mmsbm.fit(train)
```
#### Cross-Validation Fit
```python
accuracies = mmsbm.cv_fit(train, folds=5)
print(f"Mean accuracy: {np.mean(accuracies):.3f} ± {np.std(accuracies):.3f}")
```
### Making Predictions
```python
predictions = mmsbm.predict(test)
```
### Model Evaluation
```python
results = model.score()
# Access various metrics
accuracy = results['stats']['accuracy']
mae = results['stats']['mae']
# Access model parameters
theta = results['objects']['theta'] # User group memberships
eta = results['objects']['eta'] # Item group memberships
pr = results['objects']['pr'] # Rating probabilities
```
## Performance Considerations
- Computation is vectorized for efficient processing of large datasets.
- Parallel processing for multiple sampling runs
- Computational complexity scales primarily with the number of unique items, but not users
- Memory usage scales primarily with the number of unique users and items
## Running Tests
To run tests do the following:
```
# Install development dependencies
pip install -e ".[dev]"
# Run tests
python -m pytest tests/*
```
## Contributing
1. Fork the repository
2. Create your feature branch (git checkout -b feature/amazing-feature)
3. Commit your changes (git commit -m 'Add amazing feature')
4. Push to the branch (git push origin feature/amazing-feature)
5. Open a Pull Request
## TODO
- Progress bars are not working for jupyter notebooks.
- Include user_groups and item_groups optimization procedure.
# References
[1]: Godoy-Lorite, Antonia, et al. "Accurate and scalable social recommendation
using mixed-membership stochastic block models." Proceedings of the National
Academy of Sciences 113.50 (2016): 14207-14212.
Raw data
{
"_id": null,
"home_page": "https://github.com/eudald-seeslab/mmsbm",
"name": "mmsbm",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "bayesian analysis, recommender systems, network analysis, python",
"author": "Eudald Correig",
"author_email": "eudald.correig@urv.cat",
"download_url": "https://files.pythonhosted.org/packages/d5/4b/7dc9c96f702e2d52278e1a6114b45addc2e071d9fb59d533f5ed022c2077/mmsbm-0.3.1.tar.gz",
"platform": null,
"description": "# Mixed Membership Stochastic Block Models\n\n[![PyPI version](https://badge.fury.io/py/mmsbm.svg)](https://badge.fury.io/py/mmsbm)\n[![Documentation Status](https://readthedocs.org/projects/mmsbm-docs/badge/?version=latest)](https://mmsbm-docs.readthedocs.io/en/latest/?badge=latest)\n[![Python Versions](https://img.shields.io/pypi/pyversions/mmsbm.svg)](https://pypi.org/project/mmsbm/)\n[![Tests](https://github.com/eudald-seeslab/mmsbm/actions/workflows/tests.yml/badge.svg)](https://github.com/eudald-seeslab/mmsbm/actions/workflows/tests.yml)\n[![Coverage Status](https://coveralls.io/repos/github/eudald-seeslab/mmsbm/badge.svg?branch=main)](https://coveralls.io/github/eudald-seeslab/mmsbm?branch=main)\n[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)\n[![Downloads](https://pepy.tech/badge/mmsbm)](https://pepy.tech/project/mmsbm)\n\nA Python implementation of Mixed Membership Stochastic Block Models for recommendation systems, based on the work by Godoy-Lorite et al. (2016). This library provides an efficient, vectorized implementation suitable for both research and production environments.\n\n## Freatures\n\n- Fast, vectorized implementation of MMSBM.\n- Support for both simple and cross-validated fitting.\n- Parallel processing for multiple sampling runs.\n- Comprehensive model statistics and evaluation metrics.\n- Compatible with Python 3.6+ through 3.12.\n\n## Installation\n\n```bash\npip install mmsbm\n```\n\n## Quick Start\n\n```python\nfrom mmsbm import MMSBM\n\n# Create a model\nmodel = MMSBM(user_groups=2, item_groups=4)\n\n# Fit and predict\nmodel.fit(train_data)\npredictions = model.predict(test_data)\n\n# Get model results\nresults = model.score()\n```\n## Detailed Usage\n\n### Data Format\n\nThe input data should be a pandas DataFrame with exactly 3 columns representing users, items, and ratings:\n\n```python\nimport pandas as pd\nfrom random import choice\n\ntrain = pd.DataFrame(\n {\n \"users\": [f\"user{choice(list(range(5)))}\" for _ in range(100)],\n \"items\": [f\"item{choice(list(range(10)))}\" for _ in range(100)],\n \"ratings\": [choice(list(range(1, 6))) for _ in range(100)]\n }\n)\n\ntest = pd.DataFrame(\n {\n \"users\": [f\"user{choice(list(range(5)))}\" for _ in range(50)],\n \"items\": [f\"item{choice(list(range(10)))}\" for _ in range(50)],\n \"ratings\": [choice(list(range(1, 6))) for _ in range(50)]\n }\n)\n\n```\n\n### Model Configuration\n\n```python\n\nfrom mmsbm import MMSBM\n\n# Initialize the MMSBM class:\nmodel = MMSBM(\n user_groups=2, # Number of user groups\n item_groups=4, # Number of item groups\n iterations=500, # Number of EM iterations\n sampling=5, # Number of parallel runs\n seed=1, # Random seed for reproducibility\n debug=False # Enable debug logging\n)\n```\n\n### Training Methods\n\n#### Simple Fit\n\n```python\nmmsbm.fit(train)\n```\n\n#### Cross-Validation Fit\n\n```python\naccuracies = mmsbm.cv_fit(train, folds=5)\nprint(f\"Mean accuracy: {np.mean(accuracies):.3f} \u00b1 {np.std(accuracies):.3f}\")\n```\n\n### Making Predictions\n\n```python\npredictions = mmsbm.predict(test)\n```\n\n### Model Evaluation\n\n```python\nresults = model.score()\n\n# Access various metrics\naccuracy = results['stats']['accuracy']\nmae = results['stats']['mae']\n\n# Access model parameters\ntheta = results['objects']['theta'] # User group memberships\neta = results['objects']['eta'] # Item group memberships\npr = results['objects']['pr'] # Rating probabilities\n```\n\n## Performance Considerations\n\n- Computation is vectorized for efficient processing of large datasets.\n- Parallel processing for multiple sampling runs\n- Computational complexity scales primarily with the number of unique items, but not users\n- Memory usage scales primarily with the number of unique users and items\n\n## Running Tests\n\nTo run tests do the following:\n\n```\n# Install development dependencies\npip install -e \".[dev]\"\n\n# Run tests\npython -m pytest tests/*\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create your feature branch (git checkout -b feature/amazing-feature)\n3. Commit your changes (git commit -m 'Add amazing feature')\n4. Push to the branch (git push origin feature/amazing-feature)\n5. Open a Pull Request\n\n## TODO\n\n- Progress bars are not working for jupyter notebooks.\n- Include user_groups and item_groups optimization procedure.\n\n\n# References\n[1]: Godoy-Lorite, Antonia, et al. \"Accurate and scalable social recommendation \nusing mixed-membership stochastic block models.\" Proceedings of the National \nAcademy of Sciences 113.50 (2016): 14207-14212.\n",
"bugtrack_url": null,
"license": "BSD-3-Clause License",
"summary": "Compute Mixed Membership Stochastic Block Models.",
"version": "0.3.1",
"project_urls": {
"Homepage": "https://github.com/eudald-seeslab/mmsbm"
},
"split_keywords": [
"bayesian analysis",
" recommender systems",
" network analysis",
" python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "cc81c93486f95868364a784a30c6bc0adb65abc35bfd75b8087dcc794915ebe6",
"md5": "d21a910be789f094b6e2ba0b9c5d975a",
"sha256": "880bbfbd3d39f16d0bef1bb015981569fd6d8e9331ee6a447a929a65c23c83fb"
},
"downloads": -1,
"filename": "mmsbm-0.3.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d21a910be789f094b6e2ba0b9c5d975a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 13465,
"upload_time": "2024-11-17T10:46:56",
"upload_time_iso_8601": "2024-11-17T10:46:56.916888Z",
"url": "https://files.pythonhosted.org/packages/cc/81/c93486f95868364a784a30c6bc0adb65abc35bfd75b8087dcc794915ebe6/mmsbm-0.3.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d54b7dc9c96f702e2d52278e1a6114b45addc2e071d9fb59d533f5ed022c2077",
"md5": "ad0dfa8d1a4bf52f827c8562b86636c5",
"sha256": "cfb7e8f5ee32b30f06110e92b663e61afc66e313a2aa0034a5dadfbf170b2b16"
},
"downloads": -1,
"filename": "mmsbm-0.3.1.tar.gz",
"has_sig": false,
"md5_digest": "ad0dfa8d1a4bf52f827c8562b86636c5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 17537,
"upload_time": "2024-11-17T10:46:58",
"upload_time_iso_8601": "2024-11-17T10:46:58.758253Z",
"url": "https://files.pythonhosted.org/packages/d5/4b/7dc9c96f702e2d52278e1a6114b45addc2e071d9fb59d533f5ed022c2077/mmsbm-0.3.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-17 10:46:58",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "eudald-seeslab",
"github_project": "mmsbm",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"lcname": "mmsbm"
}