# SPMF
Python Wrapper for [SPMF Java library](http://www.philippe-fournier-viger.com/spmf).
## Information
This module contains python wrappers for pattern mining algorithms implemented in SPMF Java library. Each algorithm is implemented as a standalone Python class with fully descriptive and tested APIs. It also provides native support for Pandas dataframes.
Why? If you're in a Python pipeline, it might be cumbersome to use Java as an intermediate step. Using `spmf-wrapper` you can stay in your pipeline as though Java is never used at all.
## Installation
[`pip install spmf-wrapper`](https://pypi.org/project/spmf-wrapper/)
A Java Runtime Environment is required to run this wrapper. If an existing installation is not detected, JRE v21 is automatically installed using `install-jdk` python module at `$HOME/.jre/jdk-21.0.2+13-jre`. If you prefer to install Java Runtime manually, follow instructions [`here`](https://www.java.com/en/download/help/download_options.html). Test installation by running the following command on the terminal:
```
> java -version
java version "1.8.0_391"
Java(TM) SE Runtime Environment (build 1.8.0_391-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.391-b13, mixed mode)
```
## Usage
Example:
```python
from spmf import EMMA
emma = EMMA(min_support=2, max_window=2, timestamp_present=True, transform=True)
output = emma.run_pandas(input_df)
```
Input:
| | Time points | Itemset
| ---- | ------ | -------
| 0 | 1 | a
| 1 | 2 | a
| 2 | 3 | a
| 3 | 3 | b
| 4 | 6 | a
| 5 | 7 | a
| 6 | 7 | b
| 7 | 8 | c
| 8 | 9 | b
| 9 | 11 | d
Output:
| | Frequent episode | Support
| --- | ---------------- | -------
|0 | a | 5
|1 | b | 3
|2 | a b | 2
|3 | a-> a | 3
|4 | a -> b | 2
|5 | a -> a b | 2
See [examples]('https://github.com/AakashVasudevan/Py-SPMF/tree/main/examples') for more details.
For a detailed explanation of the algorithm and parameters, refer to the corresponding webpage in the SPMF [documentation](http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php).
## Implementation Checklist
### Sequential Pattern Mining
| Algorithm| Type | Implemented
| -------- | ------- | ---------
| PrefixSpan | Frequent Sequential Pattern | ✓
| GSP | Frequent Sequential Pattern |
| SPADE | Frequent Sequential Pattern | ✓
| CM-SPADE | Frequent Sequential Pattern | ✓
| SPAM | Frequent Sequential Pattern | ✓
| CM-SPAM | Frequent Sequential Pattern |
| FAST | Frequent Sequential Pattern |
| LAPIN | Frequent Sequential Pattern |
| ClaSP | Frequent Closed Sequential Pattern | ✓
| CM-ClaSP | Frequent Closed Sequential Pattern | ✓
| CloFAST | Frequent Closed Sequential Pattern |
| CloSpan | Frequent Closed Sequential Pattern |
| BIDE+ | Frequent Closed Sequential Pattern |
| Post Processing SPAM or PrefixSpan | Frequent Closed Sequential Pattern |
| MaxSP | Frequent Maximal Sequential Pattern |
| VMSP | Frequent Maximal Sequential Pattern | ✓
| FEAT | Frequent Sequential Generator Pattern |
| FSGP | Frequent Sequential Generator Pattern |
| VGEN | Frequent Sequential Generator Pattern | ✓
| NOSEP | Non-overlapping Sequential Pattern | ✓
| GoKrimp | Compressing Sequential Pattern |
| TKS | Top-k Frequent Sequential Pattern | ✓
| TSP | Top-k Frequent Sequential Pattern |
### Episode Mining
| Algorithm| Type | Implemented
| -------- | ------- | ---------
| EMMA | Frequent Episode | ✓
| AFEM | Frequent Episode | ✓
| MINEPI | Frequent Episode |
| MINEPI+ | Frequent Episode | ✓
| TKE | Top-k Frequent Episodes | ✓
| MaxFEM | Maximal Frequent Episodes | ✓
| POERM | Episode Rules |
| POERM-ALL | Episode Rules |
| POERMH | Episode Rules |
| NONEPI | Episode Rules | ✓
| TKE-Rules | Episode Rules | ✓
| AFEM-Rules | Episode Rules | ✓
| EMMA-Rules | Epsiode Rules | ✓
| MINEPI+-Rules | Episode Rules |
| HUE-SPAN | High Utility Episodes |
| US-SPAN | High Utility Episodes |
| TUP | Top-K High Utility Episodes |
## Bibliography
```
Fournier-Viger, P., Lin, C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., Lam, H. T. (2016).
The SPMF Open-Source Data Mining Library Version 2.
Proc. 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III, Springer LNCS 9853, pp. 36-40.
```
Raw data
{
"_id": null,
"home_page": "https://github.com/AakashVasudevan/Py-SPMF",
"name": "spmf-wrapper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "SPMF, pattern, mining",
"author": "Aakash Vasudevan",
"author_email": "Aakash.Vasudevan@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/60/89/d32899a6faa8f24b343188f8e86348c8415f383a58c9681aa030b221a099/spmf-wrapper-0.5.0.tar.gz",
"platform": null,
"description": "# SPMF\nPython Wrapper for [SPMF Java library](http://www.philippe-fournier-viger.com/spmf).\n\n## Information\nThis module contains python wrappers for pattern mining algorithms implemented in SPMF Java library. Each algorithm is implemented as a standalone Python class with fully descriptive and tested APIs. It also provides native support for Pandas dataframes.\n\nWhy? If you're in a Python pipeline, it might be cumbersome to use Java as an intermediate step. Using `spmf-wrapper` you can stay in your pipeline as though Java is never used at all.\n\n## Installation\n[`pip install spmf-wrapper`](https://pypi.org/project/spmf-wrapper/)\n\nA Java Runtime Environment is required to run this wrapper. If an existing installation is not detected, JRE v21 is automatically installed using `install-jdk` python module at `$HOME/.jre/jdk-21.0.2+13-jre`. If you prefer to install Java Runtime manually, follow instructions [`here`](https://www.java.com/en/download/help/download_options.html). Test installation by running the following command on the terminal:\n\n```\n> java -version\njava version \"1.8.0_391\"\nJava(TM) SE Runtime Environment (build 1.8.0_391-b13)\nJava HotSpot(TM) 64-Bit Server VM (build 25.391-b13, mixed mode)\n```\n\n## Usage\nExample:\n```python\nfrom spmf import EMMA\n\nemma = EMMA(min_support=2, max_window=2, timestamp_present=True, transform=True)\noutput = emma.run_pandas(input_df)\n```\n\nInput:\n\n| | Time points | Itemset\n| ---- | ------ | -------\n| 0\t| 1\t| a\n| 1\t| 2\t| a\n| 2\t| 3\t| a\n| 3\t| 3\t| b\n| 4\t| 6\t| a\n| 5\t| 7\t| a\n| 6\t| 7\t| b\n| 7\t| 8\t| c\n| 8\t| 9\t| b\n| 9\t| 11 | d\n\nOutput:\n\n|\t| Frequent episode | Support\n| --- | ---------------- | -------\n|0 |\ta |\t5\n|1 |\tb\t| 3\n|2 |\ta b\t| 2\n|3\t| a-> a |\t3\n|4\t| a -> b |\t2\n|5\t| a -> a b | 2\n\nSee [examples]('https://github.com/AakashVasudevan/Py-SPMF/tree/main/examples') for more details.\n\nFor a detailed explanation of the algorithm and parameters, refer to the corresponding webpage in the SPMF [documentation](http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php).\n\n## Implementation Checklist\n\n### Sequential Pattern Mining\n\n| Algorithm| Type | Implemented\n| -------- | ------- | ---------\n| PrefixSpan | Frequent Sequential Pattern | ✓\n| GSP | Frequent Sequential Pattern |\n| SPADE | Frequent Sequential Pattern | ✓\n| CM-SPADE | Frequent Sequential Pattern | ✓\n| SPAM | Frequent Sequential Pattern | ✓\n| CM-SPAM | Frequent Sequential Pattern |\n| FAST | Frequent Sequential Pattern |\n| LAPIN | Frequent Sequential Pattern |\n| ClaSP | Frequent Closed Sequential Pattern | ✓\n| CM-ClaSP | Frequent Closed Sequential Pattern | ✓\n| CloFAST | Frequent Closed Sequential Pattern |\n| CloSpan | Frequent Closed Sequential Pattern |\n| BIDE+ | Frequent Closed Sequential Pattern |\n| Post Processing SPAM or PrefixSpan | Frequent Closed Sequential Pattern |\n| MaxSP | Frequent Maximal Sequential Pattern |\n| VMSP | Frequent Maximal Sequential Pattern | ✓\n| FEAT | Frequent Sequential Generator Pattern |\n| FSGP | Frequent Sequential Generator Pattern |\n| VGEN | Frequent Sequential Generator Pattern | ✓\n| NOSEP | Non-overlapping Sequential Pattern | ✓\n| GoKrimp | Compressing Sequential Pattern |\n| TKS | Top-k Frequent Sequential Pattern | ✓\n| TSP | Top-k Frequent Sequential Pattern |\n\n### Episode Mining\n\n| Algorithm| Type | Implemented\n| -------- | ------- | ---------\n| EMMA | Frequent Episode | ✓\n| AFEM | Frequent Episode | ✓\n| MINEPI | Frequent Episode |\n| MINEPI+ | Frequent Episode | ✓\n| TKE | Top-k Frequent Episodes | ✓\n| MaxFEM | Maximal Frequent Episodes | ✓\n| POERM | Episode Rules |\n| POERM-ALL | Episode Rules |\n| POERMH | Episode Rules |\n| NONEPI | Episode Rules | ✓\n| TKE-Rules | Episode Rules | ✓\n| AFEM-Rules | Episode Rules | ✓\n| EMMA-Rules | Epsiode Rules | ✓\n| MINEPI+-Rules | Episode Rules |\n| HUE-SPAN | High Utility Episodes |\n| US-SPAN | High Utility Episodes |\n| TUP | Top-K High Utility Episodes |\n\n\n## Bibliography\n```\nFournier-Viger, P., Lin, C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., Lam, H. T. (2016).\nThe SPMF Open-Source Data Mining Library Version 2.\nProc. 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III, Springer LNCS 9853, pp. 36-40.\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Python Wrapper for SPMF",
"version": "0.5.0",
"project_urls": {
"Homepage": "https://github.com/AakashVasudevan/Py-SPMF"
},
"split_keywords": [
"spmf",
" pattern",
" mining"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "798464005ffe8b5c0230e1abee387e5ad6cb8b4cb6e3ad66758df3e2046b13d2",
"md5": "24ee0a45650bd5b44ad0e8610d64ea10",
"sha256": "10e0791f643f2315ff2dfaa590949d30bfe698a18f743b9be21810e600c04a1a"
},
"downloads": -1,
"filename": "spmf_wrapper-0.5.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "24ee0a45650bd5b44ad0e8610d64ea10",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 12092358,
"upload_time": "2024-03-23T19:51:09",
"upload_time_iso_8601": "2024-03-23T19:51:09.073370Z",
"url": "https://files.pythonhosted.org/packages/79/84/64005ffe8b5c0230e1abee387e5ad6cb8b4cb6e3ad66758df3e2046b13d2/spmf_wrapper-0.5.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6089d32899a6faa8f24b343188f8e86348c8415f383a58c9681aa030b221a099",
"md5": "17ebfd791f7f4ded58d297085d3dca3b",
"sha256": "baaed021791ef20758fac5176f93c37b5c355acb522b1969aa0dd06eb68e2b88"
},
"downloads": -1,
"filename": "spmf-wrapper-0.5.0.tar.gz",
"has_sig": false,
"md5_digest": "17ebfd791f7f4ded58d297085d3dca3b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 12090700,
"upload_time": "2024-03-23T19:51:12",
"upload_time_iso_8601": "2024-03-23T19:51:12.350154Z",
"url": "https://files.pythonhosted.org/packages/60/89/d32899a6faa8f24b343188f8e86348c8415f383a58c9681aa030b221a099/spmf-wrapper-0.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-23 19:51:12",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AakashVasudevan",
"github_project": "Py-SPMF",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "spmf-wrapper"
}