recommenders


Namerecommenders JSON
Version 1.2.0 PyPI version JSON
download
home_pagehttps://github.com/recommenders-team/recommenders
SummaryRecommenders - Python utilities for building recommendation systems
upload_time2024-05-01 18:45:31
maintainerNone
docs_urlNone
authorRecommenders contributors
requires_python>=3.6
licenseNone
keywords recommendations recommendation recommenders recommender system engine machine learning python spark gpu
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <!--
Copyright (c) Recommenders contributors.
Licensed under the MIT License.
-->

# Recommender Utilities

This package contains functions to simplify common tasks used when developing and evaluating recommender systems. A short description of the submodules is provided below. For more details about what functions are available and how to use them, please review the doc-strings provided with the code or the [online documentation](https://readthedocs.org/projects/microsoft-recommenders/).

# Installation

## Pre-requisites
Some dependencies require compilation during pip installation. On Linux this can be supported by adding build-essential dependencies:
```bash
sudo apt-get install -y build-essential libpython<version>
``` 
where `<version>` should be the Python version (e.g. `3.8`).

On Windows you will need [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/)

For more details about the software requirements that must be pre-installed on each supported platform, see the [setup guide](https://github.com/microsoft/recommenders/blob/main/SETUP.md).   

## Basic installation

To install core utilities, CPU-based algorithms, and dependencies
```bash
pip install --upgrade pip setuptools
pip install recommenders
```

## Optional Dependencies

By default `recommenders` does not install all dependencies used throughout the code and the notebook examples in this repo. Instead we require a bare minimum set of dependencies needed to execute functionality in the `recommenders` package (excluding Spark, GPU and Jupyter functionality). We also allow the user to specify which groups of dependencies are needed at installation time (or later if updating the pip installation). The following groups are provided:

- examples: dependencies related to Jupyter needed to run [example notebooks](https://github.com/microsoft/recommenders/tree/main/examples)
- gpu: dependencies to enable GPU functionality (PyTorch & TensorFlow)
- spark: dependencies to enable Apache Spark functionality used in dataset, splitting, evaluation and certain algorithms
- dev: dependencies such as `black` and `pytest` required only for development or testing
- all: all of the above dependencies
- experimental: current experimental dependencies that are being evaluated (e.g. libraries that require advanced build requirements or might conflict with libraries from other options)
- nni: dependencies for NNI tuning framework.

Note that, currently, xLearn and Vowpal Wabbit are in the experimental group.

These groups can be installed alone or in combination:
```bash
# install recommenders with core requirements and support for CPU-based recommender algorithms and notebooks
pip install recommenders[examples]

# add support for running example notebooks and GPU functionality
pip install recommenders[examples,gpu]
```

## GPU Support

You will need CUDA Toolkit v11.2 and CuDNN v8.1 to enable both Tensorflow and PyTorch to use the GPU. For example, if you are using a conda environment, this can be installed with
```bash
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1
```
For a virtual environment, you may use a [docker container by Nvidia](../SETUP.md#using-a-virtual-environment). 

For manual installation of the necessary requirements see [TensorFlow](https://www.tensorflow.org/install/gpu#software_requirements) and [PyTorch](https://pytorch.org/get-started/locally/) installation pages.

When installing with GPU support you will need to point to the PyTorch index to ensure you are downloading a version of PyTorch compiled with CUDA support. This can be done using the --find-links or -f option below.

`pip install recommenders[gpu] -f https://download.pytorch.org/whl/cu111/torch_stable.html`

## Experimental dependencies

We are currently evaluating inclusion of the following dependencies:

 - vowpalwabbit: current examples show how to use vowpal wabbit after it has been installed on the command line; using the [PyPI package](https://pypi.org/project/vowpalwabbit/) with the scikit-learn interface will facilitate easier integration into python environments
 - xlearn: on some platforms, xLearn requires pre-installation of cmake.

## Other dependencies

Some dependencies are not available via the recommenders PyPI package, but can be installed in the following ways: 
 - pymanopt: this dependency is required for the RLRMC and GeoIMC algorithms; a version of this code compatible with TensorFlow 2 can be
 installed with `pip install "pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip"`. 

## NNI dependencies

For NNI a more recent version can be installed but is untested.


## Installing the utilities from a local copy

In case you want to use a version of the source code that is not published on PyPI, one alternative is to install from a clone of the source code on your machine. To this end, 
a [setup.py](../setup.py) file is provided in order to simplify the installation of the utilities in this repo from the main directory.

This still requires an environment to be installed as described in the [setup guide](../SETUP.md). Once the necessary dependencies are installed, you can use the following command to install `recommenders` as a python package.

    pip install -e .

It is also possible to install directly from GitHub. Or from a specific branch as well.

    pip install -e git+https://github.com/microsoft/recommenders/#egg=pkg
    pip install -e git+https://github.com/microsoft/recommenders/@staging#egg=pkg

**NOTE** - The pip installation does not install all of the pre-requisites; it is assumed that the environment has already been set up according to the [setup guide](../SETUP.md), for the utilities to be used.


# Contents

## [Datasets](datasets)

Datasets module includes helper functions for pulling different datasets and formatting them appropriately as well as utilities for splitting data for training / testing.

### Data Loading

There are dataloaders for several datasets. For example, the movielens module will allow you to load a dataframe in pandas or spark formats from the MovieLens dataset, with sizes of 100k, 1M, 10M, or 20M to test algorithms and evaluate performance benchmarks.

```python
df = movielens.load_pandas_df(size="100k")
```

### Splitting Techniques

Currently three methods are available for splitting datasets. All of them support splitting by user or item and filtering out minimal samples (for instance users that have not rated enough items, or items that have not been rated by enough users).

- Random: this is the basic approach where entries are randomly assigned to each group based on the ratio desired
- Chronological: this uses provided timestamps to order the data and selects a cut-off time that will split the desired ratio of data to train before that time and test after that time
- Stratified: this is similar to random sampling, but the splits are stratified, for example if the datasets are split by user, the splitting approach will attempt to maintain the same ratio of items used in both training and test splits. The converse is true if splitting by item.

## [Evaluation](evaluation)

The evaluation submodule includes functionality for calculating common recommendation metrics directly in Python or in a Spark environment using PySpark.

Currently available metrics include:

- Root Mean Squared Error
- Mean Absolute Error
- R<sup>2</sup>
- Explained Variance
- Precision at K
- Recall at K
- Normalized Discounted Cumulative Gain at K
- Mean Average Precision at K
- Area Under Curve
- Logistic Loss

## [Models](models)

The models submodule contains implementations of various algorithms that can be used in addition to external packages to evaluate and develop new recommender system approaches. A description of all the algorithms can be found on [this table](../README.md#algorithms). The following is a list of the algorithm utilities:

* Cornac
* DeepRec
  *  Convolutional Sequence Embedding Recommendation (CASER)
  *  Deep Knowledge-Aware Network (DKN)
  *  Extreme Deep Factorization Machine (xDeepFM)
  *  GRU
  *  LightGCN
  *  Next Item Recommendation (NextItNet)
  *  Short-term and Long-term Preference Integrated Recommender (SLi-Rec)
  *  Multi-Interest-Aware Sequential User Modeling (SUM)
* FastAI
* GeoIMC
* LightFM
* LightGBM
* NCF
* NewsRec
  * Neural Recommendation with Long- and Short-term User Representations (LSTUR)
  * Neural Recommendation with Attentive Multi-View Learning (NAML)
  * Neural Recommendation with Personalized Attention (NPA)
  * Neural Recommendation with Multi-Head Self-Attention (NRMS)
* Restricted Boltzmann Machines (RBM)
* Riemannian Low-rank Matrix Completion (RLRMC)
* Simple Algorithm for Recommendation (SAR)
* Self-Attentive Sequential Recommendation (SASRec)
* Sequential Recommendation Via Personalized Transformer (SSEPT)
* Surprise
* Term Frequency - Inverse Document Frequency (TF-IDF)
* Variational Autoencoders (VAE)
  * Multinomial
  * Standard
* Vowpal Wabbit (VW)
* Wide and Deep
* xLearn
  * Factorization Machine (FM)
  * Field-Aware FM (FFM)

## [Tuning](tuning)

This submodule contains utilities for performing hyperparameter tuning.

## [Utils](utils)

This submodule contains high-level utilities for defining constants used in most algorithms as well as helper functions for managing aspects of different frameworks: GPU, Spark, Jupyter notebook.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/recommenders-team/recommenders",
    "name": "recommenders",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "recommendations recommendation recommenders recommender system engine machine learning python spark gpu",
    "author": "Recommenders contributors",
    "author_email": "recommenders-technical-discuss@lists.lfaidata.foundation",
    "download_url": "https://files.pythonhosted.org/packages/90/1b/0e30360bad76e8cb6c4c968cd7624ca5f6c6896a4285a919c735f202b50b/recommenders-1.2.0.tar.gz",
    "platform": null,
    "description": "<!--\nCopyright (c) Recommenders contributors.\nLicensed under the MIT License.\n-->\n\n# Recommender Utilities\n\nThis package contains functions to simplify common tasks used when developing and evaluating recommender systems. A short description of the submodules is provided below. For more details about what functions are available and how to use them, please review the doc-strings provided with the code or the [online documentation](https://readthedocs.org/projects/microsoft-recommenders/).\n\n# Installation\n\n## Pre-requisites\nSome dependencies require compilation during pip installation. On Linux this can be supported by adding build-essential dependencies:\n```bash\nsudo apt-get install -y build-essential libpython<version>\n``` \nwhere `<version>` should be the Python version (e.g. `3.8`).\n\nOn Windows you will need [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/)\n\nFor more details about the software requirements that must be pre-installed on each supported platform, see the [setup guide](https://github.com/microsoft/recommenders/blob/main/SETUP.md).   \n\n## Basic installation\n\nTo install core utilities, CPU-based algorithms, and dependencies\n```bash\npip install --upgrade pip setuptools\npip install recommenders\n```\n\n## Optional Dependencies\n\nBy default `recommenders` does not install all dependencies used throughout the code and the notebook examples in this repo. Instead we require a bare minimum set of dependencies needed to execute functionality in the `recommenders` package (excluding Spark, GPU and Jupyter functionality). We also allow the user to specify which groups of dependencies are needed at installation time (or later if updating the pip installation). The following groups are provided:\n\n- examples: dependencies related to Jupyter needed to run [example notebooks](https://github.com/microsoft/recommenders/tree/main/examples)\n- gpu: dependencies to enable GPU functionality (PyTorch & TensorFlow)\n- spark: dependencies to enable Apache Spark functionality used in dataset, splitting, evaluation and certain algorithms\n- dev: dependencies such as `black` and `pytest` required only for development or testing\n- all: all of the above dependencies\n- experimental: current experimental dependencies that are being evaluated (e.g. libraries that require advanced build requirements or might conflict with libraries from other options)\n- nni: dependencies for NNI tuning framework.\n\nNote that, currently, xLearn and Vowpal Wabbit are in the experimental group.\n\nThese groups can be installed alone or in combination:\n```bash\n# install recommenders with core requirements and support for CPU-based recommender algorithms and notebooks\npip install recommenders[examples]\n\n# add support for running example notebooks and GPU functionality\npip install recommenders[examples,gpu]\n```\n\n## GPU Support\n\nYou will need CUDA Toolkit v11.2 and CuDNN v8.1 to enable both Tensorflow and PyTorch to use the GPU. For example, if you are using a conda environment, this can be installed with\n```bash\nconda install -c conda-forge cudatoolkit=11.2 cudnn=8.1\n```\nFor a virtual environment, you may use a [docker container by Nvidia](../SETUP.md#using-a-virtual-environment). \n\nFor manual installation of the necessary requirements see [TensorFlow](https://www.tensorflow.org/install/gpu#software_requirements) and [PyTorch](https://pytorch.org/get-started/locally/) installation pages.\n\nWhen installing with GPU support you will need to point to the PyTorch index to ensure you are downloading a version of PyTorch compiled with CUDA support. This can be done using the --find-links or -f option below.\n\n`pip install recommenders[gpu] -f https://download.pytorch.org/whl/cu111/torch_stable.html`\n\n## Experimental dependencies\n\nWe are currently evaluating inclusion of the following dependencies:\n\n - vowpalwabbit: current examples show how to use vowpal wabbit after it has been installed on the command line; using the [PyPI package](https://pypi.org/project/vowpalwabbit/) with the scikit-learn interface will facilitate easier integration into python environments\n - xlearn: on some platforms, xLearn requires pre-installation of cmake.\n\n## Other dependencies\n\nSome dependencies are not available via the recommenders PyPI package, but can be installed in the following ways: \n - pymanopt: this dependency is required for the RLRMC and GeoIMC algorithms; a version of this code compatible with TensorFlow 2 can be\n installed with `pip install \"pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip\"`. \n\n## NNI dependencies\n\nFor NNI a more recent version can be installed but is untested.\n\n\n## Installing the utilities from a local copy\n\nIn case you want to use a version of the source code that is not published on PyPI, one alternative is to install from a clone of the source code on your machine. To this end, \na [setup.py](../setup.py) file is provided in order to simplify the installation of the utilities in this repo from the main directory.\n\nThis still requires an environment to be installed as described in the [setup guide](../SETUP.md). Once the necessary dependencies are installed, you can use the following command to install `recommenders` as a python package.\n\n    pip install -e .\n\nIt is also possible to install directly from GitHub. Or from a specific branch as well.\n\n    pip install -e git+https://github.com/microsoft/recommenders/#egg=pkg\n    pip install -e git+https://github.com/microsoft/recommenders/@staging#egg=pkg\n\n**NOTE** - The pip installation does not install all of the pre-requisites; it is assumed that the environment has already been set up according to the [setup guide](../SETUP.md), for the utilities to be used.\n\n\n# Contents\n\n## [Datasets](datasets)\n\nDatasets module includes helper functions for pulling different datasets and formatting them appropriately as well as utilities for splitting data for training / testing.\n\n### Data Loading\n\nThere are dataloaders for several datasets. For example, the movielens module will allow you to load a dataframe in pandas or spark formats from the MovieLens dataset, with sizes of 100k, 1M, 10M, or 20M to test algorithms and evaluate performance benchmarks.\n\n```python\ndf = movielens.load_pandas_df(size=\"100k\")\n```\n\n### Splitting Techniques\n\nCurrently three methods are available for splitting datasets. All of them support splitting by user or item and filtering out minimal samples (for instance users that have not rated enough items, or items that have not been rated by enough users).\n\n- Random: this is the basic approach where entries are randomly assigned to each group based on the ratio desired\n- Chronological: this uses provided timestamps to order the data and selects a cut-off time that will split the desired ratio of data to train before that time and test after that time\n- Stratified: this is similar to random sampling, but the splits are stratified, for example if the datasets are split by user, the splitting approach will attempt to maintain the same ratio of items used in both training and test splits. The converse is true if splitting by item.\n\n## [Evaluation](evaluation)\n\nThe evaluation submodule includes functionality for calculating common recommendation metrics directly in Python or in a Spark environment using PySpark.\n\nCurrently available metrics include:\n\n- Root Mean Squared Error\n- Mean Absolute Error\n- R<sup>2</sup>\n- Explained Variance\n- Precision at K\n- Recall at K\n- Normalized Discounted Cumulative Gain at K\n- Mean Average Precision at K\n- Area Under Curve\n- Logistic Loss\n\n## [Models](models)\n\nThe models submodule contains implementations of various algorithms that can be used in addition to external packages to evaluate and develop new recommender system approaches. A description of all the algorithms can be found on [this table](../README.md#algorithms). The following is a list of the algorithm utilities:\n\n* Cornac\n* DeepRec\n  *  Convolutional Sequence Embedding Recommendation (CASER)\n  *  Deep Knowledge-Aware Network (DKN)\n  *  Extreme Deep Factorization Machine (xDeepFM)\n  *  GRU\n  *  LightGCN\n  *  Next Item Recommendation (NextItNet)\n  *  Short-term and Long-term Preference Integrated Recommender (SLi-Rec)\n  *  Multi-Interest-Aware Sequential User Modeling (SUM)\n* FastAI\n* GeoIMC\n* LightFM\n* LightGBM\n* NCF\n* NewsRec\n  * Neural Recommendation with Long- and Short-term User Representations (LSTUR)\n  * Neural Recommendation with Attentive Multi-View Learning (NAML)\n  * Neural Recommendation with Personalized Attention (NPA)\n  * Neural Recommendation with Multi-Head Self-Attention (NRMS)\n* Restricted Boltzmann Machines (RBM)\n* Riemannian Low-rank Matrix Completion (RLRMC)\n* Simple Algorithm for Recommendation (SAR)\n* Self-Attentive Sequential Recommendation (SASRec)\n* Sequential Recommendation Via Personalized Transformer (SSEPT)\n* Surprise\n* Term Frequency - Inverse Document Frequency (TF-IDF)\n* Variational Autoencoders (VAE)\n  * Multinomial\n  * Standard\n* Vowpal Wabbit (VW)\n* Wide and Deep\n* xLearn\n  * Factorization Machine (FM)\n  * Field-Aware FM (FFM)\n\n## [Tuning](tuning)\n\nThis submodule contains utilities for performing hyperparameter tuning.\n\n## [Utils](utils)\n\nThis submodule contains high-level utilities for defining constants used in most algorithms as well as helper functions for managing aspects of different frameworks: GPU, Spark, Jupyter notebook.\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Recommenders - Python utilities for building recommendation systems",
    "version": "1.2.0",
    "project_urls": {
        "Documentation": "https://recommenders-team.github.io/recommenders/intro.html",
        "Homepage": "https://github.com/recommenders-team/recommenders",
        "Wiki": "https://github.com/recommenders-team/recommenders/wiki"
    },
    "split_keywords": [
        "recommendations",
        "recommendation",
        "recommenders",
        "recommender",
        "system",
        "engine",
        "machine",
        "learning",
        "python",
        "spark",
        "gpu"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cfcb22c4f61a3e33e9905ff2659c2f94b180bfbabc95e28efcda1d67b8340898",
                "md5": "80e990d82a512dc96fe6d4fe4da20864",
                "sha256": "77fca97da5f5c46080f10f40ac04ec1ff116b83fe78b6efb59d6d7b8c28938c9"
            },
            "downloads": -1,
            "filename": "recommenders-1.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "80e990d82a512dc96fe6d4fe4da20864",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 356008,
            "upload_time": "2024-05-01T18:45:29",
            "upload_time_iso_8601": "2024-05-01T18:45:29.011470Z",
            "url": "https://files.pythonhosted.org/packages/cf/cb/22c4f61a3e33e9905ff2659c2f94b180bfbabc95e28efcda1d67b8340898/recommenders-1.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "901b0e30360bad76e8cb6c4c968cd7624ca5f6c6896a4285a919c735f202b50b",
                "md5": "2e8a6919a9859acd9eb5c50e1fa943c7",
                "sha256": "389a862dac54829bc6eb3a9e01a0de43976daa298baadf2f1956459da93afbf6"
            },
            "downloads": -1,
            "filename": "recommenders-1.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2e8a6919a9859acd9eb5c50e1fa943c7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 277820,
            "upload_time": "2024-05-01T18:45:31",
            "upload_time_iso_8601": "2024-05-01T18:45:31.380349Z",
            "url": "https://files.pythonhosted.org/packages/90/1b/0e30360bad76e8cb6c4c968cd7624ca5f6c6896a4285a919c735f202b50b/recommenders-1.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-01 18:45:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "recommenders-team",
    "github_project": "recommenders",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "recommenders"
}
        
Elapsed time: 0.23952s