slune-lib


Nameslune-lib JSON
Version 0.0.2 PyPI version JSON
download
home_page
SummaryA package for performing hyperparameter tuning with the SLURM scheduling system.
upload_time2023-12-02 14:47:28
maintainer
docs_urlNone
author
requires_python>=3.1
licenseMIT License
keywords slurm hyperparameter tuning machine learning optimisation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![PyPI - Version](https://img.shields.io/pypi/v/:slune-lib)
[![license](https://img.shields.io/badge/License-MIT-purple.svg)](LICENSE)
![badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/h-0-0/4aa01e058fee448070c587f6967037e4/raw/CodeCovSlune.json)

![badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/h-0-0/4aa01e058fee448070c587f6967037e4/raw/Tests-macos.json)
![badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/h-0-0/4aa01e058fee448070c587f6967037e4/raw/Tests-ubuntu.json)
![badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/h-0-0/4aa01e058fee448070c587f6967037e4/raw/Tests-windows.json)



# slune (= slurm + tune!)
A super simplistic python package for performing hyperparameter tuning (or more generally launching jobs and saving results) on a cluster using SLURM. Takes advantage of the fact that lots of jobs (including hyperparameter tuning) are embarrassingly parallel! With slune you can divide your compute into lots of separately scheduled jobs meaning that each small job can get running on your cluster more quickly, speeding up your workflow! Often significantly! 

Slune is super-easy to use! We have helper functions which can execute everything you need done for you. Letting you speed up your work without wasting time. 

Slune is barebones by design. This means that you can easily write code to integrate with slune if you want to do something a bit different! You can also workout what each function is doing pretty easily. 

Slune is flexible. In designing this package I've tried to make as few assumptions as possible meaning that it can be used for lots of stuff outside hyperparameter tuning! (or also within!) For example, you can get slune to give you paths for where to save things, submit lots of jobs in parallel for any sort of script and do grid search! and there's more to come!

The docs are [here](https://h-0-0.github.io/slune/).

## Usage
Let's go through a quick example of how we can use slune ... first let's define a model that we want to train:
```python
# Simple Regularized Linear Regression without using external libraries

# Function to compute the mean of a list
def mean(values):
    return sum(values) / float(len(values))

# Function to compute the covariance between two lists
def covariance(x, mean_x, y, mean_y):
    covar = 0.0
    for i in range(len(x)):
        covar += (x[i] - mean_x) * (y[i] - mean_y)
    return covar

# Function to compute the variance of a list
def variance(values, mean):
    return sum((x - mean) ** 2 for x in values)

# Function to compute coefficients for a simple regularized linear regression
def coefficients_regularized(x, y, alpha):
    mean_x, mean_y = mean(x), mean(y)
    var_x = variance(x, mean_x)
    covar = covariance(x, mean_x, y, mean_y)
    b1 = (covar + alpha * var_x) / (var_x + alpha)
    b0 = mean_y - b1 * mean_x
    return b0, b1

# Function to make predictions with a simple regularized linear regression model
def linear_regression_regularized(train_X, train_y, test_X, alpha):
    b0, b1 = coefficients_regularized(train_X, train_y, alpha)
    predictions = [b0 + b1 * x for x in test_X]
    return predictions

# ------------------
# The above is code for a simple normalized linear regression model that we want to train.
# Now let's fit the model and use slune to save how well our model performs!
# ------------------

if __name__ == "__main__":
    # First let's load in the value for the regularization parameter alpha that has been passed to this script from the command line. We will use the slune helper function lsargs to do this. 
    # lsargs returns a tuple of the python path and a list of arguments passed to the script. We can then use this to get the alpha value.
    from slune import lsargs
    python_path, args = lsargs()
    alpha = float(args[0])

    # Mock training dataset, function is y = 1 + 1 * x
    X = [1, 2, 3, 4, 5]
    y = [2, 3, 4, 5, 6]

    # Mock test dataset
    test_X = [6, 7, 8]
    test_y = [7, 8, 9]
    test_predictions = linear_regression_regularized(X, y, test_X, alpha)

    # First let's load in a function that we can use to get a saver object that uses the default method of logging. The saving will be coordinated by a csv saver object which saves and reads results from csv files stored in a hierarchy of directories.
    from slune import get_csv_saver
    csv_saver = get_csv_saver(params = args)

    # Let's now calculate the mean squared error of our predictions and log it!
    mse = mean((test_y[i] - test_predictions[i])**2 for i in range(len(test_y)))
    csv_saver.log({'mse': mse})

    # Let's now save our logged results!
    csv_saver.save_collated()
```
Now let's write some code that will submit some jobs to train our model using different hyperparameters!!
```python
# Let's now load in a function that will coordinate our search! We're going to do a grid search.
# SearcherGrid is the class we can use to coordinate a grid search. We pass it a dictionary of hyperparameters and the values we want to try for each hyperparameter. We also pass it the number of runs we want to do for each combination of hyperparameters.
from slune.searchers import SearcherGrid
grid_searcher = SearcherGrid({'alpha' : [0.25, 0.5, 0.75]}, runs = 1)

# Let's now import a function which will submit a job for our model, the script_path specifies the path to the script that contains the model we want to train. The template_path specifies the path to the template script that we want to specify the job with, cargs is a list of constant arguments we want to pass to the script for each tuning. 
# We set saver to None as we don't want to not run jobs if we have already run them before.
from slune import sbatchit
script_path = 'model.py'
template_path = 'template.sh'
sbatchit(script_path, template_path, grid_searcher, cargs=[], saver=None)
```
Now we've submitted our jobs we will wait for them to finish 🕛🕐🕑🕒🕓🕔🕕🕖🕗🕘🕙🕚🕛, now that they are finished we can read the results!
```python
from slune import get_csv_saver
csv_saver = get_csv_saver(params = None)
params, value = csv_saver.read(params = [], metric_name = 'mse', select_by ='min')
print(f'Best hyperparameters: {params}')
print(f'Their MSE: {value}')
```
Amazing! 🥳 We have successfully used slune to train our model. I hope this gives you a good idea of how you can use slune and how easy it is to use!

Please check out the examples folder for notebooks detailing in more depth some potential ways you can use slune and of course please check out the docs! 

## Roadmap
Still in early stages! First thing on the horizon is better integration with SLURM:
- Set-up notifications for job completion, failure, etc.
- Auto job naming, job output naming and job output location saving.
- Auto save logged results when finishing a job.
- Automatically re-submit failed jobs.
- Tools for monitoring and cancelling jobs. 
Then it will be looking at adding more savers, loggers and searchers! For example integration with tensorboard, saving to one csv file (as opposed to a hierarchy of csv files in different directories) and different search methods like random search and cross validation. It would perhaps also be beneficial to be able to interface with other languages like R and Julia. Finally, more helper functions!

However, I am trying to keep this package as bloatless as possible to make it easy for you to tweak and configure to your individual needs. It's written in a simple and compartmentalized manner for this reason. You can of course use the helper functions and let slune handle everything under the hood, but, you can also very quickly and easily write your own classes to work with other savers, loggers and searchers to do as you please.

## Installation
To install latest version use:
```bash
pip install slune-lib
```
To install latest dev version use (CURRENTLY RECOMENDED):
```bash
# With https
pip install "git+https://github.com/h-0-0/slune.git#egg=slune-lib"
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "slune-lib",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.1",
    "maintainer_email": "",
    "keywords": "SLURM,hyperparameter,tuning,machine,learning,optimisation",
    "author": "",
    "author_email": "Henry Bourne <hwbourne@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/3e/07/d0c7e1591ab3bbdfbd1b0c76fd7ada6a3d7230bef3e57106a87c1257b2e4/slune-lib-0.0.2.tar.gz",
    "platform": null,
    "description": "![PyPI - Version](https://img.shields.io/pypi/v/:slune-lib)\n[![license](https://img.shields.io/badge/License-MIT-purple.svg)](LICENSE)\n![badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/h-0-0/4aa01e058fee448070c587f6967037e4/raw/CodeCovSlune.json)\n\n![badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/h-0-0/4aa01e058fee448070c587f6967037e4/raw/Tests-macos.json)\n![badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/h-0-0/4aa01e058fee448070c587f6967037e4/raw/Tests-ubuntu.json)\n![badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/h-0-0/4aa01e058fee448070c587f6967037e4/raw/Tests-windows.json)\n\n\n\n# slune (= slurm + tune!)\nA super simplistic python package for performing hyperparameter tuning (or more generally launching jobs and saving results) on a cluster using SLURM. Takes advantage of the fact that lots of jobs (including hyperparameter tuning) are embarrassingly parallel! With slune you can divide your compute into lots of separately scheduled jobs meaning that each small job can get running on your cluster more quickly, speeding up your workflow! Often significantly! \n\nSlune is super-easy to use! We have helper functions which can execute everything you need done for you. Letting you speed up your work without wasting time. \n\nSlune is barebones by design. This means that you can easily write code to integrate with slune if you want to do something a bit different! You can also workout what each function is doing pretty easily. \n\nSlune is flexible. In designing this package I've tried to make as few assumptions as possible meaning that it can be used for lots of stuff outside hyperparameter tuning! (or also within!) For example, you can get slune to give you paths for where to save things, submit lots of jobs in parallel for any sort of script and do grid search! and there's more to come!\n\nThe docs are [here](https://h-0-0.github.io/slune/).\n\n## Usage\nLet's go through a quick example of how we can use slune ... first let's define a model that we want to train:\n```python\n# Simple Regularized Linear Regression without using external libraries\n\n# Function to compute the mean of a list\ndef mean(values):\n    return sum(values) / float(len(values))\n\n# Function to compute the covariance between two lists\ndef covariance(x, mean_x, y, mean_y):\n    covar = 0.0\n    for i in range(len(x)):\n        covar += (x[i] - mean_x) * (y[i] - mean_y)\n    return covar\n\n# Function to compute the variance of a list\ndef variance(values, mean):\n    return sum((x - mean) ** 2 for x in values)\n\n# Function to compute coefficients for a simple regularized linear regression\ndef coefficients_regularized(x, y, alpha):\n    mean_x, mean_y = mean(x), mean(y)\n    var_x = variance(x, mean_x)\n    covar = covariance(x, mean_x, y, mean_y)\n    b1 = (covar + alpha * var_x) / (var_x + alpha)\n    b0 = mean_y - b1 * mean_x\n    return b0, b1\n\n# Function to make predictions with a simple regularized linear regression model\ndef linear_regression_regularized(train_X, train_y, test_X, alpha):\n    b0, b1 = coefficients_regularized(train_X, train_y, alpha)\n    predictions = [b0 + b1 * x for x in test_X]\n    return predictions\n\n# ------------------\n# The above is code for a simple normalized linear regression model that we want to train.\n# Now let's fit the model and use slune to save how well our model performs!\n# ------------------\n\nif __name__ == \"__main__\":\n    # First let's load in the value for the regularization parameter alpha that has been passed to this script from the command line. We will use the slune helper function lsargs to do this. \n    # lsargs returns a tuple of the python path and a list of arguments passed to the script. We can then use this to get the alpha value.\n    from slune import lsargs\n    python_path, args = lsargs()\n    alpha = float(args[0])\n\n    # Mock training dataset, function is y = 1 + 1 * x\n    X = [1, 2, 3, 4, 5]\n    y = [2, 3, 4, 5, 6]\n\n    # Mock test dataset\n    test_X = [6, 7, 8]\n    test_y = [7, 8, 9]\n    test_predictions = linear_regression_regularized(X, y, test_X, alpha)\n\n    # First let's load in a function that we can use to get a saver object that uses the default method of logging. The saving will be coordinated by a csv saver object which saves and reads results from csv files stored in a hierarchy of directories.\n    from slune import get_csv_saver\n    csv_saver = get_csv_saver(params = args)\n\n    # Let's now calculate the mean squared error of our predictions and log it!\n    mse = mean((test_y[i] - test_predictions[i])**2 for i in range(len(test_y)))\n    csv_saver.log({'mse': mse})\n\n    # Let's now save our logged results!\n    csv_saver.save_collated()\n```\nNow let's write some code that will submit some jobs to train our model using different hyperparameters!!\n```python\n# Let's now load in a function that will coordinate our search! We're going to do a grid search.\n# SearcherGrid is the class we can use to coordinate a grid search. We pass it a dictionary of hyperparameters and the values we want to try for each hyperparameter. We also pass it the number of runs we want to do for each combination of hyperparameters.\nfrom slune.searchers import SearcherGrid\ngrid_searcher = SearcherGrid({'alpha' : [0.25, 0.5, 0.75]}, runs = 1)\n\n# Let's now import a function which will submit a job for our model, the script_path specifies the path to the script that contains the model we want to train. The template_path specifies the path to the template script that we want to specify the job with, cargs is a list of constant arguments we want to pass to the script for each tuning. \n# We set saver to None as we don't want to not run jobs if we have already run them before.\nfrom slune import sbatchit\nscript_path = 'model.py'\ntemplate_path = 'template.sh'\nsbatchit(script_path, template_path, grid_searcher, cargs=[], saver=None)\n```\nNow we've submitted our jobs we will wait for them to finish \ud83d\udd5b\ud83d\udd50\ud83d\udd51\ud83d\udd52\ud83d\udd53\ud83d\udd54\ud83d\udd55\ud83d\udd56\ud83d\udd57\ud83d\udd58\ud83d\udd59\ud83d\udd5a\ud83d\udd5b, now that they are finished we can read the results!\n```python\nfrom slune import get_csv_saver\ncsv_saver = get_csv_saver(params = None)\nparams, value = csv_saver.read(params = [], metric_name = 'mse', select_by ='min')\nprint(f'Best hyperparameters: {params}')\nprint(f'Their MSE: {value}')\n```\nAmazing! \ud83e\udd73 We have successfully used slune to train our model. I hope this gives you a good idea of how you can use slune and how easy it is to use!\n\nPlease check out the examples folder for notebooks detailing in more depth some potential ways you can use slune and of course please check out the docs! \n\n## Roadmap\nStill in early stages! First thing on the horizon is better integration with SLURM:\n- Set-up notifications for job completion, failure, etc.\n- Auto job naming, job output naming and job output location saving.\n- Auto save logged results when finishing a job.\n- Automatically re-submit failed jobs.\n- Tools for monitoring and cancelling jobs. \nThen it will be looking at adding more savers, loggers and searchers! For example integration with tensorboard, saving to one csv file (as opposed to a hierarchy of csv files in different directories) and different search methods like random search and cross validation. It would perhaps also be beneficial to be able to interface with other languages like R and Julia. Finally, more helper functions!\n\nHowever, I am trying to keep this package as bloatless as possible to make it easy for you to tweak and configure to your individual needs. It's written in a simple and compartmentalized manner for this reason. You can of course use the helper functions and let slune handle everything under the hood, but, you can also very quickly and easily write your own classes to work with other savers, loggers and searchers to do as you please.\n\n## Installation\nTo install latest version use:\n```bash\npip install slune-lib\n```\nTo install latest dev version use (CURRENTLY RECOMENDED):\n```bash\n# With https\npip install \"git+https://github.com/h-0-0/slune.git#egg=slune-lib\"\n```\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A package for performing hyperparameter tuning with the SLURM scheduling system.",
    "version": "0.0.2",
    "project_urls": null,
    "split_keywords": [
        "slurm",
        "hyperparameter",
        "tuning",
        "machine",
        "learning",
        "optimisation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0bbafcd0b8b1dd23c43025e49e094cfdb07e8d4d72e358849a9bbed64c021c28",
                "md5": "9ebc666bf030777c00a43fff958f20ee",
                "sha256": "d4e40489c0fc5c60dbb9416101eb22200f98eecefbb30ef2b821c9b216576ec2"
            },
            "downloads": -1,
            "filename": "slune_lib-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9ebc666bf030777c00a43fff958f20ee",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.1",
            "size": 18483,
            "upload_time": "2023-12-02T14:47:26",
            "upload_time_iso_8601": "2023-12-02T14:47:26.122342Z",
            "url": "https://files.pythonhosted.org/packages/0b/ba/fcd0b8b1dd23c43025e49e094cfdb07e8d4d72e358849a9bbed64c021c28/slune_lib-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3e07d0c7e1591ab3bbdfbd1b0c76fd7ada6a3d7230bef3e57106a87c1257b2e4",
                "md5": "c557f7c275ae005f608109bd3b02142f",
                "sha256": "509a1abb758eeba5711d0eef23971fc69288bbbf970a5a607391d464884488ab"
            },
            "downloads": -1,
            "filename": "slune-lib-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "c557f7c275ae005f608109bd3b02142f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.1",
            "size": 244505,
            "upload_time": "2023-12-02T14:47:28",
            "upload_time_iso_8601": "2023-12-02T14:47:28.092856Z",
            "url": "https://files.pythonhosted.org/packages/3e/07/d0c7e1591ab3bbdfbd1b0c76fd7ada6a3d7230bef3e57106a87c1257b2e4/slune-lib-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-02 14:47:28",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "slune-lib"
}
        
Elapsed time: 0.34616s