findi-descent


Namefindi-descent JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttps://github.com/draktr/findi-descent
SummaryFinDi: Finite Difference Gradient Descent can optimize any function, including the ones without analytic form, by employing finite difference numerical differentiation within a gradient descent algorithm.
upload_time2024-09-14 03:48:51
maintainerNone
docs_urlNone
authordraktr
requires_python>=3.8
licenseMIT License
keywords optimization gradient-descent numerical-analysis numerical-differentiation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FinDi: Finite Difference Gradient Descent

FinDi: Finite Difference Gradient Descent can optimize any function, including the ones without analytic form, by employing finite difference numerical differentiation within a gradient descent algorithm.

* Free software: MIT license
* Documentation: <https://findi-descent.readthedocs.io/en/latest/>

## Installation

A preferred method to install `findi` is through Python's package installer pip. To install `findi`, run this command in your terminal

```shell
pip install findi-descent
```

Alternatively, you can install the package directly from GitHub:

```shell
git clone -b development https://github.com/draktr/findi-descent.git
cd findi-descent
python setup.py install
```

## Finite Difference Gradient Descent - A Short Introduction

Finite Difference Gradient Descent (FDGD) is a modification of the regular GD algorithm that approximates the gradient of the objective function with finite difference derivatives, as

$$
-\nabla f(v) = \frac{\partial f}{\partial X} =
\begin{bmatrix}
    \frac{\partial f}{\partial x_1} \\
    \frac{\partial f}{\partial x_2} \\
    \vdots                          \\
    \frac{\partial f}{\partial x_n} \\
\end{bmatrix}
\approx
\begin{bmatrix}
    \frac{\Delta f}{\Delta x_1} \\
    \frac{\Delta f}{\Delta x_2} \\
    \vdots                          \\
    \frac{\Delta f}{\Delta x_n} \\
\end{bmatrix}
$$

Analogously, the FDGD update rule is given as

$$
v_{t+1} = v_{t} - \gamma
\begin{bmatrix}
    \frac{\Delta f}{\Delta x_1} \\
    \frac{\Delta f}{\Delta x_2} \\
    \vdots                          \\
    \frac{\Delta f}{\Delta x_n} \\
\end{bmatrix}
$$

where $\gamma$ is the same as in the regular GD. Given appropriate $\gamma$, FDGD still constructs a monotonic sequence $f(v_{0}) \geq f(v_{1}) \geq f(v_{2}) \geq \cdot \cdot \cdot$, however, due to the gradient approximation the convergence has an error proportional to the error discussed in *Differentiation* subsection. For more details refer to the Mathematical Guide in the documentation.

## Features

### Optimization Algorithms

* `descent()` - regular FDGD algorithm
* `partial_descent()` - FDGD algorithm where in each epoch `parameters_used` number of parameters are randomly selected to be differenced. This approach is useful in cases where the evaluation of objective function is computationally expensive
* `partially_partial_descent()` - FDGD algorithm that uses `partial_descent()` algorithm for the first `partial_epochs` number of epochs and `descent()` for the remaining epochs. This approach is useful as it combines the computational efficiency of `partial_descent()` with the approximational accuracy of `descent()`

### Computational

* Numba mode - Numba just-in-time compilation is available for **all** algorithms, including automatically parallelized evaluation. This drastically decreases computation time, however, it also requires the objective function to be Numba-compiled
* `joblib` parallelization - supported in Python mode. This is helpful, especially with high-dimensional problems where Numba objective function is unfeasible
* `values_out()` function - exports outputs, parameters, and constants values for each epoch as `Pandas` `DataFrame`. This is useful for, among other things, algorithm convergence analytics and hyperparameter (e.g. learning rates and differences) tuning
* Variable learning rates and difference values - other than scalars, `l` and `h` arguments also accept array_like structures (e.g. lists and `Numpy` arrays). These can be constructed manually, by the library [OptSchedule](https://pypi.org/project/optschedule/) which provides a variety of decay schedules
* `momentum` hyperparameter - accelerates gradient descent in the relevant direction and dampens oscillations. `momentum = 0` (default value) implies no acceleration and dampening. The update rule with `momentum > 0` is

$$
v_{t} = m*v_{t-1} - l_{t} * \frac{F(X_{t})-F(X_{t-1})}{h}
X_{t} = X_{t-1} + v_{t}
$$

### Non-Mathematical Functions as Objectives

* support for **metaparameters** - FinDi accepts objective functions that require *metaparameters* to be passed to it. By *metaparameter* is considered any parameter passed to the objective function that **will not** be differenced and its value will be held constant throughout epochs
* support for **multiple outputs** - FinDi accepts objective functions that return more than one value. For example, if the objective function has a convex optimization routine within it, FinDi allows for the objective function to return the regular objective value along with the solutions to the optimization problem. The first value of the return structure will be taken as the objective value to be minimized

## Advantages Over Other Optimization Techniques

1) Optimizing objective functions that **cannot be expressed or solved analytically** or **discontinuous functions**
2) **Intuitive and easy to communicate** its implementation, unlike most of the derivative-free optimization methods
3) Convenient work with blackbox or proprietary objective functions through metaparameters, where source code might be inaccessible
4) Increased computational efficiency with Numba **just-in-time compilation**
5) Supports **parallelization** via `joblib` or `numba` library
6) **Partial Gradient Descent** makes high-dimensional, simple problems less computationally expensive to solve
7) Built-in support for **variable learning rates and differences**

## A Quick Example

Below is a simple demonstrative example to show how to use `findi`. More examples can be found in [the documentation](https://findi-descent.readthedocs.io/en/latest/), including the examples of problems that can be solved by `findi` and not by other Python Gradient Descent implementations.

```python
import findi as fd

# Defining the objective function
def foo(params):
    return [(params[0]+2)**2]

# Descent
outputs, parameters = fd.descent(
    objective=foo,
    initial=5,
    h=0.0001,
    l=0.01,
    epochs=1000,
)

print("Solution (argmin): ", parameters[-1])
print("Objective value at solution (min): ", outputs[-1])

# Saves values of outputs and parameters as Pandas DataFrame...
values = fd.values_out(outputs, parameters, columns=["x"])
# ...to be stored as a CSV file
values.to_csv("values.csv")
```

## Project Principles

* Easy to be understood and used by those with little computer science background, including scientists, researchers and industry practitioners
* Flexibility for proprietary modifications
* Emphasis on computational efficiency
* Use consistency across approaches (Numba vs Python, regular Gradient Descent vs Partial Gradient Descent etc.)
* Tested
* Dedicated and detailed technical and applicative documentation
* Formatting deferred to [Black](https://github.com/psf/black)

## Future Development

Feature requests are more than welcome through the Issues forum, as are bug reports and improvement recommendations. Anyone is more than welcome to become a contributor or maintainer.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/draktr/findi-descent",
    "name": "findi-descent",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "optimization, gradient-descent, numerical-analysis, numerical-differentiation",
    "author": "draktr",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/71/38/a9ccb24de568bd00b4c0930575fc57b517787c43b15755296b9026d9a2f4/findi-descent-0.2.0.tar.gz",
    "platform": null,
    "description": "# FinDi: Finite Difference Gradient Descent\r\n\r\nFinDi: Finite Difference Gradient Descent can optimize any function, including the ones without analytic form, by employing finite difference numerical differentiation within a gradient descent algorithm.\r\n\r\n* Free software: MIT license\r\n* Documentation: <https://findi-descent.readthedocs.io/en/latest/>\r\n\r\n## Installation\r\n\r\nA preferred method to install `findi` is through Python's package installer pip. To install `findi`, run this command in your terminal\r\n\r\n```shell\r\npip install findi-descent\r\n```\r\n\r\nAlternatively, you can install the package directly from GitHub:\r\n\r\n```shell\r\ngit clone -b development https://github.com/draktr/findi-descent.git\r\ncd findi-descent\r\npython setup.py install\r\n```\r\n\r\n## Finite Difference Gradient Descent - A Short Introduction\r\n\r\nFinite Difference Gradient Descent (FDGD) is a modification of the regular GD algorithm that approximates the gradient of the objective function with finite difference derivatives, as\r\n\r\n$$\r\n-\\nabla f(v) = \\frac{\\partial f}{\\partial X} =\r\n\\begin{bmatrix}\r\n    \\frac{\\partial f}{\\partial x_1} \\\\\r\n    \\frac{\\partial f}{\\partial x_2} \\\\\r\n    \\vdots                          \\\\\r\n    \\frac{\\partial f}{\\partial x_n} \\\\\r\n\\end{bmatrix}\r\n\\approx\r\n\\begin{bmatrix}\r\n    \\frac{\\Delta f}{\\Delta x_1} \\\\\r\n    \\frac{\\Delta f}{\\Delta x_2} \\\\\r\n    \\vdots                          \\\\\r\n    \\frac{\\Delta f}{\\Delta x_n} \\\\\r\n\\end{bmatrix}\r\n$$\r\n\r\nAnalogously, the FDGD update rule is given as\r\n\r\n$$\r\nv_{t+1} = v_{t} - \\gamma\r\n\\begin{bmatrix}\r\n    \\frac{\\Delta f}{\\Delta x_1} \\\\\r\n    \\frac{\\Delta f}{\\Delta x_2} \\\\\r\n    \\vdots                          \\\\\r\n    \\frac{\\Delta f}{\\Delta x_n} \\\\\r\n\\end{bmatrix}\r\n$$\r\n\r\nwhere $\\gamma$ is the same as in the regular GD. Given appropriate $\\gamma$, FDGD still constructs a monotonic sequence $f(v_{0}) \\geq f(v_{1}) \\geq f(v_{2}) \\geq \\cdot \\cdot \\cdot$, however, due to the gradient approximation the convergence has an error proportional to the error discussed in *Differentiation* subsection. For more details refer to the Mathematical Guide in the documentation.\r\n\r\n## Features\r\n\r\n### Optimization Algorithms\r\n\r\n* `descent()` - regular FDGD algorithm\r\n* `partial_descent()` - FDGD algorithm where in each epoch `parameters_used` number of parameters are randomly selected to be differenced. This approach is useful in cases where the evaluation of objective function is computationally expensive\r\n* `partially_partial_descent()` - FDGD algorithm that uses `partial_descent()` algorithm for the first `partial_epochs` number of epochs and `descent()` for the remaining epochs. This approach is useful as it combines the computational efficiency of `partial_descent()` with the approximational accuracy of `descent()`\r\n\r\n### Computational\r\n\r\n* Numba mode - Numba just-in-time compilation is available for **all** algorithms, including automatically parallelized evaluation. This drastically decreases computation time, however, it also requires the objective function to be Numba-compiled\r\n* `joblib` parallelization - supported in Python mode. This is helpful, especially with high-dimensional problems where Numba objective function is unfeasible\r\n* `values_out()` function - exports outputs, parameters, and constants values for each epoch as `Pandas` `DataFrame`. This is useful for, among other things, algorithm convergence analytics and hyperparameter (e.g. learning rates and differences) tuning\r\n* Variable learning rates and difference values - other than scalars, `l` and `h` arguments also accept array_like structures (e.g. lists and `Numpy` arrays). These can be constructed manually, by the library [OptSchedule](https://pypi.org/project/optschedule/) which provides a variety of decay schedules\r\n* `momentum` hyperparameter - accelerates gradient descent in the relevant direction and dampens oscillations. `momentum = 0` (default value) implies no acceleration and dampening. The update rule with `momentum > 0` is\r\n\r\n$$\r\nv_{t} = m*v_{t-1} - l_{t} * \\frac{F(X_{t})-F(X_{t-1})}{h}\r\nX_{t} = X_{t-1} + v_{t}\r\n$$\r\n\r\n### Non-Mathematical Functions as Objectives\r\n\r\n* support for **metaparameters** - FinDi accepts objective functions that require *metaparameters* to be passed to it. By *metaparameter* is considered any parameter passed to the objective function that **will not** be differenced and its value will be held constant throughout epochs\r\n* support for **multiple outputs** - FinDi accepts objective functions that return more than one value. For example, if the objective function has a convex optimization routine within it, FinDi allows for the objective function to return the regular objective value along with the solutions to the optimization problem. The first value of the return structure will be taken as the objective value to be minimized\r\n\r\n## Advantages Over Other Optimization Techniques\r\n\r\n1) Optimizing objective functions that **cannot be expressed or solved analytically** or **discontinuous functions**\r\n2) **Intuitive and easy to communicate** its implementation, unlike most of the derivative-free optimization methods\r\n3) Convenient work with blackbox or proprietary objective functions through metaparameters, where source code might be inaccessible\r\n4) Increased computational efficiency with Numba **just-in-time compilation**\r\n5) Supports **parallelization** via `joblib` or `numba` library\r\n6) **Partial Gradient Descent** makes high-dimensional, simple problems less computationally expensive to solve\r\n7) Built-in support for **variable learning rates and differences**\r\n\r\n## A Quick Example\r\n\r\nBelow is a simple demonstrative example to show how to use `findi`. More examples can be found in [the documentation](https://findi-descent.readthedocs.io/en/latest/), including the examples of problems that can be solved by `findi` and not by other Python Gradient Descent implementations.\r\n\r\n```python\r\nimport findi as fd\r\n\r\n# Defining the objective function\r\ndef foo(params):\r\n    return [(params[0]+2)**2]\r\n\r\n# Descent\r\noutputs, parameters = fd.descent(\r\n    objective=foo,\r\n    initial=5,\r\n    h=0.0001,\r\n    l=0.01,\r\n    epochs=1000,\r\n)\r\n\r\nprint(\"Solution (argmin): \", parameters[-1])\r\nprint(\"Objective value at solution (min): \", outputs[-1])\r\n\r\n# Saves values of outputs and parameters as Pandas DataFrame...\r\nvalues = fd.values_out(outputs, parameters, columns=[\"x\"])\r\n# ...to be stored as a CSV file\r\nvalues.to_csv(\"values.csv\")\r\n```\r\n\r\n## Project Principles\r\n\r\n* Easy to be understood and used by those with little computer science background, including scientists, researchers and industry practitioners\r\n* Flexibility for proprietary modifications\r\n* Emphasis on computational efficiency\r\n* Use consistency across approaches (Numba vs Python, regular Gradient Descent vs Partial Gradient Descent etc.)\r\n* Tested\r\n* Dedicated and detailed technical and applicative documentation\r\n* Formatting deferred to [Black](https://github.com/psf/black)\r\n\r\n## Future Development\r\n\r\nFeature requests are more than welcome through the Issues forum, as are bug reports and improvement recommendations. Anyone is more than welcome to become a contributor or maintainer.\r\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "FinDi: Finite Difference Gradient Descent can optimize any function, including the ones without analytic form, by employing finite difference numerical differentiation within a gradient descent algorithm.",
    "version": "0.2.0",
    "project_urls": {
        "Documentation": "https://findi-descent.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/draktr/findi-descent",
        "Issues": "https://github.com/draktr/findi-descent/issues"
    },
    "split_keywords": [
        "optimization",
        " gradient-descent",
        " numerical-analysis",
        " numerical-differentiation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7138a9ccb24de568bd00b4c0930575fc57b517787c43b15755296b9026d9a2f4",
                "md5": "15002b3d872e6aedc74d65cf475d9e55",
                "sha256": "b572821da27236d3593679c504653fe37041108e9e50a2accd05b68b840b68ba"
            },
            "downloads": -1,
            "filename": "findi-descent-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "15002b3d872e6aedc74d65cf475d9e55",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 25188,
            "upload_time": "2024-09-14T03:48:51",
            "upload_time_iso_8601": "2024-09-14T03:48:51.167802Z",
            "url": "https://files.pythonhosted.org/packages/71/38/a9ccb24de568bd00b4c0930575fc57b517787c43b15755296b9026d9a2f4/findi-descent-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-14 03:48:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "draktr",
    "github_project": "findi-descent",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "findi-descent"
}
        
Elapsed time: 0.43568s