caft


Namecaft JSON
Version 0.1.9 PyPI version JSON
download
home_pagehttps://github.com/joshdunnlime/caft
SummaryContinuous Affine Feature Transformations for feature mapping.
upload_time2023-06-16 16:44:30
maintainer
docs_urlNone
authorJoshua Dunn
requires_python>=3.8
license
keywords feature-engineering feature-mapping anomaly-detection
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # CAFT - Continuous Affine Feature Transformer

[![PyPI package](https://img.shields.io/badge/pip%20install-caft-brightgreen)](https://pypi.org/project/caft) [![version number](https://img.shields.io/pypi/v/example-pypi-package?color=green&label=version)](https://github.com/tomchen/example_pypi_package/releases) [![Unit Tests Status](https://github.com/joshdunnlime/caft/actions/workflows/test.yml/badge.svg)](https://github.com/joshdunnlime/caft/actions)
 [![License](https://img.shields.io/github/license/tomchen/example_pypi_package)](https://github.com/tomchen/example_pypi_package/blob/main/LICENSE)

A custom transformer package that allows users to make affine/geometric transformations on datasets with respect to some curve with a well defined continuous equation.

The transformers attempt to follow the scikit-learn api, however, there are limitations here based on the fact that transformers operate on both `X` and `y` variables. This will likely cause issues when used within a scikit-learn pipeline.

## Installation

Install `caft` via pip with

```bash
pip install caft
```

## Documentation

Currently, there is no hosted documentation but most functions are well documented, with examples.

Alternatively, there is a thorough example in the [example.ipynb](./example.ipynb) notebook.

## Useage

The main pattern is as follows.

```python
import sympy as sp
import numpy as np
import matplotlib.pyplot as plt

from caft.odr import SympyODRegressor, ODRegressor
from caft.affine import ContinuousAffineFeatureTransformer

np.random.seed(42)

n = 10000

# Generate data with some natural noise (not errors)
X_true = np.linspace(-2, 2, n) + np.random.uniform(-0.5, 0.5, n)

# Add random measurement errors - both small and extreme
errors_in_X = np.random.normal(0, 0.3, n)
errors_in_y = np.random.normal(0, 5, n)
y =  3 * (X_true + errors_in_X) ** 3 + errors_in_y
fx = 3 * X_true ** 3

# Add systematic error
n_errs = 100
X_outliers = -0.5 * np.ones(n_errs) + 0.2 * np.random.uniform(-0.3, 0.5, n_errs)
y_outliers = -30 * np.ones(n_errs) + np.random.normal(0, 3, n_errs)
X = np.hstack([X_true, X_outliers]).reshape(-1, 1)
y = np.hstack([y, y_outliers])

plt.scatter(X, y)
plt.scatter(X_true, fx, color="r", s=1,)
```

![Alt text](https://github.com/joshdunnlime/caft/blob/main/fx_scatter_plot.png)

Here we can see the scatter plot of `X` and `y` and the original function $y = f(x)$ without noise. Now we can create an affine transformation with respect to the original function (or at least the SympyRegressor estimate of it).


```python
eq = "a * x ** 3 + b"

X_ = X / X.max()
y_ = y / y.max()

sodr = SympyODRegressor(eq, beta0={"a": 0.5, "b": 1})
caft = ContinuousAffineFeatureTransformer(sodr, optimiser="halley")
caft.fit(X_, y_)
Xt, yt = caft.transform(X_, y_)
Xt = Xt.reshape(-1, 1)

plt.scatter(Xt, yt, s=6,)
plt.show()
```

![Alt text](https://github.com/joshdunnlime/caft/blob/main/img/fx_scatter_plot.png)

A most thorough example can be found in [example.ipynb](./example.ipynb) notebook.

This is some what of an unusual pattern, using a nested regressor within a transformer. However, the benefit here is that it allows each component to be used individually, either for individual equation regression or by rolling your own regressors to create the regressor equation.

## Development

Deploy new versions to PyPI using GitHub Actions:

Change version number to `__version__ = "X.Y.Z"` in `caft/__init__.py` then

```bash
git tag -a "vX.Y.Z" -m "deployment message"
git push --tags
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/joshdunnlime/caft",
    "name": "caft",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "feature-engineering feature-mapping,anomaly-detection",
    "author": "Joshua Dunn",
    "author_email": "joshua.t.dunn@hotmail.co.uk",
    "download_url": "https://files.pythonhosted.org/packages/f0/10/b5b0ecf96c5483f7d86914644c51718faf404ad2c8e3d756c0bef519e0e3/caft-0.1.9.tar.gz",
    "platform": null,
    "description": "# CAFT - Continuous Affine Feature Transformer\n\n[![PyPI package](https://img.shields.io/badge/pip%20install-caft-brightgreen)](https://pypi.org/project/caft) [![version number](https://img.shields.io/pypi/v/example-pypi-package?color=green&label=version)](https://github.com/tomchen/example_pypi_package/releases) [![Unit Tests Status](https://github.com/joshdunnlime/caft/actions/workflows/test.yml/badge.svg)](https://github.com/joshdunnlime/caft/actions)\n [![License](https://img.shields.io/github/license/tomchen/example_pypi_package)](https://github.com/tomchen/example_pypi_package/blob/main/LICENSE)\n\nA custom transformer package that allows users to make affine/geometric transformations on datasets with respect to some curve with a well defined continuous equation.\n\nThe transformers attempt to follow the scikit-learn api, however, there are limitations here based on the fact that transformers operate on both `X` and `y` variables. This will likely cause issues when used within a scikit-learn pipeline.\n\n## Installation\n\nInstall `caft` via pip with\n\n```bash\npip install caft\n```\n\n## Documentation\n\nCurrently, there is no hosted documentation but most functions are well documented, with examples.\n\nAlternatively, there is a thorough example in the [example.ipynb](./example.ipynb) notebook.\n\n## Useage\n\nThe main pattern is as follows.\n\n```python\nimport sympy as sp\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nfrom caft.odr import SympyODRegressor, ODRegressor\nfrom caft.affine import ContinuousAffineFeatureTransformer\n\nnp.random.seed(42)\n\nn = 10000\n\n# Generate data with some natural noise (not errors)\nX_true = np.linspace(-2, 2, n) + np.random.uniform(-0.5, 0.5, n)\n\n# Add random measurement errors - both small and extreme\nerrors_in_X = np.random.normal(0, 0.3, n)\nerrors_in_y = np.random.normal(0, 5, n)\ny =  3 * (X_true + errors_in_X) ** 3 + errors_in_y\nfx = 3 * X_true ** 3\n\n# Add systematic error\nn_errs = 100\nX_outliers = -0.5 * np.ones(n_errs) + 0.2 * np.random.uniform(-0.3, 0.5, n_errs)\ny_outliers = -30 * np.ones(n_errs) + np.random.normal(0, 3, n_errs)\nX = np.hstack([X_true, X_outliers]).reshape(-1, 1)\ny = np.hstack([y, y_outliers])\n\nplt.scatter(X, y)\nplt.scatter(X_true, fx, color=\"r\", s=1,)\n```\n\n![Alt text](https://github.com/joshdunnlime/caft/blob/main/fx_scatter_plot.png)\n\nHere we can see the scatter plot of `X` and `y` and the original function $y = f(x)$ without noise. Now we can create an affine transformation with respect to the original function (or at least the SympyRegressor estimate of it).\n\n\n```python\neq = \"a * x ** 3 + b\"\n\nX_ = X / X.max()\ny_ = y / y.max()\n\nsodr = SympyODRegressor(eq, beta0={\"a\": 0.5, \"b\": 1})\ncaft = ContinuousAffineFeatureTransformer(sodr, optimiser=\"halley\")\ncaft.fit(X_, y_)\nXt, yt = caft.transform(X_, y_)\nXt = Xt.reshape(-1, 1)\n\nplt.scatter(Xt, yt, s=6,)\nplt.show()\n```\n\n![Alt text](https://github.com/joshdunnlime/caft/blob/main/img/fx_scatter_plot.png)\n\nA most thorough example can be found in [example.ipynb](./example.ipynb) notebook.\n\nThis is some what of an unusual pattern, using a nested regressor within a transformer. However, the benefit here is that it allows each component to be used individually, either for individual equation regression or by rolling your own regressors to create the regressor equation.\n\n## Development\n\nDeploy new versions to PyPI using GitHub Actions:\n\nChange version number to `__version__ = \"X.Y.Z\"` in `caft/__init__.py` then\n\n```bash\ngit tag -a \"vX.Y.Z\" -m \"deployment message\"\ngit push --tags\n```\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Continuous Affine Feature Transformations for feature mapping.",
    "version": "0.1.9",
    "project_urls": {
        "Bug Reports": "https://github.com/joshdunnlime/caft/issues",
        "Documentation": "https://github.com/joshdunnlime/caft",
        "Homepage": "https://github.com/joshdunnlime/caft",
        "Source Code": "https://github.com/joshdunnlime/caft"
    },
    "split_keywords": [
        "feature-engineering feature-mapping",
        "anomaly-detection"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "44767720aa205d3fdfe1a3c765c63f4bce8ed1a5916a35de8fbdd6adfce76143",
                "md5": "e7b7e20787252072b3fa2731d4b47e1a",
                "sha256": "f20f13c8aef4802fc744197200871983495feb609b4eef251db5c2b312e7ab88"
            },
            "downloads": -1,
            "filename": "caft-0.1.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e7b7e20787252072b3fa2731d4b47e1a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 12786,
            "upload_time": "2023-06-16T16:44:29",
            "upload_time_iso_8601": "2023-06-16T16:44:29.359804Z",
            "url": "https://files.pythonhosted.org/packages/44/76/7720aa205d3fdfe1a3c765c63f4bce8ed1a5916a35de8fbdd6adfce76143/caft-0.1.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f010b5b0ecf96c5483f7d86914644c51718faf404ad2c8e3d756c0bef519e0e3",
                "md5": "38288455f3b26be23844e1dc5f90978a",
                "sha256": "0605e151d305556625dd808a465c731b381427cbc71d1f34e85a69b8d7f6d3e7"
            },
            "downloads": -1,
            "filename": "caft-0.1.9.tar.gz",
            "has_sig": false,
            "md5_digest": "38288455f3b26be23844e1dc5f90978a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 75864,
            "upload_time": "2023-06-16T16:44:30",
            "upload_time_iso_8601": "2023-06-16T16:44:30.605957Z",
            "url": "https://files.pythonhosted.org/packages/f0/10/b5b0ecf96c5483f7d86914644c51718faf404ad2c8e3d756c0bef519e0e3/caft-0.1.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-16 16:44:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "joshdunnlime",
    "github_project": "caft",
    "github_not_found": true,
    "lcname": "caft"
}
        
Elapsed time: 0.08114s