keras-hrp


Namekeras-hrp JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttp://github.com/ulf1/keras-hrp
SummaryHashed Random Projection layer for TF2/Keras
upload_time2023-07-10 08:12:04
maintainer
docs_urlNone
authorUlf Hamster
requires_python>=3.7
licenseApache License 2.0
keywords
VCS
bugtrack_url
requirements tensorflow numpy numba scipy
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![PyPI version](https://badge.fury.io/py/keras-hrp.svg)](https://badge.fury.io/py/keras-hrp)
[![PyPi downloads](https://img.shields.io/pypi/dm/keras-hrp)](https://img.shields.io/pypi/dm/keras-hrp)


# keras-hrp
Hashed Random Projection layer for TF2/Keras.

## Usage
<a href="demo/Hashed Random Projections.ipynb">Hashed Random Projections (HRP), binary representations, encoding/decoding for storage</a> (notebook)


### Generate a HRP layer with a new hyperplane
The random projection or hyperplane is randomly initialized.
The initial state of the PRNG (`random_state`) is required (Default: 42) to ensure reproducibility.

```py
import keras_hrp as khrp
import tensorflow as tf

BATCH_SIZE = 32
NUM_FEATURES = 64
OUTPUT_SIZE = 1024

# demo inputs
inputs = tf.random.normal(shape=(BATCH_SIZE, NUM_FEATURES))

# instantiate layer 
layer = khrp.HashedRandomProjection(
    output_size=OUTPUT_SIZE,
    random_state=42   # Default: 42
)

# run it
outputs = layer(inputs)
assert outputs.shape == (BATCH_SIZE, OUTPUT_SIZE)
```


### Instiantiate HRP layer with given hyperplane

```py
import keras_hrp as khrp
import tensorflow as tf
import numpy as np

BATCH_SIZE = 32
NUM_FEATURES = 64
OUTPUT_SIZE = 1024

# demo inputs
inputs = tf.random.normal(shape=(BATCH_SIZE, NUM_FEATURES))

# create hyperplane as numpy array
myhyperplane = np.random.randn(NUM_FEATURES, OUTPUT_SIZE)

# instantiate layer 
layer = khrp.HashedRandomProjection(hyperplane=myhyperplane)

# run it
outputs = layer(inputs)
assert outputs.shape == (BATCH_SIZE, OUTPUT_SIZE)

```


### Serialize Boolean to Int8
Python stores 1-bit boolean values always as 8-bit integers or 1-byte. 
Some database technologies behave in similar way, and use up 8x-times of the theoretically required storage space (e.g., Postgres `boolean` uses 1-byte instead of 1-bit).
In order to save memory or storage space, chuncks of 8 boolean vector elements can be transformed into one 1-byte int8 number.

```py
import keras_hrp as khrp
import numpy as np

# given boolean values
hashvalues = np.array([1, 0, 1, 0, 1, 1, 0, 0])

# serialize boolean to int8
serialized = khrp.bool_to_int8(hashvalues)

# deserialize int8 to boolean
deserialized = khrp.int8_to_bool(serialized)

# check
np.testing.assert_array_equal(deserialized, hashvalues)
```


## Appendix

### Installation
The `keras-hrp` [git repo](http://github.com/ulf1/keras-hrp) is available as [PyPi package](https://pypi.org/project/keras-hrp)

```sh
pip install keras-hrp
pip install git+ssh://git@github.com/ulf1/keras-hrp.git
```

### Install a virtual environment

```sh
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt --no-cache-dir
pip install -r requirements-dev.txt --no-cache-dir
pip install -r requirements-demo.txt --no-cache-dir
```

(If your git repo is stored in a folder with whitespaces, then don't use the subfolder `.venv`. Use an absolute path without whitespaces.)

### Python commands

* Jupyter for the examples: `jupyter lab`
* Check syntax: `flake8 --ignore=F401 --exclude=$(grep -v '^#' .gitignore | xargs | sed -e 's/ /,/g')`
* Run Unit Tests: `PYTHONPATH=. pytest`

Publish

```sh
# pandoc README.md --from markdown --to rst -s -o README.rst
python setup.py sdist 
twine upload -r pypi dist/*
```

### Clean up 

```sh
find . -type f -name "*.pyc" | xargs rm
find . -type d -name "__pycache__" | xargs rm -r
rm -r .pytest_cache
rm -r .venv
```


### Support
Please [open an issue](https://github.com/ulf1/keras-hrp/issues/new) for support.


### Contributing
Please contribute using [Github Flow](https://guides.github.com/introduction/flow/). Create a branch, add commits, and [open a pull request](https://github.com/ulf1/keras-hrp/compare/).

### Acknowledgements
The "Evidence" project was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - [433249742](https://gepris.dfg.de/gepris/projekt/433249742) (GU 798/27-1; GE 1119/11-1).

### Maintenance
- till 31.Aug.2023 (v0.1.0) the code repository was maintained within the DFG project [433249742](https://gepris.dfg.de/gepris/projekt/433249742?context=projekt&task=showDetail&id=433249742&)
- since 01.Sep.2023 (v0.2.0) the code repository is maintained by [@ulf1](https://github.com/ulf1).

### Citation
Please cite the arXiv Preprint when using this software for any purpose.

```
@misc{hamster2023rediscovering,
      title={Rediscovering Hashed Random Projections for Efficient Quantization of Contextualized Sentence Embeddings}, 
      author={Ulf A. Hamster and Ji-Ung Lee and Alexander Geyken and Iryna Gurevych},
      year={2023},
      eprint={2304.02481},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```



            

Raw data

            {
    "_id": null,
    "home_page": "http://github.com/ulf1/keras-hrp",
    "name": "keras-hrp",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "",
    "author": "Ulf Hamster",
    "author_email": "554c46@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/21/89/3a3290e28f3c2d2d7b7b60db92367a7b25e84ac6ff94ecfb835307d3fb8e/keras-hrp-0.2.0.tar.gz",
    "platform": null,
    "description": "[![PyPI version](https://badge.fury.io/py/keras-hrp.svg)](https://badge.fury.io/py/keras-hrp)\n[![PyPi downloads](https://img.shields.io/pypi/dm/keras-hrp)](https://img.shields.io/pypi/dm/keras-hrp)\n\n\n# keras-hrp\nHashed Random Projection layer for TF2/Keras.\n\n## Usage\n<a href=\"demo/Hashed Random Projections.ipynb\">Hashed Random Projections (HRP), binary representations, encoding/decoding for storage</a> (notebook)\n\n\n### Generate a HRP layer with a new hyperplane\nThe random projection or hyperplane is randomly initialized.\nThe initial state of the PRNG (`random_state`) is required (Default: 42) to ensure reproducibility.\n\n```py\nimport keras_hrp as khrp\nimport tensorflow as tf\n\nBATCH_SIZE = 32\nNUM_FEATURES = 64\nOUTPUT_SIZE = 1024\n\n# demo inputs\ninputs = tf.random.normal(shape=(BATCH_SIZE, NUM_FEATURES))\n\n# instantiate layer \nlayer = khrp.HashedRandomProjection(\n    output_size=OUTPUT_SIZE,\n    random_state=42   # Default: 42\n)\n\n# run it\noutputs = layer(inputs)\nassert outputs.shape == (BATCH_SIZE, OUTPUT_SIZE)\n```\n\n\n### Instiantiate HRP layer with given hyperplane\n\n```py\nimport keras_hrp as khrp\nimport tensorflow as tf\nimport numpy as np\n\nBATCH_SIZE = 32\nNUM_FEATURES = 64\nOUTPUT_SIZE = 1024\n\n# demo inputs\ninputs = tf.random.normal(shape=(BATCH_SIZE, NUM_FEATURES))\n\n# create hyperplane as numpy array\nmyhyperplane = np.random.randn(NUM_FEATURES, OUTPUT_SIZE)\n\n# instantiate layer \nlayer = khrp.HashedRandomProjection(hyperplane=myhyperplane)\n\n# run it\noutputs = layer(inputs)\nassert outputs.shape == (BATCH_SIZE, OUTPUT_SIZE)\n\n```\n\n\n### Serialize Boolean to Int8\nPython stores 1-bit boolean values always as 8-bit integers or 1-byte. \nSome database technologies behave in similar way, and use up 8x-times of the theoretically required storage space (e.g., Postgres `boolean` uses 1-byte instead of 1-bit).\nIn order to save memory or storage space, chuncks of 8 boolean vector elements can be transformed into one 1-byte int8 number.\n\n```py\nimport keras_hrp as khrp\nimport numpy as np\n\n# given boolean values\nhashvalues = np.array([1, 0, 1, 0, 1, 1, 0, 0])\n\n# serialize boolean to int8\nserialized = khrp.bool_to_int8(hashvalues)\n\n# deserialize int8 to boolean\ndeserialized = khrp.int8_to_bool(serialized)\n\n# check\nnp.testing.assert_array_equal(deserialized, hashvalues)\n```\n\n\n## Appendix\n\n### Installation\nThe `keras-hrp` [git repo](http://github.com/ulf1/keras-hrp) is available as [PyPi package](https://pypi.org/project/keras-hrp)\n\n```sh\npip install keras-hrp\npip install git+ssh://git@github.com/ulf1/keras-hrp.git\n```\n\n### Install a virtual environment\n\n```sh\npython3 -m venv .venv\nsource .venv/bin/activate\npip install --upgrade pip\npip install -r requirements.txt --no-cache-dir\npip install -r requirements-dev.txt --no-cache-dir\npip install -r requirements-demo.txt --no-cache-dir\n```\n\n(If your git repo is stored in a folder with whitespaces, then don't use the subfolder `.venv`. Use an absolute path without whitespaces.)\n\n### Python commands\n\n* Jupyter for the examples: `jupyter lab`\n* Check syntax: `flake8 --ignore=F401 --exclude=$(grep -v '^#' .gitignore | xargs | sed -e 's/ /,/g')`\n* Run Unit Tests: `PYTHONPATH=. pytest`\n\nPublish\n\n```sh\n# pandoc README.md --from markdown --to rst -s -o README.rst\npython setup.py sdist \ntwine upload -r pypi dist/*\n```\n\n### Clean up \n\n```sh\nfind . -type f -name \"*.pyc\" | xargs rm\nfind . -type d -name \"__pycache__\" | xargs rm -r\nrm -r .pytest_cache\nrm -r .venv\n```\n\n\n### Support\nPlease [open an issue](https://github.com/ulf1/keras-hrp/issues/new) for support.\n\n\n### Contributing\nPlease contribute using [Github Flow](https://guides.github.com/introduction/flow/). Create a branch, add commits, and [open a pull request](https://github.com/ulf1/keras-hrp/compare/).\n\n### Acknowledgements\nThe \"Evidence\" project was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - [433249742](https://gepris.dfg.de/gepris/projekt/433249742) (GU 798/27-1; GE 1119/11-1).\n\n### Maintenance\n- till 31.Aug.2023 (v0.1.0) the code repository was maintained within the DFG project [433249742](https://gepris.dfg.de/gepris/projekt/433249742?context=projekt&task=showDetail&id=433249742&)\n- since 01.Sep.2023 (v0.2.0) the code repository is maintained by [@ulf1](https://github.com/ulf1).\n\n### Citation\nPlease cite the arXiv Preprint when using this software for any purpose.\n\n```\n@misc{hamster2023rediscovering,\n      title={Rediscovering Hashed Random Projections for Efficient Quantization of Contextualized Sentence Embeddings}, \n      author={Ulf A. Hamster and Ji-Ung Lee and Alexander Geyken and Iryna Gurevych},\n      year={2023},\n      eprint={2304.02481},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n\n\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "Hashed Random Projection layer for TF2/Keras",
    "version": "0.2.0",
    "project_urls": {
        "Homepage": "http://github.com/ulf1/keras-hrp"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "21893a3290e28f3c2d2d7b7b60db92367a7b25e84ac6ff94ecfb835307d3fb8e",
                "md5": "88314389617cf68bd4531f1975dfc567",
                "sha256": "03909b40a26c2f3270c99f649cc2e8e6aceaf7dc005ba2d73e56fafed8fbb75c"
            },
            "downloads": -1,
            "filename": "keras-hrp-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "88314389617cf68bd4531f1975dfc567",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 9645,
            "upload_time": "2023-07-10T08:12:04",
            "upload_time_iso_8601": "2023-07-10T08:12:04.257847Z",
            "url": "https://files.pythonhosted.org/packages/21/89/3a3290e28f3c2d2d7b7b60db92367a7b25e84ac6ff94ecfb835307d3fb8e/keras-hrp-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-10 08:12:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ulf1",
    "github_project": "keras-hrp",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "tensorflow",
            "specs": [
                [
                    ">=",
                    "2.8.0"
                ],
                [
                    "<",
                    "3"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "<",
                    "2"
                ],
                [
                    ">=",
                    "1.19.5"
                ]
            ]
        },
        {
            "name": "numba",
            "specs": [
                [
                    ">=",
                    "0.53.1"
                ],
                [
                    "<",
                    "1"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.5.4"
                ],
                [
                    "<",
                    "2"
                ]
            ]
        }
    ],
    "lcname": "keras-hrp"
}
        
Elapsed time: 1.61387s