brc-pytorch

Name	brc-pytorch JSON
Version	0.1.3 JSON
	download
home_page	https://github.com/niklexical/brc_pytorch
Summary	Pytorch Implementation of BRC.
upload_time	2020-12-11 08:58:36
maintainer
docs_url	None
author	Nikita Janakarajan, Jannis Born
requires_python
license	MIT
keywords	pytorch deep learning rnn brc
VCS
bugtrack_url
requirements	numpy torch matplotlib seaborn pytest coverage
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![PyPI version](https://badge.fury.io/py/brc-pytorch.svg)](https://badge.fury.io/py/brc-pytorch)
[![Build
Status](https://travis-ci.com/niklexical/brc_pytorch.svg?branch=master)](https://travis-ci.com/niklexical/brc_pytorch)
[![codecov](https://codecov.io/gh/niklexical/brc_pytorch/branch/master/graph/badge.svg?token=UQ5O5CP8KD)](https://codecov.io/gh/niklexical/brc_pytorch)
[![License:
MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

# brc_pytorch
Pytorch implementation of bistable recurrent cell with baseline comparisons.

This repository contains the Pytorch implementation of the paper ["A bio-inspired bistable recurrent cell allows for long-lasting memory"](https://arxiv.org/abs/2006.05252). The original `tensorflow` implementation by the author Nicolas Vecoven can be found [here](https://github.com/nvecoven/BRC).

Another important feature of this repository is the implementation of a base class that returns a recurrent neural network for a given recurrent cell. Based on the hyperparameters provided, the network can have multiple layers, be bidirectional and the input can either have batch first or not. The outputs from the network mimic that returned by GRU/LSTM networks developed by PyTorch, with an additional option of returning only the hidden states from the last layer and last time step.
## Package setup

`brc_pytorch` is `pypi` installable:
```sh
pip install brc_pytorch
```
### Development setup
Create a `venv`, activate it, install dependencies and package in editable mode.
```sh
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -e .
```

### Usage (example)
```py
from brc_pytorch.layers import BistableRecurrentCell, NeuromodulatedBistableRecurrentCell
from brc_pytorch.layers import MultiLayerBase

# Create a 3-layer nBRC (behaves like a nn.GRU)

input_size = 32
hidden_size = 16
num_layers = 3
bidirectional = True
batch_first = True
return_sequences = False

num_directions = 2 if bidirectional else 1

# Behaves like a nn.GRUCell
nbrc = NeuromodulatedBistableRecurrentCell(input_size, hidden_size)

# Append cells for subsequent layers keeping in mind
for _ in range(num_layers - 1):
        nbrc.append(
            NeuromodulatedBistableRecurrentCell(inner_input_dimensions, hidden_size)
        )

three_layer_nbrc = rnn = MultiLayerBase(
        "nBRC",
        nbrc,
        hidden_size,
        batch_first,
        bidirectional,
        return_sequences,
    )
```


## Validation studies

First, the implementations of both the BRC and nBRC are validated on the
Copy-First-Input task (Benchmark 1 from the original paper). Moreover, it is well known
that standard RNNs have difficulties in *discrete counting*, especially for
longer sequences (see
[NeurIPS 2015 paper](http://papers.nips.cc/paper/5857-inferring-algorithmic-patterns-with-stack-augmented-recurrent-nets)).
To this end, we here identify **Binary Addition** as another
task for which the nBRC is superior to LSTM & GRU which begs implications for a
set of tasks involving more explicit memorization. For both tasks, the
performances of BRC and nBRC are compared with that of the LSTM and GRU cells. 

### Copy-First-Input

The goal of this task is to correctly predict the number at the start of a sequence of a certain length. 

This task is reproduced from the paper - 2 layer model with 100 units each, trained on datasets with increasing sequence lengths - 5, 100, 300. The plot is obtained by taking a moving average of the training loss per gradient iteration with window size = 100 for lengths 100 and 300, and window size 20 for length 5. 

The results from Copy-First-Input task show trends similar to that in the paper, thus confirming their findings. It should, however, be noted that the absolute losses are higher than reported in the paper. This is mostly due to the training and testing sizes being much smaller, and no hyperparameter tuning being done. 

![copy-first-input](https://github.com/niklexical/brc_pytorch/raw/master/results/copy-first-input.png)

To reproduce this task do:
1. Change directory to the `scripts` folder. From the terminal, run the following commands:
```sh
# The following command creates a directory with subdirectories in the scripts folder to save the models and results.
mkdir -p test_benchmark1/{models,results}
# Run the training script with your python executable. The following is an example for Anaconda.
/opt/anaconda3/envs/venv/bin/python brc_benchmark1.py test_benchmark1/models/ test_benchmark1/results/

```
Or, if training takes a very long time, run the script cell-wise, i.e, specify cell name as an additional argument and run multiple jobs in parallell - one for each cell.
```sh
/opt/anaconda3/envs/venv/bin/python brc_benchmark1_cell.py nBRC test_benchmark1/models/ test_benchmark1/results/

```
2. Calculate the moving average for each `TrainLoss_AllE_*.npy` file from test_benchmark1/results/ and plot.

### Binary Addition

Additional testing on Binary Addition was done to test the capabilities of these cells. The goal of this task is to correctly predict the sum of two binary numbers (in integer form).

Both single layer and 2 layer models, with constant hidden units 100, are evaluated based on the accuracy of their predictions.

The results from this task prove the usefulness of both the nBRC and BRC layers which consistently perform better than both the LSTM and GRU. Moreover, it is interesting to note the potential of nBRC in the binary addition task which is consistent around near perfect accuracy upto sequence length 60. The plots are obtained by averaging the results over 5 runs of the experiment and highlighting the standard error of the average.

![copy-first-input](https://github.com/niklexical/brc_pytorch/raw/master/results/binary_addition_1layer.png)

![copy-first-input](https://github.com/niklexical/brc_pytorch/raw/master/results/binary_addition_2layer.png)

While the Copy-First-Input task highlights the performance superiority of these cells over the conventional LSTM and GRU, the Binary Addition task, which requires counting, is witness to their usefulness beyond just long-lasting memory.

To reproduce this task do:

1. Change directory to the `scripts` folder. From the terminal, run the following command:
```sh
# The following command creates a directory with subdirectories in the scripts folder to save the models and results.
mkdir -p test_binary_addition/{models,results}/{test1,test2,test3,test4,test5}

```
2. Create and run the following python script from the same directory. Make sure the python executable file is correct.
```py
import os
import sys
import subprocess

dir_models = 'test_binary_addition/models/'
dir_results = 'test_binary_addition/results/'

modelpaths = [
    os.path.join(dir_models,f'test{i}') for i in range(1,6)
]
resultpaths = [
    os.path.join(dir_results,f'test{i}') for i in range(1,6)
]

procs = []
for i in range(5):
    proc = subprocess.Popen(
        [
            sys.executable,
            'binary_addition_train.py',
            modelpaths[i], resultpaths[i]
        ]
    )
    procs.append(proc)

for proc in procs:
    proc.wait()
```

3. Calculate the mean and standard error of mean over the different tests for each `test_acc_*.npy` file and plot.

For the 2 layer implementation, simply add another 100 to the `hidden_sizes` variable in the training file, and repeat the steps.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/niklexical/brc_pytorch",
    "name": "brc-pytorch",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "PyTorch,Deep Learning,RNN,BRC",
    "author": "Nikita Janakarajan, Jannis Born",
    "author_email": "nikita.janakarajan907@gmail.com, jannis.born@gmx.de",
    "download_url": "https://files.pythonhosted.org/packages/7e/d3/aa4c08a52eece25df84933eb4ad930719319e77a6bfd048d2be4cdad13f8/brc_pytorch-0.1.3.tar.gz",
    "platform": "",
    "description": "[![PyPI version](https://badge.fury.io/py/brc-pytorch.svg)](https://badge.fury.io/py/brc-pytorch)\n[![Build\nStatus](https://travis-ci.com/niklexical/brc_pytorch.svg?branch=master)](https://travis-ci.com/niklexical/brc_pytorch)\n[![codecov](https://codecov.io/gh/niklexical/brc_pytorch/branch/master/graph/badge.svg?token=UQ5O5CP8KD)](https://codecov.io/gh/niklexical/brc_pytorch)\n[![License:\nMIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n# brc_pytorch\nPytorch implementation of bistable recurrent cell with baseline comparisons.\n\nThis repository contains the Pytorch implementation of the paper [\"A bio-inspired bistable recurrent cell allows for long-lasting memory\"](https://arxiv.org/abs/2006.05252). The original `tensorflow` implementation by the author Nicolas Vecoven can be found [here](https://github.com/nvecoven/BRC).\n\nAnother important feature of this repository is the implementation of a base class that returns a recurrent neural network for a given recurrent cell. Based on the hyperparameters provided, the network can have multiple layers, be bidirectional and the input can either have batch first or not. The outputs from the network mimic that returned by GRU/LSTM networks developed by PyTorch, with an additional option of returning only the hidden states from the last layer and last time step.\n## Package setup\n\n`brc_pytorch` is `pypi` installable:\n```sh\npip install brc_pytorch\n```\n### Development setup\nCreate a `venv`, activate it, install dependencies and package in editable mode.\n```sh\npython -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\npip install -e .\n```\n\n### Usage (example)\n```py\nfrom brc_pytorch.layers import BistableRecurrentCell, NeuromodulatedBistableRecurrentCell\nfrom brc_pytorch.layers import MultiLayerBase\n\n# Create a 3-layer nBRC (behaves like a nn.GRU)\n\ninput_size = 32\nhidden_size = 16\nnum_layers = 3\nbidirectional = True\nbatch_first = True\nreturn_sequences = False\n\nnum_directions = 2 if bidirectional else 1\n\n# Behaves like a nn.GRUCell\nnbrc = NeuromodulatedBistableRecurrentCell(input_size, hidden_size)\n\n# Append cells for subsequent layers keeping in mind\nfor _ in range(num_layers - 1):\n        nbrc.append(\n            NeuromodulatedBistableRecurrentCell(inner_input_dimensions, hidden_size)\n        )\n\nthree_layer_nbrc = rnn = MultiLayerBase(\n        \"nBRC\",\n        nbrc,\n        hidden_size,\n        batch_first,\n        bidirectional,\n        return_sequences,\n    )\n```\n\n\n## Validation studies\n\nFirst, the implementations of both the BRC and nBRC are validated on the\nCopy-First-Input task (Benchmark 1 from the original paper). Moreover, it is well known\nthat standard RNNs have difficulties in *discrete counting*, especially for\nlonger sequences (see\n[NeurIPS 2015 paper](http://papers.nips.cc/paper/5857-inferring-algorithmic-patterns-with-stack-augmented-recurrent-nets)).\nTo this end, we here identify **Binary Addition** as another\ntask for which the nBRC is superior to LSTM & GRU which begs implications for a\nset of tasks involving more explicit memorization. For both tasks, the\nperformances of BRC and nBRC are compared with that of the LSTM and GRU cells. \n\n### Copy-First-Input\n\nThe goal of this task is to correctly predict the number at the start of a sequence of a certain length. \n\nThis task is reproduced from the paper - 2 layer model with 100 units each, trained on datasets with increasing sequence lengths - 5, 100, 300. The plot is obtained by taking a moving average of the training loss per gradient iteration with window size = 100 for lengths 100 and 300, and window size 20 for length 5. \n\nThe results from Copy-First-Input task show trends similar to that in the paper, thus confirming their findings. It should, however, be noted that the absolute losses are higher than reported in the paper. This is mostly due to the training and testing sizes being much smaller, and no hyperparameter tuning being done. \n\n![copy-first-input](https://github.com/niklexical/brc_pytorch/raw/master/results/copy-first-input.png)\n\nTo reproduce this task do:\n1. Change directory to the `scripts` folder. From the terminal, run the following commands:\n```sh\n# The following command creates a directory with subdirectories in the scripts folder to save the models and results.\nmkdir -p test_benchmark1/{models,results}\n# Run the training script with your python executable. The following is an example for Anaconda.\n/opt/anaconda3/envs/venv/bin/python brc_benchmark1.py test_benchmark1/models/ test_benchmark1/results/\n\n```\nOr, if training takes a very long time, run the script cell-wise, i.e, specify cell name as an additional argument and run multiple jobs in parallell - one for each cell.\n```sh\n/opt/anaconda3/envs/venv/bin/python brc_benchmark1_cell.py nBRC test_benchmark1/models/ test_benchmark1/results/\n\n```\n2. Calculate the moving average for each `TrainLoss_AllE_*.npy` file from test_benchmark1/results/ and plot.\n\n### Binary Addition\n\nAdditional testing on Binary Addition was done to test the capabilities of these cells. The goal of this task is to correctly predict the sum of two binary numbers (in integer form).\n\nBoth single layer and 2 layer models, with constant hidden units 100, are evaluated based on the accuracy of their predictions.\n\nThe results from this task prove the usefulness of both the nBRC and BRC layers which consistently perform better than both the LSTM and GRU. Moreover, it is interesting to note the potential of nBRC in the binary addition task which is consistent around near perfect accuracy upto sequence length 60. The plots are obtained by averaging the results over 5 runs of the experiment and highlighting the standard error of the average.\n\n![copy-first-input](https://github.com/niklexical/brc_pytorch/raw/master/results/binary_addition_1layer.png)\n\n![copy-first-input](https://github.com/niklexical/brc_pytorch/raw/master/results/binary_addition_2layer.png)\n\nWhile the Copy-First-Input task highlights the performance superiority of these cells over the conventional LSTM and GRU, the Binary Addition task, which requires counting, is witness to their usefulness beyond just long-lasting memory.\n\nTo reproduce this task do:\n\n1. Change directory to the `scripts` folder. From the terminal, run the following command:\n```sh\n# The following command creates a directory with subdirectories in the scripts folder to save the models and results.\nmkdir -p test_binary_addition/{models,results}/{test1,test2,test3,test4,test5}\n\n```\n2. Create and run the following python script from the same directory. Make sure the python executable file is correct.\n```py\nimport os\nimport sys\nimport subprocess\n\ndir_models = 'test_binary_addition/models/'\ndir_results = 'test_binary_addition/results/'\n\nmodelpaths = [\n    os.path.join(dir_models,f'test{i}') for i in range(1,6)\n]\nresultpaths = [\n    os.path.join(dir_results,f'test{i}') for i in range(1,6)\n]\n\nprocs = []\nfor i in range(5):\n    proc = subprocess.Popen(\n        [\n            sys.executable,\n            'binary_addition_train.py',\n            modelpaths[i], resultpaths[i]\n        ]\n    )\n    procs.append(proc)\n\nfor proc in procs:\n    proc.wait()\n```\n\n3. Calculate the mean and standard error of mean over the different tests for each `test_acc_*.npy` file and plot.\n\nFor the 2 layer implementation, simply add another 100 to the `hidden_sizes` variable in the training file, and repeat the steps.\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Pytorch Implementation of BRC.",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/niklexical/brc_pytorch"
    },
    "split_keywords": [
        "pytorch",
        "deep learning",
        "rnn",
        "brc"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "86110a71bcf0abc40b1d5decf9b27a1ea9dd9cb18e377ecb5387a3e09414cb0d",
                "md5": "dd9d64a75b40a16252e798842e0ff781",
                "sha256": "bc597866ec41081d58f68ed3ca590c202c243c0bc619612043d55b38509a52be"
            },
            "downloads": -1,
            "filename": "brc_pytorch-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dd9d64a75b40a16252e798842e0ff781",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 14914,
            "upload_time": "2020-12-11T08:58:35",
            "upload_time_iso_8601": "2020-12-11T08:58:35.144729Z",
            "url": "https://files.pythonhosted.org/packages/86/11/0a71bcf0abc40b1d5decf9b27a1ea9dd9cb18e377ecb5387a3e09414cb0d/brc_pytorch-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7ed3aa4c08a52eece25df84933eb4ad930719319e77a6bfd048d2be4cdad13f8",
                "md5": "8764a5a3f3c52c184e854f08942cf8c8",
                "sha256": "797ac173ed4f4e4b4127d49d430b3775e680c453ba04c9feaee9549a6531c25f"
            },
            "downloads": -1,
            "filename": "brc_pytorch-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "8764a5a3f3c52c184e854f08942cf8c8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 13710,
            "upload_time": "2020-12-11T08:58:36",
            "upload_time_iso_8601": "2020-12-11T08:58:36.693185Z",
            "url": "https://files.pythonhosted.org/packages/7e/d3/aa4c08a52eece25df84933eb4ad930719319e77a6bfd048d2be4cdad13f8/brc_pytorch-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2020-12-11 08:58:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "niklexical",
    "github_project": "brc_pytorch",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.14.3"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.1.1"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    ">=",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    "==",
                    "6.2.5"
                ]
            ]
        },
        {
            "name": "coverage",
            "specs": [
                [
                    "==",
                    "5.3"
                ]
            ]
        }
    ],
    "lcname": "brc-pytorch"
}

Nikita Janakarajan, Jannis Born