tf-seq2seq-losses

Name	tf-seq2seq-losses JSON
Version	0.3.0 JSON
	download
home_page	https://github.com/alexeytochin/tf-seq2seq-losses
Summary	Tensorflow implementations for (CTC) loss functions that are fast and support second-order derivatives.
upload_time	2024-06-20 20:41:32
maintainer	None
docs_url	None
author	Alexey Tochin
requires_python	None
license	Apache 2.0
keywords	tensorflow loss loss function ctc connectionist temporal classification seq2se seq 2 seq seq to seq asr automatic speach recognition sequence recognitionhessian second derivative
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # tf-seq2seq-losses
Tensorflow implementations for
[Connectionist Temporal Classification](file:///home/alexey/Downloads/Connectionist_temporal_classification_Labelling_un.pdf)
(CTC) loss functions that are fast and support second-order derivatives.

## Installation
```bash
$ pip install tf-seq2seq-losses
```

## Why Use This Package?
### 1. Faster Performance
Official CTC loss implementation, 
[`tf.nn.ctc_loss`](https://www.tensorflow.org/api_docs/python/tf/nn/ctc_loss),
is significantly slower.
Our implementation is approximately 30 times faster, as shown by the benchmark results:

|        Name        | Forward Time (ms) | Gradient Calculation Time (ms) |                 
|:------------------:|:-----------------:|:------------------------------:|
|  `tf.nn.ctc_loss`  |    13.2 ± 0.02    |            10.4 ± 3            |
| `classic_ctc_loss` |   0.138 ± 0.006   |          0.28 ± 0.01           |
| `simple_ctc_loss`  |  0.0531 ± 0.003   |         0.119 ± 0.004          |

Tested on a single GPU: GeForce GTX 970, Driver Version: 460.91.03, CUDA Version: 11.2. For the experimental setup, see
[`benchmark.py`](tests/performance_test.py)
To reproduce this benchmark, run the following command from the project root directory 
(install `pytest` and `pandas` if needed):
```bash
$ pytest -o log_cli=true --log-level=INFO tests/benchmark.py
```
Here, `classic_ctc_loss` is the standard version of CTC loss with token collapsing, e.g., `a_bb_ccc_c -> abcc`. 
The `simple_ctc_loss` is a simplified version that removes blanks trivially, e.g., `a_bb_ccc_c -> abbcccc`.

### 2. Supports Second-Order Derivatives
This implementation supports second-order derivatives without using TensorFlow's autogradient. 
Instead, it uses a custom approach similar to the one described
[here](https://www.tensorflow.org/api_docs/python/tf/nn/ctc_loss)
with a complexity of 
$O(l^4)$, 
where 
$l$
is the sequence length. The gradient complexity is 
$O(l^2)$.

Example usage:
```python
import tensorflow as tf
from tf_seq2seq_losses import classic_ctc_loss 

batch_size = 2
num_tokens = 3
logit_length = 5
labels = tf.constant([[1, 2, 2, 1], [1, 2, 1, 0]], dtype=tf.int32)
label_length = tf.constant([4, 3], dtype=tf.int32)
logits = tf.zeros(shape=[batch_size, logit_length, num_tokens], dtype=tf.float32)
logit_length = tf.constant([5, 4], dtype=tf.int32)

with tf.GradientTape(persistent=True) as tape1: 
    tape1.watch([logits])
    with tf.GradientTape() as tape2:
        tape2.watch([logits])
        loss = tf.reduce_sum(classic_ctc_loss(
            labels=labels,
            logits=logits,
            label_length=label_length,
            logit_length=logit_length,
            blank_index=0,
        ))
    gradient = tape2.gradient(loss, sources=logits)
hessian = tape1.batch_jacobian(gradient, source=logits, experimental_use_pfor=False)
# shape = [2, 5, 3, 5, 3]
```

### 3. Numerical Stability
1. The proposed implementation is more numerically stable, 
producing reasonable outputs even for logits of order `1e+10` and `-tf.inf`.
2. If the logit length is too short to predict the label output, 
the loss is `tf.inf` for that sample, unlike `tf.nn.ctc_loss`, which might output `707.13184`.


### 4. Pure Python Implementation
This is a pure Python/TensorFlow implementation, eliminating the need to build or compile any C++/CUDA components.


## Usage
The interface is identical to `tensorflow.nn.ctc_loss` with `logits_time_major=False`.

Example:
```python
import tensorflow as tf
from tf_seq2seq_losses import classic_ctc_loss

batch_size = 1
num_tokens = 3 # = 2 tokens + 1 blank token
logit_length = 5
loss = classic_ctc_loss(
    labels=tf.constant([[1, 2, 2, 1]], dtype=tf.int32),
    logits=tf.zeros(shape=[batch_size, logit_length, num_tokens], dtype=tf.float32),
    label_length=tf.constant([4], dtype=tf.int32),
    logit_length=tf.constant([logit_length], dtype=tf.int32),
    blank_index=0,
)
```

## Under the Hood
The implementation uses TensorFlow operations such as tf.while_loop and tf.TensorArray. 
The main computational bottleneck is the iteration over the logit length to calculate α and β 
(as described in the original
[CTC paper](file:///home/alexey/Downloads/Connectionist_temporal_classification_Labelling_un.pdf)). 
The expected gradient GPU calculation time is linear with respect to the logit length.

## Known Issues
### 1. Warning:
> AutoGraph could not transform <function classic_ctc_loss at ...> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 
(on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.

Observed with TensorFlow version 2.4.1. 
This warning does not affect performance and is caused by the use of Union in type annotations.

### 2. UnimplementedError:
Using `tf.jacobian` and `tf.batch_jacobian` for the second derivative of classic_ctc_loss with 
`experimental_use_pfor=False` in `tf.GradientTape` may cause an unexpected `UnimplementedError` 
in TensorFlow version 2.4.1 or later. 
This can be avoided by setting `experimental_use_pfor=True` 
or by using `ClassicCtcLossData.hessian` directly without `tf.GradientTape`.

Feel free to reach out if you have any questions or need further clarification.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alexeytochin/tf-seq2seq-losses",
    "name": "tf-seq2seq-losses",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "tensorflow, loss, loss function, ctc, connectionist temporal classification, seq2se, seq 2 seq, seq to seq, asr, automatic speach recognition, sequence recognitionhessian, second derivative",
    "author": "Alexey Tochin",
    "author_email": "alexey.tochin@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/35/53/2d759f7d7f0003afe07f1c1cfbf2fb70f141443f65ad263c861639d38087/tf_seq2seq_losses-0.3.0.tar.gz",
    "platform": null,
    "description": "# tf-seq2seq-losses\nTensorflow implementations for\n[Connectionist Temporal Classification](file:///home/alexey/Downloads/Connectionist_temporal_classification_Labelling_un.pdf)\n(CTC) loss functions that are fast and support second-order derivatives.\n\n## Installation\n```bash\n$ pip install tf-seq2seq-losses\n```\n\n## Why Use This Package?\n### 1. Faster Performance\nOfficial CTC loss implementation, \n[`tf.nn.ctc_loss`](https://www.tensorflow.org/api_docs/python/tf/nn/ctc_loss),\nis significantly slower.\nOur implementation is approximately 30 times faster, as shown by the benchmark results:\n\n|        Name        | Forward Time (ms) | Gradient Calculation Time (ms) |                 \n|:------------------:|:-----------------:|:------------------------------:|\n|  `tf.nn.ctc_loss`  |    13.2 \u00b1 0.02    |            10.4 \u00b1 3            |\n| `classic_ctc_loss` |   0.138 \u00b1 0.006   |          0.28 \u00b1 0.01           |\n| `simple_ctc_loss`  |  0.0531 \u00b1 0.003   |         0.119 \u00b1 0.004          |\n\nTested on a single GPU: GeForce GTX 970, Driver Version: 460.91.03, CUDA Version: 11.2. For the experimental setup, see\n[`benchmark.py`](tests/performance_test.py)\nTo reproduce this benchmark, run the following command from the project root directory \n(install `pytest` and `pandas` if needed):\n```bash\n$ pytest -o log_cli=true --log-level=INFO tests/benchmark.py\n```\nHere, `classic_ctc_loss` is the standard version of CTC loss with token collapsing, e.g., `a_bb_ccc_c -> abcc`. \nThe `simple_ctc_loss` is a simplified version that removes blanks trivially, e.g., `a_bb_ccc_c -> abbcccc`.\n\n### 2. Supports Second-Order Derivatives\nThis implementation supports second-order derivatives without using TensorFlow's autogradient. \nInstead, it uses a custom approach similar to the one described\n[here](https://www.tensorflow.org/api_docs/python/tf/nn/ctc_loss)\nwith a complexity of \n$O(l^4)$, \nwhere \n$l$\nis the sequence length. The gradient complexity is \n$O(l^2)$.\n\nExample usage:\n```python\nimport tensorflow as tf\nfrom tf_seq2seq_losses import classic_ctc_loss \n\nbatch_size = 2\nnum_tokens = 3\nlogit_length = 5\nlabels = tf.constant([[1, 2, 2, 1], [1, 2, 1, 0]], dtype=tf.int32)\nlabel_length = tf.constant([4, 3], dtype=tf.int32)\nlogits = tf.zeros(shape=[batch_size, logit_length, num_tokens], dtype=tf.float32)\nlogit_length = tf.constant([5, 4], dtype=tf.int32)\n\nwith tf.GradientTape(persistent=True) as tape1: \n    tape1.watch([logits])\n    with tf.GradientTape() as tape2:\n        tape2.watch([logits])\n        loss = tf.reduce_sum(classic_ctc_loss(\n            labels=labels,\n            logits=logits,\n            label_length=label_length,\n            logit_length=logit_length,\n            blank_index=0,\n        ))\n    gradient = tape2.gradient(loss, sources=logits)\nhessian = tape1.batch_jacobian(gradient, source=logits, experimental_use_pfor=False)\n# shape = [2, 5, 3, 5, 3]\n```\n\n### 3. Numerical Stability\n1. The proposed implementation is more numerically stable, \nproducing reasonable outputs even for logits of order `1e+10` and `-tf.inf`.\n2. If the logit length is too short to predict the label output, \nthe loss is `tf.inf` for that sample, unlike `tf.nn.ctc_loss`, which might output `707.13184`.\n\n\n### 4. Pure Python Implementation\nThis is a pure Python/TensorFlow implementation, eliminating the need to build or compile any C++/CUDA components.\n\n\n## Usage\nThe interface is identical to `tensorflow.nn.ctc_loss` with `logits_time_major=False`.\n\nExample:\n```python\nimport tensorflow as tf\nfrom tf_seq2seq_losses import classic_ctc_loss\n\nbatch_size = 1\nnum_tokens = 3 # = 2 tokens + 1 blank token\nlogit_length = 5\nloss = classic_ctc_loss(\n    labels=tf.constant([[1, 2, 2, 1]], dtype=tf.int32),\n    logits=tf.zeros(shape=[batch_size, logit_length, num_tokens], dtype=tf.float32),\n    label_length=tf.constant([4], dtype=tf.int32),\n    logit_length=tf.constant([logit_length], dtype=tf.int32),\n    blank_index=0,\n)\n```\n\n## Under the Hood\nThe implementation uses TensorFlow operations such as tf.while_loop and tf.TensorArray. \nThe main computational bottleneck is the iteration over the logit length to calculate \u03b1 and \u03b2 \n(as described in the original\n[CTC paper](file:///home/alexey/Downloads/Connectionist_temporal_classification_Labelling_un.pdf)). \nThe expected gradient GPU calculation time is linear with respect to the logit length.\n\n## Known Issues\n### 1. Warning:\n> AutoGraph could not transform <function classic_ctc_loss at ...> and will run it as-is.\nPlease report this to the TensorFlow team. When filing the bug, set the verbosity to 10 \n(on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n\nObserved with TensorFlow version 2.4.1. \nThis warning does not affect performance and is caused by the use of Union in type annotations.\n\n### 2. UnimplementedError:\nUsing `tf.jacobian` and `tf.batch_jacobian` for the second derivative of classic_ctc_loss with \n`experimental_use_pfor=False` in `tf.GradientTape` may cause an unexpected `UnimplementedError` \nin TensorFlow version 2.4.1 or later. \nThis can be avoided by setting `experimental_use_pfor=True` \nor by using `ClassicCtcLossData.hessian` directly without `tf.GradientTape`.\n\nFeel free to reach out if you have any questions or need further clarification.\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "Tensorflow implementations for (CTC) loss functions that are fast and support second-order derivatives.",
    "version": "0.3.0",
    "project_urls": {
        "Changelog": "https://github.com/alexeytochin/tf-seq2seq-losses/CHANGELOG.md",
        "Documentation": "https://github.com/alexeytochin/tf-seq2seq-losses/README.md",
        "GitHub": "https://github.com/alexeytochin/tf-seq2seq-losses",
        "Homepage": "https://github.com/alexeytochin/tf-seq2seq-losses"
    },
    "split_keywords": [
        "tensorflow",
        " loss",
        " loss function",
        " ctc",
        " connectionist temporal classification",
        " seq2se",
        " seq 2 seq",
        " seq to seq",
        " asr",
        " automatic speach recognition",
        " sequence recognitionhessian",
        " second derivative"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f3976a9c72a0159900f7be1fbd0dcc9e0bb53e69ab5095a2a97c6bb75cf8ed92",
                "md5": "7ffd7292e4fb390c4af8ee3cc114c1ae",
                "sha256": "1ae12a3ede0bb96f1276de4c6784f9503fc5b28f99310bfb25ff3b1b147501bb"
            },
            "downloads": -1,
            "filename": "tf_seq2seq_losses-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7ffd7292e4fb390c4af8ee3cc114c1ae",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 26081,
            "upload_time": "2024-06-20T20:41:28",
            "upload_time_iso_8601": "2024-06-20T20:41:28.286844Z",
            "url": "https://files.pythonhosted.org/packages/f3/97/6a9c72a0159900f7be1fbd0dcc9e0bb53e69ab5095a2a97c6bb75cf8ed92/tf_seq2seq_losses-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "35532d759f7d7f0003afe07f1c1cfbf2fb70f141443f65ad263c861639d38087",
                "md5": "758c9dd2909ed5ca6cae3a3b913d3aa9",
                "sha256": "f03d225675115746670efa7ee5410be5757533c8129a43b8bc9fd40419f3e134"
            },
            "downloads": -1,
            "filename": "tf_seq2seq_losses-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "758c9dd2909ed5ca6cae3a3b913d3aa9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 28050,
            "upload_time": "2024-06-20T20:41:32",
            "upload_time_iso_8601": "2024-06-20T20:41:32.455127Z",
            "url": "https://files.pythonhosted.org/packages/35/53/2d759f7d7f0003afe07f1c1cfbf2fb70f141443f65ad263c861639d38087/tf_seq2seq_losses-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-20 20:41:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alexeytochin",
    "github_project": "tf-seq2seq-losses",
    "github_not_found": true,
    "lcname": "tf-seq2seq-losses"
}

Alexey Tochin