k2


Namek2 JSON
Version 1.24.0 PyPI version JSON
download
home_pagehttps://github.com/k2-fsa/k2
SummaryFSA/FST algorithms, intended to (eventually) be interoperable with PyTorch and similar
upload_time2023-04-27 04:27:16
maintainer
docs_urlNone
authorDaniel Povey
requires_python>=3.6
license
keywords k2 fsa fst
VCS
bugtrack_url
requirements graphviz torch typing_extensions
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
<a href="https://k2-fsa.github.io/k2/">
  <img src="https://raw.githubusercontent.com/k2-fsa/k2/master/docs/source/_static/logo.png" width=88>
</a>

<br/>

[![Documentation Status](https://github.com/k2-fsa/k2/actions/workflows/build-doc.yml/badge.svg)](https://k2-fsa.github.io/k2/)

</div>

# k2

The vision of k2 is to be able to seamlessly integrate Finite State Automaton
(FSA) and Finite State Transducer (FST) algorithms into autograd-based machine
learning toolkits like PyTorch and TensorFlow.  For speech recognition
applications, this should make it easy to interpolate and combine various
training objectives such as cross-entropy, CTC and MMI and to jointly optimize a
speech recognition system with multiple decoding passes including lattice
rescoring and confidence estimation.  We hope k2 will have many other
applications as well.

One of the key algorithms that we have implemented is
pruned composition of a generic FSA with a "dense" FSA (i.e. one that
corresponds to log-probs of symbols at the output of a neural network).  This
can be used as a fast implementation of decoding for ASR, and for CTC and
LF-MMI training.  This won't give a direct advantage in terms of Word Error Rate when
compared with existing technology; but the point is to do this in a much more
general and extensible framework to allow further development of ASR technology.

## Implementation

 A few key points on our implementation strategy.

 Most of the code is in C++ and CUDA.  We implement a templated class `Ragged`,
 which is quite like TensorFlow's `RaggedTensor` (actually we came up with the
 design independently, and were later told that TensorFlow was using the same
 ideas).  Despite a close similarity at the level of data structures, the
 design is quite different from TensorFlow and PyTorch.  Most of the time we
 don't use composition of simple operations, but rely on C++11 lambdas defined
 directly in the C++ implementations of algorithms.  The code in these lambdas operate
 directly on data pointers and, if the backend is CUDA, they can run in parallel
 for each element of a tensor.  (The C++ and CUDA code is mixed together and the
 CUDA kernels get instantiated via templates).

 It is difficult to adequately describe what we are doing with these `Ragged`
 objects without going in detail through the code.  The algorithms look very
 different from the way you would code them on CPU because of the need to avoid
 sequential processing.  We are using coding patterns that make the most
 expensive parts of the computations "embarrassingly parallelizable"; the only
 somewhat nontrivial CUDA operations are generally reduction-type operations
 such as exclusive-prefix-sum, for which we use NVidia's `cub` library.  Our
 design is not too specific to the NVidia hardware and the bulk of the code we
 write is fairly normal-looking C++; the nontrivial CUDA programming is mostly
 done via the cub library, parts of which we wrap with our own convenient
 interface.

 The Finite State Automaton object is then implemented as a Ragged tensor templated
 on a specific data type (a struct representing an arc in the automaton).

## Autograd

 If you look at the code as it exists now, you won't find any references to
 autograd.  The design is quite different to TensorFlow and PyTorch (which is
 why we didn't simply extend one of those toolkits).  Instead of making autograd
 come from the bottom up (by making individual operations differentiable) we are
 implementing it from the top down, which is much more efficient in this case
 (and will tend to have better roundoff properties).

 An example: suppose we are finding the best path of an FSA, and we need
 derivatives.  We implement this by keeping track of, for each arc in the output
 best-path, which input arc it corresponds to.  (For more complex algorithms an arc
 in the output might correspond to a sum of probabilities of a list of input arcs).
 We can make this compatible with PyTorch/TensorFlow autograd at the Python level,
 by, for example, defining a Function class in PyTorch that remembers this relationship
 between the arcs and does the appropriate (sparse) operations to propagate back the
 derivatives w.r.t. the weights.

## Current state of the code

 We have wrapped all the C++ code to Python with [pybind11](https://github.com/pybind/pybind11)
 and have finished the integration with [PyTorch](https://github.com/pytorch/pytorch).

 We are currently writing speech recognition recipes using k2, which are hosted in a
 separate repository. Please see <https://github.com/k2-fsa/icefall>.

## Plans after initial release

 We are currently trying to make k2 ready for production use (see the branch
 [v2.0-pre](https://github.com/k2-fsa/k2/tree/v2.0-pre)).

## Quick start

Want to try it out without installing anything? We have setup a [Google Colab][1].
You can find more Colab notebooks using k2 in speech recognition at
<https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html>.

[1]: https://colab.research.google.com/drive/1qbHUhNZUX7AYEpqnZyf29Lrz2IPHBGlX?usp=sharing

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/k2-fsa/k2",
    "name": "k2",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "k2,FSA,FST",
    "author": "Daniel Povey",
    "author_email": "dpovey@gmail.com",
    "download_url": "",
    "platform": null,
    "description": "<div align=\"center\">\n<a href=\"https://k2-fsa.github.io/k2/\">\n  <img src=\"https://raw.githubusercontent.com/k2-fsa/k2/master/docs/source/_static/logo.png\" width=88>\n</a>\n\n<br/>\n\n[![Documentation Status](https://github.com/k2-fsa/k2/actions/workflows/build-doc.yml/badge.svg)](https://k2-fsa.github.io/k2/)\n\n</div>\n\n# k2\n\nThe vision of k2 is to be able to seamlessly integrate Finite State Automaton\n(FSA) and Finite State Transducer (FST) algorithms into autograd-based machine\nlearning toolkits like PyTorch and TensorFlow.  For speech recognition\napplications, this should make it easy to interpolate and combine various\ntraining objectives such as cross-entropy, CTC and MMI and to jointly optimize a\nspeech recognition system with multiple decoding passes including lattice\nrescoring and confidence estimation.  We hope k2 will have many other\napplications as well.\n\nOne of the key algorithms that we have implemented is\npruned composition of a generic FSA with a \"dense\" FSA (i.e. one that\ncorresponds to log-probs of symbols at the output of a neural network).  This\ncan be used as a fast implementation of decoding for ASR, and for CTC and\nLF-MMI training.  This won't give a direct advantage in terms of Word Error Rate when\ncompared with existing technology; but the point is to do this in a much more\ngeneral and extensible framework to allow further development of ASR technology.\n\n## Implementation\n\n A few key points on our implementation strategy.\n\n Most of the code is in C++ and CUDA.  We implement a templated class `Ragged`,\n which is quite like TensorFlow's `RaggedTensor` (actually we came up with the\n design independently, and were later told that TensorFlow was using the same\n ideas).  Despite a close similarity at the level of data structures, the\n design is quite different from TensorFlow and PyTorch.  Most of the time we\n don't use composition of simple operations, but rely on C++11 lambdas defined\n directly in the C++ implementations of algorithms.  The code in these lambdas operate\n directly on data pointers and, if the backend is CUDA, they can run in parallel\n for each element of a tensor.  (The C++ and CUDA code is mixed together and the\n CUDA kernels get instantiated via templates).\n\n It is difficult to adequately describe what we are doing with these `Ragged`\n objects without going in detail through the code.  The algorithms look very\n different from the way you would code them on CPU because of the need to avoid\n sequential processing.  We are using coding patterns that make the most\n expensive parts of the computations \"embarrassingly parallelizable\"; the only\n somewhat nontrivial CUDA operations are generally reduction-type operations\n such as exclusive-prefix-sum, for which we use NVidia's `cub` library.  Our\n design is not too specific to the NVidia hardware and the bulk of the code we\n write is fairly normal-looking C++; the nontrivial CUDA programming is mostly\n done via the cub library, parts of which we wrap with our own convenient\n interface.\n\n The Finite State Automaton object is then implemented as a Ragged tensor templated\n on a specific data type (a struct representing an arc in the automaton).\n\n## Autograd\n\n If you look at the code as it exists now, you won't find any references to\n autograd.  The design is quite different to TensorFlow and PyTorch (which is\n why we didn't simply extend one of those toolkits).  Instead of making autograd\n come from the bottom up (by making individual operations differentiable) we are\n implementing it from the top down, which is much more efficient in this case\n (and will tend to have better roundoff properties).\n\n An example: suppose we are finding the best path of an FSA, and we need\n derivatives.  We implement this by keeping track of, for each arc in the output\n best-path, which input arc it corresponds to.  (For more complex algorithms an arc\n in the output might correspond to a sum of probabilities of a list of input arcs).\n We can make this compatible with PyTorch/TensorFlow autograd at the Python level,\n by, for example, defining a Function class in PyTorch that remembers this relationship\n between the arcs and does the appropriate (sparse) operations to propagate back the\n derivatives w.r.t. the weights.\n\n## Current state of the code\n\n We have wrapped all the C++ code to Python with [pybind11](https://github.com/pybind/pybind11)\n and have finished the integration with [PyTorch](https://github.com/pytorch/pytorch).\n\n We are currently writing speech recognition recipes using k2, which are hosted in a\n separate repository. Please see <https://github.com/k2-fsa/icefall>.\n\n## Plans after initial release\n\n We are currently trying to make k2 ready for production use (see the branch\n [v2.0-pre](https://github.com/k2-fsa/k2/tree/v2.0-pre)).\n\n## Quick start\n\nWant to try it out without installing anything? We have setup a [Google Colab][1].\nYou can find more Colab notebooks using k2 in speech recognition at\n<https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html>.\n\n[1]: https://colab.research.google.com/drive/1qbHUhNZUX7AYEpqnZyf29Lrz2IPHBGlX?usp=sharing\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "FSA/FST algorithms, intended to (eventually) be interoperable with PyTorch and similar",
    "version": "1.24.0",
    "split_keywords": [
        "k2",
        "fsa",
        "fst"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "def8b20118a6ef792c6c137206df751a583730552052265c318e0196fc9ef22b",
                "md5": "3f7e7b74cd09b21e8de636a10f9c2cad",
                "sha256": "ca1e04170b7b25214622957df8295023937038467951cd4bc5787b84b4e043cd"
            },
            "downloads": -1,
            "filename": "k2-1.24.0-cp310-cp310-macosx_10_15_x86_64.whl",
            "has_sig": false,
            "md5_digest": "3f7e7b74cd09b21e8de636a10f9c2cad",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.6",
            "size": 2381428,
            "upload_time": "2023-04-27T04:27:16",
            "upload_time_iso_8601": "2023-04-27T04:27:16.396533Z",
            "url": "https://files.pythonhosted.org/packages/de/f8/b20118a6ef792c6c137206df751a583730552052265c318e0196fc9ef22b/k2-1.24.0-cp310-cp310-macosx_10_15_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "483f68edbc4df2b694a459a0f3d0f8531c293ce374c79b2560808a7cd09f7f04",
                "md5": "86652e69089898451a380b6e07f3c15e",
                "sha256": "009591072ed05ef91772a07ddf4310db6cbce08536d869b3f082743a49212f10"
            },
            "downloads": -1,
            "filename": "k2-1.24.0-cp37-cp37m-macosx_10_15_x86_64.whl",
            "has_sig": false,
            "md5_digest": "86652e69089898451a380b6e07f3c15e",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": ">=3.6",
            "size": 2370697,
            "upload_time": "2023-04-27T04:27:26",
            "upload_time_iso_8601": "2023-04-27T04:27:26.089822Z",
            "url": "https://files.pythonhosted.org/packages/48/3f/68edbc4df2b694a459a0f3d0f8531c293ce374c79b2560808a7cd09f7f04/k2-1.24.0-cp37-cp37m-macosx_10_15_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a71fb30f0f0668c8d9d08097b75d4fba949a8375a2d82e58c3cb59966c63d2e8",
                "md5": "19667bd395d9dcc6741ab42c2eeb1767",
                "sha256": "f03bc05aa10a9fa9956f2253049b361e4151d5b097f7f333a1ea17af799e05ed"
            },
            "downloads": -1,
            "filename": "k2-1.24.0-cp38-cp38-macosx_10_15_x86_64.whl",
            "has_sig": false,
            "md5_digest": "19667bd395d9dcc6741ab42c2eeb1767",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.6",
            "size": 2381810,
            "upload_time": "2023-04-27T04:28:27",
            "upload_time_iso_8601": "2023-04-27T04:28:27.039459Z",
            "url": "https://files.pythonhosted.org/packages/a7/1f/b30f0f0668c8d9d08097b75d4fba949a8375a2d82e58c3cb59966c63d2e8/k2-1.24.0-cp38-cp38-macosx_10_15_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dacafe57b0f231bc9fb9fcfc5b2b75b6107e58ee7aa2e4fc41a98dd05b51740a",
                "md5": "08402c111a91ee5ca7e6d9451f224875",
                "sha256": "a2d45043d415ad4475c3a0e66c10281b668c9e710a74dfc36f5c9107d25146db"
            },
            "downloads": -1,
            "filename": "k2-1.24.0-cp39-cp39-macosx_10_15_x86_64.whl",
            "has_sig": false,
            "md5_digest": "08402c111a91ee5ca7e6d9451f224875",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.6",
            "size": 2381952,
            "upload_time": "2023-04-27T04:28:47",
            "upload_time_iso_8601": "2023-04-27T04:28:47.275603Z",
            "url": "https://files.pythonhosted.org/packages/da/ca/fe57b0f231bc9fb9fcfc5b2b75b6107e58ee7aa2e4fc41a98dd05b51740a/k2-1.24.0-cp39-cp39-macosx_10_15_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-27 04:27:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "k2-fsa",
    "github_project": "k2",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "graphviz",
            "specs": []
        },
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "1.6.0"
                ]
            ]
        },
        {
            "name": "typing_extensions",
            "specs": []
        }
    ],
    "lcname": "k2"
}
        
Elapsed time: 0.13799s