da4ml

Name	da4ml JSON
Version	0.3.2 JSON
	download
home_page	None
Summary	Digital Arithmetic for Machine Learning
upload_time	2025-08-16 23:30:01
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	GNU Lesser General Public License v3 (LGPLv3)
keywords	cmvm distributed arithmetic hls4ml mcm subexpression elimination
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # da4ml: Distributed Arithmetic for Machine Learning

[![LGPLv3](https://img.shields.io/badge/License-LGPLv3-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0)
[![Documentation](https://github.com/calad0i/da4ml/actions/workflows/sphinx-build.yml/badge.svg)](https://calad0i.github.io/da4ml/)
[![PyPI version](https://badge.fury.io/py/da4ml.svg)](https://badge.fury.io/py/da4ml)
[![ArXiv](https://img.shields.io/badge/arXiv-2507.04535-b31b1b.svg)](https://arxiv.org/abs/2507.04535)

da4ml is a library for implementing distributed arithmetic (DA) based algorithms for ultra-low latency machine learning (ML) applications on FPGAs. It as two major components:
 - A fast and performant constant-matrix-vector multiplications (CMVM) optimizer to implement them as
   efficient adder trees. Common sub-expressions elimination (CSE) with graph-based pre-optimization are
   performed to reduce the firmware footprint and improve the performance.
 - Low-level symbolic tracing frameworks for generating combinational/fully pipelined logics in HDL or HLS
   code. For fully pipelined networks, da4ml can generate the firmware for the whole network standalone.
   Alternatively, da4ml be used as a plugin in hls4ml to optimize the CMVM operations in the network.


Key Features
------------

- **Optimized Algorithms**: Comparing to hls4ml's latency strategy, da4ml's CMVM implementation uses no DSO and consumes up to 50% less LUT usage.
- **Fast code generation**: da4ml can generate HDL for a fully pipelined network in seconds. For the same models, high-level synthesis tools like Vivado/Vitis HLS can take up to days to generate the HDL code.
- **Low-level symbolic tracing**: As long as the operation can be expressed by a combination of the low-level operations supported, adding new operations is straightforward by "replaying" the operation on the symbolic tensor provided. In most cases, adding support for a new operation/layer takes just a few lines of code in numpy flavor.
- **Automatic model conversion**: da4ml can automatically convert models trained in `HGQ2 <https://github.com/calad0i/hgq2>`_.
- **Bit-accurate Simulation**: All operation in da4ml is bit-accurate, meaning the generated HDL code will produce the same output as the original model. da4ml's computation is converted to a RISC-like, instruction set level intermediate representation, distributed arithmetic instruction set (DAIS), which can be easily simulated in multiple ways.
- **hls4ml integration**: da4ml can be used as a plugin in hls4ml to optimize the CMVM operations in the network by setting `strategy='distributed_arithmetic'` for the strategy of the Dense, EinsumDense, or Conv1/2D layers.

Installation
------------

```bash
pip install da4ml
```

Getting Started
---------------

See the [Getting Started](https://calad0i.github.io/da4ml/getting_started.html) guide for a quick introduction to using da4ml.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "da4ml",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "CMVM, distributed arithmetic, hls4ml, MCM, subexpression elimination",
    "author": null,
    "author_email": "Chang Sun <chsun@cern.ch>",
    "download_url": "https://files.pythonhosted.org/packages/ca/c3/2b49efa3189a2debb73063ffdd4bf77f097824f9e1ae0609fb46f0658c2f/da4ml-0.3.2.tar.gz",
    "platform": null,
    "description": "# da4ml: Distributed Arithmetic for Machine Learning\n\n[![LGPLv3](https://img.shields.io/badge/License-LGPLv3-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0)\n[![Documentation](https://github.com/calad0i/da4ml/actions/workflows/sphinx-build.yml/badge.svg)](https://calad0i.github.io/da4ml/)\n[![PyPI version](https://badge.fury.io/py/da4ml.svg)](https://badge.fury.io/py/da4ml)\n[![ArXiv](https://img.shields.io/badge/arXiv-2507.04535-b31b1b.svg)](https://arxiv.org/abs/2507.04535)\n\nda4ml is a library for implementing distributed arithmetic (DA) based algorithms for ultra-low latency machine learning (ML) applications on FPGAs. It as two major components:\n - A fast and performant constant-matrix-vector multiplications (CMVM) optimizer to implement them as\n   efficient adder trees. Common sub-expressions elimination (CSE) with graph-based pre-optimization are\n   performed to reduce the firmware footprint and improve the performance.\n - Low-level symbolic tracing frameworks for generating combinational/fully pipelined logics in HDL or HLS\n   code. For fully pipelined networks, da4ml can generate the firmware for the whole network standalone.\n   Alternatively, da4ml be used as a plugin in hls4ml to optimize the CMVM operations in the network.\n\n\nKey Features\n------------\n\n- **Optimized Algorithms**: Comparing to hls4ml's latency strategy, da4ml's CMVM implementation uses no DSO and consumes up to 50% less LUT usage.\n- **Fast code generation**: da4ml can generate HDL for a fully pipelined network in seconds. For the same models, high-level synthesis tools like Vivado/Vitis HLS can take up to days to generate the HDL code.\n- **Low-level symbolic tracing**: As long as the operation can be expressed by a combination of the low-level operations supported, adding new operations is straightforward by \"replaying\" the operation on the symbolic tensor provided. In most cases, adding support for a new operation/layer takes just a few lines of code in numpy flavor.\n- **Automatic model conversion**: da4ml can automatically convert models trained in `HGQ2 <https://github.com/calad0i/hgq2>`_.\n- **Bit-accurate Simulation**: All operation in da4ml is bit-accurate, meaning the generated HDL code will produce the same output as the original model. da4ml's computation is converted to a RISC-like, instruction set level intermediate representation, distributed arithmetic instruction set (DAIS), which can be easily simulated in multiple ways.\n- **hls4ml integration**: da4ml can be used as a plugin in hls4ml to optimize the CMVM operations in the network by setting `strategy='distributed_arithmetic'` for the strategy of the Dense, EinsumDense, or Conv1/2D layers.\n\nInstallation\n------------\n\n```bash\npip install da4ml\n```\n\nGetting Started\n---------------\n\nSee the [Getting Started](https://calad0i.github.io/da4ml/getting_started.html) guide for a quick introduction to using da4ml.\n",
    "bugtrack_url": null,
    "license": "GNU Lesser General Public License v3 (LGPLv3)",
    "summary": "Digital Arithmetic for Machine Learning",
    "version": "0.3.2",
    "project_urls": {
        "repository": "https://github.com/calad0i/da4ml"
    },
    "split_keywords": [
        "cmvm",
        " distributed arithmetic",
        " hls4ml",
        " mcm",
        " subexpression elimination"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0c1dbf934c7ef2a6e57985fbcc57232a5771ad532ca8ac964d47581e951f687a",
                "md5": "3cc7f48e00a1228c5656f5617a752f23",
                "sha256": "e732c31c9f89d8c90dde0ecd1ec07ee256ec38c7fbacb4a339bb79b22a8fd293"
            },
            "downloads": -1,
            "filename": "da4ml-0.3.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3cc7f48e00a1228c5656f5617a752f23",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 176836,
            "upload_time": "2025-08-16T23:29:59",
            "upload_time_iso_8601": "2025-08-16T23:29:59.591136Z",
            "url": "https://files.pythonhosted.org/packages/0c/1d/bf934c7ef2a6e57985fbcc57232a5771ad532ca8ac964d47581e951f687a/da4ml-0.3.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "cac32b49efa3189a2debb73063ffdd4bf77f097824f9e1ae0609fb46f0658c2f",
                "md5": "d9b51b8d7f11da33541cd859030b8a86",
                "sha256": "cfd7947b059eb8d012c9607ab348304cae5d4a5bc6d4344719117de6271f2d74"
            },
            "downloads": -1,
            "filename": "da4ml-0.3.2.tar.gz",
            "has_sig": false,
            "md5_digest": "d9b51b8d7f11da33541cd859030b8a86",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 259990,
            "upload_time": "2025-08-16T23:30:01",
            "upload_time_iso_8601": "2025-08-16T23:30:01.027008Z",
            "url": "https://files.pythonhosted.org/packages/ca/c3/2b49efa3189a2debb73063ffdd4bf77f097824f9e1ae0609fb46f0658c2f/da4ml-0.3.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-16 23:30:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "calad0i",
    "github_project": "da4ml",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "da4ml"
}

None