neural-solution


Nameneural-solution JSON
Version 2.4 PyPI version JSON
download
home_pagehttps://github.com/intel/neural-compressor
SummaryRepository of Intel® Neural Compressor
upload_time2023-12-15 09:52:31
maintainer
docs_urlNone
authorIntel AIA Team
requires_python>=3.7.0
licenseApache 2.0
keywords quantization auto-tuning post-training static quantization post-training dynamic quantization quantization-aware training
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">

Intel® Neural Compressor
===========================
<h3> An open-source Python library supporting popular model compression techniques on all mainstream deep learning frameworks (TensorFlow, PyTorch, ONNX Runtime, and MXNet)</h3>

[![python](https://img.shields.io/badge/python-3.8%2B-blue)](https://github.com/intel/neural-compressor)
[![version](https://img.shields.io/badge/release-2.4-green)](https://github.com/intel/neural-compressor/releases)
[![license](https://img.shields.io/badge/license-Apache%202-blue)](https://github.com/intel/neural-compressor/blob/master/LICENSE)
[![coverage](https://img.shields.io/badge/coverage-85%25-green)](https://github.com/intel/neural-compressor)
[![Downloads](https://static.pepy.tech/personalized-badge/neural-compressor?period=total&units=international_system&left_color=grey&right_color=green&left_text=downloads)](https://pepy.tech/project/neural-compressor)

[Architecture](./docs/source/design.md#architecture)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Workflow](./docs/source/design.md#workflow)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Results](./docs/source/validated_model_list.md)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Examples](./examples/README.md)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Documentations](https://intel.github.io/neural-compressor)

---
<div align="left">

Intel® Neural Compressor aims to provide popular model compression techniques such as quantization, pruning (sparsity), distillation, and neural architecture search on mainstream frameworks such as [TensorFlow](https://www.tensorflow.org/), [PyTorch](https://pytorch.org/), [ONNX Runtime](https://onnxruntime.ai/), and [MXNet](https://mxnet.apache.org/),
as well as Intel extensions such as [Intel Extension for TensorFlow](https://github.com/intel/intel-extension-for-tensorflow) and [Intel Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch).
In particular, the tool provides the key features, typical examples, and open collaborations as below:

* Support a wide range of Intel hardware such as [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable.html), [Intel Xeon CPU Max Series](https://www.intel.com/content/www/us/en/products/details/processors/xeon/max-series.html), [Intel Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/flex-series.html), and [Intel Data Center GPU Max Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/max-series.html) with extensive testing; support AMD CPU, ARM CPU, and NVidia GPU through ONNX Runtime with limited testing

* Validate popular LLMs such as [LLama2](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Falcon](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Bloom](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [OPT](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), and more than 10,000 broad models such as [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), by leveraging zero-code optimization solution [Neural Coder](/neural_coder#what-do-we-offer) and automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies

* Collaborate with cloud marketplaces such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)

## Installation

### Install from pypi
```Shell
pip install neural-compressor
```
> **Note**: 
> More installation methods can be found at [Installation Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/installation_guide.md). Please check out our [FAQ](https://github.com/intel/neural-compressor/blob/master/docs/source/faq.md) for more details.

## Getting Started
### Quantization with Python API

```shell
# Install Intel Neural Compressor and TensorFlow
pip install neural-compressor
pip install tensorflow
# Prepare fp32 model
wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_6/mobilenet_v1_1.0_224_frozen.pb
```
```python
from neural_compressor.data import DataLoader, Datasets
from neural_compressor.config import PostTrainingQuantConfig

dataset = Datasets("tensorflow")["dummy"](shape=(1, 224, 224, 3))
dataloader = DataLoader(framework="tensorflow", dataset=dataset)

from neural_compressor.quantization import fit

q_model = fit(
    model="./mobilenet_v1_1.0_224_frozen.pb",
    conf=PostTrainingQuantConfig(),
    calib_dataloader=dataloader,
)
```

## Documentation

<table class="docutils">
  <thead>
  <tr>
    <th colspan="8">Overview</th>
  </tr>
  </thead>
  <tbody>
    <tr>
      <td colspan="2" align="center"><a href="./docs/source/design.md#architecture">Architecture</a></td>
      <td colspan="2" align="center"><a href="./docs/source/design.md#workflow">Workflow</a></td>
      <td colspan="2" align="center"><a href="examples/README.md">Examples</a></td>
      <td colspan="2" align="center"><a href="https://intel.github.io/neural-compressor/latest/docs/source/api-doc/apis.html">APIs</a></td>
    </tr>
  </tbody>
  <thead>
    <tr>
      <th colspan="8">Python-based APIs</th>
    </tr>
  </thead>
  <tbody>
    <tr>
        <td colspan="2" align="center"><a href="./docs/source/quantization.md">Quantization</a></td>
        <td colspan="2" align="center"><a href="./docs/source/mixed_precision.md">Advanced Mixed Precision</a></td>
        <td colspan="2" align="center"><a href="./docs/source/pruning.md">Pruning (Sparsity)</a></td>
        <td colspan="2" align="center"><a href="./docs/source/distillation.md">Distillation</a></td>
    </tr>
    <tr>
        <td colspan="2" align="center"><a href="./docs/source/orchestration.md">Orchestration</a></td>
        <td colspan="2" align="center"><a href="./docs/source/benchmark.md">Benchmarking</a></td>
        <td colspan="2" align="center"><a href="./docs/source/distributed.md">Distributed Compression</a></td>
        <td colspan="2" align="center"><a href="./docs/source/export.md">Model Export</a></td>
    </tr>
  </tbody>
  <thead>
    <tr>
      <th colspan="8">Neural Coder (Zero-code Optimization)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
        <td colspan="2" align="center"><a href="./neural_coder/docs/PythonLauncher.md">Launcher</a></td>
        <td colspan="2" align="center"><a href="./neural_coder/extensions/neural_compressor_ext_lab/README.md">JupyterLab Extension</a></td>
        <td colspan="2" align="center"><a href="./neural_coder/extensions/neural_compressor_ext_vscode/README.md">Visual Studio Code Extension</a></td>
        <td colspan="2" align="center"><a href="./neural_coder/docs/SupportMatrix.md">Supported Matrix</a></td>
    </tr>
  </tbody>
  <thead>
      <tr>
        <th colspan="8">Advanced Topics</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td colspan="2" align="center"><a href="./docs/source/adaptor.md">Adaptor</a></td>
          <td colspan="2" align="center"><a href="./docs/source/tuning_strategies.md">Strategy</a></td>
          <td colspan="2" align="center"><a href="./docs/source/distillation_quantization.md">Distillation for Quantization</a></td>
          <td colspan="2" align="center"><a href="./docs/source/smooth_quant.md">SmoothQuant</td>
      </tr>
      <tr>
          <td colspan="4" align="center"><a href="./docs/source/quantization_weight_only.md">Weight-Only Quantization (INT8/INT4/FP4/NF4) </td>
          <td colspan="2" align="center"><a href="https://github.com/intel/neural-compressor/blob/fp8_adaptor/docs/source/fp8.md">FP8 Quantization </td>
          <td colspan="2" align="center"><a href="./docs/source/quantization_layer_wise.md">Layer-Wise Quantization </td>
      </tr>
  </tbody>
  <thead>
      <tr>
        <th colspan="8">Innovations for Productivity</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td colspan="4" align="center"><a href="./neural_insights/README.md">Neural Insights</a></td>
          <td colspan="4" align="center"><a href="./neural_solution/README.md">Neural Solution</a></td>
      </tr>
  </tbody>
</table>

> **Note**: 
> More documentations can be found at [User Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/user_guide.md).

## Selected Publications/Events
* Blog by Intel: [Effective Weight-Only Quantization for Large Language Models with Intel® Neural Compressor](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Effective-Weight-Only-Quantization-for-Large-Language-Models/post/1529552) (Oct 2023)
* EMNLP'2023 (Under Review): [TEQ: Trainable Equivalent Transformation for Quantization of LLMs](https://openreview.net/forum?id=iaI8xEINAf&referrer=%5BAuthor%20Console%5D) (Sep 2023)
* arXiv: [Efficient Post-training Quantization with FP8 Formats](https://arxiv.org/abs/2309.14592) (Sep 2023)
* arXiv: [Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs](https://arxiv.org/abs/2309.05516) (Sep 2023)
* NeurIPS'2022: [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) (Oct 2022)
* NeurIPS'2022: [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114) (Oct 2022)

> **Note**: 
> View [Full Publication List](https://github.com/intel/neural-compressor/blob/master/docs/source/publication_list.md).

## Additional Content

* [Release Information](./docs/source/releases_info.md)
* [Contribution Guidelines](./docs/source/CONTRIBUTING.md)
* [Legal Information](./docs/source/legal_information.md)
* [Security Policy](SECURITY.md)

## Communication 
- [GitHub Issues](https://github.com/intel/neural-compressor/issues): mainly for bug reports, new feature requests, question asking, etc.
- [Email](mailto:inc.maintainers@intel.com): welcome to raise any interesting research ideas on model compression techniques by email for collaborations.  
- [Discord Channel](https://discord.com/invite/Wxk3J3ZJkU): join the discord channel for more flexible technical discussion.
- [WeChat group](/docs/source/imgs/wechat_group.jpg): scan the QA code to join the technical discussion.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/intel/neural-compressor",
    "name": "neural-solution",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7.0",
    "maintainer_email": "",
    "keywords": "quantization,auto-tuning,post-training static quantization,post-training dynamic quantization,quantization-aware training",
    "author": "Intel AIA Team",
    "author_email": "feng.tian@intel.com, haihao.shen@intel.com, suyue.chen@intel.com",
    "download_url": "https://files.pythonhosted.org/packages/73/c6/6cca6d6c9095dc31a71118d01998cd73e71809ada4403d6ace20b47c0456/neural_solution-2.4.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n\nIntel\u00ae Neural Compressor\n===========================\n<h3> An open-source Python library supporting popular model compression techniques on all mainstream deep learning frameworks (TensorFlow, PyTorch, ONNX Runtime, and MXNet)</h3>\n\n[![python](https://img.shields.io/badge/python-3.8%2B-blue)](https://github.com/intel/neural-compressor)\n[![version](https://img.shields.io/badge/release-2.4-green)](https://github.com/intel/neural-compressor/releases)\n[![license](https://img.shields.io/badge/license-Apache%202-blue)](https://github.com/intel/neural-compressor/blob/master/LICENSE)\n[![coverage](https://img.shields.io/badge/coverage-85%25-green)](https://github.com/intel/neural-compressor)\n[![Downloads](https://static.pepy.tech/personalized-badge/neural-compressor?period=total&units=international_system&left_color=grey&right_color=green&left_text=downloads)](https://pepy.tech/project/neural-compressor)\n\n[Architecture](./docs/source/design.md#architecture)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Workflow](./docs/source/design.md#workflow)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Results](./docs/source/validated_model_list.md)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Examples](./examples/README.md)&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;[Documentations](https://intel.github.io/neural-compressor)\n\n---\n<div align=\"left\">\n\nIntel\u00ae Neural Compressor aims to provide popular model compression techniques such as quantization, pruning (sparsity), distillation, and neural architecture search on mainstream frameworks such as [TensorFlow](https://www.tensorflow.org/), [PyTorch](https://pytorch.org/), [ONNX Runtime](https://onnxruntime.ai/), and [MXNet](https://mxnet.apache.org/),\nas well as Intel extensions such as [Intel Extension for TensorFlow](https://github.com/intel/intel-extension-for-tensorflow) and [Intel Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch).\nIn particular, the tool provides the key features, typical examples, and open collaborations as below:\n\n* Support a wide range of Intel hardware such as [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable.html), [Intel Xeon CPU Max Series](https://www.intel.com/content/www/us/en/products/details/processors/xeon/max-series.html), [Intel Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/flex-series.html), and [Intel Data Center GPU Max Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/max-series.html) with extensive testing; support AMD CPU, ARM CPU, and NVidia GPU through ONNX Runtime with limited testing\n\n* Validate popular LLMs such as [LLama2](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Falcon](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Bloom](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [OPT](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), and more than 10,000 broad models such as [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), by leveraging zero-code optimization solution [Neural Coder](/neural_coder#what-do-we-offer) and automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies\n\n* Collaborate with cloud marketplaces such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)\n\n## Installation\n\n### Install from pypi\n```Shell\npip install neural-compressor\n```\n> **Note**: \n> More installation methods can be found at [Installation Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/installation_guide.md). Please check out our [FAQ](https://github.com/intel/neural-compressor/blob/master/docs/source/faq.md) for more details.\n\n## Getting Started\n### Quantization with Python API\n\n```shell\n# Install Intel Neural Compressor and TensorFlow\npip install neural-compressor\npip install tensorflow\n# Prepare fp32 model\nwget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_6/mobilenet_v1_1.0_224_frozen.pb\n```\n```python\nfrom neural_compressor.data import DataLoader, Datasets\nfrom neural_compressor.config import PostTrainingQuantConfig\n\ndataset = Datasets(\"tensorflow\")[\"dummy\"](shape=(1, 224, 224, 3))\ndataloader = DataLoader(framework=\"tensorflow\", dataset=dataset)\n\nfrom neural_compressor.quantization import fit\n\nq_model = fit(\n    model=\"./mobilenet_v1_1.0_224_frozen.pb\",\n    conf=PostTrainingQuantConfig(),\n    calib_dataloader=dataloader,\n)\n```\n\n## Documentation\n\n<table class=\"docutils\">\n  <thead>\n  <tr>\n    <th colspan=\"8\">Overview</th>\n  </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/design.md#architecture\">Architecture</a></td>\n      <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/design.md#workflow\">Workflow</a></td>\n      <td colspan=\"2\" align=\"center\"><a href=\"examples/README.md\">Examples</a></td>\n      <td colspan=\"2\" align=\"center\"><a href=\"https://intel.github.io/neural-compressor/latest/docs/source/api-doc/apis.html\">APIs</a></td>\n    </tr>\n  </tbody>\n  <thead>\n    <tr>\n      <th colspan=\"8\">Python-based APIs</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n        <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/quantization.md\">Quantization</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/mixed_precision.md\">Advanced Mixed Precision</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/pruning.md\">Pruning (Sparsity)</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/distillation.md\">Distillation</a></td>\n    </tr>\n    <tr>\n        <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/orchestration.md\">Orchestration</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/benchmark.md\">Benchmarking</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/distributed.md\">Distributed Compression</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/export.md\">Model Export</a></td>\n    </tr>\n  </tbody>\n  <thead>\n    <tr>\n      <th colspan=\"8\">Neural Coder (Zero-code Optimization)</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n        <td colspan=\"2\" align=\"center\"><a href=\"./neural_coder/docs/PythonLauncher.md\">Launcher</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./neural_coder/extensions/neural_compressor_ext_lab/README.md\">JupyterLab Extension</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./neural_coder/extensions/neural_compressor_ext_vscode/README.md\">Visual Studio Code Extension</a></td>\n        <td colspan=\"2\" align=\"center\"><a href=\"./neural_coder/docs/SupportMatrix.md\">Supported Matrix</a></td>\n    </tr>\n  </tbody>\n  <thead>\n      <tr>\n        <th colspan=\"8\">Advanced Topics</th>\n      </tr>\n  </thead>\n  <tbody>\n      <tr>\n          <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/adaptor.md\">Adaptor</a></td>\n          <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/tuning_strategies.md\">Strategy</a></td>\n          <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/distillation_quantization.md\">Distillation for Quantization</a></td>\n          <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/smooth_quant.md\">SmoothQuant</td>\n      </tr>\n      <tr>\n          <td colspan=\"4\" align=\"center\"><a href=\"./docs/source/quantization_weight_only.md\">Weight-Only Quantization (INT8/INT4/FP4/NF4) </td>\n          <td colspan=\"2\" align=\"center\"><a href=\"https://github.com/intel/neural-compressor/blob/fp8_adaptor/docs/source/fp8.md\">FP8 Quantization </td>\n          <td colspan=\"2\" align=\"center\"><a href=\"./docs/source/quantization_layer_wise.md\">Layer-Wise Quantization </td>\n      </tr>\n  </tbody>\n  <thead>\n      <tr>\n        <th colspan=\"8\">Innovations for Productivity</th>\n      </tr>\n  </thead>\n  <tbody>\n      <tr>\n          <td colspan=\"4\" align=\"center\"><a href=\"./neural_insights/README.md\">Neural Insights</a></td>\n          <td colspan=\"4\" align=\"center\"><a href=\"./neural_solution/README.md\">Neural Solution</a></td>\n      </tr>\n  </tbody>\n</table>\n\n> **Note**: \n> More documentations can be found at [User Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/user_guide.md).\n\n## Selected Publications/Events\n* Blog by Intel: [Effective Weight-Only Quantization for Large Language Models with Intel\u00ae Neural Compressor](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Effective-Weight-Only-Quantization-for-Large-Language-Models/post/1529552) (Oct 2023)\n* EMNLP'2023 (Under Review): [TEQ: Trainable Equivalent Transformation for Quantization of LLMs](https://openreview.net/forum?id=iaI8xEINAf&referrer=%5BAuthor%20Console%5D) (Sep 2023)\n* arXiv: [Efficient Post-training Quantization with FP8 Formats](https://arxiv.org/abs/2309.14592) (Sep 2023)\n* arXiv: [Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs](https://arxiv.org/abs/2309.05516) (Sep 2023)\n* NeurIPS'2022: [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) (Oct 2022)\n* NeurIPS'2022: [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114) (Oct 2022)\n\n> **Note**: \n> View [Full Publication List](https://github.com/intel/neural-compressor/blob/master/docs/source/publication_list.md).\n\n## Additional Content\n\n* [Release Information](./docs/source/releases_info.md)\n* [Contribution Guidelines](./docs/source/CONTRIBUTING.md)\n* [Legal Information](./docs/source/legal_information.md)\n* [Security Policy](SECURITY.md)\n\n## Communication \n- [GitHub Issues](https://github.com/intel/neural-compressor/issues): mainly for bug reports, new feature requests, question asking, etc.\n- [Email](mailto:inc.maintainers@intel.com): welcome to raise any interesting research ideas on model compression techniques by email for collaborations.  \n- [Discord Channel](https://discord.com/invite/Wxk3J3ZJkU): join the discord channel for more flexible technical discussion.\n- [WeChat group](/docs/source/imgs/wechat_group.jpg): scan the QA code to join the technical discussion.\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "Repository of Intel\u00ae Neural Compressor",
    "version": "2.4",
    "project_urls": {
        "Homepage": "https://github.com/intel/neural-compressor"
    },
    "split_keywords": [
        "quantization",
        "auto-tuning",
        "post-training static quantization",
        "post-training dynamic quantization",
        "quantization-aware training"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "23fe9d097c2e911f395b5fd84957e07d229bcecf46245a040e6edcf915a9ef70",
                "md5": "05acc513c6dce637945227124f16cfdb",
                "sha256": "a91a90cb3d5fc9fb07d0bcb755ef6694b1d9d9dc96abc72b98b447b579cb2953"
            },
            "downloads": -1,
            "filename": "neural_solution-2.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "05acc513c6dce637945227124f16cfdb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7.0",
            "size": 82896,
            "upload_time": "2023-12-15T09:52:29",
            "upload_time_iso_8601": "2023-12-15T09:52:29.408663Z",
            "url": "https://files.pythonhosted.org/packages/23/fe/9d097c2e911f395b5fd84957e07d229bcecf46245a040e6edcf915a9ef70/neural_solution-2.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "73c66cca6d6c9095dc31a71118d01998cd73e71809ada4403d6ace20b47c0456",
                "md5": "740089a672fcb1b17776b5dbce11a850",
                "sha256": "27d02400b523814cf8b3d453be9d182108d0f6a5aae546a66760b665df68c6c2"
            },
            "downloads": -1,
            "filename": "neural_solution-2.4.tar.gz",
            "has_sig": false,
            "md5_digest": "740089a672fcb1b17776b5dbce11a850",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7.0",
            "size": 68976,
            "upload_time": "2023-12-15T09:52:31",
            "upload_time_iso_8601": "2023-12-15T09:52:31.986312Z",
            "url": "https://files.pythonhosted.org/packages/73/c6/6cca6d6c9095dc31a71118d01998cd73e71809ada4403d6ace20b47c0456/neural_solution-2.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-15 09:52:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "intel",
    "github_project": "neural-compressor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "neural-solution"
}
        
Elapsed time: 0.15521s