Name | Version | Summary | date |
sparsify-nightly |
1.7.0.20240304 |
Easy-to-use UI for automatically sparsifying neural networks and creating sparsification recipes for better inference performance and a smaller footprint |
2024-03-05 13:38:57 |
nendo-plugin-quantize-core |
0.2.6 |
Nendo Plugin for audio quantization with grid detection and time-stretching |
2024-02-21 09:42:22 |
auto-gptq |
0.7.0 |
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm. |
2024-02-16 12:52:41 |
clika-inference |
0.0.2 |
A fake package to warn the user they are not installing the correct package. |
2024-01-31 06:43:35 |
clika-compression |
0.0.2 |
A fake package to warn the user they are not installing the correct package. |
2024-01-31 06:43:33 |
clika-client |
0.0.2 |
A fake package to warn the user they are not installing the correct package. |
2024-01-31 06:43:31 |
clika-ace |
0.0.2 |
A fake package to warn the user they are not installing the correct package. |
2024-01-31 06:43:30 |
auto-around |
0.0 |
Repository of AutoRound: Advanced Weight-Only Quantization Algorithm for LLMs |
2024-01-30 09:40:20 |
glai |
0.1.3 |
Easy deployment of quantized llama models on cpu |
2024-01-13 19:04:27 |
gguf-modeldb |
0.0.3 |
A Llama2 quantized gguf model db with over 80 preconfigured models downloadable in one line, easly add your own models or adjust settings. Don't struggle with manual downloads again. |
2024-01-13 18:57:53 |
llama-memory |
0.0.1a1 |
Easy deployment of quantized llama models on cpu |
2024-01-09 01:59:55 |
nendo_plugin_quantize_core |
0.1.4 |
Nendo Plugin for audio quantization with grid detection and time-stretching |
2023-12-22 11:12:58 |
sparsify |
1.6.1 |
Easy-to-use UI for automatically sparsifying neural networks and creating sparsification recipes for better inference performance and a smaller footprint |
2023-12-20 14:28:37 |
optimum-deepsparse |
0.1.0.dev1 |
Optimum DeepSparse is an extension of the Hugging Face Transformers library that integrates the DeepSparse inference runtime. DeepSparse offers GPU-class performance on CPUs, making it possible to run Transformers and other deep learning models on commodity hardware with sparsity. Optimum DeepSparse provides a framework for developers to easily integrate DeepSparse into their applications, regardless of the hardware platform. |
2023-10-26 02:02:45 |
RPQ-pytorch |
0.0.34 |
Reverse Product Quantization (RPQ) of weights to reduce static memory usage. |
2023-09-28 20:46:15 |
nqlib |
0.5.1 |
NQLib: Library to design noise shaping quantizer for discrete-valued input control. |
2023-09-19 05:51:16 |
AdapterLoRa |
2.0.0 |
A Tool for adaptation Larger Transfomer-Based model and Quantization built top on libraries LoRa and LoRa-Torch. |
2023-08-26 15:15:26 |
optimum-graphcore |
0.7.1 |
Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality. |
2023-07-31 09:34:12 |
discrete-key-value-bottleneck-pytorch |
0.1.1 |
Discrete Key / Value Bottleneck - Pytorch |
2023-07-09 23:57:56 |
neural-compressor-full |
2.1.1 |
Repository of IntelĀ® Neural Compressor |
2023-05-11 12:19:44 |