<div align="center">
<h1> Torch Floating Point</h1>
<img src="https://raw.githubusercontent.com/SamirMoustafa/torch-floating-point/refs/heads/main/assets/torch-floating-point-logo.png"/>




</div>
A PyTorch library for custom floating point quantization with autograd support. This library provides efficient implementations of custom floating point formats with automatic differentiation capabilities.
## Features
- **Custom Floating Point Formats**: Support for arbitrary floating point configurations (sign bits, exponent bits, mantissa bits, bias)
- **Autograd Support**: Full PyTorch autograd integration for training with quantized weights
- **CUDA Support**: GPU acceleration for both forward and backward passes
- **Straight-Through Estimator**: Gradient-friendly quantization for training
## Installation
### From PyPI (Recommended)
```bash
pip install torch-floating-point
```
### From Source
```bash
git clone https://github.com/SamirMoustafa/torch-floating-point.git
cd torch-floating-point
pip install -e .
```
## Quick Start
```python
import torch
from floating_point import FloatingPoint, Round
# Define a custom 8-bit floating point format (1 sign, 4 exponent, 3 mantissa bits)
fp8 = FloatingPoint(sign_bits=1, exponent_bits=4, mantissa_bits=3, bias=7, bits=8)
# Create a rounding function
rounder = Round(fp8)
# Create a tensor with gradients
x = torch.randn(10, requires_grad=True)
# Quantize the tensor
quantized = rounder(x)
# Use in training (gradients flow through)
loss = quantized.sum()
loss.backward()
print(f"Original: {x}")
print(f"Quantized: {quantized}")
print(f"Gradients: {x.grad}")
```
## Training with Custom Floating Point Weights
```python
import torch
import torch.nn as nn
from floating_point import FloatingPoint, Round
class FloatPointLinear(nn.Module):
def __init__(self, in_features, out_features, fp_config):
super().__init__()
self.weight = nn.Parameter(torch.randn(out_features, in_features))
self.bias = nn.Parameter(torch.randn(out_features))
self.rounder = Round(fp_config)
def forward(self, x):
quantized_weight = self.rounder(self.weight)
return torch.nn.functional.linear(x, quantized_weight, self.bias)
# Define custom floating point format
fp8 = FloatingPoint(sign_bits=1, exponent_bits=4, mantissa_bits=3, bias=7, bits=8)
# Create model with quantized weights
model = FloatPointLinear(10, 5, fp8)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
criterion = nn.MSELoss()
# Create simple data
x = torch.randn(32, 10)
y = torch.randn(32, 5)
# Training loop
for epoch in range(5):
optimizer.zero_grad()
# Forward pass
output = model(x)
loss = criterion(output, y)
# Backward pass
loss.backward()
optimizer.step()
print(f"Epoch {epoch + 1}: Loss = {loss.item():.6f}")
```
## Contributing
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Install development dependencies (`make setup-dev`)
4. Make your changes
5. Run tests (`make test`)
6. Run linting (`make lint`)
7. Commit your changes (`git commit -m 'Add amazing feature'`)
8. Push to the branch (`git push origin feature/amazing-feature`)
9. Open a Pull Request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Citation
If you use this library in your research, please cite:
```bibtex
@software{moustafa2025torchfloatingpoint,
title={Torch Floating Point: A PyTorch library for custom floating point quantization},
author={Samir Moustafa},
year={2025},
url={https://github.com/SamirMoustafa/torch-floating-point}
}
```
## Support
- **Issues**: [GitHub Issues](https://github.com/SamirMoustafa/torch-floating-point/issues)
- **Discussions**: [GitHub Discussions](https://github.com/SamirMoustafa/torch-floating-point/discussions)
- **Email**: samir.moustafa.97@gmail.com
Raw data
{
"_id": null,
"home_page": "https://github.com/SamirMoustafa/torch-floating-point",
"name": "torch-floating-point",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Samir Moustafa <samir.moustafa.97@gmail.com>",
"keywords": "pytorch, floating-point, quantization, autograd, machine-learning, deep-learning",
"author": "Samir Moustafa",
"author_email": "Samir Moustafa <samir.moustafa.97@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/41/39/d80623cdee3bdeb3565535ac9bb7aac19556b9ea31f264023b014884c7d7/torch-floating-point-0.0.11.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n\n<h1> Torch Floating Point</h1>\n<img src=\"https://raw.githubusercontent.com/SamirMoustafa/torch-floating-point/refs/heads/main/assets/torch-floating-point-logo.png\"/>\n\n\n\n\n\n</div>\n\nA PyTorch library for custom floating point quantization with autograd support. This library provides efficient implementations of custom floating point formats with automatic differentiation capabilities.\n\n## Features\n\n- **Custom Floating Point Formats**: Support for arbitrary floating point configurations (sign bits, exponent bits, mantissa bits, bias)\n- **Autograd Support**: Full PyTorch autograd integration for training with quantized weights\n- **CUDA Support**: GPU acceleration for both forward and backward passes\n- **Straight-Through Estimator**: Gradient-friendly quantization for training\n\n## Installation\n\n### From PyPI (Recommended)\n\n```bash\npip install torch-floating-point\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/SamirMoustafa/torch-floating-point.git\ncd torch-floating-point\npip install -e .\n```\n\n## Quick Start\n\n```python\nimport torch\nfrom floating_point import FloatingPoint, Round\n\n# Define a custom 8-bit floating point format (1 sign, 4 exponent, 3 mantissa bits)\nfp8 = FloatingPoint(sign_bits=1, exponent_bits=4, mantissa_bits=3, bias=7, bits=8)\n\n# Create a rounding function\nrounder = Round(fp8)\n\n# Create a tensor with gradients\nx = torch.randn(10, requires_grad=True)\n\n# Quantize the tensor\nquantized = rounder(x)\n\n# Use in training (gradients flow through)\nloss = quantized.sum()\nloss.backward()\n\nprint(f\"Original: {x}\")\nprint(f\"Quantized: {quantized}\")\nprint(f\"Gradients: {x.grad}\")\n```\n\n## Training with Custom Floating Point Weights\n\n```python\nimport torch\nimport torch.nn as nn\nfrom floating_point import FloatingPoint, Round\n\nclass FloatPointLinear(nn.Module):\n def __init__(self, in_features, out_features, fp_config):\n super().__init__()\n self.weight = nn.Parameter(torch.randn(out_features, in_features))\n self.bias = nn.Parameter(torch.randn(out_features))\n self.rounder = Round(fp_config)\n \n def forward(self, x):\n quantized_weight = self.rounder(self.weight)\n return torch.nn.functional.linear(x, quantized_weight, self.bias)\n\n# Define custom floating point format\nfp8 = FloatingPoint(sign_bits=1, exponent_bits=4, mantissa_bits=3, bias=7, bits=8)\n\n# Create model with quantized weights\nmodel = FloatPointLinear(10, 5, fp8)\noptimizer = torch.optim.Adam(model.parameters(), lr=0.01)\ncriterion = nn.MSELoss()\n\n# Create simple data\nx = torch.randn(32, 10)\ny = torch.randn(32, 5)\n\n# Training loop\nfor epoch in range(5):\n optimizer.zero_grad()\n \n # Forward pass\n output = model(x)\n loss = criterion(output, y)\n \n # Backward pass\n loss.backward()\n optimizer.step()\n \n print(f\"Epoch {epoch + 1}: Loss = {loss.item():.6f}\")\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Install development dependencies (`make setup-dev`)\n4. Make your changes\n5. Run tests (`make test`)\n6. Run linting (`make lint`)\n7. Commit your changes (`git commit -m 'Add amazing feature'`)\n8. Push to the branch (`git push origin feature/amazing-feature`)\n9. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Citation\n\nIf you use this library in your research, please cite:\n\n```bibtex\n@software{moustafa2025torchfloatingpoint,\n title={Torch Floating Point: A PyTorch library for custom floating point quantization},\n author={Samir Moustafa},\n year={2025},\n url={https://github.com/SamirMoustafa/torch-floating-point}\n}\n```\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/SamirMoustafa/torch-floating-point/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/SamirMoustafa/torch-floating-point/discussions)\n- **Email**: samir.moustafa.97@gmail.com\n",
"bugtrack_url": null,
"license": null,
"summary": "A PyTorch library for custom floating point quantization with autograd support",
"version": "0.0.11",
"project_urls": {
"Bug Tracker": "https://github.com/SamirMoustafa/torch-floating-point/issues",
"Documentation": "https://torch-floating-point.readthedocs.io/",
"Homepage": "https://github.com/SamirMoustafa/torch-floating-point",
"Repository": "https://github.com/SamirMoustafa/torch-floating-point",
"Source Code": "https://github.com/SamirMoustafa/torch-floating-point"
},
"split_keywords": [
"pytorch",
" floating-point",
" quantization",
" autograd",
" machine-learning",
" deep-learning"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5c0748c6d50220c26fbc82d21754c6a369a2e997f7553a14a1302059982a9c93",
"md5": "7487e8a2f0f282219b8ca2a96daada1b",
"sha256": "13356e3b32e85c9ed2cb7bdcce188fd0e8ea973c91a9b682e642f6289591926f"
},
"downloads": -1,
"filename": "torch_floating_point-0.0.11-cp310-cp310-manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "7487e8a2f0f282219b8ca2a96daada1b",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.8",
"size": 2654561,
"upload_time": "2025-08-23T11:59:00",
"upload_time_iso_8601": "2025-08-23T11:59:00.972111Z",
"url": "https://files.pythonhosted.org/packages/5c/07/48c6d50220c26fbc82d21754c6a369a2e997f7553a14a1302059982a9c93/torch_floating_point-0.0.11-cp310-cp310-manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4139d80623cdee3bdeb3565535ac9bb7aac19556b9ea31f264023b014884c7d7",
"md5": "32534e164bf7931c3c6b76ad8b2fca3c",
"sha256": "6ecfc748aeb2ce35d1c659d70a07ccc986684d163702ecb8ff406a3605fd1f29"
},
"downloads": -1,
"filename": "torch-floating-point-0.0.11.tar.gz",
"has_sig": false,
"md5_digest": "32534e164bf7931c3c6b76ad8b2fca3c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 16176,
"upload_time": "2025-08-23T11:59:02",
"upload_time_iso_8601": "2025-08-23T11:59:02.682357Z",
"url": "https://files.pythonhosted.org/packages/41/39/d80623cdee3bdeb3565535ac9bb7aac19556b9ea31f264023b014884c7d7/torch-floating-point-0.0.11.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-23 11:59:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "SamirMoustafa",
"github_project": "torch-floating-point",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "torch-floating-point"
}