# SingLoRA: A Minimal Implementation
This repository provides a minimal, single-file implementation of SingLoRA (Single Matrix Low-Rank Adaptation) as described in the paper ["SingLoRA: Low Rank Adaptation Using a Single Matrix"](https://arxiv.org/abs/2507.05566) by Bensaïd et al.
## Overview
SingLoRA is a parameter-efficient fine-tuning method that simplifies the LoRA architecture by using a single trainable matrix instead of two. This implementation demonstrates how to apply SingLoRA to transformer models using PyTorch and the Hugging Face Transformers library.
## Features
- Simple, self-contained implementation in a single Python file
- Compatible with Hugging Face Transformers models
- Includes a working example with DistilBERT
- Demonstrates parameter reduction compared to full fine-tuning
## Installation
```bash
pip install -r requirements.txt
```
## Usage
### Basic Example
Here's a simple example of how to apply SingLoRA to a transformer model:
```python
from singlora import apply_singlora_to_model
from transformers import AutoModelForSequenceClassification
# Load your model
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
# Apply SingLoRA
apply_singlora_to_model(
model=model,
rank=8, # Low-rank dimension (r in the paper)
alpha=8.0, # Scaling factor
ramp_up_steps=1000, # Steps for ramp-up function u(t)
target_modules=["q_lin", "k_lin", "v_lin"] # Target attention layers
)
# Now only the SingLoRA parameters are trainable
optimizer = torch.optim.AdamW(
filter(lambda p: p.requires_grad, model.parameters()),
lr=1e-3
)
```
### Configuration Parameters
- `rank`: The dimension of the low-rank adaptation (r). Lower values mean fewer parameters.
- `alpha`: Scaling factor for the adaptation. Higher values allow larger updates.
- `ramp_up_steps`: Number of steps (T) for the ramp-up function u(t) = min(t/T, 1).
- `target_modules`: List of layer names to apply SingLoRA to. Common targets:
- `["query", "key", "value"]` for standard transformers
- `["q_lin", "k_lin", "v_lin"]` for DistilBERT
- `["q_proj", "k_proj", "v_proj"]` for LLaMA models
### Parameter Efficiency
SingLoRA significantly reduces the number of trainable parameters compared to full fine-tuning:
```python
# Example parameter counts
original_params = sum(p.numel() for p in original_model.parameters() if p.requires_grad)
singlora_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
reduction = 100 * (1 - singlora_params / original_params)
print(f"Parameter reduction: {reduction:.2f}%")
```
For a complete working example, see `example.py` in the repository.
### LLaMA Example
Here's how to apply SingLoRA to LLaMA models:
```python
from singlora import apply_singlora_to_model
from transformers import LlamaForCausalLM, LlamaTokenizer
import torch
# Load LLaMA model and tokenizer
model_name = "meta-llama/Llama-2-7b-hf" # or your local path
model = LlamaForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16, # Use float16 for efficiency
device_map="auto" # Automatically handle model placement
)
tokenizer = LlamaTokenizer.from_pretrained(model_name)
# Apply SingLoRA to attention layers
apply_singlora_to_model(
model=model,
rank=16, # Can use larger rank for bigger models
alpha=16.0, # Increased alpha for stronger adaptation
ramp_up_steps=2000, # More steps for larger datasets
target_modules=[ # LLaMA-specific attention layer names
"q_proj",
"k_proj",
"v_proj"
]
)
# Example training setup
optimizer = torch.optim.AdamW(
filter(lambda p: p.requires_grad, model.parameters()),
lr=1e-4 # Lower learning rate for LLaMA
)
# Example inference
prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_length=100,
temperature=0.7,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
Key differences for LLaMA models:
- Use `LlamaForCausalLM` instead of standard transformer models
- Target the LLaMA-specific projection layers (`q_proj`, `k_proj`, `v_proj`)
- Consider using `float16` for memory efficiency
- Adjust hyperparameters (`rank`, `alpha`, learning rate) for larger models
- Use `device_map="auto"` for automatic model sharding on multiple GPUs
## Citation
If you use this implementation in your research, please cite the original paper:
```bibtex
@misc{bensaïd2025singloralowrankadaptation,
title={SingLoRA: Low Rank Adaptation Using a Single Matrix},
author={David Bensaïd and Noam Rotstein and Roy Velich and Daniel Bensaïd and Ron Kimmel},
year={2025},
eprint={2507.05566},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2507.05566},
}
```
## License
This project is licensed under the MIT License - see the LICENSE file for details.
Raw data
{
"_id": null,
"home_page": "https://github.com/kyegomez/singlora",
"name": "singlora",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "artificial intelligence, deep learning, optimizers, Prompt Engineering, SingLoRA, LLM, Pytorch, Pytorch Lightning",
"author": "Kye Gomez",
"author_email": "kye@apac.ai",
"download_url": "https://files.pythonhosted.org/packages/6a/0a/147907f6f9e896455065e99b9975649b2cbce3dc847dddcba8a377ed6080/singlora-0.0.1.tar.gz",
"platform": null,
"description": "# SingLoRA: A Minimal Implementation\n\nThis repository provides a minimal, single-file implementation of SingLoRA (Single Matrix Low-Rank Adaptation) as described in the paper [\"SingLoRA: Low Rank Adaptation Using a Single Matrix\"](https://arxiv.org/abs/2507.05566) by Bensa\u00efd et al.\n\n## Overview\n\nSingLoRA is a parameter-efficient fine-tuning method that simplifies the LoRA architecture by using a single trainable matrix instead of two. This implementation demonstrates how to apply SingLoRA to transformer models using PyTorch and the Hugging Face Transformers library.\n\n## Features\n\n- Simple, self-contained implementation in a single Python file\n- Compatible with Hugging Face Transformers models\n- Includes a working example with DistilBERT\n- Demonstrates parameter reduction compared to full fine-tuning\n\n## Installation\n\n```bash\npip install -r requirements.txt\n```\n\n## Usage\n\n### Basic Example\n\nHere's a simple example of how to apply SingLoRA to a transformer model:\n\n```python\nfrom singlora import apply_singlora_to_model\nfrom transformers import AutoModelForSequenceClassification\n\n# Load your model\nmodel = AutoModelForSequenceClassification.from_pretrained(\"distilbert-base-uncased\")\n\n# Apply SingLoRA\napply_singlora_to_model(\n model=model,\n rank=8, # Low-rank dimension (r in the paper)\n alpha=8.0, # Scaling factor\n ramp_up_steps=1000, # Steps for ramp-up function u(t)\n target_modules=[\"q_lin\", \"k_lin\", \"v_lin\"] # Target attention layers\n)\n\n# Now only the SingLoRA parameters are trainable\noptimizer = torch.optim.AdamW(\n filter(lambda p: p.requires_grad, model.parameters()),\n lr=1e-3\n)\n```\n\n### Configuration Parameters\n\n- `rank`: The dimension of the low-rank adaptation (r). Lower values mean fewer parameters.\n- `alpha`: Scaling factor for the adaptation. Higher values allow larger updates.\n- `ramp_up_steps`: Number of steps (T) for the ramp-up function u(t) = min(t/T, 1).\n- `target_modules`: List of layer names to apply SingLoRA to. Common targets:\n - `[\"query\", \"key\", \"value\"]` for standard transformers\n - `[\"q_lin\", \"k_lin\", \"v_lin\"]` for DistilBERT\n - `[\"q_proj\", \"k_proj\", \"v_proj\"]` for LLaMA models\n\n### Parameter Efficiency\n\nSingLoRA significantly reduces the number of trainable parameters compared to full fine-tuning:\n\n```python\n# Example parameter counts\noriginal_params = sum(p.numel() for p in original_model.parameters() if p.requires_grad)\nsinglora_params = sum(p.numel() for p in model.parameters() if p.requires_grad)\n\nreduction = 100 * (1 - singlora_params / original_params)\nprint(f\"Parameter reduction: {reduction:.2f}%\")\n```\n\nFor a complete working example, see `example.py` in the repository.\n\n### LLaMA Example\n\nHere's how to apply SingLoRA to LLaMA models:\n\n```python\nfrom singlora import apply_singlora_to_model\nfrom transformers import LlamaForCausalLM, LlamaTokenizer\nimport torch\n\n# Load LLaMA model and tokenizer\nmodel_name = \"meta-llama/Llama-2-7b-hf\" # or your local path\nmodel = LlamaForCausalLM.from_pretrained(\n model_name,\n torch_dtype=torch.float16, # Use float16 for efficiency\n device_map=\"auto\" # Automatically handle model placement\n)\ntokenizer = LlamaTokenizer.from_pretrained(model_name)\n\n# Apply SingLoRA to attention layers\napply_singlora_to_model(\n model=model,\n rank=16, # Can use larger rank for bigger models\n alpha=16.0, # Increased alpha for stronger adaptation\n ramp_up_steps=2000, # More steps for larger datasets\n target_modules=[ # LLaMA-specific attention layer names\n \"q_proj\",\n \"k_proj\",\n \"v_proj\"\n ]\n)\n\n# Example training setup\noptimizer = torch.optim.AdamW(\n filter(lambda p: p.requires_grad, model.parameters()),\n lr=1e-4 # Lower learning rate for LLaMA\n)\n\n# Example inference\nprompt = \"Once upon a time\"\ninputs = tokenizer(prompt, return_tensors=\"pt\").to(model.device)\n\nwith torch.no_grad():\n outputs = model.generate(\n **inputs,\n max_length=100,\n temperature=0.7,\n do_sample=True\n )\n\nprint(tokenizer.decode(outputs[0], skip_special_tokens=True))\n```\n\nKey differences for LLaMA models:\n- Use `LlamaForCausalLM` instead of standard transformer models\n- Target the LLaMA-specific projection layers (`q_proj`, `k_proj`, `v_proj`)\n- Consider using `float16` for memory efficiency\n- Adjust hyperparameters (`rank`, `alpha`, learning rate) for larger models\n- Use `device_map=\"auto\"` for automatic model sharding on multiple GPUs\n\n## Citation\n\nIf you use this implementation in your research, please cite the original paper:\n\n```bibtex\n@misc{bensa\u00efd2025singloralowrankadaptation,\n title={SingLoRA: Low Rank Adaptation Using a Single Matrix}, \n author={David Bensa\u00efd and Noam Rotstein and Roy Velich and Daniel Bensa\u00efd and Ron Kimmel},\n year={2025},\n eprint={2507.05566},\n archivePrefix={arXiv},\n primaryClass={cs.AI},\n url={https://arxiv.org/abs/2507.05566}, \n}\n```\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "SingLoRA - Pytorch",
"version": "0.0.1",
"project_urls": {
"Documentation": "https://github.com/kyegomez/singlora",
"Homepage": "https://github.com/kyegomez/singlora",
"Repository": "https://github.com/kyegomez/singlora"
},
"split_keywords": [
"artificial intelligence",
" deep learning",
" optimizers",
" prompt engineering",
" singlora",
" llm",
" pytorch",
" pytorch lightning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3712297007d7c4ba8f8c8101648fa6163bb2ef62fb7585883c4871040a21361e",
"md5": "88ed21b076af0054429715b215e41195",
"sha256": "937b355f4e89ecab4ac9f1aaf120b0aa61290bf4b131078fd5a2ef91278b297e"
},
"downloads": -1,
"filename": "singlora-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "88ed21b076af0054429715b215e41195",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 6165,
"upload_time": "2025-07-09T15:55:46",
"upload_time_iso_8601": "2025-07-09T15:55:46.356276Z",
"url": "https://files.pythonhosted.org/packages/37/12/297007d7c4ba8f8c8101648fa6163bb2ef62fb7585883c4871040a21361e/singlora-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6a0a147907f6f9e896455065e99b9975649b2cbce3dc847dddcba8a377ed6080",
"md5": "c335cc6ebbc5019d591a1eb0d7abf7ed",
"sha256": "bfa95e4a32b68f8e072db8075c49811e36a84898d9ef1c0cbcd710b481fc5057"
},
"downloads": -1,
"filename": "singlora-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "c335cc6ebbc5019d591a1eb0d7abf7ed",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 5595,
"upload_time": "2025-07-09T15:55:47",
"upload_time_iso_8601": "2025-07-09T15:55:47.701269Z",
"url": "https://files.pythonhosted.org/packages/6a/0a/147907f6f9e896455065e99b9975649b2cbce3dc847dddcba8a377ed6080/singlora-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-09 15:55:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kyegomez",
"github_project": "singlora",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "torch",
"specs": []
},
{
"name": "transformers",
"specs": []
}
],
"lcname": "singlora"
}