# BiDoRA: Bi-Level Optimization for Parameter-Efficient Fine-Tuning
**BiDoRA** is a Python package implementing true BiDoRA (Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation) for efficient fine-tuning of Large Language Models. Specifically optimized for:
- 3D Code Generation (Rust, Blender, CAD)
- Spatial Intelligence Tasks
- Small Datasets (<10k samples)
- Automatic Hardware Adaptation (Laptop to A100)
## π¬ What is BiDoRA?
BiDoRA uses **bi-level optimization** to separately optimize magnitude and direction components of weight updates:
```
W' = m β (Wβ + BA) / ||Wβ + BA||
β β
magnitude direction
(upper) (lower)
```
**Training Process:**
1. **Lower Level**: Optimize direction (A, B matrices) on training set
2. **Upper Level**: Optimize magnitude (m) on validation set via hypergradients
3. **Final Phase**: Direction fine-tuning on combined data with fixed magnitude
**Benefits:**
- β
Reduces overfitting on small datasets (<10k samples)
- β
Better alignment with full fine-tuning (correlation: -8.042 vs -1.784 for DoRA)
- β
Statistically significant improvements on GLUE (p < 0.001)
**Important Notes:**
- β οΈ **Training Time**: 3-4x slower than standard LoRA due to bi-level optimization
- β οΈ **No Quantization**: BiDoRA requires full precision (bfloat16) - quantization disabled automatically
- β οΈ **Memory**: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate
- β
**Best For**: Small specialized datasets where quality > speed
## π Features
- β
**BiDoRA Bi-Level Optimization**: True magnitude-direction decomposition
- β
**Auto Hardware Detection**: Automatically adapts config to available hardware
- β
**Full Precision Training**: Optimized for bfloat16 (no quantization needed for BiDoRA)
- β
**Flexible Data Formats**: JSONL, HuggingFace Datasets
- β
**Type-Safe Config**: Pydantic-validated configuration
- β
**CLI Interface**: Simple command-line interface with Typer
## π¦ Installation
### From PyPI (recommended)
```bash
pip install bidora
```
### As a project dependency
```bash
# With uv (recommended)
uv add bidora
# With pip
pip install bidora
```
### From source (for development)
```bash
git clone https://github.com/bjoernbethge/bidora.git
cd bidora
uv sync --dev
```
## π― Quick Start
### 1. Show hardware info
```bash
bidora info
```
Shows available hardware and recommended configuration.
### 2. Show recommended models
```bash
bidora list-models
```
### 3. Start BiDoRA training
**Important:** BiDoRA requires **separate train and validation files** for bi-level optimization.
#### Basic training
```bash
bidora train \
--train-file data/train.jsonl \
--val-file data/val.jsonl \
--model Qwen/Qwen3-4B \
--output ./output \
--rank 8 \
--epochs 3
```
#### With custom learning rates
```bash
bidora train \
--train-file data/train.jsonl \
--val-file data/val.jsonl \
--model Qwen/Qwen3-4B \
--lr 2e-4 \
--upper-lr-mult 2.0 \
--rank 8
```
#### With HuggingFace dataset
```bash
bidora train \
--dataset "code_search_net" \
--model Qwen/Qwen3-8B \
--output ./output \
--rank 8
```
## π Data Format
### JSONL Format (Instruction-Tuning)
```json
{"instruction": "Generate a Rust function to create a 3D cube mesh", "output": "fn create_cube() -> Mesh { ... }"}
{"instruction": "Write Blender Python code to add a sphere", "input": "radius: 2.0", "output": "import bpy\nbpy.ops.mesh.primitive_uv_sphere_add(radius=2.0)"}
```
### JSONL Format (Code Completion)
```json
{"prompt": "// Generate 3D mesh\nfn create_mesh()", "completion": " -> Mesh {\n let vertices = vec![...];\n Mesh::new(vertices)\n}"}
```
### JSONL Format (Code-Only)
```json
{"code": "use bevy::prelude::*;\n\nfn setup_3d_scene(mut commands: Commands) { ... }"}
```
## βοΈ Hardware-Specific Setups
### Laptop (8GB GPU)
```bash
bidora train \
--train-file data/train.jsonl \
--val-file data/val.jsonl \
--model Qwen/Qwen3-4B \
--rank 4 \
--batch-size 1 \
--auto-hardware # Automatic adaptation
```
**Config automatically adjusted:**
- Precision: bfloat16 (BiDoRA requirement)
- Batch Size: 1
- Gradient Accumulation: 16
- Max Seq Length: 2048
### Desktop (16GB GPU)
```bash
bidora train \
--train-file data/train.jsonl \
--val-file data/val.jsonl \
--model Qwen/Qwen3-8B \
--rank 16 \
--batch-size 2 \
--auto-hardware
```
**Auto-Config:**
- Precision: bfloat16 (full precision - BiDoRA requirement)
- Batch Size: 1
- Gradient Accumulation: 16
- Max Seq Length: 2048
### A100 (40GB)
```bash
bidora train \
--train-file data/train.jsonl \
--val-file data/val.jsonl \
--model Qwen/Qwen3-32B \
--rank 16 \
--batch-size 8 \
--auto-hardware
```
**Auto-Config:**
- Precision: bfloat16 (full precision - BiDoRA requirement)
- Batch Size: 4
- Gradient Accumulation: 4
- Max Seq Length: 4096
## ποΈ Advanced Options
### All CLI Parameters
```bash
bidora train --help
```
**Most Important Parameters:**
| Parameter | Description | Default |
|-----------|-------------|---------|
| `--model, -m` | Model name or path | `Qwen/Qwen3-4B` |
| `--train-file, -t` | Training JSONL | Required |
| `--val-file, -v` | Validation JSONL | **Required for BiDoRA** |
| `--dataset, -d` | HuggingFace Dataset | - |
| `--output, -o` | Output directory | `./output` |
| `--rank, -r` | LoRA Rank | `8` |
| `--epochs, -e` | Training Epochs | `3` |
| `--batch-size, -b` | Batch Size | `4` |
| `--lr` | Learning Rate (lower level) | `2e-4` |
| `--upper-lr-mult` | Upper level LR multiplier | `2.0` |
| `--max-samples` | Max Training Samples | All |
| `--auto-hardware` | Auto-adjustment | `True` |
### Manual Config (without Auto-Hardware)
```bash
bidora train \
--train-file data/train.jsonl \
--val-file data/val.jsonl \
--model Qwen/Qwen3-8B \
--rank 16 \
--batch-size 8 \
--lr 3e-4 \
--epochs 5 \
--no-auto-hardware # Manual config
```
## πΎ Memory Requirements
### Qwen3 Model Sizes (BiDoRA - Full Precision)
β οΈ **Note**: BiDoRA requires full precision (bfloat16) - no quantization. Memory requirements higher than standard LoRA.
| Model | Parameter | VRAM (bf16) | Training VRAM | Recommended For |
|-------|-----------|-------------|---------------|-----------------|
| Qwen3-0.6B | 0.6B | ~2GB | ~6GB | Laptop GPU (6-8GB) |
| Qwen3-1.7B | 1.7B | ~4GB | ~10GB | **Laptop GPU (8GB+)** |
| Qwen3-4B | 4B | ~8GB | ~16GB | Desktop GPU (12-16GB) |
| Qwen3-8B | 8B | ~16GB | ~24GB | Desktop GPU (24GB+) / A100 |
| Qwen3-14B | 14B | ~28GB | ~40GB | A100 (40GB) |
| Qwen3-32B | 32B | ~64GB | ~80GB | A100 (80GB) |
π‘ **Memory Optimization**: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate for full precision requirement.
### Trainable Parameters (LoRA Rank=8)
| Base Model | LoRA Params | Reduction |
|------------|-------------|-----------|
| 7B | ~2M | **3500Γ** |
| 14B | ~4M | **3500Γ** |
| 32B | ~8M | **4000Γ** |
## π§ͺ Example Workflow: 3D Rust Code Fine-Tuning
### 1. Prepare data
```bash
# data/rust_3d_train.jsonl
{"instruction": "Create a three-rs mesh for a cube", "output": "use three::*;\n\nfn create_cube(size: f32) -> Mesh {\n let geometry = Geometry::cuboid(size, size, size);\n Mesh::new(geometry, Material::default())\n}"}
{"instruction": "Generate Bevy 3D scene setup", "output": "use bevy::prelude::*;\n\nfn setup(mut commands: Commands) {\n commands.spawn(Camera3dBundle::default());\n commands.spawn(PbrBundle {\n mesh: meshes.add(Mesh::from(shape::Cube { size: 1.0 })),\n ..default()\n });\n}"}
```
### 2. Start training
```bash
bidora train \
--train-file data/rust_3d_train.jsonl \
--val-file data/rust_3d_val.jsonl \
--model Qwen/Qwen3-4B \
--output ./rust_3d_model \
--rank 8 \
--epochs 3 \
--batch-size 2
```
### 3. Use model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load base model with BiDoRA adapters
model = AutoModelForCausalLM.from_pretrained(
"./rust_3d_model/final_model",
device_map="auto",
torch_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B")
# Generate
prompt = "### Instruction:\nCreate a three-rs function to render a sphere\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## π§ Programmatic Usage
```python
from bidora import (
FullConfig, ModelConfig, BiDoRAConfig, TrainingConfig, DataConfig,
load_model_and_tokenizer, prepare_bidora_model,
load_and_prepare_dataset, prepare_dataset_for_training,
train_bidora
)
from pathlib import Path
# Create config
config = FullConfig(
model=ModelConfig(
model_name="Qwen/Qwen3-4B",
quantization="none" # BiDoRA requires full precision (bfloat16)
),
bidora=BiDoRAConfig(
rank=8,
use_bidora=True, # Enable BiDoRA bi-level optimization
upper_lr_multiplier=2.0
),
training=TrainingConfig(
batch_size=2,
learning_rate=2e-4,
num_epochs=3
),
data=DataConfig(
train_file=Path("data/train.jsonl"),
val_file=Path("data/val.jsonl") # Required for BiDoRA
),
output_dir=Path("./output")
)
# Auto-adjust for hardware (will keep full precision for BiDoRA)
config.auto_adjust_for_hardware()
# Load model with BiDoRA layers
model, tokenizer = load_model_and_tokenizer(config.model)
model = prepare_bidora_model(model, config.bidora, quantized=False)
# Load data
dataset = load_and_prepare_dataset(config.data)
tokenized_dataset = prepare_dataset_for_training(
dataset, tokenizer, config.training.max_seq_length
)
# Train with bi-level optimization
trainer = train_bidora(model, tokenizer, tokenized_dataset, config)
```
## π Troubleshooting
### CUDA Out of Memory
```bash
# Reduce batch size
bidora train --batch-size 1 ...
# Or use smaller model
bidora train --model Qwen/Qwen3-1.7B ...
# Note: BiDoRA cannot use quantization (requires full precision)
```
### Flash Attention Error
If Flash Attention 2 is not available:
- Automatically disabled
- Or manually: Set `use_flash_attention=False` in ModelConfig
### Import Errors
```bash
# Reinstall dependencies
uv pip install --force-reinstall transformers accelerate peft bitsandbytes
```
## π Further Resources
- [BiDoRA Paper](https://arxiv.org/abs/2410.09758) - Original bi-level optimization paper
- [LoRA Paper](https://arxiv.org/abs/2106.09685) - Low-Rank Adaptation
- [DoRA Paper](https://arxiv.org/abs/2402.09353) - Weight-Decomposed LoRA
- [Qwen3 Models](https://huggingface.co/collections/Qwen/qwen3-680edabfb790c8c34a242f95) - HuggingFace model collection
## π Citation
If you use BiDoRA in your research, please cite:
```bibtex
@article{liu2024bidora,
title={BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation},
author={Liu, Peiran and Wang, Luning and Sun, Yanchao and Tang, Zhongwei and Xu, Dawei and Li, Jiaxi and Xu, Zhili},
journal={arXiv preprint arXiv:2410.09758},
year={2024}
}
```
## π License
MIT License - see LICENSE file.
Raw data
{
"_id": null,
"home_page": null,
"name": "bidora",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "3d-code, bidora, blender, deep-learning, dora, fine-tuning, llm, lora, machine-learning, peft, rust, spatial-intelligence",
"author": null,
"author_email": "Bj\u00f6rn Bethge <bjoern.bethge@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/af/76/38b7826740ba10932600456eef4d01a9b8481aaa2fe67b2d62537cf1e30b/bidora-0.1.2.tar.gz",
"platform": null,
"description": "# BiDoRA: Bi-Level Optimization for Parameter-Efficient Fine-Tuning\n\n**BiDoRA** is a Python package implementing true BiDoRA (Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation) for efficient fine-tuning of Large Language Models. Specifically optimized for:\n- 3D Code Generation (Rust, Blender, CAD)\n- Spatial Intelligence Tasks\n- Small Datasets (<10k samples)\n- Automatic Hardware Adaptation (Laptop to A100)\n\n## \ud83d\udd2c What is BiDoRA?\n\nBiDoRA uses **bi-level optimization** to separately optimize magnitude and direction components of weight updates:\n\n```\nW' = m \u2299 (W\u2080 + BA) / ||W\u2080 + BA||\n \u2191 \u2191\n magnitude direction\n (upper) (lower)\n```\n\n**Training Process:**\n1. **Lower Level**: Optimize direction (A, B matrices) on training set\n2. **Upper Level**: Optimize magnitude (m) on validation set via hypergradients\n3. **Final Phase**: Direction fine-tuning on combined data with fixed magnitude\n\n**Benefits:**\n- \u2705 Reduces overfitting on small datasets (<10k samples)\n- \u2705 Better alignment with full fine-tuning (correlation: -8.042 vs -1.784 for DoRA)\n- \u2705 Statistically significant improvements on GLUE (p < 0.001)\n\n**Important Notes:**\n- \u26a0\ufe0f **Training Time**: 3-4x slower than standard LoRA due to bi-level optimization\n- \u26a0\ufe0f **No Quantization**: BiDoRA requires full precision (bfloat16) - quantization disabled automatically\n- \u26a0\ufe0f **Memory**: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate\n- \u2705 **Best For**: Small specialized datasets where quality > speed\n\n## \ud83d\ude80 Features\n\n- \u2705 **BiDoRA Bi-Level Optimization**: True magnitude-direction decomposition\n- \u2705 **Auto Hardware Detection**: Automatically adapts config to available hardware\n- \u2705 **Full Precision Training**: Optimized for bfloat16 (no quantization needed for BiDoRA)\n- \u2705 **Flexible Data Formats**: JSONL, HuggingFace Datasets\n- \u2705 **Type-Safe Config**: Pydantic-validated configuration\n- \u2705 **CLI Interface**: Simple command-line interface with Typer\n\n## \ud83d\udce6 Installation\n\n### From PyPI (recommended)\n\n```bash\npip install bidora\n```\n\n### As a project dependency\n\n```bash\n# With uv (recommended)\nuv add bidora\n\n# With pip\npip install bidora\n```\n\n### From source (for development)\n\n```bash\ngit clone https://github.com/bjoernbethge/bidora.git\ncd bidora\nuv sync --dev\n```\n\n## \ud83c\udfaf Quick Start\n\n### 1. Show hardware info\n\n```bash\nbidora info\n```\n\nShows available hardware and recommended configuration.\n\n### 2. Show recommended models\n\n```bash\nbidora list-models\n```\n\n### 3. Start BiDoRA training\n\n**Important:** BiDoRA requires **separate train and validation files** for bi-level optimization.\n\n#### Basic training\n\n```bash\nbidora train \\\n --train-file data/train.jsonl \\\n --val-file data/val.jsonl \\\n --model Qwen/Qwen3-4B \\\n --output ./output \\\n --rank 8 \\\n --epochs 3\n```\n\n#### With custom learning rates\n\n```bash\nbidora train \\\n --train-file data/train.jsonl \\\n --val-file data/val.jsonl \\\n --model Qwen/Qwen3-4B \\\n --lr 2e-4 \\\n --upper-lr-mult 2.0 \\\n --rank 8\n```\n\n#### With HuggingFace dataset\n\n```bash\nbidora train \\\n --dataset \"code_search_net\" \\\n --model Qwen/Qwen3-8B \\\n --output ./output \\\n --rank 8\n```\n\n## \ud83d\udcca Data Format\n\n### JSONL Format (Instruction-Tuning)\n\n```json\n{\"instruction\": \"Generate a Rust function to create a 3D cube mesh\", \"output\": \"fn create_cube() -> Mesh { ... }\"}\n{\"instruction\": \"Write Blender Python code to add a sphere\", \"input\": \"radius: 2.0\", \"output\": \"import bpy\\nbpy.ops.mesh.primitive_uv_sphere_add(radius=2.0)\"}\n```\n\n### JSONL Format (Code Completion)\n\n```json\n{\"prompt\": \"// Generate 3D mesh\\nfn create_mesh()\", \"completion\": \" -> Mesh {\\n let vertices = vec![...];\\n Mesh::new(vertices)\\n}\"}\n```\n\n### JSONL Format (Code-Only)\n\n```json\n{\"code\": \"use bevy::prelude::*;\\n\\nfn setup_3d_scene(mut commands: Commands) { ... }\"}\n```\n\n## \u2699\ufe0f Hardware-Specific Setups\n\n### Laptop (8GB GPU)\n\n```bash\nbidora train \\\n --train-file data/train.jsonl \\\n --val-file data/val.jsonl \\\n --model Qwen/Qwen3-4B \\\n --rank 4 \\\n --batch-size 1 \\\n --auto-hardware # Automatic adaptation\n```\n\n**Config automatically adjusted:**\n- Precision: bfloat16 (BiDoRA requirement)\n- Batch Size: 1\n- Gradient Accumulation: 16\n- Max Seq Length: 2048\n\n### Desktop (16GB GPU)\n\n```bash\nbidora train \\\n --train-file data/train.jsonl \\\n --val-file data/val.jsonl \\\n --model Qwen/Qwen3-8B \\\n --rank 16 \\\n --batch-size 2 \\\n --auto-hardware\n```\n\n**Auto-Config:**\n- Precision: bfloat16 (full precision - BiDoRA requirement)\n- Batch Size: 1\n- Gradient Accumulation: 16\n- Max Seq Length: 2048\n\n### A100 (40GB)\n\n```bash\nbidora train \\\n --train-file data/train.jsonl \\\n --val-file data/val.jsonl \\\n --model Qwen/Qwen3-32B \\\n --rank 16 \\\n --batch-size 8 \\\n --auto-hardware\n```\n\n**Auto-Config:**\n- Precision: bfloat16 (full precision - BiDoRA requirement)\n- Batch Size: 4\n- Gradient Accumulation: 4\n- Max Seq Length: 4096\n\n## \ud83c\udf9b\ufe0f Advanced Options\n\n### All CLI Parameters\n\n```bash\nbidora train --help\n```\n\n**Most Important Parameters:**\n\n| Parameter | Description | Default |\n|-----------|-------------|---------|\n| `--model, -m` | Model name or path | `Qwen/Qwen3-4B` |\n| `--train-file, -t` | Training JSONL | Required |\n| `--val-file, -v` | Validation JSONL | **Required for BiDoRA** |\n| `--dataset, -d` | HuggingFace Dataset | - |\n| `--output, -o` | Output directory | `./output` |\n| `--rank, -r` | LoRA Rank | `8` |\n| `--epochs, -e` | Training Epochs | `3` |\n| `--batch-size, -b` | Batch Size | `4` |\n| `--lr` | Learning Rate (lower level) | `2e-4` |\n| `--upper-lr-mult` | Upper level LR multiplier | `2.0` |\n| `--max-samples` | Max Training Samples | All |\n| `--auto-hardware` | Auto-adjustment | `True` |\n\n### Manual Config (without Auto-Hardware)\n\n```bash\nbidora train \\\n --train-file data/train.jsonl \\\n --val-file data/val.jsonl \\\n --model Qwen/Qwen3-8B \\\n --rank 16 \\\n --batch-size 8 \\\n --lr 3e-4 \\\n --epochs 5 \\\n --no-auto-hardware # Manual config\n```\n\n## \ud83d\udcbe Memory Requirements\n\n### Qwen3 Model Sizes (BiDoRA - Full Precision)\n\n\u26a0\ufe0f **Note**: BiDoRA requires full precision (bfloat16) - no quantization. Memory requirements higher than standard LoRA.\n\n| Model | Parameter | VRAM (bf16) | Training VRAM | Recommended For |\n|-------|-----------|-------------|---------------|-----------------|\n| Qwen3-0.6B | 0.6B | ~2GB | ~6GB | Laptop GPU (6-8GB) |\n| Qwen3-1.7B | 1.7B | ~4GB | ~10GB | **Laptop GPU (8GB+)** |\n| Qwen3-4B | 4B | ~8GB | ~16GB | Desktop GPU (12-16GB) |\n| Qwen3-8B | 8B | ~16GB | ~24GB | Desktop GPU (24GB+) / A100 |\n| Qwen3-14B | 14B | ~28GB | ~40GB | A100 (40GB) |\n| Qwen3-32B | 32B | ~64GB | ~80GB | A100 (80GB) |\n\n\ud83d\udca1 **Memory Optimization**: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate for full precision requirement.\n\n### Trainable Parameters (LoRA Rank=8)\n\n| Base Model | LoRA Params | Reduction |\n|------------|-------------|-----------|\n| 7B | ~2M | **3500\u00d7** |\n| 14B | ~4M | **3500\u00d7** |\n| 32B | ~8M | **4000\u00d7** |\n\n## \ud83e\uddea Example Workflow: 3D Rust Code Fine-Tuning\n\n### 1. Prepare data\n\n```bash\n# data/rust_3d_train.jsonl\n{\"instruction\": \"Create a three-rs mesh for a cube\", \"output\": \"use three::*;\\n\\nfn create_cube(size: f32) -> Mesh {\\n let geometry = Geometry::cuboid(size, size, size);\\n Mesh::new(geometry, Material::default())\\n}\"}\n{\"instruction\": \"Generate Bevy 3D scene setup\", \"output\": \"use bevy::prelude::*;\\n\\nfn setup(mut commands: Commands) {\\n commands.spawn(Camera3dBundle::default());\\n commands.spawn(PbrBundle {\\n mesh: meshes.add(Mesh::from(shape::Cube { size: 1.0 })),\\n ..default()\\n });\\n}\"}\n```\n\n### 2. Start training\n\n```bash\nbidora train \\\n --train-file data/rust_3d_train.jsonl \\\n --val-file data/rust_3d_val.jsonl \\\n --model Qwen/Qwen3-4B \\\n --output ./rust_3d_model \\\n --rank 8 \\\n --epochs 3 \\\n --batch-size 2\n```\n\n### 3. Use model\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nimport torch\n\n# Load base model with BiDoRA adapters\nmodel = AutoModelForCausalLM.from_pretrained(\n \"./rust_3d_model/final_model\",\n device_map=\"auto\",\n torch_dtype=torch.bfloat16\n)\n\ntokenizer = AutoTokenizer.from_pretrained(\"Qwen/Qwen3-4B\")\n\n# Generate\nprompt = \"### Instruction:\\nCreate a three-rs function to render a sphere\\n\\n### Response:\\n\"\ninputs = tokenizer(prompt, return_tensors=\"pt\").to(model.device)\noutputs = model.generate(**inputs, max_new_tokens=256)\nprint(tokenizer.decode(outputs[0], skip_special_tokens=True))\n```\n\n## \ud83d\udd27 Programmatic Usage\n\n```python\nfrom bidora import (\n FullConfig, ModelConfig, BiDoRAConfig, TrainingConfig, DataConfig,\n load_model_and_tokenizer, prepare_bidora_model,\n load_and_prepare_dataset, prepare_dataset_for_training,\n train_bidora\n)\nfrom pathlib import Path\n\n# Create config\nconfig = FullConfig(\n model=ModelConfig(\n model_name=\"Qwen/Qwen3-4B\",\n quantization=\"none\" # BiDoRA requires full precision (bfloat16)\n ),\n bidora=BiDoRAConfig(\n rank=8,\n use_bidora=True, # Enable BiDoRA bi-level optimization\n upper_lr_multiplier=2.0\n ),\n training=TrainingConfig(\n batch_size=2,\n learning_rate=2e-4,\n num_epochs=3\n ),\n data=DataConfig(\n train_file=Path(\"data/train.jsonl\"),\n val_file=Path(\"data/val.jsonl\") # Required for BiDoRA\n ),\n output_dir=Path(\"./output\")\n)\n\n# Auto-adjust for hardware (will keep full precision for BiDoRA)\nconfig.auto_adjust_for_hardware()\n\n# Load model with BiDoRA layers\nmodel, tokenizer = load_model_and_tokenizer(config.model)\nmodel = prepare_bidora_model(model, config.bidora, quantized=False)\n\n# Load data\ndataset = load_and_prepare_dataset(config.data)\ntokenized_dataset = prepare_dataset_for_training(\n dataset, tokenizer, config.training.max_seq_length\n)\n\n# Train with bi-level optimization\ntrainer = train_bidora(model, tokenizer, tokenized_dataset, config)\n```\n\n## \ud83d\udc1b Troubleshooting\n\n### CUDA Out of Memory\n\n```bash\n# Reduce batch size\nbidora train --batch-size 1 ...\n\n# Or use smaller model\nbidora train --model Qwen/Qwen3-1.7B ...\n\n# Note: BiDoRA cannot use quantization (requires full precision)\n```\n\n### Flash Attention Error\n\nIf Flash Attention 2 is not available:\n- Automatically disabled\n- Or manually: Set `use_flash_attention=False` in ModelConfig\n\n### Import Errors\n\n```bash\n# Reinstall dependencies\nuv pip install --force-reinstall transformers accelerate peft bitsandbytes\n```\n\n## \ud83d\udcda Further Resources\n\n- [BiDoRA Paper](https://arxiv.org/abs/2410.09758) - Original bi-level optimization paper\n- [LoRA Paper](https://arxiv.org/abs/2106.09685) - Low-Rank Adaptation\n- [DoRA Paper](https://arxiv.org/abs/2402.09353) - Weight-Decomposed LoRA\n- [Qwen3 Models](https://huggingface.co/collections/Qwen/qwen3-680edabfb790c8c34a242f95) - HuggingFace model collection\n\n## \ud83d\udcd6 Citation\n\nIf you use BiDoRA in your research, please cite:\n\n```bibtex\n@article{liu2024bidora,\n title={BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation},\n author={Liu, Peiran and Wang, Luning and Sun, Yanchao and Tang, Zhongwei and Xu, Dawei and Li, Jiaxi and Xu, Zhili},\n journal={arXiv preprint arXiv:2410.09758},\n year={2024}\n}\n```\n\n## \ud83d\udcdd License\n\nMIT License - see LICENSE file.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "BiDoRA/LoRA fine-tuning toolkit for 3D code generation and spatial intelligence",
"version": "0.1.2",
"project_urls": {
"Documentation": "https://github.com/bjoernbethge/bidora#readme",
"Homepage": "https://github.com/bjoernbethge/bidora",
"Issues": "https://github.com/bjoernbethge/bidora/issues",
"Repository": "https://github.com/bjoernbethge/bidora"
},
"split_keywords": [
"3d-code",
" bidora",
" blender",
" deep-learning",
" dora",
" fine-tuning",
" llm",
" lora",
" machine-learning",
" peft",
" rust",
" spatial-intelligence"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "c3cadbb3d8f8ec94f589302dfa30073bd92d06ddd9c9af4c12a85544841c1ba8",
"md5": "dc3e58d2d5184c3c4ca4a1fae2e4bd81",
"sha256": "6f6a425fa3f9761ba51ed3a125c333bec659e9d7c0ba86d9b35161435563a7d2"
},
"downloads": -1,
"filename": "bidora-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dc3e58d2d5184c3c4ca4a1fae2e4bd81",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 22969,
"upload_time": "2025-10-27T23:34:47",
"upload_time_iso_8601": "2025-10-27T23:34:47.077859Z",
"url": "https://files.pythonhosted.org/packages/c3/ca/dbb3d8f8ec94f589302dfa30073bd92d06ddd9c9af4c12a85544841c1ba8/bidora-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "af7638b7826740ba10932600456eef4d01a9b8481aaa2fe67b2d62537cf1e30b",
"md5": "f98a99e5a6f0601294252377969697f5",
"sha256": "eaa201f6f77ef12a8c3ca7fbb156371c4245829b8068a200052107c2ede0b6b7"
},
"downloads": -1,
"filename": "bidora-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "f98a99e5a6f0601294252377969697f5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 205136,
"upload_time": "2025-10-27T23:34:48",
"upload_time_iso_8601": "2025-10-27T23:34:48.147264Z",
"url": "https://files.pythonhosted.org/packages/af/76/38b7826740ba10932600456eef4d01a9b8481aaa2fe67b2d62537cf1e30b/bidora-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-27 23:34:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bjoernbethge",
"github_project": "bidora#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "bidora"
}