bidora

Name	bidora JSON
Version	0.1.2 JSON
	download
home_page	None
Summary	BiDoRA/LoRA fine-tuning toolkit for 3D code generation and spatial intelligence
upload_time	2025-10-27 23:34:48
maintainer	None
docs_url	None
author	None
requires_python	>=3.11
license	MIT
keywords	3d-code bidora blender deep-learning dora fine-tuning llm lora machine-learning peft rust spatial-intelligence
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # BiDoRA: Bi-Level Optimization for Parameter-Efficient Fine-Tuning

**BiDoRA** is a Python package implementing true BiDoRA (Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation) for efficient fine-tuning of Large Language Models. Specifically optimized for:
- 3D Code Generation (Rust, Blender, CAD)
- Spatial Intelligence Tasks
- Small Datasets (<10k samples)
- Automatic Hardware Adaptation (Laptop to A100)

## 🔬 What is BiDoRA?

BiDoRA uses **bi-level optimization** to separately optimize magnitude and direction components of weight updates:

```
W' = m ⊙ (W₀ + BA) / ||W₀ + BA||
     ↑      ↑
  magnitude direction
  (upper)   (lower)
```

**Training Process:**
1. **Lower Level**: Optimize direction (A, B matrices) on training set
2. **Upper Level**: Optimize magnitude (m) on validation set via hypergradients
3. **Final Phase**: Direction fine-tuning on combined data with fixed magnitude

**Benefits:**
- ✅ Reduces overfitting on small datasets (<10k samples)
- ✅ Better alignment with full fine-tuning (correlation: -8.042 vs -1.784 for DoRA)
- ✅ Statistically significant improvements on GLUE (p < 0.001)

**Important Notes:**
- ⚠️ **Training Time**: 3-4x slower than standard LoRA due to bi-level optimization
- ⚠️ **No Quantization**: BiDoRA requires full precision (bfloat16) - quantization disabled automatically
- ⚠️ **Memory**: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate
- ✅ **Best For**: Small specialized datasets where quality > speed

## 🚀 Features

- ✅ **BiDoRA Bi-Level Optimization**: True magnitude-direction decomposition
- ✅ **Auto Hardware Detection**: Automatically adapts config to available hardware
- ✅ **Full Precision Training**: Optimized for bfloat16 (no quantization needed for BiDoRA)
- ✅ **Flexible Data Formats**: JSONL, HuggingFace Datasets
- ✅ **Type-Safe Config**: Pydantic-validated configuration
- ✅ **CLI Interface**: Simple command-line interface with Typer

## 📦 Installation

### From PyPI (recommended)

```bash
pip install bidora
```

### As a project dependency

```bash
# With uv (recommended)
uv add bidora

# With pip
pip install bidora
```

### From source (for development)

```bash
git clone https://github.com/bjoernbethge/bidora.git
cd bidora
uv sync --dev
```

## 🎯 Quick Start

### 1. Show hardware info

```bash
bidora info
```

Shows available hardware and recommended configuration.

### 2. Show recommended models

```bash
bidora list-models
```

### 3. Start BiDoRA training

**Important:** BiDoRA requires **separate train and validation files** for bi-level optimization.

#### Basic training

```bash
bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-4B \
  --output ./output \
  --rank 8 \
  --epochs 3
```

#### With custom learning rates

```bash
bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-4B \
  --lr 2e-4 \
  --upper-lr-mult 2.0 \
  --rank 8
```

#### With HuggingFace dataset

```bash
bidora train \
  --dataset "code_search_net" \
  --model Qwen/Qwen3-8B \
  --output ./output \
  --rank 8
```

## 📊 Data Format

### JSONL Format (Instruction-Tuning)

```json
{"instruction": "Generate a Rust function to create a 3D cube mesh", "output": "fn create_cube() -> Mesh { ... }"}
{"instruction": "Write Blender Python code to add a sphere", "input": "radius: 2.0", "output": "import bpy\nbpy.ops.mesh.primitive_uv_sphere_add(radius=2.0)"}
```

### JSONL Format (Code Completion)

```json
{"prompt": "// Generate 3D mesh\nfn create_mesh()", "completion": " -> Mesh {\n    let vertices = vec![...];\n    Mesh::new(vertices)\n}"}
```

### JSONL Format (Code-Only)

```json
{"code": "use bevy::prelude::*;\n\nfn setup_3d_scene(mut commands: Commands) { ... }"}
```

## ⚙️ Hardware-Specific Setups

### Laptop (8GB GPU)

```bash
bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-4B \
  --rank 4 \
  --batch-size 1 \
  --auto-hardware  # Automatic adaptation
```

**Config automatically adjusted:**
- Precision: bfloat16 (BiDoRA requirement)
- Batch Size: 1
- Gradient Accumulation: 16
- Max Seq Length: 2048

### Desktop (16GB GPU)

```bash
bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-8B \
  --rank 16 \
  --batch-size 2 \
  --auto-hardware
```

**Auto-Config:**
- Precision: bfloat16 (full precision - BiDoRA requirement)
- Batch Size: 1
- Gradient Accumulation: 16
- Max Seq Length: 2048

### A100 (40GB)

```bash
bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-32B \
  --rank 16 \
  --batch-size 8 \
  --auto-hardware
```

**Auto-Config:**
- Precision: bfloat16 (full precision - BiDoRA requirement)
- Batch Size: 4
- Gradient Accumulation: 4
- Max Seq Length: 4096

## 🎛️ Advanced Options

### All CLI Parameters

```bash
bidora train --help
```

**Most Important Parameters:**

| Parameter | Description | Default |
|-----------|-------------|---------|
| `--model, -m` | Model name or path | `Qwen/Qwen3-4B` |
| `--train-file, -t` | Training JSONL | Required |
| `--val-file, -v` | Validation JSONL | **Required for BiDoRA** |
| `--dataset, -d` | HuggingFace Dataset | - |
| `--output, -o` | Output directory | `./output` |
| `--rank, -r` | LoRA Rank | `8` |
| `--epochs, -e` | Training Epochs | `3` |
| `--batch-size, -b` | Batch Size | `4` |
| `--lr` | Learning Rate (lower level) | `2e-4` |
| `--upper-lr-mult` | Upper level LR multiplier | `2.0` |
| `--max-samples` | Max Training Samples | All |
| `--auto-hardware` | Auto-adjustment | `True` |

### Manual Config (without Auto-Hardware)

```bash
bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-8B \
  --rank 16 \
  --batch-size 8 \
  --lr 3e-4 \
  --epochs 5 \
  --no-auto-hardware  # Manual config
```

## 💾 Memory Requirements

### Qwen3 Model Sizes (BiDoRA - Full Precision)

⚠️ **Note**: BiDoRA requires full precision (bfloat16) - no quantization. Memory requirements higher than standard LoRA.

| Model | Parameter | VRAM (bf16) | Training VRAM | Recommended For |
|-------|-----------|-------------|---------------|-----------------|
| Qwen3-0.6B | 0.6B | ~2GB | ~6GB | Laptop GPU (6-8GB) |
| Qwen3-1.7B | 1.7B | ~4GB | ~10GB | **Laptop GPU (8GB+)** |
| Qwen3-4B | 4B | ~8GB | ~16GB | Desktop GPU (12-16GB) |
| Qwen3-8B | 8B | ~16GB | ~24GB | Desktop GPU (24GB+) / A100 |
| Qwen3-14B | 14B | ~28GB | ~40GB | A100 (40GB) |
| Qwen3-32B | 32B | ~64GB | ~80GB | A100 (80GB) |

💡 **Memory Optimization**: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate for full precision requirement.

### Trainable Parameters (LoRA Rank=8)

| Base Model | LoRA Params | Reduction |
|------------|-------------|-----------|
| 7B | ~2M | **3500×** |
| 14B | ~4M | **3500×** |
| 32B | ~8M | **4000×** |

## 🧪 Example Workflow: 3D Rust Code Fine-Tuning

### 1. Prepare data

```bash
# data/rust_3d_train.jsonl
{"instruction": "Create a three-rs mesh for a cube", "output": "use three::*;\n\nfn create_cube(size: f32) -> Mesh {\n    let geometry = Geometry::cuboid(size, size, size);\n    Mesh::new(geometry, Material::default())\n}"}
{"instruction": "Generate Bevy 3D scene setup", "output": "use bevy::prelude::*;\n\nfn setup(mut commands: Commands) {\n    commands.spawn(Camera3dBundle::default());\n    commands.spawn(PbrBundle {\n        mesh: meshes.add(Mesh::from(shape::Cube { size: 1.0 })),\n        ..default()\n    });\n}"}
```

### 2. Start training

```bash
bidora train \
  --train-file data/rust_3d_train.jsonl \
  --val-file data/rust_3d_val.jsonl \
  --model Qwen/Qwen3-4B \
  --output ./rust_3d_model \
  --rank 8 \
  --epochs 3 \
  --batch-size 2
```

### 3. Use model

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load base model with BiDoRA adapters
model = AutoModelForCausalLM.from_pretrained(
    "./rust_3d_model/final_model",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B")

# Generate
prompt = "### Instruction:\nCreate a three-rs function to render a sphere\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## 🔧 Programmatic Usage

```python
from bidora import (
    FullConfig, ModelConfig, BiDoRAConfig, TrainingConfig, DataConfig,
    load_model_and_tokenizer, prepare_bidora_model,
    load_and_prepare_dataset, prepare_dataset_for_training,
    train_bidora
)
from pathlib import Path

# Create config
config = FullConfig(
    model=ModelConfig(
        model_name="Qwen/Qwen3-4B",
        quantization="none"  # BiDoRA requires full precision (bfloat16)
    ),
    bidora=BiDoRAConfig(
        rank=8,
        use_bidora=True,  # Enable BiDoRA bi-level optimization
        upper_lr_multiplier=2.0
    ),
    training=TrainingConfig(
        batch_size=2,
        learning_rate=2e-4,
        num_epochs=3
    ),
    data=DataConfig(
        train_file=Path("data/train.jsonl"),
        val_file=Path("data/val.jsonl")  # Required for BiDoRA
    ),
    output_dir=Path("./output")
)

# Auto-adjust for hardware (will keep full precision for BiDoRA)
config.auto_adjust_for_hardware()

# Load model with BiDoRA layers
model, tokenizer = load_model_and_tokenizer(config.model)
model = prepare_bidora_model(model, config.bidora, quantized=False)

# Load data
dataset = load_and_prepare_dataset(config.data)
tokenized_dataset = prepare_dataset_for_training(
    dataset, tokenizer, config.training.max_seq_length
)

# Train with bi-level optimization
trainer = train_bidora(model, tokenizer, tokenized_dataset, config)
```

## 🐛 Troubleshooting

### CUDA Out of Memory

```bash
# Reduce batch size
bidora train --batch-size 1 ...

# Or use smaller model
bidora train --model Qwen/Qwen3-1.7B ...

# Note: BiDoRA cannot use quantization (requires full precision)
```

### Flash Attention Error

If Flash Attention 2 is not available:
- Automatically disabled
- Or manually: Set `use_flash_attention=False` in ModelConfig

### Import Errors

```bash
# Reinstall dependencies
uv pip install --force-reinstall transformers accelerate peft bitsandbytes
```

## 📚 Further Resources

- [BiDoRA Paper](https://arxiv.org/abs/2410.09758) - Original bi-level optimization paper
- [LoRA Paper](https://arxiv.org/abs/2106.09685) - Low-Rank Adaptation
- [DoRA Paper](https://arxiv.org/abs/2402.09353) - Weight-Decomposed LoRA
- [Qwen3 Models](https://huggingface.co/collections/Qwen/qwen3-680edabfb790c8c34a242f95) - HuggingFace model collection

## 📖 Citation

If you use BiDoRA in your research, please cite:

```bibtex
@article{liu2024bidora,
  title={BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation},
  author={Liu, Peiran and Wang, Luning and Sun, Yanchao and Tang, Zhongwei and Xu, Dawei and Li, Jiaxi and Xu, Zhili},
  journal={arXiv preprint arXiv:2410.09758},
  year={2024}
}
```

## 📝 License

MIT License - see LICENSE file.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "bidora",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "3d-code, bidora, blender, deep-learning, dora, fine-tuning, llm, lora, machine-learning, peft, rust, spatial-intelligence",
    "author": null,
    "author_email": "Bj\u00f6rn Bethge <bjoern.bethge@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/af/76/38b7826740ba10932600456eef4d01a9b8481aaa2fe67b2d62537cf1e30b/bidora-0.1.2.tar.gz",
    "platform": null,
    "description": "# BiDoRA: Bi-Level Optimization for Parameter-Efficient Fine-Tuning\n\n**BiDoRA** is a Python package implementing true BiDoRA (Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation) for efficient fine-tuning of Large Language Models. Specifically optimized for:\n- 3D Code Generation (Rust, Blender, CAD)\n- Spatial Intelligence Tasks\n- Small Datasets (<10k samples)\n- Automatic Hardware Adaptation (Laptop to A100)\n\n## \ud83d\udd2c What is BiDoRA?\n\nBiDoRA uses **bi-level optimization** to separately optimize magnitude and direction components of weight updates:\n\n```\nW' = m \u2299 (W\u2080 + BA) / ||W\u2080 + BA||\n     \u2191      \u2191\n  magnitude direction\n  (upper)   (lower)\n```\n\n**Training Process:**\n1. **Lower Level**: Optimize direction (A, B matrices) on training set\n2. **Upper Level**: Optimize magnitude (m) on validation set via hypergradients\n3. **Final Phase**: Direction fine-tuning on combined data with fixed magnitude\n\n**Benefits:**\n- \u2705 Reduces overfitting on small datasets (<10k samples)\n- \u2705 Better alignment with full fine-tuning (correlation: -8.042 vs -1.784 for DoRA)\n- \u2705 Statistically significant improvements on GLUE (p < 0.001)\n\n**Important Notes:**\n- \u26a0\ufe0f **Training Time**: 3-4x slower than standard LoRA due to bi-level optimization\n- \u26a0\ufe0f **No Quantization**: BiDoRA requires full precision (bfloat16) - quantization disabled automatically\n- \u26a0\ufe0f **Memory**: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate\n- \u2705 **Best For**: Small specialized datasets where quality > speed\n\n## \ud83d\ude80 Features\n\n- \u2705 **BiDoRA Bi-Level Optimization**: True magnitude-direction decomposition\n- \u2705 **Auto Hardware Detection**: Automatically adapts config to available hardware\n- \u2705 **Full Precision Training**: Optimized for bfloat16 (no quantization needed for BiDoRA)\n- \u2705 **Flexible Data Formats**: JSONL, HuggingFace Datasets\n- \u2705 **Type-Safe Config**: Pydantic-validated configuration\n- \u2705 **CLI Interface**: Simple command-line interface with Typer\n\n## \ud83d\udce6 Installation\n\n### From PyPI (recommended)\n\n```bash\npip install bidora\n```\n\n### As a project dependency\n\n```bash\n# With uv (recommended)\nuv add bidora\n\n# With pip\npip install bidora\n```\n\n### From source (for development)\n\n```bash\ngit clone https://github.com/bjoernbethge/bidora.git\ncd bidora\nuv sync --dev\n```\n\n## \ud83c\udfaf Quick Start\n\n### 1. Show hardware info\n\n```bash\nbidora info\n```\n\nShows available hardware and recommended configuration.\n\n### 2. Show recommended models\n\n```bash\nbidora list-models\n```\n\n### 3. Start BiDoRA training\n\n**Important:** BiDoRA requires **separate train and validation files** for bi-level optimization.\n\n#### Basic training\n\n```bash\nbidora train \\\n  --train-file data/train.jsonl \\\n  --val-file data/val.jsonl \\\n  --model Qwen/Qwen3-4B \\\n  --output ./output \\\n  --rank 8 \\\n  --epochs 3\n```\n\n#### With custom learning rates\n\n```bash\nbidora train \\\n  --train-file data/train.jsonl \\\n  --val-file data/val.jsonl \\\n  --model Qwen/Qwen3-4B \\\n  --lr 2e-4 \\\n  --upper-lr-mult 2.0 \\\n  --rank 8\n```\n\n#### With HuggingFace dataset\n\n```bash\nbidora train \\\n  --dataset \"code_search_net\" \\\n  --model Qwen/Qwen3-8B \\\n  --output ./output \\\n  --rank 8\n```\n\n## \ud83d\udcca Data Format\n\n### JSONL Format (Instruction-Tuning)\n\n```json\n{\"instruction\": \"Generate a Rust function to create a 3D cube mesh\", \"output\": \"fn create_cube() -> Mesh { ... }\"}\n{\"instruction\": \"Write Blender Python code to add a sphere\", \"input\": \"radius: 2.0\", \"output\": \"import bpy\\nbpy.ops.mesh.primitive_uv_sphere_add(radius=2.0)\"}\n```\n\n### JSONL Format (Code Completion)\n\n```json\n{\"prompt\": \"// Generate 3D mesh\\nfn create_mesh()\", \"completion\": \" -> Mesh {\\n    let vertices = vec![...];\\n    Mesh::new(vertices)\\n}\"}\n```\n\n### JSONL Format (Code-Only)\n\n```json\n{\"code\": \"use bevy::prelude::*;\\n\\nfn setup_3d_scene(mut commands: Commands) { ... }\"}\n```\n\n## \u2699\ufe0f Hardware-Specific Setups\n\n### Laptop (8GB GPU)\n\n```bash\nbidora train \\\n  --train-file data/train.jsonl \\\n  --val-file data/val.jsonl \\\n  --model Qwen/Qwen3-4B \\\n  --rank 4 \\\n  --batch-size 1 \\\n  --auto-hardware  # Automatic adaptation\n```\n\n**Config automatically adjusted:**\n- Precision: bfloat16 (BiDoRA requirement)\n- Batch Size: 1\n- Gradient Accumulation: 16\n- Max Seq Length: 2048\n\n### Desktop (16GB GPU)\n\n```bash\nbidora train \\\n  --train-file data/train.jsonl \\\n  --val-file data/val.jsonl \\\n  --model Qwen/Qwen3-8B \\\n  --rank 16 \\\n  --batch-size 2 \\\n  --auto-hardware\n```\n\n**Auto-Config:**\n- Precision: bfloat16 (full precision - BiDoRA requirement)\n- Batch Size: 1\n- Gradient Accumulation: 16\n- Max Seq Length: 2048\n\n### A100 (40GB)\n\n```bash\nbidora train \\\n  --train-file data/train.jsonl \\\n  --val-file data/val.jsonl \\\n  --model Qwen/Qwen3-32B \\\n  --rank 16 \\\n  --batch-size 8 \\\n  --auto-hardware\n```\n\n**Auto-Config:**\n- Precision: bfloat16 (full precision - BiDoRA requirement)\n- Batch Size: 4\n- Gradient Accumulation: 4\n- Max Seq Length: 4096\n\n## \ud83c\udf9b\ufe0f Advanced Options\n\n### All CLI Parameters\n\n```bash\nbidora train --help\n```\n\n**Most Important Parameters:**\n\n| Parameter | Description | Default |\n|-----------|-------------|---------|\n| `--model, -m` | Model name or path | `Qwen/Qwen3-4B` |\n| `--train-file, -t` | Training JSONL | Required |\n| `--val-file, -v` | Validation JSONL | **Required for BiDoRA** |\n| `--dataset, -d` | HuggingFace Dataset | - |\n| `--output, -o` | Output directory | `./output` |\n| `--rank, -r` | LoRA Rank | `8` |\n| `--epochs, -e` | Training Epochs | `3` |\n| `--batch-size, -b` | Batch Size | `4` |\n| `--lr` | Learning Rate (lower level) | `2e-4` |\n| `--upper-lr-mult` | Upper level LR multiplier | `2.0` |\n| `--max-samples` | Max Training Samples | All |\n| `--auto-hardware` | Auto-adjustment | `True` |\n\n### Manual Config (without Auto-Hardware)\n\n```bash\nbidora train \\\n  --train-file data/train.jsonl \\\n  --val-file data/val.jsonl \\\n  --model Qwen/Qwen3-8B \\\n  --rank 16 \\\n  --batch-size 8 \\\n  --lr 3e-4 \\\n  --epochs 5 \\\n  --no-auto-hardware  # Manual config\n```\n\n## \ud83d\udcbe Memory Requirements\n\n### Qwen3 Model Sizes (BiDoRA - Full Precision)\n\n\u26a0\ufe0f **Note**: BiDoRA requires full precision (bfloat16) - no quantization. Memory requirements higher than standard LoRA.\n\n| Model | Parameter | VRAM (bf16) | Training VRAM | Recommended For |\n|-------|-----------|-------------|---------------|-----------------|\n| Qwen3-0.6B | 0.6B | ~2GB | ~6GB | Laptop GPU (6-8GB) |\n| Qwen3-1.7B | 1.7B | ~4GB | ~10GB | **Laptop GPU (8GB+)** |\n| Qwen3-4B | 4B | ~8GB | ~16GB | Desktop GPU (12-16GB) |\n| Qwen3-8B | 8B | ~16GB | ~24GB | Desktop GPU (24GB+) / A100 |\n| Qwen3-14B | 14B | ~28GB | ~40GB | A100 (40GB) |\n| Qwen3-32B | 32B | ~64GB | ~80GB | A100 (80GB) |\n\n\ud83d\udca1 **Memory Optimization**: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate for full precision requirement.\n\n### Trainable Parameters (LoRA Rank=8)\n\n| Base Model | LoRA Params | Reduction |\n|------------|-------------|-----------|\n| 7B | ~2M | **3500\u00d7** |\n| 14B | ~4M | **3500\u00d7** |\n| 32B | ~8M | **4000\u00d7** |\n\n## \ud83e\uddea Example Workflow: 3D Rust Code Fine-Tuning\n\n### 1. Prepare data\n\n```bash\n# data/rust_3d_train.jsonl\n{\"instruction\": \"Create a three-rs mesh for a cube\", \"output\": \"use three::*;\\n\\nfn create_cube(size: f32) -> Mesh {\\n    let geometry = Geometry::cuboid(size, size, size);\\n    Mesh::new(geometry, Material::default())\\n}\"}\n{\"instruction\": \"Generate Bevy 3D scene setup\", \"output\": \"use bevy::prelude::*;\\n\\nfn setup(mut commands: Commands) {\\n    commands.spawn(Camera3dBundle::default());\\n    commands.spawn(PbrBundle {\\n        mesh: meshes.add(Mesh::from(shape::Cube { size: 1.0 })),\\n        ..default()\\n    });\\n}\"}\n```\n\n### 2. Start training\n\n```bash\nbidora train \\\n  --train-file data/rust_3d_train.jsonl \\\n  --val-file data/rust_3d_val.jsonl \\\n  --model Qwen/Qwen3-4B \\\n  --output ./rust_3d_model \\\n  --rank 8 \\\n  --epochs 3 \\\n  --batch-size 2\n```\n\n### 3. Use model\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nimport torch\n\n# Load base model with BiDoRA adapters\nmodel = AutoModelForCausalLM.from_pretrained(\n    \"./rust_3d_model/final_model\",\n    device_map=\"auto\",\n    torch_dtype=torch.bfloat16\n)\n\ntokenizer = AutoTokenizer.from_pretrained(\"Qwen/Qwen3-4B\")\n\n# Generate\nprompt = \"### Instruction:\\nCreate a three-rs function to render a sphere\\n\\n### Response:\\n\"\ninputs = tokenizer(prompt, return_tensors=\"pt\").to(model.device)\noutputs = model.generate(**inputs, max_new_tokens=256)\nprint(tokenizer.decode(outputs[0], skip_special_tokens=True))\n```\n\n## \ud83d\udd27 Programmatic Usage\n\n```python\nfrom bidora import (\n    FullConfig, ModelConfig, BiDoRAConfig, TrainingConfig, DataConfig,\n    load_model_and_tokenizer, prepare_bidora_model,\n    load_and_prepare_dataset, prepare_dataset_for_training,\n    train_bidora\n)\nfrom pathlib import Path\n\n# Create config\nconfig = FullConfig(\n    model=ModelConfig(\n        model_name=\"Qwen/Qwen3-4B\",\n        quantization=\"none\"  # BiDoRA requires full precision (bfloat16)\n    ),\n    bidora=BiDoRAConfig(\n        rank=8,\n        use_bidora=True,  # Enable BiDoRA bi-level optimization\n        upper_lr_multiplier=2.0\n    ),\n    training=TrainingConfig(\n        batch_size=2,\n        learning_rate=2e-4,\n        num_epochs=3\n    ),\n    data=DataConfig(\n        train_file=Path(\"data/train.jsonl\"),\n        val_file=Path(\"data/val.jsonl\")  # Required for BiDoRA\n    ),\n    output_dir=Path(\"./output\")\n)\n\n# Auto-adjust for hardware (will keep full precision for BiDoRA)\nconfig.auto_adjust_for_hardware()\n\n# Load model with BiDoRA layers\nmodel, tokenizer = load_model_and_tokenizer(config.model)\nmodel = prepare_bidora_model(model, config.bidora, quantized=False)\n\n# Load data\ndataset = load_and_prepare_dataset(config.data)\ntokenized_dataset = prepare_dataset_for_training(\n    dataset, tokenizer, config.training.max_seq_length\n)\n\n# Train with bi-level optimization\ntrainer = train_bidora(model, tokenizer, tokenized_dataset, config)\n```\n\n## \ud83d\udc1b Troubleshooting\n\n### CUDA Out of Memory\n\n```bash\n# Reduce batch size\nbidora train --batch-size 1 ...\n\n# Or use smaller model\nbidora train --model Qwen/Qwen3-1.7B ...\n\n# Note: BiDoRA cannot use quantization (requires full precision)\n```\n\n### Flash Attention Error\n\nIf Flash Attention 2 is not available:\n- Automatically disabled\n- Or manually: Set `use_flash_attention=False` in ModelConfig\n\n### Import Errors\n\n```bash\n# Reinstall dependencies\nuv pip install --force-reinstall transformers accelerate peft bitsandbytes\n```\n\n## \ud83d\udcda Further Resources\n\n- [BiDoRA Paper](https://arxiv.org/abs/2410.09758) - Original bi-level optimization paper\n- [LoRA Paper](https://arxiv.org/abs/2106.09685) - Low-Rank Adaptation\n- [DoRA Paper](https://arxiv.org/abs/2402.09353) - Weight-Decomposed LoRA\n- [Qwen3 Models](https://huggingface.co/collections/Qwen/qwen3-680edabfb790c8c34a242f95) - HuggingFace model collection\n\n## \ud83d\udcd6 Citation\n\nIf you use BiDoRA in your research, please cite:\n\n```bibtex\n@article{liu2024bidora,\n  title={BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation},\n  author={Liu, Peiran and Wang, Luning and Sun, Yanchao and Tang, Zhongwei and Xu, Dawei and Li, Jiaxi and Xu, Zhili},\n  journal={arXiv preprint arXiv:2410.09758},\n  year={2024}\n}\n```\n\n## \ud83d\udcdd License\n\nMIT License - see LICENSE file.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "BiDoRA/LoRA fine-tuning toolkit for 3D code generation and spatial intelligence",
    "version": "0.1.2",
    "project_urls": {
        "Documentation": "https://github.com/bjoernbethge/bidora#readme",
        "Homepage": "https://github.com/bjoernbethge/bidora",
        "Issues": "https://github.com/bjoernbethge/bidora/issues",
        "Repository": "https://github.com/bjoernbethge/bidora"
    },
    "split_keywords": [
        "3d-code",
        " bidora",
        " blender",
        " deep-learning",
        " dora",
        " fine-tuning",
        " llm",
        " lora",
        " machine-learning",
        " peft",
        " rust",
        " spatial-intelligence"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c3cadbb3d8f8ec94f589302dfa30073bd92d06ddd9c9af4c12a85544841c1ba8",
                "md5": "dc3e58d2d5184c3c4ca4a1fae2e4bd81",
                "sha256": "6f6a425fa3f9761ba51ed3a125c333bec659e9d7c0ba86d9b35161435563a7d2"
            },
            "downloads": -1,
            "filename": "bidora-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dc3e58d2d5184c3c4ca4a1fae2e4bd81",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 22969,
            "upload_time": "2025-10-27T23:34:47",
            "upload_time_iso_8601": "2025-10-27T23:34:47.077859Z",
            "url": "https://files.pythonhosted.org/packages/c3/ca/dbb3d8f8ec94f589302dfa30073bd92d06ddd9c9af4c12a85544841c1ba8/bidora-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "af7638b7826740ba10932600456eef4d01a9b8481aaa2fe67b2d62537cf1e30b",
                "md5": "f98a99e5a6f0601294252377969697f5",
                "sha256": "eaa201f6f77ef12a8c3ca7fbb156371c4245829b8068a200052107c2ede0b6b7"
            },
            "downloads": -1,
            "filename": "bidora-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f98a99e5a6f0601294252377969697f5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 205136,
            "upload_time": "2025-10-27T23:34:48",
            "upload_time_iso_8601": "2025-10-27T23:34:48.147264Z",
            "url": "https://files.pythonhosted.org/packages/af/76/38b7826740ba10932600456eef4d01a9b8481aaa2fe67b2d62537cf1e30b/bidora-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-27 23:34:48",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bjoernbethge",
    "github_project": "bidora#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "bidora"
}

None