# Phi Model Plugin for DashAI
This plugin integrates Microsoft's **Phi** language models into the DashAI framework using the `llama.cpp` backend. It provides a lightweight, efficient text generation system with support for quantized GGUF models.
## Included Models
### 1. Phi-3 Mini 4K Instruct
- 3.8B parameter lightweight model from the Phi-3 family
- Designed for high-quality output with strong reasoning abilities
- Trained on synthetic and filtered public datasets
- Fine-tuned with supervised techniques and direct preference optimization
- Based on [`microsoft/Phi-3-mini-4k-instruct-gguf`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf)
- Uses GGUF file: `Phi-3-mini-4k-instruct-q4.gguf`
### 2. Phi-4
- State-of-the-art open model developed by Microsoft Research
- Trained on high-quality public domain content, academic books, and Q&A datasets
- Emphasizes precise instruction-following and strong safety alignment
- Based on [`microsoft/phi-4-gguf`](https://huggingface.co/microsoft/phi-4-gguf)
- Uses GGUF file: `phi-4-IQ3_M.gguf`
Both models use the **GGUF** format and are compatible with CPU and GPU inference.
## Components
### PhiModel
- Implements the `TextToTextGenerationTaskModel` interface from DashAI
- Uses the `llama.cpp` backend with GGUF support
- Automatically loads the correct quantized model file based on the selected model
- Performs chat-style completion with system/user/assistant messages
## Features
- Configurable text generation with:
- `max_tokens`: Number of tokens to generate
- `temperature`: Controls output randomness
- `frequency_penalty`: Reduces repetition
- `context_window`: Max tokens per forward pass
- `device`: `"cpu"` or `"gpu"` (auto-detected)
- Efficient memory usage with quantized GGUF format
- Automatic model loading from Hugging Face
- Compatible with chat-style prompts (role-based message format)
## Model Parameters
| Parameter | Description | Default |
| ------------------- | ------------------------------------------------ | ----------------------------------------- |
| `model_name` | Model ID from Hugging Face | `"microsoft/Phi-3-mini-4k-instruct-gguf"` |
| `max_tokens` | Maximum number of tokens to generate | 100 |
| `temperature` | Sampling temperature (higher = more random) | 0.7 |
| `frequency_penalty` | Penalizes repeated tokens to encourage diversity | 0.1 |
| `context_window` | Maximum context window (tokens in prompt) | 512 |
| `device` | Device for inference (`"gpu"` or `"cpu"`) | Auto-detected |
## Requirements
- `DashAI`
- `llama-cpp-python`
- Model files from Hugging Face:
- [`microsoft/Phi-3-mini-4k-instruct-gguf`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf)
- [`microsoft/phi-4-gguf`](https://huggingface.co/microsoft/phi-4-gguf)
## Notes
This plugin uses the **GGUF** format, introduced by the `llama.cpp` team in August 2023.
GGUF replaces the older **GGML** format and is optimized for fast inference and low memory usage.
Both Phi-3 Mini and Phi-4 models have undergone **supervised fine-tuning** and **preference optimization** to improve instruction adherence and safety.
> ⚠️ These models are designed for **inference only** and are **not intended for fine-tuning**.
Raw data
{
"_id": null,
"home_page": null,
"name": "dashai-phi-model-package",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "DashAI, Model",
"author": "DashAI team",
"author_email": "dashaisoftware@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/05/96/18f42d6f1e8c516c9ffaa4e030adff8fcbd31d1a08741d66a89e03fa4987/dashai_phi_model_package-0.0.2.tar.gz",
"platform": null,
"description": "# Phi Model Plugin for DashAI\n\nThis plugin integrates Microsoft's **Phi** language models into the DashAI framework using the `llama.cpp` backend. It provides a lightweight, efficient text generation system with support for quantized GGUF models.\n\n## Included Models\n\n### 1. Phi-3 Mini 4K Instruct\n\n- 3.8B parameter lightweight model from the Phi-3 family\n- Designed for high-quality output with strong reasoning abilities\n- Trained on synthetic and filtered public datasets\n- Fine-tuned with supervised techniques and direct preference optimization\n- Based on [`microsoft/Phi-3-mini-4k-instruct-gguf`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf)\n- Uses GGUF file: `Phi-3-mini-4k-instruct-q4.gguf`\n\n### 2. Phi-4\n\n- State-of-the-art open model developed by Microsoft Research\n- Trained on high-quality public domain content, academic books, and Q&A datasets\n- Emphasizes precise instruction-following and strong safety alignment\n- Based on [`microsoft/phi-4-gguf`](https://huggingface.co/microsoft/phi-4-gguf)\n- Uses GGUF file: `phi-4-IQ3_M.gguf`\n\nBoth models use the **GGUF** format and are compatible with CPU and GPU inference.\n\n## Components\n\n### PhiModel\n\n- Implements the `TextToTextGenerationTaskModel` interface from DashAI\n- Uses the `llama.cpp` backend with GGUF support\n- Automatically loads the correct quantized model file based on the selected model\n- Performs chat-style completion with system/user/assistant messages\n\n## Features\n\n- Configurable text generation with:\n\n - `max_tokens`: Number of tokens to generate\n - `temperature`: Controls output randomness\n - `frequency_penalty`: Reduces repetition\n - `context_window`: Max tokens per forward pass\n - `device`: `\"cpu\"` or `\"gpu\"` (auto-detected)\n\n- Efficient memory usage with quantized GGUF format\n- Automatic model loading from Hugging Face\n- Compatible with chat-style prompts (role-based message format)\n\n## Model Parameters\n\n| Parameter | Description | Default |\n| ------------------- | ------------------------------------------------ | ----------------------------------------- |\n| `model_name` | Model ID from Hugging Face | `\"microsoft/Phi-3-mini-4k-instruct-gguf\"` |\n| `max_tokens` | Maximum number of tokens to generate | 100 |\n| `temperature` | Sampling temperature (higher = more random) | 0.7 |\n| `frequency_penalty` | Penalizes repeated tokens to encourage diversity | 0.1 |\n| `context_window` | Maximum context window (tokens in prompt) | 512 |\n| `device` | Device for inference (`\"gpu\"` or `\"cpu\"`) | Auto-detected |\n\n## Requirements\n\n- `DashAI`\n- `llama-cpp-python`\n- Model files from Hugging Face:\n - [`microsoft/Phi-3-mini-4k-instruct-gguf`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf)\n - [`microsoft/phi-4-gguf`](https://huggingface.co/microsoft/phi-4-gguf)\n\n## Notes\n\nThis plugin uses the **GGUF** format, introduced by the `llama.cpp` team in August 2023. \nGGUF replaces the older **GGML** format and is optimized for fast inference and low memory usage.\n\nBoth Phi-3 Mini and Phi-4 models have undergone **supervised fine-tuning** and **preference optimization** to improve instruction adherence and safety.\n\n> \u26a0\ufe0f These models are designed for **inference only** and are **not intended for fine-tuning**.\n",
"bugtrack_url": null,
"license": null,
"summary": "Phi Model for DashAI",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/DashAISoftware/DashAI",
"Issues": "https://github.com/DashAISoftware/DashAI/issues"
},
"split_keywords": [
"dashai",
" model"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "39374245a057a451f637d49505ca9355c52bf4eadc3cbc2c5c30b22816afa39b",
"md5": "8e5a7747c0f5f7685580488aeaf053f8",
"sha256": "4b5f3c669da1a4506e5b319c30e307f9029ce29d5ad808d8a3e9fdf409f5e2ed"
},
"downloads": -1,
"filename": "dashai_phi_model_package-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8e5a7747c0f5f7685580488aeaf053f8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 4330,
"upload_time": "2025-07-10T04:24:20",
"upload_time_iso_8601": "2025-07-10T04:24:20.826291Z",
"url": "https://files.pythonhosted.org/packages/39/37/4245a057a451f637d49505ca9355c52bf4eadc3cbc2c5c30b22816afa39b/dashai_phi_model_package-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "059618f42d6f1e8c516c9ffaa4e030adff8fcbd31d1a08741d66a89e03fa4987",
"md5": "a072a91aa156437b5e89d2c43932c131",
"sha256": "f99ae114f856bc34560ad7c2f172802cfcb7fbb8acfc3e5c15643c54137028cd"
},
"downloads": -1,
"filename": "dashai_phi_model_package-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "a072a91aa156437b5e89d2c43932c131",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 19210,
"upload_time": "2025-07-10T04:24:22",
"upload_time_iso_8601": "2025-07-10T04:24:22.145662Z",
"url": "https://files.pythonhosted.org/packages/05/96/18f42d6f1e8c516c9ffaa4e030adff8fcbd31d1a08741d66a89e03fa4987/dashai_phi_model_package-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-10 04:24:22",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DashAISoftware",
"github_project": "DashAI",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "accelerate",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "aiohappyeyeballs",
"specs": [
[
"==",
"2.4.4"
]
]
},
{
"name": "aiohttp",
"specs": [
[
"==",
"3.10.11"
]
]
},
{
"name": "aiosignal",
"specs": [
[
"==",
"1.3.1"
]
]
},
{
"name": "alembic",
"specs": [
[
"==",
"1.14.1"
]
]
},
{
"name": "annotated-types",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "anyio",
"specs": [
[
"==",
"4.5.2"
]
]
},
{
"name": "attrs",
"specs": [
[
"==",
"25.1.0"
]
]
},
{
"name": "beartype",
"specs": [
[
"==",
"0.19.0"
]
]
},
{
"name": "certifi",
"specs": [
[
"==",
"2025.1.31"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
"==",
"3.4.1"
]
]
},
{
"name": "click",
"specs": [
[
"==",
"8.1.8"
]
]
},
{
"name": "cloudpickle",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "colorama",
"specs": [
[
"==",
"0.4.6"
]
]
},
{
"name": "colorlog",
"specs": [
[
"==",
"6.9.0"
]
]
},
{
"name": "contourpy",
"specs": [
[
"==",
"1.1.1"
]
]
},
{
"name": "cycler",
"specs": [
[
"==",
"0.12.1"
]
]
},
{
"name": "datasets",
"specs": [
[
"==",
"3.1.0"
]
]
},
{
"name": "dill",
"specs": [
[
"==",
"0.3.8"
]
]
},
{
"name": "dnspython",
"specs": [
[
"==",
"2.6.1"
]
]
},
{
"name": "email-validator",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "et-xmlfile",
"specs": [
[
"==",
"2.0.0"
]
]
},
{
"name": "evaluate",
"specs": [
[
"==",
"0.4.3"
]
]
},
{
"name": "fastapi",
"specs": [
[
"==",
"0.115.8"
]
]
},
{
"name": "fastapi-cli",
"specs": [
[
"==",
"0.0.7"
]
]
},
{
"name": "filelock",
"specs": [
[
"==",
"3.16.1"
]
]
},
{
"name": "fonttools",
"specs": [
[
"==",
"4.56.0"
]
]
},
{
"name": "frozenlist",
"specs": [
[
"==",
"1.5.0"
]
]
},
{
"name": "fsspec",
"specs": [
[
"==",
"2024.9.0"
]
]
},
{
"name": "future",
"specs": [
[
"==",
"1.0.0"
]
]
},
{
"name": "greenlet",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "h11",
"specs": [
[
"==",
"0.14.0"
]
]
},
{
"name": "httpcore",
"specs": [
[
"==",
"1.0.7"
]
]
},
{
"name": "httptools",
"specs": [
[
"==",
"0.6.4"
]
]
},
{
"name": "httpx",
"specs": [
[
"==",
"0.28.1"
]
]
},
{
"name": "huggingface-hub",
"specs": [
[
"==",
"0.29.1"
]
]
},
{
"name": "hyperopt",
"specs": [
[
"==",
"0.2.7"
]
]
},
{
"name": "idna",
"specs": [
[
"==",
"3.10"
]
]
},
{
"name": "imbalanced-learn",
"specs": [
[
"==",
"0.13.0"
]
]
},
{
"name": "imblearn",
"specs": [
[
"==",
"0.0"
]
]
},
{
"name": "itsdangerous",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "jinja2",
"specs": [
[
"==",
"3.1.5"
]
]
},
{
"name": "joblib",
"specs": [
[
"==",
"1.4.2"
]
]
},
{
"name": "kink",
"specs": [
[
"==",
"0.8.1"
]
]
},
{
"name": "kiwisolver",
"specs": [
[
"==",
"1.4.7"
]
]
},
{
"name": "llvmlite",
"specs": [
[
"==",
"0.41.1"
]
]
},
{
"name": "lxml",
"specs": [
[
"==",
"5.3.1"
]
]
},
{
"name": "mako",
"specs": [
[
"==",
"1.3.9"
]
]
},
{
"name": "markdown-it-py",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "markupsafe",
"specs": [
[
"==",
"2.1.5"
]
]
},
{
"name": "matplotlib",
"specs": [
[
"==",
"3.7.5"
]
]
},
{
"name": "mdurl",
"specs": [
[
"==",
"0.1.2"
]
]
},
{
"name": "mpmath",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "multidict",
"specs": [
[
"==",
"6.1.0"
]
]
},
{
"name": "multiprocess",
"specs": [
[
"==",
"0.70.16"
]
]
},
{
"name": "narwhals",
"specs": [
[
"==",
"1.27.1"
]
]
},
{
"name": "networkx",
"specs": [
[
"==",
"3.1"
]
]
},
{
"name": "numba",
"specs": [
[
"==",
"0.58.1"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.24.4"
]
]
},
{
"name": "openpyxl",
"specs": [
[
"==",
"3.1.5"
]
]
},
{
"name": "optuna",
"specs": [
[
"==",
"4.2.1"
]
]
},
{
"name": "orjson",
"specs": [
[
"==",
"3.10.15"
]
]
},
{
"name": "packaging",
"specs": [
[
"==",
"24.2"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.0.3"
]
]
},
{
"name": "pillow",
"specs": [
[
"==",
"10.4.0"
]
]
},
{
"name": "plotly",
"specs": [
[
"==",
"6.0.0"
]
]
},
{
"name": "portalocker",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "propcache",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "psutil",
"specs": [
[
"==",
"7.0.0"
]
]
},
{
"name": "py4j",
"specs": [
[
"==",
"0.10.9.9"
]
]
},
{
"name": "pyarrow",
"specs": [
[
"==",
"17.0.0"
]
]
},
{
"name": "pydantic",
"specs": [
[
"==",
"2.10.6"
]
]
},
{
"name": "pydantic-core",
"specs": [
[
"==",
"2.27.2"
]
]
},
{
"name": "pydantic-extra-types",
"specs": [
[
"==",
"2.10.2"
]
]
},
{
"name": "pydantic-settings",
"specs": [
[
"==",
"2.8.0"
]
]
},
{
"name": "pygments",
"specs": [
[
"==",
"2.19.1"
]
]
},
{
"name": "pyparsing",
"specs": [
[
"==",
"3.1.4"
]
]
},
{
"name": "python-dateutil",
"specs": [
[
"==",
"2.9.0.post0"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "python-multipart",
"specs": [
[
"==",
"0.0.20"
]
]
},
{
"name": "pytz",
"specs": [
[
"==",
"2025.1"
]
]
},
{
"name": "pyyaml",
"specs": [
[
"==",
"6.0.2"
]
]
},
{
"name": "regex",
"specs": [
[
"==",
"2024.11.6"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.32.3"
]
]
},
{
"name": "rich",
"specs": [
[
"==",
"13.9.4"
]
]
},
{
"name": "rich-toolkit",
"specs": [
[
"==",
"0.13.2"
]
]
},
{
"name": "sacrebleu",
"specs": [
[
"==",
"2.5.1"
]
]
},
{
"name": "safetensors",
"specs": [
[
"==",
"0.5.2"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"1.3.2"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.10.1"
]
]
},
{
"name": "sentencepiece",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "shap",
"specs": [
[
"==",
"0.44.1"
]
]
},
{
"name": "shellingham",
"specs": [
[
"==",
"1.5.4"
]
]
},
{
"name": "six",
"specs": [
[
"==",
"1.17.0"
]
]
},
{
"name": "sklearn-compat",
"specs": [
[
"==",
"0.1.3"
]
]
},
{
"name": "slicer",
"specs": [
[
"==",
"0.0.7"
]
]
},
{
"name": "smart-open",
"specs": [
[
"==",
"7.1.0"
]
]
},
{
"name": "sniffio",
"specs": [
[
"==",
"1.3.1"
]
]
},
{
"name": "sqlalchemy",
"specs": [
[
"==",
"2.0.38"
]
]
},
{
"name": "starlette",
"specs": [
[
"==",
"0.44.0"
]
]
},
{
"name": "streaming-form-data",
"specs": [
[
"==",
"1.16.0"
]
]
},
{
"name": "sympy",
"specs": [
[
"==",
"1.13.3"
]
]
},
{
"name": "tabulate",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "threadpoolctl",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "tokenizers",
"specs": [
[
"==",
"0.20.3"
]
]
},
{
"name": "torch",
"specs": [
[
"==",
"2.4.1+cu118"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.67.1"
]
]
},
{
"name": "transformers",
"specs": [
[
"==",
"4.46.3"
]
]
},
{
"name": "typer",
"specs": [
[
"==",
"0.15.1"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
"==",
"4.12.2"
]
]
},
{
"name": "tzdata",
"specs": [
[
"==",
"2025.1"
]
]
},
{
"name": "ujson",
"specs": [
[
"==",
"5.10.0"
]
]
},
{
"name": "urllib3",
"specs": [
[
"==",
"2.2.3"
]
]
},
{
"name": "uvicorn",
"specs": [
[
"==",
"0.33.0"
]
]
},
{
"name": "watchfiles",
"specs": [
[
"==",
"0.24.0"
]
]
},
{
"name": "websockets",
"specs": [
[
"==",
"13.1"
]
]
},
{
"name": "wordcloud",
"specs": [
[
"==",
"1.9.4"
]
]
},
{
"name": "wrapt",
"specs": [
[
"==",
"1.17.2"
]
]
},
{
"name": "xxhash",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "yarl",
"specs": [
[
"==",
"1.15.2"
]
]
}
],
"lcname": "dashai-phi-model-package"
}