# Gemma Model Plugin for DashAI
This plugin integrates Google's **Gemma 3** language models into the DashAI framework using the `llama.cpp` backend. It enables efficient and flexible text generation with GGUF quantized models and supports private access using a Hugging Face API token.
## Included Models
### 1. Gemma 3 1B It QAT
- Lightweight instruction-tuned model with 1.3B parameters
- Quantized and optimized for local inference (`q4_0` format)
- Based on [`google/gemma-3-1b-it-qat-q4_0-gguf`](https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguf)
### 2. Gemma 3 4B It QAT
- Instruction-tuned model with 4B parameters
- Balanced size and capability for local or cloud deployment
- Based on [`google/gemma-3-4b-it-qat-q4_0-gguf`](https://huggingface.co/google/gemma-3-4b-it-qat-q4_0-gguf)
Both models are **instruction-tuned**, designed for high-quality generation and compatibility with CPU or GPU inference using `llama.cpp`.
## About Gemma
Gemma is a family of lightweight, state-of-the-art open models from **Google**, developed with the same technology as the **Gemini** models.
Key features of **Gemma 3** models:
- Multimodal: support text and image input (in general; this plugin currently handles text-only generation)
- Large context window: up to **128K tokens**
- Instruction-tuned variants available
- Multilingual: over **140 languages** supported
- Open weights with access control via Hugging Face
Gemma is designed for deployment on laptops, desktops, and cloud infrastructure, making advanced AI more accessible.
## Features
- Text generation via chat-style prompt completion
- GGUF format for optimized performance and memory usage
- Configurable generation parameters:
- `max_tokens`: Output length
- `temperature`: Output randomness
- `frequency_penalty`: Controls repetition
- `context_window`: Number of tokens per forward pass
- `device`: `"gpu"` or `"cpu"`
- Automatic login to Hugging Face to access gated models
## Model Parameters
| Parameter | Description | Default |
| ------------------- | -------------------------------------------------- | -------------------------------------- |
| `model_name` | Model ID from Hugging Face | `"google/gemma-3-4b-it-qat-q4_0-gguf"` |
| `huggingface_key` | Hugging Face API token to access restricted models | _Required_ |
| `max_tokens` | Maximum number of tokens to generate | 100 |
| `temperature` | Sampling temperature (higher = more random) | 0.7 |
| `frequency_penalty` | Penalizes repeated tokens to encourage diversity | 0.1 |
| `context_window` | Maximum context window (tokens in prompt) | 512 |
| `device` | Inference device (`"gpu"` or `"cpu"`) | `"gpu"` if available |
## Requirements
- `DashAI`
- `llama-cpp-python`
- Valid **Hugging Face Access Token**
- Model files from Hugging Face:
- [`google/gemma-3-1b-it-qat-q4_0-gguf`](https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguf)
- [`google/gemma-3-4b-it-qat-q4_0-gguf`](https://huggingface.co/google/gemma-3-4b-it-qat-q4_0-gguf)
> ⚠️ **Access Notice**: You must [accept the model terms on Hugging Face](https://huggingface.co/google/gemma-3-4b-it-qat-q4_0-gguf) and use a valid Hugging Face token.
> This repository is publicly accessible, but gated. You need to agree to share your contact information to access the model files.
## Notes
This plugin uses the **GGUF** format, developed by the `llama.cpp` team for fast inference and low memory consumption.
The model is **pretrained and instruction-tuned** for inference and is **not designed for fine-tuning**.
Currently, this plugin supports only **text generation** (not image inputs).
Raw data
{
"_id": null,
"home_page": null,
"name": "dashai-gemma-model-package",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "DashAI, Model",
"author": "DashAI team",
"author_email": "dashaisoftware@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/9b/ae/0750e2806a89f9ab22728ee1cf7bda67b5f64fc7cfe5e1efea09fe60da1e/dashai_gemma_model_package-0.0.1.tar.gz",
"platform": null,
"description": "# Gemma Model Plugin for DashAI\n\nThis plugin integrates Google's **Gemma 3** language models into the DashAI framework using the `llama.cpp` backend. It enables efficient and flexible text generation with GGUF quantized models and supports private access using a Hugging Face API token.\n\n## Included Models\n\n### 1. Gemma 3 1B It QAT\n\n- Lightweight instruction-tuned model with 1.3B parameters\n- Quantized and optimized for local inference (`q4_0` format)\n- Based on [`google/gemma-3-1b-it-qat-q4_0-gguf`](https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguf)\n\n### 2. Gemma 3 4B It QAT\n\n- Instruction-tuned model with 4B parameters\n- Balanced size and capability for local or cloud deployment\n- Based on [`google/gemma-3-4b-it-qat-q4_0-gguf`](https://huggingface.co/google/gemma-3-4b-it-qat-q4_0-gguf)\n\nBoth models are **instruction-tuned**, designed for high-quality generation and compatibility with CPU or GPU inference using `llama.cpp`.\n\n## About Gemma\n\nGemma is a family of lightweight, state-of-the-art open models from **Google**, developed with the same technology as the **Gemini** models. \nKey features of **Gemma 3** models:\n\n- Multimodal: support text and image input (in general; this plugin currently handles text-only generation)\n- Large context window: up to **128K tokens**\n- Instruction-tuned variants available\n- Multilingual: over **140 languages** supported\n- Open weights with access control via Hugging Face\n\nGemma is designed for deployment on laptops, desktops, and cloud infrastructure, making advanced AI more accessible.\n\n## Features\n\n- Text generation via chat-style prompt completion\n- GGUF format for optimized performance and memory usage\n- Configurable generation parameters:\n - `max_tokens`: Output length\n - `temperature`: Output randomness\n - `frequency_penalty`: Controls repetition\n - `context_window`: Number of tokens per forward pass\n - `device`: `\"gpu\"` or `\"cpu\"`\n- Automatic login to Hugging Face to access gated models\n\n## Model Parameters\n\n| Parameter | Description | Default |\n| ------------------- | -------------------------------------------------- | -------------------------------------- |\n| `model_name` | Model ID from Hugging Face | `\"google/gemma-3-4b-it-qat-q4_0-gguf\"` |\n| `huggingface_key` | Hugging Face API token to access restricted models | _Required_ |\n| `max_tokens` | Maximum number of tokens to generate | 100 |\n| `temperature` | Sampling temperature (higher = more random) | 0.7 |\n| `frequency_penalty` | Penalizes repeated tokens to encourage diversity | 0.1 |\n| `context_window` | Maximum context window (tokens in prompt) | 512 |\n| `device` | Inference device (`\"gpu\"` or `\"cpu\"`) | `\"gpu\"` if available |\n\n## Requirements\n\n- `DashAI`\n- `llama-cpp-python`\n- Valid **Hugging Face Access Token**\n- Model files from Hugging Face:\n - [`google/gemma-3-1b-it-qat-q4_0-gguf`](https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguf)\n - [`google/gemma-3-4b-it-qat-q4_0-gguf`](https://huggingface.co/google/gemma-3-4b-it-qat-q4_0-gguf)\n\n> \u26a0\ufe0f **Access Notice**: You must [accept the model terms on Hugging Face](https://huggingface.co/google/gemma-3-4b-it-qat-q4_0-gguf) and use a valid Hugging Face token. \n> This repository is publicly accessible, but gated. You need to agree to share your contact information to access the model files.\n\n## Notes\n\nThis plugin uses the **GGUF** format, developed by the `llama.cpp` team for fast inference and low memory consumption.\n\nThe model is **pretrained and instruction-tuned** for inference and is **not designed for fine-tuning**. \nCurrently, this plugin supports only **text generation** (not image inputs).\n",
"bugtrack_url": null,
"license": null,
"summary": "Gemma Model for DashAI",
"version": "0.0.1",
"project_urls": {
"Homepage": "https://github.com/DashAISoftware/DashAI",
"Issues": "https://github.com/DashAISoftware/DashAI/issues"
},
"split_keywords": [
"dashai",
" model"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "78cb62b8be783cf740c999a2879b59a6ac9340077eb8596bd03199ac01aecba9",
"md5": "cc655695933efa442bec58a94ba68d45",
"sha256": "953907b4a9664b7c0c4385d529fd124b23d83ce9dc31e903c1102bc73b6c7e5d"
},
"downloads": -1,
"filename": "dashai_gemma_model_package-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "cc655695933efa442bec58a94ba68d45",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 4631,
"upload_time": "2025-07-10T04:21:05",
"upload_time_iso_8601": "2025-07-10T04:21:05.532516Z",
"url": "https://files.pythonhosted.org/packages/78/cb/62b8be783cf740c999a2879b59a6ac9340077eb8596bd03199ac01aecba9/dashai_gemma_model_package-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "9bae0750e2806a89f9ab22728ee1cf7bda67b5f64fc7cfe5e1efea09fe60da1e",
"md5": "c2056db798851f9562ffa411e5015143",
"sha256": "8ebca2e11a8674704c2cb7baeed45abb54e9d8675f3261bd3d3b86f851003423"
},
"downloads": -1,
"filename": "dashai_gemma_model_package-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "c2056db798851f9562ffa411e5015143",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 19474,
"upload_time": "2025-07-10T04:21:07",
"upload_time_iso_8601": "2025-07-10T04:21:07.083182Z",
"url": "https://files.pythonhosted.org/packages/9b/ae/0750e2806a89f9ab22728ee1cf7bda67b5f64fc7cfe5e1efea09fe60da1e/dashai_gemma_model_package-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-10 04:21:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DashAISoftware",
"github_project": "DashAI",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "accelerate",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "aiohappyeyeballs",
"specs": [
[
"==",
"2.4.4"
]
]
},
{
"name": "aiohttp",
"specs": [
[
"==",
"3.10.11"
]
]
},
{
"name": "aiosignal",
"specs": [
[
"==",
"1.3.1"
]
]
},
{
"name": "alembic",
"specs": [
[
"==",
"1.14.1"
]
]
},
{
"name": "annotated-types",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "anyio",
"specs": [
[
"==",
"4.5.2"
]
]
},
{
"name": "attrs",
"specs": [
[
"==",
"25.1.0"
]
]
},
{
"name": "beartype",
"specs": [
[
"==",
"0.19.0"
]
]
},
{
"name": "certifi",
"specs": [
[
"==",
"2025.1.31"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
"==",
"3.4.1"
]
]
},
{
"name": "click",
"specs": [
[
"==",
"8.1.8"
]
]
},
{
"name": "cloudpickle",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "colorama",
"specs": [
[
"==",
"0.4.6"
]
]
},
{
"name": "colorlog",
"specs": [
[
"==",
"6.9.0"
]
]
},
{
"name": "contourpy",
"specs": [
[
"==",
"1.1.1"
]
]
},
{
"name": "cycler",
"specs": [
[
"==",
"0.12.1"
]
]
},
{
"name": "datasets",
"specs": [
[
"==",
"3.1.0"
]
]
},
{
"name": "dill",
"specs": [
[
"==",
"0.3.8"
]
]
},
{
"name": "dnspython",
"specs": [
[
"==",
"2.6.1"
]
]
},
{
"name": "email-validator",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "et-xmlfile",
"specs": [
[
"==",
"2.0.0"
]
]
},
{
"name": "evaluate",
"specs": [
[
"==",
"0.4.3"
]
]
},
{
"name": "fastapi",
"specs": [
[
"==",
"0.115.8"
]
]
},
{
"name": "fastapi-cli",
"specs": [
[
"==",
"0.0.7"
]
]
},
{
"name": "filelock",
"specs": [
[
"==",
"3.16.1"
]
]
},
{
"name": "fonttools",
"specs": [
[
"==",
"4.56.0"
]
]
},
{
"name": "frozenlist",
"specs": [
[
"==",
"1.5.0"
]
]
},
{
"name": "fsspec",
"specs": [
[
"==",
"2024.9.0"
]
]
},
{
"name": "future",
"specs": [
[
"==",
"1.0.0"
]
]
},
{
"name": "greenlet",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "h11",
"specs": [
[
"==",
"0.14.0"
]
]
},
{
"name": "httpcore",
"specs": [
[
"==",
"1.0.7"
]
]
},
{
"name": "httptools",
"specs": [
[
"==",
"0.6.4"
]
]
},
{
"name": "httpx",
"specs": [
[
"==",
"0.28.1"
]
]
},
{
"name": "huggingface-hub",
"specs": [
[
"==",
"0.29.1"
]
]
},
{
"name": "hyperopt",
"specs": [
[
"==",
"0.2.7"
]
]
},
{
"name": "idna",
"specs": [
[
"==",
"3.10"
]
]
},
{
"name": "imbalanced-learn",
"specs": [
[
"==",
"0.13.0"
]
]
},
{
"name": "imblearn",
"specs": [
[
"==",
"0.0"
]
]
},
{
"name": "itsdangerous",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "jinja2",
"specs": [
[
"==",
"3.1.5"
]
]
},
{
"name": "joblib",
"specs": [
[
"==",
"1.4.2"
]
]
},
{
"name": "kink",
"specs": [
[
"==",
"0.8.1"
]
]
},
{
"name": "kiwisolver",
"specs": [
[
"==",
"1.4.7"
]
]
},
{
"name": "llvmlite",
"specs": [
[
"==",
"0.41.1"
]
]
},
{
"name": "lxml",
"specs": [
[
"==",
"5.3.1"
]
]
},
{
"name": "mako",
"specs": [
[
"==",
"1.3.9"
]
]
},
{
"name": "markdown-it-py",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "markupsafe",
"specs": [
[
"==",
"2.1.5"
]
]
},
{
"name": "matplotlib",
"specs": [
[
"==",
"3.7.5"
]
]
},
{
"name": "mdurl",
"specs": [
[
"==",
"0.1.2"
]
]
},
{
"name": "mpmath",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "multidict",
"specs": [
[
"==",
"6.1.0"
]
]
},
{
"name": "multiprocess",
"specs": [
[
"==",
"0.70.16"
]
]
},
{
"name": "narwhals",
"specs": [
[
"==",
"1.27.1"
]
]
},
{
"name": "networkx",
"specs": [
[
"==",
"3.1"
]
]
},
{
"name": "numba",
"specs": [
[
"==",
"0.58.1"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.24.4"
]
]
},
{
"name": "openpyxl",
"specs": [
[
"==",
"3.1.5"
]
]
},
{
"name": "optuna",
"specs": [
[
"==",
"4.2.1"
]
]
},
{
"name": "orjson",
"specs": [
[
"==",
"3.10.15"
]
]
},
{
"name": "packaging",
"specs": [
[
"==",
"24.2"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.0.3"
]
]
},
{
"name": "pillow",
"specs": [
[
"==",
"10.4.0"
]
]
},
{
"name": "plotly",
"specs": [
[
"==",
"6.0.0"
]
]
},
{
"name": "portalocker",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "propcache",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "psutil",
"specs": [
[
"==",
"7.0.0"
]
]
},
{
"name": "py4j",
"specs": [
[
"==",
"0.10.9.9"
]
]
},
{
"name": "pyarrow",
"specs": [
[
"==",
"17.0.0"
]
]
},
{
"name": "pydantic",
"specs": [
[
"==",
"2.10.6"
]
]
},
{
"name": "pydantic-core",
"specs": [
[
"==",
"2.27.2"
]
]
},
{
"name": "pydantic-extra-types",
"specs": [
[
"==",
"2.10.2"
]
]
},
{
"name": "pydantic-settings",
"specs": [
[
"==",
"2.8.0"
]
]
},
{
"name": "pygments",
"specs": [
[
"==",
"2.19.1"
]
]
},
{
"name": "pyparsing",
"specs": [
[
"==",
"3.1.4"
]
]
},
{
"name": "python-dateutil",
"specs": [
[
"==",
"2.9.0.post0"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "python-multipart",
"specs": [
[
"==",
"0.0.20"
]
]
},
{
"name": "pytz",
"specs": [
[
"==",
"2025.1"
]
]
},
{
"name": "pyyaml",
"specs": [
[
"==",
"6.0.2"
]
]
},
{
"name": "regex",
"specs": [
[
"==",
"2024.11.6"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.32.3"
]
]
},
{
"name": "rich",
"specs": [
[
"==",
"13.9.4"
]
]
},
{
"name": "rich-toolkit",
"specs": [
[
"==",
"0.13.2"
]
]
},
{
"name": "sacrebleu",
"specs": [
[
"==",
"2.5.1"
]
]
},
{
"name": "safetensors",
"specs": [
[
"==",
"0.5.2"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"1.3.2"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.10.1"
]
]
},
{
"name": "sentencepiece",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "shap",
"specs": [
[
"==",
"0.44.1"
]
]
},
{
"name": "shellingham",
"specs": [
[
"==",
"1.5.4"
]
]
},
{
"name": "six",
"specs": [
[
"==",
"1.17.0"
]
]
},
{
"name": "sklearn-compat",
"specs": [
[
"==",
"0.1.3"
]
]
},
{
"name": "slicer",
"specs": [
[
"==",
"0.0.7"
]
]
},
{
"name": "smart-open",
"specs": [
[
"==",
"7.1.0"
]
]
},
{
"name": "sniffio",
"specs": [
[
"==",
"1.3.1"
]
]
},
{
"name": "sqlalchemy",
"specs": [
[
"==",
"2.0.38"
]
]
},
{
"name": "starlette",
"specs": [
[
"==",
"0.44.0"
]
]
},
{
"name": "streaming-form-data",
"specs": [
[
"==",
"1.16.0"
]
]
},
{
"name": "sympy",
"specs": [
[
"==",
"1.13.3"
]
]
},
{
"name": "tabulate",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "threadpoolctl",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "tokenizers",
"specs": [
[
"==",
"0.20.3"
]
]
},
{
"name": "torch",
"specs": [
[
"==",
"2.4.1+cu118"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.67.1"
]
]
},
{
"name": "transformers",
"specs": [
[
"==",
"4.46.3"
]
]
},
{
"name": "typer",
"specs": [
[
"==",
"0.15.1"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
"==",
"4.12.2"
]
]
},
{
"name": "tzdata",
"specs": [
[
"==",
"2025.1"
]
]
},
{
"name": "ujson",
"specs": [
[
"==",
"5.10.0"
]
]
},
{
"name": "urllib3",
"specs": [
[
"==",
"2.2.3"
]
]
},
{
"name": "uvicorn",
"specs": [
[
"==",
"0.33.0"
]
]
},
{
"name": "watchfiles",
"specs": [
[
"==",
"0.24.0"
]
]
},
{
"name": "websockets",
"specs": [
[
"==",
"13.1"
]
]
},
{
"name": "wordcloud",
"specs": [
[
"==",
"1.9.4"
]
]
},
{
"name": "wrapt",
"specs": [
[
"==",
"1.17.2"
]
]
},
{
"name": "xxhash",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "yarl",
"specs": [
[
"==",
"1.15.2"
]
]
}
],
"lcname": "dashai-gemma-model-package"
}