# DeepSeek LLM Plugin for DashAI
This plugin integrates two DeepSeek models into the DashAI framework using the `llama.cpp` backend. It enables text generation tasks through a lightweight and efficient inference engine with support for quantized GGUF models.
## Included Models
### 1. DeepSeek LLM 7B Chat
- Pretrained chat-oriented model for general text generation
- Based on [`TheBloke/deepseek-llm-7B-chat-GGUF`](https://huggingface.co/TheBloke/deepseek-llm-7B-chat-GGUF)
- Uses quantized file: `deepseek-llm-7b-chat.Q5_K_M.gguf`
### 2. DeepSeek Coder 6.7B Instruct
- Instruction-tuned model for code-related and general instruction tasks
- Initialized from `deepseek-coder-6.7b-base`, fine-tuned on 2B instruction tokens
- Based on [`TheBloke/deepseek-coder-6.7B-instruct-GGUF`](https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF)
- Uses quantized file: `deepseek-coder-6.7b-instruct.Q5_K_M.gguf`
Both models use the **Q5_K_M** quantization method for a balance of quality and efficiency, and are compatible with both CPU and GPU inference.
## Components
### DeepSeekModel
- Implements the `TextToTextGenerationTaskModel` interface from DashAI
- Uses the `llama.cpp` backend with GGUF support
- Loads the model from Hugging Face at runtime
- Supports configurable generation parameters
- Automatically truncates long prompts and uses custom stop sequences for cleaner output
## Features
- Configurable text generation with:
- `max_tokens`: Number of tokens to generate
- `temperature`: Controls output randomness
- `frequency_penalty`: Reduces repetition
- `n_ctx`: Context window size
- `device`: `"cpu"` or `"gpu"`
- Efficient memory usage with GGUF quantization
- Custom stop sequence: `["Q:"]`
## Model Parameters
| Parameter | Description | Default |
| ------------------- | ------------------------------------------------ | -------------------- |
| `max_tokens` | Maximum number of tokens to generate | 100 |
| `temperature` | Sampling temperature (higher = more random) | 0.7 |
| `frequency_penalty` | Penalizes repeated tokens to encourage diversity | 0.1 |
| `n_ctx` | Maximum context window (tokens in prompt) | 4096 |
| `device` | Inference device | `"gpu"` if available |
## Requirements
- `DashAI`
- `llama-cpp-python`
- Model files from Hugging Face:
- [`TheBloke/deepseek-llm-7B-chat-GGUF`](https://huggingface.co/TheBloke/deepseek-llm-7B-chat-GGUF)
- [`TheBloke/deepseek-coder-6.7B-instruct-GGUF`](https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF)
## Notes
This plugin uses the **GGUF** format, introduced by the `llama.cpp` team in August 2023.
GGUF replaces the older **GGML** format, which is no longer supported.
GGUF models are optimized for fast inference and lower memory consumption, especially on CPU- or GPU-constrained devices.
Both models (`deepseek-llm-7b-chat` and `deepseek-coder-6.7b-instruct`) are distributed in the **Q5_K_M** quantized format.
This quantization method offers a solid trade-off between model size and quality, making them suitable for real-time or resource-limited environments.
> ⚠️ These models are **pretrained and instruction-tuned** for inference only. They are **not intended for fine-tuning**.
Raw data
{
"_id": null,
"home_page": null,
"name": "dashai-deepseek-model-package",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "DashAI, Model",
"author": "DashAI team",
"author_email": "dashaisoftware@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/dd/4b/e75d0c8d69b3a82e683c217849ea758246d170198b80996584f91d837542/dashai_deepseek_model_package-0.0.5.tar.gz",
"platform": null,
"description": "# DeepSeek LLM Plugin for DashAI\n\nThis plugin integrates two DeepSeek models into the DashAI framework using the `llama.cpp` backend. It enables text generation tasks through a lightweight and efficient inference engine with support for quantized GGUF models.\n\n## Included Models\n\n### 1. DeepSeek LLM 7B Chat\n\n- Pretrained chat-oriented model for general text generation\n- Based on [`TheBloke/deepseek-llm-7B-chat-GGUF`](https://huggingface.co/TheBloke/deepseek-llm-7B-chat-GGUF)\n- Uses quantized file: `deepseek-llm-7b-chat.Q5_K_M.gguf`\n\n### 2. DeepSeek Coder 6.7B Instruct\n\n- Instruction-tuned model for code-related and general instruction tasks\n- Initialized from `deepseek-coder-6.7b-base`, fine-tuned on 2B instruction tokens\n- Based on [`TheBloke/deepseek-coder-6.7B-instruct-GGUF`](https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF)\n- Uses quantized file: `deepseek-coder-6.7b-instruct.Q5_K_M.gguf`\n\nBoth models use the **Q5_K_M** quantization method for a balance of quality and efficiency, and are compatible with both CPU and GPU inference.\n\n## Components\n\n### DeepSeekModel\n\n- Implements the `TextToTextGenerationTaskModel` interface from DashAI\n- Uses the `llama.cpp` backend with GGUF support\n- Loads the model from Hugging Face at runtime\n- Supports configurable generation parameters\n- Automatically truncates long prompts and uses custom stop sequences for cleaner output\n\n## Features\n\n- Configurable text generation with:\n - `max_tokens`: Number of tokens to generate\n - `temperature`: Controls output randomness\n - `frequency_penalty`: Reduces repetition\n - `n_ctx`: Context window size\n - `device`: `\"cpu\"` or `\"gpu\"`\n- Efficient memory usage with GGUF quantization\n- Custom stop sequence: `[\"Q:\"]`\n\n## Model Parameters\n\n| Parameter | Description | Default |\n| ------------------- | ------------------------------------------------ | -------------------- |\n| `max_tokens` | Maximum number of tokens to generate | 100 |\n| `temperature` | Sampling temperature (higher = more random) | 0.7 |\n| `frequency_penalty` | Penalizes repeated tokens to encourage diversity | 0.1 |\n| `n_ctx` | Maximum context window (tokens in prompt) | 4096 |\n| `device` | Inference device | `\"gpu\"` if available |\n\n## Requirements\n\n- `DashAI`\n- `llama-cpp-python`\n- Model files from Hugging Face:\n - [`TheBloke/deepseek-llm-7B-chat-GGUF`](https://huggingface.co/TheBloke/deepseek-llm-7B-chat-GGUF)\n - [`TheBloke/deepseek-coder-6.7B-instruct-GGUF`](https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF)\n\n## Notes\n\nThis plugin uses the **GGUF** format, introduced by the `llama.cpp` team in August 2023. \nGGUF replaces the older **GGML** format, which is no longer supported.\n\nGGUF models are optimized for fast inference and lower memory consumption, especially on CPU- or GPU-constrained devices.\n\nBoth models (`deepseek-llm-7b-chat` and `deepseek-coder-6.7b-instruct`) are distributed in the **Q5_K_M** quantized format. \nThis quantization method offers a solid trade-off between model size and quality, making them suitable for real-time or resource-limited environments.\n\n> \u26a0\ufe0f These models are **pretrained and instruction-tuned** for inference only. They are **not intended for fine-tuning**.\n",
"bugtrack_url": null,
"license": null,
"summary": "DeepSeek Model for DashAI",
"version": "0.0.5",
"project_urls": {
"Homepage": "https://github.com/DashAISoftware/DashAI",
"Issues": "https://github.com/DashAISoftware/DashAI/issues"
},
"split_keywords": [
"dashai",
" model"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a6051e9672e2b39e20444b79c25eb11db0745079507a32ffda11353125038929",
"md5": "3c8110786403b3c33e246c403ab2bb3a",
"sha256": "71f24462ac4697ce099e5bda360adbebbdcb5dfae1e3d84498c47ff4d422ba98"
},
"downloads": -1,
"filename": "dashai_deepseek_model_package-0.0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3c8110786403b3c33e246c403ab2bb3a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 4418,
"upload_time": "2025-07-10T03:09:36",
"upload_time_iso_8601": "2025-07-10T03:09:36.504103Z",
"url": "https://files.pythonhosted.org/packages/a6/05/1e9672e2b39e20444b79c25eb11db0745079507a32ffda11353125038929/dashai_deepseek_model_package-0.0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "dd4be75d0c8d69b3a82e683c217849ea758246d170198b80996584f91d837542",
"md5": "0ea0267fec5b048ccb3ec0e96f0d9b83",
"sha256": "da315f898422025a0fefebc597d52bc2e3decdae317e212e7935a36689d08aee"
},
"downloads": -1,
"filename": "dashai_deepseek_model_package-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "0ea0267fec5b048ccb3ec0e96f0d9b83",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 19196,
"upload_time": "2025-07-10T03:09:37",
"upload_time_iso_8601": "2025-07-10T03:09:37.438411Z",
"url": "https://files.pythonhosted.org/packages/dd/4b/e75d0c8d69b3a82e683c217849ea758246d170198b80996584f91d837542/dashai_deepseek_model_package-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-10 03:09:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DashAISoftware",
"github_project": "DashAI",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "accelerate",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "aiohappyeyeballs",
"specs": [
[
"==",
"2.4.4"
]
]
},
{
"name": "aiohttp",
"specs": [
[
"==",
"3.10.11"
]
]
},
{
"name": "aiosignal",
"specs": [
[
"==",
"1.3.1"
]
]
},
{
"name": "alembic",
"specs": [
[
"==",
"1.14.1"
]
]
},
{
"name": "annotated-types",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "anyio",
"specs": [
[
"==",
"4.5.2"
]
]
},
{
"name": "attrs",
"specs": [
[
"==",
"25.1.0"
]
]
},
{
"name": "beartype",
"specs": [
[
"==",
"0.19.0"
]
]
},
{
"name": "certifi",
"specs": [
[
"==",
"2025.1.31"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
"==",
"3.4.1"
]
]
},
{
"name": "click",
"specs": [
[
"==",
"8.1.8"
]
]
},
{
"name": "cloudpickle",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "colorama",
"specs": [
[
"==",
"0.4.6"
]
]
},
{
"name": "colorlog",
"specs": [
[
"==",
"6.9.0"
]
]
},
{
"name": "contourpy",
"specs": [
[
"==",
"1.1.1"
]
]
},
{
"name": "cycler",
"specs": [
[
"==",
"0.12.1"
]
]
},
{
"name": "datasets",
"specs": [
[
"==",
"3.1.0"
]
]
},
{
"name": "dill",
"specs": [
[
"==",
"0.3.8"
]
]
},
{
"name": "dnspython",
"specs": [
[
"==",
"2.6.1"
]
]
},
{
"name": "email-validator",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "et-xmlfile",
"specs": [
[
"==",
"2.0.0"
]
]
},
{
"name": "evaluate",
"specs": [
[
"==",
"0.4.3"
]
]
},
{
"name": "fastapi",
"specs": [
[
"==",
"0.115.8"
]
]
},
{
"name": "fastapi-cli",
"specs": [
[
"==",
"0.0.7"
]
]
},
{
"name": "filelock",
"specs": [
[
"==",
"3.16.1"
]
]
},
{
"name": "fonttools",
"specs": [
[
"==",
"4.56.0"
]
]
},
{
"name": "frozenlist",
"specs": [
[
"==",
"1.5.0"
]
]
},
{
"name": "fsspec",
"specs": [
[
"==",
"2024.9.0"
]
]
},
{
"name": "future",
"specs": [
[
"==",
"1.0.0"
]
]
},
{
"name": "greenlet",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "h11",
"specs": [
[
"==",
"0.14.0"
]
]
},
{
"name": "httpcore",
"specs": [
[
"==",
"1.0.7"
]
]
},
{
"name": "httptools",
"specs": [
[
"==",
"0.6.4"
]
]
},
{
"name": "httpx",
"specs": [
[
"==",
"0.28.1"
]
]
},
{
"name": "huggingface-hub",
"specs": [
[
"==",
"0.29.1"
]
]
},
{
"name": "hyperopt",
"specs": [
[
"==",
"0.2.7"
]
]
},
{
"name": "idna",
"specs": [
[
"==",
"3.10"
]
]
},
{
"name": "imbalanced-learn",
"specs": [
[
"==",
"0.13.0"
]
]
},
{
"name": "imblearn",
"specs": [
[
"==",
"0.0"
]
]
},
{
"name": "itsdangerous",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "jinja2",
"specs": [
[
"==",
"3.1.5"
]
]
},
{
"name": "joblib",
"specs": [
[
"==",
"1.4.2"
]
]
},
{
"name": "kink",
"specs": [
[
"==",
"0.8.1"
]
]
},
{
"name": "kiwisolver",
"specs": [
[
"==",
"1.4.7"
]
]
},
{
"name": "llvmlite",
"specs": [
[
"==",
"0.41.1"
]
]
},
{
"name": "lxml",
"specs": [
[
"==",
"5.3.1"
]
]
},
{
"name": "mako",
"specs": [
[
"==",
"1.3.9"
]
]
},
{
"name": "markdown-it-py",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "markupsafe",
"specs": [
[
"==",
"2.1.5"
]
]
},
{
"name": "matplotlib",
"specs": [
[
"==",
"3.7.5"
]
]
},
{
"name": "mdurl",
"specs": [
[
"==",
"0.1.2"
]
]
},
{
"name": "mpmath",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "multidict",
"specs": [
[
"==",
"6.1.0"
]
]
},
{
"name": "multiprocess",
"specs": [
[
"==",
"0.70.16"
]
]
},
{
"name": "narwhals",
"specs": [
[
"==",
"1.27.1"
]
]
},
{
"name": "networkx",
"specs": [
[
"==",
"3.1"
]
]
},
{
"name": "numba",
"specs": [
[
"==",
"0.58.1"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.24.4"
]
]
},
{
"name": "openpyxl",
"specs": [
[
"==",
"3.1.5"
]
]
},
{
"name": "optuna",
"specs": [
[
"==",
"4.2.1"
]
]
},
{
"name": "orjson",
"specs": [
[
"==",
"3.10.15"
]
]
},
{
"name": "packaging",
"specs": [
[
"==",
"24.2"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.0.3"
]
]
},
{
"name": "pillow",
"specs": [
[
"==",
"10.4.0"
]
]
},
{
"name": "plotly",
"specs": [
[
"==",
"6.0.0"
]
]
},
{
"name": "portalocker",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "propcache",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "psutil",
"specs": [
[
"==",
"7.0.0"
]
]
},
{
"name": "py4j",
"specs": [
[
"==",
"0.10.9.9"
]
]
},
{
"name": "pyarrow",
"specs": [
[
"==",
"17.0.0"
]
]
},
{
"name": "pydantic",
"specs": [
[
"==",
"2.10.6"
]
]
},
{
"name": "pydantic-core",
"specs": [
[
"==",
"2.27.2"
]
]
},
{
"name": "pydantic-extra-types",
"specs": [
[
"==",
"2.10.2"
]
]
},
{
"name": "pydantic-settings",
"specs": [
[
"==",
"2.8.0"
]
]
},
{
"name": "pygments",
"specs": [
[
"==",
"2.19.1"
]
]
},
{
"name": "pyparsing",
"specs": [
[
"==",
"3.1.4"
]
]
},
{
"name": "python-dateutil",
"specs": [
[
"==",
"2.9.0.post0"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "python-multipart",
"specs": [
[
"==",
"0.0.20"
]
]
},
{
"name": "pytz",
"specs": [
[
"==",
"2025.1"
]
]
},
{
"name": "pyyaml",
"specs": [
[
"==",
"6.0.2"
]
]
},
{
"name": "regex",
"specs": [
[
"==",
"2024.11.6"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.32.3"
]
]
},
{
"name": "rich",
"specs": [
[
"==",
"13.9.4"
]
]
},
{
"name": "rich-toolkit",
"specs": [
[
"==",
"0.13.2"
]
]
},
{
"name": "sacrebleu",
"specs": [
[
"==",
"2.5.1"
]
]
},
{
"name": "safetensors",
"specs": [
[
"==",
"0.5.2"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"1.3.2"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.10.1"
]
]
},
{
"name": "sentencepiece",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "shap",
"specs": [
[
"==",
"0.44.1"
]
]
},
{
"name": "shellingham",
"specs": [
[
"==",
"1.5.4"
]
]
},
{
"name": "six",
"specs": [
[
"==",
"1.17.0"
]
]
},
{
"name": "sklearn-compat",
"specs": [
[
"==",
"0.1.3"
]
]
},
{
"name": "slicer",
"specs": [
[
"==",
"0.0.7"
]
]
},
{
"name": "smart-open",
"specs": [
[
"==",
"7.1.0"
]
]
},
{
"name": "sniffio",
"specs": [
[
"==",
"1.3.1"
]
]
},
{
"name": "sqlalchemy",
"specs": [
[
"==",
"2.0.38"
]
]
},
{
"name": "starlette",
"specs": [
[
"==",
"0.44.0"
]
]
},
{
"name": "streaming-form-data",
"specs": [
[
"==",
"1.16.0"
]
]
},
{
"name": "sympy",
"specs": [
[
"==",
"1.13.3"
]
]
},
{
"name": "tabulate",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "threadpoolctl",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "tokenizers",
"specs": [
[
"==",
"0.20.3"
]
]
},
{
"name": "torch",
"specs": [
[
"==",
"2.4.1+cu118"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.67.1"
]
]
},
{
"name": "transformers",
"specs": [
[
"==",
"4.46.3"
]
]
},
{
"name": "typer",
"specs": [
[
"==",
"0.15.1"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
"==",
"4.12.2"
]
]
},
{
"name": "tzdata",
"specs": [
[
"==",
"2025.1"
]
]
},
{
"name": "ujson",
"specs": [
[
"==",
"5.10.0"
]
]
},
{
"name": "urllib3",
"specs": [
[
"==",
"2.2.3"
]
]
},
{
"name": "uvicorn",
"specs": [
[
"==",
"0.33.0"
]
]
},
{
"name": "watchfiles",
"specs": [
[
"==",
"0.24.0"
]
]
},
{
"name": "websockets",
"specs": [
[
"==",
"13.1"
]
]
},
{
"name": "wordcloud",
"specs": [
[
"==",
"1.9.4"
]
]
},
{
"name": "wrapt",
"specs": [
[
"==",
"1.17.2"
]
]
},
{
"name": "xxhash",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "yarl",
"specs": [
[
"==",
"1.15.2"
]
]
}
],
"lcname": "dashai-deepseek-model-package"
}