llama-cpp-cffi

Name	llama-cpp-cffi JSON
Version	0.4.32 JSON
	download
home_page	https://github.com/tangledgroup/llama-cpp-cffi
Summary	Python binding for llama.cpp using cffi
upload_time	2025-02-19 09:04:14
maintainer	None
docs_url	None
author	Tangled Group, Inc
requires_python	<4.0,>=3.10
license	MIT
keywords	llama llama-cpp llama.cpp llama-cpp-cffi cffi
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # llama-cpp-cffi

<!--
[![Build][build-image]]()
[![Status][status-image]][pypi-project-url]
[![Stable Version][stable-ver-image]][pypi-project-url]
[![Coverage][coverage-image]]()
[![Python][python-ver-image]][pypi-project-url]
[![License][mit-image]][mit-url]
-->
[![PyPI](https://img.shields.io/pypi/v/llama-cpp-cffi)](https://pypi.org/project/llama-cpp-cffi/)
[![Supported Versions](https://img.shields.io/pypi/pyversions/llama-cpp-cffi)](https://pypi.org/project/llama-cpp-cffi)
[![PyPI Downloads](https://img.shields.io/pypi/dm/llama-cpp-cffi)](https://pypistats.org/packages/llama-cpp-cffi)
[![Github Downloads](https://img.shields.io/github/downloads/tangledgroup/llama-cpp-cffi/total.svg?label=Github%20Downloads)]()
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)

**Python** 3.10+ binding for [llama.cpp](https://github.com/ggerganov/llama.cpp) using **cffi**. Supports **CPU**, **Vulkan 1.x** (AMD, Intel and Nvidia GPUs) and **CUDA 12.6** (Nvidia GPUs) runtimes, **x86_64** and **aarch64** platforms.

NOTE: Currently supported operating system is **Linux** (`manylinux_2_28` and `musllinux_1_2`), but we are working on both **Windows** and **macOS** versions.

## News

- **Feb 17 2025, v0.4.21**: CUDA 12.8.0 for x86_64; CUDA ARCHITECTURES: `61, 70, 75, 80, 86, 89, 90, 100, 101, 120`
- **Jan 15 2025, v0.4.15**: Dynamically load/unload models while executing prompts in parallel.
- **Jan 14 2025, v0.4.14**: Modular llama.cpp build using `cmake` build system. Deprecated `make` build system.
- **Jan 1 2025, v0.3.1**: OpenAI compatible API, **text** and **vision** models. Added support for **Qwen2-VL** models. Hot-swap of models on demand in server/API.
- **Dec 9 2024, v0.2.0**: Low-level and high-level APIs: llama, llava, clip and ggml API.
- **Nov 27 2024, v0.1.22**: Support for Multimodal models such as **llava** and **minicpmv**.

## Install

Basic library install:

```bash
pip install llama-cpp-cffi
```

In case you want [OpenAI © Chat Completions API](https://platform.openai.com/docs/overview) compatible API:

```bash
pip install llama-cpp-cffi[openai]
```

**IMPORTANT:** If you want to take advantage of **Nvidia** GPU acceleration, make sure that you have installed **CUDA 12**. If you don't have `CUDA 12.X.Y` installed follow instructions here: https://developer.nvidia.com/cuda-downloads .

GPU Compute Capability: `61;70;75;80;86;89;90;100;101;120` covering from most of GPUs from **GeForce GTX 1050** to **Nvidia H100** and **Nvidia Blackwell**. [GPU Compute Capability](https://developer.nvidia.com/cuda-gpus).

## LLM Example

```python
from llama import Model

#
# first define and load/init model
#
model = Model(
    creator_hf_repo='HuggingFaceTB/SmolLM2-1.7B-Instruct',
    hf_repo='bartowski/SmolLM2-1.7B-Instruct-GGUF',
    hf_file='SmolLM2-1.7B-Instruct-Q4_K_M.gguf',
)

model.init(n_ctx=8 * 1024, gpu_layers=99)

#
# messages
#
messages = [
    {'role': 'system', 'content': 'You are a helpful assistant.'},
    {'role': 'user', 'content': '1 + 1 = ?'},
    {'role': 'assistant', 'content': '2'},
    {'role': 'user', 'content': 'Evaluate 1 + 2 in Python.'},
]

completions = model.completions(
    messages=messages,
    predict=1 * 1024,
    temp=0.7,
    top_p=0.8,
    top_k=100,
)

for chunk in completions:
    print(chunk, flush=True, end='')

#
# prompt
#
prompt='Evaluate 1 + 2 in Python. Result in Python is'

completions = model.completions(
    prompt=prompt,
    predict=1 * 1024,
    temp=0.7,
    top_p=0.8,
    top_k=100,
)

for chunk in completions:
    print(chunk, flush=True, end='')
```

### References
- `examples/llm.py`
- `examples/demo_text.py`

## VLM Example

```python
from llama import Model

#
# first define and load/init model
#
model = Model( # 1.87B
    creator_hf_repo='vikhyatk/moondream2',
    hf_repo='vikhyatk/moondream2',
    hf_file='moondream2-text-model-f16.gguf',
    mmproj_hf_file='moondream2-mmproj-f16.gguf',
)

model.init(n_ctx=8 * 1024, gpu_layers=99)

#
# prompt
#
prompt = 'Describe this image.'
image = 'examples/llama-1.png'

completions = model.completions(
    prompt=prompt,
    image=image,
    predict=1 * 1024,
)

for chunk in completions:
    print(chunk, flush=True, end='')
```

### References
- `examples/vlm.py`
- `examples/demo_llava.py`
- `examples/demo_minicpmv.py`
- `examples/demo_qwen2vl.py`

## API

### Server - llama-cpp-cffi + OpenAI API

Run server first:

```bash
python -B -u -m llama.server
# or
python -B -u -m gunicorn --bind '0.0.0.0:11434' --timeout 900 --workers 1 --worker-class aiohttp.GunicornWebWorker 'llama.server:build_app()'
```

### Client - llama-cpp-cffi API / curl

```bash
#
# llm
#
curl -XPOST 'http://localhost:11434/api/1.0/completions' \
-H "Content-Type: application/json" \
-d '{
    "gpu_layers": 99,
    "prompt": "Evaluate 1 + 2 in Python."
}'

curl -XPOST 'http://localhost:11434/api/1.0/completions' \
-H "Content-Type: application/json" \
-d '{
    "creator_hf_repo": "HuggingFaceTB/SmolLM2-1.7B-Instruct",
    "hf_repo": "bartowski/SmolLM2-1.7B-Instruct-GGUF",
    "hf_file": "SmolLM2-1.7B-Instruct-Q4_K_M.gguf",
    "gpu_layers": 99,
    "prompt": "Evaluate 1 + 2 in Python."
}'

curl -XPOST 'http://localhost:11434/api/1.0/completions' \
-H "Content-Type: application/json" \
-d '{
    "creator_hf_repo": "Qwen/Qwen2.5-0.5B-Instruct",
    "hf_repo": "Qwen/Qwen2.5-0.5B-Instruct-GGUF",
    "hf_file": "qwen2.5-0.5b-instruct-q4_k_m.gguf",
    "gpu_layers": 99,
    "prompt": "Evaluate 1 + 2 in Python."
}'

curl -XPOST 'http://localhost:11434/api/1.0/completions' \
-H "Content-Type: application/json" \
-d '{
    "creator_hf_repo": "Qwen/Qwen2.5-7B-Instruct",
    "hf_repo": "bartowski/Qwen2.5-7B-Instruct-GGUF",
    "hf_file": "Qwen2.5-7B-Instruct-Q4_K_M.gguf",
    "gpu_layers": 99,
    "prompt": "Evaluate 1 + 2 in Python."
}'

#
# vlm - example 1
#
image_path="examples/llama-1.jpg"
mime_type=$(file -b --mime-type "$image_path")
base64_data=$(base64 -w 0 "$image_path")

cat << EOF > /tmp/temp.json
{
    "creator_hf_repo": "Qwen/Qwen2-VL-2B-Instruct",
    "hf_repo": "bartowski/Qwen2-VL-2B-Instruct-GGUF",
    "hf_file": "Qwen2-VL-2B-Instruct-Q4_K_M.gguf",
    "mmproj_hf_file": "mmproj-Qwen2-VL-2B-Instruct-f16.gguf",
    "gpu_layers": 99,
    "prompt": "Describe this image.",
    "image": "data:$mime_type;base64,$base64_data"
}
EOF

curl -XPOST 'http://localhost:11434/api/1.0/completions' \
-H "Content-Type: application/json" \
--data-binary "@/tmp/temp.json"

#
# vlm - example 2
#
image_path="examples/llama-1.jpg"
mime_type=$(file -b --mime-type "$image_path")
base64_data=$(base64 -w 0 "$image_path")

cat << EOF > /tmp/temp.json
{
    "creator_hf_repo": "Qwen/Qwen2-VL-2B-Instruct",
    "hf_repo": "bartowski/Qwen2-VL-2B-Instruct-GGUF",
    "hf_file": "Qwen2-VL-2B-Instruct-Q4_K_M.gguf",
    "mmproj_hf_file": "mmproj-Qwen2-VL-2B-Instruct-f16.gguf",
    "gpu_layers": 99,
    "messages": [
        {"role": "user", "content": [
            {"type": "text", "text": "Describe this image."},
            {
                "type": "image_url",
                "image_url": {"url": "data:$mime_type;base64,$base64_data"}
            }
        ]}
    ]
}
EOF

curl -XPOST 'http://localhost:11434/api/1.0/completions' \
-H "Content-Type: application/json" \
--data-binary "@/tmp/temp.json"
```

### Client - OpenAI © compatible Chat Completions API

```bash
#
# text
#
curl -XPOST 'http://localhost:11434/v1/chat/completions' \
-H "Content-Type: application/json" \
-d '{
    "model": "HuggingFaceTB/SmolLM2-1.7B-Instruct:bartowski/SmolLM2-1.7B-Instruct-GGUF:SmolLM2-1.7B-Instruct-Q4_K_M.gguf",
    "messages": [
        {
            "role": "user",
            "content": "Evaluate 1 + 2 in Python."
        }
    ],
    "n_ctx": 8192,
    "gpu_layers": 99
}'

#
# image
#
image_path="examples/llama-1.jpg"
mime_type=$(file -b --mime-type "$image_path")
base64_data=$(base64 -w 0 "$image_path")

cat << EOF > /tmp/temp.json
{
    "model": "Qwen/Qwen2-VL-2B-Instruct:bartowski/Qwen2-VL-2B-Instruct-GGUF:Qwen2-VL-2B-Instruct-Q4_K_M.gguf:mmproj-Qwen2-VL-2B-Instruct-f16.gguf",
    "messages": [
        {"role": "user", "content": [
            {"type": "text", "text": "Describe this image."},
            {
                "type": "image_url",
                "image_url": {"url": "data:$mime_type;base64,$base64_data"}
            }
        ]}
    ],
    "n_ctx": 8192,
    "gpu_layers": 99
}
EOF

curl -XPOST 'http://localhost:11434/v1/chat/completions' \
-H "Content-Type: application/json" \
--data-binary "@/tmp/temp.json"

#
# Client Python API for OpenAI
#
python -B examples/demo_openai.py
```

### References
- `examples/demo_openai.py`

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tangledgroup/llama-cpp-cffi",
    "name": "llama-cpp-cffi",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": "llama, llama-cpp, llama.cpp, llama-cpp-cffi, cffi",
    "author": "Tangled Group, Inc",
    "author_email": "info@tangledgroup.com",
    "download_url": null,
    "platform": null,
    "description": "# llama-cpp-cffi\n\n<!--\n[![Build][build-image]]()\n[![Status][status-image]][pypi-project-url]\n[![Stable Version][stable-ver-image]][pypi-project-url]\n[![Coverage][coverage-image]]()\n[![Python][python-ver-image]][pypi-project-url]\n[![License][mit-image]][mit-url]\n-->\n[![PyPI](https://img.shields.io/pypi/v/llama-cpp-cffi)](https://pypi.org/project/llama-cpp-cffi/)\n[![Supported Versions](https://img.shields.io/pypi/pyversions/llama-cpp-cffi)](https://pypi.org/project/llama-cpp-cffi)\n[![PyPI Downloads](https://img.shields.io/pypi/dm/llama-cpp-cffi)](https://pypistats.org/packages/llama-cpp-cffi)\n[![Github Downloads](https://img.shields.io/github/downloads/tangledgroup/llama-cpp-cffi/total.svg?label=Github%20Downloads)]()\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n\n**Python** 3.10+ binding for [llama.cpp](https://github.com/ggerganov/llama.cpp) using **cffi**. Supports **CPU**, **Vulkan 1.x** (AMD, Intel and Nvidia GPUs) and **CUDA 12.6** (Nvidia GPUs) runtimes, **x86_64** and **aarch64** platforms.\n\nNOTE: Currently supported operating system is **Linux** (`manylinux_2_28` and `musllinux_1_2`), but we are working on both **Windows** and **macOS** versions.\n\n## News\n\n- **Feb 17 2025, v0.4.21**: CUDA 12.8.0 for x86_64; CUDA ARCHITECTURES: `61, 70, 75, 80, 86, 89, 90, 100, 101, 120`\n- **Jan 15 2025, v0.4.15**: Dynamically load/unload models while executing prompts in parallel.\n- **Jan 14 2025, v0.4.14**: Modular llama.cpp build using `cmake` build system. Deprecated `make` build system.\n- **Jan 1 2025, v0.3.1**: OpenAI compatible API, **text** and **vision** models. Added support for **Qwen2-VL** models. Hot-swap of models on demand in server/API.\n- **Dec 9 2024, v0.2.0**: Low-level and high-level APIs: llama, llava, clip and ggml API.\n- **Nov 27 2024, v0.1.22**: Support for Multimodal models such as **llava** and **minicpmv**.\n\n## Install\n\nBasic library install:\n\n```bash\npip install llama-cpp-cffi\n```\n\nIn case you want [OpenAI \u00a9 Chat Completions API](https://platform.openai.com/docs/overview) compatible API:\n\n```bash\npip install llama-cpp-cffi[openai]\n```\n\n**IMPORTANT:** If you want to take advantage of **Nvidia** GPU acceleration, make sure that you have installed **CUDA 12**. If you don't have `CUDA 12.X.Y` installed follow instructions here: https://developer.nvidia.com/cuda-downloads .\n\nGPU Compute Capability: `61;70;75;80;86;89;90;100;101;120` covering from most of GPUs from **GeForce GTX 1050** to **Nvidia H100** and **Nvidia Blackwell**. [GPU Compute Capability](https://developer.nvidia.com/cuda-gpus).\n\n## LLM Example\n\n```python\nfrom llama import Model\n\n#\n# first define and load/init model\n#\nmodel = Model(\n    creator_hf_repo='HuggingFaceTB/SmolLM2-1.7B-Instruct',\n    hf_repo='bartowski/SmolLM2-1.7B-Instruct-GGUF',\n    hf_file='SmolLM2-1.7B-Instruct-Q4_K_M.gguf',\n)\n\nmodel.init(n_ctx=8 * 1024, gpu_layers=99)\n\n#\n# messages\n#\nmessages = [\n    {'role': 'system', 'content': 'You are a helpful assistant.'},\n    {'role': 'user', 'content': '1 + 1 = ?'},\n    {'role': 'assistant', 'content': '2'},\n    {'role': 'user', 'content': 'Evaluate 1 + 2 in Python.'},\n]\n\ncompletions = model.completions(\n    messages=messages,\n    predict=1 * 1024,\n    temp=0.7,\n    top_p=0.8,\n    top_k=100,\n)\n\nfor chunk in completions:\n    print(chunk, flush=True, end='')\n\n#\n# prompt\n#\nprompt='Evaluate 1 + 2 in Python. Result in Python is'\n\ncompletions = model.completions(\n    prompt=prompt,\n    predict=1 * 1024,\n    temp=0.7,\n    top_p=0.8,\n    top_k=100,\n)\n\nfor chunk in completions:\n    print(chunk, flush=True, end='')\n```\n\n### References\n- `examples/llm.py`\n- `examples/demo_text.py`\n\n## VLM Example\n\n```python\nfrom llama import Model\n\n#\n# first define and load/init model\n#\nmodel = Model( # 1.87B\n    creator_hf_repo='vikhyatk/moondream2',\n    hf_repo='vikhyatk/moondream2',\n    hf_file='moondream2-text-model-f16.gguf',\n    mmproj_hf_file='moondream2-mmproj-f16.gguf',\n)\n\nmodel.init(n_ctx=8 * 1024, gpu_layers=99)\n\n#\n# prompt\n#\nprompt = 'Describe this image.'\nimage = 'examples/llama-1.png'\n\ncompletions = model.completions(\n    prompt=prompt,\n    image=image,\n    predict=1 * 1024,\n)\n\nfor chunk in completions:\n    print(chunk, flush=True, end='')\n```\n\n### References\n- `examples/vlm.py`\n- `examples/demo_llava.py`\n- `examples/demo_minicpmv.py`\n- `examples/demo_qwen2vl.py`\n\n## API\n\n### Server - llama-cpp-cffi + OpenAI API\n\nRun server first:\n\n```bash\npython -B -u -m llama.server\n# or\npython -B -u -m gunicorn --bind '0.0.0.0:11434' --timeout 900 --workers 1 --worker-class aiohttp.GunicornWebWorker 'llama.server:build_app()'\n```\n\n### Client - llama-cpp-cffi API / curl\n\n```bash\n#\n# llm\n#\ncurl -XPOST 'http://localhost:11434/api/1.0/completions' \\\n-H \"Content-Type: application/json\" \\\n-d '{\n    \"gpu_layers\": 99,\n    \"prompt\": \"Evaluate 1 + 2 in Python.\"\n}'\n\ncurl -XPOST 'http://localhost:11434/api/1.0/completions' \\\n-H \"Content-Type: application/json\" \\\n-d '{\n    \"creator_hf_repo\": \"HuggingFaceTB/SmolLM2-1.7B-Instruct\",\n    \"hf_repo\": \"bartowski/SmolLM2-1.7B-Instruct-GGUF\",\n    \"hf_file\": \"SmolLM2-1.7B-Instruct-Q4_K_M.gguf\",\n    \"gpu_layers\": 99,\n    \"prompt\": \"Evaluate 1 + 2 in Python.\"\n}'\n\ncurl -XPOST 'http://localhost:11434/api/1.0/completions' \\\n-H \"Content-Type: application/json\" \\\n-d '{\n    \"creator_hf_repo\": \"Qwen/Qwen2.5-0.5B-Instruct\",\n    \"hf_repo\": \"Qwen/Qwen2.5-0.5B-Instruct-GGUF\",\n    \"hf_file\": \"qwen2.5-0.5b-instruct-q4_k_m.gguf\",\n    \"gpu_layers\": 99,\n    \"prompt\": \"Evaluate 1 + 2 in Python.\"\n}'\n\ncurl -XPOST 'http://localhost:11434/api/1.0/completions' \\\n-H \"Content-Type: application/json\" \\\n-d '{\n    \"creator_hf_repo\": \"Qwen/Qwen2.5-7B-Instruct\",\n    \"hf_repo\": \"bartowski/Qwen2.5-7B-Instruct-GGUF\",\n    \"hf_file\": \"Qwen2.5-7B-Instruct-Q4_K_M.gguf\",\n    \"gpu_layers\": 99,\n    \"prompt\": \"Evaluate 1 + 2 in Python.\"\n}'\n\n#\n# vlm - example 1\n#\nimage_path=\"examples/llama-1.jpg\"\nmime_type=$(file -b --mime-type \"$image_path\")\nbase64_data=$(base64 -w 0 \"$image_path\")\n\ncat << EOF > /tmp/temp.json\n{\n    \"creator_hf_repo\": \"Qwen/Qwen2-VL-2B-Instruct\",\n    \"hf_repo\": \"bartowski/Qwen2-VL-2B-Instruct-GGUF\",\n    \"hf_file\": \"Qwen2-VL-2B-Instruct-Q4_K_M.gguf\",\n    \"mmproj_hf_file\": \"mmproj-Qwen2-VL-2B-Instruct-f16.gguf\",\n    \"gpu_layers\": 99,\n    \"prompt\": \"Describe this image.\",\n    \"image\": \"data:$mime_type;base64,$base64_data\"\n}\nEOF\n\ncurl -XPOST 'http://localhost:11434/api/1.0/completions' \\\n-H \"Content-Type: application/json\" \\\n--data-binary \"@/tmp/temp.json\"\n\n#\n# vlm - example 2\n#\nimage_path=\"examples/llama-1.jpg\"\nmime_type=$(file -b --mime-type \"$image_path\")\nbase64_data=$(base64 -w 0 \"$image_path\")\n\ncat << EOF > /tmp/temp.json\n{\n    \"creator_hf_repo\": \"Qwen/Qwen2-VL-2B-Instruct\",\n    \"hf_repo\": \"bartowski/Qwen2-VL-2B-Instruct-GGUF\",\n    \"hf_file\": \"Qwen2-VL-2B-Instruct-Q4_K_M.gguf\",\n    \"mmproj_hf_file\": \"mmproj-Qwen2-VL-2B-Instruct-f16.gguf\",\n    \"gpu_layers\": 99,\n    \"messages\": [\n        {\"role\": \"user\", \"content\": [\n            {\"type\": \"text\", \"text\": \"Describe this image.\"},\n            {\n                \"type\": \"image_url\",\n                \"image_url\": {\"url\": \"data:$mime_type;base64,$base64_data\"}\n            }\n        ]}\n    ]\n}\nEOF\n\ncurl -XPOST 'http://localhost:11434/api/1.0/completions' \\\n-H \"Content-Type: application/json\" \\\n--data-binary \"@/tmp/temp.json\"\n```\n\n### Client - OpenAI \u00a9 compatible Chat Completions API\n\n```bash\n#\n# text\n#\ncurl -XPOST 'http://localhost:11434/v1/chat/completions' \\\n-H \"Content-Type: application/json\" \\\n-d '{\n    \"model\": \"HuggingFaceTB/SmolLM2-1.7B-Instruct:bartowski/SmolLM2-1.7B-Instruct-GGUF:SmolLM2-1.7B-Instruct-Q4_K_M.gguf\",\n    \"messages\": [\n        {\n            \"role\": \"user\",\n            \"content\": \"Evaluate 1 + 2 in Python.\"\n        }\n    ],\n    \"n_ctx\": 8192,\n    \"gpu_layers\": 99\n}'\n\n#\n# image\n#\nimage_path=\"examples/llama-1.jpg\"\nmime_type=$(file -b --mime-type \"$image_path\")\nbase64_data=$(base64 -w 0 \"$image_path\")\n\ncat << EOF > /tmp/temp.json\n{\n    \"model\": \"Qwen/Qwen2-VL-2B-Instruct:bartowski/Qwen2-VL-2B-Instruct-GGUF:Qwen2-VL-2B-Instruct-Q4_K_M.gguf:mmproj-Qwen2-VL-2B-Instruct-f16.gguf\",\n    \"messages\": [\n        {\"role\": \"user\", \"content\": [\n            {\"type\": \"text\", \"text\": \"Describe this image.\"},\n            {\n                \"type\": \"image_url\",\n                \"image_url\": {\"url\": \"data:$mime_type;base64,$base64_data\"}\n            }\n        ]}\n    ],\n    \"n_ctx\": 8192,\n    \"gpu_layers\": 99\n}\nEOF\n\ncurl -XPOST 'http://localhost:11434/v1/chat/completions' \\\n-H \"Content-Type: application/json\" \\\n--data-binary \"@/tmp/temp.json\"\n\n#\n# Client Python API for OpenAI\n#\npython -B examples/demo_openai.py\n```\n\n### References\n- `examples/demo_openai.py`\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python binding for llama.cpp using cffi",
    "version": "0.4.32",
    "project_urls": {
        "Bug Tracker": "https://github.com/tangledgroup/llama-cpp-cffi/issues",
        "Documentation": "https://github.com/tangledgroup/llama-cpp-cffi",
        "Homepage": "https://github.com/tangledgroup/llama-cpp-cffi",
        "Repository": "https://github.com/tangledgroup/llama-cpp-cffi"
    },
    "split_keywords": [
        "llama",
        " llama-cpp",
        " llama.cpp",
        " llama-cpp-cffi",
        " cffi"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cbfbea52c5e62c4319b2bc5b0433052287d6bbea967945dd578f6e10f1aa9b81",
                "md5": "85c7c0841eb49133ee312bcd81d77fc3",
                "sha256": "7d7f5d7148c5fa03583cc03dfc2a4e61975fdb7fb1525cc34d7fc9dd3b2bc5bf"
            },
            "downloads": -1,
            "filename": "llama_cpp_cffi-0.4.32-cp312-cp312-manylinux_2_34_x86_64.whl",
            "has_sig": false,
            "md5_digest": "85c7c0841eb49133ee312bcd81d77fc3",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": "<4.0,>=3.10",
            "size": 447668739,
            "upload_time": "2025-02-19T09:04:14",
            "upload_time_iso_8601": "2025-02-19T09:04:14.880771Z",
            "url": "https://files.pythonhosted.org/packages/cb/fb/ea52c5e62c4319b2bc5b0433052287d6bbea967945dd578f6e10f1aa9b81/llama_cpp_cffi-0.4.32-cp312-cp312-manylinux_2_34_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-19 09:04:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tangledgroup",
    "github_project": "llama-cpp-cffi",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "llama-cpp-cffi"
}

Tangled Group, Inc