llm-gguf

Name	llm-gguf JSON
Version	0.2 JSON
	download
home_page	None
Summary	Run models distributed as GGUF files
upload_time	2024-11-21 07:17:50
maintainer	None
docs_url	None
author	Simon Willison
requires_python	None
license	Apache-2.0
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # llm-gguf

[![PyPI](https://img.shields.io/pypi/v/llm-gguf.svg)](https://pypi.org/project/llm-gguf/)
[![Changelog](https://img.shields.io/github/v/release/simonw/llm-gguf?include_prereleases&label=changelog)](https://github.com/simonw/llm-gguf/releases)
[![Tests](https://github.com/simonw/llm-gguf/actions/workflows/test.yml/badge.svg)](https://github.com/simonw/llm-gguf/actions/workflows/test.yml)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/llm-gguf/blob/main/LICENSE)

Run models distributed as GGUF files using [LLM](https://llm.datasette.io/)

## Installation

Install this plugin in the same environment as LLM:
```bash
llm install llm-gguf
```
## Usage

This plugin runs models that have been distributed as GGUF files.

You can either ask the plugin to download these directly, or you can register models you have already downloaded.

To download the LM Studio GGUF of Llama 3.1 8B Instruct, run the following command:

```bash
llm gguf download-model \
  https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \
  --alias llama-3.1-8b-instruct --alias l31i
```
The `--alias` options set aliases for that model, you can omit them if you don't want to set any.

This command will download the 4.92GB file and store it in the directory revealed by running `llm gguf models-dir` - on macOS this will be `~/Library/Application Support/io.datasette.llm/gguf/models`.

Run `llm models` to confirm that the model has been installed.

You can then run prompts through that model like this:
```bash
llm -m gguf/Meta-Llama-3.1-8B-Instruct-Q4_K_M 'Five great names for a pet lemur'
```
Or using one of the aliases that you set like this:
```bash
llm -m l31i 'Five great names for a pet lemur'
```
You can start a persistent chat session with the model using `llm chat` - this will avoid having to load the model into memory for each prompt:
```bash
llm chat -m l31i
```
```
Chatting with gguf/Meta-Llama-3.1-8B-Instruct-Q4_K_M
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
> tell me a joke about a walrus, a pelican and a lemur getting lunch
Here's one: Why did the walrus, the pelican, and the lemur go to the cafeteria for lunch? ...
```

If you have downloaded the model already you can register it with the plugin while keeping the file in its current location like this:
```bash
llm gguf register-model \
  ~/Downloads/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \
  --alias llama-3.1-8b-instruct --alias l31i
```

This plugin **currently only works with chat models** - these are usually distributed in files with the prefix `-Instruct` or `-Chat` or similar.

For non-chat models you may have better luck with the older [llm-llama-cpp plugin](https://github.com/simonw/llm-llama-cpp).

## Embedding models

This plugin also supports [embedding models](https://llm.datasette.io/en/stable/embeddings/index.html) that are distributed as GGUFs.

These are managed using the `llm gguf embed-models`, `llm gguf download-embed-model` and `llm gguf register-embed-model` commands.

For example, to start using the excellent and tiny [mxbai-embed-xsmall-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-xsmall-v1) model you can download the 30.8MB GGUF version like this:

```bash
llm gguf download-embed-model \
  https://huggingface.co/mixedbread-ai/mxbai-embed-xsmall-v1/resolve/main/gguf/mxbai-embed-xsmall-v1-q8_0.gguf
```
This will store the model in the directory shown if you run `llm gguf models-dir`.

Confirm that the new model is available by running this:

```bash
llm embed-models
```
You should see `gguf/mxbai-embed-xsmall-v1-q8_0` in the list.

Then try that model out like this:

```bash
llm embed -m gguf/mxbai-embed-xsmall-v1-q8_0 -c 'hello'
```
This will output a 384 element floating point JSON array.

Consult the [LLM documentation](https://llm.datasette.io/en/stable/embeddings/index.html) for more information on how to use these embeddings.

## Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:
```bash
cd llm-gguf
python3 -m venv venv
source venv/bin/activate
```
Now install the dependencies and test dependencies:
```bash
llm install -e '.[test]'
```
To run the tests:
```bash
pytest
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llm-gguf",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Simon Willison",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/55/1b/abe762bc8c2cfd649690901da2c50aa6a2761ca51ac1bd91b6265170f7a5/llm_gguf-0.2.tar.gz",
    "platform": null,
    "description": "# llm-gguf\n\n[![PyPI](https://img.shields.io/pypi/v/llm-gguf.svg)](https://pypi.org/project/llm-gguf/)\n[![Changelog](https://img.shields.io/github/v/release/simonw/llm-gguf?include_prereleases&label=changelog)](https://github.com/simonw/llm-gguf/releases)\n[![Tests](https://github.com/simonw/llm-gguf/actions/workflows/test.yml/badge.svg)](https://github.com/simonw/llm-gguf/actions/workflows/test.yml)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/llm-gguf/blob/main/LICENSE)\n\nRun models distributed as GGUF files using [LLM](https://llm.datasette.io/)\n\n## Installation\n\nInstall this plugin in the same environment as LLM:\n```bash\nllm install llm-gguf\n```\n## Usage\n\nThis plugin runs models that have been distributed as GGUF files.\n\nYou can either ask the plugin to download these directly, or you can register models you have already downloaded.\n\nTo download the LM Studio GGUF of Llama 3.1 8B Instruct, run the following command:\n\n```bash\nllm gguf download-model \\\n  https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \\\n  --alias llama-3.1-8b-instruct --alias l31i\n```\nThe `--alias` options set aliases for that model, you can omit them if you don't want to set any.\n\nThis command will download the 4.92GB file and store it in the directory revealed by running `llm gguf models-dir` - on macOS this will be `~/Library/Application Support/io.datasette.llm/gguf/models`.\n\nRun `llm models` to confirm that the model has been installed.\n\nYou can then run prompts through that model like this:\n```bash\nllm -m gguf/Meta-Llama-3.1-8B-Instruct-Q4_K_M 'Five great names for a pet lemur'\n```\nOr using one of the aliases that you set like this:\n```bash\nllm -m l31i 'Five great names for a pet lemur'\n```\nYou can start a persistent chat session with the model using `llm chat` - this will avoid having to load the model into memory for each prompt:\n```bash\nllm chat -m l31i\n```\n```\nChatting with gguf/Meta-Llama-3.1-8B-Instruct-Q4_K_M\nType 'exit' or 'quit' to exit\nType '!multi' to enter multiple lines, then '!end' to finish\n> tell me a joke about a walrus, a pelican and a lemur getting lunch\nHere's one: Why did the walrus, the pelican, and the lemur go to the cafeteria for lunch? ...\n```\n\nIf you have downloaded the model already you can register it with the plugin while keeping the file in its current location like this:\n```bash\nllm gguf register-model \\\n  ~/Downloads/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf \\\n  --alias llama-3.1-8b-instruct --alias l31i\n```\n\nThis plugin **currently only works with chat models** - these are usually distributed in files with the prefix `-Instruct` or `-Chat` or similar.\n\nFor non-chat models you may have better luck with the older [llm-llama-cpp plugin](https://github.com/simonw/llm-llama-cpp).\n\n## Embedding models\n\nThis plugin also supports [embedding models](https://llm.datasette.io/en/stable/embeddings/index.html) that are distributed as GGUFs.\n\nThese are managed using the `llm gguf embed-models`, `llm gguf download-embed-model` and `llm gguf register-embed-model` commands.\n\nFor example, to start using the excellent and tiny [mxbai-embed-xsmall-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-xsmall-v1) model you can download the 30.8MB GGUF version like this:\n\n```bash\nllm gguf download-embed-model \\\n  https://huggingface.co/mixedbread-ai/mxbai-embed-xsmall-v1/resolve/main/gguf/mxbai-embed-xsmall-v1-q8_0.gguf\n```\nThis will store the model in the directory shown if you run `llm gguf models-dir`.\n\nConfirm that the new model is available by running this:\n\n```bash\nllm embed-models\n```\nYou should see `gguf/mxbai-embed-xsmall-v1-q8_0` in the list.\n\nThen try that model out like this:\n\n```bash\nllm embed -m gguf/mxbai-embed-xsmall-v1-q8_0 -c 'hello'\n```\nThis will output a 384 element floating point JSON array.\n\nConsult the [LLM documentation](https://llm.datasette.io/en/stable/embeddings/index.html) for more information on how to use these embeddings.\n\n## Development\n\nTo set up this plugin locally, first checkout the code. Then create a new virtual environment:\n```bash\ncd llm-gguf\npython3 -m venv venv\nsource venv/bin/activate\n```\nNow install the dependencies and test dependencies:\n```bash\nllm install -e '.[test]'\n```\nTo run the tests:\n```bash\npytest\n```\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Run models distributed as GGUF files",
    "version": "0.2",
    "project_urls": {
        "CI": "https://github.com/simonw/llm-gguf/actions",
        "Changelog": "https://github.com/simonw/llm-gguf/releases",
        "Homepage": "https://github.com/simonw/llm-gguf",
        "Issues": "https://github.com/simonw/llm-gguf/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7bdcf9db07a0ab215d5a0781b24dd7e9686d8235f7337a67b5fa81e4febc14b7",
                "md5": "9e571628f148cd3151098c59f3f78386",
                "sha256": "9f7b8e164a90c67e0767ce76ef7e66078576ba34d204f68480c93f2c90f42daa"
            },
            "downloads": -1,
            "filename": "llm_gguf-0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9e571628f148cd3151098c59f3f78386",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9967,
            "upload_time": "2024-11-21T07:17:47",
            "upload_time_iso_8601": "2024-11-21T07:17:47.994689Z",
            "url": "https://files.pythonhosted.org/packages/7b/dc/f9db07a0ab215d5a0781b24dd7e9686d8235f7337a67b5fa81e4febc14b7/llm_gguf-0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "551babe762bc8c2cfd649690901da2c50aa6a2761ca51ac1bd91b6265170f7a5",
                "md5": "c76b1aadb9c81cdc0021ffe2963b5cad",
                "sha256": "c8fcb07e32bc16c6f597fcd760c1baea88a555ac7b43b3e6651be18067ee27d6"
            },
            "downloads": -1,
            "filename": "llm_gguf-0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "c76b1aadb9c81cdc0021ffe2963b5cad",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10108,
            "upload_time": "2024-11-21T07:17:50",
            "upload_time_iso_8601": "2024-11-21T07:17:50.247483Z",
            "url": "https://files.pythonhosted.org/packages/55/1b/abe762bc8c2cfd649690901da2c50aa6a2761ca51ac1bd91b6265170f7a5/llm_gguf-0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-21 07:17:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "simonw",
    "github_project": "llm-gguf",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "llm-gguf"
}

Simon Willison