# Ollama Data Tools
## Requirements
- Python 3.x
## Installation
Clone the repository and install the necessary dependencies:
```sh
git clone https://github.com/queelius/ollama_data_tools.git
cd ollama_data_tools
pip install -r requirements.txt
pip install -e .
```
## Ollama Data Toolkit
The `OllamaData` class is the core module of the Ollama Data Toolkit, allowing users to work programmatically with Ollama model data. This class provides methods to access, search, and filter model information.
### Features
- Retrieve the schema of the OllamaData object.
- Access models by name or index.
- List all available models.
- Perform JMESPath queries and apply regex filters on the model data.
- Cache model data for efficient repeated access.
### Class Methods
#### `OllamaData.get_schema() -> Dict[str, Any]`
Returns the schema of the `OllamaData` object.
#### `OllamaData.__init__(cache_path: str = '~/.ollama_data/cache', cache_time: str = '1 day')`
Initializes the `OllamaData` object.
- `cache_path`: The path to the cache file.
- `cache_time`: The duration the cache is valid.
#### `OllamaData.__len__() -> int`
Returns the number of models.
#### `OllamaData.__getitem__(index: int) -> Dict[str, Any]`
Gets a model by index.
- `index`: The index of the model.
#### `OllamaData.get_model(name: str) -> Dict[str, Any]`
Gets the model by name. Returns the most specific model that starts with the given name.
- `name`: The name of the model.
#### `OllamaData.get_models() -> Dict[str, Any]`
Gets the models. Caches the model data to avoid repeated regeneration.
#### `OllamaData.search(query: str = '[*]', regex: Optional[str] = None, regex_path: str = '@') -> Dict[str, Any]`
Queries, searches, and views the models using a JMESPath query, regex filter, and exclude keys.
- `query`: The JMESPath query to filter and provide a view of the models.
- `regex`: The regex pattern to match against the output.
- `regex_path`: The JMESPath query for the regex pattern.
### Usage Example
Here is an example of how to use the `OllamaData` class programmatically:
```python
import ollama_data as od
# Initialize the OllamaData object
models = od.OllamaData(cache_path='~/.ollama_data/cache', cache_time='1 day')
# Get the schema of the OllamaData object
print("Schema:", models.get_schema())
# List all models
print("Models:", ollama_data.get_models())
# Get a specific model by name
model = models.get_model('mistral')
print("Specific Model:", model['name'])
# Search models using a JMESPath query
query_result = models.search(query="[*].{name: name, size: total_weights_size}")
print("Query Result:", query_result)
# Search models using a JMESPath query and regex filter
query_regex_result = models.search(
query="[*].{name: name, size: total_weights_size}",
regex="mistral", regex_path="name")
print("Query Regex Result:", query_regex_result)
```
## Ollama Data Query
The `ollama_data_query.py` script allows users to search and filter Ollama models using JMESPath queries and regular expressions. This tool is designed to help users explore and retrieve specific information about the models in their Ollama registry.
### Features
- Perform JMESPath queries to filter model data.
- Use regular expressions to match specific patterns within the model data.
- Print the JSON schema of the models.
- Support for piped input queries.
### Arguments
- `query`: The JMESPath query to filter results.
- `--regex`: Regular expression to match.
- `--regex-path`: The JMESPath query for the regex pattern to apply against (default: `@`).
- `--schema`: Print the JSON schema.
- `--debug`: Set logging level to DEBUG.
- `--cache-time`: Time to keep the cache file (default: `1 hour`).
- `--cache-path`: The path to the cache file (default: `~/.ollama_data/cache`).
### Usage
To perform a JMESPath query:
```sh
ollama_data_query "max_by(@, &total_weights_size).{name: name, size: total_weights_size}"
```
To use a regular expression to filter results:
```sh
ollama_data_query --regex "mistral:latest" --regex-path name "[*].{name: name, size: total_weights_size}"
```
To pipe a query from a file or another command:
```sh
cat query.txt | ollama_data_query
```
Using regex and regex-path with a piped query:
```sh
echo "[*].{info: { name: name, other: weights}}" | ollama_data_query --regex 14f2 --regex-path "info.other[*].file_name"
```
### Examples
#### Query for the Largest Model
```sh
ollama_data_query "max_by(@, &total_weights_size).{name: name, sz: total_weights_size}"
```
#### Filter Models Using Regex
```sh
ollama_data_query --regex "mistral|llama3" --regex-path name "[*].{name: name, size: total_weights_size}"
```
#### Pipe a Query from a File
```sh
cat query.txt | ollama_data_query
```
#### Use Regex with a Piped Query
```sh
echo "[*].{info: { name: name, other: weights}}" | ollama_data_query --regex 14f2 --regex-path "info.other[*].file_name"
```
## Ollama Data Export
The `ollama_data_export` script allows users to export Ollama models to a specified directory. This tool creates soft links for the model weights and saves the model metadata in the output directory.
### Features
- Export specified models to a self-contained directory.
- Create soft links for model weights.
- Save model metadata in JSON format.
- Enable debug logging for detailed output.
### Arguments
- `outdir`: The output directory where the models will be exported.
- `--models`: Comma-separated list of models to export. If not specified, all models will be exported.
- `--cache-path`: The path to the cache file (default: `~/.ollama_data/cache`).
- `--cache-time`: The time to keep the cache file (default: `1 day`).
- `--debug`: Enable debug logging.
- `--hash-length`: The length of the hash to use for the weight soft-links (default: `8`).
### Usage
To export specified models to a directory:
```sh
ollama_data_export --models model1,model2 --outdir /path/to/export
```
To export all models to a directory:
```sh
ollama_data_export /path/to/export
```
### Examples
#### Export Specified Models
```sh
ollama_data_export --models mistral,llama3 --outdir /path/to/export
```
#### Export All Models
```sh
ollama_data_export --ourdir /path/to/export
```
#### Enable Debug Logging
```sh
ollama_data_export --models mistral --outdir /path/to/export --debug
```
#### Specify Hash Length for Soft Links
```sh
ollama_data_export --models mistral --outdir /path/to/export --hash-length 2
```
## Ollama Data Adapter
The `ollama_data_adapter` script adapts Ollama models for use with other inference engines, such as `llamacpp`. This tool is designed to reduce friction when experimenting with local LLM models and integrates with other tools for viewing, searching, and exporting Ollama models.
### Features
- List available engines and models.
- Run models with specified engines.
- Show the template for a given model.
- Pass additional arguments to the inference engine.
- Debugging information for advanced users.
### Arguments
- `model`: The model to run.
- `engine`: The engine to use.
- `--engine-path`: The path to the engine (required).
- `--list-engines`: List available engines.
- `--list-models`: List available models.
- `--cache-path`: The path to the cache file (default: `~/.ollama_data/cache`).
- `--cache-time`: The time to keep the cache file (default: `1 day`).
- `--engine-args`: Arguments to pass through to the engine.
- `--debug`: Print debug information.
- `--show-template`: Show the template for the model.
### Usage
To list all available engines:
```sh
ollama_data_adapter --list-engines
```
To list all available models:
```sh
ollama_data_adapter --list-models
```
To show the template for a specific model:
```sh
ollama_data_adapter mistral --show-template
## The template for the model has the following forms:
## - [INST] {{ .System }} {{ .Prompt }} [/INST]
```
To run a specific model with an engine:
```sh
ollama_data_adapter model engine --engine-path /path/to/engine --engine-args 'arg1' ... 'argn'
```
### Example
To use the `llamacpp` inference engine with the `mistral` model (assuming
it is available in your `Ollama` registry), you might use the following
arguments:
```sh
ollama_data_adapter
mistral # Also matches `mistral:latest`
llamacpp # Use the llamacpp engine
--engine-path /path/to/llamacpp # Path to engine, e.g. ~/llamacpp/main
--engine-args # Pass these arguments into the engine
'--n-gpu-layers 40'
'--prompt "[INST] You are a helpful AI assistant. [/INST]"'
```
The `--prompt` engine pass-through argument follows the template shown by
the `ollama_data_adapter mistral --show-template`.
We place a lot of burden on the end-user to get the formatting right. These
models are very sensitive to how you prompt them, so some experimentation
may be necessary.
You may also want to use `ollama_data_query` to show the system message
or other properties of a model, so that you can further customize the
pass-through arguments to better fit its training data.
## Contributing
Contributions are welcome! Please submit a pull request or open an issue to discuss changes.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
## Author
Alex Towell
- Email: lex@metafunctor.com
- Twitter: [@queelius](https://twitter.com/queelius)
- Website: [metafunctor](https://metafunctor.com)
- GitHub: [@queelius](https://github.com/queelius)
Raw data
{
"_id": null,
"home_page": "https://github.com/queelius/ollama_data_tools",
"name": "ollama-data-tools",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "Alex Towell",
"author_email": "lex@metafunctor.com",
"download_url": "https://files.pythonhosted.org/packages/85/c0/c6303ed047bdc930c7bc46907b444ae65838411ba5a759cd5f1c18f2e88c/ollama_data_tools-0.1.1.tar.gz",
"platform": null,
"description": "# Ollama Data Tools\n\n## Requirements\n\n- Python 3.x\n\n## Installation\n\nClone the repository and install the necessary dependencies:\n\n```sh\ngit clone https://github.com/queelius/ollama_data_tools.git\ncd ollama_data_tools\npip install -r requirements.txt\npip install -e .\n```\n\n## Ollama Data Toolkit\n\nThe `OllamaData` class is the core module of the Ollama Data Toolkit, allowing users to work programmatically with Ollama model data. This class provides methods to access, search, and filter model information.\n\n### Features\n\n- Retrieve the schema of the OllamaData object.\n- Access models by name or index.\n- List all available models.\n- Perform JMESPath queries and apply regex filters on the model data.\n- Cache model data for efficient repeated access.\n\n### Class Methods\n\n#### `OllamaData.get_schema() -> Dict[str, Any]`\nReturns the schema of the `OllamaData` object.\n\n#### `OllamaData.__init__(cache_path: str = '~/.ollama_data/cache', cache_time: str = '1 day')`\nInitializes the `OllamaData` object.\n\n- `cache_path`: The path to the cache file.\n- `cache_time`: The duration the cache is valid.\n\n#### `OllamaData.__len__() -> int`\nReturns the number of models.\n\n#### `OllamaData.__getitem__(index: int) -> Dict[str, Any]`\nGets a model by index.\n\n- `index`: The index of the model.\n\n#### `OllamaData.get_model(name: str) -> Dict[str, Any]`\nGets the model by name. Returns the most specific model that starts with the given name.\n\n- `name`: The name of the model.\n\n#### `OllamaData.get_models() -> Dict[str, Any]`\nGets the models. Caches the model data to avoid repeated regeneration.\n\n#### `OllamaData.search(query: str = '[*]', regex: Optional[str] = None, regex_path: str = '@') -> Dict[str, Any]`\nQueries, searches, and views the models using a JMESPath query, regex filter, and exclude keys.\n\n- `query`: The JMESPath query to filter and provide a view of the models.\n- `regex`: The regex pattern to match against the output.\n- `regex_path`: The JMESPath query for the regex pattern.\n\n### Usage Example\n\nHere is an example of how to use the `OllamaData` class programmatically:\n\n```python\nimport ollama_data as od\n\n# Initialize the OllamaData object\nmodels = od.OllamaData(cache_path='~/.ollama_data/cache', cache_time='1 day')\n\n# Get the schema of the OllamaData object\nprint(\"Schema:\", models.get_schema())\n\n# List all models\nprint(\"Models:\", ollama_data.get_models())\n\n# Get a specific model by name\nmodel = models.get_model('mistral')\nprint(\"Specific Model:\", model['name'])\n\n# Search models using a JMESPath query\nquery_result = models.search(query=\"[*].{name: name, size: total_weights_size}\")\nprint(\"Query Result:\", query_result)\n\n# Search models using a JMESPath query and regex filter\nquery_regex_result = models.search(\n query=\"[*].{name: name, size: total_weights_size}\",\n regex=\"mistral\", regex_path=\"name\")\nprint(\"Query Regex Result:\", query_regex_result)\n```\n\n## Ollama Data Query\n\nThe `ollama_data_query.py` script allows users to search and filter Ollama models using JMESPath queries and regular expressions. This tool is designed to help users explore and retrieve specific information about the models in their Ollama registry.\n\n### Features\n\n- Perform JMESPath queries to filter model data.\n- Use regular expressions to match specific patterns within the model data.\n- Print the JSON schema of the models.\n- Support for piped input queries.\n\n### Arguments\n\n- `query`: The JMESPath query to filter results.\n- `--regex`: Regular expression to match.\n- `--regex-path`: The JMESPath query for the regex pattern to apply against (default: `@`).\n- `--schema`: Print the JSON schema.\n- `--debug`: Set logging level to DEBUG.\n- `--cache-time`: Time to keep the cache file (default: `1 hour`).\n- `--cache-path`: The path to the cache file (default: `~/.ollama_data/cache`).\n\n### Usage\n\nTo perform a JMESPath query:\n\n```sh\nollama_data_query \"max_by(@, &total_weights_size).{name: name, size: total_weights_size}\"\n```\n\nTo use a regular expression to filter results:\n\n```sh\nollama_data_query --regex \"mistral:latest\" --regex-path name \"[*].{name: name, size: total_weights_size}\"\n```\n\nTo pipe a query from a file or another command:\n\n```sh\ncat query.txt | ollama_data_query\n```\n\nUsing regex and regex-path with a piped query:\n\n```sh\necho \"[*].{info: { name: name, other: weights}}\" | ollama_data_query --regex 14f2 --regex-path \"info.other[*].file_name\"\n```\n\n### Examples\n\n#### Query for the Largest Model\n\n```sh\nollama_data_query \"max_by(@, &total_weights_size).{name: name, sz: total_weights_size}\"\n```\n\n#### Filter Models Using Regex\n\n```sh\nollama_data_query --regex \"mistral|llama3\" --regex-path name \"[*].{name: name, size: total_weights_size}\"\n```\n\n#### Pipe a Query from a File\n\n```sh\ncat query.txt | ollama_data_query\n```\n\n#### Use Regex with a Piped Query\n\n```sh\necho \"[*].{info: { name: name, other: weights}}\" | ollama_data_query --regex 14f2 --regex-path \"info.other[*].file_name\"\n```\n\n## Ollama Data Export\n\nThe `ollama_data_export` script allows users to export Ollama models to a specified directory. This tool creates soft links for the model weights and saves the model metadata in the output directory.\n\n### Features\n\n- Export specified models to a self-contained directory.\n- Create soft links for model weights.\n- Save model metadata in JSON format.\n- Enable debug logging for detailed output.\n\n### Arguments\n\n- `outdir`: The output directory where the models will be exported.\n- `--models`: Comma-separated list of models to export. If not specified, all models will be exported.\n- `--cache-path`: The path to the cache file (default: `~/.ollama_data/cache`).\n- `--cache-time`: The time to keep the cache file (default: `1 day`).\n- `--debug`: Enable debug logging.\n- `--hash-length`: The length of the hash to use for the weight soft-links (default: `8`).\n\n### Usage\n\nTo export specified models to a directory:\n\n```sh\nollama_data_export --models model1,model2 --outdir /path/to/export\n```\n\nTo export all models to a directory:\n\n```sh\nollama_data_export /path/to/export\n```\n\n### Examples\n\n#### Export Specified Models\n\n```sh\nollama_data_export --models mistral,llama3 --outdir /path/to/export\n```\n\n#### Export All Models\n\n```sh\nollama_data_export --ourdir /path/to/export\n```\n\n#### Enable Debug Logging\n\n```sh\nollama_data_export --models mistral --outdir /path/to/export --debug\n```\n\n#### Specify Hash Length for Soft Links\n\n```sh\nollama_data_export --models mistral --outdir /path/to/export --hash-length 2\n```\n\n## Ollama Data Adapter\n\nThe `ollama_data_adapter` script adapts Ollama models for use with other inference engines, such as `llamacpp`. This tool is designed to reduce friction when experimenting with local LLM models and integrates with other tools for viewing, searching, and exporting Ollama models.\n\n### Features\n\n- List available engines and models.\n- Run models with specified engines.\n- Show the template for a given model.\n- Pass additional arguments to the inference engine.\n- Debugging information for advanced users.\n\n### Arguments\n\n- `model`: The model to run.\n- `engine`: The engine to use.\n- `--engine-path`: The path to the engine (required).\n- `--list-engines`: List available engines.\n- `--list-models`: List available models.\n- `--cache-path`: The path to the cache file (default: `~/.ollama_data/cache`).\n- `--cache-time`: The time to keep the cache file (default: `1 day`).\n- `--engine-args`: Arguments to pass through to the engine.\n- `--debug`: Print debug information.\n- `--show-template`: Show the template for the model.\n\n### Usage\n\nTo list all available engines:\n\n```sh\nollama_data_adapter --list-engines\n```\n\nTo list all available models:\n\n```sh\nollama_data_adapter --list-models\n```\n\nTo show the template for a specific model:\n\n```sh\nollama_data_adapter mistral --show-template\n\n## The template for the model has the following forms:\n## - [INST] {{ .System }} {{ .Prompt }} [/INST]\n```\n\nTo run a specific model with an engine:\n\n```sh\nollama_data_adapter model engine --engine-path /path/to/engine --engine-args 'arg1' ... 'argn'\n```\n\n### Example\n\nTo use the `llamacpp` inference engine with the `mistral` model (assuming\nit is available in your `Ollama` registry), you might use the following\narguments:\n\n```sh\nollama_data_adapter\n mistral # Also matches `mistral:latest`\n llamacpp # Use the llamacpp engine\n --engine-path /path/to/llamacpp # Path to engine, e.g. ~/llamacpp/main\n --engine-args # Pass these arguments into the engine \n '--n-gpu-layers 40'\n '--prompt \"[INST] You are a helpful AI assistant. [/INST]\"'\n```\n\nThe `--prompt` engine pass-through argument follows the template shown by\nthe `ollama_data_adapter mistral --show-template`.\n\nWe place a lot of burden on the end-user to get the formatting right. These\nmodels are very sensitive to how you prompt them, so some experimentation\nmay be necessary.\n\nYou may also want to use `ollama_data_query` to show the system message\nor other properties of a model, so that you can further customize the\npass-through arguments to better fit its training data.\n\n## Contributing\n\nContributions are welcome! Please submit a pull request or open an issue to discuss changes.\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n## Author\n\nAlex Towell\n- Email: lex@metafunctor.com\n- Twitter: [@queelius](https://twitter.com/queelius)\n- Website: [metafunctor](https://metafunctor.com)\n- GitHub: [@queelius](https://github.com/queelius)\n",
"bugtrack_url": null,
"license": null,
"summary": "Tools for working with Ollama model data",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/queelius/ollama_data_tools"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "320db791547ac057d1dbc78de42e4f602351eefa0df7a9e0b725927375e66397",
"md5": "3d285db762dc501e9a4cddf66cff059d",
"sha256": "2b6d89dd070a93ea80d4a4b993bd11b0881e09d32c7fc9f5b320f67628636606"
},
"downloads": -1,
"filename": "ollama_data_tools-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3d285db762dc501e9a4cddf66cff059d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 17727,
"upload_time": "2024-05-18T15:19:30",
"upload_time_iso_8601": "2024-05-18T15:19:30.418052Z",
"url": "https://files.pythonhosted.org/packages/32/0d/b791547ac057d1dbc78de42e4f602351eefa0df7a9e0b725927375e66397/ollama_data_tools-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "85c0c6303ed047bdc930c7bc46907b444ae65838411ba5a759cd5f1c18f2e88c",
"md5": "610683a5af251c6444c96a311f77dfee",
"sha256": "a76a80445cdf86cabf97adf4c04ce516610411dd8cf4e115a9f5f23d4f8da108"
},
"downloads": -1,
"filename": "ollama_data_tools-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "610683a5af251c6444c96a311f77dfee",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 16590,
"upload_time": "2024-05-18T15:19:31",
"upload_time_iso_8601": "2024-05-18T15:19:31.928151Z",
"url": "https://files.pythonhosted.org/packages/85/c0/c6303ed047bdc930c7bc46907b444ae65838411ba5a759cd5f1c18f2e88c/ollama_data_tools-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-18 15:19:31",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "queelius",
"github_project": "ollama_data_tools",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "jmespath",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "python_dateutil",
"specs": [
[
"==",
"2.8.2"
]
]
}
],
"lcname": "ollama-data-tools"
}