promptwright


Namepromptwright JSON
Version 0.1.5 PyPI version JSON
download
home_pageNone
SummaryA tool for generating and managing prompts for local LLMs using Ollama
upload_time2024-10-27 09:15:08
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # promptwright - Large Dataset Generation

[![Tests](https://github.com/StacklokLabs/promptwright/actions/workflows/test.yml/badge.svg)](https://github.com/StacklokLabs/promptwright/actions/workflows/test.yml)
[![Python Version](https://img.shields.io/pypi/pyversions/promptwright.svg)](https://pypi.org/project/promptwright/)

<img src="https://github.com/StacklokLabs/promptwright/releases/download/0.0.0/image.png" width="350" height="350">


promptwright is a Python library from [Stacklok](https://stacklok.com) designed for generating large datasets using a local LLM via Ollama. The library offers a flexible and easy-to-use interface to
enabling users the ability to generate structured datasets.

This was inspired by the [redotvideo/pluto](https://github.com/redotvideo/pluto),
in fact it started as fork, but ended up largley being a re-write, but with a
focus on generating large datasets against a local LLM model, as opposed to OpenAI 
where costs can be prohibitively expensive.

## Features

- **Local LLM Client Integration**: interact with Ollama models using the `LocalDataEngine`.
- **Test and Validate Connections**: Verify a connection with the LLM and ensure models are operational available before generating datasets.
- **Configurable Instructions and Prompts**: Define custom instructions and system prompts to fine-tune the behavior of the LLM.
- **Push to Hugging Face**: Push the generated dataset to Hugging Face in parquet
format.


## Getting Started

### Prerequisites

- Python 3.11+
- `promptwright` library installed
- Ollama CLI installed and running (see [Ollama Installation](https://ollama.com/)
- A Model pulled via Ollama (see [Model Compatibility](#model-compatibility))

### Installation

To install the prerequisites, you can use the following commands:

```bash
pip install promptwright
ollama serve
ollama pull {model_name} # whichever model you want to use
```

### Example Usage

There are a few examples in the `examples` directory that demonstrate how to use
the library to generate different topic based datasets.

### Running an Example

To run an example:

1. Ensure you have started Ollama by running `ollama serve`.
2. Verify that the required model is downloaded (e.g. `llama3.2:latest`).
4. Set the `model_name` in the chosen example file to the model you have downloaded.

  ```python
      engine = LocalDataEngine(
        args=LocalEngineArguments(
            instructions="Generate creative writing prompts and example responses.",
            system_prompt="You are a creative writing instructor providing writing prompts and example responses.",
            model_name="llama3.2:latest",
            temperature=0.9,  # Higher temperature for more creative variations
            max_retries=2,
  ```
5. Run your chosen example file:
   ```bash
   python example/creative_writing.py
   ```
6. The generated dataset will be saved to a JSONL file to whatever is set within  `dataset.save()`.

### Prompt Output Examples

```json
{
  "messages": [
    {
      "role": "system",
      "content": "You are tasked with designing an immersive virtual reality experience that transports users to a fantastical world of wonder."
    },
    {
      "role": "user",
      "content": "Create a descriptive passage about a character discovering their hidden talents."
    },
    {
      "role": "assistant",
      "content": "As she stared at the canvas, Emma's fingers hovered above the paintbrushes, as if hesitant to unleash the colors that had been locked within her. The strokes began with bold abandon, swirling blues and greens merging into a mesmerizing dance of light and shadow. With each passing moment, she felt herself becoming the art – her very essence seeping onto the canvas like watercolors in a spring storm. The world around her melted away, leaving only the vibrant symphony of color and creation."
    }
  ]
}
```

### Library Overview

#### Classes

- **Dataset**: A class for managing generated datasets.
- **LocalDataEngine**: The main engine responsible for interacting with the LLM client and generating datasets.
- **LocalEngineArguments**: A configuration class that defines the instructions, system prompt, model name temperature, retries, and prompt templates used for generating data.
- **OllamaClient**: A client class for interacting with the Ollama API.
- **HFUploader**: A utility class for uploading datasets to Hugging Face (pass in the path to the dataset and token).

### Troubleshooting

If you encounter any errors while running the script, here are a few common troubleshooting steps:

1. **Restart Ollama**:  
   ```bash
   killall ollama && ollama serve
   ```

2. **Verify Model Installation**:  
   ```bash
   ollama pull {model_name}
   ```

3. **Check Ollama Logs**:  
   Inspect the logs for any error messages that might provide more context on
   what went wrong, these can be found in the `~/.ollama/logs` directory.

## Model Compatibility

The library should work with most LLM models. It has been tested with the
following models so far:

- **LLaMA3**: The library is designed to work with the LLaMA model, specifically
the `llama3:latest` model.
- **Mistral**: The library is compatible with the Mistral model, which is a fork
of the GPT-3 model.

If you test anymore, please make a pull request to update this list!

### Contributing

If something here could be improved, please open an issue or submit a pull request.

### License

This project is licensed under the Apache 2 License. See the `LICENSE` file for more details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "promptwright",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/80/8b/008df187e9f2b0f019925e0400e73f0078211c8b36d063f18e30afbed8c1/promptwright-0.1.5.tar.gz",
    "platform": null,
    "description": "# promptwright - Large Dataset Generation\n\n[![Tests](https://github.com/StacklokLabs/promptwright/actions/workflows/test.yml/badge.svg)](https://github.com/StacklokLabs/promptwright/actions/workflows/test.yml)\n[![Python Version](https://img.shields.io/pypi/pyversions/promptwright.svg)](https://pypi.org/project/promptwright/)\n\n<img src=\"https://github.com/StacklokLabs/promptwright/releases/download/0.0.0/image.png\" width=\"350\" height=\"350\">\n\n\npromptwright is a Python library from [Stacklok](https://stacklok.com) designed for generating large datasets using a local LLM via Ollama. The library offers a flexible and easy-to-use interface to\nenabling users the ability to generate structured datasets.\n\nThis was inspired by the [redotvideo/pluto](https://github.com/redotvideo/pluto),\nin fact it started as fork, but ended up largley being a re-write, but with a\nfocus on generating large datasets against a local LLM model, as opposed to OpenAI \nwhere costs can be prohibitively expensive.\n\n## Features\n\n- **Local LLM Client Integration**: interact with Ollama models using the `LocalDataEngine`.\n- **Test and Validate Connections**: Verify a connection with the LLM and ensure models are operational available before generating datasets.\n- **Configurable Instructions and Prompts**: Define custom instructions and system prompts to fine-tune the behavior of the LLM.\n- **Push to Hugging Face**: Push the generated dataset to Hugging Face in parquet\nformat.\n\n\n## Getting Started\n\n### Prerequisites\n\n- Python 3.11+\n- `promptwright` library installed\n- Ollama CLI installed and running (see [Ollama Installation](https://ollama.com/)\n- A Model pulled via Ollama (see [Model Compatibility](#model-compatibility))\n\n### Installation\n\nTo install the prerequisites, you can use the following commands:\n\n```bash\npip install promptwright\nollama serve\nollama pull {model_name} # whichever model you want to use\n```\n\n### Example Usage\n\nThere are a few examples in the `examples` directory that demonstrate how to use\nthe library to generate different topic based datasets.\n\n### Running an Example\n\nTo run an example:\n\n1. Ensure you have started Ollama by running `ollama serve`.\n2. Verify that the required model is downloaded (e.g. `llama3.2:latest`).\n4. Set the `model_name` in the chosen example file to the model you have downloaded.\n\n  ```python\n      engine = LocalDataEngine(\n        args=LocalEngineArguments(\n            instructions=\"Generate creative writing prompts and example responses.\",\n            system_prompt=\"You are a creative writing instructor providing writing prompts and example responses.\",\n            model_name=\"llama3.2:latest\",\n            temperature=0.9,  # Higher temperature for more creative variations\n            max_retries=2,\n  ```\n5. Run your chosen example file:\n   ```bash\n   python example/creative_writing.py\n   ```\n6. The generated dataset will be saved to a JSONL file to whatever is set within  `dataset.save()`.\n\n### Prompt Output Examples\n\n```json\n{\n  \"messages\": [\n    {\n      \"role\": \"system\",\n      \"content\": \"You are tasked with designing an immersive virtual reality experience that transports users to a fantastical world of wonder.\"\n    },\n    {\n      \"role\": \"user\",\n      \"content\": \"Create a descriptive passage about a character discovering their hidden talents.\"\n    },\n    {\n      \"role\": \"assistant\",\n      \"content\": \"As she stared at the canvas, Emma's fingers hovered above the paintbrushes, as if hesitant to unleash the colors that had been locked within her. The strokes began with bold abandon, swirling blues and greens merging into a mesmerizing dance of light and shadow. With each passing moment, she felt herself becoming the art \u2013 her very essence seeping onto the canvas like watercolors in a spring storm. The world around her melted away, leaving only the vibrant symphony of color and creation.\"\n    }\n  ]\n}\n```\n\n### Library Overview\n\n#### Classes\n\n- **Dataset**: A class for managing generated datasets.\n- **LocalDataEngine**: The main engine responsible for interacting with the LLM client and generating datasets.\n- **LocalEngineArguments**: A configuration class that defines the instructions, system prompt, model name temperature, retries, and prompt templates used for generating data.\n- **OllamaClient**: A client class for interacting with the Ollama API.\n- **HFUploader**: A utility class for uploading datasets to Hugging Face (pass in the path to the dataset and token).\n\n### Troubleshooting\n\nIf you encounter any errors while running the script, here are a few common troubleshooting steps:\n\n1. **Restart Ollama**:  \n   ```bash\n   killall ollama && ollama serve\n   ```\n\n2. **Verify Model Installation**:  \n   ```bash\n   ollama pull {model_name}\n   ```\n\n3. **Check Ollama Logs**:  \n   Inspect the logs for any error messages that might provide more context on\n   what went wrong, these can be found in the `~/.ollama/logs` directory.\n\n## Model Compatibility\n\nThe library should work with most LLM models. It has been tested with the\nfollowing models so far:\n\n- **LLaMA3**: The library is designed to work with the LLaMA model, specifically\nthe `llama3:latest` model.\n- **Mistral**: The library is compatible with the Mistral model, which is a fork\nof the GPT-3 model.\n\nIf you test anymore, please make a pull request to update this list!\n\n### Contributing\n\nIf something here could be improved, please open an issue or submit a pull request.\n\n### License\n\nThis project is licensed under the Apache 2 License. See the `LICENSE` file for more details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A tool for generating and managing prompts for local LLMs using Ollama",
    "version": "0.1.5",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d29be876b41580031e085ae0a6b515ef9a175a319612eca31c9a5162a018b821",
                "md5": "01abbc1978063668a742880f41916f9e",
                "sha256": "7023cbcf81712ff838f9e335247f68be6913cd834968d3748f68b14476c5ccc5"
            },
            "downloads": -1,
            "filename": "promptwright-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "01abbc1978063668a742880f41916f9e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 17030,
            "upload_time": "2024-10-27T09:15:06",
            "upload_time_iso_8601": "2024-10-27T09:15:06.131874Z",
            "url": "https://files.pythonhosted.org/packages/d2/9b/e876b41580031e085ae0a6b515ef9a175a319612eca31c9a5162a018b821/promptwright-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "808b008df187e9f2b0f019925e0400e73f0078211c8b36d063f18e30afbed8c1",
                "md5": "111027f384f59c91d582712de8265aa4",
                "sha256": "5ecfe9b716bfc3aaa4aad27f059222a21a186d567da21053de8f3af809b9eeeb"
            },
            "downloads": -1,
            "filename": "promptwright-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "111027f384f59c91d582712de8265aa4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 5661242,
            "upload_time": "2024-10-27T09:15:08",
            "upload_time_iso_8601": "2024-10-27T09:15:08.534291Z",
            "url": "https://files.pythonhosted.org/packages/80/8b/008df187e9f2b0f019925e0400e73f0078211c8b36d063f18e30afbed8c1/promptwright-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-27 09:15:08",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "promptwright"
}
        
Elapsed time: 0.38683s