lighthouz


Namelighthouz JSON
Version 0.0.5 PyPI version JSON
download
home_page
SummaryLighthouz AI Python SDK
upload_time2024-02-12 07:27:52
maintainer
docs_urlNone
author
requires_python>=3.8
licenseMIT
keywords lighthouz python sdk api development evaluation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
  <img src="https://lighthouz.ai/lighthouz-logo.png" alt="lighthouz" width="50%"/>
</div>

<div align="center">

![PyPI - Version](https://img.shields.io/pypi/v/lighthouz?label=lighthouz&link=https%3A%2F%2Fpypi.org%2Fproject%2Flighthouz)
[![Docs](https://img.shields.io/badge/docs-lighthouz%20docs-green)](https://www.lighthouz.ai/docs/)
[![GitHub](https://img.shields.io/badge/github-Lighthouz_AI-blue)](https://github.com/Lighthouz-AI)

[Website](https://lighthouz.ai/) | [Installation](#installation) | [Quick Usage](#quick-usage) | [Documentation](https://www.lighthouz.ai/docs/)

</div>

Lighthouz AI is a AI benchmark data generation, evaluation, and security platform. It is meticulously designed to aid
developers in both evaluating the reliability and enhancing the capabilities of their Language Learning Model (LLM)
applications.

## Key Features

Lighthouz has the following features:

### 1. AutoBench: Create AI-assisted custom benchmarks for security, privacy, and reliability

- **Create Benchmarks**: AutoBench creates application-specific and task-specific benchmark test cases to assess
  critical security, privacy, and reliability aspects of your LLM app.
- **Flexibility**: Tailor-made benchmarks to suit your specific evaluation needs.
- **Integration with your own benchmarks**: Seamlessly upload and incorporate your pre-existing benchmarks.

### 2. Eval Studio: Evaluate LLM Applications for security, privacy, and reliability

- **Comprehensive Analysis**: Thoroughly assess your LLM application for hallucinations, toxicity, out-of-context
  responses, PII data leaks, and prompt injections.
- **Insightful Feedback**: Gain valuable insights to refine your application.
- **Comparative Analysis**: Effortlessly compare different LLM apps and versions.
- **Customization**: Test the impact on performance of prompts, LLMs, hyperparameters, etc.

## Installation

```bash
pip install lighthouz
```

## Quick Usage

### Initialization

```python
from lighthouz import Lighthouz

LH = Lighthouz("lighthouz_api_key")  # replace with your lighthouz api key
```

### AutoBench: Create custom benchmarks

To generate a benchmark, use the generate_benchmark function under the Benchmark class.

This generates and stores a benchmark spanning benchmark_category categories. The benchmark is a collection of unit
tests, called Prompt Unit Tests. Each unit test contains an input prompt, an expected response (if applicable),
context (if applicable), and corresponding file name (if applicable).

```python
from lighthouz.benchmark import Benchmark

lh_benchmark = Benchmark(LH)  # LH: Lighthouz instance initialized with Lighthouz API key
benchmark_data = lh_benchmark.generate_benchmark(
    file_path="pdf_file_path",
    benchmark_categories=["rag_benchmark", "out_of_context", "prompt_injection", "pii_leak"]
)
benchmark_id = benchmark_data.get("benchmark_id")
print(benchmark_id)
```

The possible `benchmark_categories` options are:

* "rag_benchmark": this creates two hallucination benchmarks, namely Hallucination: direct questions and Hallucination:
  indirect questions.
* "out_of_context": this benchmark contains out-of-context prompts to test whether the LLM app responds to irrelevant
  queries.
* "prompt_injection": this benchmark contains prompt injection prompts testing whether the LLM behavior can be
  manipulated.
* "pii_leak": this benchmark contains prompts testing whether the LLM can leak PII data.

The resulting data, when viewed on Lighthouz platform, looks as follows:
![AutoBench](https://lighthouz.ai/assets/images/autobench-ca4f6afca2405f37ce0de8f1e0c68f8e.png)

### Evaluate a RAG Application on a Benchmark Dataset

The following shows how to use the Evaluation class from Lighthouz to evaluate a RAG system. It involves initializing an
evaluation instance with a Lighthouz API key and using the evaluate_rag_model method with a response function, benchmark
ID, and app ID.

```python
from lighthouz.evaluation import Evaluation

evaluation = Evaluation(LH)  # LH: Lighthouz instance initialized with Lighthouz API key
e_single = evaluation.evaluate_rag_model(
    response_function=llamaindex_rag_query_function,
    benchmark_id="lighthouz_benchmark_id",  # replace with benchmark id
    app_id="lighthouz_app_id",  # replace with the app id
)
print(e_single)
```

The evaluation results, when viewed on Lighthouz platform, look as follows:
![AutoBench](https://lighthouz.ai/assets/images/eval-one-a075376733a726a70d0941e034f30a07.png)

Individual test cases are shown as:
![AutoBench](https://lighthouz.ai/assets/images/eval-one-detail-677b266dfb731953f826ef91aefe83b9.png)

### Use Lighthouz Eval Endpoint to Evaluate a Single RAG Query

Add your Lighthouz API key before running the following code:

```bash
curl -X POST "https://lighthouz.ai/api/api/evaluate_query" \
-H "api-key: YOUR LH API KEY" \
-H "Content-Type: application/json" \
-d '{
    "app_name": "gpt-4-0613",
    "query": "What is the Company'\''s line of personal computers based on its macOS operating system and what does it include?",
    "expected_response": "The Mac line includes laptops MacBook Air and MacBook Pro, as well as desktops iMac, Mac mini, Mac Studio and Mac Pro.",
    "generated_response": "The Company'\''s line of personal computers based on its macOS operating system is Mac.",
    "context": "s the Company’s line of smartphones based on its iOS operating system. The iPhone line includes iPhone 14 Pro, iPhone 14, iPhone 13, iPhone SE®, iPhone 12 and iPhone 11. Mac Mac® is the Company’s line of personal computers based on its macOS® operating system. The Mac line includes laptops MacBook Air® and MacBook Pro®, as well as desktops iMac®, Mac mini®, Mac Studio™ and Mac Pro®. iPad iPad® is the Company’s line of multipurpose tablets based on its iPadOS® operating system. The iPad line includes iPad Pro®, iPad Air®, iPad and iPad mini®. Wearables, Home and Accessories Wearables, Home and Accessories includes: •AirPods®, the Company’s wireless headphones, including AirPods, AirPods Pro® and AirPods Max™; •Apple TV®, the Company’s media streaming and gaming device based on its tvOS® operating system, including Apple TV 4K and Apple TV HD; •Apple Watch®, the Company’s line of smartwatches based on its watchOS® operating system, including Apple Watch Ultra ™, Apple Watch Series 8 and Apple Watch SE®; and •Beats® products, HomePod mini® and accessories. Apple Inc. | 2022 Form 10-K | 1"
}'
```

The returned result, in json format, is as follows:

```json
{
  "_id":"65c99a2a3ddb41f89115d327",
  "app_id":"65b6c0af56ecfafc9440b970",
  "app_title":"gpt-4-0613",
  "user_id":"658066787e7ab545580c0a98",
  "query":"What is the Company's line of personal computers based on its macOS operating system and what does it include?",
  "source_context":"s the Company\u2019s line of smartphones based on its iOS operating system. The iPhone line includes iPhone 14 Pro, iPhone 14, iPhone 13, iPhone SE\u00ae, iPhone 12 and iPhone 11. Mac Mac\u00ae is the Company\u2019s line of personal computers based on its macOS\u00ae operating system. The Mac line includes laptops MacBook Air\u00ae and MacBook Pro\u00ae, as well as desktops iMac\u00ae, Mac mini\u00ae, Mac Studio\u2122 and Mac Pro\u00ae. iPad iPad\u00ae is the Company\u2019s line of multipurpose tablets based on its iPadOS\u00ae operating system. The iPad line includes iPad Pro\u00ae, iPad Air\u00ae, iPad and iPad mini\u00ae. Wearables, Home and Accessories Wearables, Home and Accessories includes: \u2022AirPods\u00ae, the Company\u2019s wireless headphones, including AirPods, AirPods Pro\u00ae and AirPods Max\u2122; \u2022Apple TV\u00ae, the Company\u2019s media streaming and gaming device based on its tvOS\u00ae operating system, including Apple TV 4K and Apple TV HD; \u2022Apple Watch\u00ae, the Company\u2019s line of smartwatches based on its watchOS\u00ae operating system, including Apple Watch Ultra \u2122, Apple Watch Series 8 and Apple Watch SE\u00ae; and \u2022Beats\u00ae products, HomePod mini\u00ae and accessories. Apple Inc. | 2022 Form 10-K | 1",
  "expected_output":"The Mac line includes laptops MacBook Air and MacBook Pro, as well as desktops iMac, Mac mini, Mac Studio and Mac Pro.",
  "generated_output":"The Company's line of personal computers based on its macOS operating system is Mac.",
  "put_type":"Hallucination: Direct Question",
  "created_at":"Mon, 12 Feb 2024 04:10:18 GMT",
  "alerts":[],
  "label":"correct but incomplete",
  "passed":null,
  "similarity_score":0.6825323104858398,
  "conciseness_score":0.711864406779661,
  "query_toxicity_score":0.0008218016009777784,
  "generated_response_toxicity_score":0.0009125719661824405,
  "prompt_injection_score":0.00042253732681274414,
  "query_pii":[],
  "generated_response_pii":[],
}
```

## Quick Start Examples

[Evaluation of a RAG app built with LangChain](https://lighthouz.ai/docs/examples/langchain-example)

[Evaluation of a RAG app built with LlamaIndex](https://lighthouz.ai/docs/examples/llamaindex-example)

[Evaluation of a RAG app hosted on an API endpoint](https://lighthouz.ai/docs/examples/api-example)

## Contact

For any queries, reach out to contact@lighthouz.ai


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "lighthouz",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "lighthouz,python,sdk,api,development,evaluation",
    "author": "",
    "author_email": "\"Lighthouz AI, Inc\" <srijan@lighthouz.ai>",
    "download_url": "https://files.pythonhosted.org/packages/2c/23/ecc0da91543d7eee63f61aebf152c3d3c98ebfae67f3ce11d389bf5eb39e/lighthouz-0.0.5.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <img src=\"https://lighthouz.ai/lighthouz-logo.png\" alt=\"lighthouz\" width=\"50%\"/>\n</div>\n\n<div align=\"center\">\n\n![PyPI - Version](https://img.shields.io/pypi/v/lighthouz?label=lighthouz&link=https%3A%2F%2Fpypi.org%2Fproject%2Flighthouz)\n[![Docs](https://img.shields.io/badge/docs-lighthouz%20docs-green)](https://www.lighthouz.ai/docs/)\n[![GitHub](https://img.shields.io/badge/github-Lighthouz_AI-blue)](https://github.com/Lighthouz-AI)\n\n[Website](https://lighthouz.ai/) | [Installation](#installation) | [Quick Usage](#quick-usage) | [Documentation](https://www.lighthouz.ai/docs/)\n\n</div>\n\nLighthouz AI is a AI benchmark data generation, evaluation, and security platform. It is meticulously designed to aid\ndevelopers in both evaluating the reliability and enhancing the capabilities of their Language Learning Model (LLM)\napplications.\n\n## Key Features\n\nLighthouz has the following features:\n\n### 1. AutoBench: Create AI-assisted custom benchmarks for security, privacy, and reliability\n\n- **Create Benchmarks**: AutoBench creates application-specific and task-specific benchmark test cases to assess\n  critical security, privacy, and reliability aspects of your LLM app.\n- **Flexibility**: Tailor-made benchmarks to suit your specific evaluation needs.\n- **Integration with your own benchmarks**: Seamlessly upload and incorporate your pre-existing benchmarks.\n\n### 2. Eval Studio: Evaluate LLM Applications for security, privacy, and reliability\n\n- **Comprehensive Analysis**: Thoroughly assess your LLM application for hallucinations, toxicity, out-of-context\n  responses, PII data leaks, and prompt injections.\n- **Insightful Feedback**: Gain valuable insights to refine your application.\n- **Comparative Analysis**: Effortlessly compare different LLM apps and versions.\n- **Customization**: Test the impact on performance of prompts, LLMs, hyperparameters, etc.\n\n## Installation\n\n```bash\npip install lighthouz\n```\n\n## Quick Usage\n\n### Initialization\n\n```python\nfrom lighthouz import Lighthouz\n\nLH = Lighthouz(\"lighthouz_api_key\")  # replace with your lighthouz api key\n```\n\n### AutoBench: Create custom benchmarks\n\nTo generate a benchmark, use the generate_benchmark function under the Benchmark class.\n\nThis generates and stores a benchmark spanning benchmark_category categories. The benchmark is a collection of unit\ntests, called Prompt Unit Tests. Each unit test contains an input prompt, an expected response (if applicable),\ncontext (if applicable), and corresponding file name (if applicable).\n\n```python\nfrom lighthouz.benchmark import Benchmark\n\nlh_benchmark = Benchmark(LH)  # LH: Lighthouz instance initialized with Lighthouz API key\nbenchmark_data = lh_benchmark.generate_benchmark(\n    file_path=\"pdf_file_path\",\n    benchmark_categories=[\"rag_benchmark\", \"out_of_context\", \"prompt_injection\", \"pii_leak\"]\n)\nbenchmark_id = benchmark_data.get(\"benchmark_id\")\nprint(benchmark_id)\n```\n\nThe possible `benchmark_categories` options are:\n\n* \"rag_benchmark\": this creates two hallucination benchmarks, namely Hallucination: direct questions and Hallucination:\n  indirect questions.\n* \"out_of_context\": this benchmark contains out-of-context prompts to test whether the LLM app responds to irrelevant\n  queries.\n* \"prompt_injection\": this benchmark contains prompt injection prompts testing whether the LLM behavior can be\n  manipulated.\n* \"pii_leak\": this benchmark contains prompts testing whether the LLM can leak PII data.\n\nThe resulting data, when viewed on Lighthouz platform, looks as follows:\n![AutoBench](https://lighthouz.ai/assets/images/autobench-ca4f6afca2405f37ce0de8f1e0c68f8e.png)\n\n### Evaluate a RAG Application on a Benchmark Dataset\n\nThe following shows how to use the Evaluation class from Lighthouz to evaluate a RAG system. It involves initializing an\nevaluation instance with a Lighthouz API key and using the evaluate_rag_model method with a response function, benchmark\nID, and app ID.\n\n```python\nfrom lighthouz.evaluation import Evaluation\n\nevaluation = Evaluation(LH)  # LH: Lighthouz instance initialized with Lighthouz API key\ne_single = evaluation.evaluate_rag_model(\n    response_function=llamaindex_rag_query_function,\n    benchmark_id=\"lighthouz_benchmark_id\",  # replace with benchmark id\n    app_id=\"lighthouz_app_id\",  # replace with the app id\n)\nprint(e_single)\n```\n\nThe evaluation results, when viewed on Lighthouz platform, look as follows:\n![AutoBench](https://lighthouz.ai/assets/images/eval-one-a075376733a726a70d0941e034f30a07.png)\n\nIndividual test cases are shown as:\n![AutoBench](https://lighthouz.ai/assets/images/eval-one-detail-677b266dfb731953f826ef91aefe83b9.png)\n\n### Use Lighthouz Eval Endpoint to Evaluate a Single RAG Query\n\nAdd your Lighthouz API key before running the following code:\n\n```bash\ncurl -X POST \"https://lighthouz.ai/api/api/evaluate_query\" \\\n-H \"api-key: YOUR LH API KEY\" \\\n-H \"Content-Type: application/json\" \\\n-d '{\n    \"app_name\": \"gpt-4-0613\",\n    \"query\": \"What is the Company'\\''s line of personal computers based on its macOS operating system and what does it include?\",\n    \"expected_response\": \"The Mac line includes laptops MacBook Air and MacBook Pro, as well as desktops iMac, Mac mini, Mac Studio and Mac Pro.\",\n    \"generated_response\": \"The Company'\\''s line of personal computers based on its macOS operating system is Mac.\",\n    \"context\": \"s the Company\u2019s line of smartphones based on its iOS operating system. The iPhone line includes iPhone 14 Pro, iPhone 14, iPhone 13, iPhone SE\u00ae, iPhone 12 and iPhone 11. Mac Mac\u00ae is the Company\u2019s line of personal computers based on its macOS\u00ae operating system. The Mac line includes laptops MacBook Air\u00ae and MacBook Pro\u00ae, as well as desktops iMac\u00ae, Mac mini\u00ae, Mac Studio\u2122 and Mac Pro\u00ae. iPad iPad\u00ae is the Company\u2019s line of multipurpose tablets based on its iPadOS\u00ae operating system. The iPad line includes iPad Pro\u00ae, iPad Air\u00ae, iPad and iPad mini\u00ae. Wearables, Home and Accessories Wearables, Home and Accessories includes: \u2022AirPods\u00ae, the Company\u2019s wireless headphones, including AirPods, AirPods Pro\u00ae and AirPods Max\u2122; \u2022Apple TV\u00ae, the Company\u2019s media streaming and gaming device based on its tvOS\u00ae operating system, including Apple TV 4K and Apple TV HD; \u2022Apple Watch\u00ae, the Company\u2019s line of smartwatches based on its watchOS\u00ae operating system, including Apple Watch Ultra \u2122, Apple Watch Series 8 and Apple Watch SE\u00ae; and \u2022Beats\u00ae products, HomePod mini\u00ae and accessories. Apple Inc. | 2022 Form 10-K | 1\"\n}'\n```\n\nThe returned result, in json format, is as follows:\n\n```json\n{\n  \"_id\":\"65c99a2a3ddb41f89115d327\",\n  \"app_id\":\"65b6c0af56ecfafc9440b970\",\n  \"app_title\":\"gpt-4-0613\",\n  \"user_id\":\"658066787e7ab545580c0a98\",\n  \"query\":\"What is the Company's line of personal computers based on its macOS operating system and what does it include?\",\n  \"source_context\":\"s the Company\\u2019s line of smartphones based on its iOS operating system. The iPhone line includes iPhone 14 Pro, iPhone 14, iPhone 13, iPhone SE\\u00ae, iPhone 12 and iPhone 11. Mac Mac\\u00ae is the Company\\u2019s line of personal computers based on its macOS\\u00ae operating system. The Mac line includes laptops MacBook Air\\u00ae and MacBook Pro\\u00ae, as well as desktops iMac\\u00ae, Mac mini\\u00ae, Mac Studio\\u2122 and Mac Pro\\u00ae. iPad iPad\\u00ae is the Company\\u2019s line of multipurpose tablets based on its iPadOS\\u00ae operating system. The iPad line includes iPad Pro\\u00ae, iPad Air\\u00ae, iPad and iPad mini\\u00ae. Wearables, Home and Accessories Wearables, Home and Accessories includes: \\u2022AirPods\\u00ae, the Company\\u2019s wireless headphones, including AirPods, AirPods Pro\\u00ae and AirPods Max\\u2122; \\u2022Apple TV\\u00ae, the Company\\u2019s media streaming and gaming device based on its tvOS\\u00ae operating system, including Apple TV 4K and Apple TV HD; \\u2022Apple Watch\\u00ae, the Company\\u2019s line of smartwatches based on its watchOS\\u00ae operating system, including Apple Watch Ultra \\u2122, Apple Watch Series 8 and Apple Watch SE\\u00ae; and \\u2022Beats\\u00ae products, HomePod mini\\u00ae and accessories. Apple Inc. | 2022 Form 10-K | 1\",\n  \"expected_output\":\"The Mac line includes laptops MacBook Air and MacBook Pro, as well as desktops iMac, Mac mini, Mac Studio and Mac Pro.\",\n  \"generated_output\":\"The Company's line of personal computers based on its macOS operating system is Mac.\",\n  \"put_type\":\"Hallucination: Direct Question\",\n  \"created_at\":\"Mon, 12 Feb 2024 04:10:18 GMT\",\n  \"alerts\":[],\n  \"label\":\"correct but incomplete\",\n  \"passed\":null,\n  \"similarity_score\":0.6825323104858398,\n  \"conciseness_score\":0.711864406779661,\n  \"query_toxicity_score\":0.0008218016009777784,\n  \"generated_response_toxicity_score\":0.0009125719661824405,\n  \"prompt_injection_score\":0.00042253732681274414,\n  \"query_pii\":[],\n  \"generated_response_pii\":[],\n}\n```\n\n## Quick Start Examples\n\n[Evaluation of a RAG app built with LangChain](https://lighthouz.ai/docs/examples/langchain-example)\n\n[Evaluation of a RAG app built with LlamaIndex](https://lighthouz.ai/docs/examples/llamaindex-example)\n\n[Evaluation of a RAG app hosted on an API endpoint](https://lighthouz.ai/docs/examples/api-example)\n\n## Contact\n\nFor any queries, reach out to contact@lighthouz.ai\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Lighthouz AI Python SDK",
    "version": "0.0.5",
    "project_urls": {
        "Documentation": "https://www.lighthouz.ai/docs",
        "Homepage": "https://www.lighthouz.ai",
        "Issues": "https://github.com/Lighthouz-AI/lighthouz_sdk/issues",
        "Source": "https://github.com/Lighthouz-AI/lighthouz_sdk"
    },
    "split_keywords": [
        "lighthouz",
        "python",
        "sdk",
        "api",
        "development",
        "evaluation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d66cc1b2b64d7f0762d5ae90cf54990970f095008b08ae5a9f61212f0386d75b",
                "md5": "05c81418ca70671d919b1bc5d3d6a8c9",
                "sha256": "e530cf182aeb038a20c10912fd3bbaced9b641cee0f39a2beaf27d07abc45ca0"
            },
            "downloads": -1,
            "filename": "lighthouz-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "05c81418ca70671d919b1bc5d3d6a8c9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 9489,
            "upload_time": "2024-02-12T07:27:49",
            "upload_time_iso_8601": "2024-02-12T07:27:49.624049Z",
            "url": "https://files.pythonhosted.org/packages/d6/6c/c1b2b64d7f0762d5ae90cf54990970f095008b08ae5a9f61212f0386d75b/lighthouz-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2c23ecc0da91543d7eee63f61aebf152c3d3c98ebfae67f3ce11d389bf5eb39e",
                "md5": "737dedec1b2817ce78d4b376db04c6a3",
                "sha256": "eb1c1b9be9880f42a8cc57250f4470169b6344dce6d2db064c806d3e0dcd9460"
            },
            "downloads": -1,
            "filename": "lighthouz-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "737dedec1b2817ce78d4b376db04c6a3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 11573,
            "upload_time": "2024-02-12T07:27:52",
            "upload_time_iso_8601": "2024-02-12T07:27:52.726365Z",
            "url": "https://files.pythonhosted.org/packages/2c/23/ecc0da91543d7eee63f61aebf152c3d3c98ebfae67f3ce11d389bf5eb39e/lighthouz-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-12 07:27:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Lighthouz-AI",
    "github_project": "lighthouz_sdk",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "lighthouz"
}
        
Elapsed time: 0.18859s