agent-evaluation


Nameagent-evaluation JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/awslabs/agent-evaluation
SummaryA generative AI-powered framework for testing virtual agents.
upload_time2024-05-03 22:25:11
maintainerNone
docs_urlNone
authorAmazon Web Services
requires_python>=3.9
licenseApache 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![PyPI - Version](https://img.shields.io/pypi/v/agent-evaluation)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/agent-evaluation)
![GitHub License](https://img.shields.io/github/license/awslabs/agent-evaluation)
[![security: bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Built with Material for MkDocs](https://img.shields.io/badge/Material_for_MkDocs-526CFE?style=for-the-badge&logo=MaterialForMkDocs&logoColor=white)](https://squidfunk.github.io/mkdocs-material/)

# Agent Evaluation

Agent Evaluation is a generative AI-powered framework for testing virtual agents.

Internally, Agent Evaluation implements an LLM agent (evaluator) that will orchestrate conversations with your own agent (target) and evaluate the responses during the conversation.

## ✨ Key features

- Built-in support for popular AWS services including [Amazon Bedrock](https://aws.amazon.com/bedrock/), [Amazon Q Business](https://aws.amazon.com/q/business/), and [Amazon SageMaker](https://aws.amazon.com/sagemaker/). You can also [bring your own agent](https://awslabs.github.io/agent-evaluation/targets/custom_targets/) to test using Agent Evaluation.
- Orchestrate concurrent, multi-turn conversations with your agent while evaluating its responses.
- Define [hooks](https://awslabs.github.io/agent-evaluation/hooks/) to perform additional tasks such as integration testing.
- Can be incorporated into CI/CD pipelines to expedite the time to delivery while maintaining the stability of agents in production environments.

## 📚 Documentation

To get started, please visit the full documentation [here](https://awslabs.github.io/agent-evaluation/). To contribute, please refer to [CONTRIBUTING.md](./CONTRIBUTING.md)

## 👏 Contributors

Shout out to these awesome contributors:

<a href="https://github.com/awslabs/agent-evaluation/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=awslabs/agent-evaluation" />
</a>

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/awslabs/agent-evaluation",
    "name": "agent-evaluation",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Amazon Web Services",
    "author_email": "agent-evaluation-oss-core-team@amazon.com",
    "download_url": "https://files.pythonhosted.org/packages/94/a9/fb463164dc38d476d3a544c9128607e4938095b72b24d64a172413439654/agent_evaluation-0.1.0.tar.gz",
    "platform": null,
    "description": "![PyPI - Version](https://img.shields.io/pypi/v/agent-evaluation)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/agent-evaluation)\n![GitHub License](https://img.shields.io/github/license/awslabs/agent-evaluation)\n[![security: bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Built with Material for MkDocs](https://img.shields.io/badge/Material_for_MkDocs-526CFE?style=for-the-badge&logo=MaterialForMkDocs&logoColor=white)](https://squidfunk.github.io/mkdocs-material/)\n\n# Agent Evaluation\n\nAgent Evaluation is a generative AI-powered framework for testing virtual agents.\n\nInternally, Agent Evaluation implements an LLM agent (evaluator) that will orchestrate conversations with your own agent (target) and evaluate the responses during the conversation.\n\n## \u2728 Key features\n\n- Built-in support for popular AWS services including [Amazon Bedrock](https://aws.amazon.com/bedrock/), [Amazon Q Business](https://aws.amazon.com/q/business/), and [Amazon SageMaker](https://aws.amazon.com/sagemaker/). You can also [bring your own agent](https://awslabs.github.io/agent-evaluation/targets/custom_targets/) to test using Agent Evaluation.\n- Orchestrate concurrent, multi-turn conversations with your agent while evaluating its responses.\n- Define [hooks](https://awslabs.github.io/agent-evaluation/hooks/) to perform additional tasks such as integration testing.\n- Can be incorporated into CI/CD pipelines to expedite the time to delivery while maintaining the stability of agents in production environments.\n\n## \ud83d\udcda Documentation\n\nTo get started, please visit the full documentation [here](https://awslabs.github.io/agent-evaluation/). To contribute, please refer to [CONTRIBUTING.md](./CONTRIBUTING.md)\n\n## \ud83d\udc4f Contributors\n\nShout out to these awesome contributors:\n\n<a href=\"https://github.com/awslabs/agent-evaluation/graphs/contributors\">\n  <img src=\"https://contrib.rocks/image?repo=awslabs/agent-evaluation\" />\n</a>\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "A generative AI-powered framework for testing virtual agents.",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/awslabs/agent-evaluation"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d7e6731c68918c5fce4c71c58096ffb36d87ce70d925bdd825538e46d12fdab2",
                "md5": "0f9d744eff88fc238a56ae6f6df7944e",
                "sha256": "6c8876df6bb6fb932c724fa9ad16e31f81e60d4f7fd5986eba4651925ae183c5"
            },
            "downloads": -1,
            "filename": "agent_evaluation-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0f9d744eff88fc238a56ae6f6df7944e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 34665,
            "upload_time": "2024-05-03T22:25:09",
            "upload_time_iso_8601": "2024-05-03T22:25:09.390348Z",
            "url": "https://files.pythonhosted.org/packages/d7/e6/731c68918c5fce4c71c58096ffb36d87ce70d925bdd825538e46d12fdab2/agent_evaluation-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "94a9fb463164dc38d476d3a544c9128607e4938095b72b24d64a172413439654",
                "md5": "375716b2f472dc6ed06e329ec14612b9",
                "sha256": "9209839c593be496a8c6764fe99562525d3c496685dfb4c6949006631ab2501d"
            },
            "downloads": -1,
            "filename": "agent_evaluation-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "375716b2f472dc6ed06e329ec14612b9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 22330,
            "upload_time": "2024-05-03T22:25:11",
            "upload_time_iso_8601": "2024-05-03T22:25:11.143570Z",
            "url": "https://files.pythonhosted.org/packages/94/a9/fb463164dc38d476d3a544c9128607e4938095b72b24d64a172413439654/agent_evaluation-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-03 22:25:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "awslabs",
    "github_project": "agent-evaluation",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "agent-evaluation"
}
        
Elapsed time: 0.97402s