


[](https://github.com/PyCQA/bandit)
[](https://github.com/psf/black)
[](https://squidfunk.github.io/mkdocs-material/)
# Agent Evaluation
Agent Evaluation is a generative AI-powered framework for testing virtual agents.
Internally, Agent Evaluation implements an LLM agent (evaluator) that will orchestrate conversations with your own agent (target) and evaluate the responses during the conversation.
## ✨ Key features
- Built-in support for popular AWS services including [Amazon Bedrock](https://aws.amazon.com/bedrock/), [Amazon Q Business](https://aws.amazon.com/q/business/), and [Amazon SageMaker](https://aws.amazon.com/sagemaker/). You can also [bring your own agent](https://awslabs.github.io/agent-evaluation/targets/custom_targets/) to test using Agent Evaluation.
- Orchestrate concurrent, multi-turn conversations with your agent while evaluating its responses.
- Define [hooks](https://awslabs.github.io/agent-evaluation/hooks/) to perform additional tasks such as integration testing.
- Can be incorporated into CI/CD pipelines to expedite the time to delivery while maintaining the stability of agents in production environments.
## 📚 Documentation
To get started, please visit the full documentation [here](https://awslabs.github.io/agent-evaluation/). To contribute, please refer to [CONTRIBUTING.md](./CONTRIBUTING.md)
## 👏 Contributors
Shout out to these awesome contributors:
<a href="https://github.com/awslabs/agent-evaluation/graphs/contributors">
<img src="https://contrib.rocks/image?repo=awslabs/agent-evaluation" />
</a>
Raw data
{
"_id": null,
"home_page": "https://awslabs.github.io/agent-evaluation/",
"name": "agent-evaluation",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Amazon Web Services",
"author_email": "agent-evaluation-oss-core-team@amazon.com",
"download_url": "https://files.pythonhosted.org/packages/ef/d1/4da9cb0192ddc01f4417d51cef459c1b3e72591b223c2ea7e7f7a6f28adb/agent_evaluation-0.3.0.tar.gz",
"platform": null,
"description": "\n\n\n[](https://github.com/PyCQA/bandit)\n[](https://github.com/psf/black)\n[](https://squidfunk.github.io/mkdocs-material/)\n\n# Agent Evaluation\n\nAgent Evaluation is a generative AI-powered framework for testing virtual agents.\n\nInternally, Agent Evaluation implements an LLM agent (evaluator) that will orchestrate conversations with your own agent (target) and evaluate the responses during the conversation.\n\n## \u2728 Key features\n\n- Built-in support for popular AWS services including [Amazon Bedrock](https://aws.amazon.com/bedrock/), [Amazon Q Business](https://aws.amazon.com/q/business/), and [Amazon SageMaker](https://aws.amazon.com/sagemaker/). You can also [bring your own agent](https://awslabs.github.io/agent-evaluation/targets/custom_targets/) to test using Agent Evaluation.\n- Orchestrate concurrent, multi-turn conversations with your agent while evaluating its responses.\n- Define [hooks](https://awslabs.github.io/agent-evaluation/hooks/) to perform additional tasks such as integration testing.\n- Can be incorporated into CI/CD pipelines to expedite the time to delivery while maintaining the stability of agents in production environments.\n\n## \ud83d\udcda Documentation\n\nTo get started, please visit the full documentation [here](https://awslabs.github.io/agent-evaluation/). To contribute, please refer to [CONTRIBUTING.md](./CONTRIBUTING.md)\n\n## \ud83d\udc4f Contributors\n\nShout out to these awesome contributors:\n\n<a href=\"https://github.com/awslabs/agent-evaluation/graphs/contributors\">\n <img src=\"https://contrib.rocks/image?repo=awslabs/agent-evaluation\" />\n</a>\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "A generative AI-powered framework for testing virtual agents.",
"version": "0.3.0",
"project_urls": {
"Homepage": "https://awslabs.github.io/agent-evaluation/"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "61a212d347400e6248c689c0bab63b219d57cd8fff46774f27b100bc46801bee",
"md5": "ab1207669cd16c7884d834d79f52c990",
"sha256": "a03af4d28f5bf554939ebbc9ad2ca70f2144c0d460bf45246dd259a7c9915eae"
},
"downloads": -1,
"filename": "agent_evaluation-0.3.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ab1207669cd16c7884d834d79f52c990",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 42842,
"upload_time": "2024-12-11T15:50:42",
"upload_time_iso_8601": "2024-12-11T15:50:42.915559Z",
"url": "https://files.pythonhosted.org/packages/61/a2/12d347400e6248c689c0bab63b219d57cd8fff46774f27b100bc46801bee/agent_evaluation-0.3.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "efd14da9cb0192ddc01f4417d51cef459c1b3e72591b223c2ea7e7f7a6f28adb",
"md5": "bb671bcf4e287c6cfcbe6f91c4541456",
"sha256": "3b31891d0ddfbb15a0bddeb3fb3385586132a162f08be894e789bdcdb8dd38d2"
},
"downloads": -1,
"filename": "agent_evaluation-0.3.0.tar.gz",
"has_sig": false,
"md5_digest": "bb671bcf4e287c6cfcbe6f91c4541456",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 25947,
"upload_time": "2024-12-11T15:50:45",
"upload_time_iso_8601": "2024-12-11T15:50:45.128772Z",
"url": "https://files.pythonhosted.org/packages/ef/d1/4da9cb0192ddc01f4417d51cef459c1b3e72591b223c2ea7e7f7a6f28adb/agent_evaluation-0.3.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-11 15:50:45",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "agent-evaluation"
}