# SpiralEval ๐
## Introduction ๐
The primary purpose of SpiralEval is to establish an evaluation method based on character traits.
SpiralEval enables us to elaborate a better character.
Here are the main functionalities:
- From the conversation of the target person, we create a summary of that person's character traits.
- Evaluation of SpiralEval based on the evaluation dataset derived from the character trait summary.
- Determine the number of sentences in the text generated by LLM that effectively express character traits.
## Installation ๐ง
The easiest way to install SpiralFilm is simply by using pip:
```
pip install spiraleval
```
Magic! ๐ฉโจ
## Tutorial ๐
Now that you've got SpiralEval installed, let's see it in action!
We need to follow the following process to get evaluation for your LLM.
### Step 1: Generate the summary of your character
For this, we'll use the script in `examples/simple_example.py`
```python
from SpiralEval.spiraleval import EvalSummary
# First things first, let's set up the path for files
api_path = "your_openai_api_key_path.txt"
reference_path = 'your_reference_dataset.json'
# Now, let's create and run a EvalSummary instance
f = EvalSummary(
api_path, reference_path).run()
# Booom, you should get character summary on the same directory, named character_summary.txt
```
### Step 2: Just try our evaluation. ๐ช
Evaluating your LLM with use of a number of QA takes API cost.
Before your operation, you can try our evaluation on a simple QA.
```python
from SpiralEval.spiraleval import EvalLLMTrial
# Let's set up path for files
api_path = "your_openai_api_key_path.txt"
summary_path = "your_generated_summary.txt"
reference_path = 'your_reference_dataset.json'
# Now, let's create a EvalLLMTrial instance with the specified path.
eval_llm_trial = EvalLLMTrial(api_path, summary_path, reference_path)
# You can get the character summary here and use it for the following prompt.
character_summary = eval_llm_trial.character_summary
instruction = f"""You will determine if the target sentence I provide is either
generated by a language model (False) or belongs to the set of reference sentences (True),
based on True or False criteria.
Both the reference and target sentences come as pairs of questions and
their answers. The following is the summary of his/her character. {character_summary}"""
eval_llm_trial.run(instruction)
# You will be asked to provide question and answer for this prompt.
```
In this example, the run_parallel method allows for concurrent processing of multiple prompts, drastically reducing the time it would take if done sequentially. This is especially handy for batch processing or when dealing with real-time requirements.
### Example 3: Use our evaluation on your LLM ๐ง
There's immense power in context, and with `FilmCore`, you can harness this power seamlessly. This example, which you can find in `examples/conversation_example.py`, showcases how you can retain context and query it in subsequent interactions:
By using the create_from method, we can ensure a smooth continuation of the conversation. So, whether it's a fact, a story detail, or a crucial piece of data, FilmCore helps keep the narrative threads intact. ๐งต๐
```python
from SpiralEval.spiraleval import EvalLLM
# Let's set up path for files
api_path = "your_openai_api_key_path.txt"
summary_path = "your_generated_summary.txt"
# Paste your answers generated by LLM on the following file.
target_path = "your_target_dataset.json"
reference_path = 'your_reference_dataset.json'
eval_llm = EvalLLM(api_path, summary_path, target_path, reference_path)
character_summary = eval_llm.character_summary
instruction = f"""You will determine if the target sentence I provide is either
generated by a language model (False) or belongs to the set of reference sentences (True),
based on True or False criteria.
Both the reference and target sentences come as pairs of questions and
their answers. The following is the summary of his/her character. {character_summary}"""
eval_llm.run(instruction)
```
### Additional Example: Test our evaluation system ๐
If you are doubt about our evaluation system. You can test it!
```python
from SpiralEval.spiraleval import EvalSpiralEval
# Let's set up path for files
api_path = "your_openai_api_key_path.txt"
summary_path = "your_generated_summary.txt"
target_path = "your_target_dataset.json"
reference_path = 'your_reference_dataset.json'
evalspiraleval = EvalSpiralEval(api_path, summary_path, target_path, reference_path)
character_summary = evalspiraleval.character_summary
instruction = f"""You will determine if the target sentence I provide is either
generated by a language model (False) or belongs to the set of reference sentences (True),
based on True or False criteria.
Both the reference and target sentences come as pairs of questions and
their answers. The following is the summary of his/her character. {character_summary}"""
evalspiraleval.run(instruction)
```
With this, you can evaluate your LLM based on characteristics traits.
And that's it, folks! You're now ready to start making your own epic conversational masterpieces with SpiralEval! Happy coding! ๐ป๐
But wait, there's more! Be sure to check out the "examples" folder for more usage scenarios and ideas. We've packed it full of tips, tricks, and goodies to get you up and running in no time. ๐๐
## Contribution ๐ค
If you feel like giving back, we always welcome contributions. But remember, at SpiralEval, we're all about keeping it simple and transparent. We love that you're excited to add features
Raw data
{
"_id": null,
"home_page": "https://github.com/Spiral-AI/SpiralEval",
"name": "SpiralEval",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8, <4",
"maintainer_email": "",
"keywords": "openai,api,evaluation,characteristics,gpt",
"author": "Kosei Uemura",
"author_email": "koseiuemura1227@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/f0/1c/1c55e066d0a61ad12eb0976bee65e2971940763216c51d4ee5c272bcf1eb/SpiralEval-0.1.2.tar.gz",
"platform": null,
"description": "# SpiralEval \ud83c\udf00\r\n## Introduction \ud83d\ude80\r\nThe primary purpose of SpiralEval is to establish an evaluation method based on character traits.\r\nSpiralEval enables us to elaborate a better character.\r\n\r\nHere are the main functionalities:\r\n\r\n- From the conversation of the target person, we create a summary of that person's character traits.\r\n- Evaluation of SpiralEval based on the evaluation dataset derived from the character trait summary.\r\n- Determine the number of sentences in the text generated by LLM that effectively express character traits.\r\n\r\n## Installation \ud83d\udd27\r\n\r\nThe easiest way to install SpiralFilm is simply by using pip:\r\n```\r\npip install spiraleval\r\n```\r\n\r\nMagic! \ud83c\udfa9\u2728\r\n\r\n## Tutorial \ud83d\udcda\r\n\r\nNow that you've got SpiralEval installed, let's see it in action! \r\nWe need to follow the following process to get evaluation for your LLM.\r\n\r\n### Step 1: Generate the summary of your character\r\nFor this, we'll use the script in `examples/simple_example.py`\r\n\r\n```python\r\nfrom SpiralEval.spiraleval import EvalSummary\r\n\r\n# First things first, let's set up the path for files\r\napi_path = \"your_openai_api_key_path.txt\"\r\nreference_path = 'your_reference_dataset.json'\r\n\r\n# Now, let's create and run a EvalSummary instance\r\nf = EvalSummary(\r\n api_path, reference_path).run()\r\n\r\n# Booom, you should get character summary on the same directory, named character_summary.txt\r\n```\r\n\r\n### Step 2: Just try our evaluation. \ud83e\ude84\r\nEvaluating your LLM with use of a number of QA takes API cost.\r\nBefore your operation, you can try our evaluation on a simple QA.\r\n\r\n```python\r\nfrom SpiralEval.spiraleval import EvalLLMTrial\r\n\r\n# Let's set up path for files\r\napi_path = \"your_openai_api_key_path.txt\"\r\nsummary_path = \"your_generated_summary.txt\"\r\nreference_path = 'your_reference_dataset.json'\r\n\r\n# Now, let's create a EvalLLMTrial instance with the specified path.\r\neval_llm_trial = EvalLLMTrial(api_path, summary_path, reference_path)\r\n# You can get the character summary here and use it for the following prompt.\r\ncharacter_summary = eval_llm_trial.character_summary\r\n\r\ninstruction = f\"\"\"You will determine if the target sentence I provide is either \r\ngenerated by a language model (False) or belongs to the set of reference sentences (True), \r\nbased on True or False criteria.\r\nBoth the reference and target sentences come as pairs of questions and \r\ntheir answers. The following is the summary of his/her character. {character_summary}\"\"\"\r\n\r\neval_llm_trial.run(instruction)\r\n# You will be asked to provide question and answer for this prompt.\r\n```\r\nIn this example, the run_parallel method allows for concurrent processing of multiple prompts, drastically reducing the time it would take if done sequentially. This is especially handy for batch processing or when dealing with real-time requirements.\r\n\r\n### Example 3: Use our evaluation on your LLM \ud83e\udde0\r\nThere's immense power in context, and with `FilmCore`, you can harness this power seamlessly. This example, which you can find in `examples/conversation_example.py`, showcases how you can retain context and query it in subsequent interactions:\r\n\r\n\r\nBy using the create_from method, we can ensure a smooth continuation of the conversation. So, whether it's a fact, a story detail, or a crucial piece of data, FilmCore helps keep the narrative threads intact. \ud83e\uddf5\ud83d\udcd6\r\n```python\r\nfrom SpiralEval.spiraleval import EvalLLM\r\n\r\n# Let's set up path for files\r\napi_path = \"your_openai_api_key_path.txt\"\r\nsummary_path = \"your_generated_summary.txt\"\r\n# Paste your answers generated by LLM on the following file.\r\ntarget_path = \"your_target_dataset.json\"\r\nreference_path = 'your_reference_dataset.json'\r\n\r\neval_llm = EvalLLM(api_path, summary_path, target_path, reference_path)\r\ncharacter_summary = eval_llm.character_summary\r\n\r\ninstruction = f\"\"\"You will determine if the target sentence I provide is either \r\ngenerated by a language model (False) or belongs to the set of reference sentences (True), \r\nbased on True or False criteria.\r\nBoth the reference and target sentences come as pairs of questions and \r\ntheir answers. The following is the summary of his/her character. {character_summary}\"\"\"\r\n\r\neval_llm.run(instruction)\r\n```\r\n\r\n\r\n### Additional Example: Test our evaluation system \ud83c\udf0a\r\nIf you are doubt about our evaluation system. You can test it!\r\n```python\r\nfrom SpiralEval.spiraleval import EvalSpiralEval\r\n\r\n# Let's set up path for files\r\napi_path = \"your_openai_api_key_path.txt\"\r\nsummary_path = \"your_generated_summary.txt\"\r\ntarget_path = \"your_target_dataset.json\"\r\nreference_path = 'your_reference_dataset.json'\r\n\r\nevalspiraleval = EvalSpiralEval(api_path, summary_path, target_path, reference_path)\r\ncharacter_summary = evalspiraleval.character_summary\r\n\r\ninstruction = f\"\"\"You will determine if the target sentence I provide is either \r\ngenerated by a language model (False) or belongs to the set of reference sentences (True), \r\nbased on True or False criteria.\r\nBoth the reference and target sentences come as pairs of questions and \r\ntheir answers. The following is the summary of his/her character. {character_summary}\"\"\"\r\n\r\nevalspiraleval.run(instruction)\r\n```\r\n\r\nWith this, you can evaluate your LLM based on characteristics traits.\r\n\r\nAnd that's it, folks! You're now ready to start making your own epic conversational masterpieces with SpiralEval! Happy coding! \ud83d\udcbb\ud83d\ude80\r\n\r\nBut wait, there's more! Be sure to check out the \"examples\" folder for more usage scenarios and ideas. We've packed it full of tips, tricks, and goodies to get you up and running in no time. \ud83d\udcda\ud83d\udd0d\r\n\r\n## Contribution \ud83e\udd1d\r\n\r\nIf you feel like giving back, we always welcome contributions. But remember, at SpiralEval, we're all about keeping it simple and transparent. We love that you're excited to add features\r\n",
"bugtrack_url": null,
"license": "",
"summary": "Evaluation for characteristics",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/Spiral-AI/SpiralEval"
},
"split_keywords": [
"openai",
"api",
"evaluation",
"characteristics",
"gpt"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "204e2a5db38a2cfbb07877099bbfe9d3ed0b2970c1e49f429c981796d95bfed2",
"md5": "16b8fb68c5a28173bc359612db401c4b",
"sha256": "b45d1f148a18f1dd61254b4a207cd09b3c1e77bab986b67cb21b30046fd30283"
},
"downloads": -1,
"filename": "SpiralEval-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "16b8fb68c5a28173bc359612db401c4b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8, <4",
"size": 12699,
"upload_time": "2023-10-05T03:20:43",
"upload_time_iso_8601": "2023-10-05T03:20:43.182050Z",
"url": "https://files.pythonhosted.org/packages/20/4e/2a5db38a2cfbb07877099bbfe9d3ed0b2970c1e49f429c981796d95bfed2/SpiralEval-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f01c1c55e066d0a61ad12eb0976bee65e2971940763216c51d4ee5c272bcf1eb",
"md5": "a49b679f621aa2de9664a1b4e064c236",
"sha256": "6a59fa5a866efd5f0aded8e4825ecb035657cf997892f7ad1a544be3693b4c56"
},
"downloads": -1,
"filename": "SpiralEval-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "a49b679f621aa2de9664a1b4e064c236",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8, <4",
"size": 2254480,
"upload_time": "2023-10-05T03:20:45",
"upload_time_iso_8601": "2023-10-05T03:20:45.549046Z",
"url": "https://files.pythonhosted.org/packages/f0/1c/1c55e066d0a61ad12eb0976bee65e2971940763216c51d4ee5c272bcf1eb/SpiralEval-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-05 03:20:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Spiral-AI",
"github_project": "SpiralEval",
"github_not_found": true,
"lcname": "spiraleval"
}