| Name | sciphi-synthesizer JSON |
| Version |
1.0.5
JSON |
| download |
| home_page | |
| Summary | Synthesizer: A Framework for LLM Powered Data. |
| upload_time | 2024-01-03 20:54:40 |
| maintainer | |
| docs_url | None |
| author | Owen Colegrove |
| requires_python | >=3.9,<3.12 |
| license | Apache-2.0 |
| keywords |
|
| VCS |
|
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# Synthesizer[ΨΦ]: A multi-purpose LLM framework 💡
<p align="center">
<img width="716" alt="SciPhi Logo" src="https://github.com/emrgnt-cmplxty/sciphi/assets/68796651/195367d8-54fd-4281-ace0-87ea8523f982">
</p>
With Synthesizer, users can:
- **Custom Data Creation**: Generate datasets via LLMs that are tailored to your needs.
- Anthropic, OpenAI, vLLM, and HuggingFace.
- **Retrieval-Augmented Generation (RAG) on Demand**: Built-in RAG Provider Interface to anchor generated data to real-world sources.
- Turnkey integration with Agent Search API.
- **Custom Data Creation**: Generate datasets via LLMs that are tailored to your needs, for LLM training, RAG, and more.
---
## Fast Setup
```bash
pip install sciphi-synthesizer
```
### Using Synthesizer
1. **Generate synthetic question-answer pairs**
```bash
export SCIPHI_API_KEY=MY_SCIPHI_API_KEY
python -m synthesizer.scripts.data_augmenter run --dataset="wiki_qa"
```
```bash
tail augmented_output/config_name_eq_answer_question__dataset_name_eq_wiki_qa.jsonl
{ "formatted_prompt": "... ### Question:\nwhat country did wine originate in\n\n### Input:\n1. URL: https://en.wikipedia.org/wiki/History%20of%20wine (Score: 0.85)\nTitle:History of wine....",
{ "completion": "Wine originated in the South Caucasus, which is now part of modern-day Armenia ..."
```
2. **Evaluate RAG pipeline performance**
```bash
export SCIPHI_API_KEY=MY_SCIPHI_API_KEY
python -m synthesizer.scripts.rag_harness --rag_provider="agent-search" --llm_provider_name="sciphi" --n_samples=25
```
### Documentation
For more detailed information, tutorials, and API references, please visit the official [Synthesizer Documentation](https://sciphi.readthedocs.io/en/latest/).
### Community & Support
- Engage with our vibrant community on [Discord](https://discord.gg/j9GxfbxqAe).
- For tailored inquiries or feedback, please [email us](mailto:owen@sciphi.ai).
### Developing with Synthesizer
Quickly set up RAG augmented generation with your choice of provider, from OpenAI, Anhtropic, vLLM, and SciPhi:
```python
# Requires SCIPHI_API_KEY in env
from synthesizer.core import LLMProviderName, RAGProviderName
from synthesizer.interface import LLMInterfaceManager, RAGInterfaceManager
from synthesizer.llm import GenerationConfig
# RAG Provider Settings
rag_interface = RAGInterfaceManager.get_interface_from_args(
RAGProviderName("agent-search"),
limit_hierarchical_url_results=rag_limit_hierarchical_url_results,
limit_final_pagerank_results=rag_limit_final_pagerank_results,
)
rag_context = rag_interface.get_rag_context(query)
# LLM Provider Settings
llm_interface = LLMInterfaceManager.get_interface_from_args(
LLMProviderName("openai"),
)
generation_config = GenerationConfig(
model_name=llm_model_name,
max_tokens_to_sample=llm_max_tokens_to_sample,
temperature=llm_temperature,
top_p=llm_top_p,
# other generation params here ...
)
formatted_prompt = raw_prompt.format(rag_context=rag_context)
completion = llm_interface.get_completion(formatted_prompt, generation_config)
```
Raw data
{
"_id": null,
"home_page": "",
"name": "sciphi-synthesizer",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9,<3.12",
"maintainer_email": "",
"keywords": "",
"author": "Owen Colegrove",
"author_email": "owen@sciphi.ai",
"download_url": "https://files.pythonhosted.org/packages/46/50/08020d06f4453916627e2a9e62e873cadbb7f21455a0ce97d22c8de7813c/sciphi_synthesizer-1.0.5.tar.gz",
"platform": null,
"description": "# Synthesizer[\u03a8\u03a6]: A multi-purpose LLM framework \ud83d\udca1\n\n<p align=\"center\">\n<img width=\"716\" alt=\"SciPhi Logo\" src=\"https://github.com/emrgnt-cmplxty/sciphi/assets/68796651/195367d8-54fd-4281-ace0-87ea8523f982\">\n</p>\n\nWith Synthesizer, users can:\n\n- **Custom Data Creation**: Generate datasets via LLMs that are tailored to your needs.\n - Anthropic, OpenAI, vLLM, and HuggingFace.\n- **Retrieval-Augmented Generation (RAG) on Demand**: Built-in RAG Provider Interface to anchor generated data to real-world sources. \n - Turnkey integration with Agent Search API. \n- **Custom Data Creation**: Generate datasets via LLMs that are tailored to your needs, for LLM training, RAG, and more.\n\n---\n\n## Fast Setup\n\n```bash\npip install sciphi-synthesizer\n```\n\n### Using Synthesizer\n\n1. **Generate synthetic question-answer pairs**\n\n ```bash\n export SCIPHI_API_KEY=MY_SCIPHI_API_KEY\n python -m synthesizer.scripts.data_augmenter run --dataset=\"wiki_qa\"\n ```\n\n ```bash\n tail augmented_output/config_name_eq_answer_question__dataset_name_eq_wiki_qa.jsonl\n { \"formatted_prompt\": \"... ### Question:\\nwhat country did wine originate in\\n\\n### Input:\\n1. URL: https://en.wikipedia.org/wiki/History%20of%20wine (Score: 0.85)\\nTitle:History of wine....\",\n { \"completion\": \"Wine originated in the South Caucasus, which is now part of modern-day Armenia ...\"\n ```\n\n2. **Evaluate RAG pipeline performance**\n\n ```bash\n export SCIPHI_API_KEY=MY_SCIPHI_API_KEY\n python -m synthesizer.scripts.rag_harness --rag_provider=\"agent-search\" --llm_provider_name=\"sciphi\" --n_samples=25\n ```\n\n### Documentation\n\nFor more detailed information, tutorials, and API references, please visit the official [Synthesizer Documentation](https://sciphi.readthedocs.io/en/latest/).\n\n### Community & Support\n\n- Engage with our vibrant community on [Discord](https://discord.gg/j9GxfbxqAe).\n- For tailored inquiries or feedback, please [email us](mailto:owen@sciphi.ai).\n\n### Developing with Synthesizer\n\nQuickly set up RAG augmented generation with your choice of provider, from OpenAI, Anhtropic, vLLM, and SciPhi:\n\n```python\n# Requires SCIPHI_API_KEY in env\n\nfrom synthesizer.core import LLMProviderName, RAGProviderName\nfrom synthesizer.interface import LLMInterfaceManager, RAGInterfaceManager\nfrom synthesizer.llm import GenerationConfig\n\n# RAG Provider Settings\nrag_interface = RAGInterfaceManager.get_interface_from_args(\n RAGProviderName(\"agent-search\"),\n limit_hierarchical_url_results=rag_limit_hierarchical_url_results,\n limit_final_pagerank_results=rag_limit_final_pagerank_results,\n)\nrag_context = rag_interface.get_rag_context(query)\n\n# LLM Provider Settings\nllm_interface = LLMInterfaceManager.get_interface_from_args(\n LLMProviderName(\"openai\"),\n)\n\ngeneration_config = GenerationConfig(\n model_name=llm_model_name,\n max_tokens_to_sample=llm_max_tokens_to_sample,\n temperature=llm_temperature,\n top_p=llm_top_p,\n # other generation params here ...\n)\n\nformatted_prompt = raw_prompt.format(rag_context=rag_context)\ncompletion = llm_interface.get_completion(formatted_prompt, generation_config)\n```",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Synthesizer: A Framework for LLM Powered Data.",
"version": "1.0.5",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a1ce5e9562d6b76911186f1555b21a439ced43c2a5c18c9e5693146d807ef3fa",
"md5": "794b6fe1f8d6c64ec9f0951cd59c8b9f",
"sha256": "be1a736e8e7c7ed41f2ec2d639d1f59be46dad871e52b8e28c37798419a2aa5a"
},
"downloads": -1,
"filename": "sciphi_synthesizer-1.0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "794b6fe1f8d6c64ec9f0951cd59c8b9f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9,<3.12",
"size": 164030,
"upload_time": "2024-01-03T20:54:38",
"upload_time_iso_8601": "2024-01-03T20:54:38.198027Z",
"url": "https://files.pythonhosted.org/packages/a1/ce/5e9562d6b76911186f1555b21a439ced43c2a5c18c9e5693146d807ef3fa/sciphi_synthesizer-1.0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "465008020d06f4453916627e2a9e62e873cadbb7f21455a0ce97d22c8de7813c",
"md5": "668df494a0e06d321549236a21ac49c0",
"sha256": "c8f5db3e4b222edb501d7096579bac6d2659fdbc180bbdd7b6e726eb6eedd6c5"
},
"downloads": -1,
"filename": "sciphi_synthesizer-1.0.5.tar.gz",
"has_sig": false,
"md5_digest": "668df494a0e06d321549236a21ac49c0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9,<3.12",
"size": 144472,
"upload_time": "2024-01-03T20:54:40",
"upload_time_iso_8601": "2024-01-03T20:54:40.378431Z",
"url": "https://files.pythonhosted.org/packages/46/50/08020d06f4453916627e2a9e62e873cadbb7f21455a0ce97d22c8de7813c/sciphi_synthesizer-1.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-03 20:54:40",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "sciphi-synthesizer"
}