edg4llm

Name	edg4llm JSON
Version	1.0.14 JSON
	download
home_page	https://github.com/alannikos/edg4llm
Summary	A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues.
upload_time	2025-01-11 05:58:45
maintainer	None
docs_url	None
author	Alannikos
requires_python	>=3.8
license	None
keywords	llm fine-tuning data-generation ai nlp
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # EDG4LLM

<div align="center">

<!-- ![welcome](assets/welcome.png) -->
```
      __      __  __        __   ___  __     __  __   __                     
|  | |_  |   /   /  \ |\/| |_     |  /  \   |_  |  \ / _   |__| |   |   |\/| 
|/\| |__ |__ \__ \__/ |  | |__    |  \__/   |__ |__/ \__)     | |__ |__ |  | 
                                                                             
```


</div>

<div align="center">

[📘Documentation](https://github.com/Alannikos/FunGPT) |
[🛠️Quick Start](https://github.com/Alannikos/FunGPT) |
[🤔Reporting Issues](https://github.com/Alannikos/FunGPT/issues) 

</div>

<div align="center">

<!-- PROJECT SHIELDS -->
[![GitHub Issues](https://img.shields.io/github/issues/Alannikos/edg4llm?style=flat&logo=github&color=%23FF5252)](https://github.com/Alannikos/edg4llm/issues)
[![GitHub forks](https://img.shields.io/github/forks/Alannikos/edg4llm?style=flat&logo=github&color=%23FF9800)](https://github.com/Alannikos/edg4llm/forks)
![GitHub Repo stars](https://img.shields.io/github/stars/Alannikos/edg4llm?style=flat&logo=github&color=%23FFEB3B)
![GitHub License](https://img.shields.io/github/license/Alannikos/edg4llm?style=flat&logo=github&color=%234CAF50)
[![Discord](https://img.shields.io/discord/1327445853388144681?style=flat&logo=discord)](https://discord.com/channels/1327445853388144681/)
[![Bilibili](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fapi.bilibili.com%2Fx%2Frelation%2Fstat%3Fvmid%3D3494365446015137&query=%24.data.follower&style=flat&logo=bilibili&label=followers&color=%23FF69B4)](https://space.bilibili.com/3494365446015137)
[![PyPI - Version](https://img.shields.io/pypi/v/edg4llm?style=flat&logo=pypi&logoColor=blue&color=red)](https://pypi.org/project/edg4llm/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/edg4llm?color=blue&logo=pypi&logoColor=gold)](https://pypi.org/project/edg4llm/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/edg4llm?logo=python&logoColor=gold)](https://pypi.org/project/edg4llm/)
</div>


**Easy Data Generation For Large Language Model(abbreviated as  EDG4LLM)**, A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues.


## Latest News

<details open>
<summary><b>2025</b></summary>

- [2025/01/11] 👋👋 We are excited to announce [**the initial release of edg4llm v1.0.12**](https://pypi.org/project/edg4llm/1.0.12/), marking the completion of its core functionalities. 

</details>

## Table of Contents
- [Latest News](#latest-news)
- [Introduction](#introduction)
- [Features](#features)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Requirements](#requirements)
- [License](#license)
- [Future Development Plans](#future-development-plans)
- [Acknowledgments](#acknowledgments)
- [License](#license)
- [Contact us](#contact-me)
- [Star History](#star-history)

## Introduction
**edg4llm** is a Python library designed specifically for generating fine-tuning data using large language models. This tool aims to assist users in creating high-quality training datasets efficiently. At its current stage, it mainly supports text data generation. The generated data includes, but is not limited to:
- **Question data**
- **Answer data**
- **Dialogue data**

With **edg4llm**, users can easily produce diverse datasets tailored to fine-tuning requirements, significantly enhancing the performance of large language models in specific tasks.
## Features
EDG4LLM is a unified tool designed to simplify and accelerate the creation of fine-tuning datasets for large language models. With a focus on usability, efficiency, and adaptability, it offers a range of features to meet diverse development needs while ensuring seamless integration and robust debugging support.

1. **Simple to Use**: Provides a straightforward interface that allows users to get started without complex configurations.
2. **Lightweight**: Minimal dependencies and low resource consumption make it efficient and easy to use.
3. **Flexibility**: Supports a variety of data formats and generation options, allowing customization to meet specific needs.
4. **Compatibility**: Seamlessly integrates with mainstream large language models and is suitable for various development scenarios.
5. **Transparent Debugging**: Provides clear and detailed log outputs, making it easy to debug and trace issues effectively.

## Installation
To install **edg4llm**, simply run the following command in your terminal:


```bash
pip install edg4llm
```

### Supported Python Versions
- **Supported Python Versions**: Python 3.8 or higher is required for compatibility with this library. Ensure your environment meets this version requirement.

### Supported LLM Provider
The current version of edg4llm supports the following large language model providers:
- [**InternLM**](https://github.com/InternLM)
    - Developer: Developed by the Shanghai Artificial Intelligence Laboratory.
    - Advantages: InternLM is a series of open-source large language models that offer outstanding reasoning, long-text processing, and tool usage capabilities. 

- [**ChatGLM**](https://github.com/THUDM/)
    - Developer: Jointly developed by Tsinghua University and Zhipu AI.
    - Advantages: ChatGLM is an open-source, bilingual dialog language model based on the General Language Model (GLM) architecture. It has been trained on a large corpus of Chinese and English text, making it highly effective for generating natural and contextually relevant responses.
- [**DeepSeek**](https://github.com/deepseek-ai/)
    - Developer: Developed by the DeepSeek team.
    - Advantages: DeepSeek-V3 is a powerful and cost-effective open-source large language model. It offers top-tier performance, especially in tasks like language generation, question answering, and dialog systems.
- [**OpenAI ChatGPT**](https://chatgpt.com/)
    - Developer: Developed by OpenAI.
    - Advantages: OpenAI's ChatGPT is a highly advanced language model known for its robust text generation capabilities. It has been trained on a vast amount of data, allowing it to generate high-quality and contextually relevant responses. 

More providers will be added in future updates to extend compatibility and functionality. 

| **Model**         | **Free**         | **Base URL**                                               |
|--------------------|------------------|------------------------------------------------------------|
| **InternLM**       | Yes(Partly)              | `https://internlm-chat.intern-ai.org.cn/puyu/api/v1/chat/completions`      |
| **ChatGLM**        | Yes(Partly)              | `https://open.bigmodel.cn/api/paas/v4/chat/completions/`    |
| **DeepSeek**       | Yes(Free Trial for New Users)              | `https://api.deepseek.com/chat/completions` |
| **OpenAI ChatGPT** | No (Paid Plans)  | `https://api.openai.com/v1/chat/completions`                    |


## Quick Start

To get started with **edg4llm**, follow the steps below. This example demonstrates how to use the library to generate dialogue data based on a specific prompt.

### Prerequisites

1. Install the **edg4llm** package:
```bash
   pip install edg4llm
```

2. Ensure you have Python version 3.8 or higher.

3. Obtain the necessary API key and base URL for your chosen model provider (e.g., ChatGLM).

### Code Example(Chinese Version)
```python
# chatglm_demo.py

import edg4llm
print(edg4llm.__version__)

from edg4llm import EDG4LLM

api_key = "xxx"
base_url = "https://open.bigmodel.cn/api/paas/v4/chat/completions"

edg = EDG4LLM(model_provider='chatglm', model_name="glm-4-flash", base_url=base_url, api_key=api_key)
# 设置测试数据
system_prompt = """你是一个精通中国古代诗词的古文学大师"""

user_prompt = """
    目标: 1. 请生成过年为场景的连续多轮对话记录
            2. 提出的问题要多样化。
            3. 要符合人类的说话习惯。
            4. 严格遵循规则: 请以如下格式返回生成的数据, 只返回JSON格式，json模板:  
                [
                    {{
                        "input":"AAA","output":"BBB" 
                    }}
                ]
                其中input字段表示一个人的话语, output字段表示专家的话语
"""
num_samples = 1  # 只生成一个对话样本

# 调用 generate 方法生成对话
data_dialogue = edg.generate(
    task_type="dialogue",
    system_prompt=system_prompt,
    user_prompt=user_prompt,
    num_samples=num_samples
)
```
### Code Example(English Version)
```python
# chatglm_demo.py

import edg4llm
print(edg4llm.__version__)

from edg4llm import EDG4LLM

api_key = "xxx"
base_url = "https://open.bigmodel.cn/api/paas/v4/chat/completions"

edg = EDG4LLM(model_provider='chatglm', model_name="glm-4-flash", base_url=base_url, api_key=api_key)

# Set the test data
system_prompt = """You are a master of ancient Chinese literature, specializing in classical poetry."""

user_prompt = """
    Goal: 1. Please generate a multi-turn dialogue set in the context of celebrating the Lunar New Year.
          2. The questions should be diverse.
          3. The dialogue should align with natural human conversational habits.
          4. Strictly follow this rule: Please return the generated data in the following format, only in JSON format. JSON template:  
                [
                    {{
                        "input":"AAA","output":"BBB" 
                    }}
                ]
                Where the input field represents a person's dialogue, and the output field represents the expert's response.
"""
num_samples = 1  # Generate only one dialogue sample

# Call the generate method to generate the dialogue
data_dialogue = edg.generate(
    task_type="dialogue",
    system_prompt=system_prompt,
    user_prompt=user_prompt,
    num_samples=num_samples
)

```

### Explanation

1. Importing the Library: Import the edg4llm library and verify the version using print(edg4llm.__version__).

2. Initialization: Use EDG4LLM to initialize the library with the appropriate model provider, model name, base URL, and API key.

3. Prompts:
    - system_prompt defines the behavior or role of the assistant.
    - user_prompt provides specific instructions for generating data.
4. Data Generation:
Use the generate method with the following parameters:
    - task_type: Defines the type of task (e.g., dialogue, question-answering).
    - system_prompt and user_prompt: Provide context and task-specific instructions.
    - num_samples: Specifies how many samples to generate.
5. Output: The generated data is returned as a JSON object in the specified format.

## Requirements
This project has **minimal dependencies**, requiring only the requests library. Make sure to have the following version installed:

- requests>=2.32.3

## Future Development Plans
1. - [ ] Recording Introduction Video
2. - [ ] Support Gemini2
3. - [ ] Support local large language models
4. - [ ] Support other types of data, such as picture.

## Acknowledgments
| Project | Description |
|---|---|
| [FunGPT](https://github.com/Alannikos/FunGPT) | An open-source Role-Play project |
| [InternLM](https://github.com/InternLM/InternLM) | A series of advanced open-source large language models |
| [ChatGLM](https://github.com/THUDM/) | A bilingual dialog language model based on the General Language Model (GLM) architecture, jointly developed by Tsinghua University and Zhipu AI. |
| [DeepSeek](https://github.com/deepseek-ai/) | A powerful and cost-effective open-source large language model, excelling in tasks such as language generation, question answering, and dialog systems. |
| [ChatGPT](https://openai.com/chatgpt/) | A highly advanced language model developed by OpenAI, known for its robust text generation capabilities. |

## License
MIT License - See [LICENSE](LICENSE) for details.

## Contact Me
Thank you for using **EDG4LLM**! Your support and feedback are invaluable in making this project better.

If you encounter any issues, have suggestions, or simply want to share your thoughts, feel free to:
- Submit an Issue: Visit the [Issues Page](https://github.com/Alannikos/edg4llm/issues) and describe the problem or suggestion.
- Email Me: You can also reach out directly via email at alannikos768@outlook.com. I'll do my best to respond promptly.

Your contributions and feedback are greatly appreciated. Thank you for helping improve this tool!

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=Alannikos/edg4llm&type=Date)](https://star-history.com/#Alannikos/edg4llm&Date)

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alannikos/edg4llm",
    "name": "edg4llm",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "LLM fine-tuning data-generation AI NLP",
    "author": "Alannikos",
    "author_email": "alannikos768@outlook.com",
    "download_url": "https://files.pythonhosted.org/packages/dc/a4/ff9643ee85c81fd54cd12f38fcb7495b6058540ea0cc7cea1d5a99c60f01/edg4llm-1.0.14.tar.gz",
    "platform": null,
    "description": "# EDG4LLM\n\n<div align=\"center\">\n\n<!-- ![welcome](assets/welcome.png) -->\n```\n      __      __  __        __   ___  __     __  __   __                     \n|  | |_  |   /   /  \\ |\\/| |_     |  /  \\   |_  |  \\ / _   |__| |   |   |\\/| \n|/\\| |__ |__ \\__ \\__/ |  | |__    |  \\__/   |__ |__/ \\__)     | |__ |__ |  | \n                                                                             \n```\n\n\n</div>\n\n<div align=\"center\">\n\n[\ud83d\udcd8Documentation](https://github.com/Alannikos/FunGPT) |\n[\ud83d\udee0\ufe0fQuick Start](https://github.com/Alannikos/FunGPT) |\n[\ud83e\udd14Reporting Issues](https://github.com/Alannikos/FunGPT/issues) \n\n</div>\n\n<div align=\"center\">\n\n<!-- PROJECT SHIELDS -->\n[![GitHub Issues](https://img.shields.io/github/issues/Alannikos/edg4llm?style=flat&logo=github&color=%23FF5252)](https://github.com/Alannikos/edg4llm/issues)\n[![GitHub forks](https://img.shields.io/github/forks/Alannikos/edg4llm?style=flat&logo=github&color=%23FF9800)](https://github.com/Alannikos/edg4llm/forks)\n![GitHub Repo stars](https://img.shields.io/github/stars/Alannikos/edg4llm?style=flat&logo=github&color=%23FFEB3B)\n![GitHub License](https://img.shields.io/github/license/Alannikos/edg4llm?style=flat&logo=github&color=%234CAF50)\n[![Discord](https://img.shields.io/discord/1327445853388144681?style=flat&logo=discord)](https://discord.com/channels/1327445853388144681/)\n[![Bilibili](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fapi.bilibili.com%2Fx%2Frelation%2Fstat%3Fvmid%3D3494365446015137&query=%24.data.follower&style=flat&logo=bilibili&label=followers&color=%23FF69B4)](https://space.bilibili.com/3494365446015137)\n[![PyPI - Version](https://img.shields.io/pypi/v/edg4llm?style=flat&logo=pypi&logoColor=blue&color=red)](https://pypi.org/project/edg4llm/)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/edg4llm?color=blue&logo=pypi&logoColor=gold)](https://pypi.org/project/edg4llm/)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/edg4llm?logo=python&logoColor=gold)](https://pypi.org/project/edg4llm/)\n</div>\n\n\n**Easy Data Generation For Large Language Model(abbreviated as  EDG4LLM)**, A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues.\n\n\n## Latest News\n\n<details open>\n<summary><b>2025</b></summary>\n\n- [2025/01/11] \ud83d\udc4b\ud83d\udc4b We are excited to announce [**the initial release of edg4llm v1.0.12**](https://pypi.org/project/edg4llm/1.0.12/), marking the completion of its core functionalities. \n\n</details>\n\n## Table of Contents\n- [Latest News](#latest-news)\n- [Introduction](#introduction)\n- [Features](#features)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Requirements](#requirements)\n- [License](#license)\n- [Future Development Plans](#future-development-plans)\n- [Acknowledgments](#acknowledgments)\n- [License](#license)\n- [Contact us](#contact-me)\n- [Star History](#star-history)\n\n## Introduction\n**edg4llm** is a Python library designed specifically for generating fine-tuning data using large language models. This tool aims to assist users in creating high-quality training datasets efficiently. At its current stage, it mainly supports text data generation. The generated data includes, but is not limited to:\n- **Question data**\n- **Answer data**\n- **Dialogue data**\n\nWith **edg4llm**, users can easily produce diverse datasets tailored to fine-tuning requirements, significantly enhancing the performance of large language models in specific tasks.\n## Features\nEDG4LLM is a unified tool designed to simplify and accelerate the creation of fine-tuning datasets for large language models. With a focus on usability, efficiency, and adaptability, it offers a range of features to meet diverse development needs while ensuring seamless integration and robust debugging support.\n\n1. **Simple to Use**: Provides a straightforward interface that allows users to get started without complex configurations.\n2. **Lightweight**: Minimal dependencies and low resource consumption make it efficient and easy to use.\n3. **Flexibility**: Supports a variety of data formats and generation options, allowing customization to meet specific needs.\n4. **Compatibility**: Seamlessly integrates with mainstream large language models and is suitable for various development scenarios.\n5. **Transparent Debugging**: Provides clear and detailed log outputs, making it easy to debug and trace issues effectively.\n\n## Installation\nTo install **edg4llm**, simply run the following command in your terminal:\n\n\n```bash\npip install edg4llm\n```\n\n### Supported Python Versions\n- **Supported Python Versions**: Python 3.8 or higher is required for compatibility with this library. Ensure your environment meets this version requirement.\n\n### Supported LLM Provider\nThe current version of edg4llm supports the following large language model providers:\n- [**InternLM**](https://github.com/InternLM)\n    - Developer: Developed by the Shanghai Artificial Intelligence Laboratory.\n    - Advantages: InternLM is a series of open-source large language models that offer outstanding reasoning, long-text processing, and tool usage capabilities. \n\n- [**ChatGLM**](https://github.com/THUDM/)\n    - Developer: Jointly developed by Tsinghua University and Zhipu AI.\n    - Advantages: ChatGLM is an open-source, bilingual dialog language model based on the General Language Model (GLM) architecture. It has been trained on a large corpus of Chinese and English text, making it highly effective for generating natural and contextually relevant responses.\n- [**DeepSeek**](https://github.com/deepseek-ai/)\n    - Developer: Developed by the DeepSeek team.\n    - Advantages: DeepSeek-V3 is a powerful and cost-effective open-source large language model. It offers top-tier performance, especially in tasks like language generation, question answering, and dialog systems.\n- [**OpenAI ChatGPT**](https://chatgpt.com/)\n    - Developer: Developed by OpenAI.\n    - Advantages: OpenAI's ChatGPT is a highly advanced language model known for its robust text generation capabilities. It has been trained on a vast amount of data, allowing it to generate high-quality and contextually relevant responses. \n\nMore providers will be added in future updates to extend compatibility and functionality. \n\n| **Model**         | **Free**         | **Base URL**                                               |\n|--------------------|------------------|------------------------------------------------------------|\n| **InternLM**       | Yes(Partly)              | `https://internlm-chat.intern-ai.org.cn/puyu/api/v1/chat/completions`      |\n| **ChatGLM**        | Yes(Partly)              | `https://open.bigmodel.cn/api/paas/v4/chat/completions/`    |\n| **DeepSeek**       | Yes(Free Trial for New Users)              | `https://api.deepseek.com/chat/completions` |\n| **OpenAI ChatGPT** | No (Paid Plans)  | `https://api.openai.com/v1/chat/completions`                    |\n\n\n## Quick Start\n\nTo get started with **edg4llm**, follow the steps below. This example demonstrates how to use the library to generate dialogue data based on a specific prompt.\n\n### Prerequisites\n\n1. Install the **edg4llm** package:\n```bash\n   pip install edg4llm\n```\n\n2. Ensure you have Python version 3.8 or higher.\n\n3. Obtain the necessary API key and base URL for your chosen model provider (e.g., ChatGLM).\n\n### Code Example(Chinese Version)\n```python\n# chatglm_demo.py\n\nimport edg4llm\nprint(edg4llm.__version__)\n\nfrom edg4llm import EDG4LLM\n\napi_key = \"xxx\"\nbase_url = \"https://open.bigmodel.cn/api/paas/v4/chat/completions\"\n\nedg = EDG4LLM(model_provider='chatglm', model_name=\"glm-4-flash\", base_url=base_url, api_key=api_key)\n# \u8bbe\u7f6e\u6d4b\u8bd5\u6570\u636e\nsystem_prompt = \"\"\"\u4f60\u662f\u4e00\u4e2a\u7cbe\u901a\u4e2d\u56fd\u53e4\u4ee3\u8bd7\u8bcd\u7684\u53e4\u6587\u5b66\u5927\u5e08\"\"\"\n\nuser_prompt = \"\"\"\n    \u76ee\u6807: 1. \u8bf7\u751f\u6210\u8fc7\u5e74\u4e3a\u573a\u666f\u7684\u8fde\u7eed\u591a\u8f6e\u5bf9\u8bdd\u8bb0\u5f55\n            2. \u63d0\u51fa\u7684\u95ee\u9898\u8981\u591a\u6837\u5316\u3002\n            3. \u8981\u7b26\u5408\u4eba\u7c7b\u7684\u8bf4\u8bdd\u4e60\u60ef\u3002\n            4. \u4e25\u683c\u9075\u5faa\u89c4\u5219: \u8bf7\u4ee5\u5982\u4e0b\u683c\u5f0f\u8fd4\u56de\u751f\u6210\u7684\u6570\u636e, \u53ea\u8fd4\u56deJSON\u683c\u5f0f\uff0cjson\u6a21\u677f:  \n                [\n                    {{\n                        \"input\":\"AAA\",\"output\":\"BBB\" \n                    }}\n                ]\n                \u5176\u4e2dinput\u5b57\u6bb5\u8868\u793a\u4e00\u4e2a\u4eba\u7684\u8bdd\u8bed, output\u5b57\u6bb5\u8868\u793a\u4e13\u5bb6\u7684\u8bdd\u8bed\n\"\"\"\nnum_samples = 1  # \u53ea\u751f\u6210\u4e00\u4e2a\u5bf9\u8bdd\u6837\u672c\n\n# \u8c03\u7528 generate \u65b9\u6cd5\u751f\u6210\u5bf9\u8bdd\ndata_dialogue = edg.generate(\n    task_type=\"dialogue\",\n    system_prompt=system_prompt,\n    user_prompt=user_prompt,\n    num_samples=num_samples\n)\n```\n### Code Example(English Version)\n```python\n# chatglm_demo.py\n\nimport edg4llm\nprint(edg4llm.__version__)\n\nfrom edg4llm import EDG4LLM\n\napi_key = \"xxx\"\nbase_url = \"https://open.bigmodel.cn/api/paas/v4/chat/completions\"\n\nedg = EDG4LLM(model_provider='chatglm', model_name=\"glm-4-flash\", base_url=base_url, api_key=api_key)\n\n# Set the test data\nsystem_prompt = \"\"\"You are a master of ancient Chinese literature, specializing in classical poetry.\"\"\"\n\nuser_prompt = \"\"\"\n    Goal: 1. Please generate a multi-turn dialogue set in the context of celebrating the Lunar New Year.\n          2. The questions should be diverse.\n          3. The dialogue should align with natural human conversational habits.\n          4. Strictly follow this rule: Please return the generated data in the following format, only in JSON format. JSON template:  \n                [\n                    {{\n                        \"input\":\"AAA\",\"output\":\"BBB\" \n                    }}\n                ]\n                Where the input field represents a person's dialogue, and the output field represents the expert's response.\n\"\"\"\nnum_samples = 1  # Generate only one dialogue sample\n\n# Call the generate method to generate the dialogue\ndata_dialogue = edg.generate(\n    task_type=\"dialogue\",\n    system_prompt=system_prompt,\n    user_prompt=user_prompt,\n    num_samples=num_samples\n)\n\n```\n\n### Explanation\n\n1. Importing the Library: Import the edg4llm library and verify the version using print(edg4llm.__version__).\n\n2. Initialization: Use EDG4LLM to initialize the library with the appropriate model provider, model name, base URL, and API key.\n\n3. Prompts:\n    - system_prompt defines the behavior or role of the assistant.\n    - user_prompt provides specific instructions for generating data.\n4. Data Generation:\nUse the generate method with the following parameters:\n    - task_type: Defines the type of task (e.g., dialogue, question-answering).\n    - system_prompt and user_prompt: Provide context and task-specific instructions.\n    - num_samples: Specifies how many samples to generate.\n5. Output: The generated data is returned as a JSON object in the specified format.\n\n## Requirements\nThis project has **minimal dependencies**, requiring only the requests library. Make sure to have the following version installed:\n\n- requests>=2.32.3\n\n## Future Development Plans\n1. - [ ] Recording Introduction Video\n2. - [ ] Support Gemini2\n3. - [ ] Support local large language models\n4. - [ ] Support other types of data, such as picture.\n\n## Acknowledgments\n| Project | Description |\n|---|---|\n| [FunGPT](https://github.com/Alannikos/FunGPT) | An open-source Role-Play project |\n| [InternLM](https://github.com/InternLM/InternLM) | A series of advanced open-source large language models |\n| [ChatGLM](https://github.com/THUDM/) | A bilingual dialog language model based on the General Language Model (GLM) architecture, jointly developed by Tsinghua University and Zhipu AI. |\n| [DeepSeek](https://github.com/deepseek-ai/) | A powerful and cost-effective open-source large language model, excelling in tasks such as language generation, question answering, and dialog systems. |\n| [ChatGPT](https://openai.com/chatgpt/) | A highly advanced language model developed by OpenAI, known for its robust text generation capabilities. |\n\n## License\nMIT License - See [LICENSE](LICENSE) for details.\n\n## Contact Me\nThank you for using **EDG4LLM**! Your support and feedback are invaluable in making this project better.\n\nIf you encounter any issues, have suggestions, or simply want to share your thoughts, feel free to:\n- Submit an Issue: Visit the [Issues Page](https://github.com/Alannikos/edg4llm/issues) and describe the problem or suggestion.\n- Email Me: You can also reach out directly via email at alannikos768@outlook.com. I'll do my best to respond promptly.\n\nYour contributions and feedback are greatly appreciated. Thank you for helping improve this tool!\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=Alannikos/edg4llm&type=Date)](https://star-history.com/#Alannikos/edg4llm&Date)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues.",
    "version": "1.0.14",
    "project_urls": {
        "Homepage": "https://github.com/alannikos/edg4llm"
    },
    "split_keywords": [
        "llm",
        "fine-tuning",
        "data-generation",
        "ai",
        "nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "89c9351f2a7b6c6a832ed04b0d7ef7b2228d03dfae651ebc9e665605a9fb1b2b",
                "md5": "546618b876513613c51966afd25e3557",
                "sha256": "05aa42b2f97d5a6872139c53bd87e27e290caf70d20fc7a0062e02fc29ee478a"
            },
            "downloads": -1,
            "filename": "edg4llm-1.0.14-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "546618b876513613c51966afd25e3557",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 42481,
            "upload_time": "2025-01-11T05:58:42",
            "upload_time_iso_8601": "2025-01-11T05:58:42.484995Z",
            "url": "https://files.pythonhosted.org/packages/89/c9/351f2a7b6c6a832ed04b0d7ef7b2228d03dfae651ebc9e665605a9fb1b2b/edg4llm-1.0.14-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dca4ff9643ee85c81fd54cd12f38fcb7495b6058540ea0cc7cea1d5a99c60f01",
                "md5": "6a25e3eed6411f5222b3d8ed608441fe",
                "sha256": "425555fc8a5f7e7739965d08b4a3f69f1a036382e5df7d12c56c76ef1dfdea20"
            },
            "downloads": -1,
            "filename": "edg4llm-1.0.14.tar.gz",
            "has_sig": false,
            "md5_digest": "6a25e3eed6411f5222b3d8ed608441fe",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 31317,
            "upload_time": "2025-01-11T05:58:45",
            "upload_time_iso_8601": "2025-01-11T05:58:45.279722Z",
            "url": "https://files.pythonhosted.org/packages/dc/a4/ff9643ee85c81fd54cd12f38fcb7495b6058540ea0cc7cea1d5a99c60f01/edg4llm-1.0.14.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-11 05:58:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alannikos",
    "github_project": "edg4llm",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "edg4llm"
}

Alannikos