llm-analysis-assistant

Name	llm-analysis-assistant JSON
Version	0.3.1 JSON
	download
home_page	None
Summary	An LLM analysis assistant to help you understand and implement PMF
upload_time	2025-07-30 08:26:07
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	Apache-2.0
keywords	analysis asgi assistant llm mcp sse stdio streamable-http uvicorn websocket
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            
[English](./README.md) | [简体中文](./README_zh.md) 

# 1、Project Features
Through this proxy service, we can easily record the parameters and return results of the interaction with the big model, so as to conveniently analyze the logic of the client calling the big model and deeply understand the phenomenon and its essence.
This project is not for optimizing the big model, but it can help you uncover the mystery of the big model, understand and achieve product market fit (PMF).

MCP is also an important part of LLM, so this project can also be used as an mcp client and supports detection of sse/mcp-streamable-http mode.

# 🌟 Main features
### Function list:
1. **mcp client (already supports stdio/sse/streamableHttp calls)**
2. **mcp initialization detection and analysis (such as Cherry Studio supports stdio/sse/streamableHttp)**
3. **Detect ollama/openai interface and generate analysis log**
4. **mock ollama/openai interface data**

### Technical features:
1. **uv tool use**
2. **uvicorn framework use**
3. **front-end async, back-end async**
4. **log display real-time refresh, breakpoint continuation**
5. **py socket write http client, support get/post, and their respective streaming output**
6. **webSocket combined with asyncio use**
7. **threading/queue use**
8. **py program packaged into exe**
9. **python -m llm_analysis_assistant**

# 2. Project Background
Before the arrival of true AGI, we will have to go through a long journey, during which we will have to face constant challenges. Whether ordinary people or professionals, their lives will be changed.

However, for the use of large models, both ordinary users and developers often indirectly contact them through various clients. But the client often blocks the process of interacting with the large model, and can directly give results based on the user's simple input, giving people a feeling that the large model is mysterious, like a black box. In fact, this is not the case. When using a large model, we simply understand that we are calling an interface with input and output.
It should be noted that although many inference platforms provide OpenAI format interfaces, their actual support varies. Simply put, the request parameters and return parameters of the API are not exactly the same.

For detailed parameter support, please see

[Semi-standard:OpenAI API](https://platform.openai.com/docs/api-reference/responses/create)

[n development:OLLAMA API](https://github.com/ollama/ollama/blob/main/docs/openai.md#supported-features)

[in production:VLLM API](https://docs.vllm.ai/en/stable/api/inference_params.html#sampling-parameters)

Please check for other platforms

### This project uses the uvicorn framework to start asgi to provide API services, with minimal dependencies, running quickly and concisely, paying tribute to the classics

# 3. Installation

```sh

# clone git
git clone https://github.com/xuzexin-hz/llm-analysis-assistant.git
cd llm-analysis-assistant

# Install the extension
uv sync

```

# 4. Use
Enter the root directory, then the bin directory
Click run-server.cmd to start the service
Click run-build.cmd to package the service into an executable file (in the dist directory)
Or run the following command directly in the root directory:

```sh

#Default port 8000
python server.py

#You can also specify the port
python server.py --port=8001

#You can also specify the openai address, the default is the ollama address: http://127.0.0.1:11434/v1/
python server.py --base_url=https://api.openai.com
#If you configure other api addresses, remember to fill in the correct api_key, ollama does not need api_key by default

#--is_mock=true Turn on mock and return mock data
python server.py --is_mock=true

#--mock_string, you can customize the returned mock data, if you do not set this item, the default mock data will be returned. This parameter also applies to non-streaming output
python server.py --is_mock=true --mock_string=Hello

#--mock_count, the number of times the mock returns data when streaming output, the default is 3 times
python server.py --is_mock=true --mock_string=Hello --mock_count=10

#--single_word, mock streaming output return effect, the default is to divide a sentence into 3 parts according to [2:5:3] and return them in sequence, after setting the second parameter, it will be a word-by-word streaming output effect
python server.py --is_mock=true --mock_string=你好啊 --single_word=true

#--looptime, mock streaming output return data interval, the default is 0.35 seconds, set looptime=1 when streaming output display data speed will be slow
python server.py --is_mock=true --mock_string=你好啊 --looptime=1

```

### Using uv (recommended)

When using [`uv`](https://docs.astral.sh/uv/) no specific installation is needed. We will use 
[`uvx`](https://docs.astral.sh/uv/guides/tools/) to directly run *llm-analysis-assistant*.

```
uvx llm_analysis_assistant
```

### Using PIP(🌟)

Alternatively you can install `llm-analysis-assistant` via pip:

```
pip install llm-analysis-assistant
```

After installation, you can run it as a script using:

```
python -m llm_analysis_assistant
```

http://127.0.0.1:8000/logs View logs in real time

# Detection, analysis and call mcp (currently supports stdio/sse/streamableHttp)

The implementation logic of mcp client technology is as follows. The interface log seems to be a sequential request, but it is not actually a simple request-response mode. This is easier for users to understand

![mcp.png](docs/imgs/mcp.png)

mcp-sse logic details (for similarities and differences with stdio/streamableHttp, please refer to other materials)

![mcp-sse.png](docs/imgs/mcp-sse.png)

# Detection and analysis of mcp-stdio
Open the following address in the browser. In the command line, ++user=xxx means that the system variable is user and the value is xxx

http://127.0.0.1:8000/mcp?url=stdio

Or use Cherry Studio to add the stdio service

![Cherry-Studio-mcp-stdio.png](docs/imgs/Cherry-Studio-mcp-stdio.png)

# Detection and analysis of mcp-sse
Open the following address in the browser, the url is the sse service address

http://127.0.0.1:8000/mcp?url=http://127.0.0.1:8001/sse

http://127.0.0.1:8000/mcp?url=http://127.0.0.1:8002/sse?++user=xxx # ++user=xxx in the url means the HTTP request header user value is xxx

Or use Cherry Studio to add the mcp service

![Cherry-Studio-mcp-sse.png](docs/imgs/Cherry-Studio-mcp-sse.png)

# Detection and analysis of mcp-streamable-http
Open the following address in the browser, the url is the streamableHttp service address

http://127.0.0.1:8000/mcp?url=http://127.0.0.1:8001/mcp

http://127.0.0.1:8000/mcp?url=http://127.0.0.1:8001/mcp?++user=xxx # ++user=xxx in the url means the HTTP request header user value is xxx

Or use Cherry Studio to add the mcp service

![mcp-streamable-http.png](docs/imgs/mcp-streamable-http.png)

When using Cherry Studio, you can http://127.0.0.1:8000/logs View the logs in real time to analyze the calling logic of sse/mcp-streamable-http

# 5. Example collection
Change the base_url of openai to the address of the service: http://127.0.0.1:8000
### ⑴. Analyze langchain
### Install langchain first:
```sh

pip install langchain langchain-openai

```

```sh

from langchain.chat_models import init_chat_model
model = init_chat_model("qwen2.5-coder:1.5b", model_provider="openai",base_url='http://127.0.0.1:8000',api_key='ollama')
model.invoke("Hello, world!")

```
##### After running the above code, if you want to view the log file, you can enter the corresponding day folder in the logs directory to view it. There is a log file for each request
##### Open [http://127.0.0.1:8000/logs](http://127.0.0.1:8000/logs) to view the logs in real time

### ⑵Analysis tool set
#### 1. Tool Open WebUI
[Open WebUI.md](docs/Open%20WebUI.md)

#### 2. Tool Cherry Studio
[Cherry Studio.md](docs/Cherry%20Studio.md)

#### 3. Tool continue
[continue.md](docs/continue.md)

#### 4. Tool Navicat
[Navicat.md](docs/Navicat.md)

### ⑶、Analysis agent
#### 1. Agent Multi-Agent Supervisor

####### Agent is a node, agent is a tool, leader mode

![langgraph-supervisor1.png](docs/imgs/langgraph-supervisor1.png)

[langgraph-supervisor.md](docs/langgraph-supervisor.md)

#### 2. Intelligent agent Multi-Agent Swarm
###### Professional matters are reliable when handed over to professionals, teamwork mode

![langgraph-swarm1.png](docs/imgs/langgraph-swarm1.png)

[langgraph-swarm.md](docs/langgraph-swarm.md)

#### 3. Intelligent agent codeact
####### Every inch has its own strengths and weaknesses (it is said that CodeAct will greatly improve accuracy and efficiency in some scenarios)

![langgraph-codeact1.png](docs/imgs/langgraph-codeact1.png)

[langgraph-codeact.md](docs/langgraph-codeact.md)

# License
[Apache 2.0 License.](LICENSE)

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llm-analysis-assistant",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "\"xuzexin,hz\" <1768527366@qq.com>",
    "keywords": "analysis, asgi, assistant, llm, mcp, sse, stdio, streamable-http, uvicorn, websocket",
    "author": null,
    "author_email": "\"xuzexin,hz\" <1768527366@qq.com>",
    "download_url": "https://files.pythonhosted.org/packages/2a/fc/a224812e8f93689f84d7e41b8af5281c395f7e8236637865c47e700af5d6/llm_analysis_assistant-0.3.1.tar.gz",
    "platform": null,
    "description": "\n[English](./README.md) | [\u7b80\u4f53\u4e2d\u6587](./README_zh.md) \n\n# 1\u3001Project Features\nThrough this proxy service, we can easily record the parameters and return results of the interaction with the big model, so as to conveniently analyze the logic of the client calling the big model and deeply understand the phenomenon and its essence.\nThis project is not for optimizing the big model, but it can help you uncover the mystery of the big model, understand and achieve product market fit (PMF).\n\nMCP is also an important part of LLM, so this project can also be used as an mcp client and supports detection of sse/mcp-streamable-http mode.\n\n# \ud83c\udf1f Main features\n### Function list:\n1. **mcp client (already supports stdio/sse/streamableHttp calls)**\n2. **mcp initialization detection and analysis (such as Cherry Studio supports stdio/sse/streamableHttp)**\n3. **Detect ollama/openai interface and generate analysis log**\n4. **mock ollama/openai interface data**\n\n### Technical features:\n1. **uv tool use**\n2. **uvicorn framework use**\n3. **front-end async, back-end async**\n4. **log display real-time refresh, breakpoint continuation**\n5. **py socket write http client, support get/post, and their respective streaming output**\n6. **webSocket combined with asyncio use**\n7. **threading/queue use**\n8. **py program packaged into exe**\n9. **python -m llm_analysis_assistant**\n\n# 2. Project Background\nBefore the arrival of true AGI, we will have to go through a long journey, during which we will have to face constant challenges. Whether ordinary people or professionals, their lives will be changed.\n\nHowever, for the use of large models, both ordinary users and developers often indirectly contact them through various clients. But the client often blocks the process of interacting with the large model, and can directly give results based on the user's simple input, giving people a feeling that the large model is mysterious, like a black box. In fact, this is not the case. When using a large model, we simply understand that we are calling an interface with input and output.\nIt should be noted that although many inference platforms provide OpenAI format interfaces, their actual support varies. Simply put, the request parameters and return parameters of the API are not exactly the same.\n\nFor detailed parameter support, please see\n\n[Semi-standard:OpenAI API](https://platform.openai.com/docs/api-reference/responses/create)\n\n[n development:OLLAMA API](https://github.com/ollama/ollama/blob/main/docs/openai.md#supported-features)\n\n[in production:VLLM API](https://docs.vllm.ai/en/stable/api/inference_params.html#sampling-parameters)\n\nPlease check for other platforms\n\n### This project uses the uvicorn framework to start asgi to provide API services, with minimal dependencies, running quickly and concisely, paying tribute to the classics\n\n# 3. Installation\n\n```sh\n\n# clone git\ngit clone https://github.com/xuzexin-hz/llm-analysis-assistant.git\ncd llm-analysis-assistant\n\n# Install the extension\nuv sync\n\n```\n\n# 4. Use\nEnter the root directory, then the bin directory\nClick run-server.cmd to start the service\nClick run-build.cmd to package the service into an executable file (in the dist directory)\nOr run the following command directly in the root directory:\n\n```sh\n\n#Default port 8000\npython server.py\n\n#You can also specify the port\npython server.py --port=8001\n\n#You can also specify the openai address, the default is the ollama address: http://127.0.0.1:11434/v1/\npython server.py --base_url=https://api.openai.com\n#If you configure other api addresses, remember to fill in the correct api_key, ollama does not need api_key by default\n\n#--is_mock=true Turn on mock and return mock data\npython server.py --is_mock=true\n\n#--mock_string, you can customize the returned mock data, if you do not set this item, the default mock data will be returned. This parameter also applies to non-streaming output\npython server.py --is_mock=true --mock_string=Hello\n\n#--mock_count, the number of times the mock returns data when streaming output, the default is 3 times\npython server.py --is_mock=true --mock_string=Hello --mock_count=10\n\n#--single_word, mock streaming output return effect, the default is to divide a sentence into 3 parts according to [2:5:3] and return them in sequence, after setting the second parameter, it will be a word-by-word streaming output effect\npython server.py --is_mock=true --mock_string=\u4f60\u597d\u554a --single_word=true\n\n#--looptime, mock streaming output return data interval, the default is 0.35 seconds, set looptime=1 when streaming output display data speed will be slow\npython server.py --is_mock=true --mock_string=\u4f60\u597d\u554a --looptime=1\n\n```\n\n### Using uv (recommended)\n\nWhen using [`uv`](https://docs.astral.sh/uv/) no specific installation is needed. We will use \n[`uvx`](https://docs.astral.sh/uv/guides/tools/) to directly run *llm-analysis-assistant*.\n\n```\nuvx llm_analysis_assistant\n```\n\n### Using PIP(\ud83c\udf1f)\n\nAlternatively you can install `llm-analysis-assistant` via pip:\n\n```\npip install llm-analysis-assistant\n```\n\nAfter installation, you can run it as a script using:\n\n```\npython -m llm_analysis_assistant\n```\n\nhttp://127.0.0.1:8000/logs View logs in real time\n\n# Detection, analysis and call mcp (currently supports stdio/sse/streamableHttp)\n\nThe implementation logic of mcp client technology is as follows. The interface log seems to be a sequential request, but it is not actually a simple request-response mode. This is easier for users to understand\n\n![mcp.png](docs/imgs/mcp.png)\n\nmcp-sse logic details (for similarities and differences with stdio/streamableHttp, please refer to other materials)\n\n![mcp-sse.png](docs/imgs/mcp-sse.png)\n\n# Detection and analysis of mcp-stdio\nOpen the following address in the browser. In the command line, ++user=xxx means that the system variable is user and the value is xxx\n\nhttp://127.0.0.1:8000/mcp?url=stdio\n\nOr use Cherry Studio to add the stdio service\n\n![Cherry-Studio-mcp-stdio.png](docs/imgs/Cherry-Studio-mcp-stdio.png)\n\n# Detection and analysis of mcp-sse\nOpen the following address in the browser, the url is the sse service address\n\nhttp://127.0.0.1:8000/mcp?url=http://127.0.0.1:8001/sse\n\nhttp://127.0.0.1:8000/mcp?url=http://127.0.0.1:8002/sse?++user=xxx # ++user=xxx in the url means the HTTP request header user value is xxx\n\nOr use Cherry Studio to add the mcp service\n\n![Cherry-Studio-mcp-sse.png](docs/imgs/Cherry-Studio-mcp-sse.png)\n\n# Detection and analysis of mcp-streamable-http\nOpen the following address in the browser, the url is the streamableHttp service address\n\nhttp://127.0.0.1:8000/mcp?url=http://127.0.0.1:8001/mcp\n\nhttp://127.0.0.1:8000/mcp?url=http://127.0.0.1:8001/mcp?++user=xxx # ++user=xxx in the url means the HTTP request header user value is xxx\n\nOr use Cherry Studio to add the mcp service\n\n![mcp-streamable-http.png](docs/imgs/mcp-streamable-http.png)\n\nWhen using Cherry Studio, you can http://127.0.0.1:8000/logs View the logs in real time to analyze the calling logic of sse/mcp-streamable-http\n\n# 5. Example collection\nChange the base_url of openai to the address of the service: http://127.0.0.1:8000\n### \u2474. Analyze langchain\n### Install langchain first:\n```sh\n\npip install langchain langchain-openai\n\n```\n\n```sh\n\nfrom langchain.chat_models import init_chat_model\nmodel = init_chat_model(\"qwen2.5-coder:1.5b\", model_provider=\"openai\",base_url='http://127.0.0.1:8000',api_key='ollama')\nmodel.invoke(\"Hello, world!\")\n\n```\n##### After running the above code, if you want to view the log file, you can enter the corresponding day folder in the logs directory to view it. There is a log file for each request\n##### Open [http://127.0.0.1:8000/logs](http://127.0.0.1:8000/logs) to view the logs in real time\n\n### \u2475Analysis tool set\n#### 1. Tool Open WebUI\n[Open WebUI.md](docs/Open%20WebUI.md)\n\n#### 2. Tool Cherry Studio\n[Cherry Studio.md](docs/Cherry%20Studio.md)\n\n#### 3. Tool continue\n[continue.md](docs/continue.md)\n\n#### 4. Tool Navicat\n[Navicat.md](docs/Navicat.md)\n\n### \u2476\u3001Analysis agent\n#### 1. Agent Multi-Agent Supervisor\n\n####### Agent is a node, agent is a tool, leader mode\n\n![langgraph-supervisor1.png](docs/imgs/langgraph-supervisor1.png)\n\n[langgraph-supervisor.md](docs/langgraph-supervisor.md)\n\n#### 2. Intelligent agent Multi-Agent Swarm\n###### Professional matters are reliable when handed over to professionals, teamwork mode\n\n![langgraph-swarm1.png](docs/imgs/langgraph-swarm1.png)\n\n[langgraph-swarm.md](docs/langgraph-swarm.md)\n\n#### 3. Intelligent agent codeact\n####### Every inch has its own strengths and weaknesses (it is said that CodeAct will greatly improve accuracy and efficiency in some scenarios)\n\n![langgraph-codeact1.png](docs/imgs/langgraph-codeact1.png)\n\n[langgraph-codeact.md](docs/langgraph-codeact.md)\n\n# License\n[Apache 2.0 License.](LICENSE)",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "An LLM analysis assistant to help you understand and implement PMF",
    "version": "0.3.1",
    "project_urls": {
        "Home-page": "https://xuzexin-hz.github.io/pmf-docs/",
        "Repository": "https://github.com/xuzexin-hz/llm-analysis-assistant.git"
    },
    "split_keywords": [
        "analysis",
        " asgi",
        " assistant",
        " llm",
        " mcp",
        " sse",
        " stdio",
        " streamable-http",
        " uvicorn",
        " websocket"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "41d1a7922badc6a53dd304d464476ca32fbea0ae00b486c7a76471934b3efb94",
                "md5": "44dad78194506fbfe35fa3518af7d4d3",
                "sha256": "1e31215c752507c385cb4fefb6092f76d796bf02816a5c3e6ae83f619819fc67"
            },
            "downloads": -1,
            "filename": "llm_analysis_assistant-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "44dad78194506fbfe35fa3518af7d4d3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 49112,
            "upload_time": "2025-07-30T08:25:58",
            "upload_time_iso_8601": "2025-07-30T08:25:58.298763Z",
            "url": "https://files.pythonhosted.org/packages/41/d1/a7922badc6a53dd304d464476ca32fbea0ae00b486c7a76471934b3efb94/llm_analysis_assistant-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2afca224812e8f93689f84d7e41b8af5281c395f7e8236637865c47e700af5d6",
                "md5": "cee8cb9003c4a2226b254c9d4c6fc026",
                "sha256": "6eb297883f6bc39e51c6233e246a5616583f2980df5b24cd11648b7d3ae9c703"
            },
            "downloads": -1,
            "filename": "llm_analysis_assistant-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "cee8cb9003c4a2226b254c9d4c6fc026",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 1840930,
            "upload_time": "2025-07-30T08:26:07",
            "upload_time_iso_8601": "2025-07-30T08:26:07.494972Z",
            "url": "https://files.pythonhosted.org/packages/2a/fc/a224812e8f93689f84d7e41b8af5281c395f7e8236637865c47e700af5d6/llm_analysis_assistant-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-30 08:26:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "xuzexin-hz",
    "github_project": "llm-analysis-assistant",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "llm-analysis-assistant"
}

None