<div align="center">
<img src="./fig/logo.png" width="50%" alt="FlowLine" />
<!-- [](LICENSE) -->
[δΈζ](../readme.md) | English
</div>
FlowLine is an automated system for **GPU resource management** and **concurrent command stream scheduling**, supporting both **Command Line Interface (CLI)** and **Web Graphical User Interface (GUI)** interaction modes. It is suitable for multi-task experiments, deep learning training, or high-concurrency computing environments.
* π **API Documentation**: See [API Docs](./docs/api.md)
* π§© **System Design Overview**: See [Design Overview](./docs/design.md)
* ποΈ **System Architecture Details**: See [Architecture Documentation](./docs/arch.md)
The system was designed to replace the inefficient manual process of monitoring GPU status and executing commands sequentially. In traditional workflows, users need to continuously monitor GPU VRAM availability and usage to manually launch Python scripts or terminate processes, which is particularly cumbersome in multi-task experimental scenarios. This project addresses these issues through automation, improving experimental efficiency and resource utilization.
## Core Features
* Real-time GPU status monitoring: Automatically detects available GPU count, VRAM usage, process information, and selects the most appropriate GPU.
* Command scheduling & resource control: Supports configuring conditions per command (required GPU count, minimum VRAM, max concurrency, etc.).
* Dynamic control mechanisms: Allows manual termination or restarting of processes for flexible task queue management.
* Concurrent multi-task execution: Supports task priority queues, failure retry policies, suitable for batch experiments.
* Dual interaction modes: CLI for scripted control and batch deployment on Linux servers; Web GUI for visual task monitoring, status tracking, and real-time intervention.
## π Quick Start Guide
### π₯οΈ Using Command Line Interface (CLI Mode)
#### 1. Installation
You can directly reference the `flowline` folder by copying it to your project root, or install it into your Python environment:
- Install via pip:
```bash
pip install fline
```
- Or install from source:
```bash
pip install -e <path_to_flowline_repository>
```
> Note: Ensure you have installed basic dependencies from `requirements.txt` (`pandas`, `psutil`, `openpyxl`, etc.).
#### 2. Create Task Control Sheet
The system uses a list file (`.xlsx`γ `.csv` or `.json` format) to define task parameters. **This is the only input method for all tasks.** Each row represents an independent task, and each column corresponds to a parameter that will be automatically mapped to `--key value` CLI format.
<details>
<summary>Example and Explanation</summary>
Example files: [`test/todo.xlsx`](./test/todo.xlsx), [`test/todo.csv`](./test/todo.csv),[`test/todo.json`](./test/todo.json), which can be constructed using the example program [`test/task_builder.py`](./test/task_builder.py).
| *name* | lr | batch_size | *run_num* | *need_run_num* | *cmd* |
| --------- | ----- | ---------- | --------- | -------------- | ----------- |
| baseline1 | 0.01 | 64 | 0 | 1 | train_main |
| baseline2 | 0.001 | 128 | 0 | 2 | train_alt |
Field descriptions:
* `run_num`: Current execution count (automatically maintained by system, default=0).
* `need_run_num`: Total desired executions (system controls repeats based on this, default=1).
* `name`: Task identifier. Auto-generated as `Task:{row_number}` if unspecified.
* `cmd`: Reserved field (can be empty or specify main command like `train_main`). Can be used with custom `func` logic.
* Other fields can be freely defined and will be passed to the command constructor.
> Note: If reserved fields are missing, **the system will auto-complete them during loading** to ensure valid structure.
The flexible task sheet structure supports everything from parameter tuning to complex grid search automation.
</details>
#### 3. Define Task Constructor `func(dict, gpu_id)`
You need to define a custom function that constructs the final command string using the task parameters `dict` (from Excel row) and allocated `gpu_id`.
<details>
<summary>Example and Explanation</summary>
Example:
```python
from flowline import run_cli
if __name__ == "__main__":
def func(param_dict, gpu_id):
cmd = "CUDA_VISIBLE_DEVICES=" + str(gpu_id) + " python -u test/test.py "
args = " ".join([f"--{k} {v}" for k, v in param_dict.items()])
return cmd + args
run_cli(func, "test/todo.xlsx")
```
* `param_dict`: Dictionary built from current Excel row (keys=column names, values=cell content)
* `gpu_id`: Dynamically allocated GPU ID (ensures no conflicts)
* Returned command string executes as a subprocess (equivalent to direct CLI execution)
* Can be adapted for shell scripts, conda environments, or main command variants
<details>
<summary>About Output and python -u</summary>
π‘ **About `python -u`:**
Using `-u` flag (`python -u ...`) enables **unbuffered mode**:
* `stdout`/`stderr` flush immediately
* Essential for real-time log viewing (especially when output is redirected)
* FlowLine saves each task's output to `log/` directory:
```
log/
βββ 0.out # stdout for task 0
βββ 0.err # stderr for task 0
βββ 1.out
βββ 1.err
...
```
Always use `-u` to ensure **real-time log writing** to these files.
</details>
</details>
#### 4. Enter `run` to start the task flow
<details>
<summary>FlowLine CLI Command Reference Table</summary>
| Command | Parameter | Description |
| -------------- | ------------------------- | ---------------------------------------------------------------------------------------------------------- |
| `run` | None | Toggles the task processing loop state (start/stop) |
| `gpu <id>` | `<id>`: GPU ID | Toggles the availability of the specified GPU (available/unavailable) |
| `killgpu <id>` | `<id>`: GPU ID | Terminates all processes running on the specified GPU |
| `kill <id>` | `<id>`: Process ID | Terminates the process with the specified process ID |
| `ls` | None | Lists all running processes, showing process ID, PID, task ID, GPU ID, status, and command |
| `gpus` | None | Displays the status of all GPUs, including utilization, memory usage, temperature, power consumption, etc. |
| `min <num>` | `<num>`: Memory size (MB) | Sets the minimum required memory for processes |
| `max <num>` | `<num>`: Process count | Sets the maximum number of concurrent processes |
| `task` | None | Lists the pending task queue, showing task ID, name, run count, etc. |
| `exit` | None | Exits the program (equivalent to `Ctrl+D`) |
| `help` or `?` | None | Displays help information |
<details>
<summary>Command Usage Examples</summary>
```bash
# Start the task processing loop
> run
# Check GPU status
> gpus
# View running processes
> ls
# Set the maximum number of concurrent processes to 4
> max 4
# Set the minimum memory requirement to 2048 MB
> min 2048
# Disable GPU 1
> gpu 1
# Terminate all processes on GPU 0
> killgpu 0
# View pending tasks
> task
# Exit the program
> exit
```
</details>
</details>
### π Using Web Interface (Visual Task Management)
> **No extra configuration needed - Works directly in SSH environments**
Besides CLI, you can use the Web GUI for **real-time monitoring and dynamic intervention**.
#### 1. Start Backend API Service
Run the Flask backend:
```bash
python main_server.py
```
#### 2. Start Frontend Service
Launch static file server:
```bash
cd web
python -m http.server 8000
```
Access the frontend at [http://localhost:8000](http://localhost:8000/). The interface communicates with backend via RESTful APIs.
<div align=center>
<img src="./docs/fig/gpu.png" alt="GPU Monitoring" height="200px" />
<img src="./docs/fig/task.png" alt="Task Management" height="200px" />
<img src="./docs/fig/log.png" alt="Log Viewer" height="200px" />
<img src="./docs/fig/set.png" alt="Settings" height="200px" />
</div>
## π Disclaimer
This project provides **automated detection and utilization of idle GPUs** for resource-constrained environments (e.g., labs), enabling rapid task initiation without manual polling.
### π Important Notes
- This tool **does NOT forcibly terminate others' tasks** or bypass permission/scheduling systems.
- Default operation is **limited to devices where user has access permissions**. Comply with institutional policies.
- **DO NOT misuse to monopolize shared resources or disrupt others' research.**
### π¨ Risk Statement
Potential risks include but not limited to:
- Resource conflicts from concurrent scheduling
- Violation of lab/platform policies if abused
Developers **shall not be liable** for any direct/indirect losses including resource conflicts, account restrictions, or data loss resulting from script usage.
## π Contributions
We welcome everyone to contribute code, fix bugs, or improve the documentation for this template!
- If you have any suggestions or questions, please submit an issue.
- Pull requests are welcome.
> [!TIP]
> If this project is helpful to you, please give it a **Star**!
**Thanks to all contributors!**
[](https://github.com/dramwig/FlowLine/graphs/contributors)
<a href="https://www.star-history.com/#dramwig/FlowLine&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=dramwig/FlowLine&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=dramwig/FlowLine&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=dramwig/FlowLine&type=Date" />
</picture>
</a>
Raw data
{
"_id": null,
"home_page": "https://github.com/Dramwig/FlowLine",
"name": "fline",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "flowline, fline, machine-learning, automation, flash, gpu, gpu-monitoring, experiment-management",
"author": "Dramwig",
"author_email": "dramwig@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/42/a3/22b4dd015125d8cfdd46423302dd590a3467d452b20f307e6d8c2dcd6d47/fline-0.1.1.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n <img src=\"./fig/logo.png\" width=\"50%\" alt=\"FlowLine\" />\n\n <!-- [](LICENSE) -->\n\n [\u4e2d\u6587](../readme.md) | English\n</div>\n\n\n\nFlowLine is an automated system for **GPU resource management** and **concurrent command stream scheduling**, supporting both **Command Line Interface (CLI)** and **Web Graphical User Interface (GUI)** interaction modes. It is suitable for multi-task experiments, deep learning training, or high-concurrency computing environments.\n\n* \ud83d\udcd8 **API Documentation**: See [API Docs](./docs/api.md)\n* \ud83e\udde9 **System Design Overview**: See [Design Overview](./docs/design.md)\n* \ud83c\udfd7\ufe0f **System Architecture Details**: See [Architecture Documentation](./docs/arch.md)\n\nThe system was designed to replace the inefficient manual process of monitoring GPU status and executing commands sequentially. In traditional workflows, users need to continuously monitor GPU VRAM availability and usage to manually launch Python scripts or terminate processes, which is particularly cumbersome in multi-task experimental scenarios. This project addresses these issues through automation, improving experimental efficiency and resource utilization.\n\n## Core Features\n\n* Real-time GPU status monitoring: Automatically detects available GPU count, VRAM usage, process information, and selects the most appropriate GPU.\n* Command scheduling & resource control: Supports configuring conditions per command (required GPU count, minimum VRAM, max concurrency, etc.).\n* Dynamic control mechanisms: Allows manual termination or restarting of processes for flexible task queue management.\n* Concurrent multi-task execution: Supports task priority queues, failure retry policies, suitable for batch experiments.\n* Dual interaction modes: CLI for scripted control and batch deployment on Linux servers; Web GUI for visual task monitoring, status tracking, and real-time intervention.\n\n## \ud83d\ude80 Quick Start Guide\n\n### \ud83d\udda5\ufe0f Using Command Line Interface (CLI Mode)\n\n#### 1. Installation\n\nYou can directly reference the `flowline` folder by copying it to your project root, or install it into your Python environment:\n\n- Install via pip:\n```bash\npip install fline\n```\n\n- Or install from source:\n```bash\npip install -e <path_to_flowline_repository>\n```\n\n> Note: Ensure you have installed basic dependencies from `requirements.txt` (`pandas`, `psutil`, `openpyxl`, etc.).\n\n#### 2. Create Task Control Sheet\n\nThe system uses a list file (`.xlsx`\u3001 `.csv` or `.json` format) to define task parameters. **This is the only input method for all tasks.** Each row represents an independent task, and each column corresponds to a parameter that will be automatically mapped to `--key value` CLI format.\n\n<details>\n<summary>Example and Explanation</summary>\n\nExample files: [`test/todo.xlsx`](./test/todo.xlsx), [`test/todo.csv`](./test/todo.csv),[`test/todo.json`](./test/todo.json), which can be constructed using the example program [`test/task_builder.py`](./test/task_builder.py).\n\n| *name* | lr | batch_size | *run_num* | *need_run_num* | *cmd* |\n| --------- | ----- | ---------- | --------- | -------------- | ----------- |\n| baseline1 | 0.01 | 64 | 0 | 1 | train_main |\n| baseline2 | 0.001 | 128 | 0 | 2 | train_alt |\n\nField descriptions:\n* `run_num`: Current execution count (automatically maintained by system, default=0).\n* `need_run_num`: Total desired executions (system controls repeats based on this, default=1).\n* `name`: Task identifier. Auto-generated as `Task:{row_number}` if unspecified.\n* `cmd`: Reserved field (can be empty or specify main command like `train_main`). Can be used with custom `func` logic.\n* Other fields can be freely defined and will be passed to the command constructor.\n\n> Note: If reserved fields are missing, **the system will auto-complete them during loading** to ensure valid structure.\n\nThe flexible task sheet structure supports everything from parameter tuning to complex grid search automation.\n\n</details>\n\n#### 3. Define Task Constructor `func(dict, gpu_id)`\n\nYou need to define a custom function that constructs the final command string using the task parameters `dict` (from Excel row) and allocated `gpu_id`.\n\n<details>\n<summary>Example and Explanation</summary>\n\nExample:\n```python\nfrom flowline import run_cli\n\nif __name__ == \"__main__\":\n def func(param_dict, gpu_id):\n cmd = \"CUDA_VISIBLE_DEVICES=\" + str(gpu_id) + \" python -u test/test.py \"\n args = \" \".join([f\"--{k} {v}\" for k, v in param_dict.items()])\n return cmd + args\n\n run_cli(func, \"test/todo.xlsx\")\n```\n\n* `param_dict`: Dictionary built from current Excel row (keys=column names, values=cell content)\n* `gpu_id`: Dynamically allocated GPU ID (ensures no conflicts)\n* Returned command string executes as a subprocess (equivalent to direct CLI execution)\n* Can be adapted for shell scripts, conda environments, or main command variants\n\n<details>\n<summary>About Output and python -u</summary>\n\n\ud83d\udca1 **About `python -u`:**\nUsing `-u` flag (`python -u ...`) enables **unbuffered mode**:\n* `stdout`/`stderr` flush immediately\n* Essential for real-time log viewing (especially when output is redirected)\n* FlowLine saves each task's output to `log/` directory:\n\n```\nlog/\n\u251c\u2500\u2500 0.out # stdout for task 0\n\u251c\u2500\u2500 0.err # stderr for task 0\n\u251c\u2500\u2500 1.out\n\u251c\u2500\u2500 1.err\n...\n```\n\nAlways use `-u` to ensure **real-time log writing** to these files.\n</details>\n</details>\n\n#### 4. Enter `run` to start the task flow\n\n<details>\n<summary>FlowLine CLI Command Reference Table</summary>\n\n| Command | Parameter | Description |\n| -------------- | ------------------------- | ---------------------------------------------------------------------------------------------------------- |\n| `run` | None | Toggles the task processing loop state (start/stop) |\n| `gpu <id>` | `<id>`: GPU ID | Toggles the availability of the specified GPU (available/unavailable) |\n| `killgpu <id>` | `<id>`: GPU ID | Terminates all processes running on the specified GPU |\n| `kill <id>` | `<id>`: Process ID | Terminates the process with the specified process ID |\n| `ls` | None | Lists all running processes, showing process ID, PID, task ID, GPU ID, status, and command |\n| `gpus` | None | Displays the status of all GPUs, including utilization, memory usage, temperature, power consumption, etc. |\n| `min <num>` | `<num>`: Memory size (MB) | Sets the minimum required memory for processes |\n| `max <num>` | `<num>`: Process count | Sets the maximum number of concurrent processes |\n| `task` | None | Lists the pending task queue, showing task ID, name, run count, etc. |\n| `exit` | None | Exits the program (equivalent to `Ctrl+D`) |\n| `help` or `?` | None | Displays help information |\n\n<details>\n<summary>Command Usage Examples</summary>\n\n```bash\n# Start the task processing loop\n> run\n\n# Check GPU status\n> gpus\n\n# View running processes\n> ls\n\n# Set the maximum number of concurrent processes to 4\n> max 4\n\n# Set the minimum memory requirement to 2048 MB\n> min 2048\n\n# Disable GPU 1\n> gpu 1\n\n# Terminate all processes on GPU 0\n> killgpu 0\n\n# View pending tasks\n> task\n\n# Exit the program\n> exit\n```\n\n</details>\n</details>\n\n\n### \ud83c\udf10 Using Web Interface (Visual Task Management)\n\n> **No extra configuration needed - Works directly in SSH environments**\n\nBesides CLI, you can use the Web GUI for **real-time monitoring and dynamic intervention**.\n\n#### 1. Start Backend API Service\nRun the Flask backend:\n```bash\npython main_server.py\n```\n\n#### 2. Start Frontend Service\nLaunch static file server:\n```bash\ncd web\npython -m http.server 8000\n```\n\nAccess the frontend at [http://localhost:8000](http://localhost:8000/). The interface communicates with backend via RESTful APIs.\n\n<div align=center>\n <img src=\"./docs/fig/gpu.png\" alt=\"GPU Monitoring\" height=\"200px\" />\n <img src=\"./docs/fig/task.png\" alt=\"Task Management\" height=\"200px\" />\n <img src=\"./docs/fig/log.png\" alt=\"Log Viewer\" height=\"200px\" />\n <img src=\"./docs/fig/set.png\" alt=\"Settings\" height=\"200px\" />\n</div>\n\n## \ud83d\uded1 Disclaimer\n\nThis project provides **automated detection and utilization of idle GPUs** for resource-constrained environments (e.g., labs), enabling rapid task initiation without manual polling.\n\n### \ud83d\udccc Important Notes\n- This tool **does NOT forcibly terminate others' tasks** or bypass permission/scheduling systems.\n- Default operation is **limited to devices where user has access permissions**. Comply with institutional policies.\n- **DO NOT misuse to monopolize shared resources or disrupt others' research.**\n\n### \ud83d\udea8 Risk Statement\n\nPotential risks include but not limited to:\n- Resource conflicts from concurrent scheduling\n- Violation of lab/platform policies if abused\n\nDevelopers **shall not be liable** for any direct/indirect losses including resource conflicts, account restrictions, or data loss resulting from script usage.\n\n\n## \ud83d\udc90 Contributions\n\nWe welcome everyone to contribute code, fix bugs, or improve the documentation for this template!\n\n- If you have any suggestions or questions, please submit an issue.\n- Pull requests are welcome.\n \n> [!TIP]\n> If this project is helpful to you, please give it a **Star**!\n\n**Thanks to all contributors!**\n\n[](https://github.com/dramwig/FlowLine/graphs/contributors)\n\n<a href=\"https://www.star-history.com/#dramwig/FlowLine&Date\">\n<picture>\n <source media=\"(prefers-color-scheme: dark)\" srcset=\"https://api.star-history.com/svg?repos=dramwig/FlowLine&type=Date&theme=dark\" />\n <source media=\"(prefers-color-scheme: light)\" srcset=\"https://api.star-history.com/svg?repos=dramwig/FlowLine&type=Date\" />\n \n<img alt=\"Star History Chart\" src=\"https://api.star-history.com/svg?repos=dramwig/FlowLine&type=Date\" />\n </picture>\n</a>\n",
"bugtrack_url": null,
"license": "Apache Software License",
"summary": "Automated tool for running Python programs in a streamlined manner",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/Dramwig/FlowLine"
},
"split_keywords": [
"flowline",
" fline",
" machine-learning",
" automation",
" flash",
" gpu",
" gpu-monitoring",
" experiment-management"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "c722b675d6fd7d7085606cd44536e4b410b4bc1e08f7d76ebcba1c90ca79839b",
"md5": "62eeaa56c7fd7a401e668ae823c34292",
"sha256": "2276855ce32bc66f5738bff78a4548cfbe9b2a4ee58721a34fc44565cc803d64"
},
"downloads": -1,
"filename": "fline-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "62eeaa56c7fd7a401e668ae823c34292",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 25373,
"upload_time": "2025-08-15T07:40:06",
"upload_time_iso_8601": "2025-08-15T07:40:06.924767Z",
"url": "https://files.pythonhosted.org/packages/c7/22/b675d6fd7d7085606cd44536e4b410b4bc1e08f7d76ebcba1c90ca79839b/fline-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "42a322b4dd015125d8cfdd46423302dd590a3467d452b20f307e6d8c2dcd6d47",
"md5": "ef76c0a41aa5319f3641b7f32f99dd2e",
"sha256": "458cf77eba65ef0b63f88d1ec2487c110d8da2fa3ec6ac2ec7e3a7d818183025"
},
"downloads": -1,
"filename": "fline-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "ef76c0a41aa5319f3641b7f32f99dd2e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 22006,
"upload_time": "2025-08-15T07:40:08",
"upload_time_iso_8601": "2025-08-15T07:40:08.020611Z",
"url": "https://files.pythonhosted.org/packages/42/a3/22b4dd015125d8cfdd46423302dd590a3467d452b20f307e6d8c2dcd6d47/fline-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-15 07:40:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Dramwig",
"github_project": "FlowLine",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "Flask",
"specs": [
[
"==",
"2.0.1"
]
]
},
{
"name": "Flask_Cors",
"specs": [
[
"==",
"3.0.10"
]
]
},
{
"name": "Flask_SocketIO",
"specs": [
[
"==",
"5.5.1"
]
]
},
{
"name": "nvidia_ml_py",
"specs": [
[
"==",
"12.560.30"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.3.1"
]
]
},
{
"name": "psutil",
"specs": [
[
"==",
"5.9.0"
]
]
},
{
"name": "pynvml",
"specs": [
[
"==",
"11.0.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.66.5"
]
]
},
{
"name": "werkzeug",
"specs": [
[
"==",
"2.3"
]
]
},
{
"name": "openpyxl",
"specs": []
}
],
"lcname": "fline"
}