coeai

Name	coeai JSON
Version	2.0.0 JSON
	download
home_page	https://github.com/pkonal/coeai
Summary	Client to interact with COE AI-hosted LLM models
upload_time	2025-07-25 09:24:15
maintainer	None
docs_url	None
author	Konal Puri
requires_python	>=3.6
license	MIT
keywords	llm inference coeai ollama ai-client
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # coeai `PyPI Package`

**Interact with high-capacity multimodal LLMs hosted on the COE AI GPU cluster from any Python environment.**

`coeai` is a lightweight Python wrapper currently around the *LLaMA-4 16x17B* model (128K context, vision-enabled) deployed on the Centre of Excellence for AI (COE AI) servers at UPES. It exposes a single `/generate` HTTP endpoint, making it trivial to run both text-only and image+text inference from notebooks, scripts or backend services connected to the UPES Wi-Fi.

> **Text and image input** **128,000-token context** **Streaming or batch** **Runs on the COE AI GPU node**

---

## Table of Contents
1. [Features](#features)
2. [Requirements](#requirements)
3. [Installation](#installation)
4. [Quick Start](#quick-start)
5. [API Usage](#api-usage)
6. [Model Parameters](#model-parameters)
7. [Authentication](#authentication)
8. [Joining COE AI](#joining-coe-ai)
9. [Troubleshooting](#troubleshooting)
10. [License](#license)
11. [Author](#author)

---

## Features

* **Ultra-long context** up to **128K tokens** per request for long documents or multi-turn chats
* **Vision support** send images along with text for multimodal reasoning
* **High performance** queries are served by a dedicated GPU node inside the COE AI HPC cluster
* **Simple auth** authenticate with a short-lived API key (valid 30 days) sent in the request header
* **Drop-in wrapper** minimal Python API; no need to handle HTTP manually

---

## Requirements

* Python **3.8 or newer**
* Network access to `http://10.16.1.50:8000` from the UPES campus Wi-Fi
* A **valid API key** issued by the COE AI team

---

## Installation

```bash
pip install coeai
```

This pulls the latest release from PyPI.

---

## Quick Start

The wrapper exposes a single `LLMinfer` class. Initialize it with the API URL and your API key, then call `infer()`.

### Text-to-Text

```python
from coeai import LLMinfer

llm = LLMinfer(
    api_url="http://10.16.1.50:8000/generate",
    api_key="API_KEY"
)

response = llm.infer(
    mode="text-to-text",
    prompt_text="Summarize the key points of general relativity.",
    max_tokens=500,
    temperature=0.6,
    top_p=0.95,
    stream=False
)

print(response)

```

### Image + Text

```python
from coeai import LLMinfer

# Initialize the client
llm = LLMinfer(
    api_url="http://10.16.1.50:8000/generate",
    api_key="API_KEY"
)

# Run inference with image and prompt
response = llm.infer(
    mode="image-text-to-text",
    prompt_text="Describe what's happening in the image.",
    image_path="/home/konal.106904/sample.jpg",  # <-- update to a valid path
    max_tokens=512,
    temperature=0.7,
    top_p=1.0,
    stream=False
)

# Print the response
print(response)

```

---

## API Usage

### Using the Python Wrapper

The examples above show the recommended approach using the `LLMinfer` class.

### Direct API Access with cURL

You can also interact directly with the `/generate` endpoint using cURL.

#### Prerequisites

| Requirement | Purpose |
|-------------|---------|
| A running instance of the API | Default URL: `http://10.16.1.50:8000/generate` |
| Valid API key | Supply in the `X-API-Key` request header |
| cURL 7.68+ | Supports `--data @-` JSON piping |

#### Text-Only Request

```bash
curl -X POST http://10.16.1.50:8000/generate \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY_HERE" \
  -d '{
    "model": "llama4",
    "messages": [
      {
        "role": "system",
        "content": "This is a chat between a user and an assistant. The assistant is helping the user with general questions."
      },
      {
        "role": "user",
        "content": "Explain what a black hole is."
      }
    ],
    "max_tokens": 512,
    "temperature": 0.7,
    "top_p": 1.0,
    "stream": false
  }'
```

#### Image + Text Request

For multimodal requests, include the image as a Base64-encoded Data URI in the `content` array:

> **Note:** Replace `YOUR_API_KEY_HERE` with your own API key and `PUT_BASE64_IMAGE_STRING_HERE` with the **Base64-encoded** contents of your image file.

**How it Works:**
1. **Inline Image**: The `image_url` object embeds the entire image as a Data URI so no separate file upload is required
2. **Multi-Modal Prompt**: The `content` field is an array containing both the image and the accompanying text question, preserving ordering
3. **Response**: The server returns a JSON object containing the assistant's interpretation of the supplied image

---

## Model Parameters

### Default Parameters

| Field          | Description                                  | Default |
|----------------|----------------------------------------------|---------|
| `model`        | Model name (currently fixed to `llama4`)     | `llama4`      |
| `stream`       | Return tokens incrementally                  | `false` |
| `max_tokens`   | Maximum new tokens to generate               | `1024`  |
| `temperature`  | Sampling temperature (creativity)            | `0.7`   |
| `top_p`        | Nucleus sampling                             | `1.0`   |
| `stop`         | List of stop sequences                       | `null`  |

### Parameter Details

| Parameter      | Description |
|----------------|-------------|
| `model`        | The model identifier exposed by your server (here `llama4`) |
| `messages`     | Conversation history, each entry containing a `role` and `content` |
| `max_tokens`   | Upper bound on tokens in the assistant reply |
| `temperature`  | Controls randomness; lower values yield more deterministic output |
| `top_p`        | Nucleus sampling; keep at `1.0` for default behavior |
| `stream`       | When `true`, the API will send incremental responses via Server-Sent Events (SSE) |

> **Note:** The server enforces total context of 128K tokens (prompt + generated). Adjust `max_tokens` accordingly.

---

## Authentication

All requests must include an **API key** issued by the COE AI team. Pass the key when constructing `LLMinfer` (it is added as an `Authorization` header behind the scenes).

### Requesting an API Key

1. **Send an email** to `hpc-access@ddn.upes.ac.in` *from your official UPES account* using this template:

```
Subject: API Key Request for COE AI LLM Access

Dear COE AI Team,

I am requesting access to the LLM API for my project work.

Project Details:
- Project Name: <Your Project Name>
- Project Description: <Brief description>
- Expected Usage: <How you plan to use the LLM>
- Duration: <Timeline>

Reason for API Access:
<Research objectives or academic requirements>

Additional Information:
- Name: <Your Name>
- Email: <Your Email>
- Department/Affiliation: <Dept/Organisation>
- Student/Faculty ID: <If applicable>

Thank you for considering my request.

Best regards,
<Your Name>
```

2. Allow **2-3 business days** for processing. The team will reply with your API key.

### Key Renewal

Keys expire **after 30 days**. Email the same address with the subject:
```
Subject: API Key Renewal Request for COE AI LLM Access
```

Include your previous key and a brief usage summary.

---

## Troubleshooting

| Symptom | Possible Cause | Fix |
|---------|----------------|-----|
| `ConnectionError` | Not on UPES network | Connect to campus Wi-Fi or VPN |
| `401 Unauthorized` | Missing/expired API key | Request or renew your key |
| Long latency | Very large prompts or high `max_tokens` | Reduce prompt size or output length |

---

## License

`coeai` is released under the **MIT License**.

---

## Author

**Konal Puri**  
Centre of Excellence: AI (COE AI), HPC Project, UPES

PyPI: <https://pypi.org/project/coeai>

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/pkonal/coeai",
    "name": "coeai",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "llm inference coeai ollama ai-client",
    "author": "Konal Puri",
    "author_email": "purikonal23@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/19/03/7c7c44152e99ccd720b4c1612958c874396e4b7f00ad987c1be65d19ded7/coeai-2.0.0.tar.gz",
    "platform": null,
    "description": "# coeai `PyPI Package`\n\n**Interact with high-capacity multimodal LLMs hosted on the COE AI GPU cluster from any Python environment.**\n\n`coeai` is a lightweight Python wrapper currently around the *LLaMA-4 16x17B* model (128K context, vision-enabled) deployed on the Centre of Excellence for AI (COE AI) servers at UPES. It exposes a single `/generate` HTTP endpoint, making it trivial to run both text-only and image+text inference from notebooks, scripts or backend services connected to the UPES Wi-Fi.\n\n> **Text and image input** **128,000-token context** **Streaming or batch** **Runs on the COE AI GPU node**\n\n---\n\n## Table of Contents\n1. [Features](#features)\n2. [Requirements](#requirements)\n3. [Installation](#installation)\n4. [Quick Start](#quick-start)\n5. [API Usage](#api-usage)\n6. [Model Parameters](#model-parameters)\n7. [Authentication](#authentication)\n8. [Joining COE AI](#joining-coe-ai)\n9. [Troubleshooting](#troubleshooting)\n10. [License](#license)\n11. [Author](#author)\n\n---\n\n## Features\n\n* **Ultra-long context** up to **128K tokens** per request for long documents or multi-turn chats\n* **Vision support** send images along with text for multimodal reasoning\n* **High performance** queries are served by a dedicated GPU node inside the COE AI HPC cluster\n* **Simple auth** authenticate with a short-lived API key (valid 30 days) sent in the request header\n* **Drop-in wrapper** minimal Python API; no need to handle HTTP manually\n\n---\n\n## Requirements\n\n* Python **3.8 or newer**\n* Network access to `http://10.16.1.50:8000` from the UPES campus Wi-Fi\n* A **valid API key** issued by the COE AI team\n\n---\n\n## Installation\n\n```bash\npip install coeai\n```\n\nThis pulls the latest release from PyPI.\n\n---\n\n## Quick Start\n\nThe wrapper exposes a single `LLMinfer` class. Initialize it with the API URL and your API key, then call `infer()`.\n\n### Text-to-Text\n\n```python\nfrom coeai import LLMinfer\n\nllm = LLMinfer(\n    api_url=\"http://10.16.1.50:8000/generate\",\n    api_key=\"API_KEY\"\n)\n\nresponse = llm.infer(\n    mode=\"text-to-text\",\n    prompt_text=\"Summarize the key points of general relativity.\",\n    max_tokens=500,\n    temperature=0.6,\n    top_p=0.95,\n    stream=False\n)\n\nprint(response)\n\n```\n\n### Image + Text\n\n```python\nfrom coeai import LLMinfer\n\n# Initialize the client\nllm = LLMinfer(\n    api_url=\"http://10.16.1.50:8000/generate\",\n    api_key=\"API_KEY\"\n)\n\n# Run inference with image and prompt\nresponse = llm.infer(\n    mode=\"image-text-to-text\",\n    prompt_text=\"Describe what's happening in the image.\",\n    image_path=\"/home/konal.106904/sample.jpg\",  # <-- update to a valid path\n    max_tokens=512,\n    temperature=0.7,\n    top_p=1.0,\n    stream=False\n)\n\n# Print the response\nprint(response)\n\n```\n\n---\n\n## API Usage\n\n### Using the Python Wrapper\n\nThe examples above show the recommended approach using the `LLMinfer` class.\n\n### Direct API Access with cURL\n\nYou can also interact directly with the `/generate` endpoint using cURL.\n\n#### Prerequisites\n\n| Requirement | Purpose |\n|-------------|---------|\n| A running instance of the API | Default URL: `http://10.16.1.50:8000/generate` |\n| Valid API key | Supply in the `X-API-Key` request header |\n| cURL 7.68+ | Supports `--data @-` JSON piping |\n\n#### Text-Only Request\n\n```bash\ncurl -X POST http://10.16.1.50:8000/generate \\\n  -H \"Content-Type: application/json\" \\\n  -H \"X-API-Key: YOUR_API_KEY_HERE\" \\\n  -d '{\n    \"model\": \"llama4\",\n    \"messages\": [\n      {\n        \"role\": \"system\",\n        \"content\": \"This is a chat between a user and an assistant. The assistant is helping the user with general questions.\"\n      },\n      {\n        \"role\": \"user\",\n        \"content\": \"Explain what a black hole is.\"\n      }\n    ],\n    \"max_tokens\": 512,\n    \"temperature\": 0.7,\n    \"top_p\": 1.0,\n    \"stream\": false\n  }'\n```\n\n#### Image + Text Request\n\nFor multimodal requests, include the image as a Base64-encoded Data URI in the `content` array:\n\n> **Note:** Replace `YOUR_API_KEY_HERE` with your own API key and `PUT_BASE64_IMAGE_STRING_HERE` with the **Base64-encoded** contents of your image file.\n\n**How it Works:**\n1. **Inline Image**: The `image_url` object embeds the entire image as a Data URI so no separate file upload is required\n2. **Multi-Modal Prompt**: The `content` field is an array containing both the image and the accompanying text question, preserving ordering\n3. **Response**: The server returns a JSON object containing the assistant's interpretation of the supplied image\n\n---\n\n## Model Parameters\n\n### Default Parameters\n\n| Field          | Description                                  | Default |\n|----------------|----------------------------------------------|---------|\n| `model`        | Model name (currently fixed to `llama4`)     | `llama4`      |\n| `stream`       | Return tokens incrementally                  | `false` |\n| `max_tokens`   | Maximum new tokens to generate               | `1024`  |\n| `temperature`  | Sampling temperature (creativity)            | `0.7`   |\n| `top_p`        | Nucleus sampling                             | `1.0`   |\n| `stop`         | List of stop sequences                       | `null`  |\n\n### Parameter Details\n\n| Parameter      | Description |\n|----------------|-------------|\n| `model`        | The model identifier exposed by your server (here `llama4`) |\n| `messages`     | Conversation history, each entry containing a `role` and `content` |\n| `max_tokens`   | Upper bound on tokens in the assistant reply |\n| `temperature`  | Controls randomness; lower values yield more deterministic output |\n| `top_p`        | Nucleus sampling; keep at `1.0` for default behavior |\n| `stream`       | When `true`, the API will send incremental responses via Server-Sent Events (SSE) |\n\n> **Note:** The server enforces total context of 128K tokens (prompt + generated). Adjust `max_tokens` accordingly.\n\n---\n\n## Authentication\n\nAll requests must include an **API key** issued by the COE AI team. Pass the key when constructing `LLMinfer` (it is added as an `Authorization` header behind the scenes).\n\n### Requesting an API Key\n\n1. **Send an email** to `hpc-access@ddn.upes.ac.in` *from your official UPES account* using this template:\n\n```\nSubject: API Key Request for COE AI LLM Access\n\nDear COE AI Team,\n\nI am requesting access to the LLM API for my project work.\n\nProject Details:\n- Project Name: <Your Project Name>\n- Project Description: <Brief description>\n- Expected Usage: <How you plan to use the LLM>\n- Duration: <Timeline>\n\nReason for API Access:\n<Research objectives or academic requirements>\n\nAdditional Information:\n- Name: <Your Name>\n- Email: <Your Email>\n- Department/Affiliation: <Dept/Organisation>\n- Student/Faculty ID: <If applicable>\n\nThank you for considering my request.\n\nBest regards,\n<Your Name>\n```\n\n2. Allow **2-3 business days** for processing. The team will reply with your API key.\n\n### Key Renewal\n\nKeys expire **after 30 days**. Email the same address with the subject:\n```\nSubject: API Key Renewal Request for COE AI LLM Access\n```\n\nInclude your previous key and a brief usage summary.\n\n---\n\n## Troubleshooting\n\n| Symptom | Possible Cause | Fix |\n|---------|----------------|-----|\n| `ConnectionError` | Not on UPES network | Connect to campus Wi-Fi or VPN |\n| `401 Unauthorized` | Missing/expired API key | Request or renew your key |\n| Long latency | Very large prompts or high `max_tokens` | Reduce prompt size or output length |\n\n---\n\n## License\n\n`coeai` is released under the **MIT License**.\n\n---\n\n## Author\n\n**Konal Puri**  \nCentre of Excellence: AI (COE AI), HPC Project, UPES\n\nPyPI: <https://pypi.org/project/coeai>\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Client to interact with COE AI-hosted LLM models",
    "version": "2.0.0",
    "project_urls": {
        "Homepage": "https://github.com/pkonal/coeai"
    },
    "split_keywords": [
        "llm",
        "inference",
        "coeai",
        "ollama",
        "ai-client"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3f23888552c7fdd4298911518c059c07ef4858f8a22156b12d340adda0d0ddf7",
                "md5": "3b3c0d8032019d362c298eff629d423e",
                "sha256": "5aba4aa15c29c37789cc567c5c09f1d6112bb695d50a28b385efd31585cd521b"
            },
            "downloads": -1,
            "filename": "coeai-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3b3c0d8032019d362c298eff629d423e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 6605,
            "upload_time": "2025-07-25T09:24:14",
            "upload_time_iso_8601": "2025-07-25T09:24:14.074315Z",
            "url": "https://files.pythonhosted.org/packages/3f/23/888552c7fdd4298911518c059c07ef4858f8a22156b12d340adda0d0ddf7/coeai-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "19037c7c44152e99ccd720b4c1612958c874396e4b7f00ad987c1be65d19ded7",
                "md5": "9ec8fda6be6c2cf48393e01b2a5b4efa",
                "sha256": "9613cc9e3e253ea359005444d44501dd882358c1cf6c33795e7a08173781ad23"
            },
            "downloads": -1,
            "filename": "coeai-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "9ec8fda6be6c2cf48393e01b2a5b4efa",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 6486,
            "upload_time": "2025-07-25T09:24:15",
            "upload_time_iso_8601": "2025-07-25T09:24:15.340649Z",
            "url": "https://files.pythonhosted.org/packages/19/03/7c7c44152e99ccd720b4c1612958c874396e4b7f00ad987c1be65d19ded7/coeai-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-25 09:24:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pkonal",
    "github_project": "coeai",
    "github_not_found": true,
    "lcname": "coeai"
}

Konal Puri