ALLM


NameALLM JSON
Version 1.0.8 PyPI version JSON
download
home_pageNone
SummaryA simple and efficient python library for fast inference of LLMs
upload_time2024-06-15 06:13:26
maintainerAll Advance AI
docs_urlNone
authorAll Advance AI
requires_pythonNone
licenseNone
keywords gguf gguf large language model gguf large language models gguf large language modeling gguf large language modeling library
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ALLM

ALLM is a Python library designed for fast inference of GGUF (Generic Global Unsupervised Features) Large Language Models (LLMs) on both CPU and GPU. It provides a convenient interface for loading pre-trained GGUF models and performing inference using them. This library is ideal for applications where quick response times are crucial, such as chatbots, text generation, and more.

## Features

- **Efficient Inference**: ALLM leverages the power of GGUF models to provide fast and accurate inference.
- **CPU and GPU Support**: The library is optimized for both CPU and GPU, allowing you to choose the best hardware for your application.
- **Simple Interface**: With a straightforward command line support, you can easily load models and perform inference with just a single command.
- **Flexible Configuration**: Customize inference settings such as temperature and model path to suit your needs.

## Installation

You can install ALLM using pip:

```bash
pip install allm
```

## Usage

You can start inference with a simple 'allm-run' command. The command takes name or path, temperature(optional), max new tokens(optional) and additional model kwargs(optional) as arguments.

```bash
allm-run --name model_name_or_path
```

## API

You can initiate the inference API by simply using the 'allm-serve' command. This command launches the API server on the default host, 0.0.0.0:8001. 

```bash
allm-serve
```


## ALLM AGENTS 

## Local Agent Inference

To create local agent, begin by loading your knowledge documents into the database using the allm-newagent command and specifying the agent name:

```bash
allm-newagent --doc "document_path" --agent agent_name
```

or

```bash
allm-newagent --dir "directory containing files to be ingested" --agent agent_name
```

After agent is created successfully with your knowledge document, you can start the local agent chat with the allm-agentchat command:


```bash
allm-agentchat --agent agent name
```

After your agents are created you can also initiate agent-specific API server using the allm-agentapi command:


```bash
allm-agentapi --agent agent name
```

You can also add additional documents to your existing agents by using the allm-updateagent command:

```bash
allm-updateagent --doc "document path" --agent agentname
```

##Supported Cloud models.

ALLM supports the Generative LLMs on VertexAI, including Gemini-1.5 pro and AzureOpenAi models. You can start local inference of cloud based models using the following command:

```bash
allm-run-vertex --projectid Id_of_your_GCP_project --region location_of_your_cloud_server
```

or

```bash
allm-run-azure --key key --version version --endpoint https://{your_endpoint}.openai.azure.com --model model_name
```

ALLM supports the local config based inference of Generative LLMs on VertexAI, including Gemini-1.5 pro and AzureOpenAi models. You can manually create a json confi file or ALLM will create one for you and start local inference of cloud based models using the following command:

```bash
allm-run-vertex
```
Note that for the above command to work, config file needs to have all the necessary parameters set. This can be achieved by running thr full command including CLI arguments once, and then using the shortened command

Same procedure can be followed for azure.

```bash
allm-run-azure
```


You can also have a custom agent working with your cloud deployed model using the following command. It is important to note that before this step, agent should be created using the commands in the AGENTS section above.

```bash
allm-agentchat-vertex --projectid Id_of_your_GCP_project --region location_of_your_cloud_server --agent agent_name
```
or
```bash
allm-run-azure --key key --version version --endpoint https://{your_endpoint}.openai.azure.com --model model_name --agent agentname
```
model_name is an optional parameter in both vertex and azure, if not mentioned, inference will work on gemini-1.0-pro-002 for vertex and gpt-35-turbo for OpenAI by default.

Also, have an api config file ready, the following commands can be used:

```bash
allm-agentchat-vertex --agent agent_name
```
and
```bash
allm-agentchat-azure --agent agent_name
```

ALLM also supports inferencing of cloud model based agents on API

```bash
allm-agentapi-vertex --projectid Id_of_your_GCP_project --region location_of_your_cloud_server --agent agent_name
```
or
```bash
allm-agentapi-vertex --agent agent_name
```

For Azure,

```bash
allm-run-azure --key key --version version --endpoint https://{your_endpoint}.openai.azure.com --model model_name --agent agentname
```
or
```bash
allm-agentapi-vertex --agent agent_name
```

# ALLM-Enterprise
You can launch the UI with the following command:

```bash
allm-launch
```

## Supported Model names
Llama3, Llama2, llama, llama2_chat, Llama_chat, Mistral, Mistral_instruct


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ALLM",
    "maintainer": "All Advance AI",
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": "allmdev@allaai.com",
    "keywords": "GGUF, GGUF Large Language Model, GGUF Large Language Models, GGUF Large Language Modeling, GGUF Large Language Modeling Library",
    "author": "All Advance AI",
    "author_email": "allmdev@allaai.com",
    "download_url": null,
    "platform": null,
    "description": "# ALLM\r\n\r\nALLM is a Python library designed for fast inference of GGUF (Generic Global Unsupervised Features) Large Language Models (LLMs) on both CPU and GPU. It provides a convenient interface for loading pre-trained GGUF models and performing inference using them. This library is ideal for applications where quick response times are crucial, such as chatbots, text generation, and more.\r\n\r\n## Features\r\n\r\n- **Efficient Inference**: ALLM leverages the power of GGUF models to provide fast and accurate inference.\r\n- **CPU and GPU Support**: The library is optimized for both CPU and GPU, allowing you to choose the best hardware for your application.\r\n- **Simple Interface**: With a straightforward command line support, you can easily load models and perform inference with just a single command.\r\n- **Flexible Configuration**: Customize inference settings such as temperature and model path to suit your needs.\r\n\r\n## Installation\r\n\r\nYou can install ALLM using pip:\r\n\r\n```bash\r\npip install allm\r\n```\r\n\r\n## Usage\r\n\r\nYou can start inference with a simple 'allm-run' command. The command takes name or path, temperature(optional), max new tokens(optional) and additional model kwargs(optional) as arguments.\r\n\r\n```bash\r\nallm-run --name model_name_or_path\r\n```\r\n\r\n## API\r\n\r\nYou can initiate the inference API by simply using the 'allm-serve' command. This command launches the API server on the default host, 0.0.0.0:8001. \r\n\r\n```bash\r\nallm-serve\r\n```\r\n\r\n\r\n## ALLM AGENTS \r\n\r\n## Local Agent Inference\r\n\r\nTo create local agent, begin by loading your knowledge documents into the database using the allm-newagent command and specifying the agent name:\r\n\r\n```bash\r\nallm-newagent --doc \"document_path\" --agent agent_name\r\n```\r\n\r\nor\r\n\r\n```bash\r\nallm-newagent --dir \"directory containing files to be ingested\" --agent agent_name\r\n```\r\n\r\nAfter agent is created successfully with your knowledge document, you can start the local agent chat with the allm-agentchat command:\r\n\r\n\r\n```bash\r\nallm-agentchat --agent agent name\r\n```\r\n\r\nAfter your agents are created you can also initiate agent-specific API server using the allm-agentapi command:\r\n\r\n\r\n```bash\r\nallm-agentapi --agent agent name\r\n```\r\n\r\nYou can also add additional documents to your existing agents by using the allm-updateagent command:\r\n\r\n```bash\r\nallm-updateagent --doc \"document path\" --agent agentname\r\n```\r\n\r\n##Supported Cloud models.\r\n\r\nALLM supports the Generative LLMs on VertexAI, including Gemini-1.5 pro and AzureOpenAi models. You can start local inference of cloud based models using the following command:\r\n\r\n```bash\r\nallm-run-vertex --projectid Id_of_your_GCP_project --region location_of_your_cloud_server\r\n```\r\n\r\nor\r\n\r\n```bash\r\nallm-run-azure --key key --version version --endpoint https://{your_endpoint}.openai.azure.com --model model_name\r\n```\r\n\r\nALLM supports the local config based inference of Generative LLMs on VertexAI, including Gemini-1.5 pro and AzureOpenAi models. You can manually create a json confi file or ALLM will create one for you and start local inference of cloud based models using the following command:\r\n\r\n```bash\r\nallm-run-vertex\r\n```\r\nNote that for the above command to work, config file needs to have all the necessary parameters set. This can be achieved by running thr full command including CLI arguments once, and then using the shortened command\r\n\r\nSame procedure can be followed for azure.\r\n\r\n```bash\r\nallm-run-azure\r\n```\r\n\r\n\r\nYou can also have a custom agent working with your cloud deployed model using the following command. It is important to note that before this step, agent should be created using the commands in the AGENTS section above.\r\n\r\n```bash\r\nallm-agentchat-vertex --projectid Id_of_your_GCP_project --region location_of_your_cloud_server --agent agent_name\r\n```\r\nor\r\n```bash\r\nallm-run-azure --key key --version version --endpoint https://{your_endpoint}.openai.azure.com --model model_name --agent agentname\r\n```\r\nmodel_name is an optional parameter in both vertex and azure, if not mentioned, inference will work on gemini-1.0-pro-002 for vertex and gpt-35-turbo for OpenAI by default.\r\n\r\nAlso, have an api config file ready, the following commands can be used:\r\n\r\n```bash\r\nallm-agentchat-vertex --agent agent_name\r\n```\r\nand\r\n```bash\r\nallm-agentchat-azure --agent agent_name\r\n```\r\n\r\nALLM also supports inferencing of cloud model based agents on API\r\n\r\n```bash\r\nallm-agentapi-vertex --projectid Id_of_your_GCP_project --region location_of_your_cloud_server --agent agent_name\r\n```\r\nor\r\n```bash\r\nallm-agentapi-vertex --agent agent_name\r\n```\r\n\r\nFor Azure,\r\n\r\n```bash\r\nallm-run-azure --key key --version version --endpoint https://{your_endpoint}.openai.azure.com --model model_name --agent agentname\r\n```\r\nor\r\n```bash\r\nallm-agentapi-vertex --agent agent_name\r\n```\r\n\r\n# ALLM-Enterprise\r\nYou can launch the UI with the following command:\r\n\r\n```bash\r\nallm-launch\r\n```\r\n\r\n## Supported Model names\r\nLlama3, Llama2, llama, llama2_chat, Llama_chat, Mistral, Mistral_instruct\r\n\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A simple and efficient python library for fast inference of LLMs",
    "version": "1.0.8",
    "project_urls": null,
    "split_keywords": [
        "gguf",
        " gguf large language model",
        " gguf large language models",
        " gguf large language modeling",
        " gguf large language modeling library"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "71f67cb1a697363ee00a0261f269eadb947593a3522cee07e74557ded8fd9a78",
                "md5": "74ce6a60962e67769a8974c61a4f63fd",
                "sha256": "5ef1fb4f150eef5ca0fdba124ac4054e186a53b6028af40fb00ac5b5acd6f96b"
            },
            "downloads": -1,
            "filename": "ALLM-1.0.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "74ce6a60962e67769a8974c61a4f63fd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 59466,
            "upload_time": "2024-06-15T06:13:26",
            "upload_time_iso_8601": "2024-06-15T06:13:26.437086Z",
            "url": "https://files.pythonhosted.org/packages/71/f6/7cb1a697363ee00a0261f269eadb947593a3522cee07e74557ded8fd9a78/ALLM-1.0.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-15 06:13:26",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "allm"
}
        
Elapsed time: 0.24988s