langchain-llm


Namelangchain-llm JSON
Version 0.4.15 PyPI version JSON
download
home_pageNone
Summarylangchain llm wrapper
upload_time2024-04-15 01:33:59
maintainerNone
docs_urlNone
authorxusenlin
requires_python<4.0,>=3.8.1
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Langchain LLM

## Get Started

### Install

```shell
pip install langchain_llm==0.1.19
```

## Inference Usage

### HuggingFace Inference

**Completion Usage**

```python
from langchain_llm import HuggingFaceLLM

llm = HuggingFaceLLM(
    model_name="qwen-7b-chat",
    model_path="/data/checkpoints/Qwen-7B-Chat",
    load_model_kwargs={"device_map": "auto"},
)

# invoke method
prompt = "<|im_start|>user\n你是谁?<|im_end|>\n<|im_start|>assistant\n"
print(llm.invoke(prompt, stop=["<|im_end|>"]))

# Token Streaming
for chunk in llm.stream(prompt, stop=["<|im_end|>"]):
    print(chunk, end="", flush=True)

# openai usage
print(llm.call_as_openai(prompt, stop=["<|im_end|>"]))

# Streaming
for chunk in llm.call_as_openai(prompt, stop=["<|im_end|>"], stream=True):
    print(chunk.choices[0].text, end="", flush=True)
```

**Chat Completion Usage**

```python
from langchain_llm import ChatHuggingFace

chat_llm = ChatHuggingFace(llm=llm)

# invoke method
query = "你是谁?"
print(chat_llm.invoke(query))

# Token Streaming
for chunk in chat_llm.stream(query):
    print(chunk.content, end="", flush=True)

# openai usage
messages = [
    {"role": "user", "content": query}
]
print(chat_llm.call_as_openai(messages))

# Streaming
for chunk in chat_llm.call_as_openai(messages, stream=True):
    print(chunk.choices[0].delta.content or "", end="", flush=True)
```

### VLLM Inference

**Completion Usage**

```python
from langchain_llm import VLLM

llm = VLLM(
    model_name="qwen", 
    model="/data/checkpoints/Qwen-7B-Chat", 
    trust_remote_code=True,
)

# invoke method
prompt = "<|im_start|>user\n你是谁?<|im_end|>\n<|im_start|>assistant\n"
print(llm.invoke(prompt, stop=["<|im_end|>"]))

# openai usage
print(llm.call_as_openai(prompt, stop=["<|im_end|>"]))
```

**Chat Completion Usage**

```python
from langchain_llm import ChatVLLM

chat_llm = ChatVLLM(llm=llm)

# invoke method
query = "你是谁?"
print(chat_llm.invoke(query))

# openai usage
messages = [
    {"role": "user", "content": query}
]
print(chat_llm.call_as_openai(messages))
```


## Custom Chat template

```python
from langchain_llm import BaseTemplate, ChatHuggingFace

class CustomTemplate(BaseTemplate):
    
    @property
    def template(self) -> str:
        return (
            "{% for message in messages %}"
            "{{ '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n' }}"
            "{% endfor %}"
            "{% if add_generation_prompt %}"
            "{{ '<|im_start|>assistant\\n' }}"
            "{% endif %}"
        )

chat_llm = ChatHuggingFace(
    llm=llm, 
    prompt_adapter=CustomTemplate()
)
```

## Load Model Kwargs

+ `model_name_or_path`: model name or path.


+ `use_fast_tokenizer`: default false.


+ `device_map`: "auto"、"cuda:0" etc.


+ `dtype`: "half", "bfloat16", "float32".


+ `load_in_8bit`: Load model in 8 bit.


+ `load_in_4bit`: Load model in 4 bit.


+ `rope_scaling`: Which scaling strategy should be adopted for the RoPE embeddings. Literal["linear", "dynamic"].


+ `flash_attn`: Enable FlashAttention-2.

## Merge Lora model

```python
from langchain_llm import apply_lora

apply_lora("base_model_path", "lora_path", "target_model_path")
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "langchain-llm",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8.1",
    "maintainer_email": null,
    "keywords": null,
    "author": "xusenlin",
    "author_email": "xusenlin@dnect.cn",
    "download_url": "https://files.pythonhosted.org/packages/8c/2f/1f0082d992536658497eb032ed0e9106ec6efdd2972fb3d74e5589eda64f/langchain_llm-0.4.15.tar.gz",
    "platform": null,
    "description": "# Langchain LLM\n\n## Get Started\n\n### Install\n\n```shell\npip install langchain_llm==0.1.19\n```\n\n## Inference Usage\n\n### HuggingFace Inference\n\n**Completion Usage**\n\n```python\nfrom langchain_llm import HuggingFaceLLM\n\nllm = HuggingFaceLLM(\n    model_name=\"qwen-7b-chat\",\n    model_path=\"/data/checkpoints/Qwen-7B-Chat\",\n    load_model_kwargs={\"device_map\": \"auto\"},\n)\n\n# invoke method\nprompt = \"<|im_start|>user\\n\u4f60\u662f\u8c01\uff1f<|im_end|>\\n<|im_start|>assistant\\n\"\nprint(llm.invoke(prompt, stop=[\"<|im_end|>\"]))\n\n# Token Streaming\nfor chunk in llm.stream(prompt, stop=[\"<|im_end|>\"]):\n    print(chunk, end=\"\", flush=True)\n\n# openai usage\nprint(llm.call_as_openai(prompt, stop=[\"<|im_end|>\"]))\n\n# Streaming\nfor chunk in llm.call_as_openai(prompt, stop=[\"<|im_end|>\"], stream=True):\n    print(chunk.choices[0].text, end=\"\", flush=True)\n```\n\n**Chat Completion Usage**\n\n```python\nfrom langchain_llm import ChatHuggingFace\n\nchat_llm = ChatHuggingFace(llm=llm)\n\n# invoke method\nquery = \"\u4f60\u662f\u8c01\uff1f\"\nprint(chat_llm.invoke(query))\n\n# Token Streaming\nfor chunk in chat_llm.stream(query):\n    print(chunk.content, end=\"\", flush=True)\n\n# openai usage\nmessages = [\n    {\"role\": \"user\", \"content\": query}\n]\nprint(chat_llm.call_as_openai(messages))\n\n# Streaming\nfor chunk in chat_llm.call_as_openai(messages, stream=True):\n    print(chunk.choices[0].delta.content or \"\", end=\"\", flush=True)\n```\n\n### VLLM Inference\n\n**Completion Usage**\n\n```python\nfrom langchain_llm import VLLM\n\nllm = VLLM(\n    model_name=\"qwen\", \n    model=\"/data/checkpoints/Qwen-7B-Chat\", \n    trust_remote_code=True,\n)\n\n# invoke method\nprompt = \"<|im_start|>user\\n\u4f60\u662f\u8c01\uff1f<|im_end|>\\n<|im_start|>assistant\\n\"\nprint(llm.invoke(prompt, stop=[\"<|im_end|>\"]))\n\n# openai usage\nprint(llm.call_as_openai(prompt, stop=[\"<|im_end|>\"]))\n```\n\n**Chat Completion Usage**\n\n```python\nfrom langchain_llm import ChatVLLM\n\nchat_llm = ChatVLLM(llm=llm)\n\n# invoke method\nquery = \"\u4f60\u662f\u8c01\uff1f\"\nprint(chat_llm.invoke(query))\n\n# openai usage\nmessages = [\n    {\"role\": \"user\", \"content\": query}\n]\nprint(chat_llm.call_as_openai(messages))\n```\n\n\n## Custom Chat template\n\n```python\nfrom langchain_llm import BaseTemplate, ChatHuggingFace\n\nclass CustomTemplate(BaseTemplate):\n    \n    @property\n    def template(self) -> str:\n        return (\n            \"{% for message in messages %}\"\n            \"{{ '<|im_start|>' + message['role'] + '\\\\n' + message['content'] + '<|im_end|>' + '\\\\n' }}\"\n            \"{% endfor %}\"\n            \"{% if add_generation_prompt %}\"\n            \"{{ '<|im_start|>assistant\\\\n' }}\"\n            \"{% endif %}\"\n        )\n\nchat_llm = ChatHuggingFace(\n    llm=llm, \n    prompt_adapter=CustomTemplate()\n)\n```\n\n## Load Model Kwargs\n\n+ `model_name_or_path`: model name or path.\n\n\n+ `use_fast_tokenizer`: default false.\n\n\n+ `device_map`: \"auto\"\u3001\"cuda:0\" etc.\n\n\n+ `dtype`: \"half\", \"bfloat16\", \"float32\".\n\n\n+ `load_in_8bit`: Load model in 8 bit.\n\n\n+ `load_in_4bit`: Load model in 4 bit.\n\n\n+ `rope_scaling`: Which scaling strategy should be adopted for the RoPE embeddings. Literal[\"linear\", \"dynamic\"].\n\n\n+ `flash_attn`: Enable FlashAttention-2.\n\n## Merge Lora model\n\n```python\nfrom langchain_llm import apply_lora\n\napply_lora(\"base_model_path\", \"lora_path\", \"target_model_path\")\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "langchain llm wrapper",
    "version": "0.4.15",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc0847658a83f639ea69b6ea58e6056e6bf1bf433ab5a07687caf6cbb369a812",
                "md5": "2f4960c47884f323a605813e47feec06",
                "sha256": "3e4227ed357673b2ba05d84724e1155269b0388e08a43f1f23f9cf24b017bc58"
            },
            "downloads": -1,
            "filename": "langchain_llm-0.4.15-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2f4960c47884f323a605813e47feec06",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8.1",
            "size": 38357,
            "upload_time": "2024-04-15T01:33:57",
            "upload_time_iso_8601": "2024-04-15T01:33:57.671552Z",
            "url": "https://files.pythonhosted.org/packages/dc/08/47658a83f639ea69b6ea58e6056e6bf1bf433ab5a07687caf6cbb369a812/langchain_llm-0.4.15-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8c2f1f0082d992536658497eb032ed0e9106ec6efdd2972fb3d74e5589eda64f",
                "md5": "b0509b9e3bd7ddaa80935ef3a1cca683",
                "sha256": "7afa83011f1b0a486c2a52838b81dcfe0c207d59e28bd677e9bee61346421290"
            },
            "downloads": -1,
            "filename": "langchain_llm-0.4.15.tar.gz",
            "has_sig": false,
            "md5_digest": "b0509b9e3bd7ddaa80935ef3a1cca683",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8.1",
            "size": 30631,
            "upload_time": "2024-04-15T01:33:59",
            "upload_time_iso_8601": "2024-04-15T01:33:59.206910Z",
            "url": "https://files.pythonhosted.org/packages/8c/2f/1f0082d992536658497eb032ed0e9106ec6efdd2972fb3d74e5589eda64f/langchain_llm-0.4.15.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-15 01:33:59",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "langchain-llm"
}
        
Elapsed time: 0.32149s