Name | langchain-llm JSON |
Version |
0.4.15
JSON |
| download |
home_page | None |
Summary | langchain llm wrapper |
upload_time | 2024-04-15 01:33:59 |
maintainer | None |
docs_url | None |
author | xusenlin |
requires_python | <4.0,>=3.8.1 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Langchain LLM
## Get Started
### Install
```shell
pip install langchain_llm==0.1.19
```
## Inference Usage
### HuggingFace Inference
**Completion Usage**
```python
from langchain_llm import HuggingFaceLLM
llm = HuggingFaceLLM(
model_name="qwen-7b-chat",
model_path="/data/checkpoints/Qwen-7B-Chat",
load_model_kwargs={"device_map": "auto"},
)
# invoke method
prompt = "<|im_start|>user\n你是谁?<|im_end|>\n<|im_start|>assistant\n"
print(llm.invoke(prompt, stop=["<|im_end|>"]))
# Token Streaming
for chunk in llm.stream(prompt, stop=["<|im_end|>"]):
print(chunk, end="", flush=True)
# openai usage
print(llm.call_as_openai(prompt, stop=["<|im_end|>"]))
# Streaming
for chunk in llm.call_as_openai(prompt, stop=["<|im_end|>"], stream=True):
print(chunk.choices[0].text, end="", flush=True)
```
**Chat Completion Usage**
```python
from langchain_llm import ChatHuggingFace
chat_llm = ChatHuggingFace(llm=llm)
# invoke method
query = "你是谁?"
print(chat_llm.invoke(query))
# Token Streaming
for chunk in chat_llm.stream(query):
print(chunk.content, end="", flush=True)
# openai usage
messages = [
{"role": "user", "content": query}
]
print(chat_llm.call_as_openai(messages))
# Streaming
for chunk in chat_llm.call_as_openai(messages, stream=True):
print(chunk.choices[0].delta.content or "", end="", flush=True)
```
### VLLM Inference
**Completion Usage**
```python
from langchain_llm import VLLM
llm = VLLM(
model_name="qwen",
model="/data/checkpoints/Qwen-7B-Chat",
trust_remote_code=True,
)
# invoke method
prompt = "<|im_start|>user\n你是谁?<|im_end|>\n<|im_start|>assistant\n"
print(llm.invoke(prompt, stop=["<|im_end|>"]))
# openai usage
print(llm.call_as_openai(prompt, stop=["<|im_end|>"]))
```
**Chat Completion Usage**
```python
from langchain_llm import ChatVLLM
chat_llm = ChatVLLM(llm=llm)
# invoke method
query = "你是谁?"
print(chat_llm.invoke(query))
# openai usage
messages = [
{"role": "user", "content": query}
]
print(chat_llm.call_as_openai(messages))
```
## Custom Chat template
```python
from langchain_llm import BaseTemplate, ChatHuggingFace
class CustomTemplate(BaseTemplate):
@property
def template(self) -> str:
return (
"{% for message in messages %}"
"{{ '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n' }}"
"{% endfor %}"
"{% if add_generation_prompt %}"
"{{ '<|im_start|>assistant\\n' }}"
"{% endif %}"
)
chat_llm = ChatHuggingFace(
llm=llm,
prompt_adapter=CustomTemplate()
)
```
## Load Model Kwargs
+ `model_name_or_path`: model name or path.
+ `use_fast_tokenizer`: default false.
+ `device_map`: "auto"、"cuda:0" etc.
+ `dtype`: "half", "bfloat16", "float32".
+ `load_in_8bit`: Load model in 8 bit.
+ `load_in_4bit`: Load model in 4 bit.
+ `rope_scaling`: Which scaling strategy should be adopted for the RoPE embeddings. Literal["linear", "dynamic"].
+ `flash_attn`: Enable FlashAttention-2.
## Merge Lora model
```python
from langchain_llm import apply_lora
apply_lora("base_model_path", "lora_path", "target_model_path")
```
Raw data
{
"_id": null,
"home_page": null,
"name": "langchain-llm",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.8.1",
"maintainer_email": null,
"keywords": null,
"author": "xusenlin",
"author_email": "xusenlin@dnect.cn",
"download_url": "https://files.pythonhosted.org/packages/8c/2f/1f0082d992536658497eb032ed0e9106ec6efdd2972fb3d74e5589eda64f/langchain_llm-0.4.15.tar.gz",
"platform": null,
"description": "# Langchain LLM\n\n## Get Started\n\n### Install\n\n```shell\npip install langchain_llm==0.1.19\n```\n\n## Inference Usage\n\n### HuggingFace Inference\n\n**Completion Usage**\n\n```python\nfrom langchain_llm import HuggingFaceLLM\n\nllm = HuggingFaceLLM(\n model_name=\"qwen-7b-chat\",\n model_path=\"/data/checkpoints/Qwen-7B-Chat\",\n load_model_kwargs={\"device_map\": \"auto\"},\n)\n\n# invoke method\nprompt = \"<|im_start|>user\\n\u4f60\u662f\u8c01\uff1f<|im_end|>\\n<|im_start|>assistant\\n\"\nprint(llm.invoke(prompt, stop=[\"<|im_end|>\"]))\n\n# Token Streaming\nfor chunk in llm.stream(prompt, stop=[\"<|im_end|>\"]):\n print(chunk, end=\"\", flush=True)\n\n# openai usage\nprint(llm.call_as_openai(prompt, stop=[\"<|im_end|>\"]))\n\n# Streaming\nfor chunk in llm.call_as_openai(prompt, stop=[\"<|im_end|>\"], stream=True):\n print(chunk.choices[0].text, end=\"\", flush=True)\n```\n\n**Chat Completion Usage**\n\n```python\nfrom langchain_llm import ChatHuggingFace\n\nchat_llm = ChatHuggingFace(llm=llm)\n\n# invoke method\nquery = \"\u4f60\u662f\u8c01\uff1f\"\nprint(chat_llm.invoke(query))\n\n# Token Streaming\nfor chunk in chat_llm.stream(query):\n print(chunk.content, end=\"\", flush=True)\n\n# openai usage\nmessages = [\n {\"role\": \"user\", \"content\": query}\n]\nprint(chat_llm.call_as_openai(messages))\n\n# Streaming\nfor chunk in chat_llm.call_as_openai(messages, stream=True):\n print(chunk.choices[0].delta.content or \"\", end=\"\", flush=True)\n```\n\n### VLLM Inference\n\n**Completion Usage**\n\n```python\nfrom langchain_llm import VLLM\n\nllm = VLLM(\n model_name=\"qwen\", \n model=\"/data/checkpoints/Qwen-7B-Chat\", \n trust_remote_code=True,\n)\n\n# invoke method\nprompt = \"<|im_start|>user\\n\u4f60\u662f\u8c01\uff1f<|im_end|>\\n<|im_start|>assistant\\n\"\nprint(llm.invoke(prompt, stop=[\"<|im_end|>\"]))\n\n# openai usage\nprint(llm.call_as_openai(prompt, stop=[\"<|im_end|>\"]))\n```\n\n**Chat Completion Usage**\n\n```python\nfrom langchain_llm import ChatVLLM\n\nchat_llm = ChatVLLM(llm=llm)\n\n# invoke method\nquery = \"\u4f60\u662f\u8c01\uff1f\"\nprint(chat_llm.invoke(query))\n\n# openai usage\nmessages = [\n {\"role\": \"user\", \"content\": query}\n]\nprint(chat_llm.call_as_openai(messages))\n```\n\n\n## Custom Chat template\n\n```python\nfrom langchain_llm import BaseTemplate, ChatHuggingFace\n\nclass CustomTemplate(BaseTemplate):\n \n @property\n def template(self) -> str:\n return (\n \"{% for message in messages %}\"\n \"{{ '<|im_start|>' + message['role'] + '\\\\n' + message['content'] + '<|im_end|>' + '\\\\n' }}\"\n \"{% endfor %}\"\n \"{% if add_generation_prompt %}\"\n \"{{ '<|im_start|>assistant\\\\n' }}\"\n \"{% endif %}\"\n )\n\nchat_llm = ChatHuggingFace(\n llm=llm, \n prompt_adapter=CustomTemplate()\n)\n```\n\n## Load Model Kwargs\n\n+ `model_name_or_path`: model name or path.\n\n\n+ `use_fast_tokenizer`: default false.\n\n\n+ `device_map`: \"auto\"\u3001\"cuda:0\" etc.\n\n\n+ `dtype`: \"half\", \"bfloat16\", \"float32\".\n\n\n+ `load_in_8bit`: Load model in 8 bit.\n\n\n+ `load_in_4bit`: Load model in 4 bit.\n\n\n+ `rope_scaling`: Which scaling strategy should be adopted for the RoPE embeddings. Literal[\"linear\", \"dynamic\"].\n\n\n+ `flash_attn`: Enable FlashAttention-2.\n\n## Merge Lora model\n\n```python\nfrom langchain_llm import apply_lora\n\napply_lora(\"base_model_path\", \"lora_path\", \"target_model_path\")\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "langchain llm wrapper",
"version": "0.4.15",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "dc0847658a83f639ea69b6ea58e6056e6bf1bf433ab5a07687caf6cbb369a812",
"md5": "2f4960c47884f323a605813e47feec06",
"sha256": "3e4227ed357673b2ba05d84724e1155269b0388e08a43f1f23f9cf24b017bc58"
},
"downloads": -1,
"filename": "langchain_llm-0.4.15-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2f4960c47884f323a605813e47feec06",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.8.1",
"size": 38357,
"upload_time": "2024-04-15T01:33:57",
"upload_time_iso_8601": "2024-04-15T01:33:57.671552Z",
"url": "https://files.pythonhosted.org/packages/dc/08/47658a83f639ea69b6ea58e6056e6bf1bf433ab5a07687caf6cbb369a812/langchain_llm-0.4.15-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8c2f1f0082d992536658497eb032ed0e9106ec6efdd2972fb3d74e5589eda64f",
"md5": "b0509b9e3bd7ddaa80935ef3a1cca683",
"sha256": "7afa83011f1b0a486c2a52838b81dcfe0c207d59e28bd677e9bee61346421290"
},
"downloads": -1,
"filename": "langchain_llm-0.4.15.tar.gz",
"has_sig": false,
"md5_digest": "b0509b9e3bd7ddaa80935ef3a1cca683",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.8.1",
"size": 30631,
"upload_time": "2024-04-15T01:33:59",
"upload_time_iso_8601": "2024-04-15T01:33:59.206910Z",
"url": "https://files.pythonhosted.org/packages/8c/2f/1f0082d992536658497eb032ed0e9106ec6efdd2972fb3d74e5589eda64f/langchain_llm-0.4.15.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-15 01:33:59",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "langchain-llm"
}