# openai-simple-embeddings
基于OPENAI兼容API接口的embeddings服务封装,以解决langchain_community.vectorstores在使用bge-m3/bge-reranker-v2-m3等模型提供的OPENAI兼容API接口服务时遇到的兼容性问题。
## 安装
```shell
pip install openai-simple-embeddings
```
## 使用
### 配置变量设置
```shell
# OPENAI兼容API服务,可以xinference提供
# 使用OPENAI_EMBEDDINGS_BASE_URL或EMBEDDINGS_BASE_URL设置独立服务地址
export OPENAI_BASE_URL="http://localhost/v1"
# OPENAI兼容API服务密钥,一般以sk-开头,共16位长
# 使用OPENAI_EMBEDDINGS_API_KEY或EMBEDDINGS_API_KEY设置独立服务密码
export OPENAI_API_KEY=""
# 默认的文本向量化模型
export OPENAI_EMBEDDINGS_MODEL="bge-m3"
# 向量数据库(以redis-stack为例)
export REDIS_STACK_URL="redis://localhost:6379/0"
# 字符串长度控制
export OPENAI_EMBEDDINGS_MAX_SIZE=1024
```
### 获取文本向量
*代码*
```python
from openai_simple_embeddings.base import get_text_embeddings
r1 = get_text_embeddings("hello")
print(r1)
```
*输出*
```txt
[-0.032024841755628586, 0.023251207545399666, ..., -0.037223849445581436, 0.05963246524333954]
```
### 集成到向量数据库客户端
*代码*
```python
from openai_simple_embeddings.langchain_embeddings import OpenAISimpleEmbeddings
from langchain_community.vectorstores.redis import Redis as LangchainRedisVectorStore
import python_environment_settings
REDIS_STACK_URL = python_environment_settings.get("REDIS_STACK_URL")
index_name = "kb:test"
embeddings = OpenAISimpleEmbeddings()
lrvs = LangchainRedisVectorStore(
redis_url=REDIS_STACK_URL,
index_name=index_name,
key_prefix=index_name,
embedding=embeddings,
)
uids = lrvs.add_texts(["hello"])
print(uids)
```
*输出*
```txt
['kb:test:984af7f2ffea4d49952af82dd992c8f8']
```
## 关于字符串长度控制
- 模型本身一般没有字符串长度控制。
- 但过长的字符串会导入模型占用内存的增长。
- 所以默认将字符串长度控制在:1024字。
- 通过`OPENAI_EMBEDDINGS_MAX_SIZE`设置默认最大字符串长度。
- 也可以函数调用中指定最大字符串长度。
- 注意:所有超过最大长度的字符串将被截断。
## 版本记录
### v0.1.0
- 版本首发。
### v0.1.1
- 允许embeddings模型使用独立的服务地址及密码。
Raw data
{
"_id": null,
"home_page": null,
"name": "openai-simple-embeddings",
"maintainer": "rRR0VrFP",
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "openai-simple-embeddings",
"author": "rRR0VrFP",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/8f/73/867446468373926ddf1ec2b73e21e39a22d9703227fb41dd812c0eddb5cd/openai-simple-embeddings-0.1.1.tar.gz",
"platform": null,
"description": "# openai-simple-embeddings\n\n\u57fa\u4e8eOPENAI\u517c\u5bb9API\u63a5\u53e3\u7684embeddings\u670d\u52a1\u5c01\u88c5\uff0c\u4ee5\u89e3\u51b3langchain_community.vectorstores\u5728\u4f7f\u7528bge-m3/bge-reranker-v2-m3\u7b49\u6a21\u578b\u63d0\u4f9b\u7684OPENAI\u517c\u5bb9API\u63a5\u53e3\u670d\u52a1\u65f6\u9047\u5230\u7684\u517c\u5bb9\u6027\u95ee\u9898\u3002\n\n## \u5b89\u88c5\n\n```shell\npip install openai-simple-embeddings\n```\n\n## \u4f7f\u7528\n\n### \u914d\u7f6e\u53d8\u91cf\u8bbe\u7f6e\n\n```shell\n# OPENAI\u517c\u5bb9API\u670d\u52a1\uff0c\u53ef\u4ee5xinference\u63d0\u4f9b\n# \u4f7f\u7528OPENAI_EMBEDDINGS_BASE_URL\u6216EMBEDDINGS_BASE_URL\u8bbe\u7f6e\u72ec\u7acb\u670d\u52a1\u5730\u5740\nexport OPENAI_BASE_URL=\"http://localhost/v1\"\n# OPENAI\u517c\u5bb9API\u670d\u52a1\u5bc6\u94a5\uff0c\u4e00\u822c\u4ee5sk-\u5f00\u5934\uff0c\u517116\u4f4d\u957f\n# \u4f7f\u7528OPENAI_EMBEDDINGS_API_KEY\u6216EMBEDDINGS_API_KEY\u8bbe\u7f6e\u72ec\u7acb\u670d\u52a1\u5bc6\u7801\nexport OPENAI_API_KEY=\"\"\n# \u9ed8\u8ba4\u7684\u6587\u672c\u5411\u91cf\u5316\u6a21\u578b\nexport OPENAI_EMBEDDINGS_MODEL=\"bge-m3\"\n# \u5411\u91cf\u6570\u636e\u5e93\uff08\u4ee5redis-stack\u4e3a\u4f8b\uff09\nexport REDIS_STACK_URL=\"redis://localhost:6379/0\"\n# \u5b57\u7b26\u4e32\u957f\u5ea6\u63a7\u5236\nexport OPENAI_EMBEDDINGS_MAX_SIZE=1024\n```\n\n### \u83b7\u53d6\u6587\u672c\u5411\u91cf\n\n*\u4ee3\u7801*\n```python\nfrom openai_simple_embeddings.base import get_text_embeddings\n\nr1 = get_text_embeddings(\"hello\")\nprint(r1)\n```\n\n*\u8f93\u51fa*\n\n```txt\n[-0.032024841755628586, 0.023251207545399666, ..., -0.037223849445581436, 0.05963246524333954]\n```\n\n### \u96c6\u6210\u5230\u5411\u91cf\u6570\u636e\u5e93\u5ba2\u6237\u7aef\n\n*\u4ee3\u7801*\n```python\nfrom openai_simple_embeddings.langchain_embeddings import OpenAISimpleEmbeddings\nfrom langchain_community.vectorstores.redis import Redis as LangchainRedisVectorStore\nimport python_environment_settings\n\nREDIS_STACK_URL = python_environment_settings.get(\"REDIS_STACK_URL\")\nindex_name = \"kb:test\"\nembeddings = OpenAISimpleEmbeddings()\nlrvs = LangchainRedisVectorStore(\n redis_url=REDIS_STACK_URL,\n index_name=index_name,\n key_prefix=index_name,\n embedding=embeddings,\n)\nuids = lrvs.add_texts([\"hello\"])\nprint(uids)\n```\n\n*\u8f93\u51fa*\n\n```txt\n['kb:test:984af7f2ffea4d49952af82dd992c8f8']\n```\n\n## \u5173\u4e8e\u5b57\u7b26\u4e32\u957f\u5ea6\u63a7\u5236\n\n- \u6a21\u578b\u672c\u8eab\u4e00\u822c\u6ca1\u6709\u5b57\u7b26\u4e32\u957f\u5ea6\u63a7\u5236\u3002\n- \u4f46\u8fc7\u957f\u7684\u5b57\u7b26\u4e32\u4f1a\u5bfc\u5165\u6a21\u578b\u5360\u7528\u5185\u5b58\u7684\u589e\u957f\u3002\n- \u6240\u4ee5\u9ed8\u8ba4\u5c06\u5b57\u7b26\u4e32\u957f\u5ea6\u63a7\u5236\u5728\uff1a1024\u5b57\u3002\n- \u901a\u8fc7`OPENAI_EMBEDDINGS_MAX_SIZE`\u8bbe\u7f6e\u9ed8\u8ba4\u6700\u5927\u5b57\u7b26\u4e32\u957f\u5ea6\u3002\n- \u4e5f\u53ef\u4ee5\u51fd\u6570\u8c03\u7528\u4e2d\u6307\u5b9a\u6700\u5927\u5b57\u7b26\u4e32\u957f\u5ea6\u3002\n- \u6ce8\u610f\uff1a\u6240\u6709\u8d85\u8fc7\u6700\u5927\u957f\u5ea6\u7684\u5b57\u7b26\u4e32\u5c06\u88ab\u622a\u65ad\u3002\n\n## \u7248\u672c\u8bb0\u5f55\n\n### v0.1.0\n\n- \u7248\u672c\u9996\u53d1\u3002\n\n### v0.1.1\n\n- \u5141\u8bb8embeddings\u6a21\u578b\u4f7f\u7528\u72ec\u7acb\u7684\u670d\u52a1\u5730\u5740\u53ca\u5bc6\u7801\u3002\n",
"bugtrack_url": null,
"license": "Apache License, Version 2.0",
"summary": "\u57fa\u4e8eOPENAI\u517c\u5bb9API\u63a5\u53e3\u7684embeddings\u670d\u52a1\u5c01\u88c5\uff0c\u4ee5\u89e3\u51b3langchain_community.vectorstores\u5728\u4f7f\u7528bge-m3/bge-reranker-v2-m3\u7b49\u6a21\u578b\u63d0\u4f9b\u7684OPENAI\u517c\u5bb9API\u63a5\u53e3\u670d\u52a1\u65f6\u9047\u5230\u7684\u517c\u5bb9\u6027\u95ee\u9898\u3002",
"version": "0.1.1",
"project_urls": null,
"split_keywords": [
"openai-simple-embeddings"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "047e0da56855288371b72de0e9a90a1b0b4d648fcdeaa9afc1b740ad139f5131",
"md5": "b56e93a60a4d43410bf5ec9daefe7634",
"sha256": "d4dc5eb51561966cf82ed363f29c1447cae3ac69c33d343fac80db76fc1a5a34"
},
"downloads": -1,
"filename": "openai_simple_embeddings-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b56e93a60a4d43410bf5ec9daefe7634",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 9859,
"upload_time": "2025-01-15T09:50:40",
"upload_time_iso_8601": "2025-01-15T09:50:40.158238Z",
"url": "https://files.pythonhosted.org/packages/04/7e/0da56855288371b72de0e9a90a1b0b4d648fcdeaa9afc1b740ad139f5131/openai_simple_embeddings-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8f73867446468373926ddf1ec2b73e21e39a22d9703227fb41dd812c0eddb5cd",
"md5": "3d2d3c73f3fb9e67d903db7e098f1792",
"sha256": "7b22340a55ef466380468119dd58fbf6e21c4f305249ff770b0a2e5b503b4e23"
},
"downloads": -1,
"filename": "openai-simple-embeddings-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "3d2d3c73f3fb9e67d903db7e098f1792",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 8694,
"upload_time": "2025-01-15T09:50:42",
"upload_time_iso_8601": "2025-01-15T09:50:42.676960Z",
"url": "https://files.pythonhosted.org/packages/8f/73/867446468373926ddf1ec2b73e21e39a22d9703227fb41dd812c0eddb5cd/openai-simple-embeddings-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-15 09:50:42",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "openai-simple-embeddings"
}