<div align="center">
<img src="./assets/xorbits-logo.png" width="180px" alt="xorbits" />
# Xorbits Inference: Model Serving Made Easy ๐ค
<p align="center">
<a href="https://xinference.io/en">Xinference Enterprise</a> ยท
<a href="https://inference.readthedocs.io/en/latest/getting_started/installation.html#installation">Self-hosting</a> ยท
<a href="https://inference.readthedocs.io/">Documentation</a>
</p>
[](https://pypi.org/project/xinference/)
[](https://github.com/xorbitsai/inference/blob/main/LICENSE)
[](https://actions-badge.atrox.dev/xorbitsai/inference/goto?ref=main)
[](https://discord.gg/Xw9tszSkr5)
[](https://twitter.com/xorbitsio)
<p align="center">
<a href="./README.md"><img alt="README in English" src="https://img.shields.io/badge/English-454545?style=for-the-badge"></a>
<a href="./README_zh_CN.md"><img alt="็ฎไฝไธญๆ็่ช่ฟฐๆไปถ" src="https://img.shields.io/badge/ไธญๆไป็ป-d9d9d9?style=for-the-badge"></a>
<a href="./README_ja_JP.md"><img alt="ๆฅๆฌ่ชใฎREADME" src="https://img.shields.io/badge/ๆฅๆฌ่ช-d9d9d9?style=for-the-badge"></a>
</p>
</div>
<br />
Xorbits Inference(Xinference) is a powerful and versatile library designed to serve language,
speech recognition, and multimodal models. With Xorbits Inference, you can effortlessly deploy
and serve your or state-of-the-art built-in models using just a single command. Whether you are a
researcher, developer, or data scientist, Xorbits Inference empowers you to unleash the full
potential of cutting-edge AI models.
<div align="center">
<i><a href="https://discord.gg/Xw9tszSkr5">๐ Join our Discord community!</a></i>
</div>
## ๐ฅ Hot Topics
### Framework Enhancements
- [Xllamacpp](https://github.com/xorbitsai/xllamacpp): New llama.cpp Python binding, maintained by Xinference team, supports continuous batching and is more production-ready.: [#2997](https://github.com/xorbitsai/inference/pull/2997)
- Distributed inference: running models across workers: [#2877](https://github.com/xorbitsai/inference/pull/2877)
- VLLM enhancement: Shared KV cache across multiple replicas: [#2732](https://github.com/xorbitsai/inference/pull/2732)
- Support Continuous batching for Transformers engine: [#1724](https://github.com/xorbitsai/inference/pull/1724)
- Support MLX backend for Apple Silicon chips: [#1765](https://github.com/xorbitsai/inference/pull/1765)
- Support specifying worker and GPU indexes for launching models: [#1195](https://github.com/xorbitsai/inference/pull/1195)
- Support SGLang backend: [#1161](https://github.com/xorbitsai/inference/pull/1161)
- Support LoRA for LLM and image models: [#1080](https://github.com/xorbitsai/inference/pull/1080)
### New Models
- Built-in support for [ERNIE 4.5](https://yiyan.baidu.com/blog/posts/ernie4.5/): [#3812](https://github.com/xorbitsai/inference/pull/3812)
- Built-in support for [GLM-4.1V-Thinking](https://github.com/THUDM/GLM-4.1V-Thinking/tree/main): [#3756](https://github.com/xorbitsai/inference/pull/3756)
- Built-in support for [jina-embeddings-v4](https://jina.ai/news/jina-embeddings-v4-universal-embeddings-for-multimodal-multilingual-retrieval/): [#3814](https://github.com/xorbitsai/inference/pull/3814)
- Built-in support for [FLUX.1-Kontext-dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev): [#3728](https://github.com/xorbitsai/inference/pull/3728)
- Built-in support for [Qwen3-Embedding](https://github.com/QwenLM/Qwen3-Embedding): [#3627](https://github.com/xorbitsai/inference/pull/3627)
- Built-in support for [Minicpm4](https://github.com/OpenBMB/MiniCPM): [#3609](https://github.com/xorbitsai/inference/pull/3609)
- Built-in support for [CogView4](https://github.com/THUDM/CogView4): [#3557](https://github.com/xorbitsai/inference/pull/3557)
- Built-in support for [Deepseek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528): [#3539](https://github.com/xorbitsai/inference/pull/3539)
### Integrations
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
- [FastGPT](https://github.com/labring/FastGPT): a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.
- [RAGFlow](https://github.com/infiniflow/ragflow): is an open-source RAG engine based on deep document understanding.
- [MaxKB](https://github.com/1Panel-dev/MaxKB): MaxKB = Max Knowledge Base, it is a chatbot based on Large Language Models (LLM) and Retrieval-Augmented Generation (RAG).
- [Chatbox](https://chatboxai.app/): a desktop client for multiple cutting-edge LLM models, available on Windows, Mac and Linux.
## Key Features
๐ **Model Serving Made Easy**: Simplify the process of serving large language, speech
recognition, and multimodal models. You can set up and deploy your models
for experimentation and production with a single command.
โก๏ธ **State-of-the-Art Models**: Experiment with cutting-edge built-in models using a single
command. Inference provides access to state-of-the-art open-source models!
๐ฅ **Heterogeneous Hardware Utilization**: Make the most of your hardware resources with
[ggml](https://github.com/ggerganov/ggml). Xorbits Inference intelligently utilizes heterogeneous
hardware, including GPUs and CPUs, to accelerate your model inference tasks.
โ๏ธ **Flexible API and Interfaces**: Offer multiple interfaces for interacting
with your models, supporting OpenAI compatible RESTful API (including Function Calling API), RPC, CLI
and WebUI for seamless model management and interaction.
๐ **Distributed Deployment**: Excel in distributed deployment scenarios,
allowing the seamless distribution of model inference across multiple devices or machines.
๐ **Built-in Integration with Third-Party Libraries**: Xorbits Inference seamlessly integrates
with popular third-party libraries including [LangChain](https://python.langchain.com/docs/integrations/providers/xinference), [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window), [Dify](https://docs.dify.ai/advanced/model-configuration/xinference), and [Chatbox](https://chatboxai.app/).
## Why Xinference
| Feature | Xinference | FastChat | OpenLLM | RayLLM |
|------------------------------------------------|------------|----------|---------|--------|
| OpenAI-Compatible RESTful API | โ
| โ
| โ
| โ
|
| vLLM Integrations | โ
| โ
| โ
| โ
|
| More Inference Engines (GGML, TensorRT) | โ
| โ | โ
| โ
|
| More Platforms (CPU, Metal) | โ
| โ
| โ | โ |
| Multi-node Cluster Deployment | โ
| โ | โ | โ
|
| Image Models (Text-to-Image) | โ
| โ
| โ | โ |
| Text Embedding Models | โ
| โ | โ | โ |
| Multimodal Models | โ
| โ | โ | โ |
| Audio Models | โ
| โ | โ | โ |
| More OpenAI Functionalities (Function Calling) | โ
| โ | โ | โ |
## Using Xinference
- **Cloud </br>**
We host a [Xinference Cloud](https://inference.top) service for anyone to try with zero setup.
- **Self-hosting Xinference Community Edition</br>**
Quickly get Xinference running in your environment with this [starter guide](#getting-started).
Use our [documentation](https://inference.readthedocs.io/) for further references and more in-depth instructions.
- **Xinference for enterprise / organizations</br>**
We provide additional enterprise-centric features. [send us an email](mailto:business@xprobe.io?subject=[GitHub]Business%20License%20Inquiry) to discuss enterprise needs. </br>
## Staying Ahead
Star Xinference on GitHub and be instantly notified of new releases.

## Getting Started
* [Docs](https://inference.readthedocs.io/en/latest/index.html)
* [Built-in Models](https://inference.readthedocs.io/en/latest/models/builtin/index.html)
* [Custom Models](https://inference.readthedocs.io/en/latest/models/custom.html)
* [Deployment Docs](https://inference.readthedocs.io/en/latest/getting_started/using_xinference.html)
* [Examples and Tutorials](https://inference.readthedocs.io/en/latest/examples/index.html)
### Jupyter Notebook
The lightest way to experience Xinference is to try our [Jupyter Notebook on Google Colab](https://colab.research.google.com/github/xorbitsai/inference/blob/main/examples/Xinference_Quick_Start.ipynb).
### Docker
Nvidia GPU users can start Xinference server using [Xinference Docker Image](https://inference.readthedocs.io/en/latest/getting_started/using_docker_image.html). Prior to executing the installation command, ensure that both [Docker](https://docs.docker.com/get-docker/) and [CUDA](https://developer.nvidia.com/cuda-downloads) are set up on your system.
```bash
docker run --name xinference -d -p 9997:9997 -e XINFERENCE_HOME=/data -v </on/your/host>:/data --gpus all xprobe/xinference:latest xinference-local -H 0.0.0.0
```
### K8s via helm
Ensure that you have GPU support in your Kubernetes cluster, then install as follows.
```
# add repo
helm repo add xinference https://xorbitsai.github.io/xinference-helm-charts
# update indexes and query xinference versions
helm repo update xinference
helm search repo xinference/xinference --devel --versions
# install xinference
helm install xinference xinference/xinference -n xinference --version 0.0.1-v<xinference_release_version>
```
For more customized installation methods on K8s, please refer to the [documentation](https://inference.readthedocs.io/en/latest/getting_started/using_kubernetes.html).
### Quick Start
Install Xinference by using pip as follows. (For more options, see [Installation page](https://inference.readthedocs.io/en/latest/getting_started/installation.html).)
```bash
pip install "xinference[all]"
```
To start a local instance of Xinference, run the following command:
```bash
$ xinference-local
```
Once Xinference is running, there are multiple ways you can try it: via the web UI, via cURL,
via the command line, or via the Xinferenceโs python client. Check out our [docs]( https://inference.readthedocs.io/en/latest/getting_started/using_xinference.html#run-xinference-locally) for the guide.

## Getting involved
| Platform | Purpose |
|-------------------------------------------------------------------------------------------------|---------------------------------------------|
| [Github Issues](https://github.com/xorbitsai/inference/issues) | Reporting bugs and filing feature requests. |
| [Discord](https://discord.gg/Xw9tszSkr5) | Collaborating with other Xinference users. |
| [Twitter](https://twitter.com/xorbitsio) | Staying up-to-date on new features. |
## Citation
If this work is helpful, please kindly cite as:
```bibtex
@inproceedings{lu2024xinference,
title = "Xinference: Making Large Model Serving Easy",
author = "Lu, Weizheng and Xiong, Lingfeng and Zhang, Feng and Qin, Xuye and Chen, Yueguo",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.emnlp-demo.30",
pages = "291--300",
}
```
## Contributors
<a href="https://github.com/xorbitsai/inference/graphs/contributors">
<img src="https://contrib.rocks/image?repo=xorbitsai/inference" />
</a>
## Star History
[](https://star-history.com/#xorbitsai/inference&Date)
Raw data
{
"_id": null,
"home_page": "https://github.com/xorbitsai/inference",
"name": "xinference",
"maintainer": "Qin Xuye",
"docs_url": null,
"requires_python": null,
"maintainer_email": "qinxuye@xprobe.io",
"keywords": null,
"author": "Qin Xuye",
"author_email": "qinxuye@xprobe.io",
"download_url": "https://files.pythonhosted.org/packages/f4/1e/e6af95956e6371dfd9c6e842e889ca06c07c1010d24fbed24e53b788fcd2/xinference-1.8.0.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n<img src=\"./assets/xorbits-logo.png\" width=\"180px\" alt=\"xorbits\" />\n\n# Xorbits Inference: Model Serving Made Easy \ud83e\udd16\n\n<p align=\"center\">\n <a href=\"https://xinference.io/en\">Xinference Enterprise</a> \u00b7\n <a href=\"https://inference.readthedocs.io/en/latest/getting_started/installation.html#installation\">Self-hosting</a> \u00b7\n <a href=\"https://inference.readthedocs.io/\">Documentation</a>\n</p>\n\n[](https://pypi.org/project/xinference/)\n[](https://github.com/xorbitsai/inference/blob/main/LICENSE)\n[](https://actions-badge.atrox.dev/xorbitsai/inference/goto?ref=main)\n[](https://discord.gg/Xw9tszSkr5)\n[](https://twitter.com/xorbitsio)\n\n<p align=\"center\">\n <a href=\"./README.md\"><img alt=\"README in English\" src=\"https://img.shields.io/badge/English-454545?style=for-the-badge\"></a>\n <a href=\"./README_zh_CN.md\"><img alt=\"\u7b80\u4f53\u4e2d\u6587\u7248\u81ea\u8ff0\u6587\u4ef6\" src=\"https://img.shields.io/badge/\u4e2d\u6587\u4ecb\u7ecd-d9d9d9?style=for-the-badge\"></a>\n <a href=\"./README_ja_JP.md\"><img alt=\"\u65e5\u672c\u8a9e\u306eREADME\" src=\"https://img.shields.io/badge/\u65e5\u672c\u8a9e-d9d9d9?style=for-the-badge\"></a>\n</p>\n\n</div>\n<br />\n\n\nXorbits Inference(Xinference) is a powerful and versatile library designed to serve language, \nspeech recognition, and multimodal models. With Xorbits Inference, you can effortlessly deploy \nand serve your or state-of-the-art built-in models using just a single command. Whether you are a \nresearcher, developer, or data scientist, Xorbits Inference empowers you to unleash the full \npotential of cutting-edge AI models.\n\n<div align=\"center\">\n<i><a href=\"https://discord.gg/Xw9tszSkr5\">\ud83d\udc49 Join our Discord community!</a></i>\n</div>\n\n## \ud83d\udd25 Hot Topics\n### Framework Enhancements\n- [Xllamacpp](https://github.com/xorbitsai/xllamacpp): New llama.cpp Python binding, maintained by Xinference team, supports continuous batching and is more production-ready.: [#2997](https://github.com/xorbitsai/inference/pull/2997)\n- Distributed inference: running models across workers: [#2877](https://github.com/xorbitsai/inference/pull/2877)\n- VLLM enhancement: Shared KV cache across multiple replicas: [#2732](https://github.com/xorbitsai/inference/pull/2732)\n- Support Continuous batching for Transformers engine: [#1724](https://github.com/xorbitsai/inference/pull/1724)\n- Support MLX backend for Apple Silicon chips: [#1765](https://github.com/xorbitsai/inference/pull/1765)\n- Support specifying worker and GPU indexes for launching models: [#1195](https://github.com/xorbitsai/inference/pull/1195)\n- Support SGLang backend: [#1161](https://github.com/xorbitsai/inference/pull/1161)\n- Support LoRA for LLM and image models: [#1080](https://github.com/xorbitsai/inference/pull/1080)\n### New Models\n- Built-in support for [ERNIE 4.5](https://yiyan.baidu.com/blog/posts/ernie4.5/): [#3812](https://github.com/xorbitsai/inference/pull/3812)\n- Built-in support for [GLM-4.1V-Thinking](https://github.com/THUDM/GLM-4.1V-Thinking/tree/main): [#3756](https://github.com/xorbitsai/inference/pull/3756)\n- Built-in support for [jina-embeddings-v4](https://jina.ai/news/jina-embeddings-v4-universal-embeddings-for-multimodal-multilingual-retrieval/): [#3814](https://github.com/xorbitsai/inference/pull/3814)\n- Built-in support for [FLUX.1-Kontext-dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev): [#3728](https://github.com/xorbitsai/inference/pull/3728)\n- Built-in support for [Qwen3-Embedding](https://github.com/QwenLM/Qwen3-Embedding): [#3627](https://github.com/xorbitsai/inference/pull/3627)\n- Built-in support for [Minicpm4](https://github.com/OpenBMB/MiniCPM): [#3609](https://github.com/xorbitsai/inference/pull/3609)\n- Built-in support for [CogView4](https://github.com/THUDM/CogView4): [#3557](https://github.com/xorbitsai/inference/pull/3557)\n- Built-in support for [Deepseek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528): [#3539](https://github.com/xorbitsai/inference/pull/3539)\n### Integrations\n- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.\n- [FastGPT](https://github.com/labring/FastGPT): a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.\n- [RAGFlow](https://github.com/infiniflow/ragflow): is an open-source RAG engine based on deep document understanding.\n- [MaxKB](https://github.com/1Panel-dev/MaxKB): MaxKB = Max Knowledge Base, it is a chatbot based on Large Language Models (LLM) and Retrieval-Augmented Generation (RAG). \n- [Chatbox](https://chatboxai.app/): a desktop client for multiple cutting-edge LLM models, available on Windows, Mac and Linux.\n\n\n## Key Features\n\ud83c\udf1f **Model Serving Made Easy**: Simplify the process of serving large language, speech \nrecognition, and multimodal models. You can set up and deploy your models\nfor experimentation and production with a single command.\n\n\u26a1\ufe0f **State-of-the-Art Models**: Experiment with cutting-edge built-in models using a single \ncommand. Inference provides access to state-of-the-art open-source models!\n\n\ud83d\udda5 **Heterogeneous Hardware Utilization**: Make the most of your hardware resources with\n[ggml](https://github.com/ggerganov/ggml). Xorbits Inference intelligently utilizes heterogeneous\nhardware, including GPUs and CPUs, to accelerate your model inference tasks.\n\n\u2699\ufe0f **Flexible API and Interfaces**: Offer multiple interfaces for interacting\nwith your models, supporting OpenAI compatible RESTful API (including Function Calling API), RPC, CLI \nand WebUI for seamless model management and interaction.\n\n\ud83c\udf10 **Distributed Deployment**: Excel in distributed deployment scenarios, \nallowing the seamless distribution of model inference across multiple devices or machines.\n\n\ud83d\udd0c **Built-in Integration with Third-Party Libraries**: Xorbits Inference seamlessly integrates\nwith popular third-party libraries including [LangChain](https://python.langchain.com/docs/integrations/providers/xinference), [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window), [Dify](https://docs.dify.ai/advanced/model-configuration/xinference), and [Chatbox](https://chatboxai.app/).\n\n## Why Xinference\n| Feature | Xinference | FastChat | OpenLLM | RayLLM |\n|------------------------------------------------|------------|----------|---------|--------|\n| OpenAI-Compatible RESTful API | \u2705 | \u2705 | \u2705 | \u2705 |\n| vLLM Integrations | \u2705 | \u2705 | \u2705 | \u2705 |\n| More Inference Engines (GGML, TensorRT) | \u2705 | \u274c | \u2705 | \u2705 |\n| More Platforms (CPU, Metal) | \u2705 | \u2705 | \u274c | \u274c |\n| Multi-node Cluster Deployment | \u2705 | \u274c | \u274c | \u2705 |\n| Image Models (Text-to-Image) | \u2705 | \u2705 | \u274c | \u274c |\n| Text Embedding Models | \u2705 | \u274c | \u274c | \u274c |\n| Multimodal Models | \u2705 | \u274c | \u274c | \u274c |\n| Audio Models | \u2705 | \u274c | \u274c | \u274c |\n| More OpenAI Functionalities (Function Calling) | \u2705 | \u274c | \u274c | \u274c |\n\n## Using Xinference\n\n- **Cloud </br>**\nWe host a [Xinference Cloud](https://inference.top) service for anyone to try with zero setup. \n\n- **Self-hosting Xinference Community Edition</br>**\nQuickly get Xinference running in your environment with this [starter guide](#getting-started).\nUse our [documentation](https://inference.readthedocs.io/) for further references and more in-depth instructions.\n\n- **Xinference for enterprise / organizations</br>**\nWe provide additional enterprise-centric features. [send us an email](mailto:business@xprobe.io?subject=[GitHub]Business%20License%20Inquiry) to discuss enterprise needs. </br>\n\n## Staying Ahead\n\nStar Xinference on GitHub and be instantly notified of new releases.\n\n\n\n## Getting Started\n\n* [Docs](https://inference.readthedocs.io/en/latest/index.html)\n* [Built-in Models](https://inference.readthedocs.io/en/latest/models/builtin/index.html)\n* [Custom Models](https://inference.readthedocs.io/en/latest/models/custom.html)\n* [Deployment Docs](https://inference.readthedocs.io/en/latest/getting_started/using_xinference.html)\n* [Examples and Tutorials](https://inference.readthedocs.io/en/latest/examples/index.html)\n\n### Jupyter Notebook\n\nThe lightest way to experience Xinference is to try our [Jupyter Notebook on Google Colab](https://colab.research.google.com/github/xorbitsai/inference/blob/main/examples/Xinference_Quick_Start.ipynb).\n\n### Docker \n\nNvidia GPU users can start Xinference server using [Xinference Docker Image](https://inference.readthedocs.io/en/latest/getting_started/using_docker_image.html). Prior to executing the installation command, ensure that both [Docker](https://docs.docker.com/get-docker/) and [CUDA](https://developer.nvidia.com/cuda-downloads) are set up on your system.\n\n```bash\ndocker run --name xinference -d -p 9997:9997 -e XINFERENCE_HOME=/data -v </on/your/host>:/data --gpus all xprobe/xinference:latest xinference-local -H 0.0.0.0\n```\n\n### K8s via helm\n\nEnsure that you have GPU support in your Kubernetes cluster, then install as follows.\n\n```\n# add repo\nhelm repo add xinference https://xorbitsai.github.io/xinference-helm-charts\n\n# update indexes and query xinference versions\nhelm repo update xinference\nhelm search repo xinference/xinference --devel --versions\n\n# install xinference\nhelm install xinference xinference/xinference -n xinference --version 0.0.1-v<xinference_release_version>\n```\n\nFor more customized installation methods on K8s, please refer to the [documentation](https://inference.readthedocs.io/en/latest/getting_started/using_kubernetes.html).\n\n### Quick Start\n\nInstall Xinference by using pip as follows. (For more options, see [Installation page](https://inference.readthedocs.io/en/latest/getting_started/installation.html).)\n\n```bash\npip install \"xinference[all]\"\n```\n\nTo start a local instance of Xinference, run the following command:\n\n```bash\n$ xinference-local\n```\n\nOnce Xinference is running, there are multiple ways you can try it: via the web UI, via cURL,\n via the command line, or via the Xinference\u2019s python client. Check out our [docs]( https://inference.readthedocs.io/en/latest/getting_started/using_xinference.html#run-xinference-locally) for the guide.\n\n\n\n## Getting involved\n\n| Platform | Purpose |\n|-------------------------------------------------------------------------------------------------|---------------------------------------------|\n| [Github Issues](https://github.com/xorbitsai/inference/issues) | Reporting bugs and filing feature requests. |\n| [Discord](https://discord.gg/Xw9tszSkr5) | Collaborating with other Xinference users. |\n| [Twitter](https://twitter.com/xorbitsio) | Staying up-to-date on new features. |\n\n## Citation\n\nIf this work is helpful, please kindly cite as:\n\n```bibtex\n@inproceedings{lu2024xinference,\n title = \"Xinference: Making Large Model Serving Easy\",\n author = \"Lu, Weizheng and Xiong, Lingfeng and Zhang, Feng and Qin, Xuye and Chen, Yueguo\",\n booktitle = \"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations\",\n month = nov,\n year = \"2024\",\n address = \"Miami, Florida, USA\",\n publisher = \"Association for Computational Linguistics\",\n url = \"https://aclanthology.org/2024.emnlp-demo.30\",\n pages = \"291--300\",\n}\n```\n\n## Contributors\n\n<a href=\"https://github.com/xorbitsai/inference/graphs/contributors\">\n <img src=\"https://contrib.rocks/image?repo=xorbitsai/inference\" />\n</a>\n\n## Star History\n\n[](https://star-history.com/#xorbitsai/inference&Date)\n",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "Model Serving Made Easy",
"version": "1.8.0",
"project_urls": {
"Homepage": "https://github.com/xorbitsai/inference"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "dedfb28e9b4b18df05a708c4f8982e613323e694a20b547347598ab7e4948f50",
"md5": "8097d2f2013e0cda01c0825912359caf",
"sha256": "35d1412ecda0226f9a5f395a7f5960c8219930b20467837775f872b7884840f2"
},
"downloads": -1,
"filename": "xinference-1.8.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8097d2f2013e0cda01c0825912359caf",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 41378561,
"upload_time": "2025-07-20T07:38:37",
"upload_time_iso_8601": "2025-07-20T07:38:37.697939Z",
"url": "https://files.pythonhosted.org/packages/de/df/b28e9b4b18df05a708c4f8982e613323e694a20b547347598ab7e4948f50/xinference-1.8.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f41ee6af95956e6371dfd9c6e842e889ca06c07c1010d24fbed24e53b788fcd2",
"md5": "4774cf569e5dffa14575d3164bed9ab5",
"sha256": "882825df4491ff0ff9401e3755a526f687697940ae0a7e299b06fa9ca6cb0205"
},
"downloads": -1,
"filename": "xinference-1.8.0.tar.gz",
"has_sig": false,
"md5_digest": "4774cf569e5dffa14575d3164bed9ab5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 31220231,
"upload_time": "2025-07-20T07:38:41",
"upload_time_iso_8601": "2025-07-20T07:38:41.990008Z",
"url": "https://files.pythonhosted.org/packages/f4/1e/e6af95956e6371dfd9c6e842e889ca06c07c1010d24fbed24e53b788fcd2/xinference-1.8.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-20 07:38:41",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "xorbitsai",
"github_project": "inference",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "xinference"
}