# MLX Omni Server
[![image](https://img.shields.io/pypi/v/mlx-omni-server.svg)](https://pypi.python.org/pypi/mlx-omni-server)
![alt text](docs/banner.png)
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements
OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
## Features
- 🚀 **Apple Silicon Optimized**: Built on MLX framework, optimized for M1/M2/M3/M4 series chips
- 🔌 **OpenAI API Compatible**: Drop-in replacement for OpenAI API endpoints
- 🎯 **Multiple AI Capabilities**:
- Audio Processing (TTS & STT)
- Chat Completion
- Image Generation
- ⚡ **High Performance**: Local inference with hardware acceleration
- 🔐 **Privacy-First**: All processing happens locally on your machine
- 🛠 **SDK Support**: Works with official OpenAI SDK and other compatible clients
## Supported API Endpoints
The server implements OpenAI-compatible endpoints:
- [Chat completions](https://platform.openai.com/docs/api-reference/chat): `/v1/chat/completions`
- ✅ Chat
- ✅ Tools, Function Calling
- ✅ Structured Output
- ✅ LogProbs
- 🚧 Vision
- [Audio](https://platform.openai.com/docs/api-reference/audio)
- ✅ `/v1/audio/speech` - Text-to-Speech
- ✅ `/v1/audio/transcriptions` - Speech-to-Text
- [Models](https://platform.openai.com/docs/api-reference/models/list)
- ✅ `/v1/models` - List models
- ✅ `/v1/models/{model}` - Retrieve or Delete model
- [Images](https://platform.openai.com/docs/api-reference/images)
- ✅ `/v1/images/generations` - Image generation
## Installation
```bash
# Install using pip
pip install mlx-omni-server
```
## Quick Start
1. Start the server:
```bash
# If installed via pip as a package
mlx-omni-server
```
you can use `--port` to specify a different port,such as: `mlx-omni-server --port 10240`, default port is 10240.
You can view more startup parameters by using `mlx-omni-server --help`.
2. Use with OpenAI SDK:
```python
from openai import OpenAI
# Configure client to use local server
client = OpenAI(
base_url="http://localhost:10240/v1", # Point to local server
api_key="not-needed" # API key is not required for local server
)
# Text-to-Speech Example
response = client.audio.speech.create(
model="lucasnewman/f5-tts-mlx",
input="Hello, welcome to MLX Omni Server!"
)
# Speech-to-Text Example
audio_file = open("speech.mp3", "rb")
transcript = client.audio.transcriptions.create(
model="mlx-community/whisper-large-v3-turbo",
file=audio_file
)
# Chat Completion Example
chat_completion = client.chat.completions.create(
model="meta-llama/Llama-3.2-3B-Instruct",
messages=[
{"role": "user", "content": "What can you do?"}
]
)
# Image Generation Example
image_response = client.images.generate(
model="argmaxinc/mlx-FLUX.1-schnell",
prompt="A serene landscape with mountains and a lake",
n=1,
size="512x512"
)
```
You can view more examples in [examples](examples).
## Contributing
We welcome contributions! If you're interested in contributing to MLX Omni Server, please check out our [Development Guide](docs/development_guide.md)
for detailed information about:
- Setting up the development environment
- Running the server in development mode
- Contributing guidelines
- Testing and documentation
For major changes, please open an issue first to discuss what you would like to change.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Built with [MLX](https://github.com/ml-explore/mlx) by Apple
- API design inspired by [OpenAI](https://openai.com)
- Uses [FastAPI](https://fastapi.tiangolo.com/) for the server implementation
- Chat(text generation) by [mlx-lm](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm)
- Image generation by [diffusionkit](https://github.com/argmaxinc/DiffusionKit)
- Text-to-Speech by [lucasnewman/f5-tts-mlx](https://github.com/lucasnewman/f5-tts-mlx)
- Speech-to-Text by [mlx-whisper](https://github.com/ml-explore/mlx-examples/blob/main/whisper/README.md)
## Disclaimer
This project is not affiliated with or endorsed by OpenAI or Apple. It's an independent implementation that provides OpenAI-compatible APIs using
Apple's MLX framework.
## Star History 🌟
[![Star History Chart](https://api.star-history.com/svg?repos=madroidmaq/mlx-omni-server&type=Date)](https://star-history.com/#madroidmaq/mlx-omni-server&Date)
Raw data
{
"_id": null,
"home_page": "https://github.com/madroidmaq/mlx-omni-server",
"name": "mlx-omni-server",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.11",
"maintainer_email": null,
"keywords": "mlx, ai, agi, aigc, server, openai, tts, stt",
"author": "madroid",
"author_email": "madroidmaq@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/78/32/f88676fec9aa5071b2bb0a35e40452598e59b9d44c675aad4e4e843938af/mlx_omni_server-0.3.1.tar.gz",
"platform": null,
"description": "# MLX Omni Server\n\n[![image](https://img.shields.io/pypi/v/mlx-omni-server.svg)](https://pypi.python.org/pypi/mlx-omni-server)\n\n![alt text](docs/banner.png)\n\nMLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements\nOpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.\n\n## Features\n\n- \ud83d\ude80 **Apple Silicon Optimized**: Built on MLX framework, optimized for M1/M2/M3/M4 series chips\n- \ud83d\udd0c **OpenAI API Compatible**: Drop-in replacement for OpenAI API endpoints\n- \ud83c\udfaf **Multiple AI Capabilities**:\n - Audio Processing (TTS & STT)\n - Chat Completion\n - Image Generation\n- \u26a1 **High Performance**: Local inference with hardware acceleration\n- \ud83d\udd10 **Privacy-First**: All processing happens locally on your machine\n- \ud83d\udee0 **SDK Support**: Works with official OpenAI SDK and other compatible clients\n\n## Supported API Endpoints\n\nThe server implements OpenAI-compatible endpoints:\n\n- [Chat completions](https://platform.openai.com/docs/api-reference/chat): `/v1/chat/completions`\n - \u2705 Chat\n - \u2705 Tools, Function Calling\n - \u2705 Structured Output\n - \u2705 LogProbs\n - \ud83d\udea7 Vision\n- [Audio](https://platform.openai.com/docs/api-reference/audio)\n - \u2705 `/v1/audio/speech` - Text-to-Speech\n - \u2705 `/v1/audio/transcriptions` - Speech-to-Text\n- [Models](https://platform.openai.com/docs/api-reference/models/list)\n - \u2705 `/v1/models` - List models\n - \u2705 `/v1/models/{model}` - Retrieve or Delete model\n- [Images](https://platform.openai.com/docs/api-reference/images)\n - \u2705 `/v1/images/generations` - Image generation\n\n## Installation\n\n```bash\n# Install using pip\npip install mlx-omni-server\n```\n\n## Quick Start\n\n1. Start the server:\n\n```bash\n# If installed via pip as a package\nmlx-omni-server\n```\n\nyou can use `--port` to specify a different port,such as: `mlx-omni-server --port 10240`, default port is 10240.\n\nYou can view more startup parameters by using `mlx-omni-server --help`.\n\n2. Use with OpenAI SDK:\n\n```python\nfrom openai import OpenAI\n\n# Configure client to use local server\nclient = OpenAI(\n base_url=\"http://localhost:10240/v1\", # Point to local server\n api_key=\"not-needed\" # API key is not required for local server\n)\n\n# Text-to-Speech Example\nresponse = client.audio.speech.create(\n model=\"lucasnewman/f5-tts-mlx\",\n input=\"Hello, welcome to MLX Omni Server!\"\n)\n\n# Speech-to-Text Example\naudio_file = open(\"speech.mp3\", \"rb\")\ntranscript = client.audio.transcriptions.create(\n model=\"mlx-community/whisper-large-v3-turbo\",\n file=audio_file\n)\n\n# Chat Completion Example\nchat_completion = client.chat.completions.create(\n model=\"meta-llama/Llama-3.2-3B-Instruct\",\n messages=[\n {\"role\": \"user\", \"content\": \"What can you do?\"}\n ]\n)\n\n# Image Generation Example\nimage_response = client.images.generate(\n model=\"argmaxinc/mlx-FLUX.1-schnell\",\n prompt=\"A serene landscape with mountains and a lake\",\n n=1,\n size=\"512x512\"\n)\n```\n\nYou can view more examples in [examples](examples).\n\n## Contributing\n\nWe welcome contributions! If you're interested in contributing to MLX Omni Server, please check out our [Development Guide](docs/development_guide.md)\nfor detailed information about:\n\n- Setting up the development environment\n- Running the server in development mode\n- Contributing guidelines\n- Testing and documentation\n\nFor major changes, please open an issue first to discuss what you would like to change.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Built with [MLX](https://github.com/ml-explore/mlx) by Apple\n- API design inspired by [OpenAI](https://openai.com)\n- Uses [FastAPI](https://fastapi.tiangolo.com/) for the server implementation\n- Chat(text generation) by [mlx-lm](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm)\n- Image generation by [diffusionkit](https://github.com/argmaxinc/DiffusionKit)\n- Text-to-Speech by [lucasnewman/f5-tts-mlx](https://github.com/lucasnewman/f5-tts-mlx)\n- Speech-to-Text by [mlx-whisper](https://github.com/ml-explore/mlx-examples/blob/main/whisper/README.md)\n\n## Disclaimer\n\nThis project is not affiliated with or endorsed by OpenAI or Apple. It's an independent implementation that provides OpenAI-compatible APIs using\nApple's MLX framework.\n\n## Star History \ud83c\udf1f\n\n[![Star History Chart](https://api.star-history.com/svg?repos=madroidmaq/mlx-omni-server&type=Date)](https://star-history.com/#madroidmaq/mlx-omni-server&Date)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": null,
"version": "0.3.1",
"project_urls": {
"Homepage": "https://github.com/madroidmaq/mlx-omni-server",
"Repository": "https://github.com/madroidmaq/mlx-omni-server"
},
"split_keywords": [
"mlx",
" ai",
" agi",
" aigc",
" server",
" openai",
" tts",
" stt"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "23976a827b5075dd95559e58f10f716a86fcfa242e6976664f6bb160fc494eec",
"md5": "d704564ef4d852dcfb45f61999cfb834",
"sha256": "cabd2bea44289c0dfe410e30d3b9901371569281f696f8b8e4bbb5d975452328"
},
"downloads": -1,
"filename": "mlx_omni_server-0.3.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d704564ef4d852dcfb45f61999cfb834",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.11",
"size": 37059,
"upload_time": "2025-01-06T18:01:07",
"upload_time_iso_8601": "2025-01-06T18:01:07.231205Z",
"url": "https://files.pythonhosted.org/packages/23/97/6a827b5075dd95559e58f10f716a86fcfa242e6976664f6bb160fc494eec/mlx_omni_server-0.3.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7832f88676fec9aa5071b2bb0a35e40452598e59b9d44c675aad4e4e843938af",
"md5": "9da3a606df0b506dd6e2ddca83e39ae6",
"sha256": "072898d110811068fede88decac34baea1596c8e5c9820e3070edae4a7af9980"
},
"downloads": -1,
"filename": "mlx_omni_server-0.3.1.tar.gz",
"has_sig": false,
"md5_digest": "9da3a606df0b506dd6e2ddca83e39ae6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.11",
"size": 27445,
"upload_time": "2025-01-06T18:01:10",
"upload_time_iso_8601": "2025-01-06T18:01:10.253030Z",
"url": "https://files.pythonhosted.org/packages/78/32/f88676fec9aa5071b2bb0a35e40452598e59b9d44c675aad4e4e843938af/mlx_omni_server-0.3.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-06 18:01:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "madroidmaq",
"github_project": "mlx-omni-server",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "mlx-omni-server"
}