mlx-omni-server


Namemlx-omni-server JSON
Version 0.4.9 PyPI version JSON
download
home_pageNone
SummaryMLX Omni Server is a server that provides OpenAI-compatible APIs using Apple's MLX framework.
upload_time2025-08-20 01:48:36
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseNone
keywords agi ai aigc mlx openai server stt tts
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MLX Omni Server

[![image](https://img.shields.io/pypi/v/mlx-omni-server.svg)](https://pypi.python.org/pypi/mlx-omni-server)
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/madroidmaq/mlx-omni-server)

![alt text](docs/banner.png)

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements
OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.

## Features

- 🚀 **Apple Silicon Optimized**: Built on MLX framework, optimized for M1/M2/M3/M4 series chips
- 🔌 **OpenAI API Compatible**: Drop-in replacement for OpenAI API endpoints
- 🎯 **Multiple AI Capabilities**:
    - Audio Processing (TTS & STT)
    - Chat Completion
    - Image Generation
- ⚡ **High Performance**: Local inference with hardware acceleration
- 🔐 **Privacy-First**: All processing happens locally on your machine
- 🛠 **SDK Support**: Works with official OpenAI SDK and other compatible clients

## Supported API Endpoints

The server implements OpenAI-compatible endpoints:

- [Chat completions](https://platform.openai.com/docs/api-reference/chat): `/v1/chat/completions`
    - ✅ Chat
    - ✅ Tools, Function Calling
    - ✅ Structured Output
    - ✅ LogProbs
    - 🚧 Vision
- [Audio](https://platform.openai.com/docs/api-reference/audio)
    - ✅ `/v1/audio/speech` - Text-to-Speech
    - ✅ `/v1/audio/transcriptions` - Speech-to-Text
- [Models](https://platform.openai.com/docs/api-reference/models/list)
    - ✅ `/v1/models` - List models
    - ✅ `/v1/models/{model}` - Retrieve or Delete model
- [Images](https://platform.openai.com/docs/api-reference/images)
    - ✅ `/v1/images/generations` - Image generation
- [Embeddings](https://platform.openai.com/docs/api-reference/embeddings)
    - ✅ `/v1/embeddings` - Create embeddings for text



## Quick Start

Follow these simple steps to get started with MLX Omni Server:

1. Install the package

```bash
pip install mlx-omni-server
```

2. Start the server

```bash
mlx-omni-server
```

3. Run a simple chat example using curl

```bash
curl http://localhost:10240/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mlx-community/gemma-3-1b-it-4bit-DWQ",
    "messages": [
      {
        "role": "user",
        "content": "What can you do?"
      }
    ]
  }'
```

That's it! You're now running AI locally on your Mac. See [Advanced Usage](#advanced-usage) for more examples.

### Server Options

```bash
# Start with default settings (port 10240)
mlx-omni-server

# Or specify a custom port
mlx-omni-server --port 8000

# View all available options
mlx-omni-server --help
```

### Basic Client Setup

```python
from openai import OpenAI

# Connect to your local server
client = OpenAI(
    base_url="http://localhost:10240/v1",  # Point to local server
    api_key="not-needed"                   # API key not required
)

# Make a simple chat request
response = client.chat.completions.create(
    model="mlx-community/gemma-3-1b-it-4bit-DWQ",
    messages=[{"role": "user", "content": "Hello, how are you?"}]
)
print(response.choices[0].message.content)
```

## Advanced Usage

MLX Omni Server supports multiple ways of interaction and various AI capabilities. Here's how to use each:

### API Usage Options

MLX Omni Server provides flexible ways to interact with AI capabilities:

#### REST API

Access the server directly using HTTP requests:

```bash
# Chat completions endpoint
curl http://localhost:10240/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mlx-community/gemma-3-1b-it-4bit-DWQ",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

# Get available models
curl http://localhost:10240/v1/models
```

#### OpenAI SDK

Use the official OpenAI Python SDK for seamless integration:

```python
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:10240/v1",  # Point to local server
    api_key="not-needed"                   # API key not required for local server
)
```

See the FAQ section for information on using TestClient for development.



### API Examples

#### Chat Completion

```python
response = client.chat.completions.create(
    model="mlx-community/Llama-3.2-3B-Instruct-4bit",
    messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
    ],
    temperature=0,
    stream=True  # this time, we set stream=True
)

for chunk in response:
    print(chunk)
    print(chunk.choices[0].delta.content)
    print("****************")
```

<details>
<summary>Curl Example</summary>

```shell
curl http://localhost:10240/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mlx-community/Llama-3.2-3B-Instruct-4bit",
    "stream": true,
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'
```

</details>

#### Text-to-Speech

```python
speech_file_path = "mlx_example.wav"
response = client.audio.speech.create(
  model="lucasnewman/f5-tts-mlx",
  voice="alloy", # voice si not working for now
  input="MLX project is awsome.",
)
response.stream_to_file(speech_file_path)
```


<details>
<summary>Curl Example</summary>

```shell
curl -X POST "http://localhost:10240/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "lucasnewman/f5-tts-mlx",
    "input": "MLX project is awsome",
    "voice": "alloy"
  }' \
  --output ~/Desktop/mlx.wav
```

</details>

#### Speech-to-Text

```python
audio_file = open("speech.mp3", "rb")
transcript = client.audio.transcriptions.create(
    model="mlx-community/whisper-large-v3-turbo",
    file=audio_file
)

print(transcript.text)
```

<details>
<summary>Curl Example</summary>

```shell
curl -X POST "http://localhost:10240/v1/audio/transcriptions" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@mlx_example.wav" \
  -F "model=mlx-community/whisper-large-v3-turbo"
```

Response:

```json
{
  "text": " MLX Project is awesome!"
}
```

</details>


#### Image Generation

```python
image_response = client.images.generate(
    model="argmaxinc/mlx-FLUX.1-schnell",
    prompt="A serene landscape with mountains and a lake",
    n=1,
    size="512x512"
)

```

<details>
<summary>Curl Example</summary>

```shell
curl http://localhost:10240/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "argmaxinc/mlx-FLUX.1-schnell",
    "prompt": "A cute baby sea otter",
    "n": 1,
    "size": "1024x1024"
  }'

```

</details>

#### Embeddings

```python
# Generate embedding for a single text
response = client.embeddings.create(
    model="mlx-community/all-MiniLM-L6-v2-4bit", input="I like reading"
)

# Examine the response structure
print(f"Response type: {type(response)}")
print(f"Model used: {response.model}")
print(f"Embedding dimension: {len(response.data[0].embedding)}")
```

<details>
<summary>Curl Example</summary>

```shell
curl http://localhost:10240/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mlx-community/all-MiniLM-L6-v2-4bit",
    "input": ["Hello world!", "Embeddings are useful for semantic search."]
  }'
```

</details>


For more detailed examples, check out the [examples](examples) directory.

## FAQ


### How are models managed?

MLX Omni Server uses Hugging Face for model downloading and management. When you specify a model ID that hasn't been downloaded yet, the framework will automatically download it. However, since download times can vary significantly:

- It's recommended to pre-download models through Hugging Face before using them in your service
- To use a locally downloaded model, simply set the `model` parameter to the local model path

```python
# Using a model from Hugging Face
response = client.chat.completions.create(
    model="mlx-community/gemma-3-1b-it-4bit-DWQ",  # Will download if not available
    messages=[{"role": "user", "content": "Hello"}]
)

# Using a local model
response = client.chat.completions.create(
    model="/path/to/your/local/model",  # Local model path
    messages=[{"role": "user", "content": "Hello"}]
)
```

The models currently supported on the machine can also be accessed through the following methods

```bash
curl http://localhost:10240/v1/models
```


### How do I specify which model to use?

Use the `model` parameter when creating a request:

```python
response = client.chat.completions.create(
    model="mlx-community/gemma-3-1b-it-4bit-DWQ",  # Specify model here
    messages=[{"role": "user", "content": "Hello"}]
)
```


### Can I use TestClient for development?

Yes, TestClient allows you to use the OpenAI client without starting a local server. This is particularly useful for development and testing scenarios:

```python
from openai import OpenAI
from fastapi.testclient import TestClient
from mlx_omni_server.main import app

# Use TestClient directly - no network service needed
client = OpenAI(
    http_client=TestClient(app)
)

# Now you can use the client just like with a running server
response = client.chat.completions.create(
    model="mlx-community/gemma-3-1b-it-4bit-DWQ",
    messages=[{"role": "user", "content": "Hello"}]
)
```

This approach bypasses the HTTP server entirely, making it ideal for unit testing and quick development iterations.


### What if I get errors when starting the server?

- Confirm you're using an Apple Silicon Mac (M1/M2/M3/M4)
- Check that your Python version is 3.9 or higher
- Verify you have the latest version of mlx-omni-server installed
- Check the log output for more detailed error information


## Contributing

We welcome contributions! If you're interested in contributing to MLX Omni Server, please check out our [Development Guide](docs/development_guide.md)
for detailed information about:

- Setting up the development environment
- Running the server in development mode
- Contributing guidelines
- Testing and documentation

For major changes, please open an issue first to discuss what you would like to change.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- Built with [MLX](https://github.com/ml-explore/mlx) by Apple
- API design inspired by [OpenAI](https://openai.com)
- Uses [FastAPI](https://fastapi.tiangolo.com/) for the server implementation
- Chat(text generation) by [mlx-lm](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm)
- Image generation by [mflux](https://github.com/filipstrand/mflux)
- Text-to-Speech by [lucasnewman/f5-tts-mlx](https://github.com/lucasnewman/f5-tts-mlx) & [Blaizzy/mlx-audio](https://github.com/Blaizzy/mlx-audio)
- Speech-to-Text by [mlx-whisper](https://github.com/ml-explore/mlx-examples/blob/main/whisper/README.md)
- Embeddings by [mlx-embeddings](https://github.com/Blaizzy/mlx-embeddings)

## Disclaimer

This project is not affiliated with or endorsed by OpenAI or Apple. It's an independent implementation that provides OpenAI-compatible APIs using
Apple's MLX framework.

## Star History 🌟

[![Star History Chart](https://api.star-history.com/svg?repos=madroidmaq/mlx-omni-server&type=Date)](https://star-history.com/#madroidmaq/mlx-omni-server&Date)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "mlx-omni-server",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "agi, ai, aigc, mlx, openai, server, stt, tts",
    "author": null,
    "author_email": "madroid <madroidmaq@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/1f/a2/81bf9c5dc7ecc50bba5f738de91dceffb41647d54c5d0f7f2f3b4aa47bde/mlx_omni_server-0.4.9.tar.gz",
    "platform": null,
    "description": "# MLX Omni Server\n\n[![image](https://img.shields.io/pypi/v/mlx-omni-server.svg)](https://pypi.python.org/pypi/mlx-omni-server)\n[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/madroidmaq/mlx-omni-server)\n\n![alt text](docs/banner.png)\n\nMLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements\nOpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.\n\n## Features\n\n- \ud83d\ude80 **Apple Silicon Optimized**: Built on MLX framework, optimized for M1/M2/M3/M4 series chips\n- \ud83d\udd0c **OpenAI API Compatible**: Drop-in replacement for OpenAI API endpoints\n- \ud83c\udfaf **Multiple AI Capabilities**:\n    - Audio Processing (TTS & STT)\n    - Chat Completion\n    - Image Generation\n- \u26a1 **High Performance**: Local inference with hardware acceleration\n- \ud83d\udd10 **Privacy-First**: All processing happens locally on your machine\n- \ud83d\udee0 **SDK Support**: Works with official OpenAI SDK and other compatible clients\n\n## Supported API Endpoints\n\nThe server implements OpenAI-compatible endpoints:\n\n- [Chat completions](https://platform.openai.com/docs/api-reference/chat): `/v1/chat/completions`\n    - \u2705 Chat\n    - \u2705 Tools, Function Calling\n    - \u2705 Structured Output\n    - \u2705 LogProbs\n    - \ud83d\udea7 Vision\n- [Audio](https://platform.openai.com/docs/api-reference/audio)\n    - \u2705 `/v1/audio/speech` - Text-to-Speech\n    - \u2705 `/v1/audio/transcriptions` - Speech-to-Text\n- [Models](https://platform.openai.com/docs/api-reference/models/list)\n    - \u2705 `/v1/models` - List models\n    - \u2705 `/v1/models/{model}` - Retrieve or Delete model\n- [Images](https://platform.openai.com/docs/api-reference/images)\n    - \u2705 `/v1/images/generations` - Image generation\n- [Embeddings](https://platform.openai.com/docs/api-reference/embeddings)\n    - \u2705 `/v1/embeddings` - Create embeddings for text\n\n\n\n## Quick Start\n\nFollow these simple steps to get started with MLX Omni Server:\n\n1. Install the package\n\n```bash\npip install mlx-omni-server\n```\n\n2. Start the server\n\n```bash\nmlx-omni-server\n```\n\n3. Run a simple chat example using curl\n\n```bash\ncurl http://localhost:10240/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"mlx-community/gemma-3-1b-it-4bit-DWQ\",\n    \"messages\": [\n      {\n        \"role\": \"user\",\n        \"content\": \"What can you do?\"\n      }\n    ]\n  }'\n```\n\nThat's it! You're now running AI locally on your Mac. See [Advanced Usage](#advanced-usage) for more examples.\n\n### Server Options\n\n```bash\n# Start with default settings (port 10240)\nmlx-omni-server\n\n# Or specify a custom port\nmlx-omni-server --port 8000\n\n# View all available options\nmlx-omni-server --help\n```\n\n### Basic Client Setup\n\n```python\nfrom openai import OpenAI\n\n# Connect to your local server\nclient = OpenAI(\n    base_url=\"http://localhost:10240/v1\",  # Point to local server\n    api_key=\"not-needed\"                   # API key not required\n)\n\n# Make a simple chat request\nresponse = client.chat.completions.create(\n    model=\"mlx-community/gemma-3-1b-it-4bit-DWQ\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello, how are you?\"}]\n)\nprint(response.choices[0].message.content)\n```\n\n## Advanced Usage\n\nMLX Omni Server supports multiple ways of interaction and various AI capabilities. Here's how to use each:\n\n### API Usage Options\n\nMLX Omni Server provides flexible ways to interact with AI capabilities:\n\n#### REST API\n\nAccess the server directly using HTTP requests:\n\n```bash\n# Chat completions endpoint\ncurl http://localhost:10240/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"mlx-community/gemma-3-1b-it-4bit-DWQ\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}]\n  }'\n\n# Get available models\ncurl http://localhost:10240/v1/models\n```\n\n#### OpenAI SDK\n\nUse the official OpenAI Python SDK for seamless integration:\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(\n    base_url=\"http://localhost:10240/v1\",  # Point to local server\n    api_key=\"not-needed\"                   # API key not required for local server\n)\n```\n\nSee the FAQ section for information on using TestClient for development.\n\n\n\n### API Examples\n\n#### Chat Completion\n\n```python\nresponse = client.chat.completions.create(\n    model=\"mlx-community/Llama-3.2-3B-Instruct-4bit\",\n    messages=[\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": \"Hello!\"}\n    ],\n    temperature=0,\n    stream=True  # this time, we set stream=True\n)\n\nfor chunk in response:\n    print(chunk)\n    print(chunk.choices[0].delta.content)\n    print(\"****************\")\n```\n\n<details>\n<summary>Curl Example</summary>\n\n```shell\ncurl http://localhost:10240/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"mlx-community/Llama-3.2-3B-Instruct-4bit\",\n    \"stream\": true,\n    \"messages\": [\n      {\n        \"role\": \"system\",\n        \"content\": \"You are a helpful assistant.\"\n      },\n      {\n        \"role\": \"user\",\n        \"content\": \"Hello!\"\n      }\n    ]\n  }'\n```\n\n</details>\n\n#### Text-to-Speech\n\n```python\nspeech_file_path = \"mlx_example.wav\"\nresponse = client.audio.speech.create(\n  model=\"lucasnewman/f5-tts-mlx\",\n  voice=\"alloy\", # voice si not working for now\n  input=\"MLX project is awsome.\",\n)\nresponse.stream_to_file(speech_file_path)\n```\n\n\n<details>\n<summary>Curl Example</summary>\n\n```shell\ncurl -X POST \"http://localhost:10240/v1/audio/speech\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"lucasnewman/f5-tts-mlx\",\n    \"input\": \"MLX project is awsome\",\n    \"voice\": \"alloy\"\n  }' \\\n  --output ~/Desktop/mlx.wav\n```\n\n</details>\n\n#### Speech-to-Text\n\n```python\naudio_file = open(\"speech.mp3\", \"rb\")\ntranscript = client.audio.transcriptions.create(\n    model=\"mlx-community/whisper-large-v3-turbo\",\n    file=audio_file\n)\n\nprint(transcript.text)\n```\n\n<details>\n<summary>Curl Example</summary>\n\n```shell\ncurl -X POST \"http://localhost:10240/v1/audio/transcriptions\" \\\n  -H \"Content-Type: multipart/form-data\" \\\n  -F \"file=@mlx_example.wav\" \\\n  -F \"model=mlx-community/whisper-large-v3-turbo\"\n```\n\nResponse:\n\n```json\n{\n  \"text\": \" MLX Project is awesome!\"\n}\n```\n\n</details>\n\n\n#### Image Generation\n\n```python\nimage_response = client.images.generate(\n    model=\"argmaxinc/mlx-FLUX.1-schnell\",\n    prompt=\"A serene landscape with mountains and a lake\",\n    n=1,\n    size=\"512x512\"\n)\n\n```\n\n<details>\n<summary>Curl Example</summary>\n\n```shell\ncurl http://localhost:10240/v1/images/generations \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"argmaxinc/mlx-FLUX.1-schnell\",\n    \"prompt\": \"A cute baby sea otter\",\n    \"n\": 1,\n    \"size\": \"1024x1024\"\n  }'\n\n```\n\n</details>\n\n#### Embeddings\n\n```python\n# Generate embedding for a single text\nresponse = client.embeddings.create(\n    model=\"mlx-community/all-MiniLM-L6-v2-4bit\", input=\"I like reading\"\n)\n\n# Examine the response structure\nprint(f\"Response type: {type(response)}\")\nprint(f\"Model used: {response.model}\")\nprint(f\"Embedding dimension: {len(response.data[0].embedding)}\")\n```\n\n<details>\n<summary>Curl Example</summary>\n\n```shell\ncurl http://localhost:10240/v1/embeddings \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"mlx-community/all-MiniLM-L6-v2-4bit\",\n    \"input\": [\"Hello world!\", \"Embeddings are useful for semantic search.\"]\n  }'\n```\n\n</details>\n\n\nFor more detailed examples, check out the [examples](examples) directory.\n\n## FAQ\n\n\n### How are models managed?\n\nMLX Omni Server uses Hugging Face for model downloading and management. When you specify a model ID that hasn't been downloaded yet, the framework will automatically download it. However, since download times can vary significantly:\n\n- It's recommended to pre-download models through Hugging Face before using them in your service\n- To use a locally downloaded model, simply set the `model` parameter to the local model path\n\n```python\n# Using a model from Hugging Face\nresponse = client.chat.completions.create(\n    model=\"mlx-community/gemma-3-1b-it-4bit-DWQ\",  # Will download if not available\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}]\n)\n\n# Using a local model\nresponse = client.chat.completions.create(\n    model=\"/path/to/your/local/model\",  # Local model path\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}]\n)\n```\n\nThe models currently supported on the machine can also be accessed through the following methods\n\n```bash\ncurl http://localhost:10240/v1/models\n```\n\n\n### How do I specify which model to use?\n\nUse the `model` parameter when creating a request:\n\n```python\nresponse = client.chat.completions.create(\n    model=\"mlx-community/gemma-3-1b-it-4bit-DWQ\",  # Specify model here\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}]\n)\n```\n\n\n### Can I use TestClient for development?\n\nYes, TestClient allows you to use the OpenAI client without starting a local server. This is particularly useful for development and testing scenarios:\n\n```python\nfrom openai import OpenAI\nfrom fastapi.testclient import TestClient\nfrom mlx_omni_server.main import app\n\n# Use TestClient directly - no network service needed\nclient = OpenAI(\n    http_client=TestClient(app)\n)\n\n# Now you can use the client just like with a running server\nresponse = client.chat.completions.create(\n    model=\"mlx-community/gemma-3-1b-it-4bit-DWQ\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}]\n)\n```\n\nThis approach bypasses the HTTP server entirely, making it ideal for unit testing and quick development iterations.\n\n\n### What if I get errors when starting the server?\n\n- Confirm you're using an Apple Silicon Mac (M1/M2/M3/M4)\n- Check that your Python version is 3.9 or higher\n- Verify you have the latest version of mlx-omni-server installed\n- Check the log output for more detailed error information\n\n\n## Contributing\n\nWe welcome contributions! If you're interested in contributing to MLX Omni Server, please check out our [Development Guide](docs/development_guide.md)\nfor detailed information about:\n\n- Setting up the development environment\n- Running the server in development mode\n- Contributing guidelines\n- Testing and documentation\n\nFor major changes, please open an issue first to discuss what you would like to change.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Built with [MLX](https://github.com/ml-explore/mlx) by Apple\n- API design inspired by [OpenAI](https://openai.com)\n- Uses [FastAPI](https://fastapi.tiangolo.com/) for the server implementation\n- Chat(text generation) by [mlx-lm](https://github.com/ml-explore/mlx-examples/tree/main/llms/mlx_lm)\n- Image generation by [mflux](https://github.com/filipstrand/mflux)\n- Text-to-Speech by [lucasnewman/f5-tts-mlx](https://github.com/lucasnewman/f5-tts-mlx) & [Blaizzy/mlx-audio](https://github.com/Blaizzy/mlx-audio)\n- Speech-to-Text by [mlx-whisper](https://github.com/ml-explore/mlx-examples/blob/main/whisper/README.md)\n- Embeddings by [mlx-embeddings](https://github.com/Blaizzy/mlx-embeddings)\n\n## Disclaimer\n\nThis project is not affiliated with or endorsed by OpenAI or Apple. It's an independent implementation that provides OpenAI-compatible APIs using\nApple's MLX framework.\n\n## Star History \ud83c\udf1f\n\n[![Star History Chart](https://api.star-history.com/svg?repos=madroidmaq/mlx-omni-server&type=Date)](https://star-history.com/#madroidmaq/mlx-omni-server&Date)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "MLX Omni Server is a server that provides OpenAI-compatible APIs using Apple's MLX framework.",
    "version": "0.4.9",
    "project_urls": {
        "Repository": "https://github.com/madroidmaq/mlx-omni-server"
    },
    "split_keywords": [
        "agi",
        " ai",
        " aigc",
        " mlx",
        " openai",
        " server",
        " stt",
        " tts"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bfe9c7c4781f47ffc6db5ec665cadd7d4e5a41b7bba1a168448992f21a3898e2",
                "md5": "fd32ead49cf0c7736820054685d8d41b",
                "sha256": "d8e9f32a2956fbfba2911659417ed0c7aaebff04c52d900feee00b5bbba9a0ba"
            },
            "downloads": -1,
            "filename": "mlx_omni_server-0.4.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fd32ead49cf0c7736820054685d8d41b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 50033,
            "upload_time": "2025-08-20T01:48:35",
            "upload_time_iso_8601": "2025-08-20T01:48:35.579275Z",
            "url": "https://files.pythonhosted.org/packages/bf/e9/c7c4781f47ffc6db5ec665cadd7d4e5a41b7bba1a168448992f21a3898e2/mlx_omni_server-0.4.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1fa281bf9c5dc7ecc50bba5f738de91dceffb41647d54c5d0f7f2f3b4aa47bde",
                "md5": "8f9dfb79393ee1bcc7bb8621386041d2",
                "sha256": "6953b819921d23c93a9bb4b54db53e9648123dbd628899ba0af55d6d6003d3b5"
            },
            "downloads": -1,
            "filename": "mlx_omni_server-0.4.9.tar.gz",
            "has_sig": false,
            "md5_digest": "8f9dfb79393ee1bcc7bb8621386041d2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 34870,
            "upload_time": "2025-08-20T01:48:36",
            "upload_time_iso_8601": "2025-08-20T01:48:36.972498Z",
            "url": "https://files.pythonhosted.org/packages/1f/a2/81bf9c5dc7ecc50bba5f738de91dceffb41647d54c5d0f7f2f3b4aa47bde/mlx_omni_server-0.4.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-20 01:48:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "madroidmaq",
    "github_project": "mlx-omni-server",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "mlx-omni-server"
}
        
Elapsed time: 1.16075s