mlx-gui


Namemlx-gui JSON
Version 1.2.4 PyPI version JSON
download
home_pageNone
SummaryA lightweight RESTful wrapper around Apple's MLX engine for dynamically loading and serving MLX-compatible models
upload_time2025-07-22 23:24:26
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseGPL-3.0
keywords apple gui machine-learning mlx rest-api
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            

```
███╗   ███╗██╗     ██╗  ██╗      ██████╗ ██╗   ██╗██╗
████╗ ████║██║     ╚██╗██╔╝     ██╔════╝ ██║   ██║██║
██╔████╔██║██║      ╚███╔╝█████╗██║  ███╗██║   ██║██║
██║╚██╔╝██║██║      ██╔██╗╚════╝██║   ██║██║   ██║██║
██║ ╚═╝ ██║███████╗██╔╝ ██╗     ╚██████╔╝╚██████╔╝██║
╚═╝     ╚═╝╚══════╝╚═╝  ╚═╝      ╚═════╝  ╚═════╝ ╚═╝
```


[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![Apple Silicon](https://img.shields.io/badge/Apple%20Silicon-Required-orange.svg)](https://support.apple.com/en-us/HT211814)
[![MLX Compatible](https://img.shields.io/badge/MLX-Compatible-green.svg)](https://github.com/ml-explore/mlx)

<div align="center">
<img src="media/video.gif" >
</div>

**The Swiss Army Knife of Apple Silicon AI - A lightweight Inference Server for Apple's MLX engine with a GUI.**

>*TLDR - OpenRouter-style v1 API interface for MLX with Ollama-like model management, featuring auto-queuing, on-demand model loading, and multi-user serving capabilities via single mac app.*

## 📦 Latest Release

### 🎉 v1.2.4 - Universal AI Ecosystem (July 22 2025)

**From Whisper to Embeddings in One API** - 23 embedding models, 99 languages, complete audio/vision/text pipeline. Production-ready, not promises.

<div align="center">
<img src="media/audio.gif" width="500">
</div>

#### 🚀 **NEW: Advanced Audio Intelligence**
- 🎙️ **Complete Whisper Ecosystem** - All variants (Tiny to Large v3) with automatic fallback - never fails!
- 🌍 **99+ Languages** - Auto-detection with no configuration needed
- ⏱️ **Word-Level Timestamps** - Perfect for subtitles, content indexing, and meeting analysis
- 📼 **Universal Format Support** - WAV, MP3, MP4, M4A, FLAC, OGG, WebM - any audio format works
- 🎯 **Parakeet TDT** - Lightning-fast transcription for real-time applications
- 🎨 **Beautiful Audio UI** - Drag-and-drop interface with 11 languages and 5 output formats



#### 🧠 **NEW: Complete Embedding Ecosystem**
- 🌟 **23+ Models, 13 Families** - E5, ModernBERT, Arctic, GTE, BGE, MiniLM, Qwen3, SentenceT5, Jina AI, and more!
- 🔧 **Triple Library Support** - Seamlessly integrates mlx_embedding_models, mlx_embeddings, AND sentence-transformers
- 🧪 **Battle-Tested** - 553 lines of embedding tests + 338 lines of audio tests ensure reliability
- 📏 **Any Dimension** - From efficient 384-dim to powerful 4096-dim embeddings
- 🎯 **Smart Architecture Detection** - Automatically optimizes extraction for each model type
- 🔢 **L2-Normalized Vectors** - Production-ready for similarity search and RAG applications

#### 🤖 **NEW: Mistral Small Integration**
- ✨ **24B Parameter Model** - Full support for Mistral-Small-3.2-24B-Instruct
- 🎨 **Vision-Text Capability** - Advanced multimodal processing via MLX-VLM
- 🧪 **Test Suite Integration** - Comprehensive testing ensuring reliable performance
- 🔧 **Smart Classification** - Automatic detection and proper model type assignment

#### 🛠️ **Technical Excellence**
- 🧪 **900+ Lines of Tests** - Comprehensive test coverage for production reliability
- 🔍 **New Discovery Endpoint** - `/v1/discover/stt` for easy speech-to-text model discovery
- 🎯 **Never-Fail Architecture** - Smart Whisper fallback ensures audio transcription always works
- 📊 **Enhanced Memory Management** - Optimized loading for large embedding and audio models
- 🔄 **Intelligent Queue System** - Handles diverse result types (lists, arrays, dicts) seamlessly
- ⚡ **Performance Optimization** - Faster model switching and concurrent processing

---

## 📚 Previous Releases

<details>
<summary><strong>v1.2.3</strong> - Real-Time Model Status & Model Support (July 19 2025)</summary>

**Key Features:**
- 🚀 Real-time status monitoring with download progress
- 🧪 Built-in API test console with response analytics
- 🎨 15+ new verified models including SmolLM3, Kimi-K2, Gemma-3n
- 🧠 Trillion-parameter model support
- 🔧 Enhanced model type classification

</details>

<details>
<summary><strong>v1.2.0-v1.2.2</strong> - Memory Management & Vision Compatibility</summary>

**Key Features:**
- 🧠 Revolutionary auto-unload system with LRU eviction
- 🖼️ Complete CyberAI image compatibility
- 🔄 Three-layer memory protection
- 📸 Enhanced VLM stability for vision models
- 🛠️ Raw base64 image support

</details>

**Download:** [Latest Release](https://github.com/RamboRogers/mlx-gui/releases/latest)

## Why ?

 1. ✅ Why MLX? Llama.cpp and Ollama are great, but they are slower than MLX. MLX is a native Apple Silicon framework that is optimized for Apple Silicon. Plus, it's free and open source, and this have a nice GUI.

 2. ⚡️ I wanted to turn my mac Mini and a Studio into more useful multiuser inference servers that I don't want to manage.

 3. 🏗️ I just want to build AI things and not manage inference servers, or pay for expensive services while maintaining sovereignty of my data.



<div align="center">
<table>
<th colspan=2>GUI</th>
<tr><td><img src="media/models.png"></td><td><img src="media/search.png"></td></tr>
<tr><td><img src="media/status.png"></td><td><img src="media/settings.png"></td></tr>
<th colspan=2>Mac Native</th>
<tr><td><img src="media/trayicon.png"></td><td><img src="media/app.png"></td></tr>
</table>
</div>





## 🚀 Features

### 🎯 **Universal AI Capabilities**
- **🧠 MLX Engine Integration** - Native Apple Silicon acceleration via MLX
- **🎙️ Advanced Audio Intelligence** - Complete Whisper & Parakeet support with multi-format processing
- **🔢 Production Embeddings** - Multi-architecture support (BGE, MiniLM, Qwen3, Arctic, E5)
- **🖼️ Vision Models** - Image understanding with Gemma-3n, Qwen2-VL, Mistral Small (enhanced stability)
- **🤖 Large Language Models** - Full support for instruction-tuned and reasoning models

### 🛠️ **Enterprise-Grade Infrastructure**
- **🔄 Intelligent Memory Management** - Advanced auto-unload system with LRU eviction
- **🛡️ Three-Layer Memory Protection** - Proactive cleanup, concurrent limits, emergency recovery
- **⚡ OpenAI Compatibility** - Drop-in replacement for OpenAI API endpoints
- **🌐 REST API Server** - Complete API for model management and inference
- **📊 Real-Time Monitoring** - System status, memory usage, and model performance

### 🎨 **User Experience**
- **🎨 Beautiful Admin Interface** - Modern web GUI for model management
- **🔍 HuggingFace Integration** - Discover and install MLX-compatible models
- **🍎 macOS System Tray** - Native menu bar integration
- **📱 Standalone App** - Packaged macOS app bundle (no Python required)

## 🤖 Tested Models

**Text Generation**
- `qwen3-8b-6bit` - Qwen3 8B quantized model
- `deepseek-r1-0528-qwen3-8b-mlx-8bit` - DeepSeek R1 reasoning model
- `smollm3-3b-4bit` / `smollm3-3b-bf16` - SmolLM3 multilingual models
- `gemma-3-27b-it-qat-4bit` - Google Gemma 3 27B instruction-tuned
- `mistral-small-3-2-24b-instruct-2506-mlx-4bit` - Mistral Small 24B with vision
- `devstral-small-2507-mlx-4bit` - Devstral coding model

**Vision Models**
- `gemma-3n-e4b-it` / `gemma-3n-e4b-it-mlx-8bit` - Gemma 3n vision models
- `mistral-small-3-2-24b-instruct-2506-mlx-4bit` - Multimodal capabilities

**Audio Transcription**
- `whisper-large-v3-turbo` - OpenAI Whisper Turbo for fast transcription
- `parakeet-tdt-0-6b-v2` - Ultra-fast Parakeet speech-to-text

**Text Embeddings**
- `qwen3-embedding-4b-4bit-dwq` - Qwen3 embeddings (2560 dimensions)
- `bge-small-en-v1-5-bf16` - BGE embeddings (384 dimensions)
- `all-minilm-l6-v2-4bit` / `all-minilm-l6-v2-bf16` - MiniLM embeddings
- `snowflake-arctic-embed-l-v2-0-4bit` - Arctic embeddings (1024 dimensions)

## 📋 Requirements

- **macOS** (Apple Silicon M1/M2/M3/M4 required)
- **Python 3.11+** (for development)
- **8GB+ RAM** (16GB+ recommended for larger models)

## 🏃‍♂️ Quick Start

### Option 1: Download Standalone App (Recommended for Mac Users)
1. Download the latest `.app` from [Releases](https://github.com/RamboRogers/mlx-gui/releases)
2. Drag to `/Applications`
3. Launch - no Python installation required!
4. From the menu bar, click the MLX icon to open the admin interface.
5. Discover and install models from HuggingFace.
6. Connect your AI app to the API endpoint.

> *📝 Models may take a few minutes to load. They are gigabytes in size and will download at your internet speed.*

### Option 2: Install from PyPI
```bash
# Install MLX-GUI
pip install mlx-gui

# Launch with system tray
mlx-gui tray

# Or launch server only
mlx-gui start --port 8000
```

### Option 3: Install from Source
```bash
# Clone the repository
git clone https://github.com/RamboRogers/mlx-gui.git
cd mlx-gui

# Install dependencies
pip install -e ".[app]"

# Launch with system tray
mlx-gui tray

# Or launch server only
mlx-gui start --port 8000
```

## 🎮 Usage

### An API Endpoint for [Jan](https://jan.ai) or any other AI app

Simply configure the API endpoint in the app settings to point to your MLX-GUI server. This works with any AI app that supports the OpenAI API. Enter anything for the API key.

<div align="center">
<table>
<tr><td><img src="media/setup.png"></td><td><img src="media/usage.png"></td></tr>
</table>
</div>

### System Tray (Recommended)

Launch the app and look for **MLX** in your menu bar:

<div align="center">
<img src="media/trayicon.png" width="500">
</div>

- **Open Admin Interface** - Web GUI for model management
- **System Status** - Real-time monitoring
- **Unload All Models** - Free up memory
- **Network Settings** - Configure binding options

### Web Admin Interface
Navigate to `http://localhost:8000/admin` for:
- 🔍 **Discover Tab** - Browse and install MLX models from HuggingFace
- 🧠 **Models Tab** - Manage installed models (load/unload/remove)
- 📊 **Monitor Tab** - System statistics and performance
- ⚙️ **Settings Tab** - Configure server and model options

### API Usage

#### OpenAI-Compatible Chat
```bash
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-6bit",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 100
  }'
```

#### Vision Models with Images
```bash
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2-vl-2b-instruct",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What do you see in this image?"},
        {"type": "image_url", "image_url": {"url": "..."}}
      ]
    }],
    "max_tokens": 200
  }'
```

#### Audio Transcription
```bash
curl -X POST http://localhost:8000/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F "file=@audio.wav" \
  -F "model=parakeet-tdt-0-6b-v2"
```

#### Text Embeddings
```bash
curl -X POST http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "input": ["Hello world", "How are you?"],
    "model": "qwen3-embedding-0-6b-4bit"
  }'
```

#### Install Models
```bash
# Install text model
curl -X POST http://localhost:8000/v1/models/install \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "mlx-community/Qwen2.5-7B-Instruct-4bit",
    "name": "qwen-7b-4bit"
  }'

# Install audio model
curl -X POST http://localhost:8000/v1/models/install \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "mlx-community/parakeet-tdt-0.6b-v2",
    "name": "parakeet-tdt-0-6b-v2"
  }'

# Install vision model
curl -X POST http://localhost:8000/v1/models/install \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "mlx-community/Qwen2-VL-2B-Instruct-4bit",
    "name": "qwen2-vl-2b-instruct"
  }'

# Install embedding model
curl -X POST http://localhost:8000/v1/models/install \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ",
    "name": "qwen3-embedding-0-6b-4bit"
  }'
```

## 🏗️ Architecture

```
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│  System Tray    │    │   Web Admin GUI  │    │   REST API      │
│  (macOS)        │◄──►│  (localhost:8000)│◄──►│  (/v1/*)        │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                                         │
                       ┌─────────────────┐              │
                       │ Model Manager   │◄─────────────┘
                       │ (Queue/Memory)  │
                       └─────────────────┘
                                │
                       ┌─────────────────┐
                       │  MLX Engine     │
                       │ (Apple Silicon) │
                       └─────────────────┘
```

## 📚 API Documentation

Full API documentation is available at `/v1/docs` when the server is running, or see [API.md](API.md) for complete endpoint reference.

### Key Endpoints

#### 🎯 **Core AI Services**
- `POST /v1/chat/completions` - OpenAI-compatible chat (text + images + Mistral Small)
- `POST /v1/embeddings` - **NEW:** Multi-architecture embeddings (BGE, MiniLM, Qwen3, Arctic)
- `POST /v1/audio/transcriptions` - **NEW:** Enhanced audio transcription (Whisper Turbo, Parakeet)

#### 🛠️ **Model Management**
- `GET /v1/models` - List installed models
- `POST /v1/models/install` - Install from HuggingFace
- `POST /v1/models/{name}/load` - Load model into memory
- `GET /v1/discover/models` - Search HuggingFace for MLX models
- `GET /v1/discover/embeddings` - **NEW:** Search for embedding models
- `GET /v1/discover/stt` - **NEW:** Search for audio transcription models

#### 📊 **System Operations**
- `GET /v1/system/status` - System and memory status
- `GET /v1/manager/status` - Detailed model manager status

## 🛠️ Development

### Setup Development Environment
```bash
git clone https://github.com/RamboRogers/mlx-gui.git
cd mlx-gui

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install in development mode with audio and vision support
pip install -e ".[dev,audio,vision]"

# Run tests
pytest

# Start development server
mlx-gui start --reload
```

### Build Standalone App
```bash
# Install build dependencies with audio and vision support
pip install rumps pyinstaller mlx-whisper parakeet-mlx mlx-vlm

# Build macOS app bundle
./build_app.sh

# Result: dist/MLX-GUI.app
```

## 🤝 Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## ⚖️ License

<p>
MLX-GUI is licensed under the GNU General Public License v3.0 (GPLv3).<br>
<em>Free Software</em>
</p>

[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg?style=for-the-badge)](https://www.gnu.org/licenses/gpl-3.0)

### Connect With Me 🤝

[![GitHub](https://img.shields.io/badge/GitHub-RamboRogers-181717?style=for-the-badge&logo=github)](https://github.com/RamboRogers)
[![Twitter](https://img.shields.io/badge/Twitter-@rogerscissp-1DA1F2?style=for-the-badge&logo=twitter)](https://x.com/rogerscissp)
[![Website](https://img.shields.io/badge/Web-matthewrogers.org-00ADD8?style=for-the-badge&logo=google-chrome)](https://matthewrogers.org)

## 🙏 Acknowledgments

- [Apple MLX Team](https://github.com/ml-explore/mlx) - For the incredible MLX framework
- [MLX-LM](https://github.com/ml-explore/mlx-examples/tree/main/llms) - MLX language model implementations
- [HuggingFace](https://huggingface.co) - For the model hub and transformers library


## ⭐ Star History

[![Star History Chart](https://api.star-history.com/svg?repos=RamboRogers/mlx-gui&type=Timeline)](https://www.star-history.com/#RamboRogers/mlx-gui&Timeline)

---

<div align="center">
  <strong>Made with ❤️ for the Apple Silicon community</strong>
</div>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "mlx-gui",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "apple, gui, machine-learning, mlx, rest-api",
    "author": null,
    "author_email": "Matt Rogers <matt@matthewrogers.org>",
    "download_url": "https://files.pythonhosted.org/packages/06/17/b541b728216f5a2b1bcbcf1ce72989c2d2d66c6b8a7afa332c593f792ce0/mlx_gui-1.2.4.tar.gz",
    "platform": null,
    "description": "\n\n```\n\u2588\u2588\u2588\u2557   \u2588\u2588\u2588\u2557\u2588\u2588\u2557     \u2588\u2588\u2557  \u2588\u2588\u2557      \u2588\u2588\u2588\u2588\u2588\u2588\u2557 \u2588\u2588\u2557   \u2588\u2588\u2557\u2588\u2588\u2557\n\u2588\u2588\u2588\u2588\u2557 \u2588\u2588\u2588\u2588\u2551\u2588\u2588\u2551     \u255a\u2588\u2588\u2557\u2588\u2588\u2554\u255d     \u2588\u2588\u2554\u2550\u2550\u2550\u2550\u255d \u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2551\n\u2588\u2588\u2554\u2588\u2588\u2588\u2588\u2554\u2588\u2588\u2551\u2588\u2588\u2551      \u255a\u2588\u2588\u2588\u2554\u255d\u2588\u2588\u2588\u2588\u2588\u2557\u2588\u2588\u2551  \u2588\u2588\u2588\u2557\u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2551\n\u2588\u2588\u2551\u255a\u2588\u2588\u2554\u255d\u2588\u2588\u2551\u2588\u2588\u2551      \u2588\u2588\u2554\u2588\u2588\u2557\u255a\u2550\u2550\u2550\u2550\u255d\u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2551\n\u2588\u2588\u2551 \u255a\u2550\u255d \u2588\u2588\u2551\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2557\u2588\u2588\u2554\u255d \u2588\u2588\u2557     \u255a\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d\u255a\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d\u2588\u2588\u2551\n\u255a\u2550\u255d     \u255a\u2550\u255d\u255a\u2550\u2550\u2550\u2550\u2550\u2550\u255d\u255a\u2550\u255d  \u255a\u2550\u255d      \u255a\u2550\u2550\u2550\u2550\u2550\u255d  \u255a\u2550\u2550\u2550\u2550\u2550\u255d \u255a\u2550\u255d\n```\n\n\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)\n[![Apple Silicon](https://img.shields.io/badge/Apple%20Silicon-Required-orange.svg)](https://support.apple.com/en-us/HT211814)\n[![MLX Compatible](https://img.shields.io/badge/MLX-Compatible-green.svg)](https://github.com/ml-explore/mlx)\n\n<div align=\"center\">\n<img src=\"media/video.gif\" >\n</div>\n\n**The Swiss Army Knife of Apple Silicon AI - A lightweight Inference Server for Apple's MLX engine with a GUI.**\n\n>*TLDR - OpenRouter-style v1 API interface for MLX with Ollama-like model management, featuring auto-queuing, on-demand model loading, and multi-user serving capabilities via single mac app.*\n\n## \ud83d\udce6 Latest Release\n\n### \ud83c\udf89 v1.2.4 - Universal AI Ecosystem (July 22 2025)\n\n**From Whisper to Embeddings in One API** - 23 embedding models, 99 languages, complete audio/vision/text pipeline. Production-ready, not promises.\n\n<div align=\"center\">\n<img src=\"media/audio.gif\" width=\"500\">\n</div>\n\n#### \ud83d\ude80 **NEW: Advanced Audio Intelligence**\n- \ud83c\udf99\ufe0f **Complete Whisper Ecosystem** - All variants (Tiny to Large v3) with automatic fallback - never fails!\n- \ud83c\udf0d **99+ Languages** - Auto-detection with no configuration needed\n- \u23f1\ufe0f **Word-Level Timestamps** - Perfect for subtitles, content indexing, and meeting analysis\n- \ud83d\udcfc **Universal Format Support** - WAV, MP3, MP4, M4A, FLAC, OGG, WebM - any audio format works\n- \ud83c\udfaf **Parakeet TDT** - Lightning-fast transcription for real-time applications\n- \ud83c\udfa8 **Beautiful Audio UI** - Drag-and-drop interface with 11 languages and 5 output formats\n\n\n\n#### \ud83e\udde0 **NEW: Complete Embedding Ecosystem**\n- \ud83c\udf1f **23+ Models, 13 Families** - E5, ModernBERT, Arctic, GTE, BGE, MiniLM, Qwen3, SentenceT5, Jina AI, and more!\n- \ud83d\udd27 **Triple Library Support** - Seamlessly integrates mlx_embedding_models, mlx_embeddings, AND sentence-transformers\n- \ud83e\uddea **Battle-Tested** - 553 lines of embedding tests + 338 lines of audio tests ensure reliability\n- \ud83d\udccf **Any Dimension** - From efficient 384-dim to powerful 4096-dim embeddings\n- \ud83c\udfaf **Smart Architecture Detection** - Automatically optimizes extraction for each model type\n- \ud83d\udd22 **L2-Normalized Vectors** - Production-ready for similarity search and RAG applications\n\n#### \ud83e\udd16 **NEW: Mistral Small Integration**\n- \u2728 **24B Parameter Model** - Full support for Mistral-Small-3.2-24B-Instruct\n- \ud83c\udfa8 **Vision-Text Capability** - Advanced multimodal processing via MLX-VLM\n- \ud83e\uddea **Test Suite Integration** - Comprehensive testing ensuring reliable performance\n- \ud83d\udd27 **Smart Classification** - Automatic detection and proper model type assignment\n\n#### \ud83d\udee0\ufe0f **Technical Excellence**\n- \ud83e\uddea **900+ Lines of Tests** - Comprehensive test coverage for production reliability\n- \ud83d\udd0d **New Discovery Endpoint** - `/v1/discover/stt` for easy speech-to-text model discovery\n- \ud83c\udfaf **Never-Fail Architecture** - Smart Whisper fallback ensures audio transcription always works\n- \ud83d\udcca **Enhanced Memory Management** - Optimized loading for large embedding and audio models\n- \ud83d\udd04 **Intelligent Queue System** - Handles diverse result types (lists, arrays, dicts) seamlessly\n- \u26a1 **Performance Optimization** - Faster model switching and concurrent processing\n\n---\n\n## \ud83d\udcda Previous Releases\n\n<details>\n<summary><strong>v1.2.3</strong> - Real-Time Model Status & Model Support (July 19 2025)</summary>\n\n**Key Features:**\n- \ud83d\ude80 Real-time status monitoring with download progress\n- \ud83e\uddea Built-in API test console with response analytics\n- \ud83c\udfa8 15+ new verified models including SmolLM3, Kimi-K2, Gemma-3n\n- \ud83e\udde0 Trillion-parameter model support\n- \ud83d\udd27 Enhanced model type classification\n\n</details>\n\n<details>\n<summary><strong>v1.2.0-v1.2.2</strong> - Memory Management & Vision Compatibility</summary>\n\n**Key Features:**\n- \ud83e\udde0 Revolutionary auto-unload system with LRU eviction\n- \ud83d\uddbc\ufe0f Complete CyberAI image compatibility\n- \ud83d\udd04 Three-layer memory protection\n- \ud83d\udcf8 Enhanced VLM stability for vision models\n- \ud83d\udee0\ufe0f Raw base64 image support\n\n</details>\n\n**Download:** [Latest Release](https://github.com/RamboRogers/mlx-gui/releases/latest)\n\n## Why ?\n\n 1. \u2705 Why MLX? Llama.cpp and Ollama are great, but they are slower than MLX. MLX is a native Apple Silicon framework that is optimized for Apple Silicon. Plus, it's free and open source, and this have a nice GUI.\n\n 2. \u26a1\ufe0f I wanted to turn my mac Mini and a Studio into more useful multiuser inference servers that I don't want to manage.\n\n 3. \ud83c\udfd7\ufe0f I just want to build AI things and not manage inference servers, or pay for expensive services while maintaining sovereignty of my data.\n\n\n\n<div align=\"center\">\n<table>\n<th colspan=2>GUI</th>\n<tr><td><img src=\"media/models.png\"></td><td><img src=\"media/search.png\"></td></tr>\n<tr><td><img src=\"media/status.png\"></td><td><img src=\"media/settings.png\"></td></tr>\n<th colspan=2>Mac Native</th>\n<tr><td><img src=\"media/trayicon.png\"></td><td><img src=\"media/app.png\"></td></tr>\n</table>\n</div>\n\n\n\n\n\n## \ud83d\ude80 Features\n\n### \ud83c\udfaf **Universal AI Capabilities**\n- **\ud83e\udde0 MLX Engine Integration** - Native Apple Silicon acceleration via MLX\n- **\ud83c\udf99\ufe0f Advanced Audio Intelligence** - Complete Whisper & Parakeet support with multi-format processing\n- **\ud83d\udd22 Production Embeddings** - Multi-architecture support (BGE, MiniLM, Qwen3, Arctic, E5)\n- **\ud83d\uddbc\ufe0f Vision Models** - Image understanding with Gemma-3n, Qwen2-VL, Mistral Small (enhanced stability)\n- **\ud83e\udd16 Large Language Models** - Full support for instruction-tuned and reasoning models\n\n### \ud83d\udee0\ufe0f **Enterprise-Grade Infrastructure**\n- **\ud83d\udd04 Intelligent Memory Management** - Advanced auto-unload system with LRU eviction\n- **\ud83d\udee1\ufe0f Three-Layer Memory Protection** - Proactive cleanup, concurrent limits, emergency recovery\n- **\u26a1 OpenAI Compatibility** - Drop-in replacement for OpenAI API endpoints\n- **\ud83c\udf10 REST API Server** - Complete API for model management and inference\n- **\ud83d\udcca Real-Time Monitoring** - System status, memory usage, and model performance\n\n### \ud83c\udfa8 **User Experience**\n- **\ud83c\udfa8 Beautiful Admin Interface** - Modern web GUI for model management\n- **\ud83d\udd0d HuggingFace Integration** - Discover and install MLX-compatible models\n- **\ud83c\udf4e macOS System Tray** - Native menu bar integration\n- **\ud83d\udcf1 Standalone App** - Packaged macOS app bundle (no Python required)\n\n## \ud83e\udd16 Tested Models\n\n**Text Generation**\n- `qwen3-8b-6bit` - Qwen3 8B quantized model\n- `deepseek-r1-0528-qwen3-8b-mlx-8bit` - DeepSeek R1 reasoning model\n- `smollm3-3b-4bit` / `smollm3-3b-bf16` - SmolLM3 multilingual models\n- `gemma-3-27b-it-qat-4bit` - Google Gemma 3 27B instruction-tuned\n- `mistral-small-3-2-24b-instruct-2506-mlx-4bit` - Mistral Small 24B with vision\n- `devstral-small-2507-mlx-4bit` - Devstral coding model\n\n**Vision Models**\n- `gemma-3n-e4b-it` / `gemma-3n-e4b-it-mlx-8bit` - Gemma 3n vision models\n- `mistral-small-3-2-24b-instruct-2506-mlx-4bit` - Multimodal capabilities\n\n**Audio Transcription**\n- `whisper-large-v3-turbo` - OpenAI Whisper Turbo for fast transcription\n- `parakeet-tdt-0-6b-v2` - Ultra-fast Parakeet speech-to-text\n\n**Text Embeddings**\n- `qwen3-embedding-4b-4bit-dwq` - Qwen3 embeddings (2560 dimensions)\n- `bge-small-en-v1-5-bf16` - BGE embeddings (384 dimensions)\n- `all-minilm-l6-v2-4bit` / `all-minilm-l6-v2-bf16` - MiniLM embeddings\n- `snowflake-arctic-embed-l-v2-0-4bit` - Arctic embeddings (1024 dimensions)\n\n## \ud83d\udccb Requirements\n\n- **macOS** (Apple Silicon M1/M2/M3/M4 required)\n- **Python 3.11+** (for development)\n- **8GB+ RAM** (16GB+ recommended for larger models)\n\n## \ud83c\udfc3\u200d\u2642\ufe0f Quick Start\n\n### Option 1: Download Standalone App (Recommended for Mac Users)\n1. Download the latest `.app` from [Releases](https://github.com/RamboRogers/mlx-gui/releases)\n2. Drag to `/Applications`\n3. Launch - no Python installation required!\n4. From the menu bar, click the MLX icon to open the admin interface.\n5. Discover and install models from HuggingFace.\n6. Connect your AI app to the API endpoint.\n\n> *\ud83d\udcdd Models may take a few minutes to load. They are gigabytes in size and will download at your internet speed.*\n\n### Option 2: Install from PyPI\n```bash\n# Install MLX-GUI\npip install mlx-gui\n\n# Launch with system tray\nmlx-gui tray\n\n# Or launch server only\nmlx-gui start --port 8000\n```\n\n### Option 3: Install from Source\n```bash\n# Clone the repository\ngit clone https://github.com/RamboRogers/mlx-gui.git\ncd mlx-gui\n\n# Install dependencies\npip install -e \".[app]\"\n\n# Launch with system tray\nmlx-gui tray\n\n# Or launch server only\nmlx-gui start --port 8000\n```\n\n## \ud83c\udfae Usage\n\n### An API Endpoint for [Jan](https://jan.ai) or any other AI app\n\nSimply configure the API endpoint in the app settings to point to your MLX-GUI server. This works with any AI app that supports the OpenAI API. Enter anything for the API key.\n\n<div align=\"center\">\n<table>\n<tr><td><img src=\"media/setup.png\"></td><td><img src=\"media/usage.png\"></td></tr>\n</table>\n</div>\n\n### System Tray (Recommended)\n\nLaunch the app and look for **MLX** in your menu bar:\n\n<div align=\"center\">\n<img src=\"media/trayicon.png\" width=\"500\">\n</div>\n\n- **Open Admin Interface** - Web GUI for model management\n- **System Status** - Real-time monitoring\n- **Unload All Models** - Free up memory\n- **Network Settings** - Configure binding options\n\n### Web Admin Interface\nNavigate to `http://localhost:8000/admin` for:\n- \ud83d\udd0d **Discover Tab** - Browse and install MLX models from HuggingFace\n- \ud83e\udde0 **Models Tab** - Manage installed models (load/unload/remove)\n- \ud83d\udcca **Monitor Tab** - System statistics and performance\n- \u2699\ufe0f **Settings Tab** - Configure server and model options\n\n### API Usage\n\n#### OpenAI-Compatible Chat\n```bash\ncurl -X POST http://localhost:8000/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"qwen3-8b-6bit\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}],\n    \"max_tokens\": 100\n  }'\n```\n\n#### Vision Models with Images\n```bash\ncurl -X POST http://localhost:8000/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"qwen2-vl-2b-instruct\",\n    \"messages\": [{\n      \"role\": \"user\",\n      \"content\": [\n        {\"type\": \"text\", \"text\": \"What do you see in this image?\"},\n        {\"type\": \"image_url\", \"image_url\": {\"url\": \"...\"}}\n      ]\n    }],\n    \"max_tokens\": 200\n  }'\n```\n\n#### Audio Transcription\n```bash\ncurl -X POST http://localhost:8000/v1/audio/transcriptions \\\n  -H \"Content-Type: multipart/form-data\" \\\n  -F \"file=@audio.wav\" \\\n  -F \"model=parakeet-tdt-0-6b-v2\"\n```\n\n#### Text Embeddings\n```bash\ncurl -X POST http://localhost:8000/v1/embeddings \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"input\": [\"Hello world\", \"How are you?\"],\n    \"model\": \"qwen3-embedding-0-6b-4bit\"\n  }'\n```\n\n#### Install Models\n```bash\n# Install text model\ncurl -X POST http://localhost:8000/v1/models/install \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model_id\": \"mlx-community/Qwen2.5-7B-Instruct-4bit\",\n    \"name\": \"qwen-7b-4bit\"\n  }'\n\n# Install audio model\ncurl -X POST http://localhost:8000/v1/models/install \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model_id\": \"mlx-community/parakeet-tdt-0.6b-v2\",\n    \"name\": \"parakeet-tdt-0-6b-v2\"\n  }'\n\n# Install vision model\ncurl -X POST http://localhost:8000/v1/models/install \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model_id\": \"mlx-community/Qwen2-VL-2B-Instruct-4bit\",\n    \"name\": \"qwen2-vl-2b-instruct\"\n  }'\n\n# Install embedding model\ncurl -X POST http://localhost:8000/v1/models/install \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model_id\": \"mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ\",\n    \"name\": \"qwen3-embedding-0-6b-4bit\"\n  }'\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502  System Tray    \u2502    \u2502   Web Admin GUI  \u2502    \u2502   REST API      \u2502\n\u2502  (macOS)        \u2502\u25c4\u2500\u2500\u25ba\u2502  (localhost:8000)\u2502\u25c4\u2500\u2500\u25ba\u2502  (/v1/*)        \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n                                                         \u2502\n                       \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510              \u2502\n                       \u2502 Model Manager   \u2502\u25c4\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n                       \u2502 (Queue/Memory)  \u2502\n                       \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n                                \u2502\n                       \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n                       \u2502  MLX Engine     \u2502\n                       \u2502 (Apple Silicon) \u2502\n                       \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## \ud83d\udcda API Documentation\n\nFull API documentation is available at `/v1/docs` when the server is running, or see [API.md](API.md) for complete endpoint reference.\n\n### Key Endpoints\n\n#### \ud83c\udfaf **Core AI Services**\n- `POST /v1/chat/completions` - OpenAI-compatible chat (text + images + Mistral Small)\n- `POST /v1/embeddings` - **NEW:** Multi-architecture embeddings (BGE, MiniLM, Qwen3, Arctic)\n- `POST /v1/audio/transcriptions` - **NEW:** Enhanced audio transcription (Whisper Turbo, Parakeet)\n\n#### \ud83d\udee0\ufe0f **Model Management**\n- `GET /v1/models` - List installed models\n- `POST /v1/models/install` - Install from HuggingFace\n- `POST /v1/models/{name}/load` - Load model into memory\n- `GET /v1/discover/models` - Search HuggingFace for MLX models\n- `GET /v1/discover/embeddings` - **NEW:** Search for embedding models\n- `GET /v1/discover/stt` - **NEW:** Search for audio transcription models\n\n#### \ud83d\udcca **System Operations**\n- `GET /v1/system/status` - System and memory status\n- `GET /v1/manager/status` - Detailed model manager status\n\n## \ud83d\udee0\ufe0f Development\n\n### Setup Development Environment\n```bash\ngit clone https://github.com/RamboRogers/mlx-gui.git\ncd mlx-gui\n\n# Create virtual environment\npython -m venv .venv\nsource .venv/bin/activate\n\n# Install in development mode with audio and vision support\npip install -e \".[dev,audio,vision]\"\n\n# Run tests\npytest\n\n# Start development server\nmlx-gui start --reload\n```\n\n### Build Standalone App\n```bash\n# Install build dependencies with audio and vision support\npip install rumps pyinstaller mlx-whisper parakeet-mlx mlx-vlm\n\n# Build macOS app bundle\n./build_app.sh\n\n# Result: dist/MLX-GUI.app\n```\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## \u2696\ufe0f License\n\n<p>\nMLX-GUI is licensed under the GNU General Public License v3.0 (GPLv3).<br>\n<em>Free Software</em>\n</p>\n\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg?style=for-the-badge)](https://www.gnu.org/licenses/gpl-3.0)\n\n### Connect With Me \ud83e\udd1d\n\n[![GitHub](https://img.shields.io/badge/GitHub-RamboRogers-181717?style=for-the-badge&logo=github)](https://github.com/RamboRogers)\n[![Twitter](https://img.shields.io/badge/Twitter-@rogerscissp-1DA1F2?style=for-the-badge&logo=twitter)](https://x.com/rogerscissp)\n[![Website](https://img.shields.io/badge/Web-matthewrogers.org-00ADD8?style=for-the-badge&logo=google-chrome)](https://matthewrogers.org)\n\n## \ud83d\ude4f Acknowledgments\n\n- [Apple MLX Team](https://github.com/ml-explore/mlx) - For the incredible MLX framework\n- [MLX-LM](https://github.com/ml-explore/mlx-examples/tree/main/llms) - MLX language model implementations\n- [HuggingFace](https://huggingface.co) - For the model hub and transformers library\n\n\n## \u2b50 Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=RamboRogers/mlx-gui&type=Timeline)](https://www.star-history.com/#RamboRogers/mlx-gui&Timeline)\n\n---\n\n<div align=\"center\">\n  <strong>Made with \u2764\ufe0f for the Apple Silicon community</strong>\n</div>\n",
    "bugtrack_url": null,
    "license": "GPL-3.0",
    "summary": "A lightweight RESTful wrapper around Apple's MLX engine for dynamically loading and serving MLX-compatible models",
    "version": "1.2.4",
    "project_urls": {
        "Homepage": "https://github.com/ramborogers/mlx-gui",
        "Issues": "https://github.com/ramborogers/mlx-gui/issues",
        "Repository": "https://github.com/ramborogers/mlx-gui"
    },
    "split_keywords": [
        "apple",
        " gui",
        " machine-learning",
        " mlx",
        " rest-api"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ebbb873e0e4ec3860773280444caa73edfb062ede78798f1bc6c260e0e1100ed",
                "md5": "8b63f3fa70b3ca288280568cdb3e2999",
                "sha256": "ab67da168c0cf96e0755954c9b99e93db90f192c84022611fe33d7450c165810"
            },
            "downloads": -1,
            "filename": "mlx_gui-1.2.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8b63f3fa70b3ca288280568cdb3e2999",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 118272,
            "upload_time": "2025-07-22T23:24:25",
            "upload_time_iso_8601": "2025-07-22T23:24:25.058291Z",
            "url": "https://files.pythonhosted.org/packages/eb/bb/873e0e4ec3860773280444caa73edfb062ede78798f1bc6c260e0e1100ed/mlx_gui-1.2.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0617b541b728216f5a2b1bcbcf1ce72989c2d2d66c6b8a7afa332c593f792ce0",
                "md5": "dce82dc059f2365ec2917aae6ccbcff7",
                "sha256": "8d83ea8126adf70da584a3a724472f7f6d25e980fa15a5fcd251c358a0f387c2"
            },
            "downloads": -1,
            "filename": "mlx_gui-1.2.4.tar.gz",
            "has_sig": false,
            "md5_digest": "dce82dc059f2365ec2917aae6ccbcff7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 110654,
            "upload_time": "2025-07-22T23:24:26",
            "upload_time_iso_8601": "2025-07-22T23:24:26.355410Z",
            "url": "https://files.pythonhosted.org/packages/06/17/b541b728216f5a2b1bcbcf1ce72989c2d2d66c6b8a7afa332c593f792ce0/mlx_gui-1.2.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-22 23:24:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ramborogers",
    "github_project": "mlx-gui",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "mlx-gui"
}
        
Elapsed time: 0.43306s