fastmlxOns

Name	fastmlxOns JSON
Version	0.2.1 JSON
	download
home_page	None
Summary	FastMLX is a high performance production ready API to host MLX models.
upload_time	2024-09-06 00:59:14
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	Apache Software License 2.0
keywords	fastmlx mlx apple mlx vision language models vlms large language models llms
VCS
bugtrack_url
requirements	mlx mlx-lm mlx-vlm fastapi jinja2
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # FastMLX

[![image](https://img.shields.io/pypi/v/fastmlx.svg)](https://pypi.python.org/pypi/fastmlx)
[![image](https://img.shields.io/conda/vn/conda-forge/fastmlx.svg)](https://anaconda.org/conda-forge/fastmlx)
[![image](https://pyup.io/repos/github/Blaizzy/fastmlx/shield.svg)](https://pyup.io/repos/github/Blaizzy/fastmlx)

**FastMLX is a high performance production ready API to host MLX models, including Vision Language Models (VLMs) and Language Models (LMs).**

-   Free software: Apache Software License 2.0
-   Documentation: https://Blaizzy.github.io/fastmlx

## Features

- **OpenAI-compatible API**: Easily integrate with existing applications that use OpenAI's API.
- **Dynamic Model Loading**: Load MLX models on-the-fly or use pre-loaded models for better performance.
- **Support for Multiple Model Types**: Compatible with various MLX model architectures.
- **Image Processing Capabilities**: Handle both text and image inputs for versatile model interactions.
- **Efficient Resource Management**: Optimized for high-performance and scalability.
- **Error Handling**: Robust error management for production environments.
- **Customizable**: Easily extendable to accommodate specific use cases and model types.

## Usage

1. **Installation**

   ```bash
   pip install fastmlx
   ```

2. **Running the Server**

   Start the FastMLX server:
   ```bash
   fastmlx
   ```
   or

   ```bash
   uvicorn fastmlx:app --reload --workers 0
   ```

   > [!WARNING]
   > The `--reload` flag should not be used in production. It is only intended for development purposes.

   ### Running with Multiple Workers (Parallel Processing)

   For improved performance and parallel processing capabilities, you can specify either the absolute number of worker processes or the fraction of CPU cores to use. This is particularly useful for handling multiple requests simultaneously.

   You can also set the `FASTMLX_NUM_WORKERS` environment variable to specify the number of workers or the fraction of CPU cores to use. `workers` defaults to 2 if not passed explicitly or set via the environment variable.

   In order of precedence (highest to lowest), the number of workers is determined by the following:
   - Explicitly passed as a command-line argument
     - `--workers 4` will set the number of workers to 4
     - `--workers 0.5` will set the number of workers to half the number of CPU cores available (minimum of 1)
   - Set via the `FASTMLX_NUM_WORKERS` environment variable
   - Default value of 2

   To use all available CPU cores, set the value to **1.0**.

   Example:
   ```bash
   fastmlx --workers 4
   ```
   or

   ```bash
   uvicorn fastmlx:app --workers 4
   ```

   > [!NOTE]
   > - `--reload` flag is not compatible with multiple workers
   > - The number of workers should typically not exceed the number of CPU cores available on your machine for optimal performance.

   ### Considerations for Multi-Worker Setup

   1. **Stateless Application**: Ensure your FastMLX application is stateless, as each worker process operates independently.
   2. **Database Connections**: If your app uses a database, make sure your connection pooling is configured to handle multiple workers.
   3. **Resource Usage**: Monitor your system's resource usage to find the optimal number of workers for your specific hardware and application needs. Additionally, you can remove any unused models using the delete model endpoint.
   4. **Load Balancing**: When running with multiple workers, incoming requests are automatically load-balanced across the worker processes.

   By leveraging multiple workers, you can significantly improve the throughput and responsiveness of your FastMLX application, especially under high load conditions.

3. **Making API Calls**

   Use the API similar to OpenAI's chat completions:

   **Vision Language Model**
   ```python
   import requests
   import json

   url = "http://localhost:8000/v1/chat/completions"
   headers = {"Content-Type": "application/json"}
   data = {
       "model": "mlx-community/nanoLLaVA-1.5-4bit",
       "image": "http://images.cocodataset.org/val2017/000000039769.jpg",
       "messages": [{"role": "user", "content": "What are these"}],
       "max_tokens": 100
   }

   response = requests.post(url, headers=headers, data=json.dumps(data))
   print(response.json())
   ```

   With streaming:
   ```python
   import requests
   import json

   def process_sse_stream(url, headers, data):
      response = requests.post(url, headers=headers, json=data, stream=True)

      if response.status_code != 200:
         print(f"Error: Received status code {response.status_code}")
         print(response.text)
         return

      full_content = ""

      try:
         for line in response.iter_lines():
               if line:
                  line = line.decode('utf-8')
                  if line.startswith('data: '):
                     event_data = line[6:]  # Remove 'data: ' prefix
                     if event_data == '[DONE]':
                           print("\nStream finished. ✅")
                           break
                     try:
                           chunk_data = json.loads(event_data)
                           content = chunk_data['choices'][0]['delta']['content']
                           full_content += content
                           print(content, end='', flush=True)
                     except json.JSONDecodeError:
                           print(f"\nFailed to decode JSON: {event_data}")
                     except KeyError:
                           print(f"\nUnexpected data structure: {chunk_data}")

      except KeyboardInterrupt:
         print("\nStream interrupted by user.")
      except requests.exceptions.RequestException as e:
         print(f"\nAn error occurred: {e}")

   if __name__ == "__main__":
      url = "http://localhost:8000/v1/chat/completions"
      headers = {"Content-Type": "application/json"}
      data = {
         "model": "mlx-community/nanoLLaVA-1.5-4bit",
         "image": "http://images.cocodataset.org/val2017/000000039769.jpg",
         "messages": [{"role": "user", "content": "What are these?"}],
         "max_tokens": 500,
         "stream": True
      }
      process_sse_stream(url, headers, data)
   ```

   **Language Model**
   ```python
   import requests
   import json

   url = "http://localhost:8000/v1/chat/completions"
   headers = {"Content-Type": "application/json"}
   data = {
       "model": "mlx-community/gemma-2-9b-it-4bit",
       "messages": [{"role": "user", "content": "What is the capital of France?"}],
       "max_tokens": 100
   }

   response = requests.post(url, headers=headers, data=json.dumps(data))
   print(response.json())
   ```

   With streaming:
   ```python
   import requests
   import json

   def process_sse_stream(url, headers, data):
      response = requests.post(url, headers=headers, json=data, stream=True)

      if response.status_code != 200:
         print(f"Error: Received status code {response.status_code}")
         print(response.text)
         return

      full_content = ""

      try:
         for line in response.iter_lines():
               if line:
                  line = line.decode('utf-8')
                  if line.startswith('data: '):
                     event_data = line[6:]  # Remove 'data: ' prefix
                     if event_data == '[DONE]':
                           print("\nStream finished. ✅")
                           break
                     try:
                           chunk_data = json.loads(event_data)
                           content = chunk_data['choices'][0]['delta']['content']
                           full_content += content
                           print(content, end='', flush=True)
                     except json.JSONDecodeError:
                           print(f"\nFailed to decode JSON: {event_data}")
                     except KeyError:
                           print(f"\nUnexpected data structure: {chunk_data}")

      except KeyboardInterrupt:
         print("\nStream interrupted by user.")
      except requests.exceptions.RequestException as e:
         print(f"\nAn error occurred: {e}")

   if __name__ == "__main__":
      url = "http://localhost:8000/v1/chat/completions"
      headers = {"Content-Type": "application/json"}
      data = {
         "model": "mlx-community/gemma-2-9b-it-4bit",
         "messages": [{"role": "user", "content": "Hi, how are you?"}],
         "max_tokens": 500,
         "stream": True
      }
      process_sse_stream(url, headers, data)
   ```

4. **Function Calling**

   FastMLX now supports tool calling in accordance with the OpenAI API specification. This feature is available for the following models:

   - Llama 3.1
   - Arcee Agent
   - C4ai-Command-R-Plus
   - Firefunction
   - xLAM

   Supported modes:
   - Without Streaming
   - Parallel Tool Calling

   > Note: Tool choice and OpenAI-compliant streaming for function calling are currently under development.

   Here's an example of how to use function calling with FastMLX:

   ```python
   import requests
   import json

   url = "http://localhost:8000/v1/chat/completions"
   headers = {"Content-Type": "application/json"}
   data = {
     "model": "mlx-community/Meta-Llama-3.1-8B-Instruct-8bit",
     "messages": [
       {
         "role": "user",
         "content": "What's the weather like in San Francisco and Washington?"
       }
     ],
     "tools": [
       {
         "name": "get_current_weather",
         "description": "Get the current weather",
         "parameters": {
           "type": "object",
           "properties": {
             "location": {
               "type": "string",
               "description": "The city and state, e.g. San Francisco, CA"
             },
             "format": {
               "type": "string",
               "enum": ["celsius", "fahrenheit"],
               "description": "The temperature unit to use. Infer this from the user's location."
             }
           },
           "required": ["location", "format"]
         }
       }
     ],
     "max_tokens": 150,
     "temperature": 0.7,
     "stream": False,
   }

   response = requests.post(url, headers=headers, data=json.dumps(data))
   print(response.json())
   ```

   This example demonstrates how to use the `get_current_weather` tool with the Llama 3.1 model. The API will process the user's question and use the provided tool to fetch the required information.

   Please note that while streaming is available for regular text generation, the streaming implementation for function calling is still in development and does not yet fully comply with the OpenAI specification.

5. **Listing Available Models**

   To see all vision and language models supported by MLX:

   ```python
   import requests

   url = "http://localhost:8000/v1/supported_models"
   response = requests.get(url)
   print(response.json())
   ```

6. **List Available Models**

   You can add new models to the API:

   ```python
   import requests

   url = "http://localhost:8000/v1/models"
   params = {
       "model_name": "hf-repo-or-path",
   }

   response = requests.post(url, params=params)
   print(response.json())
   ```

7. **Listing Available Models**

   To see all available models:

   ```python
   import requests

   url = "http://localhost:8000/v1/models"
   response = requests.get(url)
   print(response.json())
   ```

8. **Delete Models**

   To remove any models loaded to memory:

   ```python
   import requests

   url = "http://localhost:8000/v1/models"
   params = {
      "model_name": "hf-repo-or-path",
   }
   response = requests.delete(url, params=params)
   print(response)
   ```

For more detailed usage instructions and API documentation, please refer to the [full documentation](https://Blaizzy.github.io/fastmlx).

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fastmlxOns",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "fastmlx, MLX, Apple MLX, vision language models, VLMs, large language models, LLMs",
    "author": null,
    "author_email": "Prince Canuma <prince.gdt@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/83/36/cc9f05d92e0b2eb670e74bb8674938996d89b8c1b8ce5f1891493feaf734/fastmlxons-0.2.1.tar.gz",
    "platform": null,
    "description": "# FastMLX\n\n[![image](https://img.shields.io/pypi/v/fastmlx.svg)](https://pypi.python.org/pypi/fastmlx)\n[![image](https://img.shields.io/conda/vn/conda-forge/fastmlx.svg)](https://anaconda.org/conda-forge/fastmlx)\n[![image](https://pyup.io/repos/github/Blaizzy/fastmlx/shield.svg)](https://pyup.io/repos/github/Blaizzy/fastmlx)\n\n**FastMLX is a high performance production ready API to host MLX models, including Vision Language Models (VLMs) and Language Models (LMs).**\n\n-   Free software: Apache Software License 2.0\n-   Documentation: https://Blaizzy.github.io/fastmlx\n\n## Features\n\n- **OpenAI-compatible API**: Easily integrate with existing applications that use OpenAI's API.\n- **Dynamic Model Loading**: Load MLX models on-the-fly or use pre-loaded models for better performance.\n- **Support for Multiple Model Types**: Compatible with various MLX model architectures.\n- **Image Processing Capabilities**: Handle both text and image inputs for versatile model interactions.\n- **Efficient Resource Management**: Optimized for high-performance and scalability.\n- **Error Handling**: Robust error management for production environments.\n- **Customizable**: Easily extendable to accommodate specific use cases and model types.\n\n## Usage\n\n1. **Installation**\n\n   ```bash\n   pip install fastmlx\n   ```\n\n2. **Running the Server**\n\n   Start the FastMLX server:\n   ```bash\n   fastmlx\n   ```\n   or\n\n   ```bash\n   uvicorn fastmlx:app --reload --workers 0\n   ```\n\n   > [!WARNING]\n   > The `--reload` flag should not be used in production. It is only intended for development purposes.\n\n   ### Running with Multiple Workers (Parallel Processing)\n\n   For improved performance and parallel processing capabilities, you can specify either the absolute number of worker processes or the fraction of CPU cores to use. This is particularly useful for handling multiple requests simultaneously.\n\n   You can also set the `FASTMLX_NUM_WORKERS` environment variable to specify the number of workers or the fraction of CPU cores to use. `workers` defaults to 2 if not passed explicitly or set via the environment variable.\n\n   In order of precedence (highest to lowest), the number of workers is determined by the following:\n   - Explicitly passed as a command-line argument\n     - `--workers 4` will set the number of workers to 4\n     - `--workers 0.5` will set the number of workers to half the number of CPU cores available (minimum of 1)\n   - Set via the `FASTMLX_NUM_WORKERS` environment variable\n   - Default value of 2\n\n   To use all available CPU cores, set the value to **1.0**.\n\n   Example:\n   ```bash\n   fastmlx --workers 4\n   ```\n   or\n\n   ```bash\n   uvicorn fastmlx:app --workers 4\n   ```\n\n   > [!NOTE]\n   > - `--reload` flag is not compatible with multiple workers\n   > - The number of workers should typically not exceed the number of CPU cores available on your machine for optimal performance.\n\n   ### Considerations for Multi-Worker Setup\n\n   1. **Stateless Application**: Ensure your FastMLX application is stateless, as each worker process operates independently.\n   2. **Database Connections**: If your app uses a database, make sure your connection pooling is configured to handle multiple workers.\n   3. **Resource Usage**: Monitor your system's resource usage to find the optimal number of workers for your specific hardware and application needs. Additionally, you can remove any unused models using the delete model endpoint.\n   4. **Load Balancing**: When running with multiple workers, incoming requests are automatically load-balanced across the worker processes.\n\n   By leveraging multiple workers, you can significantly improve the throughput and responsiveness of your FastMLX application, especially under high load conditions.\n\n3. **Making API Calls**\n\n   Use the API similar to OpenAI's chat completions:\n\n   **Vision Language Model**\n   ```python\n   import requests\n   import json\n\n   url = \"http://localhost:8000/v1/chat/completions\"\n   headers = {\"Content-Type\": \"application/json\"}\n   data = {\n       \"model\": \"mlx-community/nanoLLaVA-1.5-4bit\",\n       \"image\": \"http://images.cocodataset.org/val2017/000000039769.jpg\",\n       \"messages\": [{\"role\": \"user\", \"content\": \"What are these\"}],\n       \"max_tokens\": 100\n   }\n\n   response = requests.post(url, headers=headers, data=json.dumps(data))\n   print(response.json())\n   ```\n\n   With streaming:\n   ```python\n   import requests\n   import json\n\n   def process_sse_stream(url, headers, data):\n      response = requests.post(url, headers=headers, json=data, stream=True)\n\n      if response.status_code != 200:\n         print(f\"Error: Received status code {response.status_code}\")\n         print(response.text)\n         return\n\n      full_content = \"\"\n\n      try:\n         for line in response.iter_lines():\n               if line:\n                  line = line.decode('utf-8')\n                  if line.startswith('data: '):\n                     event_data = line[6:]  # Remove 'data: ' prefix\n                     if event_data == '[DONE]':\n                           print(\"\\nStream finished. \u2705\")\n                           break\n                     try:\n                           chunk_data = json.loads(event_data)\n                           content = chunk_data['choices'][0]['delta']['content']\n                           full_content += content\n                           print(content, end='', flush=True)\n                     except json.JSONDecodeError:\n                           print(f\"\\nFailed to decode JSON: {event_data}\")\n                     except KeyError:\n                           print(f\"\\nUnexpected data structure: {chunk_data}\")\n\n      except KeyboardInterrupt:\n         print(\"\\nStream interrupted by user.\")\n      except requests.exceptions.RequestException as e:\n         print(f\"\\nAn error occurred: {e}\")\n\n   if __name__ == \"__main__\":\n      url = \"http://localhost:8000/v1/chat/completions\"\n      headers = {\"Content-Type\": \"application/json\"}\n      data = {\n         \"model\": \"mlx-community/nanoLLaVA-1.5-4bit\",\n         \"image\": \"http://images.cocodataset.org/val2017/000000039769.jpg\",\n         \"messages\": [{\"role\": \"user\", \"content\": \"What are these?\"}],\n         \"max_tokens\": 500,\n         \"stream\": True\n      }\n      process_sse_stream(url, headers, data)\n   ```\n\n   **Language Model**\n   ```python\n   import requests\n   import json\n\n   url = \"http://localhost:8000/v1/chat/completions\"\n   headers = {\"Content-Type\": \"application/json\"}\n   data = {\n       \"model\": \"mlx-community/gemma-2-9b-it-4bit\",\n       \"messages\": [{\"role\": \"user\", \"content\": \"What is the capital of France?\"}],\n       \"max_tokens\": 100\n   }\n\n   response = requests.post(url, headers=headers, data=json.dumps(data))\n   print(response.json())\n   ```\n\n   With streaming:\n   ```python\n   import requests\n   import json\n\n   def process_sse_stream(url, headers, data):\n      response = requests.post(url, headers=headers, json=data, stream=True)\n\n      if response.status_code != 200:\n         print(f\"Error: Received status code {response.status_code}\")\n         print(response.text)\n         return\n\n      full_content = \"\"\n\n      try:\n         for line in response.iter_lines():\n               if line:\n                  line = line.decode('utf-8')\n                  if line.startswith('data: '):\n                     event_data = line[6:]  # Remove 'data: ' prefix\n                     if event_data == '[DONE]':\n                           print(\"\\nStream finished. \u2705\")\n                           break\n                     try:\n                           chunk_data = json.loads(event_data)\n                           content = chunk_data['choices'][0]['delta']['content']\n                           full_content += content\n                           print(content, end='', flush=True)\n                     except json.JSONDecodeError:\n                           print(f\"\\nFailed to decode JSON: {event_data}\")\n                     except KeyError:\n                           print(f\"\\nUnexpected data structure: {chunk_data}\")\n\n      except KeyboardInterrupt:\n         print(\"\\nStream interrupted by user.\")\n      except requests.exceptions.RequestException as e:\n         print(f\"\\nAn error occurred: {e}\")\n\n   if __name__ == \"__main__\":\n      url = \"http://localhost:8000/v1/chat/completions\"\n      headers = {\"Content-Type\": \"application/json\"}\n      data = {\n         \"model\": \"mlx-community/gemma-2-9b-it-4bit\",\n         \"messages\": [{\"role\": \"user\", \"content\": \"Hi, how are you?\"}],\n         \"max_tokens\": 500,\n         \"stream\": True\n      }\n      process_sse_stream(url, headers, data)\n   ```\n\n4. **Function Calling**\n\n   FastMLX now supports tool calling in accordance with the OpenAI API specification. This feature is available for the following models:\n\n   - Llama 3.1\n   - Arcee Agent\n   - C4ai-Command-R-Plus\n   - Firefunction\n   - xLAM\n\n   Supported modes:\n   - Without Streaming\n   - Parallel Tool Calling\n\n   > Note: Tool choice and OpenAI-compliant streaming for function calling are currently under development.\n\n   Here's an example of how to use function calling with FastMLX:\n\n   ```python\n   import requests\n   import json\n\n   url = \"http://localhost:8000/v1/chat/completions\"\n   headers = {\"Content-Type\": \"application/json\"}\n   data = {\n     \"model\": \"mlx-community/Meta-Llama-3.1-8B-Instruct-8bit\",\n     \"messages\": [\n       {\n         \"role\": \"user\",\n         \"content\": \"What's the weather like in San Francisco and Washington?\"\n       }\n     ],\n     \"tools\": [\n       {\n         \"name\": \"get_current_weather\",\n         \"description\": \"Get the current weather\",\n         \"parameters\": {\n           \"type\": \"object\",\n           \"properties\": {\n             \"location\": {\n               \"type\": \"string\",\n               \"description\": \"The city and state, e.g. San Francisco, CA\"\n             },\n             \"format\": {\n               \"type\": \"string\",\n               \"enum\": [\"celsius\", \"fahrenheit\"],\n               \"description\": \"The temperature unit to use. Infer this from the user's location.\"\n             }\n           },\n           \"required\": [\"location\", \"format\"]\n         }\n       }\n     ],\n     \"max_tokens\": 150,\n     \"temperature\": 0.7,\n     \"stream\": False,\n   }\n\n   response = requests.post(url, headers=headers, data=json.dumps(data))\n   print(response.json())\n   ```\n\n   This example demonstrates how to use the `get_current_weather` tool with the Llama 3.1 model. The API will process the user's question and use the provided tool to fetch the required information.\n\n   Please note that while streaming is available for regular text generation, the streaming implementation for function calling is still in development and does not yet fully comply with the OpenAI specification.\n\n5. **Listing Available Models**\n\n   To see all vision and language models supported by MLX:\n\n   ```python\n   import requests\n\n   url = \"http://localhost:8000/v1/supported_models\"\n   response = requests.get(url)\n   print(response.json())\n   ```\n\n6. **List Available Models**\n\n   You can add new models to the API:\n\n   ```python\n   import requests\n\n   url = \"http://localhost:8000/v1/models\"\n   params = {\n       \"model_name\": \"hf-repo-or-path\",\n   }\n\n   response = requests.post(url, params=params)\n   print(response.json())\n   ```\n\n7. **Listing Available Models**\n\n   To see all available models:\n\n   ```python\n   import requests\n\n   url = \"http://localhost:8000/v1/models\"\n   response = requests.get(url)\n   print(response.json())\n   ```\n\n8. **Delete Models**\n\n   To remove any models loaded to memory:\n\n   ```python\n   import requests\n\n   url = \"http://localhost:8000/v1/models\"\n   params = {\n      \"model_name\": \"hf-repo-or-path\",\n   }\n   response = requests.delete(url, params=params)\n   print(response)\n   ```\n\nFor more detailed usage instructions and API documentation, please refer to the [full documentation](https://Blaizzy.github.io/fastmlx).\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "FastMLX is a high performance production ready API to host MLX models.",
    "version": "0.2.1",
    "project_urls": {
        "Homepage": "https://github.com/Blaizzy/fastmlx"
    },
    "split_keywords": [
        "fastmlx",
        " mlx",
        " apple mlx",
        " vision language models",
        " vlms",
        " large language models",
        " llms"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e1087e52c90f482c43757abff3a6484f5e64de8398645f72cdd9cad469626b88",
                "md5": "4ce75185599ecdf286a676922f40eeee",
                "sha256": "3c1ad11074ea910193295ba9cba792e47405c963b480c1a3c7b1b083dba39a63"
            },
            "downloads": -1,
            "filename": "fastmlxOns-0.2.1-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4ce75185599ecdf286a676922f40eeee",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.8",
            "size": 16907,
            "upload_time": "2024-09-06T00:59:11",
            "upload_time_iso_8601": "2024-09-06T00:59:11.787646Z",
            "url": "https://files.pythonhosted.org/packages/e1/08/7e52c90f482c43757abff3a6484f5e64de8398645f72cdd9cad469626b88/fastmlxOns-0.2.1-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8336cc9f05d92e0b2eb670e74bb8674938996d89b8c1b8ce5f1891493feaf734",
                "md5": "244b64925d3ca54c874eb8fdcfe31860",
                "sha256": "6bc457030888313df98c4237852eb7afcfc28dedbb1ff5362e18ca4952907b42"
            },
            "downloads": -1,
            "filename": "fastmlxons-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "244b64925d3ca54c874eb8fdcfe31860",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 25552,
            "upload_time": "2024-09-06T00:59:14",
            "upload_time_iso_8601": "2024-09-06T00:59:14.398958Z",
            "url": "https://files.pythonhosted.org/packages/83/36/cc9f05d92e0b2eb670e74bb8674938996d89b8c1b8ce5f1891493feaf734/fastmlxons-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-06 00:59:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Blaizzy",
    "github_project": "fastmlx",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "mlx",
            "specs": [
                [
                    ">=",
                    "0.15"
                ]
            ]
        },
        {
            "name": "mlx-lm",
            "specs": [
                [
                    ">=",
                    "0.15.2"
                ]
            ]
        },
        {
            "name": "mlx-vlm",
            "specs": [
                [
                    ">=",
                    "0.0.12"
                ]
            ]
        },
        {
            "name": "fastapi",
            "specs": [
                [
                    ">=",
                    "0.111.0"
                ]
            ]
        },
        {
            "name": "jinja2",
            "specs": []
        }
    ],
    "lcname": "fastmlxons"
}

None