Name | mflux JSON |
Version |
0.4.1
JSON |
| download |
home_page | None |
Summary | A MLX port of FLUX based on the Huggingface Diffusers implementation. |
upload_time | 2024-10-29 19:44:40 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | MIT License Copyright (c) 2024 Filip Strand Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
diffusers
flux
mlx
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
![image](src/mflux/assets/logo.png)
*A MLX port of FLUX based on the Huggingface Diffusers implementation.*
### About
Run the powerful [FLUX](https://blackforestlabs.ai/#get-flux) models from [Black Forest Labs](https://blackforestlabs.ai) locally on your Mac!
### Table of contents
<!-- TOC start (generated with https://github.com/derlin/bitdowntoc) -->
- [Philosophy](#philosophy)
- [πΏ Installation](#-installation)
- [πΌοΈ Generating an image](#%EF%B8%8F-generating-an-image)
* [π Full list of Command-Line Arguments](#-full-list-of-command-line-arguments)
- [β±οΈ Image generation speed (updated)](#%EF%B8%8F-image-generation-speed-updated)
- [βοΈ Equivalent to Diffusers implementation](#%EF%B8%8F-equivalent-to-diffusers-implementation)
- [ποΈ Quantization](#%EF%B8%8F-quantization)
* [π Size comparisons for quantized models](#-size-comparisons-for-quantized-models)
* [πΎ Saving a quantized version to disk](#-saving-a-quantized-version-to-disk)
* [π½ Loading and running a quantized version from disk](#-loading-and-running-a-quantized-version-from-disk)
- [π½ Running a non-quantized model directly from disk](#-running-a-non-quantized-model-directly-from-disk)
- [π¨ Image-to-Image](#-image-to-image)
- [π LoRA](#-lora)
* [Multi-LoRA](#multi-lora)
* [Supported LoRA formats (updated)](#supported-lora-formats-updated)
- [πΉοΈ Controlnet](#%EF%B8%8F-controlnet)
- [π§ Current limitations](#-current-limitations)
- [π‘Workflow tips](#workflow-tips)
- [β
TODO](#-todo)
- [License](#license)
<!-- TOC end -->
### Philosophy
MFLUX is a line-by-line port of the FLUX implementation in the [Huggingface Diffusers](https://github.com/huggingface/diffusers) library to [Apple MLX](https://github.com/ml-explore/mlx).
MFLUX is purposefully kept minimal and explicit - Network architectures are hardcoded and no config files are used
except for the tokenizers. The aim is to have a tiny codebase with the single purpose of expressing these models
(thereby avoiding too many abstractions). While MFLUX priorities readability over generality and performance, [it can still be quite fast](#%EF%B8%8F-image-generation-speed-updated), [and even faster quantized](#%EF%B8%8F-quantization).
All models are implemented from scratch in MLX and only the tokenizers are used via the
[Huggingface Transformers](https://github.com/huggingface/transformers) library. Other than that, there are only minimal dependencies
like [Numpy](https://numpy.org) and [Pillow](https://pypi.org/project/pillow/) for simple image post-processing.
### πΏ Installation
For users, the easiest way to install MFLUX is to use `uv tool`: If you have [installed `uv`](https://github.com/astral-sh/uv?tab=readme-ov-file#installation), simply:
```sh
uv tool install --upgrade mflux
```
to get the `mflux-generate` and related command line executables. You can skip to the usage guides below.
<details>
<summary>For the classic way to create a user virtual environment:</summary>
```
mkdir -p mflux && cd mflux && python3 -m venv .venv && source .venv/bin/activate
```
This creates and activates a virtual environment in the `mflux` folder. After that, install MFLUX via pip:
```
pip install -U mflux
```
</details>
<details>
<summary>For contributors (click to expand)</summary>
1. Clone the repo:
```sh
git clone git@github.com:filipstrand/mflux.git
```
2. Install the application
```sh
make install
```
3. To run the test suite
```sh
make test
```
4. Follow format and lint checks prior to submitting Pull Requests. The recommended `make lint` and `make format` installs and uses [`ruff`](https://github.com/astral-sh/ruff). You can setup your editor/IDE to lint/format automatically, or use our provided `make` helpers:
- `make format` - formats your code
- `make lint` - shows your lint errors and warnings, but does not auto fix
- `make check` - via `pre-commit` hooks, formats your code **and** attempts to auto fix lint errors
- consult official [`ruff` documentation](https://docs.astral.sh/ruff/) on advanced usages
</details>
### πΌοΈ Generating an image
Run the command `mflux-generate` by specifying a prompt and the model and some optional arguments. For example, here we use a quantized version of the `schnell` model for 2 steps:
```sh
mflux-generate --model schnell --prompt "Luxury food photograph" --steps 2 --seed 2 -q 8
```
This example uses the more powerful `dev` model with 25 time steps:
```sh
mflux-generate --model dev --prompt "Luxury food photograph" --steps 25 --seed 2 -q 8
```
β οΈ *If the specific model is not already downloaded on your machine, it will start the download process and fetch the model weights (~34GB in size for the Schnell or Dev model respectively). See the [quantization](#%EF%B8%8F-quantization) section for running compressed versions of the model.* β οΈ
*By default, model files are downloaded to the `.cache` folder within your home directory. For example, in my setup, the path looks like this:*
```
/Users/filipstrand/.cache/huggingface/hub/models--black-forest-labs--FLUX.1-dev
```
*To change this default behavior, you can do so by modifying the `HF_HOME` environment variable. For more details on how to adjust this setting, please refer to the [Hugging Face documentation](https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables)*.
π [FLUX.1-dev currently requires granted access to its Huggingface repo. For troubleshooting, see the issue tracker](https://github.com/filipstrand/mflux/issues/14) π
#### π Full list of Command-Line Arguments
- **`--prompt`** (required, `str`): Text description of the image to generate.
- **`--model`** or **`-m`** (required, `str`): Model to use for generation (`"schnell"` or `"dev"`).
- **`--output`** (optional, `str`, default: `"image.png"`): Output image filename.
- **`--seed`** (optional, `int`, default: `None`): Seed for random number generation. Default is time-based.
- **`--height`** (optional, `int`, default: `1024`): Height of the output image in pixels.
- **`--width`** (optional, `int`, default: `1024`): Width of the output image in pixels.
- **`--steps`** (optional, `int`, default: `4`): Number of inference steps.
- **`--guidance`** (optional, `float`, default: `3.5`): Guidance scale (only used for `"dev"` model).
- **`--path`** (optional, `str`, default: `None`): Path to a local model on disk.
- **`--quantize`** or **`-q`** (optional, `int`, default: `None`): [Quantization](#%EF%B8%8F-quantization) (choose between `4` or `8`).
- **`--lora-paths`** (optional, `[str]`, default: `None`): The paths to the [LoRA](#-LoRA) weights.
- **`--lora-scales`** (optional, `[float]`, default: `None`): The scale for each respective [LoRA](#-LoRA) (will default to `1.0` if not specified and only one LoRA weight is loaded.)
- **`--metadata`** (optional): Exports a `.json` file containing the metadata for the image with the same name. (Even without this flag, the image metadata is saved and can be viewed using `exiftool image.png`)
- **`--controlnet-image-path`** (required, `str`): Path to the local image used by ControlNet to guide output generation.
- **`--controlnet-strength`** (optional, `float`, default: `0.4`): Degree of influence the control image has on the output. Ranges from `0.0` (no influence) to `1.0` (full influence).
- **`--controlnet-save-canny`** (optional, bool, default: False): If set, saves the Canny edge detection reference image used by ControlNet.
- **`--init-image-path`** (optional, `str`, default: `None`): Local path to the initial image for image-to-image generation.
- **`--init-image-strength`** (optional, `float`, default: `0.4`): Controls how strongly the initial image influences the output image. A value of `0.0` means no influence. (Default is `0.4`)
- **`--config-from-metadata`** or **`-C`** (optional, `str`): [EXPERIMENTAL] Path to a prior file saved via `--metadata`, or a compatible handcrafted config file adhering to the expected args schema.
<details>
<summary>parameters supported by config files</summary>
#### How configs are used
- all config properties are optional and applied to the image generation if applicable
- invalid or incompatible properties will be ignored
#### Config schema
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"seed": {
"type": ["integer", "null"]
},
"steps": {
"type": ["integer", "null"]
},
"guidance": {
"type": ["number", "null"]
},
"quantize": {
"type": ["null", "string"]
},
"lora_paths": {
"type": ["array", "null"],
"items": {
"type": "string"
}
},
"lora_scales": {
"type": ["array", "null"],
"items": {
"type": "number"
}
},
"prompt": {
"type": ["string", "null"]
}
}
}
```
#### Example
```json
{
"model": "dev",
"seed": 42,
"steps": 8,
"guidance": 3.0,
"quantize": 4,
"lora_paths": [
"/some/path1/to/subject.safetensors",
"/some/path2/to/style.safetensors"
],
"lora_scales": [
0.8,
0.4
],
"prompt": "award winning modern art, MOMA"
}
```
</details>
Or, with the correct python environment active, create and run a separate script like the following:
```python
from mflux import Flux1, Config
# Load the model
flux = Flux1.from_alias(
alias="schnell", # "schnell" or "dev"
quantize=8, # 4 or 8
)
# Generate an image
image = flux.generate_image(
seed=2,
prompt="Luxury food photograph",
config=Config(
num_inference_steps=2, # "schnell" works well with 2-4 steps, "dev" works well with 20-25 steps
height=1024,
width=1024,
)
)
image.save(path="image.png")
```
For more options on how to configure MFLUX, please see [generate.py](src/mflux/generate.py).
### β±οΈ Image generation speed (updated)
These numbers are based on the non-quantized `schnell` model, with the configuration provided in the code snippet below.
To time your machine, run the following:
```sh
time mflux-generate \
--prompt "Luxury food photograph" \
--model schnell \
--steps 2 \
--seed 2 \
--height 1024 \
--width 1024
```
| Device | User | Reported Time | Notes |
|--------------------|------------------------------------------------------------------------------------------------------------------------------|---------------|---------------------------|
| M3 Max | [@karpathy](https://gist.github.com/awni/a67d16d50f0f492d94a10418e0592bde?permalink_comment_id=5153531#gistcomment-5153531) | ~20s | |
| M2 Ultra | [@awni](https://x.com/awnihannun/status/1823515121827897385) | <15s | |
| 2023 M2 Max (96GB) | [@explorigin](https://github.com/filipstrand/mflux/issues/6) | ~25s | |
| 2021 M1 Pro (16GB) | [@qw-in](https://github.com/filipstrand/mflux/issues/7) | ~175s | Might freeze your mac |
| 2023 M3 Pro (36GB) | [@kush-gupt](https://github.com/filipstrand/mflux/issues/11) | ~80s | |
| 2020 M1 (8GB) | [@mbvillaverde](https://github.com/filipstrand/mflux/issues/13) | ~335s | With resolution 512 x 512 |
| 2022 M1 MAX (64GB) | [@BosseParra](https://x.com/BosseParra/status/1826191780812877968) | ~55s | |
| 2023 M2 Pro (32GB) | [@leekichko](https://github.com/filipstrand/mflux/issues/85) | ~54s | |
| 2021 M1 Pro (32GB) | @filipstrand | ~160s | |
| 2023 M2 Max (32GB) | @filipstrand | ~70s | |
*Note that these numbers includes starting the application from scratch, which means doing model i/o, setting/quantizing weights etc.
If we assume that the model is already loaded, you can inspect the image metadata using `exiftool image.png` and see the total duration of the denoising loop (excluding text embedding).*
### βοΈ Equivalent to Diffusers implementation
There is only a single source of randomness when generating an image: The initial latent array.
In this implementation, this initial latent is fully deterministically controlled by the input `seed` parameter.
However, if we were to import a fixed instance of this latent array saved from the Diffusers implementation, then MFLUX will produce an identical image to the Diffusers implementation (assuming a fixed prompt and using the default parameter settings in the Diffusers setup).
The images below illustrate this equivalence.
In all cases the Schnell model was run for 2 time steps.
The Diffusers implementation ran in CPU mode.
The precision for MFLUX can be set in the [Config](src/mflux/config/config.py) class.
There is typically a noticeable but very small difference in the final image when switching between 16bit and 32bit precision.
---
```
Luxury food photograph
```
![image](src/mflux/assets/comparison1.jpg)
---
```
detailed cinematic dof render of an old dusty detailed CRT monitor on a wooden desk in a dim room with items around, messy dirty room. On the screen are the letters "FLUX" glowing softly. High detail hard surface render
```
![image](src/mflux/assets/comparison2.jpg)
---
```
photorealistic, lotr, A tiny red dragon curled up asleep inside a nest, (Soft Focus) , (f_stop 2.8) , (focal_length 50mm) macro lens f/2. 8, medieval wizard table, (pastel) colors, (cozy) morning light filtering through a nearby window, (whimsical) steam shapes, captured with a (Canon EOS R5) , highlighting (serene) comfort, medieval, dnd, rpg, 3d, 16K, 8K
```
![image](src/mflux/assets/comparison3.jpg)
---
```
A weathered fisherman in his early 60s stands on the deck of his boat, gazing out at a stormy sea. He has a thick, salt-and-pepper beard, deep-set blue eyes, and skin tanned and creased from years of sun exposure. He's wearing a yellow raincoat and hat, with water droplets clinging to the fabric. Behind him, dark clouds loom ominously, and waves crash against the side of the boat. The overall atmosphere is one of tension and respect for the power of nature.
```
![image](src/mflux/assets/comparison4.jpg)
---
```
Luxury food photograph of an italian Linguine pasta alle vongole dish with lots of clams. It has perfect lighting and a cozy background with big bokeh and shallow depth of field. The mood is a sunset balcony in tuscany. The photo is taken from the side of the plate. The pasta is shiny with sprinkled parmesan cheese and basil leaves on top. The scene is complemented by a warm, inviting light that highlights the textures and colors of the ingredients, giving it an appetizing and elegant look.
```
![image](src/mflux/assets/comparison5.jpg)
---
### ποΈ Quantization
MFLUX supports running FLUX in 4-bit or 8-bit quantized mode. Running a quantized version can greatly speed up the
generation process and reduce the memory consumption by several gigabytes. [Quantized models also take up less disk space](#-size-comparisons-for-quantized-models).
```sh
mflux-generate \
--model schnell \
--steps 2 \
--seed 2 \
--quantize 8 \
--height 1920 \
--width 1024 \
--prompt "Tranquil pond in a bamboo forest at dawn, the sun is barely starting to peak over the horizon, panda practices Tai Chi near the edge of the pond, atmospheric perspective through the mist of morning dew, sunbeams, its movements are graceful and fluid β creating a sense of harmony and balance, the pondβs calm waters reflecting the scene, inviting a sense of meditation and connection with nature, style of Howard Terpning and Jessica Rossier"
```
![image](src/mflux/assets/comparison6.jpg)
*In this example, weights are quantized at **runtime** - this is convenient if you don't want to [save a quantized copy of the weights to disk](#-saving-a-quantized-version-to-disk), but still want to benefit from the potential speedup and RAM reduction quantization might bring.*
By selecting the `--quantize` or `-q` flag to be `4`, `8`, or removing it entirely, we get all 3 images above. As can be seen, there is very little difference between the images (especially between the 8-bit, and the non-quantized result).
Image generation times in this example are based on a 2021 M1 Pro (32GB) machine. Even though the images are almost identical, there is a ~2x speedup by
running the 8-bit quantized version on this particular machine. Unlike the non-quantized version, for the 8-bit version the swap memory usage is drastically reduced and GPU utilization is close to 100% during the whole generation. Results here can vary across different machines.
#### π Size comparisons for quantized models
The model sizes for both `schnell` and `dev` at various quantization levels are as follows:
| 4 bit | 8 bit | Original (16 bit) |
|--------|---------|-------------------|
| 9.85GB | 18.16GB | 33.73GB |
The reason weights sizes are not fully cut in half is because a small number of weights are not quantized and kept at full precision.
#### πΎ Saving a quantized version to disk
To save a local copy of the quantized weights, run the `mflux-save` command like so:
```sh
mflux-save \
--path "/Users/filipstrand/Desktop/schnell_8bit" \
--model schnell \
--quantize 8
```
*Note that when saving a quantized version, you will need the original huggingface weights.*
It is also possible to specify [LoRA](#-lora) adapters when saving the model, e.g
```sh
mflux-save \
--path "/Users/filipstrand/Desktop/schnell_8bit" \
--model schnell \
--quantize 8 \
--lora-paths "/path/to/lora.safetensors" \
--lora-scales 0.7
```
When generating images with a model like this, no LoRA adapter is needed to be specified since
it is already baked into the saved quantized weights.
#### π½ Loading and running a quantized version from disk
To generate a new image from the quantized model, simply provide a `--path` to where it was saved:
```sh
mflux-generate \
--path "/Users/filipstrand/Desktop/schnell_8bit" \
--model schnell \
--steps 2 \
--seed 2 \
--height 1920 \
--width 1024 \
--prompt "Tranquil pond in a bamboo forest at dawn, the sun is barely starting to peak over the horizon, panda practices Tai Chi near the edge of the pond, atmospheric perspective through the mist of morning dew, sunbeams, its movements are graceful and fluid β creating a sense of harmony and balance, the pondβs calm waters reflecting the scene, inviting a sense of meditation and connection with nature, style of Howard Terpning and Jessica Rossier"
```
*Note: When loading a quantized model from disk, there is no need to pass in `-q` flag, since we can infer this from the weight metadata.*
*Also Note: Once we have a local model (quantized [or not](#-running-a-non-quantized-model-directly-from-disk)) specified via the `--path` argument, the huggingface cache models are not required to launch the model.
In other words, you can reclaim the 34GB diskspace (per model) by deleting the full 16-bit model from the [Huggingface cache](#%EF%B8%8F-generating-an-image) if you choose.*
*If you don't want to download the full models and quantize them yourself, the 4-bit weights are available here for a direct download:*
- [madroid/flux.1-schnell-mflux-4bit](https://huggingface.co/madroid/flux.1-schnell-mflux-4bit)
- [madroid/flux.1-dev-mflux-4bit](https://huggingface.co/madroid/flux.1-dev-mflux-4bit)
### π½ Running a non-quantized model directly from disk
MFLUX also supports running a non-quantized model directly from a custom location.
In the example below, the model is placed in `/Users/filipstrand/Desktop/schnell`:
```sh
mflux-generate \
--path "/Users/filipstrand/Desktop/schnell" \
--model schnell \
--steps 2 \
--seed 2 \
--prompt "Luxury food photograph"
```
Note that the `--model` flag must be set when loading a model from disk.
Also note that unlike when using the typical `alias` way of initializing the model (which internally handles that the required resources are downloaded),
when loading a model directly from disk, we require the downloaded models to look like the following:
```
.
βββ text_encoder
βΒ Β βββ model.safetensors
βββ text_encoder_2
βΒ Β βββ model-00001-of-00002.safetensors
βΒ Β βββ model-00002-of-00002.safetensors
βββ tokenizer
βΒ Β βββ merges.txt
βΒ Β βββ special_tokens_map.json
βΒ Β βββ tokenizer_config.json
βΒ Β βββ vocab.json
βββ tokenizer_2
βΒ Β βββ special_tokens_map.json
βΒ Β βββ spiece.model
βΒ Β βββ tokenizer.json
βΒ Β βββ tokenizer_config.json
βββ transformer
βΒ Β βββ diffusion_pytorch_model-00001-of-00003.safetensors
βΒ Β βββ diffusion_pytorch_model-00002-of-00003.safetensors
βΒ Β βββ diffusion_pytorch_model-00003-of-00003.safetensors
βββ vae
βββ diffusion_pytorch_model.safetensors
```
This mirrors how the resources are placed in the [HuggingFace Repo](https://huggingface.co/black-forest-labs/FLUX.1-schnell/tree/main) for FLUX.1.
*Huggingface weights, unlike quantized ones exported directly from this project, have to be
processed a bit differently, which is why we require this structure above.*
---
### π¨ Image-to-Image
One way to condition the image generation is by starting from an existing image and let MFLUX produce new variations.
Use the `--init-image-path` flag to specify the reference image, and the `--init-image-strength` to control how much the reference
image should guide the generation. For example, given the reference image below, the following command produced the first
image using the [Sketching](https://civitai.com/models/803456/sketching?modelVersionId=898364) LoRA:
```sh
mflux-generate \
--prompt "sketching of an Eiffel architecture, masterpiece, best quality. The site is lit by lighting professionals, creating a subtle illumination effect. Ink on paper with very fine touches with colored markers, (shadings:1.1), loose lines, Schematic, Conceptual, Abstract, Gestural. Quick sketches to explore ideas and concepts." \
--init-image-path "reference.png" \
--init-image-strength 0.3 \
--lora-paths Architectural_Sketching.safetensors \
--lora-scales 1.0 \
--model dev \
--steps 20 \
--seed 43 \
--guidance 4.0 \
--quantize 8 \
--height 1024 \
--width 1024
```
Like with [Controlnet](#-controlnet), this technique combines well with [LoRA](#-lora) adapters:
![image](src/mflux/assets/img2img.jpg)
In the examples above the following LoRAs are used [Sketching](https://civitai.com/models/803456/sketching?modelVersionId=898364), [Animation Shot](https://civitai.com/models/883914/animation-shot-flux-xl-ponyrealism) and [flux-film-camera](https://civitai.com/models/874708?modelVersionId=979175) are used.
---
### π LoRA
MFLUX support loading trained [LoRA](https://huggingface.co/docs/diffusers/en/training/lora) adapters (actual training support is coming).
The following example [The_Hound](https://huggingface.co/TheLastBen/The_Hound) LoRA from [@TheLastBen](https://github.com/TheLastBen):
```sh
mflux-generate --prompt "sandor clegane" --model dev --steps 20 --seed 43 -q 8 --lora-paths "sandor_clegane_single_layer.safetensors"
```
![image](src/mflux/assets/lora1.jpg)
---
The following example is [Flux_1_Dev_LoRA_Paper-Cutout-Style](https://huggingface.co/Norod78/Flux_1_Dev_LoRA_Paper-Cutout-Style) LoRA from [@Norod78](https://huggingface.co/Norod78):
```sh
mflux-generate --prompt "pikachu, Paper Cutout Style" --model schnell --steps 4 --seed 43 -q 8 --lora-paths "Flux_1_Dev_LoRA_Paper-Cutout-Style.safetensors"
```
![image](src/mflux/assets/lora2.jpg)
*Note that LoRA trained weights are typically trained with a **trigger word or phrase**. For example, in the latter case, the sentence should include the phrase **"Paper Cutout Style"**.*
*Also note that the same LoRA weights can work well with both the `schnell` and `dev` models. Refer to the original LoRA repository to see what mode it was trained for.*
#### Multi-LoRA
Multiple LoRAs can be sent in to combine the effects of the individual adapters. The following example combines both of the above LoRAs:
```sh
mflux-generate \
--prompt "sandor clegane in a forest, Paper Cutout Style" \
--model dev \
--steps 20 \
--seed 43 \
--lora-paths sandor_clegane_single_layer.safetensors Flux_1_Dev_LoRA_Paper-Cutout-Style.safetensors \
--lora-scales 1.0 1.0 \
-q 8
```
![image](src/mflux/assets/lora3.jpg)
Just to see the difference, this image displays the four cases: One of having both adapters fully active, partially active and no LoRA at all.
The example above also show the usage of `--lora-scales` flag.
#### Supported LoRA formats (updated)
Since different fine-tuning services can use different implementations of FLUX, the corresponding
LoRA weights trained on these services can be different from one another. The aim of MFLUX is to support the most common ones.
The following table show the current supported formats:
| Supported | Name | Example | Notes |
|-----------|-----------|----------------------------------------------------------------------------------------------------------|-------------------------------------|
| β
| BFL | [civitai - Impressionism](https://civitai.com/models/545264/impressionism-sdxl-pony-flux) | Many things on civitai seem to work |
| β
| Diffusers | [Flux_1_Dev_LoRA_Paper-Cutout-Style](https://huggingface.co/Norod78/Flux_1_Dev_LoRA_Paper-Cutout-Style/) | |
| β | XLabs-AI | [flux-RealismLora](https://huggingface.co/XLabs-AI/flux-RealismLora/tree/main) | |
To report additional formats, examples or other any suggestions related to LoRA format support, please see [issue #47](https://github.com/filipstrand/mflux/issues/47).
---
### πΉοΈ Controlnet
MFLUX has [Controlnet](https://huggingface.co/docs/diffusers/en/using-diffusers/controlnet) support for an even more fine-grained control
of the image generation. By providing a reference image via `--controlnet-image-path` and a strength parameter via `--controlnet-strength`, you can guide the generation toward the reference image.
```sh
mflux-generate-controlnet \
--prompt "A comic strip with a joker in a purple suit" \
--model dev \
--steps 20 \
--seed 1727047657 \
--height 1066 \
--width 692 \
-q 8 \
--lora-paths "Dark Comic - s0_8 g4.safetensors" \
--controlnet-image-path "reference.png" \
--controlnet-strength 0.5 \
--controlnet-save-canny
```
![image](src/mflux/assets/controlnet1.jpg)
*This example combines the controlnet reference image with the LoRA [Dark Comic Flux](https://civitai.com/models/742916/dark-comic-flux)*.
β οΈ *Note: Controlnet requires an additional one-time download of ~3.58GB of weights from Huggingface. This happens automatically the first time you run the `generate-controlnet` command.
At the moment, the Controlnet used is [InstantX/FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny), which was trained for the `dev` model.
It can work well with `schnell`, but performance is not guaranteed.*
β οΈ *Note: The output can be highly sensitive to the controlnet strength and is very much dependent on the reference image.
Too high settings will corrupt the image. A recommended starting point a value like 0.4 and to play around with the strength.*
Controlnet can also work well together with [LoRA adapters](#-lora). In the example below the same reference image is used as a controlnet input
with different prompts and LoRA adapters active.
![image](src/mflux/assets/controlnet2.jpg)
### π§ Current limitations
- Images are generated one by one.
- Negative prompts not supported.
- LoRA weights are only supported for the transformer part of the network.
- Some LoRA adapters does not work.
- Currently, the supported controlnet is the [canny-only version](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny).
### π‘Workflow Tips
- To hide the model fetching status progress bars, `export HF_HUB_DISABLE_PROGRESS_BARS=1`
- Use config files to save complex job parameters in a file instead of passing many `--args`
- Set up shell aliases for required args examples:
- shortcut for dev model: `alias mflux-dev='mflux-generate --model dev'`
- shortcut for schnell model *and* always save metadata: `alias mflux-schnell='mflux-generate --model schnell --metadata'`
### β
TODO
- [ ] LoRA fine-tuning (now also in [mlx-examples](https://github.com/ml-explore/mlx-examples/pull/1028) for reference)
- [ ] Frontend support (Gradio/Streamlit/Other?)
- [ ] [ComfyUI](https://github.com/filipstrand/mflux/issues/56) support?
- [ ] [Image2Image](https://github.com/filipstrand/mflux/pull/16) support (upcoming)
- [ ] Support for [PuLID](https://github.com/ToTheBeginning/PuLID)
- [ ] Support for [depth based controlnet](https://huggingface.co/InstantX/SD3-Controlnet-Depth) via [ml-depth-pro](https://github.com/apple/ml-depth-pro) or similar?
### License
This project is licensed under the [MIT License](LICENSE).
Raw data
{
"_id": null,
"home_page": null,
"name": "mflux",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Filip Strand <strand.filip@gmail.com>",
"keywords": "diffusers, flux, mlx",
"author": null,
"author_email": "Filip Strand <strand.filip@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/51/9e/ec3bd8a30926b3dc24263f88b7c386b63b00ece58a94354030f23e2bde69/mflux-0.4.1.tar.gz",
"platform": null,
"description": "\n![image](src/mflux/assets/logo.png)\n*A MLX port of FLUX based on the Huggingface Diffusers implementation.*\n\n\n### About\n\nRun the powerful [FLUX](https://blackforestlabs.ai/#get-flux) models from [Black Forest Labs](https://blackforestlabs.ai) locally on your Mac!\n\n### Table of contents\n\n<!-- TOC start (generated with https://github.com/derlin/bitdowntoc) -->\n\n- [Philosophy](#philosophy)\n- [\ud83d\udcbf Installation](#-installation)\n- [\ud83d\uddbc\ufe0f Generating an image](#%EF%B8%8F-generating-an-image)\n * [\ud83d\udcdc Full list of Command-Line Arguments](#-full-list-of-command-line-arguments)\n- [\u23f1\ufe0f Image generation speed (updated)](#%EF%B8%8F-image-generation-speed-updated)\n- [\u2194\ufe0f Equivalent to Diffusers implementation](#%EF%B8%8F-equivalent-to-diffusers-implementation)\n- [\ud83d\udddc\ufe0f Quantization](#%EF%B8%8F-quantization)\n * [\ud83d\udcca Size comparisons for quantized models](#-size-comparisons-for-quantized-models)\n * [\ud83d\udcbe Saving a quantized version to disk](#-saving-a-quantized-version-to-disk)\n * [\ud83d\udcbd Loading and running a quantized version from disk](#-loading-and-running-a-quantized-version-from-disk)\n- [\ud83d\udcbd Running a non-quantized model directly from disk](#-running-a-non-quantized-model-directly-from-disk)\n- [\ud83c\udfa8 Image-to-Image](#-image-to-image)\n- [\ud83d\udd0c LoRA](#-lora)\n * [Multi-LoRA](#multi-lora)\n * [Supported LoRA formats (updated)](#supported-lora-formats-updated)\n- [\ud83d\udd79\ufe0f Controlnet](#%EF%B8%8F-controlnet)\n- [\ud83d\udea7 Current limitations](#-current-limitations)\n- [\ud83d\udca1Workflow tips](#workflow-tips)\n- [\u2705 TODO](#-todo)\n- [License](#license)\n\n<!-- TOC end -->\n\n### Philosophy\n\nMFLUX is a line-by-line port of the FLUX implementation in the [Huggingface Diffusers](https://github.com/huggingface/diffusers) library to [Apple MLX](https://github.com/ml-explore/mlx).\nMFLUX is purposefully kept minimal and explicit - Network architectures are hardcoded and no config files are used\nexcept for the tokenizers. The aim is to have a tiny codebase with the single purpose of expressing these models\n(thereby avoiding too many abstractions). While MFLUX priorities readability over generality and performance, [it can still be quite fast](#%EF%B8%8F-image-generation-speed-updated), [and even faster quantized](#%EF%B8%8F-quantization).\n\nAll models are implemented from scratch in MLX and only the tokenizers are used via the\n[Huggingface Transformers](https://github.com/huggingface/transformers) library. Other than that, there are only minimal dependencies\nlike [Numpy](https://numpy.org) and [Pillow](https://pypi.org/project/pillow/) for simple image post-processing.\n\n\n### \ud83d\udcbf Installation\nFor users, the easiest way to install MFLUX is to use `uv tool`: If you have [installed `uv`](https://github.com/astral-sh/uv?tab=readme-ov-file#installation), simply:\n\n```sh\nuv tool install --upgrade mflux\n```\n\nto get the `mflux-generate` and related command line executables. You can skip to the usage guides below.\n\n<details>\n<summary>For the classic way to create a user virtual environment:</summary>\n\n```\nmkdir -p mflux && cd mflux && python3 -m venv .venv && source .venv/bin/activate\n```\n\nThis creates and activates a virtual environment in the `mflux` folder. After that, install MFLUX via pip:\n\n```\npip install -U mflux\n```\n\n</details>\n\n<details>\n<summary>For contributors (click to expand)</summary>\n\n1. Clone the repo:\n```sh\n git clone git@github.com:filipstrand/mflux.git\n ```\n2. Install the application\n\n```sh\n make install\n ```\n3. To run the test suite\n```sh\n make test\n ```\n4. Follow format and lint checks prior to submitting Pull Requests. The recommended `make lint` and `make format` installs and uses [`ruff`](https://github.com/astral-sh/ruff). You can setup your editor/IDE to lint/format automatically, or use our provided `make` helpers:\n - `make format` - formats your code\n - `make lint` - shows your lint errors and warnings, but does not auto fix\n - `make check` - via `pre-commit` hooks, formats your code **and** attempts to auto fix lint errors\n - consult official [`ruff` documentation](https://docs.astral.sh/ruff/) on advanced usages\n\n</details>\n\n### \ud83d\uddbc\ufe0f Generating an image\n\nRun the command `mflux-generate` by specifying a prompt and the model and some optional arguments. For example, here we use a quantized version of the `schnell` model for 2 steps:\n\n```sh\nmflux-generate --model schnell --prompt \"Luxury food photograph\" --steps 2 --seed 2 -q 8\n```\n\nThis example uses the more powerful `dev` model with 25 time steps:\n\n```sh\nmflux-generate --model dev --prompt \"Luxury food photograph\" --steps 25 --seed 2 -q 8\n```\n\n\u26a0\ufe0f *If the specific model is not already downloaded on your machine, it will start the download process and fetch the model weights (~34GB in size for the Schnell or Dev model respectively). See the [quantization](#%EF%B8%8F-quantization) section for running compressed versions of the model.* \u26a0\ufe0f\n\n*By default, model files are downloaded to the `.cache` folder within your home directory. For example, in my setup, the path looks like this:*\n\n```\n/Users/filipstrand/.cache/huggingface/hub/models--black-forest-labs--FLUX.1-dev\n```\n\n*To change this default behavior, you can do so by modifying the `HF_HOME` environment variable. For more details on how to adjust this setting, please refer to the [Hugging Face documentation](https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables)*.\n\n\ud83d\udd12 [FLUX.1-dev currently requires granted access to its Huggingface repo. For troubleshooting, see the issue tracker](https://github.com/filipstrand/mflux/issues/14) \ud83d\udd12\n\n#### \ud83d\udcdc Full list of Command-Line Arguments\n\n- **`--prompt`** (required, `str`): Text description of the image to generate.\n\n- **`--model`** or **`-m`** (required, `str`): Model to use for generation (`\"schnell\"` or `\"dev\"`).\n\n- **`--output`** (optional, `str`, default: `\"image.png\"`): Output image filename.\n\n- **`--seed`** (optional, `int`, default: `None`): Seed for random number generation. Default is time-based.\n\n- **`--height`** (optional, `int`, default: `1024`): Height of the output image in pixels.\n\n- **`--width`** (optional, `int`, default: `1024`): Width of the output image in pixels.\n\n- **`--steps`** (optional, `int`, default: `4`): Number of inference steps.\n\n- **`--guidance`** (optional, `float`, default: `3.5`): Guidance scale (only used for `\"dev\"` model).\n\n- **`--path`** (optional, `str`, default: `None`): Path to a local model on disk.\n\n- **`--quantize`** or **`-q`** (optional, `int`, default: `None`): [Quantization](#%EF%B8%8F-quantization) (choose between `4` or `8`).\n\n- **`--lora-paths`** (optional, `[str]`, default: `None`): The paths to the [LoRA](#-LoRA) weights.\n\n- **`--lora-scales`** (optional, `[float]`, default: `None`): The scale for each respective [LoRA](#-LoRA) (will default to `1.0` if not specified and only one LoRA weight is loaded.)\n\n- **`--metadata`** (optional): Exports a `.json` file containing the metadata for the image with the same name. (Even without this flag, the image metadata is saved and can be viewed using `exiftool image.png`)\n\n- **`--controlnet-image-path`** (required, `str`): Path to the local image used by ControlNet to guide output generation.\n\n- **`--controlnet-strength`** (optional, `float`, default: `0.4`): Degree of influence the control image has on the output. Ranges from `0.0` (no influence) to `1.0` (full influence).\n\n- **`--controlnet-save-canny`** (optional, bool, default: False): If set, saves the Canny edge detection reference image used by ControlNet.\n\n- **`--init-image-path`** (optional, `str`, default: `None`): Local path to the initial image for image-to-image generation.\n\n- **`--init-image-strength`** (optional, `float`, default: `0.4`): Controls how strongly the initial image influences the output image. A value of `0.0` means no influence. (Default is `0.4`)\n\n- **`--config-from-metadata`** or **`-C`** (optional, `str`): [EXPERIMENTAL] Path to a prior file saved via `--metadata`, or a compatible handcrafted config file adhering to the expected args schema.\n\n<details>\n<summary>parameters supported by config files</summary>\n\n#### How configs are used\n\n- all config properties are optional and applied to the image generation if applicable\n- invalid or incompatible properties will be ignored\n\n#### Config schema\n\n```json\n{\n \"$schema\": \"http://json-schema.org/draft-07/schema#\",\n \"type\": \"object\",\n \"properties\": {\n \"seed\": {\n \"type\": [\"integer\", \"null\"]\n },\n \"steps\": {\n \"type\": [\"integer\", \"null\"]\n },\n \"guidance\": {\n \"type\": [\"number\", \"null\"]\n },\n \"quantize\": {\n \"type\": [\"null\", \"string\"]\n },\n \"lora_paths\": {\n \"type\": [\"array\", \"null\"],\n \"items\": {\n \"type\": \"string\"\n }\n },\n \"lora_scales\": {\n \"type\": [\"array\", \"null\"],\n \"items\": {\n \"type\": \"number\"\n }\n },\n \"prompt\": {\n \"type\": [\"string\", \"null\"]\n }\n }\n}\n```\n\n#### Example\n\n```json\n{\n \"model\": \"dev\",\n \"seed\": 42,\n \"steps\": 8,\n \"guidance\": 3.0,\n \"quantize\": 4,\n \"lora_paths\": [\n \"/some/path1/to/subject.safetensors\",\n \"/some/path2/to/style.safetensors\"\n ],\n \"lora_scales\": [\n 0.8,\n 0.4\n ],\n \"prompt\": \"award winning modern art, MOMA\"\n}\n```\n</details>\n\nOr, with the correct python environment active, create and run a separate script like the following:\n\n```python\nfrom mflux import Flux1, Config\n\n# Load the model\nflux = Flux1.from_alias(\n alias=\"schnell\", # \"schnell\" or \"dev\"\n quantize=8, # 4 or 8\n)\n\n# Generate an image\nimage = flux.generate_image(\n seed=2,\n prompt=\"Luxury food photograph\",\n config=Config(\n num_inference_steps=2, # \"schnell\" works well with 2-4 steps, \"dev\" works well with 20-25 steps\n height=1024,\n width=1024,\n )\n)\n\nimage.save(path=\"image.png\")\n```\n\nFor more options on how to configure MFLUX, please see [generate.py](src/mflux/generate.py).\n\n### \u23f1\ufe0f Image generation speed (updated)\n\nThese numbers are based on the non-quantized `schnell` model, with the configuration provided in the code snippet below.\nTo time your machine, run the following:\n```sh\ntime mflux-generate \\\n--prompt \"Luxury food photograph\" \\\n--model schnell \\\n--steps 2 \\\n--seed 2 \\\n--height 1024 \\\n--width 1024\n```\n\n| Device | User | Reported Time | Notes |\n|--------------------|------------------------------------------------------------------------------------------------------------------------------|---------------|---------------------------|\n| M3 Max | [@karpathy](https://gist.github.com/awni/a67d16d50f0f492d94a10418e0592bde?permalink_comment_id=5153531#gistcomment-5153531) | ~20s | |\n| M2 Ultra | [@awni](https://x.com/awnihannun/status/1823515121827897385) | <15s | |\n| 2023 M2 Max (96GB) | [@explorigin](https://github.com/filipstrand/mflux/issues/6) | ~25s | |\n| 2021 M1 Pro (16GB) | [@qw-in](https://github.com/filipstrand/mflux/issues/7) | ~175s | Might freeze your mac |\n| 2023 M3 Pro (36GB) | [@kush-gupt](https://github.com/filipstrand/mflux/issues/11) | ~80s | |\n| 2020 M1 (8GB) | [@mbvillaverde](https://github.com/filipstrand/mflux/issues/13) | ~335s | With resolution 512 x 512 |\n| 2022 M1 MAX (64GB) | [@BosseParra](https://x.com/BosseParra/status/1826191780812877968) | ~55s | |\n| 2023 M2 Pro (32GB) | [@leekichko](https://github.com/filipstrand/mflux/issues/85) | ~54s | |\n| 2021 M1 Pro (32GB) | @filipstrand | ~160s | |\n| 2023 M2 Max (32GB) | @filipstrand | ~70s | |\n\n*Note that these numbers includes starting the application from scratch, which means doing model i/o, setting/quantizing weights etc.\nIf we assume that the model is already loaded, you can inspect the image metadata using `exiftool image.png` and see the total duration of the denoising loop (excluding text embedding).*\n\n### \u2194\ufe0f Equivalent to Diffusers implementation\n\nThere is only a single source of randomness when generating an image: The initial latent array.\nIn this implementation, this initial latent is fully deterministically controlled by the input `seed` parameter.\nHowever, if we were to import a fixed instance of this latent array saved from the Diffusers implementation, then MFLUX will produce an identical image to the Diffusers implementation (assuming a fixed prompt and using the default parameter settings in the Diffusers setup).\n\n\nThe images below illustrate this equivalence.\nIn all cases the Schnell model was run for 2 time steps.\nThe Diffusers implementation ran in CPU mode.\nThe precision for MFLUX can be set in the [Config](src/mflux/config/config.py) class.\nThere is typically a noticeable but very small difference in the final image when switching between 16bit and 32bit precision.\n\n---\n```\nLuxury food photograph\n```\n![image](src/mflux/assets/comparison1.jpg)\n\n---\n```\ndetailed cinematic dof render of an old dusty detailed CRT monitor on a wooden desk in a dim room with items around, messy dirty room. On the screen are the letters \"FLUX\" glowing softly. High detail hard surface render\n```\n![image](src/mflux/assets/comparison2.jpg)\n\n---\n\n```\nphotorealistic, lotr, A tiny red dragon curled up asleep inside a nest, (Soft Focus) , (f_stop 2.8) , (focal_length 50mm) macro lens f/2. 8, medieval wizard table, (pastel) colors, (cozy) morning light filtering through a nearby window, (whimsical) steam shapes, captured with a (Canon EOS R5) , highlighting (serene) comfort, medieval, dnd, rpg, 3d, 16K, 8K\n```\n![image](src/mflux/assets/comparison3.jpg)\n\n---\n\n\n```\nA weathered fisherman in his early 60s stands on the deck of his boat, gazing out at a stormy sea. He has a thick, salt-and-pepper beard, deep-set blue eyes, and skin tanned and creased from years of sun exposure. He's wearing a yellow raincoat and hat, with water droplets clinging to the fabric. Behind him, dark clouds loom ominously, and waves crash against the side of the boat. The overall atmosphere is one of tension and respect for the power of nature.\n```\n![image](src/mflux/assets/comparison4.jpg)\n\n---\n\n```\nLuxury food photograph of an italian Linguine pasta alle vongole dish with lots of clams. It has perfect lighting and a cozy background with big bokeh and shallow depth of field. The mood is a sunset balcony in tuscany. The photo is taken from the side of the plate. The pasta is shiny with sprinkled parmesan cheese and basil leaves on top. The scene is complemented by a warm, inviting light that highlights the textures and colors of the ingredients, giving it an appetizing and elegant look.\n```\n![image](src/mflux/assets/comparison5.jpg)\n\n---\n\n### \ud83d\udddc\ufe0f Quantization\n\nMFLUX supports running FLUX in 4-bit or 8-bit quantized mode. Running a quantized version can greatly speed up the\ngeneration process and reduce the memory consumption by several gigabytes. [Quantized models also take up less disk space](#-size-comparisons-for-quantized-models).\n\n```sh\nmflux-generate \\\n --model schnell \\\n --steps 2 \\\n --seed 2 \\\n --quantize 8 \\\n --height 1920 \\\n --width 1024 \\\n --prompt \"Tranquil pond in a bamboo forest at dawn, the sun is barely starting to peak over the horizon, panda practices Tai Chi near the edge of the pond, atmospheric perspective through the mist of morning dew, sunbeams, its movements are graceful and fluid \u2014 creating a sense of harmony and balance, the pond\u2019s calm waters reflecting the scene, inviting a sense of meditation and connection with nature, style of Howard Terpning and Jessica Rossier\"\n```\n![image](src/mflux/assets/comparison6.jpg)\n\n*In this example, weights are quantized at **runtime** - this is convenient if you don't want to [save a quantized copy of the weights to disk](#-saving-a-quantized-version-to-disk), but still want to benefit from the potential speedup and RAM reduction quantization might bring.*\n\n\nBy selecting the `--quantize` or `-q` flag to be `4`, `8`, or removing it entirely, we get all 3 images above. As can be seen, there is very little difference between the images (especially between the 8-bit, and the non-quantized result).\nImage generation times in this example are based on a 2021 M1 Pro (32GB) machine. Even though the images are almost identical, there is a ~2x speedup by\nrunning the 8-bit quantized version on this particular machine. Unlike the non-quantized version, for the 8-bit version the swap memory usage is drastically reduced and GPU utilization is close to 100% during the whole generation. Results here can vary across different machines.\n\n#### \ud83d\udcca Size comparisons for quantized models\n\nThe model sizes for both `schnell` and `dev` at various quantization levels are as follows:\n\n| 4 bit | 8 bit | Original (16 bit) |\n|--------|---------|-------------------|\n| 9.85GB | 18.16GB | 33.73GB |\n\nThe reason weights sizes are not fully cut in half is because a small number of weights are not quantized and kept at full precision.\n\n#### \ud83d\udcbe Saving a quantized version to disk\n\nTo save a local copy of the quantized weights, run the `mflux-save` command like so:\n\n```sh\nmflux-save \\\n --path \"/Users/filipstrand/Desktop/schnell_8bit\" \\\n --model schnell \\\n --quantize 8\n```\n\n*Note that when saving a quantized version, you will need the original huggingface weights.*\n\nIt is also possible to specify [LoRA](#-lora) adapters when saving the model, e.g\n\n```sh\nmflux-save \\\n --path \"/Users/filipstrand/Desktop/schnell_8bit\" \\\n --model schnell \\\n --quantize 8 \\\n --lora-paths \"/path/to/lora.safetensors\" \\\n --lora-scales 0.7\n```\n\nWhen generating images with a model like this, no LoRA adapter is needed to be specified since\nit is already baked into the saved quantized weights.\n\n#### \ud83d\udcbd Loading and running a quantized version from disk\n\nTo generate a new image from the quantized model, simply provide a `--path` to where it was saved:\n\n```sh\nmflux-generate \\\n --path \"/Users/filipstrand/Desktop/schnell_8bit\" \\\n --model schnell \\\n --steps 2 \\\n --seed 2 \\\n --height 1920 \\\n --width 1024 \\\n --prompt \"Tranquil pond in a bamboo forest at dawn, the sun is barely starting to peak over the horizon, panda practices Tai Chi near the edge of the pond, atmospheric perspective through the mist of morning dew, sunbeams, its movements are graceful and fluid \u2014 creating a sense of harmony and balance, the pond\u2019s calm waters reflecting the scene, inviting a sense of meditation and connection with nature, style of Howard Terpning and Jessica Rossier\"\n```\n\n*Note: When loading a quantized model from disk, there is no need to pass in `-q` flag, since we can infer this from the weight metadata.*\n\n*Also Note: Once we have a local model (quantized [or not](#-running-a-non-quantized-model-directly-from-disk)) specified via the `--path` argument, the huggingface cache models are not required to launch the model.\nIn other words, you can reclaim the 34GB diskspace (per model) by deleting the full 16-bit model from the [Huggingface cache](#%EF%B8%8F-generating-an-image) if you choose.*\n\n*If you don't want to download the full models and quantize them yourself, the 4-bit weights are available here for a direct download:*\n- [madroid/flux.1-schnell-mflux-4bit](https://huggingface.co/madroid/flux.1-schnell-mflux-4bit)\n- [madroid/flux.1-dev-mflux-4bit](https://huggingface.co/madroid/flux.1-dev-mflux-4bit)\n\n### \ud83d\udcbd Running a non-quantized model directly from disk\n\nMFLUX also supports running a non-quantized model directly from a custom location.\nIn the example below, the model is placed in `/Users/filipstrand/Desktop/schnell`:\n\n```sh\nmflux-generate \\\n --path \"/Users/filipstrand/Desktop/schnell\" \\\n --model schnell \\\n --steps 2 \\\n --seed 2 \\\n --prompt \"Luxury food photograph\"\n```\n\nNote that the `--model` flag must be set when loading a model from disk.\n\nAlso note that unlike when using the typical `alias` way of initializing the model (which internally handles that the required resources are downloaded),\nwhen loading a model directly from disk, we require the downloaded models to look like the following:\n\n```\n.\n\u251c\u2500\u2500 text_encoder\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 model.safetensors\n\u251c\u2500\u2500 text_encoder_2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 model-00001-of-00002.safetensors\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 model-00002-of-00002.safetensors\n\u251c\u2500\u2500 tokenizer\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 merges.txt\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 special_tokens_map.json\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 tokenizer_config.json\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 vocab.json\n\u251c\u2500\u2500 tokenizer_2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 special_tokens_map.json\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 spiece.model\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 tokenizer.json\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 tokenizer_config.json\n\u251c\u2500\u2500 transformer\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 diffusion_pytorch_model-00001-of-00003.safetensors\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 diffusion_pytorch_model-00002-of-00003.safetensors\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 diffusion_pytorch_model-00003-of-00003.safetensors\n\u2514\u2500\u2500 vae\n \u2514\u2500\u2500 diffusion_pytorch_model.safetensors\n```\nThis mirrors how the resources are placed in the [HuggingFace Repo](https://huggingface.co/black-forest-labs/FLUX.1-schnell/tree/main) for FLUX.1.\n*Huggingface weights, unlike quantized ones exported directly from this project, have to be\nprocessed a bit differently, which is why we require this structure above.*\n\n---\n\n### \ud83c\udfa8 Image-to-Image\n\nOne way to condition the image generation is by starting from an existing image and let MFLUX produce new variations.\nUse the `--init-image-path` flag to specify the reference image, and the `--init-image-strength` to control how much the reference \nimage should guide the generation. For example, given the reference image below, the following command produced the first\nimage using the [Sketching](https://civitai.com/models/803456/sketching?modelVersionId=898364) LoRA: \n\n```sh\nmflux-generate \\\n--prompt \"sketching of an Eiffel architecture, masterpiece, best quality. The site is lit by lighting professionals, creating a subtle illumination effect. Ink on paper with very fine touches with colored markers, (shadings:1.1), loose lines, Schematic, Conceptual, Abstract, Gestural. Quick sketches to explore ideas and concepts.\" \\\n--init-image-path \"reference.png\" \\\n--init-image-strength 0.3 \\\n--lora-paths Architectural_Sketching.safetensors \\\n--lora-scales 1.0 \\\n--model dev \\\n--steps 20 \\\n--seed 43 \\\n--guidance 4.0 \\\n--quantize 8 \\\n--height 1024 \\\n--width 1024\n```\n\nLike with [Controlnet](#-controlnet), this technique combines well with [LoRA](#-lora) adapters:\n\n![image](src/mflux/assets/img2img.jpg)\n\nIn the examples above the following LoRAs are used [Sketching](https://civitai.com/models/803456/sketching?modelVersionId=898364), [Animation Shot](https://civitai.com/models/883914/animation-shot-flux-xl-ponyrealism) and [flux-film-camera](https://civitai.com/models/874708?modelVersionId=979175) are used.\n\n---\n\n### \ud83d\udd0c LoRA\n\nMFLUX support loading trained [LoRA](https://huggingface.co/docs/diffusers/en/training/lora) adapters (actual training support is coming).\n\nThe following example [The_Hound](https://huggingface.co/TheLastBen/The_Hound) LoRA from [@TheLastBen](https://github.com/TheLastBen):\n\n```sh\nmflux-generate --prompt \"sandor clegane\" --model dev --steps 20 --seed 43 -q 8 --lora-paths \"sandor_clegane_single_layer.safetensors\"\n```\n\n![image](src/mflux/assets/lora1.jpg)\n---\n\nThe following example is [Flux_1_Dev_LoRA_Paper-Cutout-Style](https://huggingface.co/Norod78/Flux_1_Dev_LoRA_Paper-Cutout-Style) LoRA from [@Norod78](https://huggingface.co/Norod78):\n\n```sh\nmflux-generate --prompt \"pikachu, Paper Cutout Style\" --model schnell --steps 4 --seed 43 -q 8 --lora-paths \"Flux_1_Dev_LoRA_Paper-Cutout-Style.safetensors\"\n```\n![image](src/mflux/assets/lora2.jpg)\n\n*Note that LoRA trained weights are typically trained with a **trigger word or phrase**. For example, in the latter case, the sentence should include the phrase **\"Paper Cutout Style\"**.*\n\n*Also note that the same LoRA weights can work well with both the `schnell` and `dev` models. Refer to the original LoRA repository to see what mode it was trained for.*\n\n#### Multi-LoRA\n\nMultiple LoRAs can be sent in to combine the effects of the individual adapters. The following example combines both of the above LoRAs:\n\n```sh\nmflux-generate \\\n --prompt \"sandor clegane in a forest, Paper Cutout Style\" \\\n --model dev \\\n --steps 20 \\\n --seed 43 \\\n --lora-paths sandor_clegane_single_layer.safetensors Flux_1_Dev_LoRA_Paper-Cutout-Style.safetensors \\\n --lora-scales 1.0 1.0 \\\n -q 8\n```\n![image](src/mflux/assets/lora3.jpg)\n\nJust to see the difference, this image displays the four cases: One of having both adapters fully active, partially active and no LoRA at all.\nThe example above also show the usage of `--lora-scales` flag.\n\n#### Supported LoRA formats (updated)\n\nSince different fine-tuning services can use different implementations of FLUX, the corresponding\nLoRA weights trained on these services can be different from one another. The aim of MFLUX is to support the most common ones.\nThe following table show the current supported formats:\n\n| Supported | Name | Example | Notes |\n|-----------|-----------|----------------------------------------------------------------------------------------------------------|-------------------------------------|\n| \u2705 | BFL | [civitai - Impressionism](https://civitai.com/models/545264/impressionism-sdxl-pony-flux) | Many things on civitai seem to work |\n| \u2705 | Diffusers | [Flux_1_Dev_LoRA_Paper-Cutout-Style](https://huggingface.co/Norod78/Flux_1_Dev_LoRA_Paper-Cutout-Style/) | |\n| \u274c | XLabs-AI | [flux-RealismLora](https://huggingface.co/XLabs-AI/flux-RealismLora/tree/main) | |\n\nTo report additional formats, examples or other any suggestions related to LoRA format support, please see [issue #47](https://github.com/filipstrand/mflux/issues/47).\n\n---\n\n### \ud83d\udd79\ufe0f Controlnet\n\nMFLUX has [Controlnet](https://huggingface.co/docs/diffusers/en/using-diffusers/controlnet) support for an even more fine-grained control\nof the image generation. By providing a reference image via `--controlnet-image-path` and a strength parameter via `--controlnet-strength`, you can guide the generation toward the reference image.\n\n```sh\nmflux-generate-controlnet \\\n --prompt \"A comic strip with a joker in a purple suit\" \\\n --model dev \\\n --steps 20 \\\n --seed 1727047657 \\\n --height 1066 \\\n --width 692 \\\n -q 8 \\\n --lora-paths \"Dark Comic - s0_8 g4.safetensors\" \\\n --controlnet-image-path \"reference.png\" \\\n --controlnet-strength 0.5 \\\n --controlnet-save-canny\n```\n![image](src/mflux/assets/controlnet1.jpg)\n\n*This example combines the controlnet reference image with the LoRA [Dark Comic Flux](https://civitai.com/models/742916/dark-comic-flux)*.\n\n\u26a0\ufe0f *Note: Controlnet requires an additional one-time download of ~3.58GB of weights from Huggingface. This happens automatically the first time you run the `generate-controlnet` command.\nAt the moment, the Controlnet used is [InstantX/FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny), which was trained for the `dev` model.\nIt can work well with `schnell`, but performance is not guaranteed.*\n\n\u26a0\ufe0f *Note: The output can be highly sensitive to the controlnet strength and is very much dependent on the reference image.\nToo high settings will corrupt the image. A recommended starting point a value like 0.4 and to play around with the strength.*\n\n\nControlnet can also work well together with [LoRA adapters](#-lora). In the example below the same reference image is used as a controlnet input\nwith different prompts and LoRA adapters active.\n\n![image](src/mflux/assets/controlnet2.jpg)\n\n### \ud83d\udea7 Current limitations\n\n- Images are generated one by one.\n- Negative prompts not supported.\n- LoRA weights are only supported for the transformer part of the network.\n- Some LoRA adapters does not work.\n- Currently, the supported controlnet is the [canny-only version](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny).\n\n### \ud83d\udca1Workflow Tips\n\n- To hide the model fetching status progress bars, `export HF_HUB_DISABLE_PROGRESS_BARS=1`\n- Use config files to save complex job parameters in a file instead of passing many `--args`\n- Set up shell aliases for required args examples:\n - shortcut for dev model: `alias mflux-dev='mflux-generate --model dev'`\n - shortcut for schnell model *and* always save metadata: `alias mflux-schnell='mflux-generate --model schnell --metadata'`\n\n### \u2705 TODO\n\n- [ ] LoRA fine-tuning (now also in [mlx-examples](https://github.com/ml-explore/mlx-examples/pull/1028) for reference)\n- [ ] Frontend support (Gradio/Streamlit/Other?)\n- [ ] [ComfyUI](https://github.com/filipstrand/mflux/issues/56) support?\n- [ ] [Image2Image](https://github.com/filipstrand/mflux/pull/16) support (upcoming)\n- [ ] Support for [PuLID](https://github.com/ToTheBeginning/PuLID)\n- [ ] Support for [depth based controlnet](https://huggingface.co/InstantX/SD3-Controlnet-Depth) via [ml-depth-pro](https://github.com/apple/ml-depth-pro) or similar?\n\n### License\n\nThis project is licensed under the [MIT License](LICENSE).\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) 2024 Filip Strand Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
"summary": "A MLX port of FLUX based on the Huggingface Diffusers implementation.",
"version": "0.4.1",
"project_urls": {
"homepage": "https://github.com/filipstrand/mflux"
},
"split_keywords": [
"diffusers",
" flux",
" mlx"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4aeafb3684913cf06df7dd4fb8930253cda9a2fb35da6b462e35adc038fdc69f",
"md5": "5c8f0895dff62ec282a41e02136005d0",
"sha256": "033ce8578011cfb376eaffe2189ca4a01d7463d657595c5e0a2f280d90955db4"
},
"downloads": -1,
"filename": "mflux-0.4.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5c8f0895dff62ec282a41e02136005d0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 73217,
"upload_time": "2024-10-29T19:44:38",
"upload_time_iso_8601": "2024-10-29T19:44:38.082469Z",
"url": "https://files.pythonhosted.org/packages/4a/ea/fb3684913cf06df7dd4fb8930253cda9a2fb35da6b462e35adc038fdc69f/mflux-0.4.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "519eec3bd8a30926b3dc24263f88b7c386b63b00ece58a94354030f23e2bde69",
"md5": "8e7204e01dc76d74ecbfc2d673f4fb30",
"sha256": "7bb619b4f613db94a5764591d8af40226b4aee77bd38916dfe7c00a680dff8b7"
},
"downloads": -1,
"filename": "mflux-0.4.1.tar.gz",
"has_sig": false,
"md5_digest": "8e7204e01dc76d74ecbfc2d673f4fb30",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 58089,
"upload_time": "2024-10-29T19:44:40",
"upload_time_iso_8601": "2024-10-29T19:44:40.350402Z",
"url": "https://files.pythonhosted.org/packages/51/9e/ec3bd8a30926b3dc24263f88b7c386b63b00ece58a94354030f23e2bde69/mflux-0.4.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-29 19:44:40",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "filipstrand",
"github_project": "mflux",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "mflux"
}