• Home
  • Cinema
  • Studio
  • Digital
  • Royalty Free Music
  • web agency
  • blog
  • Home
  • Cinema
  • Studio
  • Digital
  • Royalty Free Music
  • web agency
  • blog

BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG!

blog

articoli, news, update

LET’S TRY QWEN IMAGE EDIT 2509 WITH OUR NEW CUSTOM NODES!

bla bla bla

  • Novembre 14, 2025

Hi folks, this is IAMCCS, and today we’ll dive into the real thing — the acclaimed model from Alibaba, Qwen Image Edit 2509, working side-by-side with our new custom nodes: IAMCCS_QE_Prompt_Enhancer and my little baby, IAMCCS_annotate! We’ll use models like the Nunchaku version and the normal diffusers (with a deeper dive into the Nunchaku installation process).

P.S. You can grab our custom nodes here (before they go to ComfyUI Manager, just like their friend IAMCCS_nodes):

https://github.com/IAMCCS/IAMCCS_annotate.git

https://github.com/IAMCCS/IAMCCS_QE_PROMPT_ENHANCER.git

Take a look at the other twin post to this one, where we present our new tools, so you can know them a little better:
https://www.patreon.com/posts/two-brand-new-qe-142842818?utm_medium=clipboard_copy&utm_source=copyLink&utm_campaign=postshare_creator&utm_content=join_link

What’s Inside This Post

This is a comprehensive walkthrough, so here’s what we’ll cover:

✓ Model Overview — What makes Qwen Image Edit 2509 special
✓ Installation — Both standard Diffusers and Nunchaku versions
✓ Workflows Breakdown — How the Prompt Enhancer integrates with Qwen
✓ Multi-Image Editing — Using up to 3 images for complex compositions
✓ Real Examples — Practical generations and use cases
✓ Technical Deep Dive — DiT architecture, VAE decoding, and semantic conditioning

A Few Words About Qwen Image Edit 2509

Qwen Image Edit 2509 is an advanced multi-conditional image editing model that excels at high-consistency subject preservation while executing complex transformations. Its key functions include:

Identity Consistency: Maintaining the character’s face, features, and text fidelity during major edits (superior to older models).

Multi-Image Fusion: Combining elements (subject, style, pose) from up to three different input images in one generation.

Advanced Control: Supporting external control inputs like pose maps (via tools like DWPreprocessor) for precise structural changes and full body manipulation.

Versatile Editing: Performing high-quality style transfers, background swaps, inpainting, and outpainting with exceptional coherence.

Installing Models…

For the diffusers models, I’ll attach the links to the repos, but it’s always the same way to download and put them in the model folders.

GGUF models: https://huggingface.co/QuantStack/Qwen-Image-Edit-2509-GGUF/tree/main

Diffusion models: https://huggingface.co/Comfy-Org/Qwen-Image-Edit_ComfyUI/tree/main/split_files/diffusion_models

Text encoders: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main/split_files/text_encoders

VAE: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors

Lightx2v LorAS: lightx2v/Qwen-Image-Lightning at main

Loras for MULTIANGLE presets: https://huggingface.co/dx8152/Qwen-Edit-2509-Multiple-angles

Lora for photo-realsitic enhance: https://civitai.com/models/2121900?modelVersionId=2400325

LorA fusion: https://huggingface.co/dx8152/Qwen-Image-Edit-2509-Fusion/tree/main

A Little Trickier: The Nunchaku Version Model Installation

If you want some deeper knowledge about it, here’s the paper: [2411.05007] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

…but to put it simply, the Nunchaku team has optimized the architecture for blazing-fast speed, drastically reducing the sampling steps to a historic low of 4 or 8, making your creative process incredibly faster.

The Installation of the Nunchaku Model

This setup isn’t a simple image generator; it’s a genuine pre-visualization tool and conceptual manipulation instrument.

First of all, you have to download the “ComfyUI-nunchaku” node through the ComfyUI Manager.

If you have some problems with it, you can directly clone it here: https://github.com/nunchaku-tech/ComfyUI-nunchaku.git

Open your terminal, go to your custom_nodes folder and type:

git clone https://github.com/nunchaku-tech/ComfyUI-nunchaku.git

Restart ComfyUI.

Now you have to download the wheels to start the Nunchaku engine. You’ll find the workflow to install in their custom nodes, but you can directly install it through the workflow I’ll attach below.

SECOND STEP

DOWNLOAD MODELS:

Qwen: https://huggingface.co/nunchaku-tech/nunchaku-qwen-image-edit-2509

Flux Krea: https://huggingface.co/nunchaku-tech/nunchaku-flux.1-krea-dev

If you have less than 12 GB of VRAM, aim for the INT4 R32 models (lighter and faster); if you’re above that, use the R128 for richer, more detailed image quality.

Now it’s time to download the Text Encoder: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors

VAE: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors

I’ve been DYING to show you this workflow in action…

The first Workflow (simple)

Ladies and gentlemen, I’m happy to present to you my happy ANNOTATED workflow canvas 🙂

Yes, I know, there’s maybe some chaos in there, but hey, my friend Nietzsche said once:

«Ich sage euch: Man muss noch Chaos in sich haben, um einen tanzenden Stern gebären zu können.» — (from Also sprach Zarathustra, Prolog 5)

“One must still have chaos in oneself to be able to give birth to a dancing star.”

So we’re destined to be a future star 🙂

Of course, the notes are taken with the first of our new custom nodes called IAMCCS_annotate (I discuss it in the post below, take a look at it).

Text Conditioning – Where the Prompt Enhancer Lives

The first section is about what you want to tell the model. Normally, you would connect a simple text node → CLIPTextEncode → Qwen model. But this limits you to one raw text string.

Instead, your IAMCCS_QE_Prompt_Enhancer node replaces that basic input with a dynamic semantic layer.

Here’s what happens:

  1. You choose a preset (e.g., “Change image 1 to the back view” or “Low Angle Shot”) from the visual grid.

  2. The node outputs a structured, ready-to-encode text built from your chosen preset + any custom additions.

  3. If you toggle Maintain Consistency or Get Pose of Image 3, it appends Boolean-based clauses to the text, such as “| Maintain the consistency” or “| Get the pose of image 3”. (Check the custom nodes introduction post for further info!)

  4. This string is then sent to CLIPTextEncode, which transforms it into a conditioning tensor — the actual language the Qwen DiT understands.

In other words: 💬 Prompt Enhancer = your visual instruction layer 🧩 CLIP Encoder = translator between your intent and the model’s latent space

This approach makes it possible to trigger context-aware edits directly from buttons instead of typing every variation by hand.

Model Loading and Sampling – The Qwen DiT Heart

Once the text conditioning is ready, it flows into the Qwen Image Edit 2509 model — you can run this with either Nunchaku models (for quantized speed and low VRAM usage) or the standard Diffusers-style Qwen models for maximum detail.

Here’s the logic:

  • The loader initializes Qwen’s Diffusion Transformer (DiT) backbone, a modern alternative to the traditional U-Net architecture used in Stable Diffusion.

  • The DiT combines multi-modal attention layers (text + image embeddings) with a temporal coherence system for sequential editing.

  • It processes one or more image inputs — image1, image2, image3 — each corresponding to different conditional roles (subject, style, pose, or background).

  • The conditioning tensors from the Prompt Enhancer/CLIP pipeline are merged inside the DiT’s latent space, influencing every generation step.

The result: Qwen interprets your preset not as a “prompt string” but as a weighted instruction map, aligning visual logic with linguistic meaning.

For example:

“Copy Pose 🤸” + “Maintain the consistency”

tells Qwen to keep the subject’s identity from image1, but adopt the pose of image3, maintaining lighting and proportion.

Very powerful, huh?

Let’s Try with Our Friend from Another Dimension

Can you understand the tool in our fragile filmmaker’s hands?

Another Try

Now it’s time to take a fly…

Yes, I’m scared of flying too…

Image Management – The Multi-Input Logic

Qwen Image Edit 2509 supports multi-image editing — a defining feature. The workflow lets you load up to three images:

Image 1: Base subject (the main character or scene) Image 2: Style, pose, or background reference Image 3: Optional secondary control (used for advanced pose or light transfer)

When you select a Multi-Image Edit preset in the Prompt Enhancer (e.g., “Merge Background 🏞️” or “Add Character 👤”), the text conditioning automatically activates the correct semantic instructions, while the Qwen DiT internally fuses the image embeddings.

This fusion isn’t a naive blend — it’s attention-based composition, meaning each pixel’s generation depends on both spatial correlation and linguistic context.

So instead of just “mixing images,” Qwen understands which part belongs where, guided by the meaning of the prompt.

Example for Multi-Image Compositing:

Of course, you can be more specific by adding the second prompt to the final prompt to have more consistent results, but you can understand how powerful this stuff is!

Decoding and Output – The Cinematic Result

I have to mention the VAE step. After diffusion sampling (which can run as low as 4–8 steps in Nunchaku mode thanks to DiT acceleration or 4 steps with the diffusion thanks to the lightx2v model), the output latent is sent through the Qwen VAE decoder — responsible for reconstructing the visual result from the compressed latent space.

The VAE ensures that colors, sharpness, and depth match the high-fidelity tone defined by the prompt. Finally, as we’ve already seen, your output node shows a reconstructed image — one that visually obeys the semantic rules of your Prompt Enhancer setup.

The second workflow (first image)

This workflow introduces Image-1 generation via Flux Krea, letting you choose between the Nunchaku variant or the classic diffusion model.
All steps are annotated right inside the workflow. 🙂

Specific LorAS in action

The prompt–LoRA combination adds a powerful consistency boost to your target look.
Just hit Ctrl+B to mute any LoRA you don’t want active.

The PROMPT_QE_ENHANCER Beauty

You can also add a second prompt to help the model better understand what you want and get more consistent results — pretty powerful stuff!

Why the Integration Matters

Without the Prompt Enhancer, you’d have to type verbose, structured instructions every time:

“Change image 1 to a low-angle shot while maintaining face identity and copying pose from image 2.”

The enhancer replaces that friction with one click, guaranteeing consistency in language precision and semantic accuracy. It also reduces prompt fatigue and creates a library of reusable creative gestures — something essential for cinematic or storyboard-based AI workflows.

Until it’s not already in the ComfyUI Manager, you can grab it at: https://github.com/IAMCCS/IAMCCS_QE_PROMPT_ENHANCER.git

And the IAMCCS_annotate is here:
https://github.com/IAMCCS/IAMCCS_annotate.git

(P.S. To hide the notes, you can press Alt (Option) + S and all the notes are gone!)

Chrck our previous post:
https://www.patreon.com/posts/two-brand-new-qe-142842818?utm_medium=clipboard_copy&utm_source=copyLink&utm_campaign=postshare_creator&utm_content=join_link

In Practice, About Our Custom Nodes:

Annotate gives you a visual space to plan your compositions directly on the workflow.

Prompt Enhancer gives you the linguistic and structural bridge to execute them through Qwen.

Together, they form a visual–semantic pipeline: draw it, describe it, generate it.

Attached below you’ll find the IAMCCS QE Prompt Enhancer Simple Workflow, designed as a clean entry point for Qwen Image Edit 2509.

In our workflow, the Prompt Enhancer bridges your concept and Qwen’s understanding — automatically expanding your prompt into semantic instructions that match the model’s conditioning.

This Is the End, My Only Friend, the End…

This long post is coming to an end. I’m already happy if the workflow and my nodes can be helpful to some of you, my friends.

If you create something cool with this setup, tag me or share it in the community — I’d love to see what you build! 🚀

Coming Soon

More Qwen-specific workflows are coming soon – Face swap thing is the next one! – and that’s just the beginning. I’ve been developing a new node for weeks now, and honestly, it’s going to be a game changer for me and anyone who works like I do. Stay tuned.

Check it out!

Share:

Categorie

  • Cinema

  • Digital

  • Musica

  • Web design

Altri post

Two brand-new CUSTOM NODES for ComfyUI — IAMCCS_annotate & IAMCCS_QE_Prompt_Enhancer!!

Read More »

New! IAMCCS Native WAN 2.2 Animate Workflow – plus IAMCCS-node (Lora fix) and Long video!

Read More »

WANANIMATE v.2 is here! – (plus updated workflow v.2 for low vram users)

Read More »

WANANIMATE v.2 is here!

Read More »

FAIDENBLASS
studio

Hai bisogno di video-editing, post-produzione per il tuo progetto audio-video? Contattaci!
CONTATTACI

FAIDENBLASS
digital

Utilizziamo gli ultimi sistemi di Stable diffusion in locale di generazione di immagini + video. Contattaci per ogni info o richiesta
CONTATTACI

FAIDENBLASS
agency

Costruiamo siti web dinamici e responsivi su wordpress
CONTATTACI

Send Us A Message

PrevPreviousTwo brand-new CUSTOM NODES for ComfyUI — IAMCCS_annotate & IAMCCS_QE_Prompt_Enhancer!!

FAIDENBLASS

Faidenblass: il punto di incontro tra arte analogica e digitale.

links

  • Carmine Cristallo Scalzi
  • Mitologia Elfica
  • faidenblass web agency

pagine sito

  • Home
  • Cinema
  • Studio
  • Digital
  • Royalty Free Music
  • web agency
  • blog
  • Home
  • Cinema
  • Studio
  • Digital
  • Royalty Free Music
  • web agency
  • blog