Home
Cinema
Studio
Digital
Royalty Free Music
web agency
blog

Home
Cinema
Studio
Digital
Royalty Free Music
web agency
blog

BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG! BLOG!

blog

articoli, news, update

[VID+AUD]🎧LTX-2.3: From Audio + Image to Long-Form, Lip-Synced full video🎬

bla bla bla

Marzo 27, 2026

Hi folks, this is CCS.

First of all — thank you. Genuinely. Every share and subscription helps me keep pushing this work forward. You give me real reasons to keep going.

As you already know, each of my posts introduces something new from a nodes standpoint — not just a workflow to copy-paste, but an actual technical advancement designed for filmmakers first, and content creators in general. This one is no different.

In the introduction video of this post, you’ve already been able to see a video generated with these workflows — a musical famous in the world of ideas and projects that were never truly born 😉

A quick note: I’ve uploaded the raw already working workflows for now — I’ll be uploading the same workflows again asap, but cleaner and much more readable. Stay tuned.

What this is for

This pipeline (and the new nodes ;)) are designed for anyone who wants a complete, end-to-end production workflow for music videos and short films — and needs longer clips to actually work with.

More room to edit. More room to grade. More room to play.

With a 3-segment setup like this — which is exactly what both workflows shared here implement — you can reach up to 60 seconds of generated video in a single run, fully audio-synchronized, in one shot. And yes: depending on your VRAM and RAM, you can push it to 60 seconds at 1920×1080 with enough headroom.

That said — and I mean this — start at 1280×720.

Understand how the segments connect, how the audio math works, how the extensions chain together before pushing the resolution. This pipeline rewards understanding before brute force.

First of all: update IAMCCS-nodes: https://github.com/IAMCCS/IAMCCS-nodes.git

Some short demos (kept brief for clarity):

And here’s a quick demo of what it can do in a very short time.This is one of my songs — in Italian it’s called “Peccato.” I’m currently working on an English version (titled: “The Devil’s bed”). I tried to put myself into the video… and yeah, I came out pretty ugly. Which basically means one thing: I need to train a LoRA of myself ASAP if I want to properly exist inside LTX-2.3.Now — let’s break down how this actually works.

The workflow: LTX 2.3 — 3-Segment Audio+Video Extension

The base workflow is a 3-segment LTX 2.3 image-to-video pipeline with full audio integration. You feed in a starting image, a full audio track, and the workflow generates three consecutive video segments that are stitched together temporally — all while keeping the audio correctly sliced, positioned, and reassembled for each generation pass.

Each segment knows where it sits on the audio timeline. Each segment knows exactly how much of the track it needs to condition on. And at the end, the three audio slices are assembled back into a single coherent track that matches the video timeline precisely.

This is not audio simply concatenated at the end. The audio drives the generation from within.

For a quick overview of the new nodes (IAMCCS-nodes version 1.4.0), I’ll be publishing a dedicated post asap.

And of course, for a complete understanding of how they work, I highly recommend checking out the supporters post! 😉

The pipeline — quick walkthrough

The structure is readable and intentional:

Load your LTX 2.3 model, VAE (video + audio), and CLIP — once, shared across all three segments
Resize your starting image once via ResizeImagesByLongerEdge
Segment 0: feed the starting image into LTXVPreprocess → LTXVImgToVideoInplace, encode audio with the correct audio slice for this segment (from AudioExtensionMath + AudioExtender), set noise mask, and sample
Separate audio and video latents (LTXVSeparateAVLatent), decode the video latent with IAMCCS_VAEDecodeTiledSafe
Repeat for Segments 1 and 2 — each receiving the tail frames of the previous segment as its new starting image via LTXVConcatAVLatent, with the audio cursor tracking forward automatically
Final assembly: the three decoded video pieces and the assembled audio are combined into the final output video

The Low RAM variant

There’s a second workflow included: LOW_RAM. The logic is identical to the standard version, but it adds one key difference in the decode step: instead of keeping all decoded frames in memory, it uses IAMCCS_VAEDecodeToDisk to write frames directly to disk as they are decoded, chunk by chunk.

This means the full image tensor never needs to exist in RAM all at once. For long segments at higher resolutions on machines with limited system RAM, this is what makes the pipeline actually runnable.

I won’t go into the internal mechanics here — tiling strategy, seam handling, chunk sizing — that’s covered in the supporter post. But if you’re on a machine with 16–24 GB of RAM and you see memory errors during the decode phase, the Low RAM workflow is where to start.

What you need installed

IAMCCS_nodes (updated) — for all the nodes described above
ComfyUI-LTXVideo — for LTXVPreprocess, LTXVImgToVideoInplace, LTXVAudioVAEEncode, and the rest of the LTX 2.3 native nodes
ComfyUI-KJNodes — for VAELoaderKJ and a few helpers used in the graph

Where to go from here

This post gives you everything you need to load the workflow, understand each block, and start generating.

Start at 1280×720, learn how the segments stitch together, then scale up.

If you want the full technical breakdown — — how VAEDecodeToDisk and the GlobalPlanner node automatically configure clip settings based on your input, handle seams between latent chunks, how to tune overlap and frame math for different FPS configurations, and the reasoning behind every key widget — that’s in the supporter post.

Supporting is also the most direct way to keep this work moving — it funds the time to build, test, document, and share real production-tested pipelines back to the open-source community.

More soon.

Share:

Categorie

Cinema
Digital
Musica
Web design

Altri post

IAMCCS NEWSLETTER — MAY 10, 2026

[VID] LTX-2.3 V2V Extended — Something New is Coming

[VID+AUD] 🎭 LTX-2.3 ID-LoRA — Consistent Identity Across Audio & Video

[VID+AUD] Directing LTX-2.3: From Audio-Guided Lipsync to Full Video Pipeline (Patreon supporters)

FAIDENBLASS
studio

Hai bisogno di video-editing, post-produzione per il tuo progetto audio-video? Contattaci!

FAIDENBLASS
digital

Utilizziamo gli ultimi sistemi di Stable diffusion in locale di generazione di immagini + video. Contattaci per ogni info o richiesta

FAIDENBLASS
agency

Costruiamo siti web dinamici e responsivi su wordpress

Send Us A Message

Full Name

Phone

Email

PrevPrevious[VID+AUD] Directing LTX-2.3: From Audio-Guided Lipsync to Full Video Pipeline (Patreon supporters)

Next[VID+AUD] 🎭 LTX-2.3 ID-LoRA — Consistent Identity Across Audio & VideoNext

Nome

telefono

email

scegli la sezione di Faidenblass:

Messaggio

FAIDENBLASS

Faidenblass: il punto di incontro tra arte analogica e digitale.

links

Carmine Cristallo Scalzi
Mitologia Elfica
faidenblass web agency

pagine sito

Home
Cinema
Studio
Digital
Royalty Free Music
web agency
blog

Home
Cinema
Studio
Digital
Royalty Free Music
web agency
blog