Audio visualizer with Image Generator protocol

From the posts and, I see I’m not alone in struggling with converting a real time composition for an offline movie export.

My composition has basically a Play Audio File, a Fire Periodically and a Build List node. In the realtime version, the Fire Periodically is set to a standard video frame rate (23.976 fps or 0.0417083) and is connected to the Build List for assembling the layers fed by the FFT process done on the audio coming from the Play Audio File.

In realtime, using Save to Movie works, but I want to raise the quality with offline export. How do I get the Play Audio File and Fire Periodically to be synchronized together on the frame rate? I’ve tried things mentioned in the above referenced posts, but still no go…

Hey @Kewl. Maybe post the composition so we can take a look. If I understand you correctly- you want to offline-render a video file that is audio-reactive to an audio file. You also need to trigger nodes at frame-time rate, and audio-time rate.

I think that as currently Offline Movie Export doesn’t support Audio export in Movie files, I don’t believe there is an ‘audio-time’ port in generator protocol mode. (I could be wrong). (Without audio-time value, its almost impossible to sync the audio events to video frames).

Not to mention that Play-Audio-File node doesn’t have atime input port anyway, so (like videos dual playback nodes: Play Video & Get Video Image) one is for online rendering- the other offline.

Hopefully I am wrong, and its not a FR! ;-)  


Time event as an event only cable (option key drag cable) and plug that into Build List

does the trick for the build list. Thanks!

The Play Audio File speed is off though: it’s too slow. Does it not need to be controlled via its Set Time port? As it stands right now, in the resulting video file, each frame is repeated three to five times: it really looks like the Play Audio File is dragging.

As for the audio not being exported, once the video is done, I rejoin the audio and video in Pro Tools (which also allows for sync check).  

1 Like

Sorry for changing my previous post, I’m still researching it all a bit, (and didn’t want to post info that wasn’t helpful to you!).

Glad that one suggestion on mine fixed half of the issue.

I think that as per my updated post above, Play Audio File isn’t designed for offline rendering. What happens is that it plays the audio file at the correct speed, but then Vuo tries to render the Video frames at the fastest speed possible to save you time. This is the same problem that happens with fire periodically- thats online only.

When Vuo does offline rendering, it just tries to render everything as quick as possible. (This could be faster- or slower than real-time depending on your quality settings- but regardless, normal time is not respected anymore- and only protocol time is used) Which is why you need to connect the time port in the generator protocol to everything. It wouldn’t know when it should do things otherwise.

So I can’t see it working 100% without a special node that spits out audio buffers at a certain frame-time.

A super ‘hacky’ way to try to make it work would be to plug time into set time in play audio file, and make sure to feed an allow first event into play audio files play port. (As you suggested - but I’m not at a Mac now, so I can’t test currently…)

Looks like we need a version of decode movie image for audio to make offline audio work well, (at least for generating graphics with audio files).

I do hope I am wrong!

A larger picture thing could be for Team Vuo to quarantine (or at least have a warning) about offline v’s online nodes. Quarantine would simply mean not being able to use online nodes in offline mode. But I don’t know the best way for that to work, possibly there should be a new mode, offline render which would be the only protocol to have disabled nodes?

Are you noticing a substantial difference between online v’s offline rendering? (obviously if your composition is very processor heavy that’s a given)  

Are you noticing a substantial difference between online v’s offline rendering?

Well, I didn’t have a chance to compare yet, since we’re still looking for a solution.

But still, I imagine that I’m not alone in wanting to use audio in offline video rendering. Thanks for your help.

As @alexmitchellmus said, one solution would be a Decode Audio Frame node suitable for offline rendering. (Feel free to create a feature request.)

The attached compositions demonstrate another solution: Make a composition with Play Audio that writes the audio amplitudes to a file. Make the image generator composition read the file and select the appropriate amplitude for each movie frame based on the time input.

analyze-audio.vuo (3.29 KB)

render-audio.vuo (5.08 KB)

1 Like

Oh, that’s cool! I’ll see if I can adapt it to Calculate Amplitude for Frequencies


This is what I got so far: it basically works, but it’s too slow and it’s only for one channel.

fft_test.vuo (3.91 KB)

For the composition in the previous comment, one way to circumvent the slowness would be to start/stop Play Audio File every 512 samples for that block of samples to be analyzed and make it in time, as a text item, to the Enqueue. Would it be possible to somehow connect the Enqueue to Play Audio File saying something like “I received the data from the more recent 512 samples, now play the next 512 samples”?

The composition needs a Hold List node between the FFT and Get Item from List. Does that fix the slowness?

More info: Why is the output of Process List getting jumbled?

fft_test.vuo (4.38 KB)

The Hold List stabilizes the speed, but it’s still too slow.

Ideally, I would like to store the FFT reading for every block of 512 samples (93.75 readings/second). To test, I have a 10 seconds pink noise file that, at the end, outputs a text file with 155 items (without the Hold List, it gives a different number each time I test it, but it’s around 170 items). If it was possible to store all the FFT readings, it would give 937 items. The FFT itself outputs without any slowdown, but it’s the rest of the chain that can’t keep up. That’s why the idea of the start/stop playback so that every block of 512 samples can be analyzed and stored to text.

Thanks for the help.

I see what you mean about the chain not being able to keep up. Specifically, the Build List isn’t able to loop 2047 times x 93.75 times per second.

That’s why the idea of the start/stop playback so that every block of 512 samples can be analyzed and stored to text.

That seems to work, at least for my short test audio — SaveDataFromFFT-StartStop.vuo.

It might be less ugly with the proposed Decode Audio Frame node.

I was going to suggest another possibility — Instead of trying to convert reals → texts with each audio event, move that processing to the end, after the audio has finished. Unfortunately, SaveDataFromFFT-ProcessAtEnd.vuo doesn’t work because the Cut List node is slow for large lists.

Or another possibility would be a node (doesn’t yet exist) that converts a list of reals to a list of texts, without having to go through the Build List loop.

Anyway, hope that helps.

SaveDataFromFFT-StartStop.vuo (6.3 KB)

SaveDataFromFFT-ProcessAtEnd.vuo (6.82 KB)

GraphFFTData.vuo (5.28 KB)

1 Like

Hey, thanks! Attached are stereo versions, without the Hold List: seems to work.

Is the Decode Audio Frame a FR yet? @alexmitchellmus?  

SaveDataStereoFFT-StartStop.vuo (6.9 KB)

ProcessStereoFFTData.vuo (3.66 KB)

Totally agree with this topic ! Nice job !

I’m jumping on it just to question if anybody knows what the Convert Frame to Timestamp is about (full name “vuo.type.audioframe.real”).
The name is audio frame, but the type of the port is video frame, but I can’t connect neither a video frame nor anything ;)

May be very dumb question !

I’m still having trouble with the “speed”:

When I test this composition real time, I get around 60 fps. When I export it as a movie (1920*1080, 23.976 fps, H264), each frame is repeated three or four times. Is the video compression causing the problem? Should I connect the Build List to something else? I’m sorry to be so obtuse about this, I’m usually decent enough at dataflow programming…

The included text file is 10 seconds of pink noise FFT data.

test.vuo (4.4 KB) (9.25 MB)

So the video codec has an impact: with H.264, each image is repeated on two to four frames. With ProRes (422 or 4444), each image is repeated on two frames for a second, second and a half, and then it settles to the right “speed” of one frame for one image.

What is interesting, is that by putting a Blend Image with Feedback with an X axis translation in the feedback, I can see that every frames in the exported video are indeed different from one another, whatever the chosen codec.

Somehow, when exporting to a movie, the video codec has an impact on the flow of upstream data, in the first nodes, but not on the downstream nodes as the Blend Image with Feedback does produce a new image for each video frame.


ProRes 4444


@Bodysoulspirit, there are a couple of issues with the vuo.type.audioframe.real node. The port popover should say “Audio Frame” instead of “Video Frame”; that’s a bug. And actually, the node isn’t useful yet since there are no nodes that work with Audio Frames.

1 Like

@Kewl, could you post a complete composition (test.vuo doesn’t have any cables to outputImage) and a screenshot of your movie export settings?

Either that

or that

I’ve sent you a message via the Vuo’s contact page.

Thanks for the additional info, @Kewl.

Looks like another case where extraneous events are sneaking into the Build List loop and making things wonky. Hold List nodes are needed to block the events. I’ll reply to your contact message with a revised composition.

I suspect the variation between codecs is because they take a different amount of (real) time to export each frame. Different timings, different wonkiness.

1 Like