ShaderToy GLSL performance in 2.0.0 Beta 1

iaian7 · September 16, 2019, 8:49pm

Vuo 2 is looking more and more promising, thanks for all the hard work!

As a test this afternoon I ported a screensaver I’d made in Quartz Composer earlier this year (reling very heavily on the GLSL patch at the time). Using the Make Image with ShaderToy node, I was able to port it to Vuo surprisingly easily. I lost all of my custom shader inputs, of course, which had to be hardcoded into the shader, and it couldn’t load more than 4 images, but it did…work. I was able to make things a little more efficient while I was at it, which was nice. And on top of that, I was then able to (with the help of Chrome Console to load custom images) get it running on ShaderToy too!

However…full screen performance in Vuo, and performance of the packaged screensaver, is unusable. I’m using the same 4k textures across all platforms, and since the code really is the same in ShaderToy as it is in Vuo, I’m not sure what the problem could be.

The Quartz composition runs great in full screen (multiple monitors, 60fps confirmed)
Chrome runs great in full screen (~60fps)
Running it in a Vuo window appears to be fine (~60fps)
Running in full screen isn’t great (~30fps with frequent frame drops and stuttering)
Exporting as a screen saver is worse (not sure, but <20fps with lots of frame drops and stuttering).

Is this a known issue? Has anyone else tested performance between Quartz Composer GLSL, Vuo, and ShaderToy, especially in fullscreen and/or as a screensaver?

Asking here before I file a bug report. :)

iaian7 · September 16, 2019, 8:50pm

I’ll post an example file if I get time to make a generic test, but for now it contains branding material I’m not entirely comfortable sharing on behalf of my employer.

jstrecker · September 16, 2019, 10:20pm

One possible difference is that Vuo renders using Retina resolution when possible, whereas QC doesn’t (it always renders low resolution). So the performance difference might be because the shader is trying to run at twice the resolution (4 times the number of pixels).

If you add a Divide node to divide the width and height each by 2 before rendering, is the performance more equal?

iaian7 · September 16, 2019, 11:29pm

Excellent point! I hadn’t thought of that, and thanks for your quick reply.

Resolution multiplied by 0.5:

Vuo Composition Loader in fullscreen now performs the same as Quartz Composer in fullscreen (Vuo periodically drops a few frames, but it’s certainly improved, and maybe even better than a .QTZ running as a screensaver, which appears to drop in performance ever so slightly from the original Quartz Composer fullscreen window)
Vuo screensaver renders with improved performance over previous screensaver attempts, but only in a quarter of the screen with stretched pixels filling left/top/right (whoops…clearly not compatible)

Multiplied by 0.5, then scaled back up to the original screen size (Resize Image node):

Vuo screensaver renders full screen at somewhat improved performance, but still renders much slower than than the Vuo Composition Loader, and is still dropping a noticeable number of frames. Very jerky.

So it helps a little, but not enough to make it usable as a screensaver. In addition, this doesn’t seem to explain (from my uneducated perspective) why in my initial testing the Vuo Composition Loader rendered more reliably when it was in a full screen window versus true full screen (I wouldn’t have expected the pixel space of a dock and menubar to cause any measurable performance difference). And since Chrome appears to render the exact some code without issues in true full screen (not sure if it’s retina!)…this is…curious.

Scratchpole · September 17, 2019, 5:54pm

I think the jerky frame rate would be solved if you fed the composition a constant time/frame rate.
When I run the shadertoys I’ve copied into Freeframes for Resolume they play jerkily until I cap the frame rate.

iaian7 · September 18, 2019, 1:31am

Thanks! I’ll give it a try. Up to this point I’ve been using “Fire on Display Refresh” as recommended in the “Make Image with Shadertoy” node documentation.

Test 1 - try “time” first

I figured I’d go ahead and test all the timing options, starting with the default “time” input from the Image Generator protocol. When playing in the Vuo Composition Loader in full screen mode, it performs about the same as fire on display refresh. Some problematic frame dropping in fullscreen, not as bad in a windowed view.

However, using the “time” input and exporting it as a screensaver boosts performance when compared to display refresh (which practically dies when used in a screensaver), up to the same levels as the Vuo Composition Loader in full screen mode (albeit still not reaching Quartz performance levels). It appears “Fire on Display Refresh” really isn’t happy when embedded as a screensaver.

And that’s all rendering at retina resolution (sorta; it’s set to 1.5x since I’m on a Mac Pro with dual 24" 4k displays). Curiously, using “time” in conjunction with the half-resolution-then-scale-back-up technique described in a previous post actually kills screensaver performance again. Completely unusable frame rate, super jerky.

At which point I realised this was going to require documentation with a lot more detail…

Test 2 - test everything over again from scratch

Of note: performance in the Vuo Composition Loader was tested both in a window scaled to fill the entire screen (minus the dock/menubar), and in fullscreen mode, which uses the newer MacOS fullscreen setup (where takes over both monitors and leaves one of them blank, very frustrating). Performance is consistently better in windowed mode. However, Quartz Composer suffers no such performance hit: windowed mode and fullscreen mode are identical, however it’s not the “new” fullscreen - it simply fills one monitor, and the second monitor is fully usable (oh how I long for the old days of MacOS supporting dual monitors nicely without using them as separate spaces, sigh).

Quartz: perfect performance in all situations
Display Refresh: not compatible with screensavers
Time: much better, not perfect
Periodically 0.0166… (~60fps): worse again
Periodically 0.0333… (~30fps): minimal improvement

…and also…

Half-res (tested individually with each of the above except for Quartz): seems to make no difference in screensaver mode

Also note: dividing the resolution by half doesn’t actually match my monitor scaling factor, so it’s both significantly lower performance and noticeably lower quality than Quartz, which renders at a somewhat higher resolution (my monitors are set to 1.5x scale, so quartz would be rendering at 66.6666% of full resolution instead of 50%).

Attached you’ll find the spreadsheet of results. The percentages are simply my subjective evaluation of performance, with anything under 95% being (in my opinion) unreliable, and anything under 90% being unusable (noticeably jerky or low framerate). Unfortunately, limiting the frame rate by using a “Fire Periodically” node doesn’t appear to improve performance at all, and in some cases, actually makes it worse. Is there another, more correct way to limit frame rate?

And just for reference, here’s the node graph in Vuo:

Yellow = GLSL shader (almost exactly the same as what I’m using in Quartz, but with none of the exposed controls)
Green = Timing (this input is what I was testing with different firing options, not pictured here) (the add node combines the time input with a random number so each monitor displays a different pattern…I didn’t have the Allow First Event hooked up to the Make Random Value node, so it wasn’t working for my initial tests and this screenshot, whoops! Fixed now, working as expected)
Blue = Image loading (four different 4k textures, same ones I’m using in Quartz)
Orange = Resolution scaling (hooked up to a boolean so I can toggle it on and off for testing…based on the results above, it really needs to stay off for anything used as a screensaver!)

Scratchpole · September 18, 2019, 10:14am

I think it’s a matter of you finding the happy medium between resolution and fps, you could also try lowering the resolution of the input images maybe.
I’m surprised to hear qtz is running it smoothly.

Fire periodically is what I had expected to work.

I have noticed a big drop in performance with multi-pass shaders…reducing to really low res, sometimes lower than 1280x720 and 25fps to make some function smoothly on my crappy mbp.

iaian7 · September 18, 2019, 4:27pm

Not only is Quartz running this smoothly, so is Shadertoy via Google Chrome (100% performance in fullscreen). Using a .qtz as a screensaver, I have this same GLSL code running as a background element along with more complex animation layers in the office lobby, pulling settings and text content from a web server running on the same system. And that’s all running on a stock 2011 Mac Mini; though there are a few frame drops periodically, it runs surprisingly well, and has for many months…after years of reliably running other versions of Quartz Composer screensavers.

Based on the poor performance I’m getting with Vuo on a newer Mac Pro with far higher end GPUs, I’m hoping this is a beta issue that will be resolved by the time its released. That’s why I’m bringing it up now! There’s no “happy medium” if Vuo is a trade down from a more reliable and performant (though deprecated) system like Quartz Composer.

Bodysoulspirit · September 23, 2019, 1:43pm

My 2 cents on a non-technic level is :

I usually see some frames drops going fullscreen in Vuo too, compared to non-fullscreen even with a window the size of the screen.
Funnily enough usually checking the port events it does NOT register frame drops, but you can still see some chops.
But I did blame old 2011 hardware, or maybe the small amount of extra pixels rendered instead of the window bar.
There are also 2 fullscreen modes in Vuo if I’m right, cmd-f in the composition goes Vuo fullscreen, and the regular macOS fullscreen (green button).

Regarding timing, I guess the time published port instead of display refer is the way to go though for best results.
And instead of Fire Periodically use Allow Periodic Events.
Not that it will make a difference I guess, but it allows to be synced with the main time.

MartinusMagneson · September 23, 2019, 1:45pm

Couple of things comes to mind. First it seems like you’re always creating the full resolution image. In the case of the low-res selection, you only see the lower res one, but in the comp you also create the full-res one anyways. This is one of the major differences between QC and Vuo. In QC you wouldn’t have created the one that wasn’t selected, as it pulls the values to the output. In Vuo however, it pushes all calculations to the conclusion from the inputs enabling background calculations, but at the price of having to be a bit more careful about the values and blocking events.

I also wonder how you deal with the wrap-mode in QC? I haven’t used this (vuo) node myself, but if I understand it correctly, it in this instance would check 4 4k frames/textures at every frame cycle. That is a pretty hefty operation - especially for older/mac hw. I would try removing those nodes to check performance.

Bodysoulspirit · September 23, 2019, 2:03pm

True what Martinus says, your upscaled version is also being rendered even if you select low res.
Add a Select Output to the node chain to prevent that

iaian7 · September 26, 2019, 1:14am

Thanks for the notes and help! I find myself continually forgetting that Vuo pushes updates instead of pulling them. Ok, a few things to cover now…

Glad I’m not the only one seeing performance drops when switching to fullscreen. I’d been using the native MacOS mode, totally missing that cmd-F uses the old style fullscreen (so much more useful!). Either way, I’m still seeing the same performance drop in both modes. I wouldn’t blame it on old hardware if nearly-fullscreen windows are fine, right? There’s something weird going on with how the full screen view is being rendered? I have a somewhat newer and much more powerful machine, and I’m seeing the same issues.
The Change Wrap Mode nodes shouldn’t be modifying the image pixels, they should be modifying the metadata regarding how the images are rendered in a GLSL context, defining how the textures behave at UV borders (<0 and >1). Additionally, they’re set up to process only once when the image is loaded (Allow Once node feeds into the image loading), so even if they were processing pixels, it’d only happen once. In Quartz Composer the wrap mode of any image is automatically set to repeat and I didn’t need to customise it (I can’t remember if that’s an option in the QC mipmapping node? My memory is failing and I’m not sure you can change it to anything but repeat). Unity also defaults to repeat, as does Shadertoy. Vuo defaults to Clamp Edge, so the setting has to be customised to work as expected (the animation I’m working on right now animates the textures in an infinite UV scroll, thus the need for repeat).
I’ve implemented the Select Output node, even for the math operation just in case (seems silly, but hey, gotta try!). Checking the Show Events feature confirms that Resize Image is not being processed when LowRes is toggled off, and the resolutions being fed through the graph all check out as expected. Yay, fixed! But…
I’ve gone through performance testing after fixing my mistake on the resize image setup, and strangely…no change. I really thought that’d help. But I guess it makes some sense: I was having major performance issues before I created my testing setup, which is when I introduced the scaling selection error. I’m not seeing any improvements to the ratings I posted before. I’m still getting the same poor performance (low frame rate, dropped frames) when running in fullscreen or as a screensaver.
I’ve tested Allow Periodic Events versus Fire Periodically, and there doesn’t seem to be any difference. They both result in significantly worse performance than simply using Time.
I ended up using the mouse position as a hacky way to get some sort of data input, which allows me to correctly adjust the UV map scale within the GLSL depending on the high/low resolution toggle (something that’s specific to this animation). So that’s why I have a +1/-1 Select Input node.

jstrecker · September 27, 2019, 2:37am

@iaian7 — Thank you for sharing your test results.

Another thing you could try is to go to System Preferences > Mission Control and toggle the “Displays have separate Spaces” setting, then log out and log back in, and see how the performance compares.

To get a more objective measure of performance, you could add a visualization of skipped frames as in the attached composition.

One thing we’re thinking of changing based on your comments so far — As you observed, currently you need a Resize Image to scale the image back up to the screensaver’s expected size. We could change the wrap mode on the output image from “clamp” to “stretch” so you wouldn’t need the Resize Image (and extra processing that it incurs). Assuming that doesn’t break anything else.

ScreensaverPerformanceTest1.vuo (3.64 KB)

iaian7 · October 3, 2019, 1:19am

Jaymie — Thank you for checking out the tests, I appreciate the follow up.

On my main desktop I always run with “separate spaces” off, so I tested again with it on and saw no change in performance, except potentially fullscreen mode, which may have performed slightly better (still not great), and it used the “native” MacOS fullscreen method for both the expand button and command-f (makes sense, separate spaces means it doesn’t have to use the old style fullscreen in order to keep the other screen usable). Screensaver performance with separate spaces turned on was same/worse.

I’ve also tested the two final Vuo screensavers (full-res, half-res) on two laptops (retina with Mojave, non-retina with High Sierra). Neither of them were able to render either of the screensavers at an acceptable frame rate. Meanwhile, that same ancient non-retina High Sierra MBP can run significantly more complex Quartz screensavers at a flawless 60fps…so in theory (big caveat, haha) it’s not a computer performance issue, nor is it entirely a retina resolution issue…seeing as the non-retina laptop can run the same code (and more complex variations) in Quartz without a hiccup. Of note, I haven’t tested Vuo extensively on the laptops outside of the screensaver test; I need to check the app preview performance too and see how that compares.

Today for yet another test I ported everything over to an ISF generator (yay! I get to use input variables again!), and got the same poor performance I had with the ShaderToy node. On the upside, outside of the easily-broken header information, ISF was a lot more pleasant to work with (I could use Sublime Text to edit the module and Vuo + Vuo Preview window would update every time I saved, nice!). Though unfortunately not helpful for actual production work (I can’t store hundreds of modules across many different projects in a user library, especially when they have to be saved alongside the project for versioning and archival).

Thanks for the skipped frames test, that’s great! That said, even testing it alone without any additional graphics, I can’t get it to render smoothly as a screensaver. Completely erratic performance. If I get the time this week, I’ll try adding a skipped frames test to the Quartz compositions, then render tests across as many computer setups as I can (but I wouldn’t hold my breath…one of my teams is in the middle of a harrowing delivery week on a major mixed reality production!).

Magneson · October 3, 2019, 5:01pm

Can I ask which systems you run this on spec-wise? Do you also have a screenshot of the QC patch?

jstrecker · November 26, 2019, 3:45am

@iaian7 — I’m finally coming back to this… I set up a test on my computer (attached). Using Vuo 2.0.0-beta3, I ran the composition in fullscreen and as a screensaver for a couple minutes each. Not counting the first moments while the composition is getting started, I saw 0 frame drops in fullscreen and 2 in screensaver (a small enough difference that it might just be due to chance; unclear).

Have you learned anything further in your own investigations? Would it be possible for you to post your shader? And, as @MartinusMagneson suggested, could you post the system specs (macOS version and graphics card)?

ShaderScreensaverTest.zip (6.36 KB)

iaian7 · March 4, 2020, 12:13am

Thanks everyone! I know it’s been forever, and Vuo 2.0 is out of beta now…it’s been a busy half year and I simply wasn’t able to circle back till now.

System specs should be irrelevant when comparing relative performance, right? For example, my old 2010 MacBook Pro was able to render the same GLSL code in Quartz with significantly better performance than it was able to while using the Vuo beta last year (regardless of screensaver or dedicated app, IIRC). The fact that it can barely edit 1080p video doesn’t really factor in (haha), I didn’t care if it couldn’t run higher end graphics…the same code was great in Quartz Composer but terrible in Vuo. Of course, if it’s a machine specific bug in Vuo, that’s another matter (granted, I had the same issues on every computer I tried!). Systems I was testing on last year:

Mid 2010 MacBook Pro, 2.66GHz Core i7, 8GB ram, NVIDIA GeForce GT 330M 512 MB
Late 2013 Mac Pro, 2.7GHz 12-core Xeon E5, 64GB ram, dual AMD FirePro D700 6GB
Mid 2015 MacBook Pro, 2.8GHz 4-core i7, 16GB ram, AMD Radeon R9 M370X 2GB
Late 2015 iMac, 4GHz 4-core i7, 32GB ram, AMD Radeon R9 M395X 4GB

…I believe they were running a mix of Mojave and High Sierra at the time. They’re upgraded to a mix of Mojave and Catalina now, not to mention there have been a number of Vuo updates since then! My hope is that the bugs have been ironed out by now, I just haven’t been able to double check myself.

I didn’t learn anything additional at the time. Vuo had major performance issues and was rendering very erratically, but predominantly when using the screensaver export. Dedicated apps weren’t as bad, and in fact, performance sufficed to build out an all-new experience that now runs my company’s lobby screen (Vuo’s 4k performance on a brand new Mac Mini isn’t amazing, but we’re using the frame interpolation built into the TV to smooth things out…that thing I typically disable immediately actually proved useful! haha). Hopefully I can post more about it soon! :)
The company branding has changed dramatically in the past 6 months, so I don’t think there’s any need to keep our old files proprietary. It was kinda silly, but I just wasn’t sure I could share them before. Sorry for the delay. :D

Triangles.qtz - this is the background-only version I used for a lot of testing (it’s still the screensaver I have on the 2010 laptop, runs great) and I’m not seeing the matching Vuo composition on this computer. Sigh. Only the ones that included the isolines, so…that’s partially helpful.
Isolines.qtz - original screensaver I was trying to port into Vuo (this probably isn’t the one I was using for performance testing, which may have been more stripped down)
Isolines-ShaderToy.vuo - this should be pretty close to the qtz file above, using ShaderToy (all of the settings had to be baked internally, but it worked)
Isolines-ISF.vuo - this was the second port I made, using ISF (hey! it supports settings again! yay!)
Isolines-ISFsimple.vuo - lines-only version, which I naturally don’t have a matching quartz composition for, but hey, why not include it!
I’ve included the ISF shader files in the “Modules” folder, and the images required for running the Vuo compositions are in the folder root.

I’ve also made my ShaderToy pages public…

Isolines - Shader - Shadertoy BETA (you have to load the custom images using the browser console)
Triangles - Shader - Shadertoy BETA

GLSLtests.zip (4.5 MB)

iaian7 · March 4, 2020, 1:16am

Here’s a slightly modified version of the Vuo composition posted last year with an additional vertical offset and a periodic reset. I recreated it in Quartz as well, but had to test it in fullscreen mode instead of screensaver mode (none of the computers here at work are still on a MacOS version that supports .qtz screensavers). After 10 years of using Quartz Composer, I’ve never seen a performance difference in .qtz files between running in full screen from the editor and running in screensaver mode…but while I wish this could suffice as a usable test, Quartz is not rendering at retina resolution, so it’s very much not a direct comparison. The only real result is that the final release of Vuo 2.0.0 (build 11210) still does not solve the screensaver performance issues. I would hope that a composition this simple would render without major frame rate issues…no luck.

Quartz Composer editor fullscreen: solid frame rate, occasional dropped frames
Vuo screensaver on monitor 1: inconsistent frame rate (I could only use Screenshot.app’s 10 second delay, so results are limited to ~9 seconds)
Vuo screensaver on monitor 2: significantly worse with inconsistent frame rate and constantly dropped frames (I’m running two monitors as a combined space, not separate…but previous testing seemed to indicate that setting made no difference)

Tested on a late 2013 Mac Pro (2.7GHz 12-core Xeon E5, 64GB ram, dual AMD FirePro D700 6GB GPUs) running MacOS Mojave 10.14.6.

ScreensaverPerformanceTest.vuo (4.84 KB)

ScreensaverPerformanceTest.saver_.zip (2.75 MB)

ScreensaverPerformanceTest.qtz_.zip (10.5 KB)

iaian7 · March 8, 2020, 3:37am

Obviously an old laptop screen (non-retina) is going to be a lot easier to render than dual 4k monitors at 1.5x scale…but I’m still surprised at how much better an antiquated laptop runs this performance test than the Mac Pro posted above. Quartz performance appears identical (periodic dropped frames), and Vuo (as a screensaver) appears to run much less poorly (very rough startup, and still a little inconsistent, but not nearly as bad).

I ran through the tests both as screensavers (had to take the shots with my phone; delayed screen captures weren’t working on this machine) and from the editors (in fullscreen mode; I waited for a full loop reset to avoid window resizing performance issues and overlays showing up). Even with such a simple render test, Vuo screensaver performance is measurably worse than Vuo as a windowed app.

Tested on a Mid 2010 MacBook Pro (2.66GHz Core i7, 8GB ram, NVIDIA GeForce GT 330M 512 MB) running MacOS High Sierra 10.13.6.

jstrecker · March 27, 2020, 7:05pm

@iaian7 — Wow, thank you for sharing all of this information to help clarify the problem.

Why do you see lower performance (more skipped frames) when running a Vuo screen saver compared to running the same composition fullscreen from the Vuo editor?

I still haven’t been able to reproduce this issue myself, which is perhaps an interesting data point. (MacBook Pro, Mid 2012 or Early 2013, NVIDIA GeForce GT 650M, macOS Sierra.)

I discussed with the team and we can’t think of anything within Vuo that could cause a performance difference between running as a screen saver and running from the editor.

Could the OS be doing something that affects performance? For example: Do screensaver processes run at a lower priority than regular composition processes? Are they throttled in some way to limit the amount of resources they consume? Is there some other process that runs only when the screensaver is active or when the system is idle, that is using up resources? I don’t know.

Why do you see this difference in Vuo but not in Quartz Composer?

This is comparing apples and oranges. Quartz Composer’s graphics system works very differently from Vuo’s. In addition to using GPU and CPU in different proportions, Vuo exercises GPU functionality that Quartz Composer does not. Vuo supports Retina resolution and multisampling; Quartz Composer does not.

What’s the next step?

In the somewhat near future — after NDI nodes but before Windows support — we’re planning to overhaul Vuo’s graphics system (adding support for Metal since that is what Apple has moved to). That will probably fix a number of performance issues (and introduce some new ones to debug). Since we don’t have any leads on why you’re getting lower performance with screen savers, we’ll see if the Metal upgrade fixes it.