Cubix Xpander or
'How to connect FIVE high-end GPUs
to your Mac Pro.'
Posted Tuesday, August 6th, 2013 by rob-ART morgan, mad scientist
The trend in certain pro apps is to use multiple GPUs to render projects. One of our frustrations with "super charging" the 2009 - 2012 Mac Pro to do that well is that it only has two 16 lane PCIe 2.0 slots. And if you put two "double wide" or "thick" GPUs in those slots, you are left with only one "thin" 4 lane slot.
Cubix has provided a way to add four double wide GPUs externally by using the Xpander Desktop. Thanks to Mac Distributors and Cubix we were able to connect a total of FIVE high-end "thick" GPUs to our 2010 Mac Pro Hex-core. Check out the difference it makes with four "multi-GPU aware" pro apps. (The GPUs were loaned to us by MacVidCards and Mac*Pro.)
Five GPUs = GeForce GTX 680 inside slot one of the Mac Pro; Dual GeForce GTX 580 Classifieds and Dual GeForce GTX 770s inside the Cubix Xpander Desktop connected to the Cubix Host Interface Adapter in slot two of the Mac Pro.
Four GPUs = GeForce GTX 680 inside slot one of the Mac Pro; One GeForce GTX 680 and Dual GeForce GTX 770s inside the Cubix Xpander Desktop connected to the Cubix Host Interface Adapter in slot two of the Mac Pro.
Dual G580C = two GeForce GTX 580 Classifieds
Dual G770 = two GeForce GTX 770s
Dual G680 = two GeForce GTX 680s
One G580C = one GeForce GTX 580 Classified
One G770 = one GeForce GTX 770
One G680 = one GeForce GTX 680
"Test Mule" was a 2010 Mac Pro 3.33GHz Hex-core.
OctaneRender is a "GPU only" standalone renderer that can process scenes created in and exported from Maya, ArchiCAD, Cinema 4D, etc. -- and does so in a fraction of the time it takes with a CPU based renderer. However, currently it only works with CUDA capable NVIDIA graphics cards. The DEMO comes with a scene called octane_benchmark.ocs. For our test we selected RenderTarget PT (Path Tracing). The render time is tracked and displayed in total seconds.
You must "tell" Octane to render with multiple GPUs by going to Preferences > CUDA devices and putting a check mark next to all GPUs (CUDA devices) you want it to use. We selected all installed GPUs and confirmed that all were used in the rendering of the Benchmark scene. (FASTEST = the LOWEST time in minutes to the nearest hundredth.)
LuxMark OpenCL Benchmark
This is a benchmark that works with all GPUs that support OpenCL. Furthermore, the latest version 2.1b2 supports multiple GPUs. We feature here the results from rendering the Room scene which is extremely complex (2,000,000+ triangles) and is available only on 64bit executables. Using OpenGL Driver Monitor, we confirmed that all installed GPUs were utilized in rendering. (FASTEST = HIGHEST number in thousands of samples per second.)
DaVinci Resolve 9.1.4 adds speed and power to color grading of HD video. It uses the GPU to apply and playback using specified effects in real time -- no pre-rendering required. However, the more effect nodes created, the slower the playback. The full version supports noise reduction which can seriously slowdown playback unless you have multiple GPUs at work.
Our graph features the Candle project with the Parrot video and various node presets. Though the target speed is 25 FPS, we set the maximum playback framerate to 500 fps to force fastest possible playback speed. Results are average frames per second. (FASTEST = HIGHEST frames per second.)
The first graph features 32 nodes of blur effect. Note that the Four and Five GPU setups meet or exceed the 25 FPS target speed. We confirmed that all installed GPUs were used to render the nodes on the fly.
After Effects CC Ray-traced 3D project of an animated robot uses CUDA capable GPUs exclusively for rendering. After Effects CS6 and CC automatically use all NVIDIA GPUs to render the project -- assuming the model name of your GPU pre-exists in or is added to the AE Whitelist of "raytracer_supported_cards." (FASTEST = the LOWEST time in minutes to the nearest hundredth.)
The graph above looks odd because the DUAL 580 Classifieds actually beat the Four and Five GPU configurations. Part of this can be explained by the law of diminishing returns. When you go from two to three, four or five GPUs, they are not fully stressed by After Effects. It's also because the Dual 580 Classifieds are 'beasts.'
We also learned that when using multiple GPUs with After Effects, the CUDA compute level must be the same. When we had three GPUs at CUDA 3.0 level and two GPUs at CUDA 2.0 level, the two CUDA 2.0 GPUs were ignored by After Effects. (To determine your NVIDIA GPUs CUDA Compute Level, see NVIDIA Developer CUDA Zone.)
NOTE: The AE Ray-traced 3D animation we refer to as "robot" was provided courtesy of Juan Salvo and Danny Princz. Today, Danny has posted a compilation of render times featuring up to three GPUs.
The Cubix Xpander provides an impressive boost in three out of four examples of GPU accelerated pro apps. It as gives a temporary advantage to the Mac Pro tower over the soon-to-be-released Mac Pro "tube." Why? Because there is no spare PCIe slot in the 2013 Mac Pro for the Host Interface Controller required by the Xpander. You are limited to two factory AMD FirePro GPUs that do NOT support Ray-traced 3D in After Effects nor OctaneRender.
But notice I used the word 'temporary.' That's because I fully expect third parties will develop a similar product to the Xpander that connects via Thunderbolt 2.0. Will the Thunderbolt 2.0 bandwidth be as effective as the 16 lane bandwidth of the Xpander? Only testing will tell. (I also expect that the creators of After Effects and OctaneRender will add support for the AMD FirePros -- hopefully sooner than later.)
No drivers or patches were required to use the Xpander. It was truly plug and play using an industry standard Host Interface Controller (HIC) to connect the Xpander to the Mac Pro's second 16 lane PCIe 2.0 slot.
As you may have noticed, our testing included two power hungry GTX 580 Classifieds which require three power feeds each. Because our Xpander only provided two power leads for each GPU, we used an external power supply to feed the third connector on each card. (You can custom order an Xpander with a more powerful power supply with three power leads for each GPU.) The 580 Classifieds are also taller than industry GPUs, so leaving the cover off our Xpander was necessary. (When you order an Xpander, be sure to mention that you are using the 580 Classifieds. Cubix will gladly make a custom cover to make room for the extra height.)
The price of the four bay Xpander Desktop Elite about $3000 -- without any GPUs. So if you don't already own four high-end GeForce GPUs, your total investment with the purchase of the four GPUs will be more like $5500. Order it with four Quadro K5000s and you are up to around $11,000.
The Cubix Xpander is a cost-is-no-object solution for pros looking to significantly reduce GPU rendering time for apps like OctaneRender and DaVinci Resolve.
Comments? Suggestions? Email
, mad scientist.
Follow me on Twitter @barefeats
WHERE TO BUY THE CUBIX XPANDER
The model we tested was the Xpander Desktop 4. The current models are called Xpander Desktop Elite. The main difference is the replacement of the LEDs on the front panel with an LCD display. And it offers a variable speed temperature sensitive cooling fan. You can also order an Expander Rackmount and well as Expander for Laptops.
Though we used the Xpander exclusively for GPU testing, it can also be used for other kinds of PCIe adapters.
WHERE TO BUY CUDA CAPABLE NVIDIA GPUs for your MAC PRO