Pro Apps on the Nehalem Mac Pro:
How many cores do you need?
How much memory is best?
Originally posted March 12th, 2009, by rob-ART morgan, mad scientist
"Reconstituted" on March 21st, 2009, with Compressor and Photoshop results.
March 22nd, 2009, updated After Effects results.
March 25th, 2009, came up with more revealing Compressor test.
We've been testing with some core intensive and memory intensive apps to see how well they run on the new 2009 Nehalem based Mac Pro. Take a look at the graphs below and then we'll discuss them.
2.93x8 = 'early 2009' Mac Pro 2.93GHz 8-core
2.26x8 = 'early 2009' Mac Pro 2.26GHz 8-core
2.93x4 = 'early 2009' Mac Pro 2.93GHz 4-core
2.66x4 = 'early 2009' Mac Pro 2.66GHz 4-core
3.2x8 = 'early 2008' Mac Pro 3.20GHz 8-core
2.8x8 = 'early 2008' Mac Pro 2.80GHz 8-core
3.1x2 = 'early 2009' iMac 3.06GHz Core 2 Duo
Memory configurations vary -- (4G), (6G), (12G), (16G)
In Compressor Graph, instances vary (2i), (8i), (16i)
Tech Specs on all models of Mac Pro
Special thanks to the Other World Computing and some Remote Mad Scientists (RMS) for helping produce some of the test results in their test lab.
Many of you are trying to decide between the 2009 Nehalem based Mac Pro and the 2008 "closeout" Mac Pros. In the three tests above, the slowest 8-core 2009 Mac Pro (2.26GHz) is faster than the fastest 8-core Mac Pro (3.2GHz). On the other hand, the 2008 8-cores are faster than the 2009 4-cores and are selling at big discounts.
For example, a 2.8GHz 8-core Mac Pro is selling for $2451 at PowerMax. That's a tempting alternative to the 2.66GHz 4-core 'Nehalem' Mac Pro based on the the After Effects results. You can add a Radeon HD 4870 and 16G of RAM for about a $3K total. Spend the same amount on the Nehalem and you only have 4 cores and 8GB of RAM.
It's a shame Apple didn't make the 4-core Nehalem with eight memory slots.
I incuded the 2009 iMac 3.06GHz because some of you indicated you are trying to decide between a 4-core Mac Pro and 2-core iMac. I like the elegant iMac. It's quiet and capable, but it's not in the same league as the Mac Pro when running high end Pro Apps.
Adobe After Effects CS4 (and CS3) spawns subprocesses -- one for each core -- when 'multiprocessing' is enabled in Preferences. Each of those subprocesses can grab up to 3GB of RAM. We typically use the Total Benchmark created by Brian Maffitt to benchmark with After Effects. If you run Activity Monitor during the rendering, you can not only observe the multiple subprocesses but you can observe how much real memory each of them is appropriating as the second phase of the project render progresses toward completion. When we had 16G installed, we observed 13GB of real memory in use.
As a general rule, the more cores you have and the memory you have, the better After Effects can "breathe."
The same phenomenon is observed with Compressor (which is part of Final Cut Studio). The Apple Qmaster can be create a Quick Cluster with as many instances as you have cores (or virtual cores). Each instance is a subprocess that can grab up to 3GB each, though we've never observed it using as much memory as After Effects. On the other hand, using all available instances doesn't always produce faster render times. You should experiment with various instance settings to find the sweet spot for the project you are encoding.
As you see in the graph above, we indicate the number of instances used in each case (e.g. 2i = 2 instances, etc.) Using the "Wildlife Reel" from the QuickTime HD Gallery, we encoded an "HD DVD: H.264 60min" preset. Notice that the Nehalem 2.93 rendered faster with 6 instances. Beyond that, it began to slow.
Only certain functions/filters in Photoshop use multiple cores. A few examples of "MP aware" functions include Rotate, Gaussian Blur, Lighting Effects, Lens Flare, Pointilize, and Sharpen Edges.
As for memory usage, though you can only specify up to 3GB memory cache in the Performance Preference panel, Mac OS X is clever enough to grab unused memory as a virtual scratch volume instead before handing off the task to the actual scratch disk. If you are editing RAW photos with lots of layers and lots of history states, having the 8 memory slots in the 8-core Nehalem at dual-channel speeds can be better than 6 sticks running at triple-channel speeds. That's because slower memory transfers are better than really slow hard disk hits.
In the graph above, we used the action file created by Lloyd Chambers of DigLloyd.com and documented in his Mac Performance Guide. We have concluded that it more closely reflects the typical work flow of a photographer than our old benchmark.
Motion benefits from the same memory caching as Photoshop, courtesy of OS X Leopard. Just try doing a RAM Preview -> Play Range on the 900 frame Blocks-Detail.HD template and use Activity Monitor to observe how much RAM is used. In fact, unless you have at least 8GB of RAM, it won't render all 900 frames in RAM.
MEMORY RIDDLE: WHEN IS SIX MORE THAN EIGHT?
We were able to clearly illustrate the bandwidth advantage of three memory modules per memory bank in the Nehalem Mac Pro using DigLloydTools (DLT) stress test which does a memmove() to all of unused physical memory. We put 12 GB (6 x 2G) in first. Ran the test. Then installed 16GB (8 x 2G) and ran the test.
If you have memory installed in all 8 slots of your 8-core Nehalem (or all 4 slots of a 4-core Nehalem), it may not penalize your real world application performance. The vast majority of real world applications do not saturate the memory bandwidth. Plus it's better to drop from triple channel to double channel performance than to run out of memory and start doing virtual memory disk swaps.
DO YOU NEED AN 8-CORE MAC PRO or will a 4-Core with lotz of RAM do the job?
Apple specifies 8GB as the limit on the 4-core Mac Pro Nehalem. Even if third parties figure out that 4GB modules work in the 4-core system with only 4 memory slots, it will cost you at least $2300 as of this writing to reach 16GB. With the 8-core (8 memory slots), you can buy 16G (8x2G) of memory for $300. The gap in cost wipes out the cost penalty for the extra cores. And trust me, if you are trying to run pro apps with only 8GB of RAM, you are handicapping yourself.
If you don't use 'heavy duty' pro apps or if the 'meanest' thing you run is 3D games, then a 4-core will be fully adequate.
DO YOUR OWN TESTING (D.Y.O.T.)
If you want to run our After Effects benchmark on your Mac, you can download a full working 30 day TRIAL copy of Adobe After Effects CS4 and the Total Benchmark from Media-Motion. If you need instructions or want to share your results, email
. Ditto for Photoshop CS4 which is also available in 30 day TRIAL form and the diglloydSpeed benchmark.
WHERE TO BUY APPLE PRODUCTS
When you purchase Apple USA products, please CLICK THIS LINK or any APPLE BANNERS at the top of our pages. It's a great way to support Bare Feats. since we earn a commission on each click-through that results in a sale.