This week SGI published three new word records at Standard Performance Evaluation Corporation (SPEC) for SPECjbb2005, SPECfp_rate_base2006 and SPECint_rate_base2006. The SGI benchmarking team achieved these results at the Leibniz Supercomputing Centre (LRZ) in Garching, Germany, on an Altix 4700 with 1024 Itanium 9040 cores, 1.6GHz, 18MB cache and running SLES10 with ProPack. The results reconfirm that SGI Altix 4700 is the most scalable platform suitable for application fusion as proved by the new world record results in the series of SPEC benchmarks:
SPECjbb2005 benchmarks evaluate the performance of servers running typical Java business applications for Internet, finance, enterprise and database applications by emulating a three-tier client/server system, with emphasis on the middle tier. The benchmark exercises the capabilities of the Java Virtual Machine (JVM), its operating system and the performance of CPUs, caches, memory hierarchy and the scalability of the shared-memory system. SGI raised the bar again to recapture undisputed leadership in this benchmark with 9,611,262 Business Operations per Second (BOPS) on Altix 4700 with 512 Itanium 9040 cores,1.6GHz and 18MB cache using Oracle® JRockit, a Java Virtual Machine (JVM). The new SGI record is over 74 percent higher than the previous record.
SPEC CPU2006 rate benchmark measures the capacity of a system to complete a fixed number of tasks. In a large shared-memory environment, this test stresses the scalability of the operating system, the memory subsystem, and to some extent the I/O subsystem.
SPECfp_rate_base2006 is an indicator of system response for HPC workloads; it is a mix of floating point intensive applications from different domains that stress the platform in different ways. Running SPECfp_rate_base2006 at this scale is non-trivial because it puts significant stress on the kernel, scheduler, file system, memory bandwidth and IO bandwidth. Using a partition of the Altix 4700 system at LRZ, configured with 1024 Itanium cores, 1.6GHz and 18MB cache, SGI achieved the word record with a SPECfp_rate_base2006 score of 10600. This result is more than five times faster than the closest Single System Image (SSI) competitor on the SPEC list.
SPECint_rate_base2006 is an industry-standard benchmark suite to measure system performance when running an integer-intensive workload. The same challenges mentioned above apply to run at this scale. On the SGI benchmarking team set the SPECint_rate_base2006 world record, achieving a score of 9030, which is four times faster than the next closest SSI competitor.
To complete this story, let me recall another important result. Since 2006, SGI Altix 4700, installed at LRZ, has held the world record for STREAM, the industry-standard benchmark to measure the aggregate memory bandwidth,with 4.35TB/s, which is 5x faster than the closest SSI competitor.
To summarize the facts, SGI Altix 4700 is proven to be:
4x higher in the number of cores in a Single System Image
5x higher in memory bandwidth
5x better in performance for floating point workloads
4x better in performance for integer workloads
This leads me to the conclusion that Altix 4700 with Intel Itanium processors defines a new platform class: extreme scale-up architecture, it pushes the scale-up concept to new limits.
Why is scale-up relevant? Well, many key HPC problems like cryptography, fraud detection, search engines, complex event processing and graph-based problems, simply do not run on clusters. As an IDC study revealed, the majority of ISV applications run on a single node because massive in-memory computation enables full-scale system simulation without the need to reduce resolution or precision, and without breaking the problem apart. Applications bound by random I/O data access can achieve huge performance gains when bringing the entire dataset into memory. Load balancing can not be corrected on a small node cluster without explicitly copying data to it. In a scale-up system, the work is simply directed to the available processor. Most importantly for developers, the single system image doesn’t restrict parallel programming models that can be used, including hybrid schemes, to enable key research into programming model use, which is especially important as we move into the new world of multi-core CPUs. And of course, system administration is much easier.
SPEC results are available at:
http://www.spec.org/jbb2005/results/res2009q3/jbb2005-20090727-00756.html
http://www.spec.org/cpu2006/results/res2009q3/cpu2006-20090802-08313.html
http://www.spec.org/cpu2006/results/res2009q3/cpu2006-20090802-08312.html
http://www.cs.virginia.edu/stream/top20/Bandwidth.html
SPEC, SPECint, SPECfp and SPECjbb are registered trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of 9/03/2009.