|
Red
Storm upgrade lifts Sandia supercomputer to 2nd in world, but 1st
in scalability, say researchers
Sandia's
long row of Red Storm cabinets give hints of the
supercomputer's dazzling scalability.
Credit:
Sandia Nat. Labs
|
ALBUQUERQUE, N.M. — A
$15 million upgrade to Sandia’s Red Storm computer has
increased its peak speed from 41.5 to 124.4 teraflops in a
computing terrain in which a single teraflop was a big deal only
6 years ago.
The machine, built by Cray
Inc., is now rated second fastest in the world, with a Linpack
speed of 101.4 teraflops. The widely recognized Linpack test
measures a supercomputer’s speed as applied to a computing
problem.
“While not number one in
speed, in terms of scalability, Red Storm is the best in the
world,” says Bill Camp, director of Sandia’s
Computation, Computers, Information, and Math center.
Scalability refers to a
supercomputer’s computational efficiency as the number of
processors on a job is increased. “You want to use more
processors to get large jobs done more quickly,” says Camp,
“but if the computer doesn’t scale well you can lose
much of that speedup.” Red Storm loses little efficiency on
large numbers of processors.
“The Cray XT3
supercomputers now dominating the highest end of computing
worldwide is based upon Sandia’s Red Storm,” says
Camp, who together with Sandia colleague Jim Tomkins, led the
design of the machine. “Scientists love it because they can
do bigger science more quickly on it than any other computer in
existence, except for molecular dynamics studies on BlueGene/L
(Lawrence Livermore National Lab's supercomputer).
Otherwise, it’s the best thing since night baseball.”
“The machine’s also
a computational workhorse. It gets the job done,” says
Sandia researcher Steve Attaway, a winner of several national
computing awards who runs large engineering simulations on the
machine.
Red Storm was designed under
the National Nuclear Security Administration’s Advanced
Simulation & Computing program and is used for NNSA’s
stockpile stewardship program, which helps ensure that the U.S.
nuclear weapons stockpile is safe and reliable without the
resumption of underground nuclear testing. This supercomputer
also runs computer codes used for conducting materials science
simulations critical to national security. Sandia is an NNSA
laboratory.
The Red Storm design became the
basis for the Cray XT3™ massively parallel processor (MPP)
supercomputer that has been installed at a number of prestigious
supercomputing centers around the world.
Purchasers of this design
include Oak Ridge National Laboratory, will create an even bigger
supercomputer than Red Storm based on the same design, as well as
Lawrence Berkeley Labs, Pittsburgh Supercomputer Center (which
the largest National Science Foundation site), the U.S. Army, the
United Kingdom’s AWE Atomic Weapons Establishment program,
the national computing centers in Finland, Switzerland and the
U.K., and other U.S. and allied government sites.
Red Storm is Sandia’s
largest high-performance computer, but is thrifty in its use of
power. It uses 2.2 megawatts, roughly half of other
supercomputers of its class. This means that comparatively less
of Red Storm’s energy is converted to useless heat.
Red Storm also takes up a
relatively small area — about 3,500 square feet.
Its Linpack test demonstrated
high reliability, repeatedly running for nine hours on over
26,000 processor cores without a failure.
The machine took less than
three years to create from concept to customer shipment. It was
relatively inexpensive to develop and build — $77.5 million
including engineering and design costs — and is used for
large scientific and technical problems.
Sandia developed the
architectural specifications of the machine and did much of the
software development. “The hardware at Cray was built to
meet our specifications,” says Sandia Senior Scientist Jim
Tomkins.
The upgrade included the
addition of a fifth row of cabinets and upgrading the entire
system with dual-core AMD Opteron TM processors, resulting in a
supercomputer with over 26,000 processor cores. Dual-core
technology fits two processor cores on a single die; doubling
processing capacity with minimal impact on power consumption and
temperature levels.
Why is Red Storm so efficient?
In part, says Sandia researcher Robert Ballance, because its
operating system is based on minimalist software — termed a
lightweight kernel — which carries just enough
functionality to load the job, put it on the network, and stop
it. Any other software is job-specific; thus, each computer node
(at which two chips are located) in effect lugs no useless
software on its back.
The original technology was
pioneered by Sandia on its ASCI Red machine, built by Intel
Corporation, the world’s first terascale supercomputer.
Source
/ Credit: Sandia National Laboratories
|