Teragrid: the future of supercomputing

By Sterling Sanders

Like something out of a science fiction movie, imagine visiting the headquarters of the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Urbana-Champaign, and pulling the door open to a completely white room with a ceramic tiled floor that contains only a 20 foot long by six foot tall metal framed black computer christened with the name Mercury.

Stepping through the door, you feel the slight chill of air conditioning nipping at your skin, while the loudly working mechanical hum of this massive machine and the white glow of florescent lights engulf your senses for a surrealistic experience.

It is at this point you are told what you are now staring is currently the 35th most powerful supercomputer in the world, containing 2.662 Teraflops (a trillion calculations per second) of processing power with 512 Intel 1.3 GHz Itanium 2 processors incorporated into an IBM supercluster, a large group of computers networked together in parallel to work as one whole.

As a first phase machine in National Science Foundation's (NSF) TeraGrid project, Mercury is one of four supercomputers in what is currently the most wide-ranging and in-depth attempt in building and installing the world's largest and most powerful nationally distributed computing tool for scientific research to date.

The TeraGrid will allow researchers to have access from their desktop workstations to the world's fastest computers and largest data archives, it's a project that is revolutionizing the capabilities of scientific research said Rob Pennington, interim director of the NCSA.

It's a system that plans to bring together the best minds in the country with newest tools and most powerful computers for a combination that is said will solve the most pressing scientific problems of our time, Pennington said.

"The original concept of this project was to create a system that would not only link these powerful machines across the country, giving researchers access to a greater amount of resources and capabilities then just one specialized computer, but it was also a people effort to get people working together in a collaborative effort," said Trish Barker, public information specialist at the NCSA.

"The TeraGrid concept was launched by the NSF with a $53 million dollars in funding to four sites, the main two being at [San Diego Supercomputer Center (SDSC)] at the University of California, San Diego, and the other here at the NCSA at U of I," Barker said.

The TeraGrid is currently online running at five sites: the NCSA at UIUC -- providing 2.0 Teraflops; the SDSC at the University of California, San Diego -- providing 1.3 Teraflops; Argonne National Laboratory in Argonne, IL -- providing 1.9 Teraflops; the Center for Advanced Computing Research (CACR) at the California Institute of Technology in Pasadena -- providing .2 Teraflops; and the Pittsburgh Supercomputing Center (PSC) at Carnegie Mellon University and the University of Pittsburgh -- providing 6 Teraflops.

It would take a human 60,000 years to complete one teraflop of calculations.

"To put it in perspective, multiply the speed and power of a standard personal computer by 1,500, and you'll have near the capability of one of these machines," Barker said.

Each of the systems is connected through a 40 gigabit per second network, which provides minimal lag time between facilities when transferring massive amounts of data. This network is 40 times faster then the UIUC VPN network and approximately 4000 times faster then a standard cable modem.

"The idea here is that you don't need all the resources in one place. Scientists will be able to sit that their computers and access a vast range of power, storage and tools -- such as visualization and simulation programs -- from across the country," Pennington said.

"This makes a lot more processing power available to researchers on a standard basis," said Michael Schneider, information specialist at the PSC.

Each of the sites maintains a certain role in the TeraGrid project. While the majority of the computational power is being initiated at the NCSA at UIUC, most of the storage facilities are being placed at the SDSC.

"With the addition of the [PSC] into the TeraGrid, we have become the lead in collaborative efforts for interoperability between each of the machines in the project," Schneider said.

"Before the PSC was added to the project, there was an almost homogenous network of the computers on the system because they were all running IBM superclusters with nearly the same version of Linux," Schneider continued.

"But we run a six teraflop HP Unix server here, so we've been leading the taskforce on getting our systems working flawlessly with one another so that researchers won't have to worry about incompatibility issues once they begin using the system," Schneider concluded.

"Basically, what we want to see in this project is researchers using the system without having to know how it works," Barker said.

"The TeraGrid is supposed to work like a light switch, you flick it on when you need the power, and even though that power could be coming from half-way across the country, you don't need to know where it's coming from or how it's working, it's just working when you need it to," she said.

The first phase of the project launched January 1, 2004, and since, "about 12 teams of researchers have been making use of the system," Barker said.

So far the research currently being conducted includes molecular modeling for disease detection, cure and drug discovery, automobile crash simulations, black hole collision simulations, alternative energy source research, as well as climate and atmospheric simulations for more accurate weather predictions.

Michael Norman, a physics professor at the Center for Astrophysics and Space Science (CASS) at the University of California, San Diego (UCSD), created Enzo, the world's largest most complex scientific simulation for the evolution of the universe, and has recently adapted it to run on the first phase production model of the TeraGrid.

The test tracks the formation of vast structured galaxies and gas clouds during the billions of years following the Big Bang.

Barbra Minsker is another researcher who is currently conducting research on the contamination of groundwater in certain areas across America.

The research she is conducting on the TeraGrid is directed at solving these contamination problems by putting possible solutions to the contamination into mathematical algorithms and letting the computer determine, through a process of natural selection under certain circumstances, the best solution to a problem in a certain area.

"As we like to say, the TeraGrid system allows scientists to do 'Big Science,' but to do 'Big Science,' we have to get down to the smallest particle, and for that we need power" Barker said.

To get access to the TeraGrid system, "scientists must submit an online proposal for what we call allocations, " said Melissa Johnson, senior allocation project coordinator for the TeraGrid project.

"Once the application is submitted, it goes through a peer review process at which time we decide when this particular project would get access and what kind of allocation [or the amount of time, processing power and system tools] a particular researcher would receive," Johnson said.

The project continues to evolve as more tools are constructed to take advantage of this network of machines, Pennington said. Computer simulated research is definitely a wave of the future, and we're doing the best we can to make the results a reality, he finished.

It is scheduled, that by the end of 2004, a completed version of the TeraGrid, containing the capability of 20 Teraflops, will be up and running for the whole of the scientific research community to have access to.