The challenge in moving large amounts of scientific data is that the open Internet is designed for transferring small amounts of data, like Web pages, said Thomas A. DeFanti, a specialist in scientific visualization at the California Institute for Telecommunications and Information Technology, or Calit2, at the University of California, San Diego. While a conventional network connection might be rated at 10 gigabits per second, in practice scientists trying to transfer large amounts of data often find that the real rate is only a fraction of that capacity.
The new network will also serve as a model for future computer networks in the same way the original NSFnet, created in 1985 to link research institutions, eventually became part of the backbone for the Internet, said Larry Smarr, an astrophysicist who is director of Calit2 and the principal investigator for the new project.
NSFnet connected five supercomputer centres with 56-kilobit modems. In the three decades since, network speeds have increased dramatically, but not nearly enough to handle a coming generation of computers capable of a quintillion operations per second. This week the Obama administration announced that the United States is committed to creating what is known as the “exascale” supercomputing era, with machines roughly 30 times faster than today’s fastest computer, on what is called the “petascale.”
“I believe that this infrastructure will be for decades to come the kind of architecture by which you use petascale and exascale computers,” Smarr said. Increasingly digital science is generating torrents of data. For example, an astronomy effort called the Intermediate Palomar Transient Factory, at the Palomar Observatory in Southern California, continuously scans the dark sky looking for new phenomena. Over all, the Palomar observational system captures roughly 30 terabytes (TB) of data per night. By contrast, a Library of Congress project that archives the entire World Wide Web collects about 5 terabytes per month.
In addition to moving data between laboratories, the high-speed network will make new kinds of distributed computing for scientific applications possible. For example, physicists working with data collected by the Large Hadron Collider at Cern in Switzerland initially kept duplicate copies of files at many different computer clusters around the world, said Frank Wuerthwein, a physicist at the University of California, San Diego. More recently, he said, as high-speed links have become more widely available, experimental data is often kept in a single location and used for experiments by scientists running programs from remote locations, at a significant cost savings.
Further, the new network has been designed with hardware security features to protect it from the attacks that routinely bedevil computers connected to the Internet. Recently, one server at the University of California, San Diego, that was connected to the open Internet counted 35,000 false login attempts in one day, said Smarr.
The new network is an extension of an existing intra-campus effort by the National Science Foundation to create islands of high-speed connectivity for campus researchers. In recent years the agency has invested more than $500,000 dollars on each of roughly 100 campuses nationwide.