LoBoS History

Please click on the links in the side menu to see a detailed overview of each LoBoS implementation. Read on below for a detailed journey through how LoBoS came to be.

The Early Years

The first Beowulf cluster was developed by a group including Donald Becker, Thomas Sterling, Jim Fischer, and others at NASA's Goddard Space Flight Center in 1993-1994. The aim of the Beowulf project was to provide supercomputing performance at a fraction of the traditional cost. This was made possible by two recent developments in technology: firstly, the introduction of cheap Intel and Intel clone microprocessors that could perform respectably compared to DEC's Alpha CPU, Sun's SPARC and UltraSPARC lines, and other high performance CPUs, and secondly, the availability of capable open-source operating systems, most notably Linux. The Beowulf project was a success and spawned a variety of imitators at research insitutions that wanted supercomputing power without paying the price.

The original iteration of LoBoS was conceived by Bernard Brooks and Eric Billings in the mid 1990s, in an attempt to use the architecture developed by the NASA group to advance the cost effectiveness of molecular modeling. The first LoBoS cluster was constructed between January and April of 1997, and remained in use until March, 2000. This cluster used state of the art (at the time) hardware. The network topology was a ring (each node having three NICs, see the LoBoS 1 page for more details) that was joined to the NIH campus network by a pair of high-speed interconnects. This cluster was able to take advantage of the recent parallelization of computational chemistry software such as CHARMM. It was made available to collaborating researchers at NIH and other institutions.

Physical Map of LoBoS 1
Physical view of LoBoS 1

LoBoS through the Years

The LoBoS cluster, like the original Beowulf, proved to be a success. Researchers at NIH and collaborating institutions used it to develop large-scale parallel molecular modeling experiments and simulations. By 1998, however, the original cluster, whose nodes contained dual 200 MHz Pentium Pro processors, was becoming obsolete. A second cluster, LoBoS 2, was therefore constructed consisting of nodes with dual 450 MHz Intel Pentium II processors. This cluster also abandoned the ring network topology for a standard ethernet bus. The cluster had both fast and gigabit ethernet connections, a rarity for the late 1990s. As this was happening, the original LoBoS cluster was converted for desktop use. This represented another advantage of the LoBoS business model, as machines could be converted for other uses when newer technology became available for the cluster.

With the second incarnation of LoBoS, demand for cluster use continued to increase. To provide NIH and collaborating researchers with a top of the line cluster environment, the Cluster 2000 committee was chartered to build a combind LoBoS 3/Biowulf cluster. This committee evaluated several different options for processors, network interconnections, and other technologies.

Despite the existence of LoBoS 3/Biowulf, the CBS staff decided to construct a LoBoS 4 cluster. This cluster used nodes with a dual AMD Athlon MP configuration. LoBoS 4 also added Myricom's proprietary high speed, low latency fiber network technology, called Myrinet. Myrinet gave a significant performance improvement to parallel applications. With this cluster, the CBS staff ran into power and reliability problems. Although they were mostly fixed, most of the nodes in LoBoS 4 were returned to their vendor as trade-ins for LoBoS 5 nodes. LoBoS 5, which was completed in December, 2004, was an evolutionary development from LoBoS 4, featuring nodes with dual Xeon processors and expanded use of Myrinet technology.

As LoBoS 5 began to age, plans were made for the construction of LoBoS 6, the first version of LoBoS to use 64 bit CPUs. The first batch of nodes, 52 dual dual-core Opteron systems, were brought online in late summer of 2006. The next batch of systems are 76 dual quad core Intel Clovertown nodes, which are currently being brought on-line. The Opteron nodes are connected with high-speed single data rate (SDR; 10 Gbps) and the Clovertown nodes use double data rate (DDR; 20 Gbps) InfiniBand interconnects.

estors
The EonStor RAID arrays provide disk space for the LoBoS 5 cluster.

The Future

Thanks to the construction of a new data center, LoBoS has substantial capacity for expansion. The Laboratory of Computational Biology is considering several options for new nodes, but will likely continue to buy single or dual socket quad core systems for the foreseeable future.


References

Biowulf Staff. NIH Biowulf home page.
Sterling, Thomas. "Beowulf Breakthroughs". Linux Magazine, June, 2003.