Monday, June 30, 2008

Max Planck Institute's ATLAS Cluster Deploys Woven 10GigE Fabric

The ATLAS compute cluster at the Max Planck Institute for Gravitational Physics in Hannover, Germany, achieved top ranking among the 285 Gigabit Ethernet clusters in the most recent TOP500 list of supercomputer sites. The ATLAS cluster leverages the Dynamic Congestion Avoidance feature in Woven Systems' 10 Gigabit Ethernet (GE) fabric to reach a performance level of 32.8 Teraflops, making it the fastest Ethernet cluster in the world.

The ATLAS cluster consists of 1342 compute nodes occupying 32 full racks. The use of Intel Xeon Quad-core processors gives the cluster over 5,000 CPU cores. Each server node has its own dedicated Gigabit Ethernet connection to a Woven TRX 100 Ethernet switch, which has four separate 10 GE uplinks to the EFX 1000 Ethernet Fabric Switch at the core of the Ethernet fabric. Because this configuration is not over-subscribed, it assures non-blocking throughput performance for both inter-processor communications and storage access.

Woven Systems said its Dynamic Congestion Avoidance capability constantly balances traffic loads intelligently and in real-time along available paths in the non-blocking 10 GE fabric, thus delivering efficiency across the cluster.
A cluster's computational efficiency is critical to its performance, and efficiency is determined by the network's ability to handle inter-processor communications. The efficiency is calculated using the Linpack benchmark from the ratio of the actual performance to the theoretical peak performance for all processors in the cluster. Of the 285 Gigabit Ethernet clusters on the Top 500 List, only six achieved a computational efficiency greater than 60 percent and of those, ATLAS was the most powerful. Of the remaining Gigabit Ethernet clusters, 76 had efficiencies of 50 percent or less; 97 had efficiencies of 51-55 percent and 106 had efficiencies of 56-60 percent.