In this paper, a fast message passing communication architecture (FMP) is proposed in order to reduce the software overhead of communication, which has been a great impediment to the good performance of workstation clusters. A FMP system is implemented over the SUN Ultra2-Myrinet platform. Measurements show that this implementation has achieved a one-way latency of 11.2us for one-byte packets in network communication, and only 4.9us in local communication. Its bandwidth can reach as high as 338Mb/s for 8KB packets in network communication, and more than 770Mb/s in local message passing. These results tell us that FMP can really exploit the performance of supercomputers and high-speed networks.
Local communication is implemented through shared memory within a single host, while the methods to decrease overheads of network communication depend on user-level communication protocol, pipeline transmission, credit flow control and multithreading. Furthermore, the whole system provides the same interface for both network and local communication.