Challenges in Parallel and Distributed Computing


Boleslaw K. Szymanski


This issue closes the first volume of our journal. We have a lot of new submissions in the pipeline. In fact, we have already accepted enough papers to fill in the next two issues and many more submissions are under review. In accordance with our policy of rapid publication, we do not want to have too many papers accepted in advance of publication, on the other hand we need a constant stream of submissions to ensure high quality of the journal. This success in the first year confirmed our motivation for creating a journal which was to provide a forum for the maturing field of parallel and distributed computing. This field has an enormous potential of changing computing and we are witnessing the partial fulfillment of this potential. Today, both parallel and distributing computing have became ubiquitous. At the same time, quickly developing technology fundamentally changes balances between cost of computing, communication and programming and new balances often lead to the change of a paradigm. This editorial attempts to look at the current trends to suggest the topics which importance is likely to grow in the near future with the goal of encouraging researchers working on those topics to submit their results to our journal.

At the hardware level, the essential aspect of quickly changing landscape is the difference in growth of network bandwidth, processor speed and memory access times, which are listed in the descending order of their speed of improvement. All-optical networks and interconnects are changing balance on the networking side, because port throughput is more limited by the processor speed than by the network bandwidth, as it was in the past. At the same time, the latency of the networks is fundamentally limited by the speed of light and the distance that the transferred data need to travel. On the other hand, the speed of a processor is growing faster than the access time to the memory (where the technological advances are used to increase the memory chip capacity rather than its speed). The resulting use of buffering to mask the speed differences has led to the multi-memory hierarchy in which registers, primary cache, secondary cache and main memory are typical layers with progressively lower speed but larger capacity.

One result of these trends is growing importance of data locality for the performance of a computer system, where the architectural details dictate the structure of the most efficient object code. However, improvements in hardware speed are being developed much faster than the advances in programming efficiency are, causing the programmer time to be more and more expensive compared to the cost of hardware. Hence, to expect a programmer to find the optimum run-time structure would be contrary to this trend. As a result, the programming trends are towards portability and reuse of software which require abstraction from the architectural details of the run-time computing system. Hence, the compilers and run-time systems must be responsible for tuning the portable software for a particular architecture and research on automatic optimization of data locality has been growing in importance.

The increasing complexity of interactions between processor, memory hierarchy and network in a parallel system must be encapsulated in a proper abstract model capable of providing a universal representation of parallel algorithms. For sequential programming, such a role is fulfilled by the Turing machine or its equivalent, Random Access Machine (RAM). Unfortunately, straightforward generalization of RAM, Parallel RAM (PRAM) does not addresses the issues of data localization at all. Currently, the most promising candidates for such an abstraction are the Bulk Synchronous Parallelism (BSP) model and the LogP machine, both of which provide an abstraction of processors interconnected by a network. It is not clear yet, at least to this editor, that they are equally capable of representing a modern memory hierarchy. Hence, the work on models of parallelism, modern computer architectures and parallel algorithms is of growing interest.

At the software level, object-oriented programming is entrenched in all modern programming. Even Fortran90 programs can be written in the object-oriented style, and these capabilities are improved in Fortran95. The two new trends in language design are portability, as represented by JAVA, and generic programming which first significant example is the Standard Template Library (STL). Portability, supported for example by the Java Virtual Machine (JVM), promotes use network of workstations or even Internet connected computers as parallel machines and helps closing the gap between parallel and distributed computing.

Traditionally, distributed computing focused on resource availability, result correctness, code portability and transparency of access to the resources more than on issues of efficiency and speed which, in addition to scalability, are central to parallel computing. Low and ever-decreasing cost of hardware encourages configuring computer systems for peak-demand which is much higher than the average demand, so generally, computers are underutilized. Hence, there are large computational resources available at any moment over the LAN (Local Area Network), WAN (Wide Area Network) and the Internet. Distributed and parallel systems that are capable of exploring such resources are of growing interest. However, challenges to build them for truly universal use are formidable, among them security of the accessed machines, system's ability to adapt to the changing availability of computers, fault tolerance, transparency of such form of parallelism to the users.

We are finishing the first year of the journal existence. The challenges and opportunities in the area of parallel and distributed computing outlined above indicate that the journal focuses on a vital area of growing importance which will continue to be a rich source of exciting articles for many years to come.

Boleslaw K. Szymanski
Rensselaer Polytechnic Institute