Parallel and Distributed Real-Time Systems: An Introduction

Jan van Katwijk; Janusz Zalewski

Published: Mar 1, 2001

Jan van Katwijk

Janusz Zalewski

Abstract

In this Special Issue we collected nine contributions on Parallel and Distributed Real-Time Systems. Below, we overview all nine papers included, which deal with problems and challenges in this rapidly growing domain. The particular papers tackle different issues related to developing applications. The topics covered in the first two groups of papers vary from off-the-shelf components using CORBA, to scheduling mechanisms in parallel real-time systems and load balancing in this kind of systems, to real-time communication protocols based on multicasting and providing fault tolerance, as well as standard protocols, such as Resource ReSerVation Protocol (RSVP). Another group of papers deals with verification principles in an environment supporting a radio broadcast paradigm and with two most challenging application domains for real-time distributed systems: on-board embedded systems for satellites and control of large high-energy physics experiments. Finally, we make some predictions on the most important aspects of the research practice in the near future, which we believe will have focused on system architectures and the verification process.

Keywords:

CORBA, parallel real-time computing, real-time computing, distributed real-time computations, real-time scheduling, load balancing, verification and validation, radio broadcast paradigm, on-board systems, experiment control.

1. Motivation

Development of real-time systems gets an ever increasing attention within the fields of computer science and engineering. This is no surprise, since the computerization of our society has intensified enormously in the last decade. Computerization and automation causes our lives to be surrounded by computerized products and services. On one hand, there are simple examples we meet in our daily lives: the washing machine, our wristwatch and the central heater are classical examples of computer controlled devices. On the other hand, our existence is surrounded by far more complex systems.

In our direct living environment, we are confronted with things like a telephone, home computer and television, all fairly complex devices nowadays. Outside the house, we encounter all kinds of systems influencing our behavior there: traffic lights are controlled by complex distributed control systems, shopping centers are controlled via security cameras, our air flights could not take place if the control did not grow to the level it has today, and so on. These systems distinguish themselves by their enormous growth in the last decade, from relatively simple single computer systems a couple of decades ago, to vast distributed control systems nowadays.

The enormous impact of the area, and its growing importance in our daily lives makes it worthwhile to give an overview of current issues in the domain of parallel and distributed real-time software and systems. In this special issue of Parallel and Distributed Computing Practices, we therefore address this problem by giving an overview of the domain. Since such an overview should provide a certain balance in treatment, we have chosen to include selected contributions from most subareas of the real-time software and systems discipline.

2. Real-Time Issues in Parallel and Distributed Computing

The easiest way to develop and run a parallel/distributed real-time system seems to take existing software, with proven operational record in a parallel or distributed environment, and run it for real time. This approach, however, raises the obvious question, how contemporary tools, suitable for traditional distributed applications, would perform in real time? In order to find an answer to this question, we invited researchers we knew would address exactly this problem.

Polze, Malek, Wallnau and Plakosh discuss in their invited contribution the use of off-the-shelf components based on CORBA, in the construction of distributed real-time software systems. They pinpoint several problems that arise with an intuitive simplistic approach, and develop their own solution to most of them. Their solution is based on what they call Composite Objects. The main idea of composite objects is to leave programming on the high level of abstraction with CORBA and to concentrate all issues related to the detailed knowledge of the objects' timing behavior with the special extension mechanisms. The invited paper titled Real-Time Computing with Off-the-Shelf Components: The Case for CORBA serves as a good background to the rest of this issue, because it outlines a number of problems that need fundamental solutions, not just quick fixes.

What is crucial in presenting challenges of the discipline and how they are addressed, is to focus on new problems which emerge with the evolution of requirements for parallel and distributed real-time systems and overview the attempts for their solutions. The distinctive features of parallelism and distribution versus traditional, mostly uniprocessor systems, are primarily those, which deal with the multiplicity of processors. That is, in the first place, the issue is how to schedule multiple tasks on these processors, to let them perform optimally in real time, and secondly, how to balance the load to minimize idle cycles, a different face of the same problem.

In the contribution of Fouad, Narahari and Hahn, the imprecise computation model is used as a framework for developing a real-time parallel scheduler and incorporating the effects of and possibilities for graceful degradation in soft real-time applications. In their contribution, A Real-Time Parallel Scheduler for the Imprecise Computation Model, the authors address this issue by establishing a set of constraints that must be adhered to in utilizing dynamic load balancing for a parallel real-time system such that schedules are not invalidated and the computational error is not increased.

Imprecision and uncertainty, when formalized, are also the basis for a novel real-time load balancing scheme discussed in the paper on Distributed Computations in Real Time Based on a Rough Grammar Principle, by Wójcik and Zalewski. The concept of a rough grammar is based on that of a rough set. More specifically, the paper shows that using the rough grammar concept one can sequence dynamically all tasks in a processor net in a pipeline fashion. The duration of each pipeline is normalized to the duration of one of the shortest tasks, by which a reduction of the idle and wait times is brought to a minimum.

3. Real-Time Communication Protocols

The issues of scheduling and load balancing can never be fully resolved without thorough knowledge of communication protocols, which allow participating parties to exchange information. Real-time communication is particularly tricky due to uncertainties in the network, such as loss of messages or node failures, so that guaranteeing message delivery is difficult unless special measures are taken.

In this view, Tunali, Erciyes and Soysert discuss in their contribution, A Hierarchical Fault Tolerant Ring Protocol for Distributed Real-Time Systems, an approach to solve a communication problem in distributed real-time systems via the use of a synchronous communication protocol operating on hierarchical rings. The fault-tolerant algorithm developed allows the protocol to maintain communication in case of crash failures and is easily usable for real-time applications.

Gannod and Bhattacharya, in their paper titled Real-Time Multicast in Wireless Communication, present a new way to resolve multicast communication in wireless/mobile networks. In particular, they define a new performance metric for real-time multicast networks: delay of reconstruction of the multicast tree at the instance of node migration. This issue is important, because nodes in a mobile network tend to migrate from cell to cell causing the necessity to rebuild their descriptive patterns. In their paper, the authors give some simulation results of measuring performance metrics as a function of different system configurations when node migration occurs.

The third contribution addressing the development and application of communication protocols in the real-time domain, by Benzekri and Sarafoglou, is titled Protocols for Real-Time Network Applications Programming. The authors discuss the entire hierarchy of protocols (protocol stack) usable in real-time communication, in particular, IP Multicast, RTP/RTCP (Real-Time Protocol and Real-Time Control Protocol), and RSVP (Resource ReSerVation Protocol) and present related work on a video server application in a campus-wide network.

4. Real-Time System Development

One could think that having resolved the issues with real-time scheduling, load balancing, and real-time communication, would let the researchers and practitioners sleep quietly, but that is not so. Knowing what algorithms and protocols to use is just a tip of an iceberg—one needs to apply these concepts in real circumstances. In other words, intensive development and verification procedures are needed to prove that at least some of these concepts work.

In their contribution, Software Development and Verification of Dynamic Real-Time Distributed Systems Based on the Radio Broadcast Paradigm, van Katwijk, de Rooij, Stuurman and Toetenel address the use of an architectural style, defined as the Radio Broadcast Paradigm as a basis for software development. They develop formal verification techniques to analyze the required behavior of an underlying distributed system built according to this paradigm. Their principle relies on subsequent use of experimentation, abstraction and verification, and is labeled bottom-up as opposed to traditionally used top-down methodologies. They discuss the application of their verification approach to determining the timing constraints of an implemented communication protocol.

All theoretical methods are only as good as they get into practical use. Therefore we gave a very serious thought as to what particular parallel or distributed applications we would like to see covered in this issue. From practical research work we knew that there may exist only a few more challenging applications, in the development of parallel or distributed real-time systems, than those used in space research and high-energy physics. We were lucky enough to participate recently on Ph.D. defense committees of two practitioners who agreed to make invited contributions based on their dissertation work.

In the space domain, we welcome the contribution of Vardanega, who discusses issues in the development of software for multiple processors on-board satellite systems. In a paper titled On the Distribution of Control in New Generation On-board Embedded Real-Time Systems, he presents a sound engineering approach to distributing control functions onto multiple processing nodes in such systems. In the high-energy physics domain, the contribution of Gaspar, Franek and Schwarz titled Architecture of a Distributed Real-Time System to Control Large High-Energy Physics Experiments, discusses the generic architecture and framework capable of handling the control and monitoring of all aspects of a high-energy physics experiment. In particular, they outline the tools they developed to support design and implementation of such architecture with respect to two main issues: control and communication.

Summary

When we compare this issue on Parallel and Distributed Real-Time Systems to the previous one we have done for Informatica four years ago [1], a few thoughts on the evolution of the field come up. This time, we structured the papers into three major categories, encompassing the following topics:

timing problems, including parallel scheduling and load balancing
real-time communication protocols
system development strategies.

Papers in each of these categories were present in the previous edition, only in different numbers. The number of contributions on real-time communication protocols has increased, at the cost of papers on real-time scheduling algorithms. Furthermore, design issues were made more explicit here than in the previous edition. These facts reflect the actual tendency of intensified research in communication protocols that are crucial to the reliable operation of parallel and distributed real-time systems and achieving their high performance.

However, the key aspect of dealing with these systems is in their development procedures, where assessing directions of progress is the most important from both the research and practical standpoint. In this view, we can distinguish two crucial factors of development strategies for parallel and distributed real-time systems:

Having up-front a well defined system architecture, perhaps in a form of design patterns, would significantly facilitate the development process by allowing designers to focus on more specific aspects of the application.
Following a well-established verification process for real-time system development would be of tremendous advantage to developers, not only in the area of parallel and distributed real-time systems.

In both respects, we are glad to observe some notable attempts to respond to these issues. Papers by Polze et al., and by Gaspar et al. clearly identify certain fundamental characteristics of system architectures that are indispensable for proper design of real-time distributed systems. These contributions acknowledge the need for further research on system and software architectures as a basis for the development of parallel and distributed real-time systems. Papers by van Katwijk et al., and Vardanega emphasize the need to start with a solid architectural style, and address the need for providing feedback information as a necessary component of the development process. Indeed, it is only through feedback analysis that design and implementation can be adapted in order to meet the requirements. Improving the verification process is easier said than done, but both papers provide practical examples how to incorporate sound and formalized verification methods into the development process and how to be successful in it.

So, in summary, can we answer the question: Where is this technology heading, where does it evolve? It is our opinion, based not only on beliefs and experience but primarily on hard facts, such as the contents and results of some of the above mentioned papers, that in the foreseeable future, the crucial aspects of parallel and distributed real-time systems will relate to necessary enhancements in their development methodologies, focusing on the system architecture and the verification process. This is where the most advanced research will be heading. And all of us have to watch the progress, because it is clear that sooner or later all computer systems (with very few exceptions) will become real-time systems. This is for the simple reason that real-time systems have much better performance characteristics.

One uncertain point, we see at the moment, is the role of the Unified Modeling Language (UML) in system development. Despite some claims, UML is not ready yet to play any major role or even prove its usefulness in parallel or distributed real-time applications. But its time may come very soon, once its semantics are formalized, it is better standardized and appropriate software tools are built. This is because UML is, in principle, compatible with the major factors that, as we believe, will determine progress in this area. First, any well defined system architecture can be easily expressed in the UML notation. Second, the verification process can be substantially enhanced by a strict graphical notation (such as statecharts), which UML is hoped to provide.

References

[1] M. Paprzycki, J. Zalewski Eds., Special Issue on Parallel and Distributed Real-Time Systems, Informatica, Vol. 19, No. 1, February 1995, ISSN 0350-5596

Jan van Katwijk
Department of Mathematics and Informatics,
Delft University of Technology, P.O. Box 356, 2600 AJ Delft,
The Netherlands.
E-mail: J.vanKatwijk@twi.tudelft.nl

and

Janusz Zalewski
Department of Electrical & Computer Engineering,
University of Central Florida,
Orlando, FL 32816-2450, USA.
E-mail: jza@ece.engr.ucf.edu

Issue

Vol. 2 No. 1 (1999)

Section

Introduction to the Special Issue

Article Sidebar

Main Article Content

Abstract

Article Details