Alain Darte, Yves Robert and Frederic Vivien
Birkhauser, Boston, MA, 2000, 240 pp.
ISBN 0-8176-4149-1, $69.95
Program restructuring methods, particularly loop transformations, are important optimization techniques used in parallelizing compilers. New developments in loop transformations, based on unimodular transformations and general affine transformations, have emerged recently.
Scheduling and Automatic Parallelization offers an explanation of how to incorporate these transformations in algorithms, which transformation to apply, and how to optimize them. It provides a full study of loop transformations and algorithms for parallel loop detection in a scheduling perspective, making the link with cyclic scheduling and systems of uniform recurrence equations.
This book concerns with two principal topics of automatic parallelization: task scheduling and loop nest scheduling. Task graph scheduling aims at executing tasks linked by precedence constrains; it is a run-time activity. Loop nest scheduling aims at executing statements instances linked by data dependences; it is a compile-time activity. But most part of the book are interested in loop nest scheduling.
As for loop nest scheduling our goal is to capture in a single place the fantastic developments of the last decade or so. Dozens of loop transformations have been introduced (loop interchange, skewing, fusion, distribution, etc.) before a unifying theory emerged. The theory builds upon the pioneering papers of Karp, Miller, and Winogrand and of Lamport, and it relies on sophisticated mathematical tools (unimodular transformations, parametric integer linear programming, Hermite decomposition, Smith decomposition, etc.).
The book is divided into two parts. Part I (the first two chapters) deals with unidimensional problems (task graph system and decomposed cyclic scheduling), and Part II (Chapter 3, 4, and 5) tackles multidimensional systems (recurrent equations and loop nests).
Chapter 1 and Chapter 2 present results from scheduling theory, both classical and recent, in the framework of task graph systems. These two chapters are not directly related (like list-scheduling heuristics) that will be used later in the book. Also, they attempt to summarize state-of-the-art techniques in task graph scheduling. Chapter 1 deals with basic results on task graph scheduling. The authors use the original model where only vertices of the task system are weighted; no communication cost is taken into account. Scheduling such task graphs with unlimited resources is not difficult, but the problem with limited resources is NP-complete. Chapter 2 is the counterpart of Chapter 1 when taking communication costs into account. The problem turns out to be surprisingly difficult; even the problem with unlimited resources is NP-complete. The chapter describes both theoretical results and cost-effective heuristics widely used in software systems.
Chapter 3 is devoted to cyclic scheduling, a basic instance of the software pipelining problem. Cyclic scheduling is a transition from task graph to loop nests because they operate on a reduced representation of the problem, the reduced dependence graph which is manipulated at compile-time. Because the problem is unidimensional, it is conceptually simpler than scheduling general loop nests, and they are able to derive bounds and heuristics in the presence of limited resources.
Chapter 4 is the most theoretical of the book; it provides mathematical foundations for the next chapter. It addresses scheduling problems related to graphs whose edges are labeled by multidimensional integer vectors. Such graphs come from the computational model introduced by Karp, Miller, and Winograd, and are known as systems of uniform recurrence equations (SURE). The computability issues, linear scheduling, and multi-linear scheduling are studied. It turns out that this study can be used to automatically parallelize codes defined by nested loops.
The last chapter, Chapter 5, deals with loop nests and loop parallelization algorithms. The different dependence representations used in the literature are recalled, and the optimality of each parllelism detection algorithm is studied with respect to its underlying dependence representation. The chapter cover most algorithms proposed in the past, from Lamport's hyperplane method and Allen and Kennedy's algorithm to unimodular transformations and general multidimensional affine scheduling techniques such as those developed in Feautrier's algorithm. All the classical loop transformations (loop distribution, loop skewing, general unimodular transformations, etc.) are presented with a uniform scheduling view.
This book has the following features:
- State-of-the-art coverage of task graph scheduling
- Up-to-date results on acyclic scheduling in models with and without communication costs
- Scheduling cyclic graphs with unidimensional and multidimensional weights
- Synthesis of known results on systems of uniform recurrence equations
- Detailed coverage of parallel-loop detection algorithms
- Algorithms used to generate codes with directives for parallel loops
- Self-contained presentations, theorems, proofs, and algorithms
- Numerous detailed work examples used in text exposition
- End-of-chapter exercises not only apply the techniques but go further, allowing an exploration of more complicated questions
This book is not an introductory book to parallel processing, nor is it an introductory book to parallelizing compilers. The authors assume that readers are familiar with books High Performance Compilers for Parallel Computing by Wolfe and Super-compilers for Parallel and Vector Computers by Zima and Chapman, and that they want to know more about scheduling transformations. The authors list 125 references concerning with parallelizing compilers and some exercises at the end of each chapter. However, I think this book is not so realistic in that it is not suitable for engineers of parallelizing compiler developments. It can be used as an essential text and reference for the latest developments in automatic parallelization methods used for scheduling and for program transformations in compilers. Professionals, researches, and graduates or postgraduate students who want to understand the foundations of automatic parallelization techniques will find it an authoritative resource. It will be very useful to researches interested in scheduling, compilers, and program transformations.
The University of Aizu, Aizu, Japan