Abstracts Volume 7, No. 3, September 2006

SPECIAL ISSUE PAPERS

Empirical Parallel Performance Prediction From Semantics-Based Profiling
Norman Scaife, Greg Michaelson and Susumu Horiguchi

 The PMLS parallelizing compiler for Standard ML is based upon the automatic instantiation of algorithmic skeletons at sites of higher order function (HOF) use. Rather than mechanically replacing HOFs with skeletons, which in general leads to poor parallel performance, PMLS seeks to predict run-time parallel behaviour to optimise skeleton use.

Static extraction of analytic cost models from programs is undecidable, and practical heuristic approaches are intractable. In contrast, PMLS utilises a hybrid approach by combining static analytic cost models for skeletons with dynamic information gathered from the sequential instrumentation of HOF argument functions. Such instrumentation is provided by an implementation independent SML interpreter, based on the language's Structural Operational Semantics (SOS), in the form of SOS rule counts. PMLS then tries to relate the rule counts to program execution times through numerical techniques.

This paper considers the design and implementation of the PMLS approach to parallel performance prediction. The formulation of a general rule count cost model as a set of over-determined linear equations is discussed, and their solution by single value decomposition, and by a genetic algorithm, are presented.

download PDFSCPE_7_3_01.pdf (PDF, ~174KB) download PSSCPE_7_3_01.zip (zipped PS, ~314KB)

Managing Heterogeneity in a Grid Parallel Haskell
A. D. Al Zain, P. W. Trinder, G.J.Michaelson and H-W.Loidl

 Computational Grids potentially offer cheap large-scale high-performance systems, but are a very challenging architecture, being heterogeneous, shared and hierarchical. Rather than requiring a programmer to explicitly manage this complex environment, we recommend using a high-level parallel functional language, like GpH, with largely automatic management of parallel coordination.

We present GridGUM, an initial port of the distributed virtual shared-memory implementation of GpH for computational GRIDs. We show that, GridGUM delivers acceptable speedups on relatively low latency homogeneous and heterogeneous computational Grids. Moreover, we find that for heterogeneous computational GRIDs, load management limits performance.

We present the initial design of GridGUM2, that incorporates new load management mechanisms that cheaply and effectively combine static and dynamic information to adapt to heterogeneous GRIDs. The mechanisms are evaluated by measuring four non-trivial programs with different parallel properties. The measurements show that the new mechanisms improve load distribution over the original implementation, reducing runtime by factors ranging from 17% to 57%, and the greatest improvement is obtained for the most dynamic program.

download PDFSCPE_7_3_02.pdf (PDF, ~792KB) download PSSCPE_7_3_02.zip (zipped PS, ~745KB)

Dynamic Memory Management in the Loci Framework
Yang Zhang and Edward A. Luke

 Resource management is a critical concern in high-performance computing software. While management of processing resources to increase performance is the most critical, efficient management of memory resources plays an important role in solving large problems. This paper presents a dynamic memory management scheme for a declarative high-performance data-parallel programming system - the Loci framework. In such systems, some sort of automatic resource management is a requirement. We present an automatic memory management scheme that provides good compromise between memory utilization and speed. In addition to basic memory management, we also develop methods that take advantages of the cache memory subsystem and explore balances between memory utilization and parallel communication costs.

download PDFSCPE_7_3_03.pdf (PDF, ~175KB) download PSSCPE_7_3_03.zip (zipped PS, ~329KB)
Selected papers from the ISPDC�05 Conference

A Parallel Rule-based System and Its Experimental Usage in Membrane Computing
Dana Petcu

 Distributed or parallel rule-based systems are currently needed for real applications. The proposed architecture of such a system is based on a wrapper allowing the cooperation between several instances of the rule-based system running on different computers of a cluster.

As case study a parallel version of the Java Expert System Shell is built. Initial tests show its efficiency when running classical benchmarks. Moreover, this parallel version of Jess is successfully used to accelerate current simulators for membrane computing.

download PDFSCPE_7_3_04.pdf (PDF, ~575KB) download PSSCPE_7_3_04.zip (zipped PS, ~469KB)

An Efficient Fault-Tolerant Routing Strategy for Tori and Meshes
M. E. Gómez, P. López and J. Duato

 In massively parallel computing system, high performance interconnection networks are decisive to get the maximum performance. % to achieve the maximum performance. While routing is one of the most important design issues of interconnection networks, fault-tolerance is another issue of growing importance in these machines, since the huge amount of hardware increases the probability of failure. This paper proposes a mechanism that provides both, scalable routing and fault-tolerance, for commercial switches to build direct regular topologies, which are the topologies used in large machines. The mechanism is very flexible and the hardware required is not complex. Furthermore, it allows a high number of faults having a minimal effect on performance.

download PDFSCPE_7_3_05.pdf (PDF, ~1,2MB) download PSSCPE_7_3_05.zip (zipped PS, ~3,1MB)

Afpac: Enforcing consistency during the adaptation of a parallel component
J. Buisson, F. André and J.-L. Pazat

 Grid architectures are execution environments that are known to be at the same time distributed, parallel, heterogeneous and dynamic. While current tools focus solutions for hiding distribution, parallelism and heterogeneity, this approach does not fit well their dynamic aspect. Indeed, if applications are able to adapt themselves to environmental changes, they can benefit from it to achieve better performance. This article presents Afpac, a model extending Dynaco for designing self-adaptable parallel components that can be assembled to build applications for Grid. This model includes the definition of a consistency criterion for the dynamic adaptation of SPMD components. We propose a solution to implement this criterion. It has been evalued using both synthetic and real codes to exhibit the behavior of several proposed strategies.

download PDFSCPE_7_3_06.pdf (PDF, ~182KB) download PSSCPE_7_3_06.zip (zipped PS, ~323KB)

WebCom-G and MPICH-G2 Jobs
Padraig J. O'Dowd, Adarsh Patil and John P. Morrison

 This paper discusses using WebCom-G to handle the management & scheduling of MPICH-G2 (MPI) jobs. Users can submit their MPI applications to a WebCom-G portal via a web interface. WebCom-G will then select the machines to execute the application on, depending on the machines available to it and the number of machines requested by the user. WebCom-G automatically & dynamically constructs a RSL script with the selected machines and schedules the job for execution on these machines. Once the MPI application has finished executing, results are stored on the portal server, where the user can collect them. A main advantage of this system is fault survival, if any of the machines fail during the execution of a job, WebCom-G can automatically handle such failures. Following a machine failure, WebCom-G can create a new RSL script with the failed machines removed, incorporate new machines (if they are available) to replace the failed ones and re-launch the job without any intervention from the user. The probability of failures in a Grid environment is high, so fault survival becomes an important issue.

download PDFSCPE_7_3_07.pdf (PDF, ~161KB) download PSSCPE_7_3_07.zip (zipped PS, ~333KB)
RESEARCH PAPERS

Pion: A Problem Solving Environment for Parallel Multivariate Integration
Shujun Li, Elise de Doncker and Karlis Kaugars

 PARINT is a package for parallel multivariate numerical integration. This paper describes the design and implementation of a problem solving environment based on PARINT and Web technology. We call it PARINT ONline (Pion). It facilitates both common end-users and experts to solve computationally intensive numerical integration in parallel. No parallel programming experience or any knowledge of Unix/Linux operating systems is needed for the users. When the user submits an integration problem to Pion via a Web browser, the problem solving environment will compile the integrand function and link it dynamically with the PARINT package so that the execution can be done in parallel on high performance computing servers. The system was designed to be a globally accessible integration platform that operates as a black box, taking user data and producing the results.

download PDFSCPE_7_3_08.pdf (PDF, ~246KB) download PSSCPE_7_3_08.zip (zipped PS, ~430KB)

The Great Plains Network (GPN) Middleware Test Bed
Amy W. Apon, Gregory E. Monaco and Gordon K. Springer

 GPN (Great Plains Network) is a consortium of public universities in seven mid-western states. GPN goals include regional strategic planning and the development of a collaboration environment, middleware services and a regional grid for sharing computational, storage and data resources. A major challenge is to arrive at a common authentication and authorization service, based on the set of heterogeneous identity providers at each institution.

GPN has built a prototype middleware test bed that includes Shibboleth and other NMI-EDIT middleware components. The test bed includes several prototype end-user applications, and is being used to further our research into fine-grained access control for virtual organizations. The GPN prototype applications and namespace form a basis for the design and deployment of a robust and scalable attribute management architecture.

download PDFSCPE_7_3_09.pdf (PDF, ~800KB) download PSSCPE_7_3_09.zip (zipped PS, ~1,2MB)