Abstracts Volume 6, No. 1, March 2005

SPECIAL ISSUE PAPERS

Enabling Rich Service and Resource Discovery with a Database for Dynamic Distributed Content
Wolfgang Hoschek

 In a distributed system such as a Data Grid, it is desirable to maintain and query dynamic and timely information about active participants such as services, resources and user communities. This enables information discovery and collective collaborative functionality that operate on the system as a whole, rather than on a given part of it. However, it is not obvious how a database (registry) should maintain information populated from a large variety of unreliable, frequently changing, autonomous and heterogeneous remote data sources. In particular, how can one avoid sacrificing reliability, predictability and simplicity while allowing to express powerful queries over time-sensitive dynamic information? We propose the so-called hyper registry, which has a number of key properties. An XML data model allows for structured and semi-structured data, which is important for integration of heterogeneous content. The XQuery language allows for powerful searching, which is critical for non-trivial applications. Database state maintenance is based on soft state, which enables reliable, predictable and simple content integration from a large number of autonomous distributed content providers. Content link, content cache and a hybrid pull/push communication model allow for a wide range of dynamic content freshness policies, which may be driven by all three system components: content provider, hyper registry and client.

download PDFSCPE_6_1_01.pdf (PDF, ~233KB) download PSSCPE_6_1_01.zip (zipped PS, ~526KB)


Ad Hoc Metacomputing with Compeer
Keith Power and John P. Morrison

 Metacomputing allows the exploitation of geographically seperate, heterogenous networks and resources. Most metacomputers are feature rich and carry a long, complicated installation, requiring knowledge of accounting procedures, access control lists and user management, all of which differ from system to system. Metacomputers can have high administrative overhead, and a steep learning curve which restricts their utility to organisations which can afford these costs. This paper describes the Compeer system, which attempts to make metacomputing more accessible by employing an implicitly parallel computing model, support for programming this model with a Java-like language and the construction of a dynamic ad hoc metacomputer that can be temporarily instantiated for the purpose of executing applications.

download PDFSCPE_6_1_02.pdf (PDF, ~221KB) download PSSCPE_6_1_02.zip (zipped PS, ~351KB)


The Role of XML Within the WebCom Metacomputing Platform
John P. Morrison, Philip D. Healy, David A. Power and Keith J. Power

 Implementation details of the Nectere distributed computing platform are presented, focusing in particular on the benefits gained through the use of XML for exoressing, executing and pickling computations. The operation of various Nectere features implemented with the aid of XML are examined, including communication between Nectere servers, specifying computations, code distribution, and exception handling. The area of interoperability with common middleware protocols is also explored.

download PDFSCPE_6_1_03.pdf (PDF, ~351KB) download PSSCPE_6_1_03.zip (zipped PS, ~333KB)


Impact of Realistic Workload in Peer-to-Peer Systems a Case Study: Freenet
Da Costa Georges and Olivier Richard

 This article addresses the problem of the study of the performance evaluation and behavior of the large scale Peer-to-Peer file sharing systems. In particular the impact of realistic workload is considered by evaluating the Freenet system. This evaluation is achieved by a simulation approach. A set of inputs is determined as well as their distribution law in order to generate a more realistic workload. One of them is an original characterization of user's requests. An other contribution is to show the impact of these more realistic inputs on the overall system performances. Notably new abrupt behaviors in the learning process are described.

download PDFSCPE_6_1_04.pdf (PDF, ~216KB) download PSSCPE_6_1_04.zip (zipped PS, ~317KB)


Parallel Extension of a Dynamic Performance Forecasting Tool
Eddy Caron, Frederic Desprez and Frederic Suter

 This paper presents an extension of a performance evaluation library called Fast to handle parallel routines. Fast is a dynamic performance forecasting tool in a grid environment. We propose to combine estimations given by Fast about sequential computation routines and network availability to parallel routine models coming from code analysis.

download PDFSCPE_6_1_05.pdf (PDF, ~214KB) download PSSCPE_6_1_05.zip (zipped PS, ~351KB)


Probes Coordination Protocol for Network Performance Measurement in GRID Computing Environment
Robert Harakaly, Pascale Primet, Franck Bonnassieux, Benjamin Gaidioz

 The fast expansion of Grid technologies emphasizes the importance of network performance measurement. Some network measurement methods, like TCP throughput or latency evaluation, are very sensitive to concurrent measurements that may devalue the results. This paper presents the Probes Coordination Protocol (PCP) which can be used to schedule different network monitoring tasks. In addition, this paper goes on to discuss the main properties of the protocol; these being, exibility, eficiency, robustness, scalability and security. This study presents the results of its evaluation and of experiment periodicity measurements.

download PDFSCPE_6_1_06.pdf (PDF, ~397KB) download PSSCPE_6_1_06.zip (zipped PS, ~323KB)


Serialization of Distributed Threads in Java
Danny Weyns, Eddy Truyen and Pierre Verbaeten

 In this paper we present a mechanism for serializing the execution-state of a distributed Java application that is implemented on a conventional Object Request Broker (ORB) architecture such as Java Remote Method Invocation (RMI). To support serialization of distributed execution-state, we developed a byte code transformer and associated management subsystem that adds this functionality to a Java application by extracting execution-state from the application code. An important benefit of our mechanism is its portability. It can transparently be integrated into any legacy Java application. Furthermore, it does require no modifications to the Java Virtual Machine (JVM) or to the underlying ORB. Our serialization mechanism can serve many purposes such as migrating execution-state over the network or storing it on disk. In particular, we describe the implementation of a prototype for repartitioning distributed Java applications at run-time. Proper partitioning of distributed objects over the different machines is critical to the global performance of the distributed application. Methods for partitioning exist, and employ a graph-based model of the application being partitioned. Our mechanism enables then applying these methods at any point in an ongoing distributed computation. In the implementation of the management subsystem, we experienced the problem of losing logical thread identity when the distributed control flow crosses address space boundaries. We solved this well known problem by introducing the generic notion of distributed thread identity in Java programming. Propagation of a globally unique, distributed thread identity provides a uniform mechanism by which all the program's constituent objects involved in a distributed control flow can uniquely refer to that distributed thread as one and the same computational entity.

download PDFSCPE_6_1_07.pdf (PDF, ~246KB) download PSSCPE_6_1_07.zip (zipped PS, ~400KB)


Distributed Data Mining
Valerie Fiolet, Bernard Toursel

 Knowledge discovery in databases, also called Data Mining, is an increasing valuable engineering tool. The huge amount of data to process is more and more significant and requires parallel processing.
Special interest is given to the search for association rules, and a distributed approach to the problem is considered. Such an approach requires that data be distributed to process the various parts independently. The research for association rules is generally based on a global criterion on the entire dataset. Existing algorithms employ a large number of communication actions which is unsuited to a distributed approach on a network of workstations (NOW).
Therefore, heuristic approaches are sought for distributing the database in a coherent way so as to minimize the number of rules lost in the distributed computation.

download PDFSCPE_6_1_08.pdf (PDF, ~149KB) download PSSCPE_6_1_08.zip (zipped PS, ~301KB)


The Problem of Agent-Client Communication on the Internet
Maciej Gawinecki, Minor Gordon, Pawel Kaczmarek, Marcin Paprzycki

 In order for software agent technology to come to full fruition, it must be integrated in a realistic way with existing production technologies. In this paper we address one of the interesting problems of real-world agent integration: the interaction between agents and non-agents. The proposed solution is designed to provide non-agents (client software in particular) access to agent services, without restricting the capabilities of agents providing them.

download PDFSCPE_6_1_09.pdf (PDF, ~244KB) download PSSCPE_6_1_09.zip (zipped PS, ~407KB)
RESEARCH PAPERS

Static Analysis for Java with Alias Representation Reference-Set in High-Performance Computing
Jongwook Woo

 Static Analysis of aliases is needed for High-Performance Computing in Java. However, existing alias analyses regarding * operator for C/C++ have dificulties in applying to Java and are even imprecise and unsafe. In this paper, we propose an alias analysis in Java that is more eficient, at least equivalent, and precise than previous analyses in C++. In the beginning, the differences between C/C++ and Java are explained and a reference-set alias representation is proposed. Second, we present flow-sensitive intraprocedural and context-insensitive interprocedural rules for the reference-set alias representation. Third, for the type determination, we build the type table with reference variables and all possible types of the reference variables. Fourth, a static alias analysis algorithm is proposed with a popular iterative loop method with a structural traverse of a CFG. Fifth, we show that our reference-set representation has better performance for the alias analysis algorithm than the existing object-pair representation. Finally, we analyze the experimental results.

download PDFSCPE_6_1_10.pdf (PDF, ~298KB) download PSSCPE_6_1_10.zip (zipped PS, ~411KB)