| Abstracts | SCPE Volume 6, No. 3, September 2005 |
|
Many research and engineering fields, like Bioinformatics or Particle Physics,
are confident about the development of Grid technologies to provide the huge
amounts of computational and storage resources they require. Although several
projects are working on creating a reliable infrastructure consisting of
persistent resources and services, the truth is that the Grid will be a more
and more dynamic entity as it grows. In this paper, we present a new tool that
hides the complexity and dynamicity of the Grid from developers and users,
allowing the resolution of large computational experiments in a Grid
environment by adapting the scheduling and execution of jobs to the changing
Grid conditions and application dynamic demands.
SCPE_6_3_01.pdf (PDF, ~364KB)
SCPE_6_3_01.zip (zipped PS, ~295KB)
|
|
Distributed computing continues to be an alphabet-soup of
services and protocols for managing computation and storage.
To live in this environment, applications require
middleware that can transparently adapt
standard interfaces to new distributed systems;
such middleware is known as an interposition agent.
In this paper, we present several lessons
learned about interposition agents
via a progressive study of design possibilities.
Although performance is an important concern, we pay
special attention to less tangible issues such as portability,
reliability, and compatibility. We begin with a comparison
of seven methods of interposition and select one method,
the debugger trap, that is the slowest but also the most reliable.
Using this method, we implement a complete interposition
agent, Parrot, that splices existing remote I/O systems into
the namespace of standard applications.
The primary design problem of Parrot is the mapping of
fixed application semantics into the semantics of
the available I/O systems.
We offer a detailed discussion of how errors and
other unexpected conditions must be carefully managed in
order to keep this mapping intact.
We conclude with a evaluation of the performance of the
I/O protocols employed by Parrot, and use an Andrew-like
benchmark to demonstrate that semantic differences have
consequences in performance.
SCPE_6_3_02.pdf (PDF, ~243KB)
SCPE_6_3_02.zip (zipped PS, ~240KB)
|
|
Grid programming environments need to be both portable
and efficient to exploit the computational power of dynamically
available resources.
In previous work, we have presented the divide-and-conquer based
Satin model for parallel computing
on clustered wide-area systems.
In this paper, we present the Satin implementation on top of our new Ibis
platform which combines Java's write once, run everywhere with efficient
communication between JVMs. We evaluate Satin/Ibis
on the testbed of the EU-funded GridLab project,
showing that
Satin's load-balancing
algorithm automatically adapts both to heterogeneous processor
speeds and varying network performance, resulting in efficient utilization
of the computing resources. Our results show that when the wide-area links suffer from congestion,
Satin's load-balancing algorithm can still achieve around 80% efficiency, while
an algorithm that is not grid aware drops to 26% or less.
SCPE_6_3_03.pdf (PDF, ~427KB)
SCPE_6_3_03.zip (zipped PS, ~487KB)
|
|
Grid presents a continuously changing environment. It also introduces a new set of failures. The data grid initiative has made it possible to run data-intensive applications on the grid. Data-intensive grid applications consist of two parts: a data placement part and a computation part. The data placement part is responsible for transferring the input data to the compute node and the result of the computation to the appropriate storage system. While work has been done on making computation adapt to changing conditions, little work has been done on making the data placement adapt to changing conditions. In this work, we have developed an infrastructure which observes the environment and enables run-time adaptation of data placement jobs. We have enabled Stork, a scheduler for data placement jobs in heterogeneous environments like the grid, to use this infrastructure and adapt the data placement job to the environment just before execution. We have also added dynamic protocol selection and alternate protocol fall-back capability to Stork to provide superior performance and fault tolerance.
SCPE_6_3_04.pdf (PDF, ~310KB)
SCPE_6_3_04.zip (zipped PS, ~401KB)
|
|
We address the challenge of managing large amounts of numerical data within
computing grids consisting of a federation of clusters. We claim that
storing, accessing, updating and sharing such data should be considered by
applications as an external service. We propose a hierarchical
architecture for this service, based on a |peer-to-peer approach.
This architecture is illustrated through a software platform called JuxMem
(for Juxtaposed Memory), which provides transparent access to mutable data,
while enhancing data persistence in a dynamic environment. Managing the
volatility of storage resources is specially emphasized. As a proof
of concept, we describe a prototype implementation on top of the JXTA
peer-to-peer framework, and we report on a preliminary experimental
evaluation.
SCPE_6_3_05.pdf (PDF, ~232KB)
SCPE_6_3_05.zip (zipped PS, ~228KB)
|
|
The size of data sets produced on remote supercomputer facilities
frequently exceeds the processing capabilities of local
visualization workstations. This phenomenon increasingly
limits scientists when analyzing results of
large-scale scientific simulations. That problem gets even more
prominent in scientific collaborations, spanning large virtual
organizations, working on common shared sets of data distributed
in Grid environments. In the visualization
community, this problem is addressed by distributing the
visualization pipeline. In particular, early stages of the
pipeline are executed on resources closer to the initial (remote)
locations of the data sets.
This paper presents an efficient technique for placing the first
two stages of the visualization pipeline (data access and data
filter) onto remote resources. This is realized by exploiting the
``extended retrieve'' feature of GridFTP for flexible, high performance
access to very large HDF5 files. We reduce the number of
network transactions for filtering operations by
utilizing a server side data processing plugin, and hence reduce
latency overhead compared to GridFTP partial file access. The paper further
describes the application of hierarchical rendering techniques on
remote uniform data sets, which make use of the remote data
SCPE_6_3_06.pdf (PDF, ~846KB)
SCPE_6_3_06.zip (zipped PS, ~2,8MB)
|
|
This paper describes a data distribution algorithm suitable for copying large files to many nodes in multiple clusters in
wide-area networks. It is a self-organizing algorithm that achieves pipeline transfers, fault tolerance, scalability, and an efficient route
selection. It works in the presence of today's typical network restrictions such as firewalls and Network Address Translations, making
it suitable in wide-area setting. Experimental results indicate our algorithm is able to automatically build a transfer route close to the
optimal. Propagation of a 300MB file from one root node to over 150 nodes takes about 1.5 times as long as the best time obtained by
the manually optimized transfer route.
SCPE_6_3_07.pdf (PDF, ~313KB)
SCPE_6_3_07.zip (zipped PS, ~247KB)
|
|
The problem of data movement is central to distributed computing
paradigms like the Grid. While often overlooked, the time to stage
data and binaries can be a significant contributor to the wall-clock
program execution time in current Grid environments.
This paper describes a simple scheduler for network data movement in
Grid systems that can adaptively determine data distribution schedules
at runtime on the basis of Network Weather Service (NWS) performance
predictions. These schedules take the form of "spanning trees".
The distribution mechanism is an enhancement to the Logistical Session
Layer (LSL), a system for optimizing data transfers using
"logistics".
SCPE_6_3_08.pdf (PDF, ~243KB)
SCPE_6_3_08.zip (zipped PS, ~302KB)
|
|
The Grid approach provides a vision to access, use, and manage heterogeneous resources in virtual organizations across multiple domains and organizations. This paper foremost analyses some of the issues related to establishing trust and reputation in a Grid. Integrating reputation into quality management provides a way to reevaluate resource selection and service level agreement mechanisms. We introduce a reputation management framework for Grids to work toward facilitating the complex task of improving the quality of resource selection. Based on community experience we adapt trust and reputation of entities through specialized services. Simple contextual quality statements are evaluated in order to effect the reputation for a monitored resource. Additionally, we introduce a novel algorithm for evaluating Grid reputation by combining two known concepts using eigenvectors to compute reputation and integrating global trust.
SCPE_6_3_09.pdf (PDF, ~670KB)
SCPE_6_3_09.zip (zipped PS, ~826KB)
|
|
The Non-Dedicated Distributed Environment (NDDE) aims to muster the idle processing power of
interactive computers (workstations or PCs) into a virtual resource for parallel applications
and grid computing. NDDE is novel in the sense that it allows for safe and continuous use of
idle cycles. Differently from existing solutions, NDDE applications run inside a virtual machine
rather than on the user environment. Besides safe and continuous cycle exploitation, this
approach enables NDDE applications to run on an operating system other than that used
interactively. Our preliminary results suggest that NDDE can in fact harvests most of the idle
cycles and has almost no impact on the interactive user.
SCPE_6_3_10.pdf (PDF, ~179KB)
SCPE_6_3_10.zip (zipped PS, ~245KB)
|