# Challenges Concerning Symbolic Computations on Grids

## ##plugins.themes.bootstrap3.article.main##

## Abstract

Symbolic and algebraic computations are currently ones of fastest growing areas of scientific computing. For a long time, the numerical approach to computational solution of mathematical problems had an advantage of being capable of solving a substantially larger set of problems than the other approach, the symbolic one. Only recently the symbolic approach gained more recognition as a viable tool for solving large-scale problems from physics, engineering or economics, reasoning, robotics or life sciences. Developments in symbolic computing were lagging relative to numerical computing, mainly due to the inadequacy of available computational resources, most importantly computer memory, but also processor power. Continuous growth in the capabilities of computer hardware led naturally to an increasing interest in symbolic calculations and resulted, among others things, in development of sophisticated Computer Algebra Systems (CASs).

CASs allow users to study computational problems on the basis of their mathematical formulations and to focus on the problems themselves instead of spending time transforming the problems into forms that are numerically solvable. While their major purpose is to manipulate formulas symbolically, many systems have substantially extended their capabilities, offering nowadays functionalities like graphics allowing a comprehensive approach to problem solving. While, typically, CAS systems are utilized in an interactive mode, in order to solve large problems they can be also used in a batch mode and programmed using languages that are close to common mathematical notation.

As CASs become capable of solving large problems, they follow the course of development that has already been taken by numerical software: from sequential computers to parallel machines to distributed computing and finally to the grid. It is particularly the grid that has the highest potential as a discovery accelerator. Currently, its widespread adoption is still impeded by a number of problems, one of which is difficulty of developing and implementing grid-enabled programs. That it is also the case for grid-enabled symbolic computations.

There are several classes of symbolic and algebraic algorithms that can perform better in parallel and distributing computing environments. For example for multiprecision integer arithmetic, that appears among others in factorizations, were developed already twenty years ago systolic algorithms and implementations on massive parallel processors, and more recently, on the Internet. Another class that utilize significant amount of computational resources is related to the implementations of polynomial arithmetic: knowledge based algorithms such as symbolic differentiation, factorization of polynomials, greatest common divisor, or, more complicated, Groebner base computations. For example, in the latest case, the size of the computation and the irregular data structures make the parallel or distributed implementation not only an attractive option for improving the algorithm performance, but also a challenge for the computational environment. A third class of algorithms that can benefit from multiple resources in parallel and distributed environments is concerning the exact solvers of large systems of equations.

The main reason driving the development of parallel and distributed algorithms for symbolic computations is the ability to solve problems that are memory bound, i.e. that cannot fit into memory of a single computer. An argument for this statement relies on the observation that the input size of a symbolic or algebraic computation can be small, but the memory used in the intermediate stages of the computation may grow considerably.

Modern CASs increase their utility not only through new symbolic capabilities, but also expending their applicability using visualization or numerical modules and becoming more than only specific computational kernels. They are real problem solving environments based on interfaces to a significant number of computational engines. In this context it appears also the need to address the ability to reduce the wall-clock time by using parallel or distributed computing environment. A simple example is the case of rendering the images for a simulation animation.

Several approaches can be identified in the historical evolution of parallel and distributed CASs: developing versions for shared memory architectures, developing computer algebra hardware, adding facilities for communication and cooperation between existing CASs, or building distributed systems for distributed memory parallel machines or even across Internet.

Developing completely new parallel or distributed systems, although efficient, in most cases is rather difficult. Only a few parallel or distributed algorithms within such a system are fully implemented and tested. Still there are several successful special libraries and systems falling in this category: ParSac-2 system, the parallel version of SAC-2, Paclib system, the parallel extension of Saclib, FLATS based on special hardware, STAR/MPI, the parallel version of GAP, ParForm, the parallel version of Form, Cabal, MuPAD, or the recent Givaro, for parallel computing environments, FoxBox or DSC, for distributed computing environments.

An alternative approach to build parallel and distributed CASs is to add the new value, the parallelism or the distribution, to an existing system. The number of parallel and distributed versions of most popular CASs is impressive and it can be explained by the different requirements or targeted architectures. For example, for Maple there are several implementations on parallel machines, like the one for Intel Paragon or ||Maple||, and several implementations on networks of workstations, like Distributed Maple or PVMaple. For Mathematica there is a Parallel Computing Toolkit, a Distributed Mathematica and a gridMathematica (for dedicated clusters). Matlab that provides a Symbolic Math Toolbox based on a Maple kernel has more than twenty different parallel or distributed versions: DP-Toolbox, MPITB/PVMTB, MultiMatlab, Matlab Parallelization Toolkit, ParMatlab, PMI, MatlabMPI, MATmarks, Matlab*p, Conlab, Otter and others.

More recent web-enabled systems were proved to be efficient in number theory for finding large prime numbers, factoring large numbers, or finding collisions on known encryption algorithms. Online systems for complicated symbolic computations were also built: e.g. OGB for Groebner basis computations. A framework for description and provision of web-based mathematical services was recently designed within the Monet project and a symbolic solver wrapper was build to provide an environment that encapsulates CASs and expose their functionalities through symbolic services (Maple and Axiom were chosen as computing engines). Another platform is MapleNet build on client-server architecture: the server manages concurrent Maple instances launched to server client requests for mathematical computations. WebMathematica is a similar system that offers access to Mathematica applications through a web browser.

Grid-oriented projects that involve CASs were only recent initiated. The well-known NetSolve system was one of the earliest grid system developed. Version 2 released in 2003 introduces GridSolve for interoperability with the grid based on agent technologies. APIs are available for Mathematica, Octave and Matlab. The Genss project (Grid Enabled Numerical and Symbolic Services) follows the ideas of the Monet project and intends also to combine grid computing and mathematical web services using a common agent-based framework. Several projects are porting Matlab on grids: from small ones, like Matlab*g, to very complex ones, like Geodise. Maple2g and MathGridLink are two different approaches for grid-enabled version of Maple and Mathematica. Simple to use front-end were recently build in projects like Gemlca and Websolve to deploy legacy code applications as grid services and to allows the submission of computational requests.

The vision of grid computing is that of a simple and low cost access to computing resources without artificial barriers of physical location or ownership. Unfortunately, none of the above mentioned grid-enabled CAS is responding simultaneously to some elementary requirements of a possible implementation of this vision: deploy grid symbolic services, access within CAS to available grid services, and couple different grid symbolic services.

Moreover a number of major obstacles remain to be addressed. Amongst the most important are mechanisms for adapting to dynamic changes in either computations or systems. This is especially important for symbolic computations, which may be highly irregular in terms of data and general computational demands. Such demands received until now relatively little attention from the research community.

In the context of a growing interest in symbolic computations, powerful computer algebra systems are required for complex applications. Freshly started projects shows that porting a CAS to a current distributed environment like a grid is not a trivial task not only from technological point of view but also from algorithmic point of view. Already existing tools are allowing experimental work to be initiated, but a long way is still to be cross until real-world problems will be solved using symbolic computations on grids.

Dana Petcu,

Western University of Timisoara