|
|
The research focuses on algorithms, tools, and applications for scalable
high-performance computer systems. One important issue is the scalability of a
parallel algorithm (or application), which is a measure of its capability to
effectively use an increasing number of processors. Current research topics incl
ude
high-performance and portable linear algebra kernels, direct and iterative metho
ds for
linear systems and eigenvalue problems, development tools for parallel computing
,
and parallel algorithms in optimization (see Section \ref{OPT}).
The process of developing an efficient parallel algorithm includes
two major components.
First, to identify and specify the overall problem as a set of tasks that
can be performed concurrently. Second, to map these tasks onto different
processors so that the overall communication overhead is minimized.
The memory organization in current advanced computer architectures is hierarchic
al.
Accesses to data in the upper levels of the memory hierarchy (registers, cache
and/or local memory) are much faster than those in lower levels (off-processor
and shared memory). In order to approach the peak performance (measured in Mflop
s) it is
necessary to organize the computations such that we can maximize
reuse of data in the upper levels of the memory hierarchy.
By maximizing the data locality (or data reuse) we minimize the data movements
within a memory hierarchy and the communication overhead between processors.
There is often a tradeoff between maximizing the concurrency and the data locali
ty.
A technique used is to reorganize standard algorithms for linear algebra applica
tions
to perform matrix--matrix (level~3) operations in their inner loops.
This approach is successfully used in the development of the
high-performance library LAPACK.
(To be revised! Add new information.)
Some ongoing projects:
Members of the group
Former group members
|
|
|
|