Towards Peak Performance on Hierarchical Memory Architectures - New Recursive Blocked Data Formats and BLAS's
Fred Gustavson1, Isak Jonsson2, Bo Kågström2, and
Per Ling2 Abstract In 1998, a first version of the GEMM-based Level 3 BLAS for
superscalar type processors was announced. This version was adopted by many organizations including the ATLAS and PHiPAC projects. Here, we describe
recent developments of the project where techniques to handle symmetric multiprocessing (SMP) and deep memory hierarchies efficiently, are
incorporated. These techniques include recursive algorithms and recursive blocked data formats which lead to routines that automatically adapt to
complex memory hierarchies and naturally facilitate parallelization, due to improved data locality.
- IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598, U.S.A.
E-mail: gustav@watson.ibm.com - Department of Computing Science and HPC2N,
Umeå University, SE-901 87 Umeå, Sweden
E-mail: {isak,bokg,pol}@cs.umu.se |