Data distribution for dense factorization on computers with memory heterogeneity

TitleData distribution for dense factorization on computers with memory heterogeneity
Publication TypeJournal Article
Year of Publication2007
AuthorsLastovetsky, A., and R. Reddy
Journal TitleParallel Computing
Volume33
Issue12
Pages757-779
Journal Date12/2007
Keywordscomputation performance models, fupermod
Abstract

In this paper, we study the problem of optimal matrix partitioning for parallel dense factorization on heterogeneous processors. First, we outline existing algorithms solving the problem that use a constant performance model of processors, when the relative speed of each processor is represented by a positive constant. We also propose a new efficient algorithm, called the Reverse algorithm, solving the problem with the constant performance model. We extend the presented algorithms to the functional performance model, representing the speed of a processor by a continuous function of the task size. The model, in particular, takes account of memory heterogeneity and paging effects resulting in significant variations of relative speeds of the processors with the increase of the task size. We experimentally demonstrate that the functional extension of the Reverse algorithm outperforms functional extensions of traditional algorithms.

AttachmentSize
sdarticle.pdf714.34 KB