
   Description of the test
   ========================

   This test executes the parallel matrix-matrix multiplication
   using two-dimensional grid of processes.

   The matrix distribution used is two-dimensional 
   heterogeneous block-cyclic distribution of matrices

   The test uses "HMPI_Timeof" to find the optimal generalized block size.

   CONDITIONS
   ----------
   N must be a multiple r.

   Files
   -----

   mxm_i.h         ----> header containing the function declarations and variable declarations
   mxm_i.c         ----> Contains the algorithm of parallel matrix-matrix multiplication 
                         using heterogeneous block-cyclic distribution of matrices
                         Uses HMPI_Timeof to find the optimal generalized block size
   mxm.c           ----> contains the main
   Load_balance.c  ----> Matrices A, B, and C are distributed amongst the processes
                         (Proportional to the speeds of the processors)
   counter.h       ----> Contains the parameters
                         N=Size of the matrix to solve
                         r=granularity or communication-to-computation ratio (values of 16, 32 typical)
                         p=Number of processes along the row
                         q=Number of processes along the column

   HOW TO RUN
   ----------
shell$ hmpicc ParallelAxB.mpc

shell$ hmpibcast mxm.c mxm_i.c mxm_i.h ParallelAxB.c counter.h Load_balance.c

shell$ hmpiload -o mxm mxm.c

shell$ hmpirun mxm
===================================
=========row block size=4, column block size=60=============
TIMEOF = 0.000021
===================================
=========row block size=5, column block size=4=============
TIMEOF = 0.167114
===================================
=========row block size=5, column block size=5=============
TIMEOF = 0.083557
===================================
=========row block size=5, column block size=6=============
TIMEOF = 0.167114

Optimal generalised block size row = 1, Optimal generalised block size col = 10


Starting the matrix-matrix multiplication
N=480, p=1, q=4, block size row=1, block size column=10, time(sec)=1.100
