
   Description of the test
   ========================

   This test executes the parallel matrix-matrix multiplication
   using one-dimensional array of processes. 
   The matrix distribution used is naive one-dimensional 
   heterogeneous distribution using horizontal striping of matrices.

   The test uses "HMPI_Group_auto_create" to detect the optimal number of processes
   required to execute the algorithm.

   The time of execution of the algorithm is returned.

   CONDITIONS
   ----------
   N must be a multiple of recon_r
   recon_t  must be a multiple of recon_r

   Files
   -----

   mxm.c     ----> Contains the algorithm execution code
   counter.h ----> Contains the parameters:
                   Size of the problem
   mxm_i.c   ----> Partitioning functions to partition the matrix.
   mxm_recon.c ---> HMPI_Recon call to refresh the speeds.
   ParallelAxB.c
   ParallelAxB.mpc ---> Performance model descriptions of the algorithm
                        and the generated code.

   HOW TO RUN
   ----------
shell$ hmpicc ParallelAxB.mpc

shell$ hmpibcast mxm.c mxm_i.c mxm_i.h ParallelAxB.c counter.h mxm_recon.c

shell$ hmpiload -o mxm mxm.c

shell$ hmpirun mxm
N=640, Optimal number of processes=4, time(sec)=0.998
