
   Description of the test
   ========================

   This test executes the parallel matrix-matrix multiplication
   using one-dimensional array of processes. 
   The matrix distribution used is naive one-dimensional 
   heterogeneous distribution using horizontal striping of matrices.

   The function HMPI_Group_auto_create is used to find the optimal 
   number of processes needed to execute the algorithm. 
   
   The time of execution of the algorithm is returned.

   CONDITIONS
   ----------
   N must be a multiple of recon_r
   recon_t must be a multiple of recon_r

   Files
   -----

   ParallelAxB.mpc ----> Performance model definition for the 1D algorithm
                         of parallel matrix-matrix multiplication
   ParallelAxB.c   ----> Generated code of the performance model definition
   mxm_recon.c     ----> Uses function call HMPI_Recon to refresh the processor speeds
   mxm_i.c         ----> Contains partitioning functions to partition the matrix given the
                         speeds of the processors
   mxm.c           ----> Contains the algorithm execution code
   counter.h       ----> Contains the parameters:
                         Size of the problem

   HOW TO RUN
   ----------
   shell$ hmpicc ParallelAxB.mpc

   shell$ hmpibcast mxm.c mxm_recon.c mxm_i.h mxm_i.c ParallelAxB.c counter.h

   shell$ hmpiload -o mxm mxm.c 

   shell$ hmpirun mxm
   Optimal number of processes = 4
   N=640, t(sec)=3.346649964
